Vetted Apache Hive Professionals

Pre-screened and vetted.

VM

Senior Data Scientist specializing in GenAI, LLMs and RAG

Dallas, TX5y exp
Texas InstrumentsTrine University

Built and deployed a production LLM-powered RAG assistant for semiconductor manufacturing failure analysis, reducing engineer triage effort by grounding outputs in retrieved evidence and gating responses with SPC + ML signals (LSTM anomaly scores, XGBoost probabilities). Experienced with LangChain/LangGraph to ship reliable, observable multi-step agents with branching/fallback logic, and evaluates impact using both technical metrics and business KPIs like mean time to triage and downtime reduction.

View profile
JV

Mid-level Generative AI Engineer specializing in enterprise RAG and multimodal NLP

Iselin, NJ5y exp
Wells FargoSt. Francis College

Built and deployed a production LLM/RAG chatbot at Wells Fargo for securely querying regulated financial and compliance documents, emphasizing low hallucination rates, explainability, and strict governance. Experienced with LangChain multi-agent orchestration plus Airflow/Prefect pipelines for ingestion, embeddings, evaluation, and retraining, and partnered closely with compliance/operations to drive adoption through demos and feedback-driven retrieval rules.

View profile
HK

Mid-level Data/ML Engineer specializing in NLP, GenAI, and scalable data pipelines

5y exp
AbbottClarkson University

AI/ML engineer with production experience building LLM-powered document intelligence and customer support systems in healthcare/insurance, emphasizing high-accuracy RAG, long-document processing, and robust monitoring/fallback mechanisms. Also automates and scales ML lifecycle workflows using Apache Airflow and Kubeflow, and partners closely with non-technical operations stakeholders to drive adoption.

View profile
BG

Senior Data Scientist / ML Engineer specializing in cloud ML pipelines and GenAI

Baltimore, MD17y exp
IntelIllinois Institute of Technology

ML/NLP practitioner with experience building a transformer-failure prediction system that combines sensor signals with unstructured maintenance comments using LLM-based extraction and similarity validation. Strong emphasis on production readiness—data leakage controls, SQL-driven data quality tiers, and rigorous bias/fairness validation (including contract/spec evaluation across diverse company profiles).

View profile
DJ

Mid-level Full-Stack Developer specializing in cloud microservices and internal tooling

4y exp
The Home DepotUniversity of Central Missouri

LLM/RAG engineer who has shipped production systems in high-stakes domains (fraud analytics at Mastercard and security compliance as a CI/CD gate). Strong focus on reliability: hybrid retrieval for latency, citation-backed outputs for trust, and code-driven eval/regression pipelines using golden datasets. Also built scalable OCR-based ingestion for messy classroom artifacts (handwriting, PDFs, whiteboard photos) using Go/Python and cloud services.

View profile
YL

Yu Liu

Screened

Senior Big Data Engineer specializing in AML/KYC compliance and cloud data platforms

New York, NY17y exp
CitigroupUniversity of Missouri

Data engineer with experience delivering an end-to-end pipeline handling ~3.5TB in a star-schema setup (fact + dimensions) and producing business-facing tables in Hive/Spark. Identified and resolved UAT-reported duplicate issues caused by joins through root-cause analysis, and also built automation to run Spark SQL metrics on weekly/monthly/quarterly cadences and distribute results to users.

View profile
BS

Senior Data Engineer specializing in cloud lakehouse platforms and streaming analytics

Pittsburgh, PA8y exp
First National BankTexas A&M University-Corpus Christi

Data engineer focused on fraud and banking analytics who has owned end-to-end batch + streaming pipelines at very large scale (hundreds of millions of records/day). Built robust data quality/observability layers (schema validation, anomaly detection, alerting) and delivered low-latency serving via AWS Lambda/API Gateway with DynamoDB + Redis, plus external data ingestion/scraping pipelines orchestrated in Airflow with anti-bot protections.

View profile
Aditya Jhaveri - Mid-level Software Engineer specializing in AI, big data, and distributed systems in Jersey City, NJ

Mid-level Software Engineer specializing in AI, big data, and distributed systems

Jersey City, NJ3y exp
New York UniversityNYU

Software Developer at NYU (GEMSS) focused on scaling and optimizing a data-heavy asset management web app, including migrating/optimizing data access via Google Sheets API and Firestore. Previously an SDE at Sainapse working on Spring Boot microservices POCs (Kafka, Hadoop at 2B+ record scale). Built an end-to-end Apple Wallet coupon generation/redemption system using PassKit + Google Apps Script with measurable ops impact (40% efficiency gain).

View profile
DM

Mid-level Data Engineer specializing in real-time analytics and regulated domains

NC, USA5y exp
JPMorgan ChaseSaint Louis University

Data platform engineer focused on large-scale, real-time fraud systems, with hands-on ownership of streaming architectures using Kafka, Spark, Snowflake, and Databricks. Stands out for combining performance tuning and platform automation with LLM/RAG-based enrichment, delivering measurable gains in latency, fraud accuracy, false positives, and analyst decision speed.

View profile
SB

Mid-level Data Engineer specializing in scalable pipelines, Spark, and cloud data warehousing

Boston, USA3y exp
Fidelity InvestmentsNortheastern University

Backend/data platform engineer who recently owned an end-to-end large-scale financial data platform delivering real-time decision support for finance and operations. Has hands-on experience modernizing legacy batch pipelines into AWS cloud-native ELT with parallel-run cutovers, strong data quality controls (dbt-style tests, reconciliation), and measurable improvements in runtime, cost, and SLA compliance. Also builds scalable, secure FastAPI microservices using Docker, ALB-based horizontal scaling, Redis caching, and managed auth with Cognito/Supabase plus Postgres RLS.

View profile
BC

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp
BlackRockWebster University

Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.

View profile
SR

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp
Northern TrustBradley University

Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).

View profile
DK

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp
PwCUniversity of Central Missouri

Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.

View profile
Vaibhav Sharma - Mid-level Software Engineer specializing in AI/ML and data platforms in Remote, USA

Mid-level Software Engineer specializing in AI/ML and data platforms

Remote, USA5y exp
GoogleIndiana University Bloomington

AI/ML engineer who built a production agentic system to automate computational research experiments (simulation execution, parameter exploration, and numerical analysis) and mitigated context-window failures using constrained tool-calling/prompt-chaining patterns in LangChain with OpenAI tool-enabled models. Also has adtech/big-data pipeline experience at InMobi, orchestrating Spark jobs in Airflow to filter bot-like user IDs and publish clean IDs to an online NoSQL store for live serving, plus Apache open-source collaboration experience.

View profile
Harshavardhan Reddy - Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics in Albany, NY

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp
Capital OnePace University

ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.

View profile
YY

Yinghai Yu

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML pipelines

San Mateo, CA6y exp
Bubbles and BooksGeorgia Tech

Data-engineering-oriented candidate with hands-on experience building an agentic AI product and operational automation workflows. They described automating inventory-to-ERP discrepancy reconciliation with anomaly detection and daily reporting, and also have practical scraping/automation experience dealing with Cloudflare-protected sites using Selenium and Puppeteer.

View profile
SP

Junior AI/ML Software Engineer specializing in LLMs and data-intensive systems

New York, NY3y exp
NYU Langone HealthNYU

AI/backend engineer who has owned production applied-ML systems end to end, including a Jitsi meeting intelligence platform with custom RoBERTa boundary detection, LLM summarization, and automated retraining from user feedback. Also has healthcare AI experience building a diabetes medication titration system with strict validation, drift monitoring, and safety guardrails—showing both product speed and high-stakes engineering rigor.

View profile
RW

Junior Data Engineer and Analyst specializing in ETL, analytics, and e-commerce data

Walnut, CA3y exp
Dreamstream, LLCUC Irvine

Data engineer with a Master's in Data Science who has owned 30+ customer-facing K-12 SIS migrations end-to-end, building ETL, validation, and SOP-driven deployment processes in a PII-sensitive environment. Also brings recent hands-on agentic AI experience from a biotech capstone, where they led a production-oriented NLP-to-SQL + RAG support system that handled about 30% of support queries in testing.

View profile
AB

Anuj Bubna

Screened

Senior DevOps/SRE Engineer specializing in cloud automation, reliability, and data pipelines

10y exp
IntuitUniversity of Texas at Dallas

Hands-on technical professional experienced in taking LLM/AI-adjacent integrations from prototype to production, using customer observation to refine UX and uncover edge cases. Diagnoses workflow issues in real time using logs and Sankey-style workflow analysis, and communicates fixes with clear short/long-term plans plus proactive alerting. Also partners cross-functionally to drive adoption and cost savings, including a POC around IBM Sterling Integrator that reduced licensing costs by $30K/year.

View profile
GN

Giri Nathan

Screened

Executive Technology Leader (CTO/CIO) specializing in cloud, AI/ML, and cybersecurity

38y exp
Production Resource GroupCharter Oak State College

CTO who ties technology strategy directly to business outcomes, building multi-year roadmaps with measurable ROI. Led major modernization (cloud, data platform, unified API, microservices + CI/CD) delivering 5x faster releases/deployments, 99.8% uptime, and 40% user growth without headcount increases, while scaling engineering from 15 to 80+ in ~18 months.

View profile
PK

Prem Kumar

Screened

Senior Data Engineer specializing in cloud data platforms and regulated analytics

McLean, VA6y exp
Capital OneRowan University

Data engineer at Capital One building AWS-based real-time and batch pipelines and backend data services for financial/fraud use cases. Has owned end-to-end pipelines processing millions of records/day, implemented dbt/Great Expectations quality gates, and tuned Redshift/Snowflake workloads (cutting query latency ~22–25% and reducing pipeline failures ~30–40%) while supporting 15+ downstream consumers.

View profile
SB

Mid-level Data Engineer specializing in cloud data platforms and big data pipelines

5y exp
Molina HealthcareUniversity of Michigan-Dearborn

Healthcare data engineer with hands-on ownership of claims/member data pipelines on a cloud analytics platform, spanning batch and streaming ingestion (Airflow/Kafka/Spark/Databricks) through serving for reporting. Emphasizes reliability and data quality via embedded validation, schema-drift detection, deduplication, and operational monitoring/incident response, plus pragmatic CI/CD and observability setup in early-stage/ambiguous projects.

View profile
Dinesh Kumar Patibandla - Mid-level Machine Learning Engineer specializing in LLMs and RAG for finance and healthcare in Texas, USA

Mid-level Machine Learning Engineer specializing in LLMs and RAG for finance and healthcare

Texas, USA4y exp
Goldman SachsUniversity of North Texas

ML Engineer with recent Goldman Sachs experience building and deploying a production RAG/LLM assistant for summarization, drafting, and internal knowledge retrieval across financial, risk, and compliance documents. Designed for heavy regulatory constraints and scaled to 10,000+ concurrent users using Kubernetes-based orchestration, dynamic LLM routing, and rigorous testing (adversarial prompts, A/B tests, load simulations) with privacy controls like differential privacy.

View profile
Pavan Kumar Malasani - Mid-level AI/ML Engineer specializing in financial risk, fraud detection, and GenAI in Remote, USA

Mid-level AI/ML Engineer specializing in financial risk, fraud detection, and GenAI

Remote, USA4y exp
CitigroupUniversity of Colorado Boulder

GenAI/ML engineer in Citigroup’s finance environment who has deployed production RAG systems for investment banking under strict privacy and model-risk constraints. Built an internal-VPC Llama2 + Pinecone + LangChain solution with NER redaction and citation-based verification to prevent hallucinations, delivering major time savings, and also partnered with global finance executives to ship an AI early-warning indicator for treasury/liquidity risk.

View profile

Need someone specific?

AI Search