Vetted Apache Hadoop Professionals

Pre-screened and vetted.

SB

Mid-level Data Engineer specializing in scalable pipelines, Spark, and cloud data warehousing

Boston, USA3y exp
Fidelity InvestmentsNortheastern University

Backend/data platform engineer who recently owned an end-to-end large-scale financial data platform delivering real-time decision support for finance and operations. Has hands-on experience modernizing legacy batch pipelines into AWS cloud-native ELT with parallel-run cutovers, strong data quality controls (dbt-style tests, reconciliation), and measurable improvements in runtime, cost, and SLA compliance. Also builds scalable, secure FastAPI microservices using Docker, ALB-based horizontal scaling, Redis caching, and managed auth with Cognito/Supabase plus Postgres RLS.

View profile
JS

Jash Shah

Screened

Mid-level Data Scientist specializing in LLMs, MLOps, and predictive analytics in healthcare and finance

New Jersey, USA4y exp
Johnson & JohnsonStevens Institute of Technology

Built and deployed a production LLM/RAG clinical decision support system that enables real-time semantic search over unstructured EHR notes and delivers patient risk insights. Strong in healthcare-grade MLOps and compliance (HIPAA, PHI handling, encryption, RBAC, audit logs) and scaled embedding/retrieval pipelines using Spark/Databricks and Airflow. Partnered with clinicians via Power BI dashboards and explainability, contributing to an 18% reduction in patient readmissions.

View profile
BC

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp
BlackRockWebster University

Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.

View profile
SK

Mid-level Machine Learning Engineer specializing in NLP and cloud MLOps

CT, USA4y exp
ServiceNowRivier University

Built and deployed a production LLM-powered internal documentation assistant using embeddings, a vector database, and a RAG pipeline to reduce time spent searching PDFs/manuals. Experienced in orchestrating end-to-end LLM workflows with Airflow/LangChain, improving reliability via monitoring/error handling, and driving measurable quality through retrieval and hallucination-focused evaluation metrics.

View profile
KF

Kevin Fang

Screened

Intern Software Engineer specializing in full-stack and data systems

Beverly Hills, CA1y exp
Alo YogaUC Irvine

Software developer with healthcare operations experience at Epic Systems (Referrals & Authorizations), delivering customer-facing tooling to speed manual insurance authorization/denial documentation and support future automation. Also supported an HRIS migration to Workday at Aloe Yoga, solving legacy ID interoperability via scripting and mapping, and demonstrates strong production debugging and test-driven maintainability practices.

View profile
RH

Rahul Hatkar

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

San Francisco, CA6y exp
Scale AIWebster University

AI/ML engineer who has shipped production AI systems end-to-end, including an automated multi-channel (Gmail/WhatsApp/voice) candidate interviewing workflow and an enterprise RAG knowledge search platform. Demonstrates strong production rigor (monitoring, A/B tests, guardrails, schema validation, shadow testing) with quantified impact: ~60–70% reduction in interview evaluation time and ~20–30% relevance gains in RAG retrieval.

View profile
DK

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp
PwCUniversity of Central Missouri

Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.

View profile
Harshavardhan Reddy - Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics in Albany, NY

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp
Capital OnePace University

ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.

View profile
Kanaka Chalam Volety - Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps in San Jose, CA

Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps

San Jose, CA24y exp
ZoomThompson Rivers University

Infrastructure-focused engineer with Vonage experience modernizing early-stage cloud architecture (Terraform modularization, blue-green deployments, containerization, and zero-downtime database migration planning to Aurora). Also built a local end-to-end side project, Vastu AI, combining a custom-trained YOLO model (Roboflow-labeled data) with a locally hosted LLM via Ollama to generate a vastu compliance report from floor-plan images.

View profile
Utkarsh Mittal - Intern Data Scientist specializing in computer vision and LLM agents in Sunnyvale, CA

Intern Data Scientist specializing in computer vision and LLM agents

Sunnyvale, CA0y exp
Covalent MetrologyNYU

Software engineering candidate with hands-on experience building and shipping LLM agents: created a production AI enrichment/coding agent at Covalent Metrology using Apollo.io + OpenAI, and built a Mistral hackathon router that dynamically selects among models to reduce token cost while maintaining quality. Also developed a real-time financial margin analysis agent that emails actionable insights and iterated on reliability issues (e.g., fixing misrouted emails, improving news relevance filtering).

View profile
Vaibhav Sharma - Mid-level Software Engineer specializing in AI/ML and data platforms in Remote, USA

Mid-level Software Engineer specializing in AI/ML and data platforms

Remote, USA5y exp
GoogleIndiana University Bloomington

AI/ML engineer who built a production agentic system to automate computational research experiments (simulation execution, parameter exploration, and numerical analysis) and mitigated context-window failures using constrained tool-calling/prompt-chaining patterns in LangChain with OpenAI tool-enabled models. Also has adtech/big-data pipeline experience at InMobi, orchestrating Spark jobs in Airflow to filter bot-like user IDs and publish clean IDs to an online NoSQL store for live serving, plus Apache open-source collaboration experience.

View profile
BZ

Binghan Zhang

Screened

Intern Data Analyst specializing in business intelligence and financial analytics

San Francisco, CA1y exp
Innova AI TechUCLA

Analytics candidate with hands-on experience in both fraud and churn use cases, including SQL-based preparation of 6.5M transaction records and reproducible Python modeling workflows. Stands out for combining technical rigor in data quality, feature engineering, and imbalance handling with strong stakeholder alignment, metric definition, and dashboard adoption.

View profile
YY

Yinghai Yu

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML pipelines

San Mateo, CA6y exp
Bubbles and BooksGeorgia Tech

Data-engineering-oriented candidate with hands-on experience building an agentic AI product and operational automation workflows. They described automating inventory-to-ERP discrepancy reconciliation with anomaly detection and daily reporting, and also have practical scraping/automation experience dealing with Cloudflare-protected sites using Selenium and Puppeteer.

View profile
SP

Junior AI/ML Software Engineer specializing in LLMs and data-intensive systems

New York, NY3y exp
NYU Langone HealthNYU

AI/backend engineer who has owned production applied-ML systems end to end, including a Jitsi meeting intelligence platform with custom RoBERTa boundary detection, LLM summarization, and automated retraining from user feedback. Also has healthcare AI experience building a diabetes medication titration system with strict validation, drift monitoring, and safety guardrails—showing both product speed and high-stakes engineering rigor.

View profile
RW

Junior Data Engineer and Analyst specializing in ETL, analytics, and e-commerce data

Walnut, CA3y exp
Dreamstream, LLCUC Irvine

Data engineer with a Master's in Data Science who has owned 30+ customer-facing K-12 SIS migrations end-to-end, building ETL, validation, and SOP-driven deployment processes in a PII-sensitive environment. Also brings recent hands-on agentic AI experience from a biotech capstone, where they led a production-oriented NLP-to-SQL + RAG support system that handled about 30% of support queries in testing.

View profile
SP

Junior Robotics & AI Researcher specializing in soft robotics and real-time ML control

Boston, MA2y exp
Boston UniversityBoston University

Early-career robotics engineer who has integrated LLM/NLP command interfaces (OpenAI/LLaMA) into ROS-controlled industrial manipulators and built data-driven controls for underwater soft robotic actuators. Combines hands-on fabrication (balloon actuator with embedded copper traces) with sensor debugging (IMU/Aurora) and simulation work in Gazebo, with practical exposure to edge deployment constraints on Jetson Nano and model quantization.

View profile
KK

Intern-level Software Engineer specializing in AI/ML and time-series forecasting for finance

Bangalore, Karnataka, India0y exp
CiscoNJIT

Built a production AI-driven QA automation platform using a multi-agent architecture (MCPs + LangGraph) to run parallel website tests across multiple device environments via automated image building and containerization. Currently collaborating with restaurant operators and managers to deliver an agentic restaurant analytics system, emphasizing deep domain discovery with non-technical stakeholders.

View profile
GN

Giri Nathan

Screened

Executive Technology Leader (CTO/CIO) specializing in cloud, AI/ML, and cybersecurity

38y exp
Production Resource GroupCharter Oak State College

CTO who ties technology strategy directly to business outcomes, building multi-year roadmaps with measurable ROI. Led major modernization (cloud, data platform, unified API, microservices + CI/CD) delivering 5x faster releases/deployments, 99.8% uptime, and 40% user growth without headcount increases, while scaling engineering from 15 to 80+ in ~18 months.

View profile
PK

Prem Kumar

Screened

Senior Data Engineer specializing in cloud data platforms and regulated analytics

McLean, VA6y exp
Capital OneRowan University

Data engineer at Capital One building AWS-based real-time and batch pipelines and backend data services for financial/fraud use cases. Has owned end-to-end pipelines processing millions of records/day, implemented dbt/Great Expectations quality gates, and tuned Redshift/Snowflake workloads (cutting query latency ~22–25% and reducing pipeline failures ~30–40%) while supporting 15+ downstream consumers.

View profile
SB

Mid-level Data Engineer specializing in cloud data platforms and big data pipelines

5y exp
Molina HealthcareUniversity of Michigan-Dearborn

Healthcare data engineer with hands-on ownership of claims/member data pipelines on a cloud analytics platform, spanning batch and streaming ingestion (Airflow/Kafka/Spark/Databricks) through serving for reporting. Emphasizes reliability and data quality via embedded validation, schema-drift detection, deduplication, and operational monitoring/incident response, plus pragmatic CI/CD and observability setup in early-stage/ambiguous projects.

View profile
Dinesh Kumar Patibandla - Mid-level Machine Learning Engineer specializing in LLMs and RAG for finance and healthcare in Texas, USA

Mid-level Machine Learning Engineer specializing in LLMs and RAG for finance and healthcare

Texas, USA4y exp
Goldman SachsUniversity of North Texas

ML Engineer with recent Goldman Sachs experience building and deploying a production RAG/LLM assistant for summarization, drafting, and internal knowledge retrieval across financial, risk, and compliance documents. Designed for heavy regulatory constraints and scaled to 10,000+ concurrent users using Kubernetes-based orchestration, dynamic LLM routing, and rigorous testing (adversarial prompts, A/B tests, load simulations) with privacy controls like differential privacy.

View profile
HEMANTH KUMAR KOTTAPALLI - Mid-level Machine Learning Engineer specializing in GPU-accelerated LLMs and MLOps in GA, USA

Mid-level Machine Learning Engineer specializing in GPU-accelerated LLMs and MLOps

GA, USA4y exp
BlackRockMercer University

Built and deployed a production LLM-powered decision-support system for supply-chain planners that explains demand forecast changes using grounded retrieval from sales, promotion, inventory, and supplier data. Implemented strict anti-hallucination guardrails and latency optimizations, deployed as a real-time AWS API with monitoring, and reported ~15% forecast accuracy improvement and ~12% supply-chain risk reduction. Experienced orchestrating data/ML/LLM workflows with Airflow, LangChain/LangGraph-style patterns, and AWS Step Functions while partnering closely with non-technical business users via demos and example-based requirements.

View profile
Pandari G - Mid-level Machine Learning Engineer specializing in Generative AI and RAG systems in San Francisco, USA

Pandari G

Screened

Mid-level Machine Learning Engineer specializing in Generative AI and RAG systems

San Francisco, USA5y exp
SephoraSaint Mary's College of California

GenAI/LLM engineer with production deployments in both fintech and retail: built an AI-powered mortgage document analysis/automated underwriting pipeline at Fannie Mae (OCR + custom LLM) cutting underwriting review from 3–4 hours to under an hour with privacy-by-design controls. Also helped build Sephora’s GenAI product advisory bot using LangChain-orchestrated RAG (Azure GPT-4, Azure AI Search, MySQL HeatWave vector search), focusing on grounding, evaluation, and compliance-aware architecture choices.

View profile
Zhiwen Zhao - Junior Data Engineer specializing in cloud ETL and big data platforms in New York, NY

Zhiwen Zhao

Screened

Junior Data Engineer specializing in cloud ETL and big data platforms

New York, NY3y exp
Bank of ChinaNYU

Data engineer focused on transit/transportation datasets, building Spark-based pipelines that ingest from Oracle/APIs, apply PySpark data-quality fixes, and publish star-schema fact tables to Azure Data Lake. Experienced troubleshooting complex Spark failures (using checkpointing to manage long lineage) and operating Airflow-driven backfills and GitLab CI deployments for production DAGs.

View profile

Need someone specific?

AI Search