Vetted Apache Hadoop Professionals

Pre-screened and vetted.

YL

Yaoxin Liu

Screened

Intern Full-Stack Software Engineer specializing in real-time web systems

New York, NY0y exp
VenuePilotNYU

Built and iterated an end-to-end virtual waiting room for a real-time ticketing prototype, making concrete architecture tradeoffs (polling + Redis Pub/Sub) and improving performance post-launch with Redis caching (+30% throughput, -15% p99 latency). Also has hands-on experience building Spark/HDFS ETL pipelines with strong reliability/observability patterns and running disciplined NLP model evaluation loops on review-rating classification.

View profile
KR

Mid-level AI/ML Engineer specializing in LLMs, NLP, and MLOps

Texas, USA4y exp
McKessonUniversity of Texas at Arlington

AI/ML engineer with healthcare domain depth who led a HIPAA-compliant, production LLM system at McKesson to automate clinical document understanding—extracting entities, summarizing provider notes, and supporting authorization decisions. Hands-on across Spark/Python ETL, Hugging Face + LoRA/QLoRA fine-tuning, RAG, and cloud-native MLOps (Airflow/Kubernetes/Step Functions, MLflow, blue-green on EKS/GKE), with explicit work on PHI handling and hallucination reduction.

View profile
BB

Mid-level Data Analyst specializing in healthcare and finance analytics

New Jersey, USA5y exp
Omada HealthRowan University

Built an end-to-end Alexa smart-home IoT application controlling a Wi-Fi bulb, including ESP32 firmware (MQTT) and an AWS serverless backend (IoT Core/Device Shadow, Lambda, DynamoDB) with a REST API. Demonstrates strong real-time scalability patterns (streaming ingestion, stateless processing, partition-key design) and full-stack delivery with Spring Boot + React (JWT auth, CORS, data-heavy dashboards).

View profile
SP

Mid-level Data Analyst specializing in AI/ML and advanced analytics

USA3y exp
AccentureMurray State University

Accenture data/ML practitioner who deployed a retail churn prediction and BERT-based sentiment analysis system to production, integrating behavioral + feedback data and operationalizing it with ETL automation, orchestration, and CI/CD. Experienced managing 2TB+ multi-source data, monitoring drift in Databricks, and translating results into Power BI dashboards for marketing teams (including K-means customer segmentation).

View profile
RG

Rohan Gore

Screened

Intern AI/ML Engineer specializing in agentic systems and full-stack development

New York City, NY0y exp
MARV CapitalNYU

Built and scaled a multi-agent LLM automation pipeline during a fintech internship, growing from a rapid 1-week proof-of-concept to a 15+ agent hierarchical system that cut market brief report generation time from ~5 hours to under 30 minutes. Hands-on with agent frameworks (Haystack, CrewAI, LangChain) and experienced in debugging agent communication issues via sandboxed modular testing and context/token management; also regularly gives architecture-first technical demos at multiple hackathons and university events.

View profile
KK

Mid-level Data Scientist specializing in MLOps, LLM/RAG applications, and deep learning

United States5y exp
CitigroupUniversity of North Texas

Built and deployed a production compliance automation RAG system (at Citi) that generates citation-backed, schema-validated risk summaries for regulatory document review. Emphasizes regulated-environment reliability with retrieval-only grounding, abstention, confidence thresholds, and immutable audit logging, plus orchestration using LangChain/LangGraph and Airflow. Reported ~60% reduction in compliance review effort while maintaining high precision and traceability.

View profile
RK

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp
FordWebster University

Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.

View profile
NY

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp
Molina HealthcareSoutheast Missouri State University

Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.

View profile
Jaideep bommidi - Senior ML Engineer & Data Scientist specializing in LLM agents, retrieval/ranking, and MLOps in Denton, TX

Senior ML Engineer & Data Scientist specializing in LLM agents, retrieval/ranking, and MLOps

Denton, TX8y exp
Webster BankUniversity of North Texas

Machine Learning Engineer currently at Webster Bank building an enterprise-scale LLM agent for Temenos Journey Manager/Maestro, using RAG-style multi-stage retrieval with FAISS/Pinecone, hybrid dense+sparse search, and LoRA fine-tuning optimized via NDCG/MAP and A/B testing. Previously handled messy incident/telemetry data at Deuta Werke GmbH with deterministic + fuzzy entity resolution, and has strong production data engineering experience across Spark/Hadoop and Python ETL systems.

View profile
Mohan Naik Megavath - Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms in Remote, USA

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp
TruistElmhurst University

Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.

View profile
Preetham Reddy Konuganti - Junior Full-Stack Engineer specializing in AI applications and scalable web platforms in San Jose, CA

Junior Full-Stack Engineer specializing in AI applications and scalable web platforms

San Jose, CA2y exp
Cognia SecurityArizona State University

Full-stack engineer with customer-facing delivery experience who built and deployed a multi-platform social media automation product (Next.js/Node/MongoDB) and optimized it using BullMQ/Redis background jobs, retries, and rate limiting for reliable posting at scale. Also delivered an AI-powered false-positive analysis service in a cybersecurity context, resolving production pipeline stalls via log-driven debugging, parallelization, caching, and LLM guardrails.

View profile
Uday kumar swamy - Senior Machine Learning Engineer specializing in MLOps and NLP/GenAI in Chicago, USA

Senior Machine Learning Engineer specializing in MLOps and NLP/GenAI

Chicago, USA9y exp
UnitedHealth GroupIllinois Institute of Technology

Built a production LLM-agent framework for a startup that performs daily financial/trading analysis by combining live market data with internal tools, including a centralized memory module to prevent context drift and reduce hallucinations. Also implemented an Airflow-orchestrated retail price forecasting pipeline deployed to AWS endpoints, scaling parallel workloads via Kubernetes Executor and validating systems with rigorous functional + LLM-specific metrics and cross-team collaboration.

View profile
Sri Harsha patallapalli - Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS in Boston, MA

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp
Dextr.aiNortheastern University

Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.

View profile
UMESH KAMISETTY - Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms in Seattle, WA

Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms

Seattle, WA5y exp
First United BankCleveland State University

Data engineer focused on building production-grade pipelines on AWS (Kafka/Kinesis/Glue/S3) through to curated serving layers in Snowflake and Delta Lake. Emphasizes automated data quality validation (PySpark + CI/CD), modular dbt transformations for analytics (customer spending, risk metrics), and operational reliability with CloudWatch and DLQs; data consumed by BI tools and ML pipelines for fraud detection and risk analytics.

View profile
Harshitha Parupalli - Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines in Jersey City, NJ

Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines

Jersey City, NJ4y exp
Elevance HealthNJIT

Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.

View profile
KP

Mid-level Data Engineer specializing in capital markets post-trade data platforms

Whippany, NJ3y exp
BarclaysUniversity of Connecticut

Data/streaming engineer in capital markets who led an end-to-end trade settlement data product (Kafka→MongoDB→data lake) with rigorous data-quality logic and ~$175K first-year operational impact. Also built a low-latency Go-based CME market data engine feeding SOFR curve generation, using MSK on EKS with performance tuning (idempotency, compression, partitioning) to achieve sub-100ms delivery.

View profile
MOUNIKA SAI MEKALA - Junior Data Analyst specializing in financial and operational analytics in Kansas, USA

Junior Data Analyst specializing in financial and operational analytics

Kansas, USA3y exp
KPMGUniversity of Central Missouri

Analytics professional with experience at KPMG turning messy operational and financial data from SQL Server and AWS S3 into clean reporting datasets and automated Python workflows. They combine SQL, Python, Power BI, and experimentation methods to deliver stakeholder-aligned KPI dashboards and marketing performance insights with a strong focus on data integrity and reproducibility.

View profile
SAITEJA MALLEMPUDI - Senior Data Scientist and AI/ML Engineer specializing in GenAI and cloud ML in Chicago, IL

Senior Data Scientist and AI/ML Engineer specializing in GenAI and cloud ML

Chicago, IL6y exp
BMOLewis University

ML/AI engineer with hands-on experience owning systems from experimentation through deployment and monitoring, including a Bank of Montreal project that improved timely interventions by 12%. Also brings GenAI/RAG experience with evaluation and safety guardrails, plus clinical NLP pipeline work extracting medication data from notes for patient risk prediction.

View profile
NP

Neel Patel

Screened

Mid-level Python Backend Engineer specializing in cloud-native and AI-powered systems

USA4y exp
ComcastUniversity at Buffalo

Backend/AI engineer who has shipped an LLM-powered enterprise support-ticket agent at Comcast, building a production-grade microservices pipeline (FastAPI, SQS, Redis) with strong observability (OpenTelemetry/Splunk/Prometheus/Grafana) and reliability patterns (async, caching, circuit breakers, idempotency). Demonstrated quantified impact at scale—processing 10k+ tickets/day while improving response SLAs and routing accuracy through evaluation and human feedback loops.

View profile
DL

Senior Python Developer specializing in data engineering, MLOps, and cloud platforms

Dallas, TX13y exp
CBREAnna University

Backend/data engineer with production experience building secure Django/DRF APIs (JWT RS256 + rotating refresh tokens), background processing with Celery, and strong reliability practices (timeouts, retries/backoff, structured logging, audit trails). Has delivered AWS solutions spanning Lambda + ECS with IaC/CI-CD and built Glue/PySpark ETL pipelines with schema evolution and data-quality quarantine patterns; also modernized a legacy SAS pipeline to Python/PySpark with parallel-run parity validation and phased rollout.

View profile
TM

Tejal Mane

Screened

Mid-level Machine Learning Engineer specializing in GenAI, LLMs, and real-time ML systems

Moundsville, WV4y exp
CitiusTechUniversity of Michigan

Built and deployed a production long-form article summarization system using BART/T5/PEGASUS, tackling real-world constraints like token limits, latency/quality tradeoffs, and factual drift via chunking/merge logic and constrained decoding. Uses pragmatic Python-based pipeline orchestration (scheduled jobs, modular scripts, logging/retries) and iterates with stakeholder feedback to make outputs genuinely useful for content workflows.

View profile
KK

Mid-level Generative AI Engineer specializing in LLM apps, RAG, and MLOps

Remote, United States6y exp
AccentureEastern Illinois University

LLM/GenAI engineer with US Bank experience building a production financial-document intelligence platform using LangChain/LangGraph, GPT-4, and Amazon OpenSearch. Delivered a RAG-based assistant for compliance/audit teams with grounded, cited answers, focusing on reducing hallucinations and latency, and deployed securely on AWS (SageMaker/EKS) with CI/CD and evaluation tooling (LangSmith, RAGAS).

View profile
AK

Ansh Krishna

Screened

Intern Data Scientist specializing in ML systems and LLM-powered analytics

Noida, India1y exp
Data Security Council of IndiaUSC

Built an autonomous decision analytics LLM agent for end-to-end tabular binary classification, using RAG (FAISS) to retain context across multi-step queries. Deployed as a FastAPI service with production-style reliability features (schema-aware validation, fallbacks, retries, structured outputs) plus offline/online evaluation and monitoring to reduce analysis time and improve consistency versus stateless approaches.

View profile
VJ

Vedant Jagtap

Screened

Junior AI/NLP Engineer specializing in LLM systems and RAG

New York, NY1y exp
NYU’s Center for Social Media, AI, and PoliticsNYU

LLM/agent engineer who shipped a two-stage AI recruitment screening platform at Foursquare that automated resume ingestion through behavioral assessment, delivering an 85% reduction in screening time across 5,000+ applications with auditability and confidence-gated decisions. Also built a multi-agent benchmarking framework using MCP tool interfaces and a RAGAS + LangSmith evaluation/observability stack, including async re-architecture that cut production latency by 50%.

View profile

Need someone specific?

AI Search