Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Hadoop Professionals

Pre-screened and vetted.

Apache Hadoop Python SQL Docker Apache Spark AWS

Yaoxin Liu

Screened

Intern Full-Stack Software Engineer specializing in real-time web systems

New York, NY0y exp

VenuePilotNYU

“Built and iterated an end-to-end virtual waiting room for a real-time ticketing prototype, making concrete architecture tradeoffs (polling + Redis Pub/Sub) and improving performance post-launch with Redis caching (+30% throughput, -15% p99 latency). Also has hands-on experience building Spark/HDFS ETL pipelines with strong reliability/observability patterns and running disciplined NLP model evaluation loops on review-rating classification.”

Python Java JavaScript TypeScript SQL C+89

View profile

Krishnakaanth Reddy Yeduguru

Screened

Mid-level AI/ML Engineer specializing in LLMs, NLP, and MLOps

Texas, USA4y exp

McKessonUniversity of Texas at Arlington

“AI/ML engineer with healthcare domain depth who led a HIPAA-compliant, production LLM system at McKesson to automate clinical document understanding—extracting entities, summarizing provider notes, and supporting authorization decisions. Hands-on across Spark/Python ETL, Hugging Face + LoRA/QLoRA fine-tuning, RAG, and cloud-native MLOps (Airflow/Kubernetes/Step Functions, MLflow, blue-green on EKS/GKE), with explicit work on PHI handling and hallucination reduction.”

Python C++SQL Bash TensorFlow PyTorch+129

View profile

BHARATH BHOOTHPUR

Screened

Mid-level Data Analyst specializing in healthcare and finance analytics

New Jersey, USA5y exp

Omada HealthRowan University

“Built an end-to-end Alexa smart-home IoT application controlling a Wi-Fi bulb, including ESP32 firmware (MQTT) and an AWS serverless backend (IoT Core/Device Shadow, Lambda, DynamoDB) with a REST API. Demonstrates strong real-time scalability patterns (streaming ingestion, stateless processing, partition-key design) and full-stack delivery with Spring Boot + React (JWT auth, CORS, data-heavy dashboards).”

Python SQL R NumPy Pandas Matplotlib+113

View profile

Sushma Puchakayala

Screened

Mid-level Data Analyst specializing in AI/ML and advanced analytics

USA3y exp

AccentureMurray State University

“Accenture data/ML practitioner who deployed a retail churn prediction and BERT-based sentiment analysis system to production, integrating behavioral + feedback data and operationalizing it with ETL automation, orchestration, and CI/CD. Experienced managing 2TB+ multi-source data, monitoring drift in Databricks, and translating results into Power BI dashboards for marketing teams (including K-means customer segmentation).”

Python Pandas NumPy Matplotlib Scikit-learn Seaborn+122

View profile

Rohan Gore

Screened

Intern AI/ML Engineer specializing in agentic systems and full-stack development

New York City, NY0y exp

MARV CapitalNYU

“Built and scaled a multi-agent LLM automation pipeline during a fintech internship, growing from a rapid 1-week proof-of-concept to a 15+ agent hierarchical system that cut market brief report generation time from ~5 hours to under 30 minutes. Hands-on with agent frameworks (Haystack, CrewAI, LangChain) and experienced in debugging agent communication issues via sandboxed modular testing and context/token management; also regularly gives architecture-first technical demos at multiple hackathons and university events.”

Apache Cassandra Apache Hadoop Apache Kafka AWS AWS Lambda C#+93

View profile

Krishna Kandlakunta

Screened

Mid-level Data Scientist specializing in MLOps, LLM/RAG applications, and deep learning

United States5y exp

CitigroupUniversity of North Texas

“Built and deployed a production compliance automation RAG system (at Citi) that generates citation-backed, schema-validated risk summaries for regulatory document review. Emphasizes regulated-environment reliability with retrieval-only grounding, abstention, confidence thresholds, and immutable audit logging, plus orchestration using LangChain/LangGraph and Airflow. Reported ~60% reduction in compliance review effort while maintaining high precision and traceability.”

A/B Testing Agile Anomaly Detection Apache Hadoop Apache Hive Apache Kafka+167

View profile

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp

FordWebster University

“Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.”

Python PySpark Scala Java R SQL+173

View profile

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp

Molina HealthcareSoutheast Missouri State University

“Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.”

Python SQL PySpark Bash ETL Data pipelines+85

View profile

Jaideep bommidi

Screened

Senior ML Engineer & Data Scientist specializing in LLM agents, retrieval/ranking, and MLOps

Denton, TX8y exp

Webster BankUniversity of North Texas

“Machine Learning Engineer currently at Webster Bank building an enterprise-scale LLM agent for Temenos Journey Manager/Maestro, using RAG-style multi-stage retrieval with FAISS/Pinecone, hybrid dense+sparse search, and LoRA fine-tuning optimized via NDCG/MAP and A/B testing. Previously handled messy incident/telemetry data at Deuta Werke GmbH with deterministic + fuzzy entity resolution, and has strong production data engineering experience across Spark/Hadoop and Python ETL systems.”

A/B Testing Agile Amazon EC2 Amazon EKS Amazon ECS Amazon Kinesis+181

View profile

Mohan Naik Megavath

Screened

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp

TruistElmhurst University

“Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.”

Amazon Athena Amazon DynamoDB Amazon EC2 Amazon Redshift Amazon S3 AngularJS+137

View profile

Preetham Reddy Konuganti

Screened

Junior Full-Stack Engineer specializing in AI applications and scalable web platforms

San Jose, CA2y exp

Cognia SecurityArizona State University

“Full-stack engineer with customer-facing delivery experience who built and deployed a multi-platform social media automation product (Next.js/Node/MongoDB) and optimized it using BullMQ/Redis background jobs, retries, and rate limiting for reliable posting at scale. Also delivered an AI-powered false-positive analysis service in a cybersecurity context, resolving production pipeline stalls via log-driven debugging, parallelization, caching, and LLM guardrails.”

Agile Amazon EC2 Amazon S3 Amazon SNS Amazon SQS Angular+126

View profile

Uday kumar swamy

Screened

Senior Machine Learning Engineer specializing in MLOps and NLP/GenAI

Chicago, USA9y exp

UnitedHealth GroupIllinois Institute of Technology

“Built a production LLM-agent framework for a startup that performs daily financial/trading analysis by combining live market data with internal tools, including a centralized memory module to prevent context drift and reduce hallucinations. Also implemented an Airflow-orchestrated retail price forecasting pipeline deployed to AWS endpoints, scaling parallel workloads via Kubernetes Executor and validating systems with rigorous functional + LLM-specific metrics and cross-team collaboration.”

Python SQL R Java Scikit-learn TensorFlow+126

View profile

Sri Harsha patallapalli

Screened

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp

Dextr.aiNortheastern University

“Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.”

Python Java C++SQL JavaScript Bash+113

View profile

UMESH KAMISETTY

Screened

Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms

Seattle, WA5y exp

First United BankCleveland State University

“Data engineer focused on building production-grade pipelines on AWS (Kafka/Kinesis/Glue/S3) through to curated serving layers in Snowflake and Delta Lake. Emphasizes automated data quality validation (PySpark + CI/CD), modular dbt transformations for analytics (customer spending, risk metrics), and operational reliability with CloudWatch and DLQs; data consumed by BI tools and ML pipelines for fraud detection and risk analytics.”

Python PySpark SQL Shell Scripting AWS Amazon S3+146

View profile

Harshitha Parupalli

Screened

Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines

Jersey City, NJ4y exp

Elevance HealthNJIT

“Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.”

Python SQL Shell Scripting R Scala Java+160

View profile

Kamalesh Ponnivalavan

Screened

Mid-level Data Engineer specializing in capital markets post-trade data platforms

Whippany, NJ3y exp

BarclaysUniversity of Connecticut

“Data/streaming engineer in capital markets who led an end-to-end trade settlement data product (Kafka→MongoDB→data lake) with rigorous data-quality logic and ~$175K first-year operational impact. Also built a low-latency Go-based CME market data engine feeding SOFR curve generation, using MSK on EKS with performance tuning (idempotency, compression, partitioning) to achieve sub-100ms delivery.”

Amazon Athena Amazon DynamoDB Amazon Redshift Amazon S3 Apache Hadoop Apache Kafka+118

View profile

MOUNIKA SAI MEKALA

Screened

Junior Data Analyst specializing in financial and operational analytics

Kansas, USA3y exp

KPMGUniversity of Central Missouri

“Analytics professional with experience at KPMG turning messy operational and financial data from SQL Server and AWS S3 into clean reporting datasets and automated Python workflows. They combine SQL, Python, Power BI, and experimentation methods to deliver stakeholder-aligned KPI dashboards and marketing performance insights with a strong focus on data integrity and reproducibility.”

SQL Python Pandas NumPy SciPy R+103

View profile

SAITEJA MALLEMPUDI

Screened

Senior Data Scientist and AI/ML Engineer specializing in GenAI and cloud ML

Chicago, IL6y exp

BMOLewis University

“ML/AI engineer with hands-on experience owning systems from experimentation through deployment and monitoring, including a Bank of Montreal project that improved timely interventions by 12%. Also brings GenAI/RAG experience with evaluation and safety guardrails, plus clinical NLP pipeline work extracting medication data from notes for patient risk prediction.”

Python SQL PySpark Scala Bash Shell Scripting+153

View profile

Neel Patel

Screened

Mid-level Python Backend Engineer specializing in cloud-native and AI-powered systems

USA4y exp

ComcastUniversity at Buffalo

“Backend/AI engineer who has shipped an LLM-powered enterprise support-ticket agent at Comcast, building a production-grade microservices pipeline (FastAPI, SQS, Redis) with strong observability (OpenTelemetry/Splunk/Prometheus/Grafana) and reliability patterns (async, caching, circuit breakers, idempotency). Demonstrated quantified impact at scale—processing 10k+ tickets/day while improving response SLAs and routing accuracy through evaluation and human feedback loops.”

Python Go Java SQL FastAPI Flask+124

View profile

Dharanidharan Loganathan

Screened

Senior Python Developer specializing in data engineering, MLOps, and cloud platforms

Dallas, TX13y exp

CBREAnna University

“Backend/data engineer with production experience building secure Django/DRF APIs (JWT RS256 + rotating refresh tokens), background processing with Celery, and strong reliability practices (timeouts, retries/backoff, structured logging, audit trails). Has delivered AWS solutions spanning Lambda + ECS with IaC/CI-CD and built Glue/PySpark ETL pipelines with schema evolution and data-quality quarantine patterns; also modernized a legacy SAS pipeline to Python/PySpark with parallel-run parity validation and phased rollout.”

Python C#C++Go Java JavaScript+170

View profile

Tejal Mane

Screened

Mid-level Machine Learning Engineer specializing in GenAI, LLMs, and real-time ML systems

Moundsville, WV4y exp

CitiusTechUniversity of Michigan

“Built and deployed a production long-form article summarization system using BART/T5/PEGASUS, tackling real-world constraints like token limits, latency/quality tradeoffs, and factual drift via chunking/merge logic and constrained decoding. Uses pragmatic Python-based pipeline orchestration (scheduled jobs, modular scripts, logging/retries) and iterates with stakeholder feedback to make outputs genuinely useful for content workflows.”

Agile Apache Hadoop Apache Kafka AWS CI/CD CUDA+112

View profile

Kranthi Kumar Karupati

Screened

Mid-level Generative AI Engineer specializing in LLM apps, RAG, and MLOps

Remote, United States6y exp

AccentureEastern Illinois University

“LLM/GenAI engineer with US Bank experience building a production financial-document intelligence platform using LangChain/LangGraph, GPT-4, and Amazon OpenSearch. Delivered a RAG-based assistant for compliance/audit teams with grounded, cited answers, focusing on reducing hallucinations and latency, and deployed securely on AWS (SageMaker/EKS) with CI/CD and evaluation tooling (LangSmith, RAGAS).”

Amazon API Gateway Amazon Bedrock Amazon CloudWatch Amazon DynamoDB Amazon EKS Amazon ECS+168

View profile

Ansh Krishna

Screened

Intern Data Scientist specializing in ML systems and LLM-powered analytics

Noida, India1y exp

Data Security Council of IndiaUSC

“Built an autonomous decision analytics LLM agent for end-to-end tabular binary classification, using RAG (FAISS) to retain context across multi-step queries. Deployed as a FastAPI service with production-style reliability features (schema-aware validation, fallbacks, retries, structured outputs) plus offline/online evaluation and monitoring to reduce analysis time and improve consistency versus stateless approaches.”

A/B Testing Artificial Intelligence Backend Development C++Cloud Computing Data Structures and Algorithms+76

View profile

Vedant Jagtap

Screened

Junior AI/NLP Engineer specializing in LLM systems and RAG

New York, NY1y exp

NYU’s Center for Social Media, AI, and PoliticsNYU

“LLM/agent engineer who shipped a two-stage AI recruitment screening platform at Foursquare that automated resume ingestion through behavioral assessment, delivering an 85% reduction in screening time across 5,000+ applications with auditability and confidence-gated decisions. Also built a multi-agent benchmarking framework using MCP tool interfaces and a RAGAS + LangSmith evaluation/observability stack, including async re-architecture that cut production latency by 50%.”

Python C C++JavaScript TypeScript SQL+114

View profile

Machine Learning Engineers Software Engineers Data Engineers Data Scientists Data Analysts Software Developers AI & Machine Learning Engineering Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?