Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Hadoop Professionals

Pre-screened and vetted.

Apache Hadoop Python SQL Docker AWS Apache Spark

Gowthami chilukuru

Screened

Mid-Level Full-Stack Software Engineer specializing in healthcare, cloud, and data platforms

Sunnyvale, CA5y exp

Intuitive SurgicalStevens Institute of Technology

“Backend/platform engineer who owned a real-time customer analytics microservice stack in Python/FastAPI with Kafka streaming into PostgreSQL, including schema enforcement (Avro) and high-throughput optimizations. Strong Kubernetes + GitOps practitioner (EKS/GKE, Helm, Argo CD) who has handled CI/CD reliability issues with automated pre-deploy checks and rollbacks, and supported major migrations (on-prem to AWS; VM to EKS) with blue-green cutover planning.”

Python R Java C JavaScript TypeScript+200

View profile

Narayanaroyal Marisetty

Screened

Mid-level Data Scientist/ML Engineer specializing in healthcare AI and MLOps

USA4y exp

CVS HealthUniversity at Buffalo

“Designed and deployed an enterprise LLM-powered clinical/pharmacy policy knowledge assistant at CVS Health, replacing manual searches across PDFs/Word/SharePoint with a HIPAA-compliant RAG system. Built end-to-end ingestion and orchestration (Airflow + Azure ML/Data Lake + vector index) with PHI masking, versioned re-embedding, and production monitoring (Prometheus/Grafana), and partnered closely with clinicians/compliance to ensure policy-grounded, auditable answers.”

A/B Testing Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark+132

View profile

Siva Manikanta Lakumarapu

Screened

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Dallas, TX5y exp

Gilead SciencesUniversity of North Texas

“AI/LLM engineer with production experience building secure, scalable compliance-focused generative AI systems (GPT-3/4, BERT) including RAG over internal regulatory document bases. Has delivered end-to-end pipelines on AWS with PySpark/Airflow/Kubernetes/FastAPI, emphasizing privacy controls, monitoring, and iterative evaluation (A/B testing). Also partnered closely with bank compliance officers using prototypes to refine NLP summarization/classification and reduce document review time.”

A/B Testing Agile Amazon EC2 Amazon Redshift Amazon S3 Apache Airflow+164

View profile

kesav boob

Screened

Mid-Level Full-Stack Java Engineer specializing in microservices and cloud

San Francisco, California5y exp

Dell TechnologiesCal State LA

“Full-stack developer who built an end-to-end Hotel Management System using React and Spring Boot with MongoDB and AWS. Has hands-on experience debugging API/data-fetching issues with Postman and validating results against the database, plus exposure to handling large data workloads with chunking and monitoring via Grafana/Tabula.”

Java SQL C C++C#Python+129

View profile

Niveditha A

Screened

Mid-level AI/ML Engineer specializing in healthcare ML and LLM/RAG systems

USA4y exp

UnitedHealth GroupBowling Green State University

“AI/LLM engineer with recent production experience at UnitedHealth Group building an end-to-end RAG system over structured EMR data and unstructured clinical notes, including evidence retrieval, GPT/LLaMA-based reasoning, and a validation layer for reliability. Strong in orchestration (Kubeflow/Airflow/MLflow), prompt engineering for noisy healthcare text, and rigorous evaluation/monitoring with gold-standard benchmarking, plus close collaboration with clinical operations stakeholders.”

Python NumPy Pandas JSON SQL PostgreSQL+152

View profile

Hrishikesh Raghunath

Screened

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp

Dreamline AICalifornia State University, Fullerton

“At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.”

A/B Testing Amazon CloudWatch Amazon Kinesis Amazon Redshift Amazon S3 Amazon SageMaker+114

View profile

Rakesh Kolagani

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp

IntuitUniversity of Central Missouri

“AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.”

A/B Testing Amazon S3 Apache Airflow AWS Glue AWS Lambda AWS Step Functions+126

View profile

Pooja Murigappa

Screened

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp

Charles SchwabUniversity of Central Missouri

“ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.”

Amazon DynamoDB Apache Airflow Apache Kafka Apache Spark AWS AWS Glue+183

View profile

HarshaSree gudapati

Screened

Senior Data Engineer specializing in cloud-native data platforms for finance and healthcare

Charlotte, NC4y exp

Bank of AmericaUniversity of Cincinnati

“Data engineer/backend data services practitioner with Bank of America experience building real-time and batch transaction-monitoring pipelines and APIs (Kafka + databases, REST/GraphQL). Highlights include a reported 45% response-time improvement through performance optimizations and use of Delta Lake schema evolution plus CI/CD (GitHub Actions/Jenkins) and operational reliability patterns like CloudWatch monitoring and dead-letter queues.”

Azure Data Factory AWS Amazon S3 AWS Glue Amazon Redshift AWS Lambda+125

View profile

Madhav Vaddepalli

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Seattle, WA8y exp

SafecoFitchburg State University

“Data engineer focused on building reliable, production-grade pipelines and external data collection systems on AWS (S3/Lambda/SQS/Glue/EMR) using PySpark/SQL, serving curated datasets to Snowflake/Redshift for finance and fraud teams. Has operated a large-scale crawler ingesting millions of records/day with anti-bot tactics, schema versioning/quarantine, and CloudWatch/Datadog monitoring, and also shipped a versioned REST API with caching and query optimization.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon Redshift Amazon RDS+192

View profile

Aayush Anand

Screened

Intern Full-Stack/Software Engineer specializing in web apps, cloud, and data/ML systems

New York, NY1y exp

The NorthStar GroupNYU

“Built and productionized LLM-driven content intelligence/SEO agents for a high-traffic media platform, automating tagging/summarization/metadata with FastAPI + async orchestration and strict JSON-schema outputs. Demonstrated measurable impact (40% faster publishing, +20% organic traffic in 3 months) and strong reliability practices (offline evals, shadow mode, canaries, fallbacks, idempotency, and monitoring).”

Agile Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+112

View profile

Molli Dinesh

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps

Remote, USA4y exp

Marsh McLennanIllinois Institute of Technology

“Built an AI-driven insurance policy summarization platform at Marsh, taking it end-to-end from messy PDF ingestion/OCR and custom extraction through LLM fine-tuning and AWS SageMaker deployment. Delivered measurable impact (25% reduction in manual review time, 99% uptime) and demonstrated strong production MLOps/LLMOps practices with Airflow/Step Functions orchestration, rigorous evaluation (ROUGE + human review), and continuous monitoring for drift, latency, and hallucinations.”

Python Pandas NumPy Scikit-learn R SQL+132

View profile

SUMIT MAMTANI

Screened

Mid-level Data Scientist specializing in ML, MLOps, and customer analytics

Tempe, AZ4y exp

QlikArizona State University

“ML/NLP practitioner focused on insurance/claims analytics for a large financial firm, working with millions of fragmented structured and unstructured records. Built production-grade pipelines for entity extraction, entity resolution, and semantic search using Sentence-BERT + vector DB, including fine-tuning with contrastive learning (reported ~15% recall lift) and scalable ETL/containerized deployment on Kubernetes.”

Python Pandas NumPy Scikit-learn TensorFlow PyTorch+117

View profile

Yuvraj Singh Chauhan

Screened

Entry-level AI/ML Engineer specializing in LLMs, RAG, and DevOps automation

Bangalore, India1y exp

RapidFortThapar Institute of Engineering and Technology

“Built and owned a production-scale AI-driven software release/version intelligence platform orchestrated via GitHub Actions that tracks 1000+ upstream repositories and automatically generates SLA-bound JIRA upgrade tickets for hardened container images. Replaced brittle regex/PEP440 parsing with an LLM-based semantic filtering layer plus deterministic validation to handle noisy/inconsistent GitHub tags at scale, with monitoring for coverage, latency, and correctness validated against upstream ground truth.”

API Integration Bash Computer Vision C C++Data Analytics+71

View profile

Sanjay Mandru

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud microservices and real-time analytics

Buffalo, NY3y exp

SamsungUniversity at Buffalo

“Software engineer who built a reusable React component package (UI modules, auth helpers, API client wrappers) for an AI SaaS background-removal project, emphasizing performance (tree shaking/dynamic imports) and reliability (Jest + Storybook). Also delivered a unified REST API for Samsung Big Data Portal, resolving cross-team issues by standardizing schemas, improving validation/logging, and operating effectively amid shifting requirements.”

Agile Ansible Apache Kafka Apache Spark Authentication AWS+123

View profile

Nikitha Margadi

Screened

Mid-level Data Engineer specializing in cloud lakehouse, streaming, and MLOps

Texas, USA5y exp

AT&TCal State Fullerton

“Data engineer at AT&T focused on large-scale telecom (5G/IoT) data platforms, owning end-to-end pipelines from Kafka/Azure ingestion through Databricks/Delta Lake transformations to serving analytics and ML. Has operated at very high volumes (~50+ TB/day) and delivered measurable performance gains (25–30% faster processing) plus improved reliability via Airflow monitoring, robust data quality checks, and resilient external data collection patterns (rate limiting, retries, dynamic schemas).”

Python SQL PL/SQL PySpark Apache Spark Apache Kafka+114

View profile

Lakshmi Nallani

Screened

Junior Data Analyst specializing in analytics, BI, and machine learning

College Park, MD1y exp

USA TODAYUniversity of Maryland, College Park

“Analytics-focused candidate with experience owning end-to-end data projects across AI transcription, retail forecasting, and transportation revenue analytics. They combine strong SQL/Python pipeline skills with dashboarding and stakeholder alignment, citing measurable impact including 60% lower ETL latency, 18% better forecast accuracy, and 25% operational efficiency gains.”

SQL Python Tableau Power BI ETL Pipelines Predictive Modeling+105

View profile

Rami Jaloudi

Screened

Senior Applications Engineer specializing in legal technology and eDiscovery

New York, NY16y exp

ConduentNJIT

“Early-stage founder candidate exploring an AI-enabled legal tech startup focused on document intelligence, secure workflows, and enterprise automation. Brings a rare blend of technical architecture fluency and product/business thinking, with clear firsthand insight into legal and document-heavy operational pain points.”

Analytics Workflow automation Risk management RabbitMQ Microsoft Azure OAuth 2.0+221

View profile

Prathamesh Pramod Dhawale

Screened

Mid Software Engineer specializing in backend systems, AI, and FinTech

Remote, US4y exp

Easley-Dunn ProductionsUSC

“Backend engineer with experience at HSBC and Machinations who has delivered major production performance wins (cutting large trade-file upload times from ~13–15s to ~2s) using chunked parallel processing with strong reliability controls. Also built and shipped an applied AI RAG workflow using Langflow + Cohere embeddings + FAISS with hosted/local LLM fallbacks (Hugging Face, Ollama) and production-grade guardrails, observability, and evaluation.”

Generative AI LangGraph FastAPI Java Python Spring Boot+163

View profile

Jareena kowsar shaik

Screened

Mid-level Machine Learning & GenAI Engineer specializing in LLMs, RAG, and NLP

New York, NY6y exp

Morgan Stanley

“Built and deployed an LLM-powered customer support assistant (“Notable Assistant”) focused on automating common post-customer queries while maintaining multi-turn context and meeting scalability/latency needs. Experienced with production orchestration and operations using Kubernetes and Apache Airflow (DAG-based ETL, scheduling, monitoring/alerts), and has partnered closely with customer service stakeholders to align chatbot behavior with brand voice through iterative testing.”

A/B Testing Agile Amazon Bedrock Amazon Redshift AWS AWS Glue+209

View profile

Ramya Latha

Screened

Senior AI/ML & Data Engineer specializing in Generative AI and RAG systems

Birmingham, AL8y exp

Regions Bank

“GenAI/RAG engineer who has deployed a production policy/regulatory search assistant for a financial client using LangChain + Vertex AI, FastAPI, Docker/Kubernetes, and Airflow-orchestrated data pipelines. Demonstrated measurable impact with 50–60% latency reduction and 70% fewer pipeline failures, plus KPI-driven grounding evaluation (90%+ target) and strong cross-functional collaboration with compliance/business teams.”

Amazon Redshift Amazon S3 Apache Airflow Apache Cassandra Apache Hadoop Apache Hive+200

View profile

Radhe KC

Screened

Senior Engineering Manager specializing in Big Data and Cloud Data Platforms

18y exp

NetApp

“Engineering leader focused on developer platforms and open-source frameworks/SDKs, with strong community and release-engineering chops. Drove major reliability and DX improvements (30–50% faster release cycles; 2–3x repeat contributors; ~50% faster onboarding) and led an incremental Python monolith to TypeScript event-driven migration using Protobuf contracts, feature flags, and a plugin architecture to preserve backward compatibility.”

Agile Automation AWS AWS CloudFormation AWS Lambda Budgeting+171

View profile