Reval Logo
Home Browse Talent Skilled in Apache Hadoop

Vetted Apache Hadoop Professionals

Pre-screened and vetted.

Apache HadoopPythonDockerSQLApache SparkAWS
KP

Kalyan Pavuluri

Screened

Mid-level Full-Stack Java Developer specializing in cloud-native microservices and React

5y exp
Northern TrustCentral Michigan University

“Full-stack engineer who owned enterprise workflow platforms end-to-end at Northern Trust and Elevance Health—building NestJS/Java Spring Boot APIs, React UIs, and cloud deployments on GCP Cloud Run. Strong in data-heavy applications (hundreds of thousands of records) with proven production performance tuning (indexing/query rewrites, Cloud Run concurrency/min instances) and secure RBAC via Azure AD.”

AgileAJAXAmazon API GatewayAmazon CloudWatchAmazon DynamoDBAmazon EC2+169
View profile
AB

Alekya Battu

Screened

Mid-level Data Scientist specializing in ML, NLP, and MLOps

USA5y exp
Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

AgileScrumKanbanSDLCCI/CDWaterfall+144
View profile
AB

Ankush Banthia

Screened

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp
JPMorgan ChaseNew York Institute of Technology

“Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.”

OnboardingMentoringAgileScrumJiraConfluence+150
View profile
MK

Mrunal Kakirwar

Screened

Mid-level Full-Stack Engineer specializing in cloud-native microservices and AI automation

USA5y exp
Fuel AICalifornia State University

“Software engineer/product owner who has led end-to-end delivery of AI and content-management platforms, including building RAG-based reliability improvements and migrating fragile systems to containerized AWS ECS/Kubernetes with Terraform-managed CI/CD. Experienced designing event-driven microservices (SQS/SNS/RabbitMQ), scaling queue consumers with autoscaling, and creating internal Python tooling to standardize data connectors (e.g., BigQuery/Airtable/internal APIs) to speed iteration.”

PythonJavaScriptTypeScriptShell ScriptingJavaSQL+108
View profile
BS

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp
DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

PythonPySparkSQLScalaPandasNumPy+157
View profile
LW

Lingyi Wu

Screened

Mid-level Financial/Data Analyst specializing in analytics, forecasting, and healthcare/MarTech data

Los Angeles, CA4y exp
MINISOWestcliff University

“Growth/creative marketer from Esleydunn Games who uses Google Analytics to integrate cross-channel performance data (TikTok, YouTube, LinkedIn, Facebook) and run structured A/B tests on video ad length and layout. Reported reducing CPA by 20 per customer when leveraging YouTube and TikTok, and improved CTR through CTA/button placement testing and ongoing user-feedback loops (forum/WeChat topics).”

PythonSQLRMachine LearningDeep LearningFeature Engineering+104
View profile
VK

Varun Kumar Kota

Screened

Mid-level Software Engineer specializing in cloud, data engineering, and AI/ML

Remote3y exp
HandshakeUniversity at Buffalo

“Backend/platform engineer who owned an AI-powered resume optimization service end-to-end (FastAPI + Celery + Redis/Postgres) and optimized it for unpredictable LLM task latency. Strong Kubernetes/GitOps practitioner (Helm, autoscaling, probes, ArgoCD rollbacks) with experience in on-prem-to-cloud migrations using Terraform and CDC-based replication, plus real-time Kafka pipelines monitored via Prometheus/Grafana.”

PythonSQLRJavaJavaScriptJira+125
View profile
HE

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp
LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQLSQL query optimizationPythonPandasNumPyPySpark+159
View profile
VA

Vardhan Addakattu

Screened

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp
S&P GlobalRowan University

“Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.”

PythonRSQLPySparkPandasApache Spark+111
View profile
NK

Nandini Kosgi

Screened

Mid-level AI/ML Engineer specializing in NLP, RAG systems, and real-time risk modeling

PA, USA4y exp
Capital OneRobert Morris University

“AI/ML Engineer with 4+ years of experience (Capital One, Odin Technologies) and a master’s in Data Analytics (4.0 GPA) who has deployed LLM/RAG systems to production for compliance/risk and document review. Strong in orchestration and MLOps (Airflow, Kubernetes, MLflow, GitHub Actions) and in tackling real-world LLM constraints like latency, context limits, and data privacy, with measurable impact (20%+ manual review reduction; 33% faster release cycles).”

Anomaly DetectionApache HadoopApache HiveApache KafkaApache SparkAWS+115
View profile
SK

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp
CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLCAgileWaterfallPythonSQLJupyter Notebook+137
View profile
VG

Varun Gattamaneni

Screened

Mid-level GenAI Engineer specializing in LLM fine-tuning, RAG, and MLOps

Glassboro, NJ5y exp
HCLTechRowan University

“Healthcare-focused LLM engineer who deployed a production triage and clinical knowledge retrieval assistant using RAG and LangGraph-orchestrated multi-agent workflows. Emphasizes clinical safety and compliance with robust hallucination controls, HIPAA/PHI protections (tokenization, encryption, audit logging, zero-retention), and human-in-the-loop escalation; reports a 75% latency reduction in a healthcare agent system.”

PythonPandasNumPyRSQLBash+150
View profile
AT

Aishwarya Thorat

Screened

Intern Data Scientist specializing in ML engineering and LLM agentic workflows

San Francisco, CA6y exp
ContentstackSan José State University

“Built an agentic, multi-step LLM system that generates full-stack code for API integrations using LangChain orchestration, Pinecone/SentenceBERT RAG, and a human-in-the-loop feedback loop for iterative code refinement. Also collaborated with non-technical content writers and PMs during a Contentstack internship to deliver a Slack-based AI workflow that generates and brand-checks articles with one-click approvals.”

A/B TestingAmazon RedshiftAmazon S3API IntegrationAWSAWS Glue+129
View profile
VS

Venkatesh Sanaboina

Screened

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp
VerizonJawaharlal Nehru Technological University

“Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.”

A/B TestingAgileAmazon RedshiftAmazon S3Amazon SageMakerAnomaly Detection+168
View profile
IU

Ishaan Umesh Mandliya

Screened

Mid-Level Full-Stack Software Engineer specializing in AI/ML and cloud-native systems

Los Angeles, CA3y exp
DevolvedAIUSC

“At BondiTech, built and deployed customer-facing backend improvements for enterprise dashboards handling 1M+ records, redesigning a .NET/Entity Framework API with server-side pagination/filtering and feature-flagged rollout to cut latency from ~15s to ~2s. Experienced integrating customer systems into existing APIs, including stabilizing a legacy CRM sync by normalizing inconsistent IDs, handling strict rate limits with batching, and adding DLQs plus reconciliation reporting.”

AgileAmazon DynamoDBAmazon EC2Amazon S3Amazon SQSAmazon SNS+158
View profile
DV

Dyuti Vartak

Screened

Junior Data Scientist/Data Engineer specializing in ML pipelines and analytics

Seattle, WA1y exp
DocsumoUniversity of Washington

“Machine Learning Intern at Docsumo who delivered a customer-facing fraud-detection solution end-to-end: rebuilt the pipeline, deployed a Random Forest model, and shipped a Python/Flask microservice on AWS SageMaker. Drove measurable production impact (precision +30%, processing time cut in half, manual review -60%, customer satisfaction +15%) and demonstrated strong customer integration and live-incident response skills.”

AWSBashBigQueryCC++CSS+103
View profile
HS

Harsha Sikha

Screened

Mid-level AI/ML Engineer specializing in Generative AI and data engineering

Armonk, New York4y exp
IBMSaint Peter's University

“IBM engineer who built and deployed a production RAG-based LLM assistant using LangChain/FAISS with a fine-tuned LLaMA model, served via FastAPI microservices on Kubernetes, achieving 99%+ uptime. Demonstrates strong practical expertise in reducing hallucinations (semantic chunking + metadata-driven retrieval) and managing latency, plus mature MLOps practices (Airflow/dbt pipelines, MLflow tracking, monitoring, A/B and shadow deployments) and effective collaboration with non-technical stakeholders.”

A/B TestingAgileAnomaly DetectionAPI DevelopmentApache HadoopApache Hive+157
View profile
YL

Yun-Hao Lee

Screened

Junior Machine Learning Engineer specializing in LLM deployment and computer vision

Dallas, TX2y exp
Lab for Intelligent Storage and ComputingUniversity of Texas at Dallas

“Robotics/AI candidate who built an AI-driven landmark location tool during a summer internship at Mobile Drive, combining YOLOv5 object detection with OpenStreetMap-based geolocation to handle dense, cluttered urban environments. Also researched deploying LLM-based agents on constrained hardware using quantization plus LoRA/continuous learning, improving accuracy from ~80% to ~92%, with an emphasis on production logging for reliability.”

PythonCC++RSQLJava+91
View profile
HC

Harsha Chimirala

Screened

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp
HCLTechUniversity of New Haven

“Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).”

PythonSQLPySparkScalaBashApache Spark+80
View profile
TK

Tharun Kshathriya Sangaraju

Screened

Mid-level AI Engineer specializing in LLM orchestration, RAG, and multi-agent systems

Houston, TX4y exp
University of HoustonUniversity of Houston

“Research Assistant at the University of Houston who built and live-deployed a production RAG system for 1000+ research documents, using hybrid retrieval (dense+BM25+RRF) with cross-encoder reranking and RAGAS-based evaluation; reported 66% MRR, 0.85+ faithfulness, and 68% lower LLM inference costs. Also built a deployed LangGraph multi-agent research system (Researcher/Critic/Writer) with tool integrations (Tavily, arXiv) and dual memory (ChromaDB + Neo4j), plus freelance automation work delivering a WhatsApp chatbot and n8n workflows for a wholesale clothing business.”

API IntegrationApache AirflowApache HadoopApache KafkaApache SparkChromaDB+118
View profile
LJ

Lokesh Jain

Screened

Senior Data Engineer specializing in cloud data platforms and ML pipelines

5y exp
WayfairUniversity at Buffalo

“Built and deployed AcademiQ Ai, a production LLM-based teaching assistant using GPT/BERT with RAG (LangChain + Pinecone) to handle large student notes and generate adaptive explanations/quizzes. Demonstrated measurable retrieval-quality gains (18% precision improvement, 22% less irrelevant context) by tuning similarity thresholds and chunking based on user satisfaction signals. Also orchestrated terabyte-scale, real-time demand forecasting pipelines using Airflow and Kubeflow on GCP with strong monitoring, shadow deployment, and feedback-loop practices.”

A/B TestingAgileAngularApache HadoopApache KafkaAWS+91
View profile
NR

Nandini Reinthala

Screened

Mid-Level Full-Stack Python Developer specializing in AI and data platforms

Dallas, TX5y exp
Fannie MaeUniversity of Central Missouri

“Full-stack engineer who builds TypeScript/React SPAs on Python (Flask/FastAPI) backends and has hands-on experience integrating AI components (Azure OpenAI, LangChain, vector databases) into user workflows. Has built internal AI-enabled dashboards/search tools for analysts and business users, emphasizing typed API contracts, CI/CD-driven quality, and microservices reliability patterns (monitoring, retries, idempotency) at scale.”

AgileAJAXAmazon CloudFrontAmazon EC2Amazon EMRAmazon RDS+146
View profile
1...333435...57

Related

Machine Learning EngineersSoftware EngineersData ScientistsData EngineersSoftware DevelopersData AnalystsAI & Machine LearningEngineeringData & AnalyticsEducation

Need someone specific?

AI Search