Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Hadoop Professionals

Pre-screened and vetted.

Apache Hadoop Python Docker SQL AWS Apache Spark

Subhash Krishnamoorthy

Screened

Executive Technology Leader specializing in digital transformation, headless e-commerce, and cloud architecture

Chesterfield, VA25y exp

Hamilton BeachUniversity of Phoenix

“Technology leader focused on business-aligned roadmaps and integration-heavy ecommerce platforms. Recently delivered an on-time launch for lutusooking.com (a premium Hamilton Beach brand) by coordinating UX/UI, component-based middleware, BigCommerce, Algolia search, personalization/recommendations, payments, and supply chain integrations, and later improved scalability via a Jitterbit iPaaS approach proven during Black Friday/Cyber Monday traffic.”

Agile Android Ansible AWS Business Intelligence CI/CD+334

View profile

Abhishek Gawali

Screened

Mid-level Data Engineer specializing in cloud ETL and real-time streaming

New York, NY6y exp

PNCRochester Institute of Technology

“Data engineer focused on AWS + Spark/Databricks pipelines, including an end-to-end nightly loan-data ingestion flow (~2.2M records) from Postgres/S3 through Glue and Databricks into a DWH with layered validation and alerting. Also built real-time streaming with Kafka + Spark Structured Streaming and a master’s project streaming Reddit data for sentiment analysis under ambiguous requirements and tight budget constraints.”

SDLC Agile Waterfall Python SQL R+105

View profile

Srijitha Katkuri

Screened

Mid-level Data Analyst specializing in healthcare and business intelligence

Michigan, USA4y exp

Banner HealthTrine University

“Healthcare analytics candidate with hands-on experience turning messy EHR, billing, and operational data into validated SQL datasets and automated Python/Airflow pipelines. They appear strongest in hospital KPI reporting—especially length of stay, readmissions, retention, and bed utilization—and have owned projects from metric definition through Power BI delivery and impact measurement.”

SQL Python Pandas NumPy Power BI Tableau+70

View profile

Alekya Battu

Screened

Mid-level Data Scientist specializing in machine learning, MLOps, and cloud analytics

USA5y exp

Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

Python SQL R Classification XGBoost Random Forest+171

View profile

Hard Parikh

Screened

Mid-level Software Engineer specializing in data platforms, distributed systems, and applied AI

Austin, TX3y exp

Compass GroupUC Riverside

“AI/full-stack product engineer currently owning Fleck Intelligent Survey Chatbot at E15, a production RAG analytics assistant embedded in Compass Group dashboards for 300+ field operators. Stands out for combining LLM orchestration, analytics engineering, and strong systems thinking—cutting hallucinated numeric answers from 14% to 2%, reducing backlog 62%, and previously delivering a low-level protocol redesign at Amadeus that cut P99 latency by 56%.”

Python SQL C++Java TypeScript JavaScript+113

View profile

Mrunal Kakirwar

Screened

Mid-level Full-Stack Engineer specializing in cloud-native microservices and AI automation

USA5y exp

Fuel AICalifornia State University

“Software engineer/product owner who has led end-to-end delivery of AI and content-management platforms, including building RAG-based reliability improvements and migrating fragile systems to containerized AWS ECS/Kubernetes with Terraform-managed CI/CD. Experienced designing event-driven microservices (SQS/SNS/RabbitMQ), scaling queue consumers with autoscaling, and creating internal Python tooling to standardize data connectors (e.g., BigQuery/Airtable/internal APIs) to speed iteration.”

Python JavaScript TypeScript Shell Scripting Java SQL+108

View profile

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp

DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

Python PySpark SQL Scala Pandas NumPy+157

View profile

Lingyi Wu

Screened

Mid-level Financial/Data Analyst specializing in analytics, forecasting, and healthcare/MarTech data

Los Angeles, CA4y exp

MINISOWestcliff University

“Growth/creative marketer from Esleydunn Games who uses Google Analytics to integrate cross-channel performance data (TikTok, YouTube, LinkedIn, Facebook) and run structured A/B tests on video ad length and layout. Reported reducing CPA by 20 per customer when leveraging YouTube and TikTok, and improved CTR through CTA/button placement testing and ongoing user-feedback loops (forum/WeChat topics).”

Python SQL R Machine Learning Deep Learning Feature Engineering+104

View profile

Nandini Kosgi

Screened

Mid-level AI/ML Engineer specializing in NLP, RAG systems, and real-time risk modeling

PA, USA4y exp

Capital OneRobert Morris University

“AI/ML Engineer with 4+ years of experience (Capital One, Odin Technologies) and a master’s in Data Analytics (4.0 GPA) who has deployed LLM/RAG systems to production for compliance/risk and document review. Strong in orchestration and MLOps (Airflow, Kubernetes, MLflow, GitHub Actions) and in tackling real-world LLM constraints like latency, context limits, and data privacy, with measurable impact (20%+ manual review reduction; 33% faster release cycles).”

Agentic AI Anomaly Detection Apache Hadoop Apache Hive Apache Kafka Apache Spark+115

View profile

Vardhan Addakattu

Screened

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp

S&P GlobalRowan University

“Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.”

Python R SQL PySpark Pandas Apache Spark+111

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp

CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLC Agile Waterfall Python SQL Jupyter Notebook+137

View profile

Dharam Kharwar

Screened

Senior Site Reliability Engineer specializing in hybrid cloud infrastructure and DevOps

Sanford, FL15y exp

Fulcrum AnalyticsNYU

“Infrastructure/platform engineer with hands-on ownership of on-prem OpenStack/VMware and Kubernetes clusters deployed via Kubespray, including OpenStack-integrated networking and Ceph-backed storage. Built a GitOps-style Terraform delivery model in GitLab with CI/CD and security scanning (SAST/DAST/Checkov), and operated a hybrid on-prem/AWS ecommerce architecture with PCI-segmented payment systems over site-to-site VPN.”

Site Reliability Engineering DevOps Automation Agile SaaS Incident Response+115

View profile

Varun Gattamaneni

Screened

Mid-level GenAI Engineer specializing in LLM fine-tuning, RAG, and MLOps

Glassboro, NJ5y exp

HCLTechRowan University

“Healthcare-focused LLM engineer who deployed a production triage and clinical knowledge retrieval assistant using RAG and LangGraph-orchestrated multi-agent workflows. Emphasizes clinical safety and compliance with robust hallucination controls, HIPAA/PHI protections (tokenization, encryption, audit logging, zero-retention), and human-in-the-loop escalation; reports a 75% latency reduction in a healthcare agent system.”

Python Pandas NumPy R SQL Bash+150

View profile

Venkatesh Sanaboina

Screened

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp

VerizonJawaharlal Nehru Technological University

“Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.”

A/B Testing Agile Amazon Redshift Amazon S3 Amazon SageMaker Anomaly Detection+168

View profile

Dyuti Vartak

Screened

Junior Data Scientist/Data Engineer specializing in ML pipelines and analytics

Seattle, WA1y exp

DocsumoUniversity of Washington

“Machine Learning Intern at Docsumo who delivered a customer-facing fraud-detection solution end-to-end: rebuilt the pipeline, deployed a Random Forest model, and shipped a Python/Flask microservice on AWS SageMaker. Drove measurable production impact (precision +30%, processing time cut in half, manual review -60%, customer satisfaction +15%) and demonstrated strong customer integration and live-incident response skills.”

AWS Bash BigQuery C C++CSS+103

View profile

Harsha Sikha

Screened

Mid-level AI/ML Engineer specializing in Generative AI and data engineering

Armonk, New York4y exp

IBMSaint Peter's University

“IBM engineer who built and deployed a production RAG-based LLM assistant using LangChain/FAISS with a fine-tuned LLaMA model, served via FastAPI microservices on Kubernetes, achieving 99%+ uptime. Demonstrates strong practical expertise in reducing hallucinations (semantic chunking + metadata-driven retrieval) and managing latency, plus mature MLOps practices (Airflow/dbt pipelines, MLflow tracking, monitoring, A/B and shadow deployments) and effective collaboration with non-technical stakeholders.”

A/B Testing Agile Anomaly Detection API Development Apache Hadoop Apache Hive+157

View profile

Yun-Hao Lee

Screened

Junior Machine Learning Engineer specializing in LLM deployment and computer vision

Dallas, TX2y exp

Lab for Intelligent Storage and ComputingUniversity of Texas at Dallas

“Robotics/AI candidate who built an AI-driven landmark location tool during a summer internship at Mobile Drive, combining YOLOv5 object detection with OpenStreetMap-based geolocation to handle dense, cluttered urban environments. Also researched deploying LLM-based agents on constrained hardware using quantization plus LoRA/continuous learning, improving accuracy from ~80% to ~92%, with an emphasis on production logging for reliability.”

Python C C++R SQL Java+91

View profile

Harsha Chimirala

Screened

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp

HCLTechUniversity of New Haven

“Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).”

Python SQL PySpark Scala Bash Apache Spark+80

View profile

Tharun Kshathriya Sangaraju

Screened

Mid-level AI Engineer specializing in LLM orchestration, RAG, and multi-agent systems

Houston, TX4y exp

University of HoustonUniversity of Houston

“Research Assistant at the University of Houston who built and live-deployed a production RAG system for 1000+ research documents, using hybrid retrieval (dense+BM25+RRF) with cross-encoder reranking and RAGAS-based evaluation; reported 66% MRR, 0.85+ faithfulness, and 68% lower LLM inference costs. Also built a deployed LangGraph multi-agent research system (Researcher/Critic/Writer) with tool integrations (Tavily, arXiv) and dual memory (ChromaDB + Neo4j), plus freelance automation work delivering a WhatsApp chatbot and n8n workflows for a wholesale clothing business.”

Agentic AI AI Agents API Integration Apache Airflow Apache Hadoop Apache Kafka+118

View profile

Sai Harshith Varma Pericherla

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture

Jersey City, NJ4y exp

State StreetUniversity of New Haven

“Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.”

SQL Python Pandas NumPy Scikit-learn Java+151

View profile

Sheshikanth Pothuganti

Screened

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

New York, NY4y exp

Wells FargoUniversity of Birmingham

“Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.”

Python SQL Java Scala R PostgreSQL+122

View profile

Swathi Reddy

Screened

Mid-Level Full-Stack Software Engineer specializing in AWS cloud and Python/Java

New York, NY4y exp

Rebecca Everlene Trust CompanyNJIT

“Accenture consultant who shipped an LLM-based production solution during a client cloud migration to parse application code and identify only the database objects actually used, cutting migration time by 30% and accelerating realization of cloud cost benefits. Emphasizes production robustness with timeouts/retries/fallback routing, validation, observability, and a disciplined eval/monitoring loop that turns failures into regression tests.”

Python Java JavaScript Shell Scripting SQL PowerShell+97

View profile

Ishaan Umesh Mandliya

Screened

Mid-Level Full-Stack Software Engineer specializing in AI/ML and cloud-native systems

Los Angeles, CA3y exp

DevolvedAIUSC

“At BondiTech, built and deployed customer-facing backend improvements for enterprise dashboards handling 1M+ records, redesigning a .NET/Entity Framework API with server-side pagination/filtering and feature-flagged rollout to cut latency from ~15s to ~2s. Experienced integrating customer systems into existing APIs, including stabilizing a legacy CRM sync by normalizing inconsistent IDs, handling strict rate limits with batching, and adding DLQs plus reconciliation reporting.”

Agile Amazon DynamoDB Amazon EC2 Amazon S3 Amazon SQS Amazon SNS+158

View profile

Machine Learning Engineers Software Engineers Data Engineers Data Scientists Data Analysts Software Developers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?