Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Sharath Kumar

Screened

Mid-level AI/ML Engineer specializing in LLM fine-tuning, RAG, and MLOps

Remote, USA5y exp

HPWilmington University

“AI/ML engineer with HP experience building and productionizing an LLM-powered document intelligence platform (LangChain + Pinecone) to deliver semantic search and contextual Q&A across millions of enterprise support documents. Demonstrates strong MLOps and scaling expertise (Airflow, Kubernetes autoscaling, Triton GPU inference, monitoring with Prometheus/W&B) plus a structured approach to evaluation (A/B tests, shadow deployments, failover) and effective collaboration with non-technical stakeholders.”

Python SQL PostgreSQL BigQuery Snowflake Bash+142

View profile

Harini Kv

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Dallas, TX7y exp

EquinixFitchburg State University

“GenAI/data engineering practitioner with production experience across Equinix, Optum, and Citibank—built an Azure OpenAI (GPT-4) + LangChain document intelligence platform processing 1.5M+ docs/month and a HIPAA-compliant Airflow healthcare pipeline handling 5M+ claims/day. Also delivered a real-time fraud detection + explainability system using LightGBM and a fine-tuned T5 NLG component, improving fraud accuracy by 15%+ while partnering closely with compliance stakeholders.”

Python SQL PySpark Bash Java JavaScript+169

View profile

Edin Samuel Joselyn Chandrakumar

Screened

Senior Engineering Manager specializing in cloud platforms and risk systems

16y exp

Capital OneGovernment College of Technology, Coimbatore

“Engineering leader who proposed and delivered a new API-based document management platform to replace a vendor-dependent system, improving latency by ~1s and availability to 99.9% while migrating legacy data. Also drove Python-based automation of ~12 workflows via third-party API integrations and led an SSO/auth integration focused on backward compatibility and high login success rates.”

A/B Testing Agile Amazon CloudWatch Amazon DynamoDB Amazon ECS Amazon RDS+88

View profile

Mahesh Babu

Screened

Mid-level Full-Stack Developer specializing in cloud-native FinTech systems

New York, NY4y exp

Goldman SachsClemson University

“Built a lightweight internal JavaScript analytics tracker capturing user interactions (clicks, page views, custom events) with debounced batching, automatic session tracking, and offline event caching via a localStorage-backed append-only queue. Demonstrates practical performance optimization mindset (profiling, memoization/caching) and React performance tuning.”

Agile Amazon EC2 Amazon EKS Amazon RDS Amazon S3 Angular+97

View profile

Siva Sai Kumar Mogalluru

Screened

Mid-level AI Engineer specializing in Generative AI, MLOps, and NLP for finance and healthcare

Remote, USA4y exp

EYUniversity of South Florida

“Built and deployed a secure, production LLM-based document summarization and risk-highlighting tool for financial auditors, running inside a private Azure environment to protect confidential data. Focused on reliability (hallucination mitigation via retrieval-based prompts and source citations) and validated performance through comparisons to auditor summaries plus a user pilot, cutting review time by about half.”

A/B Testing Agile Anomaly Detection Apache Airflow Apache Spark Azure DevOps+138

View profile

Uday Chilakala

Screened

Mid-level Machine Learning Engineer specializing in NLP, computer vision, and RAG systems

Atlanta, GA5y exp

Morgan StanleyKennesaw State University

“Machine learning/NLP engineer who built a production-oriented retrieval-based AI system at Morgan Stanley for healthcare use cases, combining RAG over unstructured patient records with deep-learning medical image segmentation (U-Net/Mask R-CNN). Strong in end-to-end pipelines and MLOps (Spark/MongoDB, AWS SageMaker, CI/CD, monitoring, automated retraining) and in entity resolution/data quality validation for noisy clinical data.”

Python SQL Flask Apache Spark gRPC TensorFlow+125

View profile

Allan Farinas

Screened

Senior Full-Stack Software Engineer specializing in Python and AWS

West Covina, CA11y exp

CareRevCal Poly Pomona

“Backend/data engineer who has built production Python microservices (FastAPI) and AWS-native platforms for event ingestion and analytics, combining ECS/Fargate + Lambda with CloudFormation-driven environments and strong secrets/IAM practices. Experienced modernizing legacy logic with parallel-run parity validation and safe phased cutovers, and has demonstrated measurable SQL tuning wins (20–30s down to 1–2s) plus incident ownership in Glue/Step Functions ETL pipelines.”

Python JavaScript SQL AWS AWS Lambda Amazon API Gateway+193

View profile

Sahithi M

Screened

Mid-level GenAI/ML Engineer specializing in LLM applications and enterprise automation

5y exp

UnitedHealth GroupRivier University

“Built and shipped a production LLM-powered healthcare support agent at UnitedHealthGroup, using LangChain + FAISS RAG on AWS SageMaker with CloudWatch monitoring and human-in-the-loop fallbacks for safety. Strong focus on reliability engineering (confidence gating, retries/timeouts, caching) and continuous evaluation loops; reported ~40% improvement in query resolution efficiency while reducing manual support workload.”

A/B Testing Anomaly Detection Apache Spark Automation AWS AWS Lambda+115

View profile

Sai Gowtham Madaka

Screened

Mid-level Data Engineer specializing in streaming and cloud data platforms for financial services

Edison, NJ3y exp

Morgan StanleyPace University

“Data engineering-focused candidate (internship/project experience) who built end-to-end pipelines processing a few million transactional records/day for fraud detection and reporting, using Airflow, Python/SQL, and PySpark with strong emphasis on data quality gates, idempotency, and monitoring. Also implemented an external web/API data collection system with anti-bot tactics and schema-change quarantine, and shipped a versioned Flask API to serve curated warehouse data.”

Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+82

View profile

Ayyappan Manikandan

Screened

Senior Software Engineer specializing in backend microservices and distributed systems

United States7y exp

WalmartCleveland State University

“Senior software engineer (5+ years) from Walmart Global Tech who owned and operated high-scale supplier inventory submission systems, including a microservice handling submissions up to 500k items and a data platform processing ~10TB/day. Strong in AWS/Kubernetes (EKS), Kafka/Spark streaming + batch pipelines, and production operations (on-call, metrics/alerting), with demonstrated performance wins (30% faster responses, 50% faster processing).”

Java Spring Boot Spring Framework Microservices Distributed Systems REST APIs+97

View profile

Vineeth Reddy Vallapureddy

Screened

Mid-level Full-Stack Software Engineer specializing in backend microservices and enterprise AI tools

Redwood City, California5y exp

C3 AIUniversity at Buffalo

“Backend/platform engineer with experience across C3.ai (supply chain demand planning) and Amdocs (telecom), working on large-scale data systems and microservices. Has driven first-time adoption experiments of Snowflake + Spark to handle billion-record workloads, built Jenkins-to-Kubernetes delivery pipelines with Nexus artifact management, and implemented Kafka streaming between microservices with HA and retry/error-handling patterns.”

AWS Backend Development C C++CI/CD Debugging+117

View profile

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp

UnitedHealth GroupEast Texas A&M University

“Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.”

Python Pandas NumPy PySpark Scikit-learn SQL+133

View profile

Prasanna Chelliboyina

Screened

Mid-level Machine Learning Engineer specializing in forecasting, NLP, and GenAI

United States6y exp

WalgreensSyracuse University

“GenAI/ML engineer with production experience building multilingual LLM systems (English/Spanish) and RAG-based clinical documentation summarization at Walgreens, combining prompt engineering, structured output validation, and rigorous evaluation (ROUGE + pharmacist review). Also orchestrated end-to-end ML pipelines for demand forecasting using Apache Airflow, PySpark, and MLflow with scheduled retraining and production monitoring.”

A/B Testing Agile Anomaly Detection Apache Spark AWS Azure Machine Learning+114

View profile

Sai Charan Kolla

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps on AWS

TX, USA5y exp

BlackRockTexas A&M University-Kingsville

“LLM engineer who built a production document intelligence/RAG pipeline to extract structured data from thousands of unstructured PDFs, cutting manual review time by 60%. Experienced with LangChain and Airflow orchestration plus rigorous evaluation (labeled datasets, prompt testing, HITL review, monitoring) to improve accuracy and reduce hallucinations while partnering closely with non-technical operations stakeholders.”

Python SQL R Java C++Machine Learning+99

View profile

Samatha Amsala

Screened

Mid-level Data Engineer specializing in cloud data warehousing and analytics

Omaha, NE6y exp

American ExpressBellevue University

“Data engineer at American Express who owned end-to-end pipelines for transaction and customer data used in finance reporting and risk analytics, processing ~5–8M records/day. Built Airflow-orchestrated ingestion (including external APIs/web sources) with strong data quality controls, monitoring/alerts, and resilient backfill/retry patterns, and also shipped a versioned REST API serving aggregated metrics to analytics teams.”

Data Engineering Data Warehousing Analytics Fraud Detection ETL Data Validation+167

View profile

Ganesh Bandi

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and MLOps

USA6y exp

Capital OneUniversity of North Texas

“LLM engineer who has deployed production RAG systems for regulated document QA (PDFs/knowledge bases), emphasizing grounded answers with citations, RBAC, monitoring, and continuous feedback. Demonstrates deep practical expertise in retrieval quality (semantic chunking, hybrid BM25+embeddings, re-ranking), reliability (guardrails, deterministic workflows), and measurable evaluation (golden sets, log replay, A/B tests) while partnering closely with compliance/operations stakeholders.”

A/B Testing Agile Amazon EKS Amazon S3 Anomaly Detection Apache Spark+128

View profile

John Hoffman

Screened

Senior Data Engineer specializing in Databricks, Spark, and AWS for government healthcare data systems

Windsor Mill, MD12y exp

GDITUniversity of Virginia

“Python/AWS engineer focused on batch-processing and data workflows, including building reusable S3/boto3 utilities with reliability features and IAM-based auth. Has led low-risk legacy modernizations using parity testing plus a month of parallel production runs, and has owned production issues end-to-end (including fixing a client-side Excel macro) while contributing to significant AWS cost reductions (~$10k/month).”

Python SQL Bash Databricks Apache Spark PySpark+66

View profile

Dhyey Desai

Screened

Intern AI/ML Engineer specializing in RAG, multimodal AI, and LLM systems

Los Angeles, California0y exp

NalaUSC

“Built and shipped 'PetPulse,' a production AI pet-health note system that records voice notes, transcribes them, converts transcripts into structured symptom/event data, and supports grounded Q&A over a user’s notes and vet PDFs. Demonstrates full-stack LLM product execution (FastAPI + GPT-4 + Firebase), with concrete reliability/performance work (async endpoints, caching, RAG/embeddings, function calling) and user-centered iteration with a non-technical product stakeholder.”

AI Agents Apache Hadoop BERT C Caching Data Visualization+87

View profile

SaiTeasmitha Kaja

Screened

Mid-level Full-Stack Software Engineer specializing in Java/Spring Boot and cloud microservices

Houston, TX4y exp

HPEUniversity of Houston

“Backend-focused Python/Flask engineer who has built authentication/profile services with clean modular architecture (blueprints + service layer) and tuned SQLAlchemy/Postgres for scale using indexing, query rewrites, and pagination. Has production-style integration experience for AI/ML via TensorFlow Serving and OpenAI APIs (batching, rate limiting, caching), plus multi-tenant data isolation and high-throughput background processing with Celery/Redis and idempotent jobs.”

Agile Angular API Gateway Argo CD Audit Logging AWS+168

View profile

Divyam Agrawal

Screened

Mid-level Machine Learning Engineer specializing in LLMs and NLP classification systems

Seattle, WA4y exp

Affinity SolutionsUniversity of Washington

“Internship experience building a production RAG+LLM pipeline to map messy card transaction descriptions to merchant brands, including a custom modified-ROUGE evaluation approach for weak/variant ground truth. Improved scalability and cost by moving from a managed LLM endpoint (e.g., Bedrock) to self-hosted vLLM, and orchestrated massive embedding backfills (5,000+ files, 10B+ rows) using an Airflow-triggered SQS + ECS worker architecture with robust retry/DLQ handling.”

A/B Testing API Design AWS AWS CloudFormation AWS Lambda Auto-scaling+110

View profile

Durga Mahesh Boppani

Screened

Mid Software Engineer specializing in distributed cloud-native backend systems

Gainesville, FL4y exp

Silicon AssuranceUniversity of Florida

“Backend/AI workflow engineer who built production-grade orchestration systems for hardware security verification at Silicon Assurance (Nextflow/Python/Postgres) and a multi-agent LLM-driven regulatory code checking system at the University of Florida. Emphasizes reliability: strict plan/execute/verify boundaries, queue-based isolation, and strong observability/auditability with Prometheus/Grafana and persisted prompts/tool calls.”

Python Java C C++JavaScript SQL+130

View profile

Anirban Ghosh

Screened

Mid-level Machine Learning Engineer specializing in data science and cloud systems

Seattle, WA4y exp

AmazonStony Brook University

“ML engineer who independently pitched and built a recommendation engine at Danske Bank in a legacy fintech environment, creating compliant data pipelines and deployment infrastructure from scratch and delivering a 62% engagement lift with 70%+ advisor adoption. Also worked at AWS on classification and GenAI-powered reporting systems, with strengths spanning production ML, platform setup, monitoring, and research-to-production optimization.”

Machine Learning Data Science Business Intelligence Backend Development Cloud Computing Python+140

View profile

Jing Yang

Screened

Senior Machine Learning Engineer specializing in NLP and generative AI

McLean, VA8y exp

Capital OneUniversity of Utah

“ML/AI engineer focused on production NLP and voice AI systems in the restaurant tech space, with hands-on work spanning ASR, intent classification, LLM fine-tuning, and deployment monitoring at Presto AI. They highlight a 15% improvement in full-AI ordering rate and also built a restaurant sentiment analysis product at Wisely that they say became a standout feature in a $10M acquisition context.”

Deep Learning TensorFlow PyTorch AWS Amazon SageMaker OpenAI+107

View profile

Chaitanya Prasad Reddy Narala

Screened

Mid-level AI/ML Engineer specializing in FinTech risk and fraud systems

USA4y exp

ServiceNowSaint Louis University

“Senior AI/ML engineer focused on production LLM systems, combining RAG, fine-tuning, distributed training, and AI safety to ship scalable real-time moderation and conversational AI platforms. Stands out for pairing deep AWS/Kubernetes MLOps expertise with measurable impact: 40% lower latency/cost, 30-50% fewer hallucinations, and major reliability gains through observability and automation.”

Python Java SQL R Scikit-learn XGBoost+139

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?