Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Ramtin Khorrami

Screened

Principal Software Engineer specializing in AI/ML and cloud-native backend systems

New York, NY16y exp

McKinsey & CompanyNJIT

“McKinsey data/ML practitioner who led production deployment of an entity resolution + semantic search platform for unstructured finance and healthcare data, integrating with legacy systems under HIPAA constraints. Deep hands-on stack across transformers (spaCy/HF BERT), embeddings + FAISS, and production MLOps/workflow tooling (Airflow, Docker, CI/CD, Prometheus/Grafana), with reported gains of +30% decision speed and +25% search relevance.”

Python SQL R Ruby Java JavaScript+124

View profile

Sai Raja Ramya Bhavana Thota

Screened

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp

Northern TrustBradley University

“Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).”

A/B Testing Analytics Azure Machine Learning Bash BigQuery C+195

View profile

guna jaswanth maduri

Screened

Mid-level Machine Learning Engineer specializing in MLOps, NLP, and Computer Vision

USA5y exp

WalmartUniversity of New Haven

“ML/AI engineer with production experience across retail and healthcare: built a real-time computer-vision shelf monitoring system at Walmart and optimized edge inference latency by ~30% using TensorRT/ONNX and pruning. Also partnered with CVS Health clinical/pharmacy teams to deliver a medication-adherence predictive model, using Streamlit explainability dashboards and achieving an 18% adherence improvement.”

Python C++SQL Shell Scripting TensorFlow PyTorch+102

View profile

Yufan Wei

Screened

Intern AI Engineer specializing in LLM agents, RAG, and applied biostatistics

Beijing, China0y exp

SiemensEmory University

“Siemens AI engineer who shipped production multi-agent LLM systems across cybersecurity and sustainability, including a vulnerability automation agent that cut manual work 70%. Deep in orchestration (LangGraph supervisor-worker state machines), reliability engineering (async fault tolerance, retries, spike handling), and rigorous evaluation (offline benchmarks, LLM-as-a-Judge improving label agreement 28.9%) with measurable production guardrails.”

Python JavaScript TypeScript SQL R HTML+70

View profile

Rahul Hatkar

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

San Francisco, CA6y exp

Scale AIWebster University

“AI/ML engineer who has shipped production AI systems end-to-end, including an automated multi-channel (Gmail/WhatsApp/voice) candidate interviewing workflow and an enterprise RAG knowledge search platform. Demonstrates strong production rigor (monitoring, A/B tests, guardrails, schema validation, shadow testing) with quantified impact: ~60–70% reduction in interview evaluation time and ~20–30% relevance gains in RAG retrieval.”

A/B Testing Agile Anomaly Detection Ansible Apache Hadoop Apache Spark+167

View profile

Deepthi Mundarinti

Screened

Mid-level Generative AI Engineer specializing in decision intelligence and RAG for regulated enterprises

5y exp

JPMorgan ChaseSaint Louis University

“Healthcare GenAI engineer who built a HIPAA-compliant, auditable RAG-based claims decision support system at Molina Healthcare, processing 3M claims and delivering major impact (48% faster manual reviews, 43% higher decision accuracy). Deep hands-on experience with LangChain orchestration, vector search (ChromaDB/FAISS), embedding fine-tuning, and safety controls (confidence scoring, rule validation, human-in-the-loop escalation) for clinical workflows.”

Generative AI GPT-4 OpenAI API Prompt Engineering Retrieval-Augmented Generation (RAG)Machine Learning+96

View profile

Devender Kunta

Screened

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp

PwCUniversity of Central Missouri

“Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.”

AWS Databricks Apache Spark PySpark Scala Python+109

View profile

Avinash Pancheneni

Screened

Mid-level Machine Learning Engineer specializing in fraud detection and LLM applications

Charlotte, NC5y exp

Bank of AmericaUniversity of North Carolina at Charlotte

“Unreal Engine UI engineer focused on scalable, production-ready UI architecture (C++/Slate/UMG/CommonUI) with strong designer enablement via decoupled, interface-driven patterns and MVVM. Demonstrated measurable performance wins: replaced 200+ per-frame Blueprint bindings to cut UI prepass/paint from 4.2ms to 0.5ms and reduced VRAM by ~120MB using texture streaming proxies.”

Machine Learning Artificial Intelligence Supervised Learning Predictive Modeling Fraud Detection XGBoost+119

View profile

Palash Gharde

Screened

Mid-level Software Development Engineer specializing in backend, data engineering, and ML systems

Arizona, USA5y exp

ServiceNowArizona State University

“ML/Backend engineer with ServiceNow experience building production-grade inference services on FastAPI with Docker/Kubernetes (autoscaling, health checks) and strong reliability practices (monitoring, retries/timeouts, fallbacks). Delivered measurable improvements including 30% lower API latency and 18% higher model accuracy, and built A/B testing plus drift-triggered retraining loops to keep models stable in production.”

A/B Testing Amazon CloudWatch Apache Kafka Apache Spark Asynchronous Processing Authentication+92

View profile

Srinivas Matta

Screened

Mid-Level Full-Stack Software Developer specializing in cloud-native web platforms

Paducah, KY4y exp

IntuitSoutheast Missouri State University

“Software engineer at Capital One who owned and shipped AI-driven personalization and internal insights dashboards end-to-end, emphasizing fast iteration with feature flags and tight user feedback loops. Built a TypeScript/React + Spring Boot/Python document automation platform with compute-heavy NLP microservices, async workflows, and production-scale reliability/performance practices (Kafka/RabbitMQ-style queues, Redis caching, tracing).”

Ajax Apache Airflow Apache Kafka Apache Spark AWS CloudFormation AWS Lambda+130

View profile

Junhui Huang

Screened

Intern Machine Learning Engineer specializing in LLMs, MLOps, and NLP

Providence, RI1y exp

Harvard UniversityBrown University

“Built and deployed a production LLM-driven Dungeons & Dragons game where the model acts as a dungeon master, adding a structured combat system and a macro-state tree to ensure campaigns converge to a clear ending. Fine-tuned Gemini 2.5 Flash on Vertex AI and deployed on GCP with Kubernetes, using RAG over DnD rules/spells plus multi-agent orchestration (intent-based routing between narrative and combat agents) to reduce hallucinations and improve reliability.”

A/B Testing Agile Analytics API Development CI/CD ChromaDB+109

View profile

Vaibhav Sharma

Screened

Mid-level Software Engineer specializing in AI/ML and data platforms

Remote, USA5y exp

GoogleIndiana University Bloomington

“AI/ML engineer who built a production agentic system to automate computational research experiments (simulation execution, parameter exploration, and numerical analysis) and mitigated context-window failures using constrained tool-calling/prompt-chaining patterns in LangChain with OpenAI tool-enabled models. Also has adtech/big-data pipeline experience at InMobi, orchestrating Spark jobs in Airflow to filter bot-like user IDs and publish clean IDs to an online NoSQL store for live serving, plus Apache open-source collaboration experience.”

A/B Testing Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark+100

View profile

Prasannakumar B Vardi

Screened

Senior Software Engineer specializing in low-latency ad targeting and distributed backend systems

Santa Clara, CA9y exp

CardlyticsStony Brook University

“Backend/platform engineer who built a high-scale audience segmentation and real-time targeting system using Spark/Glue + S3/Hudi and low-latency API services backed by Redis/relational stores. Demonstrates strong production rigor: Spark performance tuning to eliminate OOM failures, API idempotency/caching to cut p95 latency ~40%, and careful dual-run/feature-flag migrations with reconciliation and rollback runbooks. Experienced implementing layered security with JWT/OAuth, RBAC/ABAC, and database row-level security to prevent privilege escalation.”

Java Python Go .NET C#Scala+114

View profile

Kanaka Chalam Volety

Screened

Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps

San Jose, CA24y exp

ZoomThompson Rivers University

“Infrastructure-focused engineer with Vonage experience modernizing early-stage cloud architecture (Terraform modularization, blue-green deployments, containerization, and zero-downtime database migration planning to Aurora). Also built a local end-to-end side project, Vastu AI, combining a custom-trained YOLO model (Roboflow-labeled data) with a locally hosted LLM via Ollama to generate a vastu compliance report from floor-plan images.”

Agile Amazon CloudWatch Amazon EC2 Amazon EKS Amazon RDS Amazon S3+190

View profile

pavan kalyan padala

Screened

Mid-level Data Scientist specializing in predictive and generative AI

Daytona Beach, Florida4y exp

2725 Hospitality LLCYeshiva University

“AI/ML engineer with production LLM experience in regulated financial services (J.P. Morgan Chase), building a customer response engine to automate first-contact resolution while addressing privacy, bias, compliance, and scale. Strong MLOps/orchestration background (Airflow, Docker/Kubernetes, AWS Step Functions, Azure ML/SageMaker) plus proven ability to integrate with legacy systems and drive stakeholder adoption through dashboards, auditability, and training.”

Python Pandas NumPy Scikit-learn TensorFlow PyTorch+98

View profile

Harshavardhan Reddy

Screened

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp

Capital OnePace University

“ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.”

Python R SQL PySpark Scala Java+105

View profile

Akshit Modi

Screened

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

Remote, USA5y exp

TempusArizona State University

“Healthcare/clinical ML practitioner who built and productionized ClinicalBERT-based pipelines to extract and standardize oncology EHR data, improving downstream model F1 from 0.81 to 0.92 while controlling training cost via LoRA/QLoRA. Experienced orchestrating real-time AWS ETL/ML workflows (Glue, Lambda, SageMaker) and partnering with clinicians using SHAP-based interpretability, contributing to an 18% reduction in readmissions and full adoption.”

Python SQL C++Java NumPy Pandas+166

View profile

Aditya Jaiswal

Screened

Intern Software Engineer specializing in cloud, DevOps, and applied AI

Carlsbad, CA1y exp

ViasatUSC

“Full-stack engineer with startup ownership experience (Aiir) building 15+ TypeScript/Go microservice APIs on GCP Cloud Run with Kafka-based async event streaming and React CRM integrations for billing/analytics. Strong post-launch operator who tuned Oracle performance (partitioning/indexing/query optimization) and validated a 23% retrieval-time reduction via AWR, and has a quality/DevSecOps mindset (94% Pytest coverage, GitHub Actions, SonarQube, Twistlock, CloudWatch) including migrating 18+ production CI/CD pipelines.”

A/B testing Apache Kafka Apache Spark Artificial Intelligence AWS AWS IAM+125

View profile

Utkarsh Mittal

Screened

Intern Data Scientist specializing in computer vision and LLM agents

Sunnyvale, CA0y exp

Covalent MetrologyNYU

“Software engineering candidate with hands-on experience building and shipping LLM agents: created a production AI enrichment/coding agent at Covalent Metrology using Apollo.io + OpenAI, and built a Mistral hackathon router that dynamically selects among models to reduce token cost while maintaining quality. Also developed a real-time financial margin analysis agent that emails actionable insights and iterated on reliability issues (e.g., fixing misrouted emails, improving news relevance filtering).”

Python C++C HTML CSS JavaScript+96

View profile

Bhavyasree Chinthala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and real-time streaming

USA, USA5y exp

PNCSaint Peter's University

“Data engineer with PNC Bank experience owning high-volume financial transaction pipelines end-to-end (Kafka/REST ingestion through Spark/Glue transformations to Redshift serving) for risk and fraud analytics. Built strong reliability and data quality practices (Great Expectations, reconciliation, Airflow alerting, idempotent retries, incremental/windowed processing), reporting 40% ingestion efficiency gains and ~99.9% data accuracy.”

Python SQL Apache Spark PySpark Apache Kafka Apache Airflow+72

View profile

Suloni Praveen

Screened

Entry-Level Software Engineer specializing in data engineering and ML systems

Los Angeles, CA0y exp

Easley-Dunn ProductionsUSC

“Built an end-to-end Next.js/TypeScript LLM-based scientific PDF analyzer using local Ollama/Llama inference to prioritize privacy and cost, producing structured research artifacts (e.g., authors/methods/findings) with ~92% extraction accuracy. At Qualtrics, helped replace a batch pipeline with a real-time, low-latency ML inference service (Python/Go on Kubernetes) using Redis caching, Grafana-based observability, and graceful fallbacks to protect UX during failures.”

Python C C++C#Go Swift+131

View profile

Kristina Shen

Screened

Intern-level Data Scientist and ML Engineer specializing in analytics and AI systems

Long Island City, NY1y exp

DataLynnUniversity of Chicago

“Early-career analytics candidate with hands-on experience in SQL/Python data pipelines, Tableau reporting, and marketing engagement analytics across internship and startup settings. Stands out for combining rigorous data quality practices with practical AI system design, including an end-to-end GPT-4 grading capstone that emphasized explainability and human oversight.”

Data Engineering Machine Learning Artificial Intelligence Data Science Data Structures PostgreSQL+100

View profile

Yinghai Yu

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML pipelines

San Mateo, CA6y exp

Bubbles and BooksGeorgia Tech

“Data-engineering-oriented candidate with hands-on experience building an agentic AI product and operational automation workflows. They described automating inventory-to-ERP discrepancy reconciliation with anomaly detection and daily reporting, and also have practical scraping/automation experience dealing with Cloudflare-protected sites using Selenium and Puppeteer.”

Python Pandas NumPy Scikit-learn Scala Java+87

View profile

Hao Liang

Screened

Mid-level Data Scientist specializing in GenAI, customer insights, and forecasting

Durham, NC5y exp

BASFUniversity of North Carolina at Chapel Hill

“ML/AI practitioner with hands-on experience shipping production time-series forecasting and RAG-based customer insights platforms in an enterprise setting. At BASF, he improved seed sales forecasting beyond naive baselines using model selection tailored by brand size, and he also led a RAG solution over Salesforce reports, complaints, and surveys that reached 2,000+ users with strong daily engagement.”

Python SQL R JavaScript Java AWS+92

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?