Vetted Apache Spark Professionals

Pre-screened and vetted.

RK

Principal Software Engineer specializing in AI/ML and cloud-native backend systems

New York, NY16y exp
McKinsey & CompanyNJIT

McKinsey data/ML practitioner who led production deployment of an entity resolution + semantic search platform for unstructured finance and healthcare data, integrating with legacy systems under HIPAA constraints. Deep hands-on stack across transformers (spaCy/HF BERT), embeddings + FAISS, and production MLOps/workflow tooling (Airflow, Docker, CI/CD, Prometheus/Grafana), with reported gains of +30% decision speed and +25% search relevance.

View profile
SR

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp
Northern TrustBradley University

Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).

View profile
GJ

Mid-level Machine Learning Engineer specializing in MLOps, NLP, and Computer Vision

USA5y exp
WalmartUniversity of New Haven

ML/AI engineer with production experience across retail and healthcare: built a real-time computer-vision shelf monitoring system at Walmart and optimized edge inference latency by ~30% using TensorRT/ONNX and pruning. Also partnered with CVS Health clinical/pharmacy teams to deliver a medication-adherence predictive model, using Streamlit explainability dashboards and achieving an 18% adherence improvement.

View profile
YW

Yufan Wei

Screened

Intern AI Engineer specializing in LLM agents, RAG, and applied biostatistics

Beijing, China0y exp
SiemensEmory University

Siemens AI engineer who shipped production multi-agent LLM systems across cybersecurity and sustainability, including a vulnerability automation agent that cut manual work 70%. Deep in orchestration (LangGraph supervisor-worker state machines), reliability engineering (async fault tolerance, retries, spike handling), and rigorous evaluation (offline benchmarks, LLM-as-a-Judge improving label agreement 28.9%) with measurable production guardrails.

View profile
RH

Rahul Hatkar

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

San Francisco, CA6y exp
Scale AIWebster University

AI/ML engineer who has shipped production AI systems end-to-end, including an automated multi-channel (Gmail/WhatsApp/voice) candidate interviewing workflow and an enterprise RAG knowledge search platform. Demonstrates strong production rigor (monitoring, A/B tests, guardrails, schema validation, shadow testing) with quantified impact: ~60–70% reduction in interview evaluation time and ~20–30% relevance gains in RAG retrieval.

View profile
DM

Mid-level Generative AI Engineer specializing in decision intelligence and RAG for regulated enterprises

5y exp
JPMorgan ChaseSaint Louis University

Healthcare GenAI engineer who built a HIPAA-compliant, auditable RAG-based claims decision support system at Molina Healthcare, processing 3M claims and delivering major impact (48% faster manual reviews, 43% higher decision accuracy). Deep hands-on experience with LangChain orchestration, vector search (ChromaDB/FAISS), embedding fine-tuning, and safety controls (confidence scoring, rule validation, human-in-the-loop escalation) for clinical workflows.

View profile
DK

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp
PwCUniversity of Central Missouri

Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.

View profile
AP

Mid-level Machine Learning Engineer specializing in fraud detection and LLM applications

Charlotte, NC5y exp
Bank of AmericaUniversity of North Carolina at Charlotte

Unreal Engine UI engineer focused on scalable, production-ready UI architecture (C++/Slate/UMG/CommonUI) with strong designer enablement via decoupled, interface-driven patterns and MVVM. Demonstrated measurable performance wins: replaced 200+ per-frame Blueprint bindings to cut UI prepass/paint from 4.2ms to 0.5ms and reduced VRAM by ~120MB using texture streaming proxies.

View profile
PG

Palash Gharde

Screened

Mid-level Software Development Engineer specializing in backend, data engineering, and ML systems

Arizona, USA5y exp
ServiceNowArizona State University

ML/Backend engineer with ServiceNow experience building production-grade inference services on FastAPI with Docker/Kubernetes (autoscaling, health checks) and strong reliability practices (monitoring, retries/timeouts, fallbacks). Delivered measurable improvements including 30% lower API latency and 18% higher model accuracy, and built A/B testing plus drift-triggered retraining loops to keep models stable in production.

View profile
Srinivas Matta - Mid-Level Full-Stack Software Developer specializing in cloud-native web platforms in Paducah, KY

Mid-Level Full-Stack Software Developer specializing in cloud-native web platforms

Paducah, KY4y exp
IntuitSoutheast Missouri State University

Software engineer at Capital One who owned and shipped AI-driven personalization and internal insights dashboards end-to-end, emphasizing fast iteration with feature flags and tight user feedback loops. Built a TypeScript/React + Spring Boot/Python document automation platform with compute-heavy NLP microservices, async workflows, and production-scale reliability/performance practices (Kafka/RabbitMQ-style queues, Redis caching, tracing).

View profile
Junhui Huang - Intern Machine Learning Engineer specializing in LLMs, MLOps, and NLP in Providence, RI

Junhui Huang

Screened

Intern Machine Learning Engineer specializing in LLMs, MLOps, and NLP

Providence, RI1y exp
Harvard UniversityBrown University

Built and deployed a production LLM-driven Dungeons & Dragons game where the model acts as a dungeon master, adding a structured combat system and a macro-state tree to ensure campaigns converge to a clear ending. Fine-tuned Gemini 2.5 Flash on Vertex AI and deployed on GCP with Kubernetes, using RAG over DnD rules/spells plus multi-agent orchestration (intent-based routing between narrative and combat agents) to reduce hallucinations and improve reliability.

View profile
Vaibhav Sharma - Mid-level Software Engineer specializing in AI/ML and data platforms in Remote, USA

Mid-level Software Engineer specializing in AI/ML and data platforms

Remote, USA5y exp
GoogleIndiana University Bloomington

AI/ML engineer who built a production agentic system to automate computational research experiments (simulation execution, parameter exploration, and numerical analysis) and mitigated context-window failures using constrained tool-calling/prompt-chaining patterns in LangChain with OpenAI tool-enabled models. Also has adtech/big-data pipeline experience at InMobi, orchestrating Spark jobs in Airflow to filter bot-like user IDs and publish clean IDs to an online NoSQL store for live serving, plus Apache open-source collaboration experience.

View profile
Prasannakumar B Vardi - Senior Software Engineer specializing in low-latency ad targeting and distributed backend systems in Santa Clara, CA

Senior Software Engineer specializing in low-latency ad targeting and distributed backend systems

Santa Clara, CA9y exp
CardlyticsStony Brook University

Backend/platform engineer who built a high-scale audience segmentation and real-time targeting system using Spark/Glue + S3/Hudi and low-latency API services backed by Redis/relational stores. Demonstrates strong production rigor: Spark performance tuning to eliminate OOM failures, API idempotency/caching to cut p95 latency ~40%, and careful dual-run/feature-flag migrations with reconciliation and rollback runbooks. Experienced implementing layered security with JWT/OAuth, RBAC/ABAC, and database row-level security to prevent privilege escalation.

View profile
Kanaka Chalam Volety - Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps in San Jose, CA

Staff DevOps/SRE Engineer specializing in AWS, Kubernetes, and GitOps

San Jose, CA24y exp
ZoomThompson Rivers University

Infrastructure-focused engineer with Vonage experience modernizing early-stage cloud architecture (Terraform modularization, blue-green deployments, containerization, and zero-downtime database migration planning to Aurora). Also built a local end-to-end side project, Vastu AI, combining a custom-trained YOLO model (Roboflow-labeled data) with a locally hosted LLM via Ollama to generate a vastu compliance report from floor-plan images.

View profile
pavan kalyan padala - Mid-level Data Scientist specializing in predictive and generative AI in Daytona Beach, Florida

Mid-level Data Scientist specializing in predictive and generative AI

Daytona Beach, Florida4y exp
2725 Hospitality LLCYeshiva University

AI/ML engineer with production LLM experience in regulated financial services (J.P. Morgan Chase), building a customer response engine to automate first-contact resolution while addressing privacy, bias, compliance, and scale. Strong MLOps/orchestration background (Airflow, Docker/Kubernetes, AWS Step Functions, Azure ML/SageMaker) plus proven ability to integrate with legacy systems and drive stakeholder adoption through dashboards, auditability, and training.

View profile
Harshavardhan Reddy - Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics in Albany, NY

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp
Capital OnePace University

ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.

View profile
Akshit Modi - Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps in Remote, USA

Akshit Modi

Screened

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

Remote, USA5y exp
TempusArizona State University

Healthcare/clinical ML practitioner who built and productionized ClinicalBERT-based pipelines to extract and standardize oncology EHR data, improving downstream model F1 from 0.81 to 0.92 while controlling training cost via LoRA/QLoRA. Experienced orchestrating real-time AWS ETL/ML workflows (Glue, Lambda, SageMaker) and partnering with clinicians using SHAP-based interpretability, contributing to an 18% reduction in readmissions and full adoption.

View profile
Aditya Jaiswal - Intern Software Engineer specializing in cloud, DevOps, and applied AI in Carlsbad, CA

Intern Software Engineer specializing in cloud, DevOps, and applied AI

Carlsbad, CA1y exp
ViasatUSC

Full-stack engineer with startup ownership experience (Aiir) building 15+ TypeScript/Go microservice APIs on GCP Cloud Run with Kafka-based async event streaming and React CRM integrations for billing/analytics. Strong post-launch operator who tuned Oracle performance (partitioning/indexing/query optimization) and validated a 23% retrieval-time reduction via AWR, and has a quality/DevSecOps mindset (94% Pytest coverage, GitHub Actions, SonarQube, Twistlock, CloudWatch) including migrating 18+ production CI/CD pipelines.

View profile
Utkarsh Mittal - Intern Data Scientist specializing in computer vision and LLM agents in Sunnyvale, CA

Intern Data Scientist specializing in computer vision and LLM agents

Sunnyvale, CA0y exp
Covalent MetrologyNYU

Software engineering candidate with hands-on experience building and shipping LLM agents: created a production AI enrichment/coding agent at Covalent Metrology using Apollo.io + OpenAI, and built a Mistral hackathon router that dynamically selects among models to reduce token cost while maintaining quality. Also developed a real-time financial margin analysis agent that emails actionable insights and iterated on reliability issues (e.g., fixing misrouted emails, improving news relevance filtering).

View profile
Bhavyasree Chinthala - Mid-level Data Engineer specializing in cloud data pipelines and real-time streaming in USA, USA

Mid-level Data Engineer specializing in cloud data pipelines and real-time streaming

USA, USA5y exp
PNCSaint Peter's University

Data engineer with PNC Bank experience owning high-volume financial transaction pipelines end-to-end (Kafka/REST ingestion through Spark/Glue transformations to Redshift serving) for risk and fraud analytics. Built strong reliability and data quality practices (Great Expectations, reconciliation, Airflow alerting, idempotent retries, incremental/windowed processing), reporting 40% ingestion efficiency gains and ~99.9% data accuracy.

View profile
Suloni Praveen - Entry-Level Software Engineer specializing in data engineering and ML systems in Los Angeles, CA

Entry-Level Software Engineer specializing in data engineering and ML systems

Los Angeles, CA0y exp
Easley-Dunn ProductionsUSC

Built an end-to-end Next.js/TypeScript LLM-based scientific PDF analyzer using local Ollama/Llama inference to prioritize privacy and cost, producing structured research artifacts (e.g., authors/methods/findings) with ~92% extraction accuracy. At Qualtrics, helped replace a batch pipeline with a real-time, low-latency ML inference service (Python/Go on Kubernetes) using Redis caching, Grafana-based observability, and graceful fallbacks to protect UX during failures.

View profile
KS

Kristina Shen

Screened

Intern-level Data Scientist and ML Engineer specializing in analytics and AI systems

Long Island City, NY1y exp
DataLynnUniversity of Chicago

Early-career analytics candidate with hands-on experience in SQL/Python data pipelines, Tableau reporting, and marketing engagement analytics across internship and startup settings. Stands out for combining rigorous data quality practices with practical AI system design, including an end-to-end GPT-4 grading capstone that emphasized explainability and human oversight.

View profile
YY

Yinghai Yu

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML pipelines

San Mateo, CA6y exp
Bubbles and BooksGeorgia Tech

Data-engineering-oriented candidate with hands-on experience building an agentic AI product and operational automation workflows. They described automating inventory-to-ERP discrepancy reconciliation with anomaly detection and daily reporting, and also have practical scraping/automation experience dealing with Cloudflare-protected sites using Selenium and Puppeteer.

View profile
HL

Hao Liang

Screened

Mid-level Data Scientist specializing in GenAI, customer insights, and forecasting

Durham, NC5y exp
BASFUniversity of North Carolina at Chapel Hill

ML/AI practitioner with hands-on experience shipping production time-series forecasting and RAG-based customer insights platforms in an enterprise setting. At BASF, he improved seed sales forecasting beyond naive baselines using model selection tailored by brand size, and he also led a RAG solution over Salesforce reports, complaints, and surveys that reached 2,000+ users with strong daily engagement.

View profile

Need someone specific?

AI Search