Vetted Apache Spark Professionals

Pre-screened and vetted.

JS

Principal Data Scientist specializing in LLMs, RAG, and enterprise AI products

Winchester, TN9y exp
SambaNovaSewanee: The University of the South
View profile
KP

Mid-level Data Engineer specializing in GCP, Spark, and healthcare analytics

New York, NY3y exp
CVS HealthColumbia University
View profile
AA

Senior AI/ML Engineer specializing in LLMs and enterprise conversational AI

Northbrook, IL16y exp
CVS HealthUniversity of Illinois Chicago
View profile
RD

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps

USA, USA4y exp
Scale AIUniversity of Texas at Arlington
View profile
KG

Senior Software Engineer specializing in backend systems and data engineering

Remote, USA10y exp
General Dynamics Information TechnologyUniversity of Texas at Austin
View profile
MP

Senior Software Engineer specializing in backend data platforms for FinTech

Irving, TX9y exp
Cottonwood FinancialUniversity of Texas at Austin
View profile
AH

Senior Full-Stack Engineer specializing in backend, cloud, and AI systems

New York, NY8y exp
Istream Solution
View profile
AP

Anurag Patil

Screened

Mid-level Data Analyst specializing in machine learning, ETL, and real-world evidence analytics

California, USA6y exp
AbbVieUC Irvine

Developed and productionized an AI-driven "indication finding" system for AbbVie to identify additional diseases a drug could target, working closely with clinical research teams on cohort inclusion/exclusion criteria and disease rollups. Leveraged an LLM to map clinical inputs to ICD codes and built configuration-driven ML pipelines (Cloudera ML, YAML, scheduled jobs) with structured testing and evaluation for reliability.

View profile
KM

Mid-Level AI/ML Software Engineer specializing in agentic LLM systems

Dallas, Texas6y exp
DatatronUniversity of West Florida

Built and deployed a production LLM-powered multi-agent compliance copilot (life sciences/finance) using LangChain/LangGraph + RAG over vector databases, delivered via async FastAPI on Kubernetes. Emphasizes audit-ready, deterministic outputs with schema constraints and citations, plus rigorous evaluation/monitoring; reports 60%+ reduction in manual research time and successful production adoption.

View profile
JT

Jingyi Tian

Screened

Junior Machine Learning Engineer specializing in MLOps and LLM/RAG systems

Houston, TX2y exp
Daxwell, LLCColumbia University

LLM/agentic workflow builder focused on productionizing document-processing systems. Redesigned pipelines with LangGraph + RAG, schema-aware validation, and eval/monitoring loops; known for fast incident diagnosis (restored accuracy from ~70% to >95% same day). Partners closely with sales and stakeholders to deliver tailored demos and drive adoption (reported +40%).

View profile
SM

Shravya M

Screened

Senior AI/ML Engineer specializing in NLP, LLMs, and MLOps

Texas, USA6y exp
CVS HealthUniversity of North Texas

LLM/agent workflow engineer with healthcare experience (CVS/CBS Health) who built and deployed a production call-insights platform using Azure OpenAI + LangChain/LangGraph, including sentiment and compliance checks. Demonstrates deep HIPAA/PHI handling (tenant-contained processing, redaction, RBAC/encryption/audit logging) and production rigor (testing, eval sets, validation/retries, autoscaling) to scale to thousands of transcripts.

View profile
SK

Mid-level Data Scientist / AI-ML Engineer specializing in Generative AI and LLM applications

Dallas, TX5y exp
Baylor Scott & WhiteUniversity of North Texas

Built a production GenAI-powered analytics assistant to reduce reliance on data analysts by enabling natural-language Q&A over Databricks/Power BI dashboards, backed by vector search (Pinecone/Milvus) and a Neo4j knowledge graph, including multimodal support via OpenAI Vision. Demonstrates strong real-world LLM reliability engineering with strict RAG, LangGraph multi-step verification, and Guardrails/custom validators, plus broad orchestration and production monitoring experience (Airflow, ADF, Step Functions, Kubernetes, Prometheus/CloudWatch).

View profile
SR

Mid-level AI/ML Engineer specializing in deep learning, NLP/LLMs, and MLOps

MA, USA6y exp
Flatiron HealthClark University

Built and shipped a real-time oncology risk prediction system used by doctors during patient visits, trained on clinical data in AWS SageMaker and deployed via FastAPI with sub-second responses. Emphasizes clinician-trust features (SHAP explainability, validation checks) and HIPAA-compliant controls (encryption, RBAC, audit logging), plus Kubernetes-based production operations with autoscaling, monitoring, and drift/retraining workflows; collaborated closely with oncologists at Flatiron Health.

View profile
BK

Bharath kumar

Screened

Director-level AI & Data Science leader specializing in GenAI, LLMs, and MLOps

Draper, UT12y exp
ThorneBharathiar University

ML/NLP engineer currently working in NYC on a system that connects complex unstructured data sources to deliver personalized insights, using embeddings + vector DB retrieval and a RAG architecture (LangChain, Pinecone/OpenSearch). Strong focus on production constraints—especially low-latency retrieval—using FAISS/ANN, PCA, index partitioning, and Redis caching, plus PEFT fine-tuning (LoRA/QLoRA) and KPI/SLA-driven promotion to production.

View profile
SB

Silpa Bhavani

Screened

Mid-level Full-Stack Java Developer specializing in cloud-native microservices

Oakland, CA5y exp
BlockLamar University

Software engineer with strong compliance-domain experience who built a customer-facing compliance and reporting dashboard using React/TypeScript with Spring Boot microservices. Demonstrates mature production engineering practices—contract-first APIs, event-driven architecture (Kafka/RabbitMQ), caching (Redis), and robust CI/CD + observability (Prometheus/Grafana/ELK)—and also created a Python-based audit automation tool adopted into the standard release process.

View profile
TK

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and Conversational AI

3y exp
AetnaIndiana Tech

Built a production RAG-based GenAI copilot backend at Aetna using Python/FastAPI, GPT-4, LangChain, and Azure AI Search, deployed on AKS with Prometheus/Grafana observability. Owned the system end-to-end (ingestion through deployment) and improved peak-time reliability by addressing vector search and embedding bottlenecks with Redis caching, index optimization, and async processing, plus added anti-hallucination guardrails via retrieval confidence thresholds.

View profile
NP

Navya Panyala

Screened

Senior Software Engineer specializing in identity, cloud-native microservices, and reactive web apps

Bentonville, AR6y exp
WalmartUniversity at Albany

Product-focused full-stack engineer with Walmart and Dell experience who built and shipped a real-time engagement dashboard end-to-end (Kafka Streams, Spring Boot, React/TypeScript/D3) used daily by business teams, moving them from next-day reports to real-time decisioning. Strong in performance/reliability (Redis caching cut latency ~40%, 90%+ test coverage, Prometheus/CloudWatch monitoring) and production operations on AWS/EKS including handling a cascading failure from a memory leak with zero-downtime rollback and redeploy.

View profile
RG

Mid-level GenAI Engineer specializing in production RAG and LLM fine-tuning

San Jose, California5y exp
eBayTexas Tech University

LLM engineer who built a production seller-support RAG system at eBay using hybrid retrieval (BM25 + Pinecone vectors) with Cohere reranking, LangGraph orchestration, and citation-grounded answers. Strong focus on reliability: semantic/structure-aware chunking, automated Ragas-based evaluation with nightly regressions, and production observability (LangSmith) plus drift monitoring (Arize). Also implemented a multi-agent fraud pipeline with AutoGen using JSON-schema contracts and explicit termination conditions.

View profile
DB

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps on AWS

TX, USA5y exp
BlackRockTexas A&M University-Kingsville

AI engineer who built a production RAG-based internal analyst tool at BlackRock, fine-tuning an LLM on proprietary financial data and adding four layers of guardrails (input/retrieval/generation/output) to improve grounding and reduce hallucinations. Implemented a LangChain-based multi-agent orchestration (7 major agents) deployed on AWS ECS, with reliability measured via internal human evaluation, LLM-as-judge, and RLHF/drift monitoring.

View profile
YT

Yupeng Tang

Screened

Junior Machine Learning Engineer specializing in LLM systems and GPU inference

Atlanta, GA1y exp
GMI CloudGeorgia Tech

LLM/agent engineer who shipped a production RAG-based recommendation + explanation system that replaced a traditional recommender stack, delivering ~20% CTR lift (and +8% after a reliability iteration) with strong cold-start performance. Demonstrates strong production rigor: schema-constrained generation, typed tool calling, explicit state/orchestration, deep monitoring/feedback loops, and safe integration with messy ERP inventory/order data using normalization, idempotency, and conflict-resolution guardrails.

View profile
JS

Intern Software Engineer specializing in edge AI deployment and distributed systems

San Francisco, CA1y exp
Zetic AISan José State University

Full-stack engineer who built an enterprise search platform (Codlens) delivering natural-language Q&A over Jira/Slack using embeddings, vector DB search, re-ranking (RRF), and LLM responses with source grounding. Also designed and benchmarked a distributed IAM system with Postgres transaction-log replication and Raft-based quorum consistency, reporting ~253 TPS at ~60ms latency in a multi-node setup. Experience spans early-stage startups (Zetic AI, Sagwara Capital) and large-scale orgs (Akamai, Atlassian).

View profile
AP

Mid-level Data Engineer specializing in cloud data pipelines and enterprise data platforms

4y exp
ConnectiveRxUniversity of Pennsylvania

Data engineer/backend engineer who owns large-scale, real-time event pipelines on AWS end-to-end, including a petabyte-scale CDC ingestion flow from multiple Postgres DBs into Redshift. Re-architected a legacy DynamoDB+S3 approach into a Delta Lake + DuckDB/PyArrow-compatible design, improving performance dramatically (e.g., ~600s to ~10s for 1k records) and increasing reliability at high file volumes.

View profile
Jincheng Pang - Principal Data Scientist specializing in healthcare analytics and medical imaging AI in Sudbury, MA

Jincheng Pang

Screened

Principal Data Scientist specializing in healthcare analytics and medical imaging AI

Sudbury, MA11y exp
AccessHopeTufts University

Developed an LLM-driven recommendation agent in Azure Databricks to triage oncology patients and trigger second-opinion case creation using medical claims and EHR data. Uses ICD-10/CPT/J-code features in prompts, embeddings + vector DB similarity, and a backtesting framework emphasizing recall to avoid missing clinically relevant cases while supporting business revenue.

View profile
Aashna Kunkolienker - Junior AI Engineer specializing in agentic workflows and ML platforms in San Ramon, CA

Junior AI Engineer specializing in agentic workflows and ML platforms

San Ramon, CA2y exp
SearceNYU

Building a production LLM/agent system for a leading US dental provider that extracts rules from payer handbooks/portals and EDI 271 responses to validate and improve patient cost estimates. Combines GCP stack (BigQuery, GKE, Cloud Run, Pub/Sub, Vertex AI) with strong agent reliability practices (observability, validator agents, grounding, PII/hallucination guardrails, confidence scoring) and has led non-technical customer stakeholders on enterprise ServiceNow↔Aha sync and AI-powered enterprise search/summarization.

View profile

Need someone specific?

AI Search