Vetted Apache Spark Professionals

Pre-screened and vetted.

SK

Mid-level Full-Stack Software Engineer specializing in cloud-native and AI-driven applications

6y exp
Fidelity InvestmentsUniversity of Texas at Dallas
View profile
NN

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

Green Bay, WI5y exp
StripeNew England College
View profile
VA

Senior Data Engineer specializing in cloud-scale pipelines and legal data utilities

Austin, TX6y exp
IBMUniversity of North Texas
View profile
NK

Junior AI Engineer specializing in enterprise LLM and FinTech systems

New York, NY4y exp
IBMCornell University
View profile
VC

Mid-level Software Engineer specializing in distributed systems and data platforms

San Francisco, CA4y exp
DatabricksUniversity of Central Florida
View profile
TM

Senior Data Engineer specializing in cloud data platforms and big data pipelines

Austin, TX11y exp
Accenture
View profile
DV

Senior Software Engineer specializing in cloud backend systems and LLM-powered agents

Seattle, WA5y exp
AmazonSan José State University

Amazon Fire TV Devices engineer who built and shipped a production LLM-powered lab triage and validation system that grounds recommendations in internal runbooks/known-issue data and pushes evidence-based actions via dashboards and Slack. Emphasizes safety and measurability with structured JSON outputs, replay-based evaluation on historical incidents, and production metrics (e.g., disagreement rate and time-to-first-action), plus cost/latency optimizations like caching, batching, and rule-based fast paths.

View profile
NK

Nandini Kosgi

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and fraud/risk analytics in Financial Services

PA, USA4y exp
Capital OneRobert Morris University

Built and shipped a production-grade GenAI Fraud & Compliance Investigation Copilot for a large US bank, integrating OCR docs, structured data, and prior case history to generate grounded, regulator-friendly summaries and red-flag highlights. Demonstrates strong end-to-end LLM systems engineering (LangGraph/LangChain, hybrid retrieval with FAISS+BM25, guardrails/citations, streaming/latency optimization) plus rigorous evaluation and close partnership with compliance stakeholders.

View profile
VD

Mid-level Software Engineer specializing in AWS, full-stack development, and AI data systems

Seattle, Washington3y exp
AmazonArizona State University

Backend engineer who built a Python-based data profiling/statistics platform processing up to 50M rows and ~300 metrics, using a DAG execution model, multithreading, and smart caching to cut processing time by up to 70%. Also improved PostgreSQL query performance from 12s to 2s via indexing/query rewrites, integrated an LLM (LangChain + OpenAI) for explainable “chat with the pipeline” functionality, and designed an AWS EC2+SQS architecture for scalable, isolated per-user processing.

View profile
LT

Mid-level Software Engineer specializing in ML platforms and cloud-native backend systems

San Francisco, CA5y exp
City and County of San FranciscoSan Francisco State University

Software engineer with experience at Google and the City and County of San Francisco building production AI systems, including a RAG-based internal support chatbot and ML-driven ticket priority tagging. Has scaled data/ML platforms with Airflow on GCP (1M+ records/day, 99.9% SLA) and deployed multi-component systems with Docker and Kubernetes (GKE), using modern LLM tooling (LangChain/CrewAI, Claude/OpenAI, Pinecone/ChromaDB, Bedrock/Ollama).

View profile
SK

Mid-level AI/ML Engineer specializing in healthcare NLP, real-time risk systems, and ML platforms

Massachusetts, USA5y exp
Johnson & JohnsonRivier University

LLM-focused customer-facing engineer who repeatedly takes document Q&A and agentic prototypes into secure, monitored production systems. Experienced in reducing hallucinations via RAG + guardrails, diagnosing retrieval/embedding issues in real time, and partnering with sales to run metrics-driven PoCs that overcome accuracy/security objections and drive adoption.

View profile
SK

Sahithi K

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

Boston, MA4y exp
ModernaUniversity of Massachusetts Dartmouth

Data engineer with experience at Moderna and Block owning high-volume (≈10TB/day) production pipelines on AWS, using Kafka/S3/Glue/dbt/Snowflake with strong data quality and observability practices (schema validation, anomaly detection, CloudWatch monitoring). Also built external financial API ingestion with Airflow retries, throttling/token rotation, and schema versioning, and helped stand up an early-stage biomedical data platform with CI/CD and incident debugging.

View profile
JA

Mid-level AI/ML Engineer specializing in NLP, RAG, and MLOps

McKinney, TX6y exp
Globe LifeTexas A&M University

Built a production LLM/RAG-based “model excellence scoring” system at Uber to automatically evaluate hundreds of ML models, standardizing quality assessment and cutting evaluation time from days to minutes on GCP. Also delivered an NLP document classification solution for insurance claims at Globe Life, partnering closely with compliance/operations and improving routing accuracy from ~85% manual to 93% with the model.

View profile
LM

Senior Data Engineer specializing in cloud ETL and real-time streaming pipelines

Austin, TX5y exp
eBayTexas Tech University

Data engineer with eBay experience owning end-to-end pipelines for real-time order and user behavior analytics at 10M+ records/day. Strong in PySpark/SQL transformations, Airflow reliability patterns, and production observability (CloudWatch), with measurable outcomes including improved data quality and 30–40% query performance gains. Also built Python data APIs for analytics/ML consumers with versioning and backward compatibility.

View profile
Travoy Spelling - Senior Data Scientist / ML Engineer specializing in GenAI, LLMs, and NLP in Texarkana, TX

Senior Data Scientist / ML Engineer specializing in GenAI, LLMs, and NLP

Texarkana, TX10y exp
TredenceUniversity of Texas at Austin

ML/NLP engineer focused on production GenAI and data linking systems: built a large-scale RAG pipeline over millions of support docs using LangChain/Pinecone and added a LangGraph-based validation layer to cut hallucinations ~40%. Also built scalable PySpark entity resolution (95%+ accuracy) and fine-tuned Sentence-BERT embeddings with contrastive learning for ~30% relevance lift, with strong CI/CD and observability practices (OpenTelemetry, Prometheus/Grafana).

View profile
Byron Pineda - Staff/Lead Data Scientist specializing in Generative AI, NLP/LLMs, and MLOps in Pascagoula, MS

Byron Pineda

Screened

Staff/Lead Data Scientist specializing in Generative AI, NLP/LLMs, and MLOps

Pascagoula, MS10y exp
TuringMississippi State University

Lead Data Scientist (10+ years) with recent work in healthcare data: built production pipelines that unify EHR, genomics, and clinical notes using NLP (spaCy/BERT/BioBERT) and scalable Spark-based processing. Also led development of domain-specific LLM/NLP systems for chatbots and semantic search, deploying models via FastAPI/Flask and improving retrieval with FAISS-backed, fine-tuned clinical embeddings and RAG-style workflows.

View profile
Vismay Patel - Senior AI & Machine Learning Engineer specializing in NLP, GenAI, and MLOps in Berkeley, CA

Vismay Patel

Screened

Senior AI & Machine Learning Engineer specializing in NLP, GenAI, and MLOps

Berkeley, CA7y exp
Kaiser PermanenteSan Francisco State University

ML/GenAI practitioner with healthcare domain depth who built and deployed a production cervical-cancer EMR classification system using a hybrid rules + medical BERT approach, optimized for high recall under severe class imbalance and PHI constraints. Experienced running end-to-end production ML/LLM pipelines with Apache Airflow (validation, promotion/rollback, monitoring, retraining) and partnering closely with clinicians to calibrate thresholds and implement human-in-the-loop review.

View profile
Rishitha Madipelli - Mid-level Software Engineer specializing in cloud-native distributed systems and streaming data in Austin, TX

Mid-level Software Engineer specializing in cloud-native distributed systems and streaming data

Austin, TX7y exp
TeslaGeorge Mason University

Backend/product engineer with Tesla experience building and operating a real-time OTA update monitoring and fleet analytics platform at massive scale (telemetry from 3M+ vehicles). Delivered end-to-end systems across Kafka-based ingestion, TimescaleDB/Postgres analytics modeling, FastAPI/GraphQL APIs, and React/TypeScript dashboards, and handled production scaling incidents on AWS EKS during major rollout spikes.

View profile
Saiteja Gaddam - Mid-Level Data Engineer specializing in cloud data platforms and streaming analytics

Mid-Level Data Engineer specializing in cloud data platforms and streaming analytics

3y exp
IntuitUniversity at Buffalo

Data engineer (Intuit) who owned an end-to-end telemetry and subscription analytics platform processing ~22M events/day, built on Kinesis/S3/Glue/Spark/Airflow/Redshift. Strong focus on reliability and data quality (schema drift controls, quarantine layers, idempotent reruns) and performance tuning, achieving a reporting latency reduction from ~15 minutes to under 4 minutes while enabling revenue and churn analytics for business teams.

View profile
Ranganayak Meravath - Mid-level Generative AI Engineer specializing in RAG, agentic copilots, and regulated AI

Mid-level Generative AI Engineer specializing in RAG, agentic copilots, and regulated AI

5y exp
LPL FinancialUniversity of North Texas

Senior engineer who built and productionized an Azure-based Enterprise AI Copilot for financial/compliance teams, focused on grounded, auditable answers with citations to reduce hallucinations in regulated workflows. Experienced designing multi-step agent orchestration and improving reliability through targeted iterations (e.g., fixing chunking/parsing to materially improve citation accuracy), plus building defensive pipelines for messy ERP/operational finance data.

View profile
HK

Mid-level Full-Stack Software Engineer specializing in cloud and data platforms

Boston, MA5y exp
Northeastern UniversityPenn State University

Full-stack engineer with experience spanning Amazon IMDb and Northeastern’s NeuroJSON portal, combining consumer product work with complex scientific data applications. Built IMDb’s streaming providers feature—described as the company’s most impactful feature of 2023—and has hands-on experience with React/Angular, GraphQL, AWS, Python services, and production monitoring.

View profile

Need someone specific?

AI Search