Vetted Apache Spark Professionals

Pre-screened and vetted.

Pavanika Thotakura - Senior Data Engineer specializing in cloud big data pipelines and real-time streaming in Seattle, WA

Senior Data Engineer specializing in cloud big data pipelines and real-time streaming

Seattle, WA6y exp
AmazonUniversity of North Texas

Amazon data engineer who built a real-time fraud detection pipeline for AWS Lambda, tackling multi-region telemetry quality issues and scaling stream processing for billions of daily requests. Strong in production-grade data/ML workflows on AWS (EMR, Glue, Kinesis, SageMaker) with hands-on entity resolution and anomaly detection.

View profile
Sourabh Jain - Director of Software Engineering specializing in enterprise Data, ML & AI platforms in Bay Area, CA

Sourabh Jain

Screened

Director of Software Engineering specializing in enterprise Data, ML & AI platforms

Bay Area, CA23y exp
RSA SecurityShri G. S. Institute of Technology and Science

Former Walmart Director of Software Engineering who left in March 2025 to build products for clients. Recently delivered an LLM/RAG-based UNSPSC classification solution for an MRO client using a multi-stage retrieval + web search + prompt-engineering workflow, and has led large-scale retail forecasting initiatives and high-severity cloud-migration incidents end-to-end.

View profile
Kevin Allen - Senior AI/ML Engineer specializing in conversational and generative AI in Austin, TX

Kevin Allen

Screened

Senior AI/ML Engineer specializing in conversational and generative AI

Austin, TX12y exp
General MotorsUniversity of Kentucky

Built and productionized an LLM-based support assistant end-to-end, including RAG, APIs, monitoring, guardrails, and agent feedback loops. Stands out for translating GenAI prototypes into reliable production systems with structured evaluation, safety controls, and reusable Python infrastructure that improved both support quality and engineering velocity.

View profile
BK

Balpreet Kaur

Screened

Junior Machine Learning Engineer specializing in LLMs and data pipelines

Amherst, MA2y exp
Google DeepMindUniversity of Massachusetts Amherst

Research Extern at Google DeepMind and former AWS Software Development Engineer Intern with a strong focus on practical, trustworthy AI engineering. Built a multi-agent RAG system for personalized news headline generation using a fine-tuned Flan-T5 model, parallel critic agents, FAISS retrieval, and style embeddings, while also leading a 3-person team on the project.

View profile
YD

Yunqi Dong

Screened

Intern Software Engineer specializing in AI, data systems, and recommendation platforms

Pittsburgh, PA0y exp
MeituanCarnegie Mellon University

Full-stack engineer with a strong mix of real-time product engineering and applied AI experience. Built and deployed a production stock trading simulator on AWS and an LLM-based customer support agent with RAG/tooling, and also shipped a zero-to-one in-store detection feature at Meituan that improved CTR by 7% and conversion by 11%.

View profile
BS

Mid-level Full-Stack Developer specializing in cloud-native backend services and real-time data platforms

Remote, USA4y exp
NetflixUniversity of Dayton

Backend/data engineering candidate with Netflix experience designing and migrating analytics platforms from batch to real-time streaming (Kafka/Flink) across AWS and GCP. Delivered measurable improvements (40% lower data delay, 99.9% accuracy) using phased rollouts, automated data validation (Great Expectations), and strong observability (Prometheus/Grafana), and proactively hardened pipelines with idempotency to prevent duplicate Kafka processing.

View profile
SF

Sara Fang

Screened

Mid-level Software Engineer specializing in cloud data platforms and distributed systems

Remote6y exp
Terra Byte XUniversity of Delaware

Backend/data engineer with production experience building FastAPI services with strong reliability patterns (circuit breaker, rate limiting, caching, graceful degradation) and JWT/OAuth2 auth. Has delivered AWS EKS deployments via Terraform with Secrets Manager/IRSA and HPA autoscaling, and built Glue/Spark ETL pipelines on S3 Parquet with schema-evolution and idempotent reruns; also demonstrated measurable SQL tuning impact (20–30s to <10s).

View profile
BK

Mid-level Full-Stack Software Engineer specializing in cloud microservices and AI integration

Jersey City, NJ3y exp
UberPace University

Backend/distributed-systems engineer with Uber experience building real-time telemetry and safety signal pipelines. Strong in Kafka-based event-driven architectures, low-latency processing under peak load, and production reliability via monitoring, retries, and fallback logic; has Docker/Kubernetes and CI/CD deployment experience.

View profile
CR

Senior Machine Learning Engineer specializing in conversational AI and Generative AI

San Francisco, CA6y exp
Scale AIDallas Baptist University

ML/AI engineer with experience at Uber and Scale AI, focused on customer service automation across both classical NLP and generative AI systems. Has owned systems from experimentation through production on AWS, including LLM fine-tuning, RAG optimization, safety evaluation, and internal Python platform tooling that improved consistency and engineering velocity.

View profile
SK

Intern Software Engineer specializing in developer productivity and data/AI systems

Los Angeles, California1y exp
IntuitUC Berkeley

Internship experience at Intuit building an LLM-grounded QA system for internal microservice data across 100+ microservices, using a graph database approach (evaluated Neo4j and selected AWS Neptune for production alignment). Also has UC Berkeley research experience (including work with Prof. Dawn Song / Berkeley Eye Research Lab) and cross-functional collaboration with bioinformatics/biology teams to deploy software systems on research servers.

View profile
MO

Mid-Level Software Engineer specializing in cloud-native distributed systems

Bellevue, WA7y exp
AmazonUniversity of Washington

Gameplay engineer with hands-on ownership of a real-time C++ combat ability system, including diagnosing and eliminating large-scale combat frame spikes by refactoring hit detection to an event-driven, animation-notify approach (cut collision checks ~80%). Also implemented UE5 networked abilities (dash) with client-side prediction and server-authoritative reconciliation, plus projectile ballistics validated through debug spline visualizations and unit tests.

View profile
SB

Mid-level Backend & Reliability Engineer specializing in AWS, Kubernetes, and automation

New Mexico, US5y exp
MetaUniversity of North Carolina at Charlotte

Meta engineer focused on reliability/operations tooling who built a unified real-time health dashboard and scalable telemetry pipelines (AWS + Datadog) for thousands of devices. Also shipped an internal LLM-powered knowledge assistant using RAG over wikis/runbooks/logs with strong guardrails and a rigorous eval loop that drove measurable accuracy improvements via automated doc ingestion and embedding updates.

View profile
YR

Senior Data Engineer specializing in cloud-native data pipelines and lakehouse platforms

6y exp
MicrosoftUniversity of North Texas

Data engineer at Microsoft who owned an end-to-end subscription analytics platform processing 7TB+ daily across 40+ pipelines, combining ADF batch ingestion with Kafka/Spark streaming and rigorous Great Expectations quality gates. Built a Fabric-based self-service ingestion platform with CI/CD and observability, plus a Databricks feature store serving near-real-time ML inference with Delta Lake reliability and versioning.

View profile
Dheeraj Kumar - Intern Data Scientist specializing in marketing analytics and data engineering in Tucson, Arizona

Dheeraj Kumar

Screened

Intern Data Scientist specializing in marketing analytics and data engineering

Tucson, Arizona2y exp
RochePurdue University

AI/LLM practitioner with internships at Dell Technologies and Roche who built and deployed a healthcare-focused "Doctor LLM" by fine-tuning Meta Llama 3.2 on healthcaremagic.json, emphasizing safety guardrails to prevent harmful medical advice. Experienced in productionizing AI workflows with monitoring, testing, and orchestration (Airflow, Kubernetes), and in delivering AI-agent-driven competitive landscape insights to non-technical business stakeholders.

View profile
Shreya Roy Koneri - Mid-level Software Engineer specializing in backend microservices and real-time payments in Phoenix, AZ

Mid-level Software Engineer specializing in backend microservices and real-time payments

Phoenix, AZ5y exp
American ExpressUniversity of Dayton

Product-minded full-stack engineer who has owned customer-facing platforms end-to-end, including a unified web UI platform that increased adoption by 30% using feature flags and phased rollouts. Experienced designing TypeScript/React systems with microservices and RabbitMQ at scale, addressing reliability issues with DLQs, retries, and idempotent consumers, and building internal analytics tooling adopted company-wide within weeks.

View profile
Zheng Wu - Junior Software Engineer specializing in backend systems and cloud messaging in Mountain View, CA

Zheng Wu

Screened

Junior Software Engineer specializing in backend systems and cloud messaging

Mountain View, CA1y exp
NewsBreakRice University

Data/ML engineer who has owned end-to-end systems across email deliverability/segmentation and production LLM apps. Built a Spark+Airflow segmentation engine that materially improved deliverability (99.9%) and open rates (>50%), and shipped a PDF-to-quiz RAG product using LangChain/Vertex AI/Chroma with strong guardrails and an eval loop that cut hallucinations to <5%.

View profile
Sri Charan Reddy Mallu - Mid-Level Software Development Engineer specializing in GenAI and full-stack cloud systems in Redwood City, CA

Mid-Level Software Development Engineer specializing in GenAI and full-stack cloud systems

Redwood City, CA5y exp
C3 AISan José State University

Full-stack engineer with experience across Magna, C3.ai, and Amazon, building GenAI-enabled products and finance transaction systems. Has shipped Next.js (App Router) + TypeScript features backed by Go/Python RAG pipelines, and emphasizes production quality via load testing, Selenium regression coverage, LLM-aware integration testing, and Azure observability. Also built LangGraph-orchestrated multi-step content generation workflows with robust retry/idempotency strategies.

View profile
Jehanzeb Khan - Director-level Engineering Manager specializing in large-scale data and compute platforms in Sunnyvale, CA

Jehanzeb Khan

Screened

Director-level Engineering Manager specializing in large-scale data and compute platforms

Sunnyvale, CA20y exp
AmazonInstitute of Business Administration

Platform and distributed-systems leader (player-coach) who owned architecture and reliability for an Amazon analytics/data platform serving ~100K internal users at exabyte scale. Built an ML-driven “Lakeflow” optimization layer that cut pipeline completion times ~20–25% and reduced compute waste >15%, and led major incident response/redesign efforts (e.g., deletion storm) with strong rollout/observability/rollback practices.

View profile
HR

Mid-level Data Analytics professional specializing in BI, data engineering, and applied AI

California, USA6y exp
AmazonSan Jose State University

Built GenMedX, a multi-module clinical AI system for emergency department decision support spanning triage prediction, diagnosis, medication Q&A, and visit summarization. Stands out for combining medical LLM fine-tuning, RAG, and rigorous evaluation/monitoring to drive a major triage recall improvement from 38.5% to 76.6%, with a strong focus on safety, edge-case detection, and production reliability.

View profile
KC

KaMing Cheung

Screened

Junior Software Engineer specializing in full-stack and machine learning

Pittsburgh, United States1y exp
Carnegie Mellon UniversityCarnegie Mellon University

CMU IoT coursework project builder who implemented an end-to-end TinyML gesture recognition system on a Particle Photon + ADXL345, streaming data via MQTT/Node-RED to a real-time Node.js frontend and deploying a quantized logistic regression model on-device. Also explored multi-drone coordination, implementing leader-follower offset control and a pivot/arc turning strategy to avoid collisions, and brings practical Docker/Kubernetes plus CI/CD workflow experience from internships.

View profile
VM

Vishal Mittal

Screened

Director-level Engineering Manager specializing in cloud security platforms and AI-driven automation

Fremont, CA18y exp
Palo Alto NetworksStanford University

Senior engineering leader in the Bay Area with experience spanning VMware, Hortonworks/Cloudera, Barracuda, and Palo Alto Networks, including leading open-source work (Apache Knox) and architecting large-scale security platforms. Has driven disaster recovery and cloud security products, designed Python microservices for Microsoft 365 security, and scaled teams (3x) while formalizing enterprise readiness practices with automated documentation using Notebook LLM.

View profile
NP

Nikhel Parkar

Screened

Executive engineering leader specializing in FinTech, data platforms, and cloud modernization

Reston, VA23y exp
Fannie MaePowerMBA Business School

Aspiring founder building an AI governance and compliance startup for the finance industry, focused on agents that monitor data lakes/lakehouses to detect security vulnerabilities, PII exposure, and governance issues in real time. Has already formed an S-corp, has not raised capital yet, and approaches idea validation through Minimum Lovable Product testing with potential clients.

View profile
NS

Nitin Sunda

Screened

Mid-level Software Engineer specializing in FinTech and GenAI platforms

Seattle, WA4y exp
AmazonNortheastern University

Candidate describes a development approach centered on AI-assisted coding, testing, and agent-driven workflows, including production exposure to multi-agent systems and governance-oriented logging. They appear particularly focused on combining AI speed with structured validation through unit tests, boundary tests, and edge-case monitoring.

View profile

Need someone specific?

AI Search