Vetted Apache Spark Professionals

Pre-screened and vetted.

PP

Senior Backend Software Engineer specializing in cloud, microservices, and AI systems

Richardson, TX8y exp
The University of Texas at DallasUniversity of Texas at Dallas

Built an AI-powered job outreach application for his own job search and took it from idea to production use, owning architecture, FastAPI backend, retrieval/generation pipeline, frontend workflow, deployment, and iteration. Especially compelling for teams needing a pragmatic full-stack engineer who can turn LLM-based product ideas into usable, maintainable tools with measurable workflow impact.

View profile
TY

Timothy Yeav

Screened

Senior AI/ML Engineer specializing in Generative AI and FinTech

Bronx, NY8y exp
InsitroNew York City College of Technology (CUNY)

Built end-to-end LLM/RAG systems for biological data and scientific literature analysis in a drug discovery setting, helping researchers explore disease insights and treatment hypotheses faster. Combines applied GenAI product work with strong production engineering, including monitoring, retrieval optimization, reusable Python services, and scalable deployment on AWS/Kubeflow.

View profile
SR

Mid-level Software Engineer specializing in backend systems, AI automation, and SaaS

Sunnyvale, CA5y exp
FlashyFablesUniversity of Texas at Dallas

Full-stack engineer who built and owned a production real-estate search platform (advanced search + saved-search alerts) using Next.js App Router/TypeScript with a NestJS + Postgres + Elasticsearch/Kafka backend. Demonstrated strong performance engineering (map search FPS ~20→60, ~80% latency reduction) and backend scalability (optimized alert-matching queries and orchestrated notification workflows with Airflow/Redis), with measurable post-launch engagement gains (+27% returning users).

View profile
Aakash Khepar - Mid-level Full-Stack AI Engineer specializing in agentic AI systems in Tempe, AZ

Aakash Khepar

Screened

Mid-level Full-Stack AI Engineer specializing in agentic AI systems

Tempe, AZ4y exp
Arizona State UniversityArizona State University

Full-stack engineer with strong ownership across production SaaS and AI agent systems, including a multi-tenant enterprise analytics product at Fractal Analytics and an archive intelligence platform for a real nonprofit. Stands out for combining deep backend/system design, secure AI/RAG implementation, and rapid zero-to-one execution—plus multiple hackathon wins and leadership roles.

View profile
SA

Mid-level Full-Stack Engineer specializing in AI-driven data platforms

Santa Barbara, CA5y exp
UberUniversity of Alabama at Birmingham

Full-stack engineer with 5+ years of experience who built real-time data visualization and analytics systems at Uber, spanning React/TypeScript frontends, Node/GraphQL services, Kafka pipelines, and PostgreSQL. Particularly compelling for teams needing a hands-on builder who can turn ambiguous customer needs into scalable products, and who has also applied RAG with LangChain/OpenAI over 1.8M support files to surface actionable insights.

View profile
SD

Mid-level Data Scientist specializing in business intelligence and machine learning

Pittsburgh, PA2y exp
Armada PartnersCarnegie Mellon University

Internship experience building a production LLM-powered podcast operations agent that automated lead intake (HubSpot), guest research, scheduling (Calendly), meeting-summary evaluation (Gemini), and human approval via Slack bot—while retaining rejected candidates for future outreach. Also contributed to ideation of a multi-agent orchestration framework with parsing and task routing, and emphasized reliability via structured prompts, HITL feedback, and prompt-based test sets.

View profile
ML

Mengyu Liu

Screened

Senior Data Scientist specializing in GenAI agents and causal inference

Remote, USA10y exp
HumanaUniversity of Miami

Built and deployed a production healthcare medical review agent that automates call-transcript summarization and medication reconciliation using a hybrid deterministic + LangGraph-orchestrated LLM workflow. Demonstrates strong reliability engineering (guardrails, schema validation, confidence thresholds, golden/adversarial eval, Langfuse monitoring) in a regulated environment, delivering 60% lower latency and 70%+ efficiency gains while partnering closely with care managers and operations.

View profile
YK

Junior AI/ML Engineer specializing in applied LLMs, security, and reinforcement learning

New York, USA2y exp
New York UniversityNYU

Built and shipped a production LLM-powered investor research feature for a fintech product, focused on grounded answers and minimizing hallucinations. Implemented retrieval-quality and evidence-coverage gating with clear refusal fallbacks, and evaluates systems with regression tests and metrics like correct-refusal rate, hallucination rate, and latency. Comfortable orchestrating workflows with LangChain or custom Python depending on production needs.

View profile
VS

Mid-level Data Scientist/ML Engineer specializing in GenAI agents and MLOps

5y exp
Capital OneUniversity of the Cumberlands

AI/LLM engineer at Capital One who deployed a production RAG-powered fraud analysis and document intelligence platform using LangChain, OpenAI, Pinecone, Kafka, and AWS. Focused on reliability in real-time investigations via hybrid retrieval, schema-validated outputs, and LLM verification loops, reporting review-time reduction from hours to minutes and ~99% fraud detection precision.

View profile
LK

Junior Full-Stack & Data Engineer specializing in cloud platforms and cybersecurity ML

New York, NY2y exp
AccentureNYU

Built a hackathon "Patient Summary Assistant" backend focused on healthcare workflows, combining RAG-based summarization with HIPAA-minded privacy controls (NER redaction + encryption). Demonstrated strong infra skills by deploying on Kubernetes with Helm/HPA and GitOps (ArgoCD), plus migrating from OpenAI to an on-prem Llama 3 stack (vLLM, quantization, shadow-mode testing) and adding real-time Kafka ingestion for patient vitals/anomaly alerts.

View profile
YP

Mid-level AI/ML Engineer specializing in Databricks, MLOps, and real-time fraud detection

The Colony, TX4y exp
DatabricksUniversity of North Texas

ML/LLM engineer building production, real-time fraud detection for financial transactions using a two-tier architecture (fast ML + GPT) to deliver both low-latency decisions and analyst-friendly risk explanations. Experienced orchestrating end-to-end retraining, drift monitoring, and automated model promotion with Databricks Jobs/Workflows and MLflow, and partnering closely with fraud analysts to tune alerts, thresholds, and dashboards.

View profile
NV

Junior Data & Machine Learning Engineer specializing in MLOps and NLP

Los Angeles, United States1y exp
WorkUpUSC

ML/LLM practitioner with production experience building a healthcare review sentiment pipeline (RateMDs) using Hugging Face Transformers plus a LangChain+FAISS RAG layer for interactive querying. Also led orchestration-driven optimization of Nike’s Fusion ETL pipeline, improving runtime efficiency by 20%, and has experience translating ML outputs into Tableau dashboards for non-technical healthcare stakeholders (e.g., readmission risk).

View profile
ZI

Senior Machine Learning Engineer specializing in LLMs, RAG, and computer vision

San Diego, CA10y exp
SOTER AIUC San Diego

Built an "AskMyVideo" system that turns YouTube videos into queryable knowledge graphs by transcribing audio (Whisper), chunking and embedding content, and enabling traceable answers back to exact timestamps. Strong in entity resolution (rules + fuzzy matching + TF-IDF/cosine with PR-curve thresholding) and modern retrieval stacks (FAISS, hybrid dense/sparse, domain fine-tuning with ~12% precision gain), with a production mindset using Airflow/Prefect, Docker/FastAPI, and LangSmith/Prometheus/Grafana observability.

View profile
SV

sai venkata

Screened

Senior Data Engineer specializing in cloud lakehouse and real-time streaming pipelines

Texas, USA6y exp
CVS HealthUniversity of Central Missouri

Senior data engineer with experience in both healthcare (CVS Health) and financial services (Bank of America), building large-scale Azure lakehouse pipelines (30+ EHR sources, ~5TB) and real-time streaming services (Event Hubs/Kafka) for patient vitals. Strong focus on reliability and data quality (Great Expectations, monitoring/alerting, schema drift automation), with measurable outcomes like 50% runtime reduction and 99%+ uptime for regulatory reporting pipelines.

View profile
JV

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

San Diego, CA6y exp
IntuitCleveland State University

Data engineer with Intuit experience owning end-to-end, high-volume financial data pipelines (API/S3 ingestion, Airflow orchestration, Spark/PySpark + SQL transforms, Snowflake marts). Strong focus on reliability and data quality—achieved 99.8% SLA and cut discrepancies by 35% using Great Expectations, reconciliation, schema versioning, and automated backfills; also built near real-time Kafka/API data services with CI/CD and observability.

View profile
RK

Rohit Kumar

Screened

Mid-level Data Engineer specializing in large-scale analytics platforms

San Jose, CA5y exp
NutanixUSC

Data/Backend engineer with experience at Naukri building large-scale analytics products over a 130M+ user base, including Spark/Airflow pipelines and Kafka-based clickstream validation with Confluent Schema Registry. Also built an audience segmentation backend (Athena/S3 + Spring Boot APIs) for non-technical internal teams and recently shipped a GenAI customer data audit system (FastAPI/Postgres/Llama) that cut sales-planning validation from ~3 months to ~1 week.

View profile
Shanmukha Koganti - Mid-level AI/ML Engineer specializing in recommender systems and edge computer vision in Bay Area, CA

Mid-level AI/ML Engineer specializing in recommender systems and edge computer vision

Bay Area, CA6y exp
ShopifyUniversity of North Texas

ML/AI engineer with production experience at Shopify and Intel, building a deep learning product ranking system that lifted add-to-cart ~14% and serving real-time similarity search via FAISS+Redis under <20ms latency at massive scale. Also deployed computer vision models to 100+ retail edge locations using Docker/Ansible/k3s with zero-downtime rollouts, and applies strong MLOps practices (A/B testing, canary/shadow, observability) plus performance optimization (OpenVINO, INT8).

View profile
Nagarjuna Vaddineni - Mid-level Full-Stack Software Engineer specializing in cloud-native microservices and data pipelines in Seattle, WA

Mid-level Full-Stack Software Engineer specializing in cloud-native microservices and data pipelines

Seattle, WA6y exp
AmazonTexas A&M University-Kingsville

Amazon backend engineer who built and operated high-scale Java Spring Boot microservices on AWS (EKS/EC2) handling millions of daily transactions, with deep experience debugging p95 latency and database/ORM bottlenecks. Shipped an AI-driven real-time personalization feature by integrating SageMaker model inference end-to-end with low-latency caching and graceful fallbacks, and designed robust order/payment orchestration with retries, compensations, and DLQ-based escalation.

View profile
Sai Dinesh Pusapati - Senior AI/ML Engineer specializing in GenAI agents and LLM workflows in San Francisco, CA

Senior AI/ML Engineer specializing in GenAI agents and LLM workflows

San Francisco, CA6y exp
Scale AIBelhaven University

LLM/AI engineer with production experience building a retrieval-based document intelligence system that extracts information from PDFs/emails, backed by Python + Spark pipelines. Focused on reliability and cost/latency optimization (caching, batch processing) and has hands-on orchestration experience with Airflow (sensors, retries, alerts). Also partnered with business stakeholders to deliver customer feedback classification/summarization for faster sentiment insights.

View profile
BB

Biplob Bidari

Screened

Senior Data Engineer specializing in FinTech analytics and ML data platforms

USA5y exp
Goldman SachsUniversity of the Cumberlands

ML/AI engineer with Goldman Sachs experience building production fraud detection and RAG-based trading insights systems end-to-end. Stands out for combining real-time ML infrastructure, GenAI retrieval systems, and compliance-aware design, with measurable impact including nearly 25% false-positive reduction and improved analyst productivity.

View profile
KC

Kevin Cruz

Screened

Senior Gen AI Engineer specializing in agentic LLM systems

Tempe, AZ15y exp
OpendoorUSC

Built and owned end-to-end production systems for a healthcare platform, including a predictive task recommendation feature (React + FastAPI + ML on AWS ECS) that cut backlog 20% and saved coordinators ~10 hours/week. Also productionized an AI-native RAG system (vector DB + LLM) delivering 40% faster query resolution, and led phased modernization of a monolithic FastAPI service into async microservices using feature flags and canary releases.

View profile
Sai Karthik Chittamuru - Senior Salesforce Developer specializing in AI systems and enterprise cloud solutions in Pittsburgh, PA

Senior Salesforce Developer specializing in AI systems and enterprise cloud solutions

Pittsburgh, PA15y exp
CRMIT SolutionsCarnegie Mellon University

Salesforce-focused engineer with hands-on experience building Sales Cloud and Service Cloud solutions, including a Zoho billing integration for quote/contract workflows and a multi-panel LWC case management dashboard. Stands out for making practical architecture decisions around middleware vs. custom REST, handling idempotency with upsert patterns, and modernizing legacy Aura patterns with Lightning Message Service.

View profile
RM

Ruby Medeiros

Screened

Staff SRE and Software Engineer specializing in distributed systems and cloud reliability

11y exp
ArenaNOVA University Lisbon

Built a production B2C behavioral interview system for job seekers using LangGraph/LangChain on AWS Bedrock with Nova models, plus a FastAPI backend and Vercel AI SDK frontend. Stands out for practical agent reliability work: local stress testing, OpenTelemetry-to-Datadog observability, token/cost monitoring, and guardrails to keep conversations on track and resistant to instruction override.

View profile

Need someone specific?

AI Search