Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Prafull Prajapati

Screened

Senior Backend Software Engineer specializing in cloud, microservices, and AI systems

Richardson, TX8y exp

The University of Texas at DallasUniversity of Texas at Dallas

“Built an AI-powered job outreach application for his own job search and took it from idea to production use, owning architecture, FastAPI backend, retrieval/generation pipeline, frontend workflow, deployment, and iteration. Especially compelling for teams needing a pragmatic full-stack engineer who can turn LLM-based product ideas into usable, maintainable tools with measurable workflow impact.”

C C++JavaScript Java Python TypeScript+162

View profile

Timothy Yeav

Screened

Senior AI/ML Engineer specializing in Generative AI and FinTech

Bronx, NY8y exp

InsitroNew York City College of Technology (CUNY)

“Built end-to-end LLM/RAG systems for biological data and scientific literature analysis in a drug discovery setting, helping researchers explore disease insights and treatment hypotheses faster. Combines applied GenAI product work with strong production engineering, including monitoring, retrieval optimization, reusable Python services, and scalable deployment on AWS/Kubeflow.”

Generative AI LLaMA GPT Agentic AI BERT Transformers+204

View profile

SushmithaVishwanath Rao

Screened

Mid-level Software Engineer specializing in backend systems, AI automation, and SaaS

Sunnyvale, CA5y exp

FlashyFablesUniversity of Texas at Dallas

“Full-stack engineer who built and owned a production real-estate search platform (advanced search + saved-search alerts) using Next.js App Router/TypeScript with a NestJS + Postgres + Elasticsearch/Kafka backend. Demonstrated strong performance engineering (map search FPS ~20→60, ~80% latency reduction) and backend scalability (optimized alert-matching queries and orchestrated notification workflows with Airflow/Redis), with measurable post-launch engagement gains (+27% returning users).”

AI Agents LangChain Claude Prompt Engineering Keras TensorFlow+87

View profile

Aakash Khepar

Screened

Mid-level Full-Stack AI Engineer specializing in agentic AI systems

Tempe, AZ4y exp

Arizona State UniversityArizona State University

“Full-stack engineer with strong ownership across production SaaS and AI agent systems, including a multi-tenant enterprise analytics product at Fractal Analytics and an archive intelligence platform for a real nonprofit. Stands out for combining deep backend/system design, secure AI/RAG implementation, and rapid zero-to-one execution—plus multiple hackathon wins and leadership roles.”

Python TypeScript JavaScript Java SQL NoSQL+236

View profile

Sai Anuhya Bandi

Screened

Mid-level Full-Stack Engineer specializing in AI-driven data platforms

Santa Barbara, CA5y exp

UberUniversity of Alabama at Birmingham

“Full-stack engineer with 5+ years of experience who built real-time data visualization and analytics systems at Uber, spanning React/TypeScript frontends, Node/GraphQL services, Kafka pipelines, and PostgreSQL. Particularly compelling for teams needing a hands-on builder who can turn ambiguous customer needs into scalable products, and who has also applied RAG with LangChain/OpenAI over 1.8M support files to surface actionable insights.”

TypeScript JavaScript Python Java SQL React+232

View profile

Shreyas Darade

Screened

Mid-level Data Scientist specializing in business intelligence and machine learning

Pittsburgh, PA2y exp

Armada PartnersCarnegie Mellon University

“Internship experience building a production LLM-powered podcast operations agent that automated lead intake (HubSpot), guest research, scheduling (Calendly), meeting-summary evaluation (Gemini), and human approval via Slack bot—while retaining rejected candidates for future outreach. Also contributed to ideation of a multi-agent orchestration framework with parsing and task routing, and emphasized reliability via structured prompts, HITL feedback, and prompt-based test sets.”

A/B Testing Analytics Business Intelligence Classification Clustering Data Analytics+84

View profile

Mengyu Liu

Screened

Senior Data Scientist specializing in GenAI agents and causal inference

Remote, USA10y exp

HumanaUniversity of Miami

“Built and deployed a production healthcare medical review agent that automates call-transcript summarization and medication reconciliation using a hybrid deterministic + LangGraph-orchestrated LLM workflow. Demonstrates strong reliability engineering (guardrails, schema validation, confidence thresholds, golden/adversarial eval, Langfuse monitoring) in a regulated environment, delivering 60% lower latency and 70%+ efficiency gains while partnering closely with care managers and operations.”

Python R SQL NumPy Pandas Matplotlib+129

View profile

Yukta Kulkarni

Screened

Junior AI/ML Engineer specializing in applied LLMs, security, and reinforcement learning

New York, USA2y exp

New York UniversityNYU

“Built and shipped a production LLM-powered investor research feature for a fintech product, focused on grounded answers and minimizing hallucinations. Implemented retrieval-quality and evidence-coverage gating with clear refusal fallbacks, and evaluates systems with regression tests and metrics like correct-refusal rate, hallucination rate, and latency. Comfortable orchestrating workflows with LangChain or custom Python depending on production needs.”

Python C C++SQL TypeScript JavaScript+82

View profile

Venkata Sai Pavan Dema

Screened

Mid-level Data Scientist/ML Engineer specializing in GenAI agents and MLOps

5y exp

Capital OneUniversity of the Cumberlands

“AI/LLM engineer at Capital One who deployed a production RAG-powered fraud analysis and document intelligence platform using LangChain, OpenAI, Pinecone, Kafka, and AWS. Focused on reliability in real-time investigations via hybrid retrieval, schema-validated outputs, and LLM verification loops, reporting review-time reduction from hours to minutes and ~99% fraud detection precision.”

A/B Testing Amazon EC2 Amazon Redshift Amazon S3 Amazon SageMaker Azure App Service+163

View profile

Lakshmi Kiranmayi Chelluboyina

Screened

Junior Full-Stack & Data Engineer specializing in cloud platforms and cybersecurity ML

New York, NY2y exp

AccentureNYU

“Built a hackathon "Patient Summary Assistant" backend focused on healthcare workflows, combining RAG-based summarization with HIPAA-minded privacy controls (NER redaction + encryption). Demonstrated strong infra skills by deploying on Kubernetes with Helm/HPA and GitOps (ArgoCD), plus migrating from OpenAI to an on-prem Llama 3 stack (vLLM, quantization, shadow-mode testing) and adding real-time Kafka ingestion for patient vitals/anomaly alerts.”

Agile Apache Spark C C#C++CI/CD+93

View profile

Yeshwanth Pulapa

Screened

Mid-level AI/ML Engineer specializing in Databricks, MLOps, and real-time fraud detection

The Colony, TX4y exp

DatabricksUniversity of North Texas

“ML/LLM engineer building production, real-time fraud detection for financial transactions using a two-tier architecture (fast ML + GPT) to deliver both low-latency decisions and analyst-friendly risk explanations. Experienced orchestrating end-to-end retraining, drift monitoring, and automated model promotion with Databricks Jobs/Workflows and MLflow, and partnering closely with fraud analysts to tune alerts, thresholds, and dashboards.”

A/B Testing Apache Airflow Apache Kafka Apache Spark AWS AWS Lambda+93

View profile

Nikita Vivek Kolhe

Screened

Junior Data & Machine Learning Engineer specializing in MLOps and NLP

Los Angeles, United States1y exp

WorkUpUSC

“ML/LLM practitioner with production experience building a healthcare review sentiment pipeline (RateMDs) using Hugging Face Transformers plus a LangChain+FAISS RAG layer for interactive querying. Also led orchestration-driven optimization of Nike’s Fusion ETL pipeline, improving runtime efficiency by 20%, and has experience translating ML outputs into Tableau dashboards for non-technical healthcare stakeholders (e.g., readmission risk).”

Python SQL C C++R MATLAB+90

View profile

Zufeshan Imran

Screened

Senior Machine Learning Engineer specializing in LLMs, RAG, and computer vision

San Diego, CA10y exp

SOTER AIUC San Diego

“Built an "AskMyVideo" system that turns YouTube videos into queryable knowledge graphs by transcribing audio (Whisper), chunking and embedding content, and enabling traceable answers back to exact timestamps. Strong in entity resolution (rules + fuzzy matching + TF-IDF/cosine with PR-curve thresholding) and modern retrieval stacks (FAISS, hybrid dense/sparse, domain fine-tuning with ~12% precision gain), with a production mindset using Airflow/Prefect, Docker/FastAPI, and LangSmith/Prometheus/Grafana observability.”

Machine Learning Deep Learning Generative AI Transformers Large Language Models (LLMs)Retrieval-Augmented Generation (RAG)+120

View profile

sai venkata

Screened

Senior Data Engineer specializing in cloud lakehouse and real-time streaming pipelines

Texas, USA6y exp

CVS HealthUniversity of Central Missouri

“Senior data engineer with experience in both healthcare (CVS Health) and financial services (Bank of America), building large-scale Azure lakehouse pipelines (30+ EHR sources, ~5TB) and real-time streaming services (Event Hubs/Kafka) for patient vitals. Strong focus on reliability and data quality (Great Expectations, monitoring/alerting, schema drift automation), with measurable outcomes like 50% runtime reduction and 99%+ uptime for regulatory reporting pipelines.”

Python SQL Scala Java Shell Scripting Apache Spark+117

View profile

jahnavi Vasala

Screened

Mid-level Data Engineer specializing in cloud data platforms and streaming pipelines

San Diego, CA6y exp

IntuitCleveland State University

“Data engineer with Intuit experience owning end-to-end, high-volume financial data pipelines (API/S3 ingestion, Airflow orchestration, Spark/PySpark + SQL transforms, Snowflake marts). Strong focus on reliability and data quality—achieved 99.8% SLA and cut discrepancies by 35% using Great Expectations, reconciliation, schema versioning, and automated backfills; also built near real-time Kafka/API data services with CI/CD and observability.”

Python SQL PySpark Scala Shell scripting Apache Spark+87

View profile

Rohit Kumar

Screened

Mid-level Data Engineer specializing in large-scale analytics platforms

San Jose, CA5y exp

NutanixUSC

“Data/Backend engineer with experience at Naukri building large-scale analytics products over a 130M+ user base, including Spark/Airflow pipelines and Kafka-based clickstream validation with Confluent Schema Registry. Also built an audience segmentation backend (Athena/S3 + Spring Boot APIs) for non-technical internal teams and recently shipped a GenAI customer data audit system (FastAPI/Postgres/Llama) that cut sales-planning validation from ~3 months to ~1 week.”

Algorithms Amazon S3 Apache Hadoop Apache Hive Apache Kafka Apache Spark+95

View profile

Shanmukha Koganti

Screened

Mid-level AI/ML Engineer specializing in recommender systems and edge computer vision

Bay Area, CA6y exp

ShopifyUniversity of North Texas

“ML/AI engineer with production experience at Shopify and Intel, building a deep learning product ranking system that lifted add-to-cart ~14% and serving real-time similarity search via FAISS+Redis under <20ms latency at massive scale. Also deployed computer vision models to 100+ retail edge locations using Docker/Ansible/k3s with zero-downtime rollouts, and applies strong MLOps practices (A/B testing, canary/shadow, observability) plus performance optimization (OpenVINO, INT8).”

A/B Testing Agile Ansible Apache Kafka Apache Spark AWS+170

View profile

Nagarjuna Vaddineni

Screened

Mid-level Full-Stack Software Engineer specializing in cloud-native microservices and data pipelines

Seattle, WA6y exp

AmazonTexas A&M University-Kingsville

“Amazon backend engineer who built and operated high-scale Java Spring Boot microservices on AWS (EKS/EC2) handling millions of daily transactions, with deep experience debugging p95 latency and database/ORM bottlenecks. Shipped an AI-driven real-time personalization feature by integrating SageMaker model inference end-to-end with low-latency caching and graceful fallbacks, and designed robust order/payment orchestration with retries, compensations, and DLQ-based escalation.”

Agile Ansible Apache Kafka Apache Spark AWS AWS CloudFormation+122

View profile

Sai Dinesh Pusapati

Screened

Senior AI/ML Engineer specializing in GenAI agents and LLM workflows

San Francisco, CA6y exp

Scale AIBelhaven University

“LLM/AI engineer with production experience building a retrieval-based document intelligence system that extracts information from PDFs/emails, backed by Python + Spark pipelines. Focused on reliability and cost/latency optimization (caching, batch processing) and has hands-on orchestration experience with Airflow (sensors, retries, alerts). Also partnered with business stakeholders to deliver customer feedback classification/summarization for faster sentiment insights.”

Python TypeScript Java C#JavaScript R+103

View profile

Biplob Bidari

Screened

Senior Data Engineer specializing in FinTech analytics and ML data platforms

USA5y exp

Goldman SachsUniversity of the Cumberlands

“ML/AI engineer with Goldman Sachs experience building production fraud detection and RAG-based trading insights systems end-to-end. Stands out for combining real-time ML infrastructure, GenAI retrieval systems, and compliance-aware design, with measurable impact including nearly 25% false-positive reduction and improved analyst productivity.”

Python Pandas NumPy PySpark SQL Bash+139

View profile

Kevin Cruz

Screened

Senior Gen AI Engineer specializing in agentic LLM systems

Tempe, AZ15y exp

OpendoorUSC

“Built and owned end-to-end production systems for a healthcare platform, including a predictive task recommendation feature (React + FastAPI + ML on AWS ECS) that cut backlog 20% and saved coordinators ~10 hours/week. Also productionized an AI-native RAG system (vector DB + LLM) delivering 40% faster query resolution, and led phased modernization of a monolithic FastAPI service into async microservices using feature flags and canary releases.”

Generative AI Multi-Agent Systems Prompt Engineering Vector Databases LangChain LangGraph+396

View profile

Sai Karthik Chittamuru

Screened

Senior Salesforce Developer specializing in AI systems and enterprise cloud solutions

Pittsburgh, PA15y exp

CRMIT SolutionsCarnegie Mellon University

“Salesforce-focused engineer with hands-on experience building Sales Cloud and Service Cloud solutions, including a Zoho billing integration for quote/contract workflows and a multi-panel LWC case management dashboard. Stands out for making practical architecture decisions around middleware vs. custom REST, handling idempotency with upsert patterns, and modernizing legacy Aura patterns with Lightning Message Service.”

REST APIs LangChain LangGraph PyTorch TensorFlow Transformers+131

View profile

Ruby Medeiros

Screened

Staff SRE and Software Engineer specializing in distributed systems and cloud reliability

11y exp

ArenaNOVA University Lisbon

“Built a production B2C behavioral interview system for job seekers using LangGraph/LangChain on AWS Bedrock with Nova models, plus a FastAPI backend and Vercel AI SDK frontend. Stands out for practical agent reliability work: local stress testing, OpenTelemetry-to-Datadog observability, token/cost monitoring, and guardrails to keep conversations on track and resistant to instruction override.”

Distributed Systems AWS Kubernetes Docker Terraform Ansible+108

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?