Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache Spark Python Docker SQL AWS CI/CD

Sanketh Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and large-scale ETL

Jersey City, NJ6y exp

JPMorgan ChaseUniversity of Texas at Dallas

“Data engineer focused on large-scale ETL/ELT pipelines across cloud stacks (GCP and AWS), including Spark-based transformations and orchestration with Airflow. Has experience loading up to ~2TB per BigQuery target table and designing atomic loads to multiple downstream systems (Elasticsearch + Kafka), with Kubernetes deployment and Jenkins CI/CD.”

Python SQL Scala Java R C+++81

View profile

Vikas Ravula

Screened

Senior Data Engineer specializing in cloud data platforms and real-time streaming for financial services

Chicago, IL6y exp

BloombergUniversity of Illinois Urbana-Champaign

“Data engineer with experience at Bloomberg, UBS, and Bank of America building high-volume financial data platforms and services. Owned an end-to-end pipeline processing ~150–200M records/day (Kafka/Cassandra/S3 → Spark/PySpark → Snowflake) with strong data quality controls and Airflow reliability practices, reporting ~99% reliability and major performance gains. Also built large-scale external API ingestion with compliance-minded rate limiting, schema versioning, and quarantine/validation layers.”

Python SQL Scala Java Shell Scripting Apache Spark+100

View profile

Sriram Rajaraman

Screened

Senior Infrastructure Platform Architect specializing in Kubernetes and hybrid cloud

Chicago, IL9y exp

ExelonGeorge Mason University

“Platform/infra engineer with strong ownership of Kubernetes on VMware and day-to-day hybrid on-prem-to-AWS operations. Has hands-on experience automating infrastructure delivery with Terraform/Ansible/CI-CD, and has resolved real production issues spanning CSI storage reattachment during upgrades, vSphere storage-latency performance degradation, and hybrid connectivity/routing failures with improved validation, monitoring, and failover.”

Kubernetes AWS Infrastructure as Code Terraform DevOps Git+204

View profile

Poorna Pedapudi

Screened

Mid-Level Software Engineer specializing in distributed backend systems and cloud-native microservices

Seattle, WA5y exp

UberGeorge Mason University

“Software engineer focused on data platforms and applied LLM systems: built an internal data quality monitoring layer to catch silent data drift and iterated post-launch after finding ~30% false-positive alerts, reducing noise via dynamic baselines and improved structured logging. Also shipped a production RAG-based internal knowledge assistant over Jira/Confluence with citations, confidence-based fallbacks, and nightly automated evals to prevent regressions.”

Go Python Java JavaScript TypeScript C+++115

View profile

Xinyuan Lin

Screened

Intern Software Engineer specializing in LLMs, RAG, and full-stack systems

San Jose, CA1y exp

eBayUniversity of Washington

“Built and productionized a multi-agent LLM analytics assistant at eBay that routes natural-language questions to retrieval or text-to-SQL, dynamically retrieves relevant schemas via a vector DB, and executes against a data warehouse. Drove a major quality lift (text-to-SQL accuracy 60%→85%) and materially reduced time engineers/PMs spent getting data insights through strong eval/monitoring, tracing, and reliability-focused design (schema retrieval, strict JSON outputs, retries/clarifications).”

Python Java C C++Go JavaScript+98

View profile

Shalin Bhavsar

Screened

Mid-level Software Engineer specializing in cloud backend and distributed systems

Seattle, WA3y exp

AmazonUSC

“Built a production GenAI support agent at Amazon for FBA on-call operations, using Bedrock, Lambda, RAG, and confidence-based human fallback to safely automate ticket triage. The system materially reduced ticket volume and manual workload while improving MTTR, showing strong depth in reliable LLM agent architecture under real operational constraints.”

Python Java C++JavaScript SQL Django+70

View profile

Anuj Vakil

Screened

Mid-level Software Engineer specializing in distributed data infrastructure

Palo Alto, CA3y exp

AmazonSan Jose State University

“Engineer who uses AI in a disciplined, practical way—leveraging it to speed debugging, generate edge-case tests, and improve coverage while retaining ownership of system design and production validation. Has experimented with chained AI tools but prefers simpler workflows when they reduce noise and review overhead.”

Java Python JavaScript TypeScript Scala Spring Boot+62

View profile

Alexander Smith

Screened

Junior Software Engineer and Data Scientist specializing in AI/ML systems

California, USA3y exp

Dun & BradstreetUC Berkeley

“Built production-grade automation and ML/data pipelines at Dun & Bradstreet and ThreadNotion, spanning large-scale document classification, country risk report automation, and resilient Playwright testing for dynamic AI chat workflows. Particularly strong in turning brittle or ambiguous systems into reliable, observable, end-to-end automated platforms.”

Python Go Java C TypeScript JavaScript+151

View profile

Shuju Sun

Screened

Mid-Level Software Engineer specializing in real-time data pipelines and ML deployment

PA, USA4y exp

VanguardUSC

“Ticketmaster data engineer who built CDC-driven Kafka pipelines feeding Snowflake for analytics and data science teams. Hands-on in production operations—scaled Kafka during sudden playoff-driven transaction spikes and improved monitoring for preemptive scaling. Known for using small-batch experiments and quantitative metrics to align stakeholders and drive cost-saving architecture changes (e.g., buffering to reduce AWS Lambda invocation frequency).”

Python Java C C++Scala Go+132

View profile

grusha shetty

Screened

Senior Data Analyst specializing in product analytics and experimentation

Berkeley, CA3y exp

Games24x7UC Berkeley

“Analytics candidate with strong product and growth analytics experience across SQL, Spark, Python, and Tableau. They have built clickstream funnel pipelines, automated Bayesian experiment evaluation, and used Markov chain journey modeling to uncover onboarding friction that led to a 5% conversion improvement. They also show strong cross-functional influence by standardizing churn definitions across product and marketing teams and operationalizing adoption in shared dashboards.”

Python SQL A/B Testing Logistic Regression Random Forest Databricks+44

View profile

Sairaghav Nissankula

Screened

Senior AI/ML Engineer specializing in LLMs, NLP, and enterprise conversational AI

Sunnyvale, CA10y exp

WalmartUniversity of Illinois Urbana-Champaign

“ML/GenAI engineer with strong end-to-end production ownership across predictive ML, RAG systems, and LLM routing. They pair solid platform engineering skills with measurable business impact, including 15% churn reduction, 35% support ticket deflection, 45% GenAI cost savings, and a shared inference library that cut deployment time from weeks to days.”

Generative AI Python PyTorch FAISS Elasticsearch LangChain+149

View profile

Joseph Rivas

Screened

Senior AI/ML Engineer specializing in GenAI, MLOps, and computer vision

Boston, MA9y exp

Jaxon.AIGeorgia Tech

“ML/AI engineer with hands-on ownership of production document intelligence and GenAI systems, spanning model experimentation, AWS deployment, monitoring, and iterative optimization. Stands out for turning document-heavy workflows into reliable, near real-time products with measurable gains in accuracy, latency, and manual-effort reduction, while also shipping citation-grounded RAG features that drove user trust and adoption.”

Agentic AI Generative AI Multi-Agent Systems ReAct Computer Vision Natural Language Processing+282

View profile

Darsh Sharma

Screened

Mid-level Software Engineer specializing in ML systems and microservices

Madison, WI2y exp

TeradataUniversity of Wisconsin–Madison

“Teradata Text Security intern who built a production LLM-powered planner agent that decomposes complex tasks into dependency-aware subtasks (DAG/topological graph) and executes them via a custom orchestrator with parallelism, status tracking, and error handling. Also contributed to an HR-facing internal document chatbot concept to streamline onboarding, showing cross-functional collaboration.”

C C++Python Java SQL PyTorch+101

View profile

Kaustubh Rai

Screened

Junior Software Engineer specializing in scalable distributed systems and cloud platforms

Pittsburgh, PA2y exp

eParts Services LLCCarnegie Mellon University

“Backend engineer with experience at UnitedHealth Group redesigning a high-traffic Spring Boot microservice from blocking to reactive architecture during peak season, cutting median latency by 47% for a service used by ~10M customers annually. Strong in Kubernetes-based deployment/scaling and pragmatic rollout strategies (blue-green/incremental traffic shifting) with performance and database troubleshooting.”

.NET Apache Hadoop Apache Kafka AWS AWS Lambda Azure Data Factory+70

View profile

Sri Lekha Kandadai

Screened

Mid-level Machine Learning Engineer specializing in MLOps, monitoring, and multimodal AI

Kansas, USA4y exp

AppleUniversity of Central Missouri

“ML/AI engineer focused on production-grade model reliability: built a monitoring and validation framework to detect drift, trigger anomaly alerts/retraining, and maintain consistent performance for device intelligence workflows at scale. Strong MLOps background with Python pipelines, Docker/Kubernetes deployments, Airflow orchestration, and real-time monitoring dashboards; experienced partnering with product managers to deliver business-facing insights.”

Python SQL R C++Java Machine Learning+85

View profile

Vikranth Gurram

Screened

Machine learning engineer and software developer with experience across fintech, e-commerce, and gaming.

Dallas, Texas, USA6y exp

Fidelity InvestmentsUniversity of the Cumberlands

“ML/AI engineer with hands-on ownership of production systems spanning classical ML fraud detection and GenAI agent workflows. At Fidelity, they built an end-to-end fraud platform that improved review queue Precision@K by 15-20% while reducing false positives 10-15%, and they also shipped RAG-based agent systems that cut manual workflow effort by 30-40%.”

Python SQL C++TypeScript Machine Learning Scikit-learn+66

View profile

Nilesh Dixit

Screened

Executive AI engineering leader specializing in agentic AI and enterprise platforms

San Francisco, CA24y exp

Zeehub AICentre for Development of Advanced Computing

“Bay Area engineering leader and startup co-founder with a rare mix of deep hands-on architecture experience, large-scale people leadership, and cross-functional product ownership. He helped launch GE Digital's industrial IoT efforts, holds multiple patents in the space, has scaled teams to 60-70 people, and has led both enterprise platform modernization and AI startup product development.”

Agentic AI Multi-Agent Systems Go-to-Market Strategy AI Agents LangChain LangGraph+87

View profile

Jiacheng Yin

Screened

Intern Software Engineer specializing in data engineering and AI agent systems

Beijing, China1y exp

JD.comCornell University

“AI engineer at Anote.ai who built and shipped a production multi-agent LangGraph/LangChain/Ray RAG platform for enterprise search and workflow automation, supporting 3 commercial products and 100+ developers. Drove measurable gains (30% accuracy improvement, lower latency) and improved reliability with Redis-based state checkpointing, message-queue synchronization, and Milvus retrieval optimizations, while partnering with PMs/clients to add transparency features like confidence scores and real-time logs.”

.NET Agile AI agents Amazon CloudFront Angular Anomaly detection+158

View profile

priya kotha

Screened

Mid-level Data Engineer specializing in real-time pipelines across FinTech and Healthcare

USA, USA4y exp

PlaidSacred Heart University

“Data engineer at Plaid who built greenfield, end-to-end real-time transaction pipelines and FastAPI data services for fraud detection and analytics, handling millions of events per day. Strong focus on reliability and data integrity via Great Expectations validation, Airflow-based monitoring/SLAs, quarantine/staging patterns, and robust external data ingestion with schema versioning and backfills (reported 50% fewer anomalies and ~40% fewer failures).”

Python SQL Pandas NumPy Apache Spark PySpark+97

View profile

Manjory saran

Screened

Senior Backend & Infrastructure Engineer specializing in cloud-native distributed systems

5y exp

WalmartSan José State University

“LLM infrastructure engineer who built a production-critical real-time personalization and memory retrieval system for a user-facing product, adding <100ms P99 latency while improving relevance ~20–25% and holding SLA through 3x traffic. Experienced designing tiered retrieval backends (Redis + vector store), deploying on Kubernetes with autoscaling/circuit breakers, and running rigorous observability, incident response, and agent evaluation (shadow traffic, A/B tests, regression/replay).”

API Design Asynchronous Processing AWS AWS CloudFormation Caching CI/CD+105

View profile

Feras Alsaiari

Screened

Senior Software Engineer specializing in AWS data platforms and event-driven systems

4y exp

Capital OneGeorgia Tech

“Capital One engineer leading the architecture and delivery of a large-scale AWS Glue/Spark/Delta Lake batch messaging pipeline that decoupled batch from real-time flows, added multi-region failover and automated retries, and delivered ~40% AWS cost savings with ~3x performance gains. Currently building an LLM-powered Slack bot using RAG to automate message investigations by querying CloudWatch, Snowflake, and internal documentation with privacy-aware masking of NPI/PII.”

Python Java JavaScript SQL C C+++91

View profile

Pranav Puranik

Screened

Senior AI Engineer specializing in LLMs, RAG, and multimodal NLP

Austin, TX5y exp

Health Care Service CorporationUniversity of Florida

“Built a production LLM/RAG assistant for insurance/health claims agents that ingests 100–200 page patient PDFs via OCR (migrated from local Tesseract to Azure Document Intelligence) and delivers grounded claim detail retrieval plus summaries with PII/PHI guardrails. Experienced orchestrating large workflows with Celery worker pipelines and AWS Step Functions (S3-triggered, Fargate-based batch inference/accuracy aggregation), and collaborates closely with non-technical SMEs (claims agents/nurses) through shadowing, iterative demos, and SME-defined evaluation.”

Python SQL Java TypeScript Bash Unix+120

View profile

Shanay Wadhwani

Screened

Mid-level Data Scientist specializing in NLP, computer vision, and applied ML

Washington, DC6y exp

World BankGeorgetown University

“AI/ML engineer with impactful work for the World Bank across both LLM systems and computer vision, including a PRAI evaluator-assistance platform and a production UNet model for slum detection from multispectral satellite imagery. Earlier built multilingual NLP-based borrower segmentation and credit scoring at Creditmate through its acquisition by Paytm, showing strong experience in ambiguous, high-impact environments.”

Python SQL R JavaScript Git Pandas+93

View profile

Software Engineers Machine Learning Engineers Data Scientists Data Engineers Software Developers AI Engineers Engineering AI & Machine Learning Data & Analytics Education

Need someone specific?

AI Search

Related

Need someone specific?