Vetted Apache Spark Professionals

Pre-screened and vetted.

SR

Sanketh Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and large-scale ETL

Jersey City, NJ6y exp
JPMorgan ChaseUniversity of Texas at Dallas

Data engineer focused on large-scale ETL/ELT pipelines across cloud stacks (GCP and AWS), including Spark-based transformations and orchestration with Airflow. Has experience loading up to ~2TB per BigQuery target table and designing atomic loads to multiple downstream systems (Elasticsearch + Kafka), with Kubernetes deployment and Jenkins CI/CD.

View profile
VR

Vikas Ravula

Screened

Senior Data Engineer specializing in cloud data platforms and real-time streaming for financial services

Chicago, IL6y exp
BloombergUniversity of Illinois Urbana-Champaign

Data engineer with experience at Bloomberg, UBS, and Bank of America building high-volume financial data platforms and services. Owned an end-to-end pipeline processing ~150–200M records/day (Kafka/Cassandra/S3 → Spark/PySpark → Snowflake) with strong data quality controls and Airflow reliability practices, reporting ~99% reliability and major performance gains. Also built large-scale external API ingestion with compliance-minded rate limiting, schema versioning, and quarantine/validation layers.

View profile
SR

Senior Infrastructure Platform Architect specializing in Kubernetes and hybrid cloud

Chicago, IL9y exp
ExelonGeorge Mason University

Platform/infra engineer with strong ownership of Kubernetes on VMware and day-to-day hybrid on-prem-to-AWS operations. Has hands-on experience automating infrastructure delivery with Terraform/Ansible/CI-CD, and has resolved real production issues spanning CSI storage reattachment during upgrades, vSphere storage-latency performance degradation, and hybrid connectivity/routing failures with improved validation, monitoring, and failover.

View profile
Poorna Pedapudi - Mid-Level Software Engineer specializing in distributed backend systems and cloud-native microservices in Seattle, WA

Mid-Level Software Engineer specializing in distributed backend systems and cloud-native microservices

Seattle, WA5y exp
UberGeorge Mason University

Software engineer focused on data platforms and applied LLM systems: built an internal data quality monitoring layer to catch silent data drift and iterated post-launch after finding ~30% false-positive alerts, reducing noise via dynamic baselines and improved structured logging. Also shipped a production RAG-based internal knowledge assistant over Jira/Confluence with citations, confidence-based fallbacks, and nightly automated evals to prevent regressions.

View profile
XL

Xinyuan Lin

Screened

Intern Software Engineer specializing in LLMs, RAG, and full-stack systems

San Jose, CA1y exp
eBayUniversity of Washington

Built and productionized a multi-agent LLM analytics assistant at eBay that routes natural-language questions to retrieval or text-to-SQL, dynamically retrieves relevant schemas via a vector DB, and executes against a data warehouse. Drove a major quality lift (text-to-SQL accuracy 60%→85%) and materially reduced time engineers/PMs spent getting data insights through strong eval/monitoring, tracing, and reliability-focused design (schema retrieval, strict JSON outputs, retries/clarifications).

View profile
SB

Mid-level Software Engineer specializing in cloud backend and distributed systems

Seattle, WA3y exp
AmazonUSC

Built a production GenAI support agent at Amazon for FBA on-call operations, using Bedrock, Lambda, RAG, and confidence-based human fallback to safely automate ticket triage. The system materially reduced ticket volume and manual workload while improving MTTR, showing strong depth in reliable LLM agent architecture under real operational constraints.

View profile
AV

Anuj Vakil

Screened

Mid-level Software Engineer specializing in distributed data infrastructure

Palo Alto, CA3y exp
AmazonSan Jose State University

Engineer who uses AI in a disciplined, practical way—leveraging it to speed debugging, generate edge-case tests, and improve coverage while retaining ownership of system design and production validation. Has experimented with chained AI tools but prefers simpler workflows when they reduce noise and review overhead.

View profile
Alexander Smith - Junior Software Engineer and Data Scientist specializing in AI/ML systems in California, USA

Junior Software Engineer and Data Scientist specializing in AI/ML systems

California, USA3y exp
Dun & BradstreetUC Berkeley

Built production-grade automation and ML/data pipelines at Dun & Bradstreet and ThreadNotion, spanning large-scale document classification, country risk report automation, and resilient Playwright testing for dynamic AI chat workflows. Particularly strong in turning brittle or ambiguous systems into reliable, observable, end-to-end automated platforms.

View profile
Shuju Sun - Mid-Level Software Engineer specializing in real-time data pipelines and ML deployment in PA, USA

Shuju Sun

Screened

Mid-Level Software Engineer specializing in real-time data pipelines and ML deployment

PA, USA4y exp
VanguardUSC

Ticketmaster data engineer who built CDC-driven Kafka pipelines feeding Snowflake for analytics and data science teams. Hands-on in production operations—scaled Kafka during sudden playoff-driven transaction spikes and improved monitoring for preemptive scaling. Known for using small-batch experiments and quantitative metrics to align stakeholders and drive cost-saving architecture changes (e.g., buffering to reduce AWS Lambda invocation frequency).

View profile
GS

grusha shetty

Screened

Senior Data Analyst specializing in product analytics and experimentation

Berkeley, CA3y exp
Games24x7UC Berkeley

Analytics candidate with strong product and growth analytics experience across SQL, Spark, Python, and Tableau. They have built clickstream funnel pipelines, automated Bayesian experiment evaluation, and used Markov chain journey modeling to uncover onboarding friction that led to a 5% conversion improvement. They also show strong cross-functional influence by standardizing churn definitions across product and marketing teams and operationalizing adoption in shared dashboards.

View profile
SN

Senior AI/ML Engineer specializing in LLMs, NLP, and enterprise conversational AI

Sunnyvale, CA10y exp
WalmartUniversity of Illinois Urbana-Champaign

ML/GenAI engineer with strong end-to-end production ownership across predictive ML, RAG systems, and LLM routing. They pair solid platform engineering skills with measurable business impact, including 15% churn reduction, 35% support ticket deflection, 45% GenAI cost savings, and a shared inference library that cut deployment time from weeks to days.

View profile
JR

Joseph Rivas

Screened

Senior AI/ML Engineer specializing in GenAI, MLOps, and computer vision

Boston, MA9y exp
Jaxon.AIGeorgia Tech

ML/AI engineer with hands-on ownership of production document intelligence and GenAI systems, spanning model experimentation, AWS deployment, monitoring, and iterative optimization. Stands out for turning document-heavy workflows into reliable, near real-time products with measurable gains in accuracy, latency, and manual-effort reduction, while also shipping citation-grounded RAG features that drove user trust and adoption.

View profile
Darsh Sharma - Mid-level Software Engineer specializing in ML systems and microservices in Madison, WI

Darsh Sharma

Screened

Mid-level Software Engineer specializing in ML systems and microservices

Madison, WI2y exp
TeradataUniversity of Wisconsin–Madison

Teradata Text Security intern who built a production LLM-powered planner agent that decomposes complex tasks into dependency-aware subtasks (DAG/topological graph) and executes them via a custom orchestrator with parallelism, status tracking, and error handling. Also contributed to an HR-facing internal document chatbot concept to streamline onboarding, showing cross-functional collaboration.

View profile
Kaustubh Rai - Junior Software Engineer specializing in scalable distributed systems and cloud platforms in Pittsburgh, PA

Kaustubh Rai

Screened

Junior Software Engineer specializing in scalable distributed systems and cloud platforms

Pittsburgh, PA2y exp
eParts Services LLCCarnegie Mellon University

Backend engineer with experience at UnitedHealth Group redesigning a high-traffic Spring Boot microservice from blocking to reactive architecture during peak season, cutting median latency by 47% for a service used by ~10M customers annually. Strong in Kubernetes-based deployment/scaling and pragmatic rollout strategies (blue-green/incremental traffic shifting) with performance and database troubleshooting.

View profile
SL

Mid-level Machine Learning Engineer specializing in MLOps, monitoring, and multimodal AI

Kansas, USA4y exp
AppleUniversity of Central Missouri

ML/AI engineer focused on production-grade model reliability: built a monitoring and validation framework to detect drift, trigger anomaly alerts/retraining, and maintain consistent performance for device intelligence workflows at scale. Strong MLOps background with Python pipelines, Docker/Kubernetes deployments, Airflow orchestration, and real-time monitoring dashboards; experienced partnering with product managers to deliver business-facing insights.

View profile
VG

Machine learning engineer and software developer with experience across fintech, e-commerce, and gaming.

Dallas, Texas, USA6y exp
Fidelity InvestmentsUniversity of the Cumberlands

ML/AI engineer with hands-on ownership of production systems spanning classical ML fraud detection and GenAI agent workflows. At Fidelity, they built an end-to-end fraud platform that improved review queue Precision@K by 15-20% while reducing false positives 10-15%, and they also shipped RAG-based agent systems that cut manual workflow effort by 30-40%.

View profile
Nilesh Dixit - Executive AI engineering leader specializing in agentic AI and enterprise platforms in San Francisco, CA

Nilesh Dixit

Screened

Executive AI engineering leader specializing in agentic AI and enterprise platforms

San Francisco, CA24y exp
Zeehub AICentre for Development of Advanced Computing

Bay Area engineering leader and startup co-founder with a rare mix of deep hands-on architecture experience, large-scale people leadership, and cross-functional product ownership. He helped launch GE Digital's industrial IoT efforts, holds multiple patents in the space, has scaled teams to 60-70 people, and has led both enterprise platform modernization and AI startup product development.

View profile
JY

Jiacheng Yin

Screened

Intern Software Engineer specializing in data engineering and AI agent systems

Beijing, China1y exp
JD.comCornell University

AI engineer at Anote.ai who built and shipped a production multi-agent LangGraph/LangChain/Ray RAG platform for enterprise search and workflow automation, supporting 3 commercial products and 100+ developers. Drove measurable gains (30% accuracy improvement, lower latency) and improved reliability with Redis-based state checkpointing, message-queue synchronization, and Milvus retrieval optimizations, while partnering with PMs/clients to add transparency features like confidence scores and real-time logs.

View profile
PK

priya kotha

Screened

Mid-level Data Engineer specializing in real-time pipelines across FinTech and Healthcare

USA, USA4y exp
PlaidSacred Heart University

Data engineer at Plaid who built greenfield, end-to-end real-time transaction pipelines and FastAPI data services for fraud detection and analytics, handling millions of events per day. Strong focus on reliability and data integrity via Great Expectations validation, Airflow-based monitoring/SLAs, quarantine/staging patterns, and robust external data ingestion with schema versioning and backfills (reported 50% fewer anomalies and ~40% fewer failures).

View profile
Manjory saran - Senior Backend & Infrastructure Engineer specializing in cloud-native distributed systems

Manjory saran

Screened

Senior Backend & Infrastructure Engineer specializing in cloud-native distributed systems

5y exp
WalmartSan José State University

LLM infrastructure engineer who built a production-critical real-time personalization and memory retrieval system for a user-facing product, adding <100ms P99 latency while improving relevance ~20–25% and holding SLA through 3x traffic. Experienced designing tiered retrieval backends (Redis + vector store), deploying on Kubernetes with autoscaling/circuit breakers, and running rigorous observability, incident response, and agent evaluation (shadow traffic, A/B tests, regression/replay).

View profile
Feras Alsaiari - Senior Software Engineer specializing in AWS data platforms and event-driven systems

Senior Software Engineer specializing in AWS data platforms and event-driven systems

4y exp
Capital OneGeorgia Tech

Capital One engineer leading the architecture and delivery of a large-scale AWS Glue/Spark/Delta Lake batch messaging pipeline that decoupled batch from real-time flows, added multi-region failover and automated retries, and delivered ~40% AWS cost savings with ~3x performance gains. Currently building an LLM-powered Slack bot using RAG to automate message investigations by querying CloudWatch, Snowflake, and internal documentation with privacy-aware masking of NPI/PII.

View profile
Pranav Puranik - Senior AI Engineer specializing in LLMs, RAG, and multimodal NLP in Austin, TX

Senior AI Engineer specializing in LLMs, RAG, and multimodal NLP

Austin, TX5y exp
Health Care Service CorporationUniversity of Florida

Built a production LLM/RAG assistant for insurance/health claims agents that ingests 100–200 page patient PDFs via OCR (migrated from local Tesseract to Azure Document Intelligence) and delivers grounded claim detail retrieval plus summaries with PII/PHI guardrails. Experienced orchestrating large workflows with Celery worker pipelines and AWS Step Functions (S3-triggered, Fargate-based batch inference/accuracy aggregation), and collaborates closely with non-technical SMEs (claims agents/nurses) through shadowing, iterative demos, and SME-defined evaluation.

View profile
Shanay Wadhwani - Mid-level Data Scientist specializing in NLP, computer vision, and applied ML in Washington, DC

Mid-level Data Scientist specializing in NLP, computer vision, and applied ML

Washington, DC6y exp
World BankGeorgetown University

AI/ML engineer with impactful work for the World Bank across both LLM systems and computer vision, including a PRAI evaluator-assistance platform and a production UNet model for slum detection from multispectral satellite imagery. Earlier built multilingual NLP-based borrower segmentation and credit scoring at Creditmate through its acquisition by Paytm, showing strong experience in ambiguous, high-impact environments.

View profile

Need someone specific?

AI Search