Vetted PySpark Professionals

Pre-screened and vetted.

SF

Sara Fang

Screened

Mid-level Software Engineer specializing in cloud data platforms and distributed systems

Remote6y exp
Terra Byte XUniversity of Delaware

Backend/data engineer with production experience building FastAPI services with strong reliability patterns (circuit breaker, rate limiting, caching, graceful degradation) and JWT/OAuth2 auth. Has delivered AWS EKS deployments via Terraform with Secrets Manager/IRSA and HPA autoscaling, and built Glue/Spark ETL pipelines on S3 Parquet with schema-evolution and idempotent reruns; also demonstrated measurable SQL tuning impact (20–30s to <10s).

View profile
DB

Staff Software Engineer specializing in Healthcare platforms and AI data pipelines

Remote10y exp
DrwellBinghamton University

Backend/data engineer with hands-on production AWS experience spanning serverless APIs (Chalice/Lambda/API Gateway/Cognito) and data pipelines (Glue PySpark + Step Functions). Has modernized a legacy SAS reporting system into AWS microservices and implemented schema-drift detection and incident prevention for ETL workflows, plus measurable SQL tuning wins (30 min to <10 min runtime).

View profile
SC

Shweta Chavan

Screened

Junior Computer Vision & ML Engineer specializing in autonomous perception systems

Pittsburgh, PA2y exp
Magna InternationalCarnegie Mellon University

LLM/RAG engineer who built a production-style multi-agent orchestrator for resume-to-recommendation workflows (PDF ingestion through screening and recommendations), emphasizing prompt tuning and strict JSON output contracts. Currently building a RAG application for an NGO using Airflow (DAGs + embeddings) and tackling messy, missing/imbalanced data; has hands-on retrieval stack experience (FAISS/HNSW, bge embeddings) and uses rigorous evaluation metrics for groundedness and hallucination control.

View profile
YR

Senior Data Engineer specializing in cloud-native data pipelines and lakehouse platforms

6y exp
MicrosoftUniversity of North Texas

Data engineer at Microsoft who owned an end-to-end subscription analytics platform processing 7TB+ daily across 40+ pipelines, combining ADF batch ingestion with Kafka/Spark streaming and rigorous Great Expectations quality gates. Built a Fabric-based self-service ingestion platform with CI/CD and observability, plus a Databricks feature store serving near-real-time ML inference with Delta Lake reliability and versioning.

View profile
KARTHIKBABU VADLOORI - Mid-level Full-Stack Developer specializing in Spring Boot, React, and cloud microservices in San Francisco, CA

Mid-level Full-Stack Developer specializing in Spring Boot, React, and cloud microservices

San Francisco, CA5y exp
MetaUniversity of Texas at Arlington

Backend engineer with experience at Meta and Accenture building regulated-data systems (healthcare/financial) using Python/Flask and Postgres. Has scaled high-throughput services to millions of daily requests, delivering measurable latency wins (~40% API latency reduction; ~35% faster DB-backed endpoints), and has productionized ML inference services using Docker/Kubernetes and AWS (ECS/SageMaker).

View profile
Asrith Velireddy - Mid-level AI/ML Engineer specializing in MLOps, LLMs, and scalable ML systems in Harrison, NJ

Mid-level AI/ML Engineer specializing in MLOps, LLMs, and scalable ML systems

Harrison, NJ4y exp
AdobeNJIT

ML/LLM engineer at Adobe who deployed a transformer-based personalization and campaign-targeting recommender system end-to-end, including PySpark/Airflow pipelines processing 12M+ events/day and containerized inference on AWS SageMaker (Docker/Kubernetes). Also has hands-on LLM workflow experience (RAG, semantic search, prompt optimization, hallucination mitigation) with a metrics-driven approach to reliability, drift monitoring, and reproducible retraining via MLflow.

View profile
HR

Mid-level Data Analytics professional specializing in BI, data engineering, and applied AI

California, USA6y exp
AmazonSan Jose State University

Built GenMedX, a multi-module clinical AI system for emergency department decision support spanning triage prediction, diagnosis, medication Q&A, and visit summarization. Stands out for combining medical LLM fine-tuning, RAG, and rigorous evaluation/monitoring to drive a major triage recall improvement from 38.5% to 76.6%, with a strong focus on safety, edge-case detection, and production reliability.

View profile
PV

Praveen V

Screened

Mid-Level Software Engineer specializing in Generative AI and RAG systems

Remote, USA5y exp
MetaUniversity of North Carolina at Charlotte

Built a production RAG-based natural-language-to-SQL system at Global Atlantic to replace slow, expensive manual analytics ticket workflows, focusing heavily on retrieval quality and measurable evaluation (200-question ground-truth set; recall@5 improved 0.65→0.78 via semantic chunking). Also built a custom MCP-style agent orchestrator for a personal project (arxiv-ai) to improve flexibility and Langfuse-aligned observability, and has hands-on experience with LangGraph, CrewAI, and n8n.

View profile
Sagnik Mazumder - Executive ML/AI Founder specializing in agentic analytics and data infrastructure

Executive ML/AI Founder specializing in agentic analytics and data infrastructure

10y exp
Photosphere LabsUniversity of Texas at Dallas

Founder of Photosphere Labs (agentic AI for ecommerce data synthesis/analysis) who worked directly with customers to scope, build, demo, and iterate LLM-based solutions, including an AI chat product for brand owners. Previously at Block, built and explained a nuanced causal inference/propensity model tied to Square POS integrations, translating model specs and outputs into business impact for varied client contexts.

View profile
PT

Mid-level Software Engineer specializing in full-stack backend systems and FinTech

Austin, TX4y exp
IntuitUniversity of Central Missouri

Engineer who uses AI thoughtfully as a productivity multiplier rather than a crutch, with hands-on experience applying agent-based workflows to coding, debugging, documentation, and testing. Particularly strong in rapid backend and data-processing development, with a clear emphasis on validation, architecture, and scalability.

View profile
JL

Joseph Lee

Screened

Staff Software Engineer specializing in cloud platforms for healthcare and financial workflows

Dallas, TX10y exp
OptumUniversity of Texas at Dallas

Backend/data engineer with Optum healthcare claims domain experience building high-reliability Python microservices (FastAPI/Kafka/Postgres) and AWS data platforms (EKS, Glue, Redshift). Demonstrated strong production ownership: fixed duplicate Kafka processing via transactional outbox/idempotency, scaled to millions of daily events, and delivered major SQL performance gains (40+ min to <5 min, ~60% CPU reduction). Seeking remote-only work; targets $130k base.

View profile
CK

Senior Software Engineer specializing in Python, cloud platforms, and distributed systems

Nashville, TN13y exp
i3 VerticalsUniversity of Chicago

Backend/data engineer with production experience at Walmart and HealthSnap building Python services and data pipelines on AWS (EKS, Lambda, Glue, Airflow). Strong reliability and operations focus—implemented idempotency + circuit breakers for peak-traffic consistency issues, GitOps CI/CD, and observability. Demonstrated measurable performance wins (Postgres p95 45s to <5s, ~60% CPU reduction) and modernized SAS batch workflows to Python with parallel-run parity validation and feature-flagged rollout.

View profile
CS

Chappidi Sasi

Screened

Mid-level Machine Learning Engineer specializing in GPU-accelerated LLM training and inference

Bay Area, CA5y exp
NVIDIAWebster University

ML/LLM engineer with production experience building a multi-GPU LLM inference platform using TensorRT and vLLM, achieving ~40% p95 latency reduction through batching/KV caching, quantization, and CUDA/runtime tuning. Also has end-to-end orchestration experience (Kubernetes, Airflow) and has delivered real-time fraud detection systems at Accenture in close collaboration with non-technical risk and product stakeholders.

View profile
greg farhadian - Senior Software Engineer specializing in cloud data platforms and Java microservices in Remote

Senior Software Engineer specializing in cloud data platforms and Java microservices

Remote4y exp
IBMUC Irvine

Backend/data engineer with experience building Kafka-driven real-time pipelines that support ML code deployment and downstream integrations. Currently migrating high-throughput mainframe (COBOL/assembly) processing to Java, using Spark/Databricks to preserve performance and employing rigorous A/B testing across dev/pre-prod/prod with years of historical data.

View profile
Chaithanya Konda - Mid-level Data Engineer specializing in multi-cloud analytics platforms in Waltham, MA

Mid-level Data Engineer specializing in multi-cloud analytics platforms

Waltham, MA6y exp
Fresenius Medical CareUniversity of Arizona

Data engineer with hands-on GCP platform experience spanning BigQuery, Cloud SQL, Dataflow, and Cloud Composer, including both production operations and cloud migration work. They led a migration from legacy SQL Server/Oracle systems to a cloud-native BigQuery architecture and cite measurable impact: processing reduced from hours to minutes, query latency improved 60%+, and ingestion time improved 40%.

View profile
NH

Mid-level Full-Stack Engineer specializing in cloud-native data and enterprise platforms

USA5y exp
AmazonUniversity of Cincinnati

Software engineer with practical, day-to-day experience embedding AI into development workflows across coding, testing, code review, and AWS data pipelines. Uses tools like Claude, Cline, JUnit, Mockito, and Amazon Bedrock, and stands out for having a realistic, mature view of agent limitations, hallucinations, and the need for strong prompting and human validation.

View profile
Srilatha Bala - Mid-level Python Developer specializing in cloud data engineering and ETL/real-time pipelines in USA

Mid-level Python Developer specializing in cloud data engineering and ETL/real-time pipelines

USA6y exp
IntelCampbellsville University
View profile
RS

Mid-level AI & ML Engineer specializing in NLP, LLMs, and scalable ML systems

Cupertino, CA6y exp
AppleVisvesvaraya Technological University
View profile
YK

Junior Machine Learning Engineer specializing in LLMs and retrieval-augmented generation

Pittsburgh, PA3y exp
PanasonicCarnegie Mellon University
View profile
Yashwant Dontam - Mid-level Data Analytics Engineer specializing in cloud data platforms and FinTech in Boston, MA

Mid-level Data Analytics Engineer specializing in cloud data platforms and FinTech

Boston, MA5y exp
AmazonNortheastern University
View profile
NP

Mid-level Backend/Full-Stack Software Engineer specializing in cloud-native microservices and APIs

San Francisco, CA5y exp
MetaConcordia University Wisconsin
View profile
Emma Nguyen - Mid-level Software Engineer specializing in backend microservices, data pipelines, and QA in Foster City, CA

Mid-level Software Engineer specializing in backend microservices, data pipelines, and QA

Foster City, CA5y exp
AccentureUniversity of Pennsylvania
View profile
Ravi Kaswan - Senior AI/ML Software Engineer specializing in LLM and RAG systems in New York City, NY

Senior AI/ML Software Engineer specializing in LLM and RAG systems

New York City, NY5y exp
PalantirUniversity of Texas at Dallas
View profile
Sicheng Cao - Junior Software Engineer specializing in backend and distributed systems in Mountain View, CA

Junior Software Engineer specializing in backend and distributed systems

Mountain View, CA2y exp
IntuitRice University
View profile

Need someone specific?

AI Search