Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Sara Fang

Screened

Mid-level Software Engineer specializing in cloud data platforms and distributed systems

Remote6y exp

Terra Byte XUniversity of Delaware

“Backend/data engineer with production experience building FastAPI services with strong reliability patterns (circuit breaker, rate limiting, caching, graceful degradation) and JWT/OAuth2 auth. Has delivered AWS EKS deployments via Terraform with Secrets Manager/IRSA and HPA autoscaling, and built Glue/Spark ETL pipelines on S3 Parquet with schema-evolution and idempotent reruns; also demonstrated measurable SQL tuning impact (20–30s to <10s).”

Java Python Scala Go SQL JavaScript+101

View profile

Damik Bermudez

Screened

Staff Software Engineer specializing in Healthcare platforms and AI data pipelines

Remote10y exp

DrwellBinghamton University

“Backend/data engineer with hands-on production AWS experience spanning serverless APIs (Chalice/Lambda/API Gateway/Cognito) and data pipelines (Glue PySpark + Step Functions). Has modernized a legacy SAS reporting system into AWS microservices and implemented schema-drift detection and incident prevention for ETL workflows, plus measurable SQL tuning wins (30 min to <10 min runtime).”

Python JavaScript TypeScript C#Django Flask+93

View profile

Shweta Chavan

Screened

Junior Computer Vision & ML Engineer specializing in autonomous perception systems

Pittsburgh, PA2y exp

Magna InternationalCarnegie Mellon University

“LLM/RAG engineer who built a production-style multi-agent orchestrator for resume-to-recommendation workflows (PDF ingestion through screening and recommendations), emphasizing prompt tuning and strict JSON output contracts. Currently building a RAG application for an NGO using Airflow (DAGs + embeddings) and tackling messy, missing/imbalanced data; has hands-on retrieval stack experience (FAISS/HNSW, bge embeddings) and uses rigorous evaluation metrics for groundedness and hallucination control.”

Python C++OpenCV MATLAB PyTorch TensorFlow+126

View profile

Yashwanth Reddy

Screened

Senior Data Engineer specializing in cloud-native data pipelines and lakehouse platforms

6y exp

MicrosoftUniversity of North Texas

“Data engineer at Microsoft who owned an end-to-end subscription analytics platform processing 7TB+ daily across 40+ pipelines, combining ADF batch ingestion with Kafka/Spark streaming and rigorous Great Expectations quality gates. Built a Fabric-based self-service ingestion platform with CI/CD and observability, plus a Databricks feature store serving near-real-time ML inference with Delta Lake reliability and versioning.”

Amazon Athena Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon EKS Amazon Kinesis+136

View profile

KARTHIKBABU VADLOORI

Screened

Mid-level Full-Stack Developer specializing in Spring Boot, React, and cloud microservices

San Francisco, CA5y exp

MetaUniversity of Texas at Arlington

“Backend engineer with experience at Meta and Accenture building regulated-data systems (healthcare/financial) using Python/Flask and Postgres. Has scaled high-throughput services to millions of daily requests, delivering measurable latency wins (~40% API latency reduction; ~35% faster DB-backed endpoints), and has productionized ML inference services using Docker/Kubernetes and AWS (ECS/SageMaker).”

Agile Ansible AWS CodePipeline AWS Lambda Azure App Service Azure Functions+165

View profile

Asrith Velireddy

Screened

Mid-level AI/ML Engineer specializing in MLOps, LLMs, and scalable ML systems

Harrison, NJ4y exp

AdobeNJIT

“ML/LLM engineer at Adobe who deployed a transformer-based personalization and campaign-targeting recommender system end-to-end, including PySpark/Airflow pipelines processing 12M+ events/day and containerized inference on AWS SageMaker (Docker/Kubernetes). Also has hands-on LLM workflow experience (RAG, semantic search, prompt optimization, hallucination mitigation) with a metrics-driven approach to reliability, drift monitoring, and reproducible retraining via MLflow.”

A/B Testing Apache Airflow Auto Scaling AWS AWS IAM AWS Lambda+123

View profile

Hamsalakshmi Ramachandran

Screened

Mid-level Data Analytics professional specializing in BI, data engineering, and applied AI

California, USA6y exp

AmazonSan Jose State University

“Built GenMedX, a multi-module clinical AI system for emergency department decision support spanning triage prediction, diagnosis, medication Q&A, and visit summarization. Stands out for combining medical LLM fine-tuning, RAG, and rigorous evaluation/monitoring to drive a major triage recall improvement from 38.5% to 76.6%, with a strong focus on safety, edge-case detection, and production reliability.”

SQL PostgreSQL MySQL Snowflake Python Pandas+167

View profile

Praveen V

Screened

Mid-Level Software Engineer specializing in Generative AI and RAG systems

Remote, USA5y exp

MetaUniversity of North Carolina at Charlotte

“Built a production RAG-based natural-language-to-SQL system at Global Atlantic to replace slow, expensive manual analytics ticket workflows, focusing heavily on retrieval quality and measurable evaluation (200-question ground-truth set; recall@5 improved 0.65→0.78 via semantic chunking). Also built a custom MCP-style agent orchestrator for a personal project (arxiv-ai) to improve flexibility and Langfuse-aligned observability, and has hands-on experience with LangGraph, CrewAI, and n8n.”

Python Java C#JavaScript TypeScript PostgreSQL+105

View profile

Sagnik Mazumder

Screened

Executive ML/AI Founder specializing in agentic analytics and data infrastructure

10y exp

Photosphere LabsUniversity of Texas at Dallas

“Founder of Photosphere Labs (agentic AI for ecommerce data synthesis/analysis) who worked directly with customers to scope, build, demo, and iterate LLM-based solutions, including an AI chat product for brand owners. Previously at Block, built and explained a nuanced causal inference/propensity model tied to Square POS integrations, translating model specs and outputs into business impact for varied client contexts.”

A/B Testing Agentic AI AWS AWS Glue BERT Data Analysis+63

View profile

Pranjal Tiwari

Screened

Mid-level Software Engineer specializing in full-stack backend systems and FinTech

Austin, TX4y exp

IntuitUniversity of Central Missouri

“Engineer who uses AI thoughtfully as a productivity multiplier rather than a crutch, with hands-on experience applying agent-based workflows to coding, debugging, documentation, and testing. Particularly strong in rapid backend and data-processing development, with a clear emphasis on validation, architecture, and scalability.”

Python Django Flask FastAPI Node.js React+174

View profile

Joseph Lee

Screened

Staff Software Engineer specializing in cloud platforms for healthcare and financial workflows

Dallas, TX10y exp

OptumUniversity of Texas at Dallas

“Backend/data engineer with Optum healthcare claims domain experience building high-reliability Python microservices (FastAPI/Kafka/Postgres) and AWS data platforms (EKS, Glue, Redshift). Demonstrated strong production ownership: fixed duplicate Kafka processing via transactional outbox/idempotency, scaled to millions of daily events, and delivered major SQL performance gains (40+ min to <5 min, ~60% CPU reduction). Seeking remote-only work; targets $130k base.”

React Next.js Angular Vue.js TypeScript JavaScript+167

View profile

Christopher Khan

Screened

Senior Software Engineer specializing in Python, cloud platforms, and distributed systems

Nashville, TN13y exp

i3 VerticalsUniversity of Chicago

“Backend/data engineer with production experience at Walmart and HealthSnap building Python services and data pipelines on AWS (EKS, Lambda, Glue, Airflow). Strong reliability and operations focus—implemented idempotency + circuit breakers for peak-traffic consistency issues, GitOps CI/CD, and observability. Demonstrated measurable performance wins (Postgres p95 45s to <5s, ~60% CPU reduction) and modernized SAS batch workflows to Python with parallel-run parity validation and feature-flagged rollout.”

Python R Django Flask FastAPI React+153

View profile

Chappidi Sasi

Screened

Mid-level Machine Learning Engineer specializing in GPU-accelerated LLM training and inference

Bay Area, CA5y exp

NVIDIAWebster University

“ML/LLM engineer with production experience building a multi-GPU LLM inference platform using TensorRT and vLLM, achieving ~40% p95 latency reduction through batching/KV caching, quantization, and CUDA/runtime tuning. Also has end-to-end orchestration experience (Kubernetes, Airflow) and has delivered real-time fraud detection systems at Accenture in close collaboration with non-technical risk and product stakeholders.”

A/B Testing Apache Spark AWS AWS Lambda BigQuery Claude+141

View profile

greg farhadian

Screened

Senior Software Engineer specializing in cloud data platforms and Java microservices

Remote4y exp

IBMUC Irvine

“Backend/data engineer with experience building Kafka-driven real-time pipelines that support ML code deployment and downstream integrations. Currently migrating high-throughput mainframe (COBOL/assembly) processing to Java, using Spark/Databricks to preserve performance and employing rigorous A/B testing across dev/pre-prod/prod with years of historical data.”

Java Spring Boot Python PySpark JavaScript React+59

View profile

Chaithanya Konda

Screened

Mid-level Data Engineer specializing in multi-cloud analytics platforms

Waltham, MA6y exp

Fresenius Medical CareUniversity of Arizona

“Data engineer with hands-on GCP platform experience spanning BigQuery, Cloud SQL, Dataflow, and Cloud Composer, including both production operations and cloud migration work. They led a migration from legacy SQL Server/Oracle systems to a cloud-native BigQuery architecture and cite measurable impact: processing reduced from hours to minutes, query latency improved 60%+, and ingestion time improved 40%.”

Python PySpark SQL Java Scala Microsoft Azure+177

View profile

Naga Harshita B

Screened

Mid-level Full-Stack Engineer specializing in cloud-native data and enterprise platforms

USA5y exp

AmazonUniversity of Cincinnati

“Software engineer with practical, day-to-day experience embedding AI into development workflows across coding, testing, code review, and AWS data pipelines. Uses tools like Claude, Cline, JUnit, Mockito, and Amazon Bedrock, and stands out for having a realistic, mature view of agent limitations, hallucinations, and the need for strong prompting and human validation.”

Java Scala Python SQL JavaScript TypeScript+157

View profile