Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Harikiran Jangam

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and RAG systems

California, USA3y exp

McKessonCalifornia Lutheran University

“Backend engineer who built and evolved a PHI-compliant RAG system (FastAPI + LangChain + embeddings/FAISS) for internal document search and summarization, delivering <400ms p95 latency at ~2,500 daily requests and measurable impact (30% faster investigations, +17% retrieval relevance). Demonstrates strong security and rollout discipline (RBAC/RLS/JWT, redaction/audits, shadow mode, dual writes, canaries) and a focus on reducing hallucination risk via grounded guardrails and confidence-based fallbacks.”

AI Agents Amazon Bedrock Apache Airflow Apache Kafka Apache Spark AWS+119

View profile

Ketan Verma

Screened

Junior Applied AI Engineer specializing in data pipelines and ML systems

College Station, TX2y exp

ElysiTexas A&M University

“Built an end-to-end wafer-data anomaly detection and reporting system at Samsung using PySpark, Random Forest models, SQL, and Grafana to help engineers track faults and take corrective action. Also has strong UX prototyping and validation practices in Figma plus hands-on front-end/full-stack experience (HTML/CSS/TypeScript), including a student project recognized as best design out of 25 teams, and early-stage startup experience pivoting a product based on user interviews into a real-time in-context feedback overlay.”

Python SQL C++Java Git PySpark+59

View profile

Abhishek Soni

Screened

Mid-level Full-Stack Developer specializing in React and scalable web applications

Mumbai, India3y exp

Taurus TechnologiesDr. A. P. J. Abdul Kalam Technical University

“Backend/data engineer with hands-on production experience across FastAPI microservices and AWS data platforms. Has delivered serverless and Glue/EMR-based ETL pipelines with strong observability (Prometheus/Grafana/Sentry, CloudWatch/SNS), schema-evolution resilience, and measurable SQL performance wins (5 min to <30 sec). Open to onsite meetings in the Bethesda, MD area and flexible on remote arrangements.”

JavaScript TypeScript Python Java C++C+80

View profile

Nisarg Shah

Screened

Junior Machine Learning Engineer specializing in geospatial analytics and computer vision

Tempe, Arizona1y exp

Arizona State UniversityArizona State University

“Built and evolved a geospatial ETL + API platform that processes pixel-wise satellite imagery in PostgreSQL/PostGIS into low-latency farm-level time-series metrics for an interactive dashboard, using precomputed hotspot analysis to reduce latency by 75–80%. Experienced in FastAPI-style API contract design (OpenAPI), caching, server-side filtering/compression, and production-minded security patterns (RBAC, session-derived authorization, password hashing) with disciplined rollback/versioning practices.”

Python Java JavaScript TypeScript React SQL+102

View profile

Pranita Agrawal

Screened

Mid-level Software Engineer specializing in Java microservices and AWS

California, USA5y exp

City of ModestoWayne State University

“TypeScript backend/full-stack engineer who owned an internal business workflow platform end-to-end in production, including API/data design, relational DB integration, and enterprise integrations. Has hands-on experience operating workflow processing services with Kafka-style event-driven patterns, idempotency, exponential backoff retries, dead-letter queues, and strong observability, plus API design with OpenAPI/Swagger and token-based auth.”

Java Python JavaScript SQL C#Bash+116

View profile

Prasanth Sai

Screened

Mid-level Data Engineer specializing in cloud lakehouse/warehouse pipelines

4y exp

Wells FargoChristian Brothers University

“Data engineer with HCA Healthcare experience building and operating end-to-end AWS-based pipelines for clinical and operational reporting (50–100 GB/day), serving curated data into Redshift/Snowflake for Power BI/Tableau. Emphasizes production reliability (Airflow SLAs/retries/alerting, logging/observability) and strong data quality controls (reconciliations, schema/null/duplicate checks), and has shipped versioned REST APIs to expose warehouse data to downstream systems.”

Amazon EC2 Amazon EKS Amazon Kinesis Amazon Redshift Amazon S3 Ansible+98

View profile

Shashank R

Screened

Senior Data Engineer specializing in cloud data platforms and real-time analytics

Las Vegas, NV6y exp

Credit One BankUniversity of North Texas

“Data engineer (Credit One) who built and owned real-time financial transaction and credit risk/fraud data systems end-to-end on AWS + Snowflake. Delivered high-scale pipelines (150k events/hour; ~2TB/week), raised data accuracy to 99%, and cut Snowflake costs 42% while adding strong observability, schema-drift handling, and production-grade APIs/documentation.”

Agile Amazon EC2 Amazon Redshift Amazon RDS Amazon S3 Apache Airflow+199

View profile

esha Pothukanuru

Screened

Mid-level Data Engineer specializing in cloud lakehouse platforms and ETL/ELT

Charlotte, NC4y exp

AccentureUniversity of North Carolina at Charlotte

“Accenture data engineer who greenfielded a supply-chain lakehouse platform, building an end-to-end medallion/Delta pipeline ingesting ~1.4TB/day from 17+ ERP/WMS/TMS/shipment sources. Delivered Gold datasets to Redshift/Synapse/Databricks SQL powering Power BI/Tableau with a 99.5% SLA, while cutting runtime 30% and cloud costs 16% through Spark/Delta optimizations and robust data quality controls.”

Python PySpark SQL Bash Apache Spark Databricks+126

View profile

Ashrita Mishra

Screened

Mid-level Data Analyst specializing in analytics, ETL, and cloud data platforms

Jersey City, NJ4y exp

CitigroupPace University

“Data analyst with 4 years of experience spanning banking and retail/marketing analytics. Has hands-on experience building churn analytics pipelines in SQL and Python, optimizing large-query performance, and turning stakeholder-aligned metrics into recurring dashboards and business actions.”

SDLC Agile Waterfall Python Java R+88

View profile

Shashwat Negi

Screened

Mid-level Software Engineer specializing in AI/ML and full-stack systems

San Jose, CA3y exp

InfrrdUniversity of Wisconsin–Madison

“Data Scientist (2–3 years) at ZS Associates who has built and productionized agentic LLM systems, including a LangGraph-based multi-LLM prompt-optimization pipeline for entity extraction deployed as a Spring Boot microservice via Jenkins. Also built an Insightmate.ai chatbot and improved its RAG accuracy by diagnosing vector retrieval issues and implementing HyDE query expansion, while partnering with sales and pharma stakeholders to drive adoption (e.g., Zimmer Biomet platform migration into a multi-year partnership).”

Python C++Go JavaScript TypeScript HTML+155

View profile

YaswithaSai Atluri

Screened

Mid-level Data Analyst specializing in BI, analytics automation, and cloud data platforms

Charlotte, NC4y exp

SkyWest AirlinesUniversity of North Carolina at Charlotte

“Analytics professional with hands-on experience building SQL/Python pipelines, customer ID mapping logic, and self-serve BI dashboards across marketing/CRM and regulated aviation reporting environments. Particularly strong in turning messy multi-source data into trusted reporting assets, with repeated claims of major efficiency gains, faster decision-making, and high-confidence stakeholder adoption.”

SQL Python Power BI Tableau Workflow Automation ETL+119

View profile

Sagar Sidhwa

Screened

Senior AI/ML Engineer specializing in LLMs, MLOps, and predictive analytics

Jamestown, NY6y exp

CumminsBinghamton University

“ML/AI engineer with hands-on experience building production MLOps systems for predictive maintenance and demand forecasting, including deployment, monitoring, and iterative retraining. Also shipped a RAG-based employee onboarding chatbot integrated with ServiceNow APIs and reports business impact of roughly $300k/month in reduced stockout and overstock costs.”

Python SQL NoSQL JavaScript TypeScript C+210

View profile

Lokesh Jain

Screened

Senior AI/ML Engineer specializing in supply chain and healthcare systems

Bentonville, AR6y exp

Forman TechnologyUniversity at Buffalo

“Built and deployed AcademiQ Ai, a production LLM-based teaching assistant using GPT/BERT with RAG (LangChain + Pinecone) to handle large student notes and generate adaptive explanations/quizzes. Demonstrated measurable retrieval-quality gains (18% precision improvement, 22% less irrelevant context) by tuning similarity thresholds and chunking based on user satisfaction signals. Also orchestrated terabyte-scale, real-time demand forecasting pipelines using Airflow and Kubeflow on GCP with strong monitoring, shadow deployment, and feedback-loop practices.”

Python SQL Scala PyTorch TensorFlow Scikit-learn+169

View profile

Meet Jhaveri

Screened

Mid-level Data Scientist specializing in AI/ML, LLMs, and healthcare analytics

California, USA3y exp

Johnson & JohnsonCalifornia State University, Fullerton

“Built and shipped enterprise AI products including a conversational SQL analytics platform and a production RAG system at Johnson & Johnson. Combines full-stack engineering with LLM systems expertise, and has delivered measurable impact at scale, including 48% lower retrieval latency and 37% better response relevance across 12M+ records.”

Python SQL R Java JavaScript TypeScript+101

View profile