Vetted PySpark Professionals

Pre-screened and vetted.

Mohan Naik Megavath - Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms in Remote, USA

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp
TruistElmhurst University

Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.

View profile
Esha Gangam - Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps in USA

Esha Gangam

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

USA4y exp
DeloitteUniversity at Albany

GenAI/ML engineer from Deloitte who built and shipped a production RAG-based internal search assistant for support teams, delivering quantified operational gains (20% effort reduction, 35% faster manual lookup). Experienced in enterprise-grade LLM reliability (grounding/hallucination control), compliance/security constraints, and rapid release cycles using CI/CD, MLflow, and orchestration tools (Airflow, Databricks Jobs, LangChain).

View profile
Sri Harsha patallapalli - Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS in Boston, MA

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp
Dextr.aiNortheastern University

Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.

View profile
UMESH KAMISETTY - Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms in Seattle, WA

Mid-level Data Engineer specializing in cloud lakehouse and streaming platforms

Seattle, WA5y exp
First United BankCleveland State University

Data engineer focused on building production-grade pipelines on AWS (Kafka/Kinesis/Glue/S3) through to curated serving layers in Snowflake and Delta Lake. Emphasizes automated data quality validation (PySpark + CI/CD), modular dbt transformations for analytics (customer spending, risk metrics), and operational reliability with CloudWatch and DLQs; data consumed by BI tools and ML pipelines for fraud detection and risk analytics.

View profile
Harshitha Parupalli - Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines in Jersey City, NJ

Mid-level Data Engineer specializing in multi-cloud real-time and batch data pipelines

Jersey City, NJ4y exp
Elevance HealthNJIT

Data engineer with healthcare domain experience who owned 100M+ record pipelines end-to-end (Kafka/Kinesis/ADF → PySpark/dbt validation → Spark SQL transforms → Snowflake/Power BI serving). Built production-grade reliability practices (Airflow orchestration, CloudWatch/Grafana monitoring, pytest + contract/regression tests, idempotent ingestion/backfills) and delivered measurable improvements: 35% lower latency and 40% better query performance.

View profile
MOUNIKA SAI MEKALA - Junior Data Analyst specializing in financial and operational analytics in Kansas, USA

Junior Data Analyst specializing in financial and operational analytics

Kansas, USA3y exp
KPMGUniversity of Central Missouri

Analytics professional with experience at KPMG turning messy operational and financial data from SQL Server and AWS S3 into clean reporting datasets and automated Python workflows. They combine SQL, Python, Power BI, and experimentation methods to deliver stakeholder-aligned KPI dashboards and marketing performance insights with a strong focus on data integrity and reproducibility.

View profile
SAITEJA MALLEMPUDI - Senior Data Scientist and AI/ML Engineer specializing in GenAI and cloud ML in Chicago, IL

Senior Data Scientist and AI/ML Engineer specializing in GenAI and cloud ML

Chicago, IL6y exp
BMOLewis University

ML/AI engineer with hands-on experience owning systems from experimentation through deployment and monitoring, including a Bank of Montreal project that improved timely interventions by 12%. Also brings GenAI/RAG experience with evaluation and safety guardrails, plus clinical NLP pipeline work extracting medication data from notes for patient risk prediction.

View profile
Akhila Kannegari - Mid-level AI/ML Engineer specializing in FinTech and retail ML systems in Alabama, USA

Mid-level AI/ML Engineer specializing in FinTech and retail ML systems

Alabama, USA4y exp
Wells FargoAuburn University at Montgomery

ML-focused candidate with strong Wells Fargo experience building production fraud systems and internal GenAI tools for fraud analysts. Stands out for measurable impact in fraud detection—raising recall from 71% to 88%—while also demonstrating hands-on depth across streaming infrastructure, MLOps, LLM/RAG implementation, and Python service architecture.

View profile
DL

Senior Python Developer specializing in data engineering, MLOps, and cloud platforms

Dallas, TX13y exp
CBREAnna University

Backend/data engineer with production experience building secure Django/DRF APIs (JWT RS256 + rotating refresh tokens), background processing with Celery, and strong reliability practices (timeouts, retries/backoff, structured logging, audit trails). Has delivered AWS solutions spanning Lambda + ECS with IaC/CI-CD and built Glue/PySpark ETL pipelines with schema evolution and data-quality quarantine patterns; also modernized a legacy SAS pipeline to Python/PySpark with parallel-run parity validation and phased rollout.

View profile
LK

Mid-level AI/ML Engineer specializing in NLP, fraud detection, and MLOps

New York, NY4y exp
AIGUniversity of Texas at Arlington

LLM/ML platform engineer with hands-on experience taking an LLM document summarization prototype into a production-grade service on AWS EKS, emphasizing low-latency inference, drift monitoring, and safe CI/CD rollouts (canary + rollback). Strong in real-time debugging of agentic/RAG systems (tracing, retrieval/index drift fixes) and in developer enablement through practical workshops (Docker/Kubernetes/FastAPI) plus pre-sales support via demos and benchmarks to close pilots.

View profile
AS

Anuj Shah

Screened

Senior Data Analyst specializing in cloud data platforms, experimentation, and predictive analytics

GA, USA9y exp
UnitedHealth GroupNorthwestern Polytechnic University

Healthcare data/ML practitioner with experience at UnitedHealth Group building production ETL and streaming pipelines (Python, BigQuery, Kafka) that unify EHR, IoT device, and lab data for patient risk prediction. Also implemented embedding-based semantic search/linking for noisy clinical notes via domain adaptation and rigorous validation with clinical stakeholders; previously built churn prediction at DirecTV using XGBoost.

View profile
KE

Kamal Ede

Screened

Mid-level Data Engineer specializing in cloud data platforms, Spark, and streaming pipelines

MO, USA4y exp
S&P GlobalUniversity of Central Missouri

Data/MLOps engineer (Cognizant background) who owned an AWS/Airflow/Snowflake healthcare transactions pipeline processing ~8–10M records/day and cut pipeline/data-quality incidents by ~33%. Also built and deployed a production FastAPI model-inference service on Kubernetes (Docker, HPA) with strong observability (Prometheus/Grafana), versioned endpoints, and resilient backfill/idempotent external data ingestion patterns.

View profile
OL

Mid-level Data Engineer specializing in cloud data pipelines and streaming

Charlotte, NC5y exp
Wells FargoUniversity of North Texas

Data engineer with experience at Wells Fargo and Accenture owning end-to-end production pipelines processing hundreds of millions of transactional/risk records daily. Strong focus on data quality and reliability (reconciliation checks, schema drift detection, CloudWatch alerting) plus Spark performance tuning and idempotent backfills using Delta Lake/merge logic across AWS (S3/EMR/Databricks/Redshift) and Azure (ADF/Azure DevOps/Azure Monitor).

View profile
MR

Mid-level Data Engineer specializing in AWS/Azure pipelines and streaming analytics

VA, USA5y exp
UnitedHealth GroupGeorge Mason University

Data engineer with experience across healthcare and geospatial risk systems, owning end-to-end pipelines from ingestion through serving on AWS/Azure stacks. Built HIPAA-compliant data quality gates and CDC for millions of daily claims, and also delivered a real-time wildfire risk platform with 20-minute refresh cycles and a 60% data accuracy lift. Strong in streaming (Kafka), Spark performance tuning, and production-grade orchestration/CI/CD (Airflow, Docker, Jenkins, GitHub Actions, Terraform).

View profile
AR

Senior Data Engineer specializing in cloud data platforms and automated data quality

Houston, TX4y exp
CenterPoint EnergyUniversity of Central Missouri

Data engineer at CenterPoint Energy who built and operated multiple production-grade GCP data systems: a daily Snowflake→BigQuery replication framework (150+ tables) with Monte Carlo/Atlan-driven observability and schema-drift protection, plus a FastAPI metrics service for pipeline health. Demonstrated measurable impact (40% faster dashboard queries, 70% less manual refresh work, zero data loss) and strong operational rigor (scaling Cloud Run jobs, SAP SLT reconciliation, quarantine patterns, CI/CD via GitHub Actions + Terraform).

View profile
SASIREKHA GULIPALLI - Mid-level Data Analyst specializing in procurement, supply chain analytics, and applied machine learning in Alpharetta, GA

Mid-level Data Analyst specializing in procurement, supply chain analytics, and applied machine learning

Alpharetta, GA4y exp
MotrexGeorgia State University

Strategic sourcing professional specializing in seasonal apparel supply chains, combining Coupa/JD Edwards analytics with Excel/Python modeling and Power BI dashboards to drive cost reduction and OTIF gains. Notable for rapid mitigation of a 10-day factory delay affecting 12 holiday SKUs (preserved 95% of revenue) and for automating PO workflows to cut cycle time by 4.2 days and improve OTIF by 15%.

View profile
Rushir Bhavsar - Intern AI/ML Engineer specializing in LLMs, MLOps, and distributed training

Intern AI/ML Engineer specializing in LLMs, MLOps, and distributed training

1y exp
Cadence Design SystemsArizona State University

Founding AI engineer (June 2024) at Talon Labs who built and productionized an LLM-powered chatbot for interacting with proprietary supply-chain documents, deployed at large scale (25–100,000 users). Experienced with RAG/LLM orchestration (LangChain, LlamaIndex, Groq AI) and production ops tooling (Kubernetes, Docker, Kubeflow, Airflow), with a metrics-driven approach to evaluation, observability, and stakeholder alignment.

View profile
Sri Lalitha - Senior Full-Stack Java Engineer specializing in cloud-native microservices and FinTech in California, USA

Sri Lalitha

Screened

Senior Full-Stack Java Engineer specializing in cloud-native microservices and FinTech

California, USA6y exp
JoydropJawaharlal Nehru Technological University

Backend engineer who owned a Python task management API with JWT auth, async notifications, and performance work (DB optimization/caching) to handle high volumes. Led an on-prem to Azure private cloud migration at Morgan Stanley using GitOps and IaC (Terraform/ARM) with phased rollout and rollback planning. Also built a Kafka real-time streaming pipeline with exactly-once/idempotent consumers and Prometheus/Grafana monitoring.

View profile
Sana Khan - Mid-level AI/ML Engineer specializing in MLOps, LLMs, and real-time inference in FinTech in Oklahoma, USA

Sana Khan

Screened

Mid-level AI/ML Engineer specializing in MLOps, LLMs, and real-time inference in FinTech

Oklahoma, USA4y exp
Capital OneOklahoma Christian University

ML/LLM engineer who has deployed a production LLM-powered assistant for intent classification and query routing (order recommendation/support deflection), combining BERT fine-tuning with an embedding-based retrieval layer and optimizing for low-latency inference. Experienced with end-to-end reliability practices—Airflow-orchestrated ETL, data validation/alerting, MLflow experiment tracking, and iterative improvements driven by user feedback and monitoring.

View profile
Ankita A Khartmol - Junior Backend Software Engineer specializing in conversational AI and cloud APIs in Bangalore, India

Junior Backend Software Engineer specializing in conversational AI and cloud APIs

Bangalore, India1y exp
HarmanUSC

Backend/ML-focused software engineer who built and evolved a Python/FastAPI backend for a large-scale conversational AI platform, decoupling API and inference services to improve stability and deployment velocity. Experienced in production hardening (timeouts/fallbacks/monitoring), secure multi-tenant systems (JWT/RBAC/RLS), and low-risk migrations using shadow deployments and incremental traffic ramp-ups.

View profile
Chandra Shekar Akkandra - Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services in Newark, CA

Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services

Newark, CA5y exp
JPMorgan ChaseUniversity of Missouri-Kansas City

Finance-domain ML/LLM engineer who has shipped production systems including a RAG-based financial insights assistant with a custom post-generation validation layer that verifies atomic claims against retrieved source text to prevent hallucinations in compliance-critical workflows. Also built large-scale MLOps automation on AWS using Kubeflow + MLflow + CI/CD for fraud detection and credit risk models processing 500M+ transactions/day with a 99.99% uptime goal, and partnered closely with JP Morgan risk/compliance stakeholders on NLP-driven compliance monitoring.

View profile
Meghanath kethireddy - Mid-level Full-Stack/Backend Engineer specializing in Java microservices and cloud platforms in Dallas, TX

Mid-level Full-Stack/Backend Engineer specializing in Java microservices and cloud platforms

Dallas, TX5y exp
CopartUniversity of Texas at Dallas

PayPal ML/AI practitioner who built and productionized a hybrid recommendation engine (BERT/LLM embeddings + collaborative filtering + XGBoost ranking) on AWS with end-to-end MLOps and orchestration. Addressed real-world issues like cold start and embedding latency (ONNX, clustering, caching, PySpark/Delta Lake) and drove a 27% lift in upsell conversion via A/B testing and stakeholder collaboration with marketing.

View profile
Mukul Sai Pendem - Mid-level Full-Stack Engineer specializing in cloud-native microservices and DevOps in United States (Remote)

Mid-level Full-Stack Engineer specializing in cloud-native microservices and DevOps

United States (Remote)4y exp
Saayam for AllNortheastern University

Backend engineer with strong Python/FastAPI microservices ownership, including an ML-serving service with embeddings, async DB access, and Redis caching to reduce latency under high load. Experienced deploying and operating containerized services on Kubernetes using GitOps (Argo CD/Helm) with automated CI/CD, plus hands-on Kafka streaming pipeline tuning and enterprise migration work (Infosys) using blue-green/active-passive strategies.

View profile
SC

Mid-level Data Engineer specializing in cloud ETL and financial data platforms

Virginia, USA3y exp
Capital OneAvila University

Data engineer with experience at Capital One and HSBC building and operating GCP-based data platforms. Led an end-to-end Oracle-to-BigQuery migration processing ~200–300GB/day using Dataflow/Beam, Airflow, Dataproc/PySpark, and Looker, achieving ~99.5% pipeline success and ~30% fewer data quality issues. Strong in production reliability, schema drift handling for external APIs, and BigQuery performance/serving patterns (materialized views, authorized views, versioned datasets).

View profile

Need someone specific?

AI Search