Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Nidhish Rao Bairineni

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and MLOps

5y exp

Wells FargoSouthern Methodist University

“Built and deployed a production RAG-based internal knowledge assistant that let analysts query company documents in natural language, using LangChain/LangGraph with Pinecone and a FastAPI service for integration. Emphasizes reliability in production through hallucination mitigation (retrieval tuning + prompt guardrails) and measurable evaluation/monitoring (accuracy, latency, task completion, hallucination rate), iterating based on user feedback.”

Artificial Intelligence Machine Learning Generative AI Large Language Models OpenAI Claude+173

View profile

Diwita Banerjee

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and enterprise AI

Fairfax, VA5y exp

Freddie MacGeorge Mason University

“Built an enterprise RAG-based document intelligence system at Freddie Mac for regulatory and financial documents, helping analysts cut search time from hours to minutes while improving retrieval accuracy by ~30%. Stands out for combining LLM product delivery with compliance-grade auditability, production monitoring, and scalable Python/FastAPI service design.”

Python MATLAB Jupyter Notebook NumPy Pandas Matplotlib+129

View profile

Sai Sri Kolanu

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and production ML systems

Dearborn, MI4y exp

FordUniversity at Buffalo

“Built and shipped an AI-powered RAG diagnostic assistant at Ford for EV technicians, integrating GPT-based models with LangChain, FAISS, and SageMaker into real technician workflows. Stands out for combining strong production LLM architecture with practical safety guardrails, monitoring, and measurable impact: 45% better diagnostic accuracy and roughly 30 minutes saved per case.”

LangChain LangGraph LlamaIndex Hugging Face GPT Claude+109

View profile

Adit Shah

Screened

Mid AI/ML Engineer specializing in computer vision, NLP, and LLM systems

USA4y exp

Omnic.AINortheastern University

“AI/full-stack engineer in gaming analytics who joined Omnic.ai at a 2-person stage, helped grow with the company, and built both backend and frontend for real-time gameplay analysis products. He combines computer vision production experience with LLM/RAG systems work, and has already led 4 employees while shipping 12 models in a fast-moving startup environment.”

Python SQL Data Structures Algorithms REST APIs FastAPI+143

View profile

Dev PARIKH

Screened

Mid-level Software Engineer specializing in backend systems and applied AI

Baltimore, MD4y exp

QualcommUniversity of Maryland, Baltimore County

“Backend/full-stack engineer at Qualcomm who built and operated a drift monitoring platform for 10k+ edge AI models. Stands out for combining strong TypeScript/React/Node execution with production-grade systems thinking across PostgreSQL tuning, Redis caching, ECS deployments, and Kafka-based architectural improvements that measurably improved reliability and release speed.”

Python JavaScript TypeScript SQL FastAPI Django+177

View profile

Manali Shetye

Screened

Mid-level Applied AI & Data Engineer specializing in automation and enterprise analytics

Irving, Texas4y exp

Trend MicroUniversity of Texas at Arlington

“Backend engineer with experience evolving a high-volume agricultural loan processing platform (APMS) at HDFC Bank, emphasizing transactional integrity, auditability, and modularity while integrating with credit bureaus, document management, and risk engines. Also improved automation/reporting robustness at Trend Micro by catching duplicate-event retry edge cases and adding idempotency safeguards.”

Python R C#SQL JavaScript C+95

View profile

Keerthana Priya

Screened

Mid-level Data Analytics & ML Engineer specializing in NLP, LLMs, and cloud data platforms

Dallas, TX5y exp

MattelKennesaw State University

“At KPMG, built and productionized a secure RAG-based LLM assistant that lets business and risk stakeholders query data warehouses in natural language, reducing dependence on data engineers for ad-hoc analysis. Demonstrates strong production rigor (Airflow orchestration, CI/CD, containerization), retrieval/embedding tuning (rechunking, semantic abstraction for structured data), and reliability controls (confidence thresholds, refusal behavior, monitoring and canary evals).”

SQL Python R PySpark Apache Spark Pandas+123

View profile

Sharath Bandi

Screened

Mid-level Generative AI Engineer specializing in LLMs, RAG, and multimodal generation

Saint Louis, Missouri4y exp

LSEGAvila University

“Open-source JavaScript contributor focused on performance and maintainability in data visualization libraries—refactored legacy ES5 into modular ES6, added tests/docs, and delivered ~30% faster load times with positive community adoption. Also optimized a React dashboard (~40% load-time reduction) and took ownership in an ambiguous AI product initiative by setting milestones, standing up an initial ML pipeline, and shipping a prototype in ~6 weeks that became the basis for production.”

A/B Testing Apache Airflow Apache Hadoop Apache Hive Apache Kafka Apache Spark+225

View profile

Manav Bhasin

Screened

Junior Full-Stack Machine Learning Engineer specializing in production ML systems

San Jose, CA2y exp

AgroFocal Technologies IncSan José State University

“Software engineer who owned end-to-end delivery of customer-facing agricultural forecast reporting (crop yield/health) and iterated quickly via rigorous edge-case testing and customer feedback. Also built an internal ML training platform (TypeScript/React + Flask/Python + MongoDB) used by every developer, with architecture designed to stay responsive under heavy compute load.”

Python SQL JavaScript TypeScript C C+++65

View profile

Sravan Kumar Jajam

Screened

Mid-level Data Scientist / ML Engineer specializing in streaming ML systems for healthcare and IoT

Urbandale, IA4y exp

John DeereAuburn University at Montgomery

“ML/GenAI engineer with production experience building an LLM-powered governance layer that summarizes verified drift/performance signals into validation reports and release notes, designed for regulated environments with de-identification and non-blocking fallbacks. Strong Airflow-based orchestration background across healthcare and finance, integrating Databricks/Spark and MLflow for scalable retraining/monitoring. Demonstrated ability to partner with non-technical healthcare operations teams to deliver actionable risk-scoring outputs via dashboards and automated reporting.”

Python R SQL Bash Pandas NumPy+127

View profile

Sowmya Sree

Screened

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

Dallas, TX5y exp

Bank of AmericaUniversity of North Texas

“Built production LLM systems including a real-time customer feedback analysis and workflow automation platform using RAG and multi-agent orchestration with confidence-based human escalation, addressing privacy and legacy integration challenges. Also automated ML operations with Airflow/Kubernetes (e.g., daily churn model retraining) cutting retraining time to under 30 minutes, and demonstrates a rigorous testing/monitoring approach plus strong non-technical stakeholder collaboration.”

Python Java Spring Boot JavaScript R Bash+148

View profile

Hanish Kukkala

Screened

Mid-level Data Scientist specializing in Generative AI and NLP

USA6y exp

CVS HealthUniversity of Central Missouri

“ML/GenAI engineer with recent CVS Health experience building a production RAG system over unstructured financial/research documents using LangChain, FAISS, and Pinecone, plus LoRA/PEFT fine-tuning of GPT/LLaMA for domain-aware summarization. Demonstrates strong applied MLOps and data engineering skills (Airflow/Prefect, Docker/Kubernetes, CI/CD, MLflow) and measurable impact (sub-second retrieval, ~40% better context retrieval, ~25% entity matching improvement).”

A/B Testing Apache Hadoop Apache Hive Apache Kafka Apache Spark AWS+170

View profile

Sailaja Lokasani

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and healthcare analytics

Dallas, TX5y exp

Lightbeam Health SolutionsSyracuse University

“Healthcare-focused data engineer/ML practitioner with experience at Lightbeam Health Solutions and Humana building production entity-resolution and semantic similarity pipelines across EMR, lab, and claims data. Uses NLP/ML (spaCy, scikit-learn, BioBERT/LightGBM) plus Snowflake/Airflow and vector search (Pinecone) to improve linkage accuracy (reported 90%) and semantic match quality (reported +12–15%), while reducing manual cleanup by 40%+.”

Apache Airflow AWS AWS Glue AWS Lambda Agile C+++134

View profile

Mayur Komaravelly

Screened

Senior Data Analyst specializing in data pipelines, web scraping, and legal data enrichment

Illinois, USA5y exp

The HartfordIndiana Wesleyan University

“Data engineer focused on reliable, scalable analytics pipelines and external data collection. Has owned end-to-end pipelines processing 5–10M records/day, serving Snowflake data marts to Power BI/Tableau, and reports ~99% reliability through strong validation/monitoring. Also shipped versioned REST APIs for curated data with query optimization and caching.”

Apache Airflow Apache Kafka Apache Spark Ansible API Design AWS Glue+140

View profile

Ansh Harjai

Screened

Junior Software Engineer specializing in AI, RAG systems, and backend development

Brooklyn, NY1y exp

New York UniversityNYU

“Built an NYU software engineering capstone called “Smart Cash AI,” a multi-agent LLM-powered web app that curates offline-ready podcasts/articles/videos/news based on user preferences and commute schedules. Architected agent orchestration (discovery/downloader/summarizer), real-time progress via WebSockets, and an ETL normalization layer across RSS/YouTube and other sources with GUID-based deduplication, retries, and failure isolation to keep the system predictable.”

Python C++SQL JavaScript HTML CSS+79

View profile

Rishitha reddy katamareddy

Screened

Mid-level Generative AI & Machine Learning Engineer specializing in agentic LLM systems

USA4y exp

OptumUniversity at Buffalo

“Built and deployed a production agentic LLM knowledge assistant that answers complex questions over internal documents, APIs, and databases using a RAG architecture (FAISS/Pinecone) and LangChain/LangGraph orchestration. Emphasizes production-grade reliability and hallucination control through grounding, confidence thresholds, validation, retries/fallbacks, and full observability (logging/metrics/traces) with continuous evaluation and feedback loops.”

Agentic AI Generative AI Large Language Models (LLMs)LangChain LangGraph Multi-Agent Systems+175

View profile

Daniel Jin

Screened

Intern Site Reliability Engineer specializing in Kubernetes, AWS, and observability

New York, NY1y exp

Woori America BankNYU

“Backend/data engineering candidate specializing in Python/Flask services and ML-enabled systems, deploying containerized workloads on AWS ECS/EKS with strong observability (Prometheus/Grafana) and PostgreSQL performance tuning. Built multi-tenant architectures with row- and schema-level isolation and optimized a Kubernetes-based Airflow + Spark nightly ETL pipeline for an e-commerce client, improving performance by 250%+ and reliably beating morning reporting deadlines; also contributed to Apache Airflow (SQLAlchemy/PostgreSQL area).”

Alerting Apache Airflow Automation AWS AWS Lambda Bash+89

View profile

Sai Nekkanti

Screened

Mid-level Data Scientist / ML Engineer specializing in secure GenAI and financial compliance

Mount Laurel, NJ4y exp

MetLifeRowan University

“Built a production "sentinel insight engine" to tame information overload from millions of product reviews and support transcripts, combining Azure OpenAI (GPT-3.5) zero-shot classification with a fine-tuned T5 summarizer to generate weekly actionable product insights. Demonstrated strong MLOps/production engineering by adding drift monitoring with embedding-based detection, integrating REST with legacy SOAP/queue-based CRM via FastAPI middleware, and scaling reliably on Kubernetes with HPA.”

SDLC Agile Waterfall Python C C+++155

View profile

Nishad Kane

Screened

Mid-level Data Scientist & AI Engineer specializing in RAG, agentic AI, and production ML

5y exp

Xtrium AIArizona State University

“AI/data engineer who built a production LLM-powered schema drift detection system (LangChain/LangGraph) to catch semantic data changes before they break downstream analytics/ML. Deployed on AWS with Docker/S3 and implemented an LLM-as-a-judge evaluation framework to improve trust, reduce hallucinations, and control false positives/alert fatigue. Collaborated with non-technical risk/business analytics stakeholders at EY by delivering human-readable drift explanations that improved confidence in financial analytics dashboards.”

A/B Testing Agentic AI Amazon EC2 Amazon EKS Amazon Redshift Amazon S3+104

View profile

Revanth Goli

Screened

Senior Data & Backend Engineer specializing in cloud data pipelines and LLM/RAG systems

Morrisville, NC6y exp

Syneos HealthUniversity of Alabama at Birmingham

“Data engineer with end-to-end ownership of large-scale retail and clinical data ingestion/processing on AWS, including real-time streaming and batch pipelines. Delivered measurable outcomes: 20M daily transactions processed, latency cut from 4 hours to 5 minutes, ~70% fewer failures, and 120+ pipelines running at 99.8% reliability with full audit compliance.”

Python Pandas PySpark FastAPI LangChain SQL+97

View profile

Bhargavi Kondaveeti

Screened

Mid-level Data Engineer specializing in big data pipelines and real-time streaming

Dallas, TX6y exp

Johnson & JohnsonUniversity of North Texas

“Data engineer who has owned end-to-end production pipelines processing a few million records/day, using Python/Airflow/SQL/PySpark with Snowflake serving to BI (Power BI). Built resilient external web data collection systems (anti-bot, schema-change detection, backfills) and shipped versioned REST APIs for internal consumers, improving pipeline success rates to 99% through monitoring, retries, and idempotent design.”

Agile Amazon CloudWatch Amazon DynamoDB Amazon Redshift Amazon S3 Amazon SQS+101

View profile

Sai Vardhan Reddy

Screened

Mid-Level Data Engineer specializing in cloud data platforms and governed analytics

5y exp

OptumUniversity of Central Missouri

“Data engineer with Optum experience building end-to-end healthcare data pipelines for HL7/FHIR, processing millions of records daily across Kafka streaming and Databricks/Spark batch. Strong focus on data quality (schema enforcement/validations), reliability (Airflow monitoring/alerts), and analytics-ready serving in Snowflake powering Power BI/Tableau, with CI/CD via Git and Jenkins.”

AWS Amazon EC2 AWS Lambda AWS Glue Amazon S3 Amazon Kinesis+94

View profile

Tanvi Dasaripally

Screened

Mid-level Cloud Data Engineer specializing in Azure/AWS pipelines and medallion architecture

USA4y exp

UnitedHealth GroupSouthern Illinois University Carbondale

“Data engineer focused on reliability and data quality, owning end-to-end pipelines processing ~100k–300k records/day. Implemented robust validation and monitoring that cut reporting issues by ~30%, and built stable external data collection with anti-bot measures, backfills, and schema-change detection while maintaining backward-compatible internal data services.”

Python SQL PySpark Apache Kafka Azure Data Factory AWS+72

View profile

Pramathesh Shukla

Screened

Senior Data Analyst specializing in marketing, BI, and financial analytics

Illinois, USA6y exp

WPPDePaul University

“Marketing analytics candidate with experience at WPP and on a global Coca-Cola campaign, focused on turning messy multi-platform media data into trusted reporting and decision systems. They combine hands-on SQL/Python pipeline building with stakeholder KPI alignment, and cite a 22% improvement in media effectiveness plus faster budget reallocation through daily automated reporting.”

SQL Python R PySpark Root-cause analysis A/B testing+100

View profile

Machine Learning Engineers Software Engineers Data Scientists Data Engineers Data Analysts AI Engineers AI & Machine Learning Data & Analytics Engineering Education

Need someone specific?

AI Search

Related

Need someone specific?