Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Nikshitha Aella

Screened

Mid-level Full-Stack Software Engineer specializing in AI platforms and microservices

Mooresville, NC6y exp

Lowe'sUniversity of North Carolina at Charlotte

“Backend engineer currently building an AWS Lambda/FastAPI inventory recommendation system using a LangChain + GPT-4 RAG pipeline and MongoDB vector search; drove major cost optimization via Redis caching (60% reduction) while sustaining 10k+ daily requests under 2s latency. Previously deployed Node.js microservices on AWS OpenShift with Jenkins/Helm at UnitedHealth Group and led a zero-downtime monolith-to-microservices migration at Verizon, including RabbitMQ-based real-time messaging with DLQs and idempotency.”

Agile Angular API Gateway AWS AWS Lambda CI/CD+83

View profile

Koushik Gunjala

Screened

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp

UnitedHealth GroupUniversity of North Carolina at Charlotte

“LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.”

Agentic AI Distributed Systems Retrieval-Augmented Generation (RAG)GPT-4 LangChain LangGraph+95

View profile

Vardhan Addakattu

Screened

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp

S&P GlobalRowan University

“Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.”

Python R SQL PySpark Pandas Apache Spark+111

View profile

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp

LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQL SQL query optimization Python Pandas NumPy PySpark+159

View profile

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp

CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLC Agile Waterfall Python SQL Jupyter Notebook+137

View profile

Hritvik Gupta

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and healthcare AI

San Francisco, CA3y exp

Penn MedicineUC Riverside

“Built and scaled an AI-powered voice/chat patient engagement platform at Penn Medicine from early prototype into production clinical workflows, focusing on latency, edge cases, and user trust. Strong in LLM reliability engineering (structured prompts, validation/fallbacks), real-time troubleshooting with observability, and cross-functional enablement through pilots, demos, and sales/customer partnership.”

AWS AWS Lambda C++CI/CD Communication Data Engineering+78

View profile

Bala Venkateswarlu K

Screened

Mid-level Data Scientist specializing in Generative AI, NLP, and MLOps

USA5y exp

MetLifeHarrisburg University of Science and Technology

“Built and deployed an LLM-powered claims-document summarization system (insurance domain) that cut agent review time from 4–5 minutes to under 2 minutes and saved 1,200+ hours per quarter. Hands-on across orchestration and production infrastructure (Airflow retraining DAGs, Kubernetes, SageMaker endpoints, FastAPI) and recent RAG workflows using n8n + Pinecone, with a strong focus on reliability, cost, and explainability for non-technical stakeholders.”

A/B Testing Agile Apache Kafka Apache Spark Auto Scaling AWS+148

View profile

Sudeep govathoti

Screened

Mid-level Data Analyst/Data Engineer specializing in BI, ETL pipelines, and cloud analytics

4y exp

VerizonLindsey Wilson College

“Data engineer focused on marketing/web analytics and external API pipelines, handling ~10M records/week. Built Azure-based ingestion and PySpark transformations with rigorous data quality checks, then served curated datasets into Synapse/Redshift for Power BI. Also designed an Airflow-orchestrated crypto REST API pipeline with monitoring, retries/exponential backoff, schema-change detection, and backfill-friendly reprocessing.”

SQL Python R PySpark Pandas Scikit-learn+71

View profile

Krish Shah

Screened

Junior AI Engineer specializing in LLM systems and analytics

Miami, FL2y exp

CoUnderscorePurdue University

“Analytics-focused candidate with internship and project experience at Recotap and CoUnderscore, combining SQL, Python, and BI dashboards to turn messy marketing and engagement data into decision-ready reporting. Stands out for tying analytics work to business outcomes, including ~15% CTR improvement, identifying ~40% misattributed spend, and enabling a ~$75K budget shift through better targeting.”

Python SQL PostgreSQL R Java JavaScript+75

View profile

Daniel Izquierdo

Screened

Mid-level Data Analyst specializing in financial risk and data automation

McLean, VA5y exp

Capital OneFlorida International University

“Analytics professional from Capital One with strong experience automating risk, reconciliation, and regulatory reporting workflows in financial services. They combine deep SQL/Python pipeline skills with stakeholder-facing dashboard and KPI design, delivering measurable impact like 30% performance gains, sub-24-hour anomaly detection, and 100% data integrity for regulatory filings.”

SQL Snowflake Python Pandas PySpark Databricks+57

View profile

Chethan Thimapuram

Screened

Mid-level AI Engineer specializing in LLMs, MLOps, and healthcare NLP

4y exp

HCA HealthcareUniversity of South Florida

“Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.”

Python PyTorch Machine Learning Generative AI Large Language Models OpenAI+182

View profile

Robert Kennedy

Screened

Senior AI/ML Engineer specializing in LLMs, generative AI, and applied research

Boca Raton, FL10y exp

ModMedFlorida Atlantic University

“Research-heavy ML/AI candidate with a PhD/publications background who translated LLM evaluation and clinical summarization techniques into production at ModMed. They owned an end-to-end healthcare GenAI pipeline that cut clinician documentation time from ~22 minutes to ~7-8 minutes, reduced token costs by ~30%, and built an internal evaluation framework later adopted by multiple teams.”

Python SQL PyTorch TensorFlow Scikit-learn Hugging Face+76

View profile

SathwikReddy Nethani

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and financial systems

Texas, USA5y exp

CitibankConcordia University, St. Paul

“GenAI/ML engineer with hands-on experience building production financial intelligence and document summarization systems at Citibank. Stands out for combining LLM fine-tuning, hybrid RAG, multi-agent workflows, and strong MLOps/observability practices to deliver measurable business impact, including 60% faster analyst retrieval, 31% higher precision, and 99%+ uptime.”

Python Pandas NumPy SQL PySpark Shell Scripting+144

View profile

Venkatesh Sanaboina

Screened

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp

VerizonJawaharlal Nehru Technological University

“Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.”

A/B Testing Agile Amazon Redshift Amazon S3 Amazon SageMaker Anomaly Detection+168

View profile

Meghana P

Screened

Mid-level AI/ML Engineer specializing in Generative AI, LLMs, and NLP

Illinois, USA5y exp

State FarmSaint Louis University

“AI/ML engineer with forensic analytics and healthcare claims experience (Optum), building production LLM/RAG systems to surface context-driven fraud patterns from unstructured claim notes and explain risk to investigators. Strong in large-scale retrieval performance tuning, legacy API integration with reliability patterns (SQS, circuit breakers), and MLOps orchestration on Airflow/Kubernetes with rigorous testing, monitoring, and stakeholder-friendly interpretability.”

A/B Testing Apache Spark AWS AWS Lambda Azure Data Factory Azure Functions+125

View profile

Sahithi Mogudala

Screened

Mid-level Full-Stack Software Developer specializing in cloud-native microservices

WI, USA3y exp

Cardinal HealthAnderson University

“Full-stack engineer with enterprise experience at Metasystems Inc. (and Qualcomm) building high-traffic, security-sensitive systems—owned a secure transaction processing module end-to-end using Java/Spring Boot, Python/Django, and React. Strong AWS production operations (EKS/ECS/Lambda/RDS/DynamoDB) with IaC (Terraform/CloudFormation), observability, and reliability patterns; also delivered resilient ETL/integration pipelines with idempotency/retries/backfills and achieved a 50% deployment-time reduction through CI/CD and modular refactoring.”

Ajax Amazon CloudFront Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon ECS+284

View profile

Harsha Sikha

Screened

Mid-level AI/ML Engineer specializing in Generative AI and data engineering

Armonk, New York4y exp

IBMSaint Peter's University

“IBM engineer who built and deployed a production RAG-based LLM assistant using LangChain/FAISS with a fine-tuned LLaMA model, served via FastAPI microservices on Kubernetes, achieving 99%+ uptime. Demonstrates strong practical expertise in reducing hallucinations (semantic chunking + metadata-driven retrieval) and managing latency, plus mature MLOps practices (Airflow/dbt pipelines, MLflow tracking, monitoring, A/B and shadow deployments) and effective collaboration with non-technical stakeholders.”

A/B Testing Agile Anomaly Detection API Development Apache Hadoop Apache Hive+157

View profile

Yun-Hao Lee

Screened

Junior Machine Learning Engineer specializing in LLM deployment and computer vision

Dallas, TX2y exp

Lab for Intelligent Storage and ComputingUniversity of Texas at Dallas

“Robotics/AI candidate who built an AI-driven landmark location tool during a summer internship at Mobile Drive, combining YOLOv5 object detection with OpenStreetMap-based geolocation to handle dense, cluttered urban environments. Also researched deploying LLM-based agents on constrained hardware using quantization plus LoRA/continuous learning, improving accuracy from ~80% to ~92%, with an emphasis on production logging for reliability.”

Python C C++R SQL Java+91

View profile

Ashok Sai Doredla

Screened

Mid-level AI/ML Engineer specializing in Generative AI and production ML systems

United States5y exp

CVS HealthUniversity of Maryland, Baltimore County

“At CVS Health, the candidate productionized a RAG-based LLM solution in a regulated healthcare setting, emphasizing reliable data pipelines, LoRA fine-tuning, monitoring, safety guardrails, and A/B testing. They have hands-on experience troubleshooting real-time RAG failures (e.g., chunking/embedding issues) and regularly lead developer-focused demos/workshops while translating technical architecture into business value for stakeholders.”

A/B Testing Asynchronous Processing AWS AWS Lambda Azure Blob Storage Azure Functions+142

View profile

Harsha Chimirala

Screened

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp

HCLTechUniversity of New Haven

“Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).”

Python SQL PySpark Scala Bash Apache Spark+80

View profile

Tharun Kshathriya Sangaraju

Screened

Mid-level AI Engineer specializing in LLM orchestration, RAG, and multi-agent systems

Houston, TX4y exp

University of HoustonUniversity of Houston

“Research Assistant at the University of Houston who built and live-deployed a production RAG system for 1000+ research documents, using hybrid retrieval (dense+BM25+RRF) with cross-encoder reranking and RAGAS-based evaluation; reported 66% MRR, 0.85+ faithfulness, and 68% lower LLM inference costs. Also built a deployed LangGraph multi-agent research system (Researcher/Critic/Writer) with tool integrations (Tavily, arXiv) and dual memory (ChromaDB + Neo4j), plus freelance automation work delivering a WhatsApp chatbot and n8n workflows for a wholesale clothing business.”

Agentic AI AI Agents API Integration Apache Airflow Apache Hadoop Apache Kafka+118

View profile

Sai Charan Reddy Kothakapu

Screened

Mid-level Full-Stack Developer specializing in React/Node, GraphQL, and Databricks lakehouse

Dallas, TX6y exp

Southern Glazer's Wine & SpiritsWebster University

“Full-stack engineer currently at Southern Glazer’s who built and owned a real-time commercial finance expense analytics dashboard end-to-end (Next.js App Router + TypeScript), including post-launch monitoring, data quality checks, and stakeholder-driven iteration. Strong data/analytics backend experience (Postgres modeling and Databricks Delta Lake pipelines) with demonstrated performance wins—e.g., cutting a key reconciliation query from 8–12s to <400ms and improving frontend load time ~40% with a 25% bounce-rate drop at Verizon.”

React Next.js JavaScript TypeScript Tailwind CSS Redux+99

View profile

Sai Harshith Varma Pericherla

Screened

Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture

Jersey City, NJ4y exp

State StreetUniversity of New Haven

“Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.”

SQL Python Pandas NumPy Scikit-learn Java+151

View profile

Sheshikanth Pothuganti

Screened

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

New York, NY4y exp

Wells FargoUniversity of Birmingham

“Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.”

Python SQL Java Scala R PostgreSQL+122

View profile

Machine Learning Engineers Software Engineers Data Scientists Data Engineers Data Analysts AI Engineers AI & Machine Learning Data & Analytics Engineering Education

Need someone specific?

AI Search

Related

Need someone specific?