Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Pavan Punna

Screened

Mid-level AI/ML Engineer specializing in LLMs, MLOps, and healthcare-fintech AI

Dallas, TX5y exp

Federal Soft SystemsConcordia University

“Built and owned a production GPT-4 RAG assistant for clinical and enterprise query resolution, taking it from initial experiment to deployment, monitoring, and iterative improvement. Their work cut resolution time from 45 minutes to under 2 minutes, achieved roughly 95% accuracy, and scaled to thousands of additional monthly queries while emphasizing safety and trust in a sensitive clinical domain.”

Python SQL Java Scala Bash PyTorch+124

View profile

Ponaganti Sanjana

Screened

Mid-level AI/ML Engineer specializing in NLP, MLOps, and FinTech

Remote, USA4y exp

AccentureUniversity of Houston

“ML/AI engineer with production experience at S&P Global and Accenture, focused on regulated, enterprise-grade systems. Built end-to-end financial risk and credit default models with >90% precision and 12% fewer false positives, and is currently developing secure RAG pipelines on AWS SageMaker for enterprise insight extraction.”

Python R SQL Java JavaScript TypeScript+110

View profile

Duncan Freeman

Screened

Staff Machine Learning Engineer specializing in NLP, LLMs, and document intelligence

Austin, TX9y exp

PNCUniversity of Cincinnati

“ML/AI engineer at PNC who has shipped enterprise-grade RAG and document intelligence systems for compliance and policy workflows. Stands out for combining LLM product thinking with production rigor—owning FastAPI/Kubernetes deployments, monitoring, evaluation, and human-feedback loops that drove measurable gains like 40% faster policy search and 30% faster compliance review.”

Machine Learning Data Science Natural Language Processing Large Language Models Computer Vision Time-Series Forecasting+169

View profile

Rushabh Thakkar

Screened

Mid-level Machine Learning Engineer specializing in NLP, computer vision, and LLMs

New York City, NY3y exp

WayfairStevens Institute of Technology

“Wayfair ML/AI engineer who has shipped and operated production LLM systems for both internal analytics and customer-facing assistants. Stands out for combining strong RAG/retrieval engineering with production-grade platform work—improving trust, reducing latency by ~30%, and cutting ad hoc reporting demand by ~50%.”

Machine Learning Deep Learning Natural Language Processing Computer Vision Large Language Models Python+168

View profile

Siva Harini Sri Janaki Raman

Screened

Mid-level Data Engineer specializing in cloud data platforms

Dallas, TX3y exp

CVS HealthTexas Tech University

“Built an AI-powered internal support assistant at CVS Health using GPT-4, LangChain, and Pinecone, applying RAG, validation, and monitoring to reduce repetitive support tickets while protecting sensitive healthcare data. Stands out for a pragmatic approach to AI engineering: using multi-agent and LLM workflows to accelerate development while keeping systems constrained, observable, and production-friendly.”

Python SQL R AWS Amazon S3 AWS Glue+110

View profile

Christopher Prokop

Screened

Director of Software Engineering specializing in AI, data platforms, and cloud architecture

Washington, DC29y exp

ZipRecruiterAmerican University

“Veteran software engineering leader who started as an early internet engineer in the mid-1990s and has since grown into Director/VP-level leadership across legacy web platforms, logistics systems, and modern data engineering. Particularly compelling for companies needing a hands-on leader who can modernize complex Perl/UNIX monoliths, manage large cross-functional teams, and deliver operational systems in warehouse, marketplace, and reverse-logistics environments.”

Agentic AI Claude AWS Terraform Kubernetes Amazon S3+78

View profile

Krishna Sristi

Screened

Entry Data Scientist specializing in ML, NLP, and GenAI

Hyderabad, India1y exp

KofluenceRowan University

“AI/full-stack engineer who has built a production-style LLM knowledge assistant from scratch, combining FastAPI, LangChain, FAISS, semantic retrieval, and a user-facing chat interface. Stands out for owning both the technical architecture and the product usability layer, including latency optimization, prompt refinement, and source-backed responses to improve trust for non-technical users.”

Python Java C++SQL PyTorch TensorFlow+111

View profile

Evan Teague

Screened

Senior Software Engineer specializing in backend and data platforms

Bethesda, MD10y exp

Spatial Data LogicUniversity of Virginia

“Series A startup engineer with broad full-stack ownership across backend, data, and frontend, including a real-time ingestion platform that scaled to 10x higher daily volume without downtime while cutting latency from minutes to seconds. Brings strong fintech and B2B SaaS experience building auditable, high-throughput systems for analysts, operations, and compliance teams in regulated environments.”

Python Go JavaScript TypeScript SQL Bash+159

View profile

Rohith kollu

Screened

Senior Software Engineer specializing in backend microservices, cloud, and full-stack systems

Dallas, TX7y exp

CiscoIndiana Wesleyan University

“Backend/platform engineer who has built and scaled production Java/Spring Boot + Kafka services on AWS/Kubernetes (1M+ msgs/day) and led reliability/performance fixes that restored SLAs (25–30% latency improvement; 99.9% uptime). Also shipped an AI customer-support chatbot end-to-end using retrieval + guardrails and rigorous evaluation/observability, improving resolution time 40% and satisfaction 25%, with a strong plan/execute/verify approach to agentic workflow reliability.”

Amazon CloudFront Amazon CloudWatch Amazon EC2 Amazon RDS Amazon S3 Apache Hadoop+154

View profile

Priya Shah

Screened

Mid-level DevOps Engineer specializing in AWS cloud infrastructure and CI/CD automation

OH6y exp

ServiceNowSardar Patel University

“Backend/data engineer with production experience building a SaaS analytics platform: FastAPI-based microservices with Redis caching and reliability patterns (RBAC, retries/backoff, centralized error handling). Also delivered AWS data pipelines (Glue/PySpark to Redshift) and owned real production incidents using CloudWatch/SNS, plus hands-on PostgreSQL query tuning on multi-million-row reporting workloads.”

SDLC Agile DevOps CI/CD Git GitHub+79

View profile

PHANINDRA KETHAMUKKALA

Screened

Senior GenAI/ML Engineer specializing in LLMs, RAG, and multimodal generative AI

USA4y exp

GE HealthCareFranklin University

“LLM/RAG engineer with production deployments in highly regulated domains (Frost Bank and GE Healthcare). Built secure, explainable document-grounded Q&A systems using LoRA fine-tuning, strict RAG with confidence thresholds, and citation-based responses; also established evaluation/monitoring (golden QA sets, hallucination tracking, drift) and achieved ~40% latency reduction through retrieval/prompt tuning.”

A/B Testing Agile AI Agents Apache Kafka Apache Spark AWS Glue+170

View profile

Aditya Sairam

Screened

Mid-Level Software Engineer specializing in cloud data platforms and AI search

Troy, MI6y exp

Robotics Technologies LLCCleveland State University

“Open-source JavaScript contributor focused on data visualization, extending Chart.js/React with custom plugins for real-time streaming dashboards. Designed an end-to-end telemetry pipeline using Apache Kafka and Azure Cosmos DB, optimizing partitioning, batching, caching, and client throttling to keep latency low and support thousands of concurrent users. Demonstrates strong ownership in fast-changing environments, including building full-stack AI applications and ingestion/ETL pipelines at Robotics Technologies LLC.”

Apache Kafka AWS AWS Lambda Azure Functions C#Cloud Computing+89

View profile

PAVAN VARMA PENMETHSA

Screened

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp

AvanadeUniversity of North Texas

“Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.”

Machine Learning Generative AI Large Language Models (LLMs)Prompt Engineering Retrieval-Augmented Generation (RAG)Embeddings+131

View profile

Vigneshwaran Moorthi

Screened

Mid-level Machine Learning Engineer specializing in LLMs, RAG, and Clinical AI

Chicago, Illinois4y exp

OptumIllinois Institute of Technology

“Built and productionized a HIPAA-compliant LLM+RAG Clinical AI assistant at Optum, fine-tuning GPT/LLaMA on de-identified patient notes and integrating FAISS/Pinecone for sub-second retrieval; reported to cut diagnosis time by ~20 minutes per case. Experienced in orchestrating ML pipelines (Airflow, AWS Step Functions, Azure Data Factory) and in reliability techniques for LLM systems (grounding, citations, confidence filters, monitoring) while partnering closely with clinicians and compliance teams.”

A/B Testing Amazon CloudWatch Amazon EC2 Amazon Redshift Amazon S3 Apache Airflow+138

View profile

BHARATH BHOOTHPUR

Screened

Mid-level Data Analyst specializing in healthcare and finance analytics

New Jersey, USA5y exp

Omada HealthRowan University

“Built an end-to-end Alexa smart-home IoT application controlling a Wi-Fi bulb, including ESP32 firmware (MQTT) and an AWS serverless backend (IoT Core/Device Shadow, Lambda, DynamoDB) with a REST API. Demonstrates strong real-time scalability patterns (streaming ingestion, stateless processing, partition-key design) and full-stack delivery with Spring Boot + React (JWT auth, CORS, data-heavy dashboards).”

Python SQL R NumPy Pandas Matplotlib+113

View profile

Sushma Puchakayala

Screened

Mid-level Data Analyst specializing in AI/ML and advanced analytics

USA3y exp

AccentureMurray State University

“Accenture data/ML practitioner who deployed a retail churn prediction and BERT-based sentiment analysis system to production, integrating behavioral + feedback data and operationalizing it with ETL automation, orchestration, and CI/CD. Experienced managing 2TB+ multi-source data, monitoring drift in Databricks, and translating results into Power BI dashboards for marketing teams (including K-means customer segmentation).”

Python Pandas NumPy Matplotlib Scikit-learn Seaborn+122

View profile

ASHWINKUMAR PACHIPALA

Screened

Mid-level Full-Stack Java Developer specializing in cloud-native microservices

USA4y exp

Epic SystemsWebster University

“Full-stack Java developer with IBM and Epic Systems experience modernizing legacy enterprise apps into microservices and delivering customer-facing healthcare claims workflows at very high scale (2M+ transactions/day). Strong blend of product engineering (APIs + React/TypeScript UI) and production operations on AWS, including performance incident remediation via query optimization, indexing, and autoscaling.”

Java Python C#Spring Boot Spring MVC Flask+136

View profile

Rohan Gore

Screened

Intern AI/ML Engineer specializing in agentic systems and full-stack development

New York City, NY0y exp

MARV CapitalNYU

“Built and scaled a multi-agent LLM automation pipeline during a fintech internship, growing from a rapid 1-week proof-of-concept to a 15+ agent hierarchical system that cut market brief report generation time from ~5 hours to under 30 minutes. Hands-on with agent frameworks (Haystack, CrewAI, LangChain) and experienced in debugging agent communication issues via sandboxed modular testing and context/token management; also regularly gives architecture-first technical demos at multiple hackathons and university events.”

Apache Cassandra Apache Hadoop Apache Kafka AWS AWS Lambda C#+93

View profile

Naga Venkata Padala

Screened

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and real-time fraud detection

4y exp

U.S. BankUniversity of Massachusetts Dartmouth

“GenAI/ML engineer who has shipped production agentic systems in highly regulated and high-throughput environments, including an AWS Bedrock-based fraud/compliance workflow at U.S. Bank with PII redaction and hallucination detection that cut investigation time by 50%+. Also built and evaluated RAG and recommendation systems at Target, using RAGAS-driven testing, hybrid retrieval with re-ranking, and SHAP explainability dashboards to align model behavior with merchandising business KPIs.”

AWS AWS CloudFormation AWS Glue AWS Lambda AI agents Apache Airflow+143

View profile

Krishna Kandlakunta

Screened

Mid-level Data Scientist specializing in MLOps, LLM/RAG applications, and deep learning

United States5y exp

CitigroupUniversity of North Texas

“Built and deployed a production compliance automation RAG system (at Citi) that generates citation-backed, schema-validated risk summaries for regulatory document review. Emphasizes regulated-environment reliability with retrieval-only grounding, abstention, confidence thresholds, and immutable audit logging, plus orchestration using LangChain/LangGraph and Airflow. Reported ~60% reduction in compliance review effort while maintaining high precision and traceability.”

A/B Testing Agile Anomaly Detection Apache Hadoop Apache Hive Apache Kafka+167

View profile

Manichandra Reddy Bethi

Screened

Mid-level GenAI Engineer specializing in production AI agents and evaluation pipelines

Overland Park, Kansas5y exp

MinutentagWilmington University

“Built and shipped a production LLM-powered internal operations automation platform using LangChain RAG (Pinecone) and FastAPI microservices, deployed on AWS EKS, serving 10k+ daily interactions. Implemented a rigorous evaluation/observability stack (golden datasets, prompt regression tests, MLflow, retrieval metrics, hallucination monitoring) that drove hallucinations below 2% and improved reliability, and partnered closely with non-technical ops leaders to cut manual lookup work by 60%+.”

A/B Testing Alerting AWS AWS Lambda BERT CI/CD+120

View profile

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp

FordWebster University

“Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.”

Python PySpark Scala Java R SQL+173

View profile

Naga Yanala

Screened

Mid-level Data Engineer specializing in cloud data pipelines and analytics platforms

Texas, USA5y exp

Molina HealthcareSoutheast Missouri State University

“Data engineer with healthcare and enterprise experience (Molina Healthcare, Dell Technologies) building and operating high-volume batch + streaming pipelines across AWS and Azure. Strong focus on data quality (schema validation, fail-fast checks), reliability (monitoring/alerts, retries), and performance tuning (Spark/partitioning), with measurable runtime reduction and improved downstream trust.”

Python SQL PySpark Bash ETL Data pipelines+85

View profile

Sai Kavyusha Ponnagant

Screened

Mid-level Data Engineer specializing in cloud data pipelines and financial services warehousing

Chicago, IL4y exp

Charles SchwabDePaul University

“Data engineer (Charles Schwab) who took ownership of an unstable, ambiguous nightly financial data pipeline and rebuilt it into a reliable, incremental AWS Glue/Airflow/Redshift system feeding Power BI. Created a custom Python data-quality framework with hard-stop gating and schema drift detection, improving integrity (99.9%), cutting runtime (~20%), and reducing incidents/tickets (35% fewer schema-related dashboard incidents; 30% fewer investigations).”

Python SQL Amazon S3 AWS Glue Amazon Redshift AWS Lambda+73

View profile

Machine Learning Engineers Software Engineers Data Scientists Data Engineers Data Analysts AI Engineers AI & Machine Learning Data & Analytics Engineering Education

Need someone specific?

AI Search

Related

Need someone specific?