Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted PySpark Professionals

Pre-screened and vetted.

PySpark Python SQL Docker AWS CI/CD

Anvith Reddy T

Screened

Mid-level AI/ML Engineer specializing in Generative AI and MLOps

Kansas City, MO5y exp

NAICUniversity of Central Missouri

“ML/AI engineer with hands-on ownership of fraud detection and investigator-assist systems, combining anomaly detection with RAG-based LLM summarization in production. Stands out for translating research ideas into reliable cloud-deployed workflows that improved precision to 92%, cut review time by 25-30%, and increased investigator throughput by roughly 30% while also building reusable Python infrastructure for team-wide velocity.”

Python SQL PySpark R Java JavaScript+148

View profile

Chin-yu Wu

Screened

Junior Data Analyst specializing in sports analytics and business intelligence

Indianapolis, IN2y exp

Indianapolis ColtsIndiana University Indianapolis

“Analytics professional in the sports industry who has owned high-impact revenue and compliance data projects for the Colts, turning fragmented Ticketmaster and Salesforce data into trusted real-time reporting. Stands out for combining strong SQL/Snowflake engineering, rigorous validation practices, and stakeholder-facing metric design that drove a record 98% compliance rate and meaningful revenue recovery.”

Python Pandas Scikit-learn SQL Snowflake PostgreSQL+79

View profile

Rohan Chodapunedi

Screened

Entry-level Data Scientist specializing in LLMs and analytics

Folsom, CA1y exp

App OrchidVirginia Tech

“Built a zero-to-one AI contract/policy QA agent for compliance and data teams, with a strong emphasis on trust, traceability, and clause-level citations rather than just fluent answers. They combine full-stack product ownership with practical LLM systems design, including hybrid retrieval, structured outputs, and evaluation pipelines to improve reliability, latency, and cost.”

Python SQL PostgreSQL MySQL Java R+83

View profile

Meet Doshi

Screened

Mid-level Data Engineer specializing in cloud data platforms and AI/ML analytics

Chicago, IL4y exp

EDNANortheastern University

“Backend/data engineer in healthcare who built an AWS-based clinical analytics platform from scratch (DynamoDB/S3/Airflow/dbt) with sub-second clinician query goals, 99.9% uptime, and HIPAA-grade controls (KMS encryption, IAM RBAC, audit trails). Also modernized ML delivery by replacing a manual 4-hour deployment with a 30-minute Docker/GitHub Actions CI/CD pipeline using parallel runs, parity testing, and rollback, and caught critical EHR data edge cases (date formats/timezones) that could have impacted patient care.”

Python PySpark SQL R Java Scala+120

View profile

Harsha KeladiGanapathi

Screened

Intern Data Scientist specializing in robotics localization and SLAM

Lexington, KY1y exp

InfineonUniversity of New Haven

“Robotics/embodied-AI practitioner who built a TurtleBot3 LiDAR-fingerprint localization pipeline end-to-end (autonomous data collection + multi-head NN) achieving ~30 cm error in a 10x10 m space. Also has industry experience at Infineon building large-scale production data/AI pipelines and rapidly fixing a deployed recommendation system by correcting upstream data normalization, improving accuracy by 20%+.”

Bash C C++Deep Learning Git Linux+143

View profile

Chaitanya Kalagara

Screened

Mid-level Machine Learning Engineer specializing in LLMs, GenAI, and Computer Vision

Boston, MA3y exp

Camp4 TherapeuticsNortheastern University

“LLM/agent engineer who built a production multi-agent research automation system using LangGraph (planner, retriever with FAISS, supervisor, evaluator) with structured outputs and citation tracking for traceable reports. Emphasizes reliability and operations—LangSmith-based observability, multi-level testing, hallucination mitigation, and latency/cost controls—plus prior experience as a Computer Vision Software Engineer at Deepsight AI Labs working directly with non-technical customers.”

A/B Testing Amazon EC2 Amazon S3 Amazon SageMaker AWS AWS Lambda+87

View profile

Sai Charan C

Screened

Mid-level Generative AI Engineer specializing in LLMs, RAG, and multimodal AI on AWS

CT, USA3y exp

HCLTechUniversity of New Haven

“Built and deployed a production RAG-based enterprise document intelligence platform for financial/compliance/operational documents on AWS (Spark/Glue ingestion, embeddings + vector DB, LangChain orchestration, REST APIs on Docker/Kubernetes). Deep hands-on experience orchestrating multi-step and multi-agent LLM workflows (LangChain, LangGraph, CrewAI) with strong focus on grounding, evaluation, observability, and cost/latency optimization, and has partnered closely with non-technical finance/compliance teams to drive adoption.”

A/B Testing Agile Amazon CloudWatch Amazon DynamoDB Amazon S3 Apache Airflow+139

View profile

Alekhya Parimala Koppolu

Screened

Mid-level AI/ML Software Engineer specializing in data pipelines, BI dashboards, and computer vision

Wichita, Kansas3y exp

Friends UniversityFriends University

“Graduate Assistant Intern at Friends University who built and deployed a GenAI-driven requirement understanding system that automates extraction and semantic grouping of technical requirements from large unstructured documents. Demonstrates strong LLM engineering rigor (golden datasets, regression testing, post-processing validation) and production-minded delivery using LangChain/LlamaIndex orchestration, FastAPI microservices, Docker, and cloud deployment.”

Python SQL R Java C C+++119

View profile

KHUSHBU KAKDIYA

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and cloud MLOps

California, USA6y exp

CVS HealthCleveland State University

“Built and deployed a production LLM/RAG system at CVS to automate clinical documents, addressing PHI compliance, retrieval accuracy, and latency; achieved a 35–40% reduction in review effort through chunking and FP16/INT8 optimization. Also has experience translating AI outputs into actionable insights for non-technical stakeholders (sports analysts).”

Python SQL PySpark R Bash Scikit-learn+114

View profile

Nitin Shivakumar

Screened

Senior Data Scientist specializing in healthcare ML, LLMs, and responsible AI

Morris Plains, NJ4y exp

CignaUniversity at Buffalo

“Clinical data scientist who has built an agentic LLM-powered literature review assistant (with RAG-style storage/retrieval) to identify predictors for downstream predictive modeling. Also delivered a patient-focused progression analysis model using Databricks + Airflow orchestration, partnering closely with clinicians to define targets and validate that model insights aligned with clinical expectations.”

A/B Testing AWS Classification Computer Vision Databricks Data Analysis+72

View profile

Akash Shanmuganathan

Screened

Mid-level GenAI & Data Engineer specializing in agentic AI systems and AWS Bedrock

Fort Mill, SC4y exp

OneData Software SolutionsNortheastern University

“At onedata, built and deployed an LLM-powered, multi-agent analytics platform on AWS Bedrock that lets users create Amazon QuickSight dashboards through natural-language conversation, cutting dashboard build time from ~30 minutes to ~5 minutes. Strong in production concerns (observability, token/cost tracking, model tradeoffs) and in bridging business + technical work, owning pre-sales pitching through delivery with an engineering management background focused on AI product management.”

Agentic AI Amazon Bedrock Amazon Redshift Amazon RDS Amazon S3 Amazon SNS+95

View profile

HemaSri Perumalla

Screened

Mid-level AI/ML Engineer specializing in fraud detection and healthcare predictive analytics

Reston, VA4y exp

TruistUniversity of Central Missouri

“ML/AI engineer with production experience in high-scale banking fraud detection at Truist, building an end-to-end pipeline (Airflow/AWS Glue/Snowflake, PyTorch/sklearn) with automated retraining and Kubernetes-based deployment; delivered measurable gains (22% fewer false positives, 15% higher recall) and reduced manual ops ~40%. Also partnered with clinicians at Kellton to deploy an LLM system for summarizing/classifying clinical notes, improving review time and decision speed.”

A/B Testing Agile Apache Kafka Apache Spark AWS Glue AWS Lambda+108

View profile

Shruti Rawat

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps for financial services

Jersey City, NJ4y exp

State StreetPace University

“Built and deployed a production Llama 3-based RAG document Q&A system using FAISS, addressing context-window limits through chunking and keeping retrieval accurate by regularly refreshing embeddings. Has hands-on orchestration experience with LangChain and LlamaIndex for multi-step LLM workflows (including memory management) and collaborates with non-technical teams (e.g., marketing) to deliver AI solutions like recommendation systems.”

A/B Testing API Integration Apache Airflow AWS AWS Glue AWS Lambda+112

View profile

Uchechukwu Okechukwu

Screened

Mid-Level Software Engineer specializing in backend, distributed systems, and AI/LLM platforms

Prairie View, TX4y exp

Prairie View A&M UniversityPrairie View A&M University

“Built and shipped AI-powered workflow automation at Oracle, including an MCP-based agentic workflow with tool-calling and guardrails, plus Grafana monitoring and Confluence documentation. Also led a Django monolith-to-microservices migration at Chamsmobile using blue-green deployment and load balancer traffic splitting to avoid regressions while modernizing production systems.”

AI Agents Algorithms Apache Kafka Artificial Intelligence AWS AWS Lambda+105

View profile

Bhavya Sri Gunnapaneni

Screened

Mid-level AI/ML Engineer specializing in fraud detection and NLP

United States4y exp

AIGLewis University

“Built production AI/RAG-style systems for message Q&A and insurance claims workflows, combining data ingestion, indexing/retrieval, and LLM integration with fallback modes. Has hands-on orchestration experience (Airflow, Prefect, LangChain) and cites large operational gains (claims processing reduced to ~45 seconds; manual review -50%; false alerts -30%) through automated, monitored pipelines and close collaboration with non-technical stakeholders.”

Python SQL R Java TensorFlow PyTorch+125

View profile

Swati Swati

Screened

Senior Data Scientist/Software Engineer specializing in ML systems and cloud DevOps

Florida, United States5y exp

Voltihost LLCStony Brook University

“AI software engineer with experience spanning LLM/RAG production systems and regulated fintech infrastructure. Built an end-to-end natural-language-to-SQL analytics assistant (Weaviate + GPT-4 + Supabase) shipped as an API with 92% accuracy and major time savings for non-technical users, and also owned demand-forecasting and CI/CD/containerization improvements for a Bank of America core banking deployment at Infosys.”

Python R C++Java Shell Scripting Bash+172

View profile

Raj Patel

Screened

Junior Machine Learning Engineer specializing in LLMs and RAG systems

Remote, USA1y exp

EmotionallNYU Tandon School of Engineering

“Production-focused applied ML/LLM engineer who has deployed an LLM-powered RAG assistant and improved reliability through rigorous retrieval evaluation (recall/MRR), reranking, and guardrails that prevent confident wrong answers. Experienced running containerized ML/LLM services on Kubernetes (including AWS-managed layers) with CI/CD and observability, and has delivered a real-time predictive maintenance system using streaming sensor data and time-series anomaly detection in close partnership with maintenance teams.”

Python Java TensorFlow PyTorch Scikit-Learn Large Language Models (LLMs)+86

View profile

Haritha Kuraparthi

Screened

Mid-level Full-Stack Developer specializing in cloud data engineering and analytics

West Haven, CT4y exp

BlackbaudUniversity of Bridgeport

“Software developer with hands-on experience owning customer-facing work end-to-end (requirements, implementation, testing, and feedback-driven iteration) using Python and React.js. Also described remodeling an internal legacy page/tool to improve performance and accuracy, and has exposure to microservices and RabbitMQ plus ETL-based system work.”

Python NumPy Pandas JavaScript Node.js Java+81

View profile

Hari Billa

Screened

Mid-level Data Scientist specializing in machine learning, NLP, and healthcare AI

USA3y exp

HCA HealthcareSouthern Arkansas University

“Senior data scientist with hands-on ownership of production ML and GenAI systems across enterprise churn, clinical Q&A, and real-time fraud detection. Stands out for combining strong MLOps discipline with measurable business impact, including $2M+ retained revenue, 10K TPS low-latency fraud infrastructure, and a clinician-reviewed RAG system that improved retrieval accuracy by ~38%.”

Python Pandas NumPy Scikit-learn Matplotlib Seaborn+107

View profile

Ruturaj Dixit

Screened

Junior Data Scientist specializing in AI/ML and product analytics

New York, NY2y exp

Pace UniversityPace University

“Applied ML/data scientist who has owned backend-heavy AI systems end-to-end, including a market-signal platform on FastAPI/AWS and rapid MVP delivery in medical computer vision. Particularly interesting for teams needing someone who can combine model development, backend APIs, production debugging, and pragmatic low-latency architecture decisions.”

Data Science Machine Learning Artificial Intelligence A/B Testing SQL Python+138

View profile

Aditya Anil Raut

Screened

Junior Software Engineer specializing in AI/ML, data pipelines, and cloud APIs

San Jose, CA3y exp

TCSCalifornia State University, Chico

“Hands-on AI/LLM practitioner who built a RAG-based customer support chatbot and tackled production issues like data chunking complexity and response-time lag. Uses techniques such as overlapping chunks, semantic search, context engineering, and query routing, and has experience presenting technical demos/workshops to developer audiences.”

AWS AWS Lambda Bootstrap C C++ChromaDB+106

View profile

Abdul Tanimu

Screened

Senior Full-Stack Software Engineer specializing in cloud-native web applications

Houston, TX7y exp

TechwaveUniversity of North Texas

“Backend/data engineer who built a production booking platform on FastAPI microservices (Postgres/Redis/gRPC) and delivered AWS infrastructure spanning Lambda, ECS, SQS, and Glue-to-Redshift analytics. Demonstrated measurable SQL optimization (10 minutes to <40 seconds) and strong operational ownership through monitoring, incident response, and schema-evolution hardening.”

Python JavaScript TypeScript PHP SQL NoSQL+81

View profile

Yash Sanap

Screened

Junior Data Scientist specializing in ML, geospatial analytics, and LLM applications

Virginia Beach, VA2y exp

City of Virginia BeachGeorge Mason University

“Built and deployed a production AI “term explainer” agent that adapts explanations to beginner/intermediate/expert users by combining multi-step LLM reasoning with grounded Wikipedia retrieval. Owns end-to-end agent orchestration (smolagents/Python), reliability patterns (fallback across LLM providers, retries, guardrails), and observability/metrics-driven evaluation; also partnered with a non-technical researcher to deliver a plain-language research assistant agent.”

Python SQL Java Go Bash JavaScript+95

View profile

Vaishnavi K

Screened

Mid-level AI/ML Engineer specializing in GenAI, MLOps, and anomaly detection

USA5y exp

TCSUniversity of New Haven

“LLM/MLOps engineer who has shipped a production RAG-based technical documentation assistant (FastAPI) cutting manual review by 45%, with deep hands-on retrieval optimization in Pinecone/LangChain (HNSW, hybrid + multi-query search, caching). Also brings healthcare domain experience—building Airflow-orchestrated EHR pipelines and delivering FDA-auditability-friendly predictive maintenance solutions using SHAP/LIME explainability surfaced in Power BI.”

A/B Testing Amazon EC2 Amazon S3 Amazon SageMaker Apache Airflow Apache Hadoop+135

View profile

Machine Learning Engineers Software Engineers Data Scientists Data Engineers Data Analysts AI Engineers AI & Machine Learning Data & Analytics Engineering Education

Need someone specific?

AI Search

Related

Need someone specific?