Vetted PySpark Professionals

Pre-screened and vetted.

NK

Senior Data Scientist and AI Engineer specializing in NLP, LLMs, and MLOps

Milwaukee, WI10y exp
CaterpillarWest Virginia University
View profile
KR

Mid-level AI/ML Engineer specializing in Financial Services

Atlanta, GA4y exp
American ExpressUniversity at Buffalo
View profile
WL

Senior Machine Learning Engineer specializing in GenAI, LLMs, and MLOps

Houston, TX11y exp
Paramount+University of Houston
View profile
AL

Senior Machine Learning Engineer specializing in GenAI, LLMs, and MLOps

Houston, TX11y exp
Paramount+University of Houston
View profile
RK

Mid-level Software Engineer specializing in distributed backend systems for FinTech

Los Angeles, CA5y exp
BlackRockCalifornia State University, Long Beach
View profile
DT

Mid-level Full-Stack Engineer specializing in cloud-native enterprise and FinTech systems

Sunnyvale, CA6y exp
WalmartCalifornia State University, East Bay
View profile
SH

Senior AI Architect specializing in Generative AI and LLM systems

New York City, NY8y exp
Rezolve AI
View profile
MJ

Mid-level Data Engineer specializing in AWS data lakes for healthcare and financial services

5y exp
Cigna
View profile
SD

Senior Data Scientist specializing in NLP, MLOps, and cloud ML platforms

Westfield Center, OH7y exp
Westfield Insurance
View profile
HS

Mid-level Java Full-Stack Developer specializing in cloud-native microservices

Dallas, TX4y exp
Baylor Scott & White
View profile
KR

Senior AI Python Engineer specializing in Generative AI and MLOps

San Francisco, CA8y exp
Silicon Valley Bank
View profile
DP

Daniel Parraga

Screened ReferencesModerate rec.

Director-level engineering leader specializing in platform architecture and cloud modernization

Kirkland, WA30y exp
Deep SyncEscuela Superior Politécnica del Litoral

Senior engineering leader with 8+ years of hands-on and people leadership experience across data-intensive enterprise platforms. He has led legacy-to-AWS modernization for mission-critical identity data workflows at Deep Sync, built and scaled teams rapidly, and previously helped create a 0-to-1 enterprise analytics platform at Kantar that later scaled to handle 10x more data with major performance gains.

View profile
MS

Mid-Level Software Engineer specializing in Cloud Infrastructure and Full-Stack Platforms

San Jose, CA6y exp
GembizzSan José State University

Built and shipped a production LLM-powered grading platform that automates rubric-aligned scoring and feedback, with strong guardrails (RAG grounding, structured JSON, validation/retries) and operational rigor (metrics, drift monitoring). Experienced using CrewAI to orchestrate multi-agent workflows end-to-end and validating quality via gold-set benchmarking against human graders with regression testing on every prompt/model change.

View profile
JB

Mid-level Python Developer specializing in cloud-native microservices for FinTech and Insurance

Charlotte, NC6y exp
Wells FargoUniversity of Bridgeport

Backend/data engineer who has maintained high-traffic FastAPI microservices and delivered a hybrid AWS serverless+containers platform using Terraform and GitHub Actions, with secrets managed via Secrets Manager/SSM. Also led modernization of a mission-critical 10,000+ line SAS financial reporting engine into Python microservices and built AWS Glue ETL pipelines feeding a centralized data lake.

View profile
PK

Mid-level AI/ML Engineer specializing in NLP, GenAI, and MLOps in healthcare and finance

USA5y exp
CVS HealthUniversity of Houston

AI/ML engineer with CVS Health experience deploying production LLM systems in regulated healthcare settings, including a large-scale RAG solution (1M+ documents) built for compliance-grade, auditable policy/regulatory Q&A with strong anti-hallucination controls. Also delivered an NLP summarization system for physician notes/case narratives by partnering closely with non-technical care operations stakeholders and iterating via prototypes, dashboards, and feedback loops.

View profile
GS

Mid-level Data Scientist & Generative AI Engineer specializing in LLMs and RAG

Auburn Hills, MI4y exp
StellantisUniversity of Cincinnati

ML/NLP practitioner who built a retrieval-augmented generation (RAG) system for large financial and operational document sets using Sentence-Transformers (all-mpnet-base-v2) and a vector DB (e.g., Pinecone), with a strong focus on retrieval evaluation and chunking strategy optimization. Experienced in entity resolution (rules + embedding similarity with type-specific thresholds) and in productionizing scalable Python data workflows using Airflow/Dagster and Spark.

View profile
RP

Ruudra Patel

Screened

Junior Data Scientist specializing in ML, LLMs, and RAG applications

Atlanta, GA3y exp
Georgia State UniversityGeorgia State University

University hackathon finalist (2nd place) who built CareerSpark, a production-style multi-agent career guidance app in 24 hours using a hierarchical debate architecture with a moderator/judge agent. Has startup internship experience at LiveSpheres AI using LangChain for multi-LLM orchestration, and demonstrates a structured approach to testing/evaluation (golden sets, integration sims, latency/accuracy KPIs) plus strong non-technical stakeholder communication.

View profile
RK

Rahul Karanam

Screened

Senior Computer Vision & Robotics Engineer specializing in perception and warehouse automation

San Jose, CA5y exp
RoboteonUniversity of Maryland, College Park

Robotics engineer with hands-on experience scaling a multi-vendor heterogeneous warehouse robot fleet, building a distributed “traffic manager” for collision avoidance and real-time rerouting using CBS/MAPF and DCOP-style negotiation. Strong real-time/safety-critical systems background (RTOS, deterministic lock-free multithreading) plus modern perception and simulation tooling (CNN-LSTM/transformers, CARLA/Isaac Sim, VIO/GTSAM, camera-IMU calibration). Startup-oriented and comfortable moving quickly from prototype to production.

View profile
AR

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and MLOps

3y exp
State FarmCleveland State University

Built a secure, on-prem/private GPT assistant to replace manual SharePoint-style search across thousands of policies/SOPs/engineering docs, using a production RAG stack (LangChain/LangGraph, FAISS/Chroma, PyMuPDF+OCR, vLLM). Implemented layout-aware ingestion (including table-to-JSON) and a multi-agent retrieval/generation/verification workflow with strong observability and compliance guardrails, delivering ~70% reduction in search time.

View profile
SN

Senior Data Engineer specializing in cloud data platforms and ML pipelines

Atlanta, GA8y exp
Berkshire HathawayUniversity of Alabama at Birmingham

Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.

View profile
RG

Mid-level Backend Python Engineer specializing in APIs, microservices, and data pipelines

USA, USA4y exp
Marsh McLennanFlorida Atlantic University

Backend engineer (Marsh McLennan) who evolved a high-volume claims automation pipeline in Python, emphasizing thin APIs with background job processing, strong validation/retries, and production-grade observability. Experienced in secure FastAPI API design (centralized JWT/RBAC), multi-tenant Postgres/Supabase-style row-level security, and low-risk refactors using parallel runs and feature flags; targeting founding-engineer scope roles.

View profile
GK

Mid-level Backend Software Engineer specializing in cloud-native distributed systems (Healthcare IT)

USA3y exp
UnitedHealth GroupNJIT

Data engineer with healthcare domain experience who has owned end-to-end pipelines and APIs at UnitedHealth Group, processing ~8M records per batch. Strong focus on data quality (multi-layer validation), reliability (monitoring/logging, retries/idempotency), and performance (Spark/SQL tuning, caching), with experience standing up early-stage systems using Python, Docker, and CI/CD.

View profile

Need someone specific?

AI Search