Vetted PySpark Professionals

Pre-screened and vetted.

NA

Mid-level Full-Stack Software Engineer specializing in AI platforms and microservices

Mooresville, NC6y exp
Lowe'sUniversity of North Carolina at Charlotte

Backend engineer currently building an AWS Lambda/FastAPI inventory recommendation system using a LangChain + GPT-4 RAG pipeline and MongoDB vector search; drove major cost optimization via Redis caching (60% reduction) while sustaining 10k+ daily requests under 2s latency. Previously deployed Node.js microservices on AWS OpenShift with Jenkins/Helm at UnitedHealth Group and led a zero-downtime monolith-to-microservices migration at Verizon, including RabbitMQ-based real-time messaging with DLQs and idempotency.

View profile
KG

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp
UnitedHealth GroupUniversity of North Carolina at Charlotte

LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.

View profile
VA

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp
S&P GlobalRowan University

Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.

View profile
HE

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp
LexisNexisUniversity of South Florida

AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.

View profile
SK

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp
CignaSaint Louis University

Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.

View profile
Hritvik Gupta - Mid-level AI Engineer specializing in LLMs, RAG, and healthcare AI in San Francisco, CA

Hritvik Gupta

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and healthcare AI

San Francisco, CA3y exp
Penn MedicineUC Riverside

Built and scaled an AI-powered voice/chat patient engagement platform at Penn Medicine from early prototype into production clinical workflows, focusing on latency, edge cases, and user trust. Strong in LLM reliability engineering (structured prompts, validation/fallbacks), real-time troubleshooting with observability, and cross-functional enablement through pilots, demos, and sales/customer partnership.

View profile
Bala Venkateswarlu K - Mid-level Data Scientist specializing in Generative AI, NLP, and MLOps in USA

Mid-level Data Scientist specializing in Generative AI, NLP, and MLOps

USA5y exp
MetLifeHarrisburg University of Science and Technology

Built and deployed an LLM-powered claims-document summarization system (insurance domain) that cut agent review time from 4–5 minutes to under 2 minutes and saved 1,200+ hours per quarter. Hands-on across orchestration and production infrastructure (Airflow retraining DAGs, Kubernetes, SageMaker endpoints, FastAPI) and recent RAG workflows using n8n + Pinecone, with a strong focus on reliability, cost, and explainability for non-technical stakeholders.

View profile
SG

Mid-level Data Analyst/Data Engineer specializing in BI, ETL pipelines, and cloud analytics

4y exp
VerizonLindsey Wilson College

Data engineer focused on marketing/web analytics and external API pipelines, handling ~10M records/week. Built Azure-based ingestion and PySpark transformations with rigorous data quality checks, then served curated datasets into Synapse/Redshift for Power BI. Also designed an Airflow-orchestrated crypto REST API pipeline with monitoring, retries/exponential backoff, schema-change detection, and backfill-friendly reprocessing.

View profile
KS

Krish Shah

Screened

Junior AI Engineer specializing in LLM systems and analytics

Miami, FL2y exp
CoUnderscorePurdue University

Analytics-focused candidate with internship and project experience at Recotap and CoUnderscore, combining SQL, Python, and BI dashboards to turn messy marketing and engagement data into decision-ready reporting. Stands out for tying analytics work to business outcomes, including ~15% CTR improvement, identifying ~40% misattributed spend, and enabling a ~$75K budget shift through better targeting.

View profile
DI

Mid-level Data Analyst specializing in financial risk and data automation

McLean, VA5y exp
Capital OneFlorida International University

Analytics professional from Capital One with strong experience automating risk, reconciliation, and regulatory reporting workflows in financial services. They combine deep SQL/Python pipeline skills with stakeholder-facing dashboard and KPI design, delivering measurable impact like 30% performance gains, sub-24-hour anomaly detection, and 100% data integrity for regulatory filings.

View profile
CT

Mid-level AI Engineer specializing in LLMs, MLOps, and healthcare NLP

4y exp
HCA HealthcareUniversity of South Florida

Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.

View profile
RK

Senior AI/ML Engineer specializing in LLMs, generative AI, and applied research

Boca Raton, FL10y exp
ModMedFlorida Atlantic University

Research-heavy ML/AI candidate with a PhD/publications background who translated LLM evaluation and clinical summarization techniques into production at ModMed. They owned an end-to-end healthcare GenAI pipeline that cut clinician documentation time from ~22 minutes to ~7-8 minutes, reduced token costs by ~30%, and built an internal evaluation framework later adopted by multiple teams.

View profile
SN

Mid-level AI/ML Engineer specializing in GenAI, NLP, and financial systems

Texas, USA5y exp
CitibankConcordia University, St. Paul

GenAI/ML engineer with hands-on experience building production financial intelligence and document summarization systems at Citibank. Stands out for combining LLM fine-tuning, hybrid RAG, multi-agent workflows, and strong MLOps/observability practices to deliver measurable business impact, including 60% faster analyst retrieval, 31% higher precision, and 99%+ uptime.

View profile
VS

Senior AI/ML Engineer specializing in Generative AI, LLMs, and MLOps

Tampa, FL9y exp
VerizonJawaharlal Nehru Technological University

Telecom (Verizon) AI/ML practitioner who built a production multimodal system that ingests messy customer issue reports (calls, chats, emails, screenshots, videos) and turns them into confidence-scored incident summaries with reproducible steps and evidence links. Also built KPI/alarm-to-ticket correlation to rank likely root-cause domains (RAN/Core/Transport), cutting triage from hours to minutes and improving MTTR.

View profile
MP

Meghana P

Screened

Mid-level AI/ML Engineer specializing in Generative AI, LLMs, and NLP

Illinois, USA5y exp
State FarmSaint Louis University

AI/ML engineer with forensic analytics and healthcare claims experience (Optum), building production LLM/RAG systems to surface context-driven fraud patterns from unstructured claim notes and explain risk to investigators. Strong in large-scale retrieval performance tuning, legacy API integration with reliability patterns (SQS, circuit breakers), and MLOps orchestration on Airflow/Kubernetes with rigorous testing, monitoring, and stakeholder-friendly interpretability.

View profile
SM

Mid-level Full-Stack Software Developer specializing in cloud-native microservices

WI, USA3y exp
Cardinal HealthAnderson University

Full-stack engineer with enterprise experience at Metasystems Inc. (and Qualcomm) building high-traffic, security-sensitive systems—owned a secure transaction processing module end-to-end using Java/Spring Boot, Python/Django, and React. Strong AWS production operations (EKS/ECS/Lambda/RDS/DynamoDB) with IaC (Terraform/CloudFormation), observability, and reliability patterns; also delivered resilient ETL/integration pipelines with idempotency/retries/backfills and achieved a 50% deployment-time reduction through CI/CD and modular refactoring.

View profile
HS

Harsha Sikha

Screened

Mid-level AI/ML Engineer specializing in Generative AI and data engineering

Armonk, New York4y exp
IBMSaint Peter's University

IBM engineer who built and deployed a production RAG-based LLM assistant using LangChain/FAISS with a fine-tuned LLaMA model, served via FastAPI microservices on Kubernetes, achieving 99%+ uptime. Demonstrates strong practical expertise in reducing hallucinations (semantic chunking + metadata-driven retrieval) and managing latency, plus mature MLOps practices (Airflow/dbt pipelines, MLflow tracking, monitoring, A/B and shadow deployments) and effective collaboration with non-technical stakeholders.

View profile
YL

Yun-Hao Lee

Screened

Junior Machine Learning Engineer specializing in LLM deployment and computer vision

Dallas, TX2y exp
Lab for Intelligent Storage and ComputingUniversity of Texas at Dallas

Robotics/AI candidate who built an AI-driven landmark location tool during a summer internship at Mobile Drive, combining YOLOv5 object detection with OpenStreetMap-based geolocation to handle dense, cluttered urban environments. Also researched deploying LLM-based agents on constrained hardware using quantization plus LoRA/continuous learning, improving accuracy from ~80% to ~92%, with an emphasis on production logging for reliability.

View profile
AS

Mid-level AI/ML Engineer specializing in Generative AI and production ML systems

United States5y exp
CVS HealthUniversity of Maryland, Baltimore County

At CVS Health, the candidate productionized a RAG-based LLM solution in a regulated healthcare setting, emphasizing reliable data pipelines, LoRA fine-tuning, monitoring, safety guardrails, and A/B testing. They have hands-on experience troubleshooting real-time RAG failures (e.g., chunking/embedding issues) and regularly lead developer-focused demos/workshops while translating technical architecture into business value for stakeholders.

View profile
HC

Mid-level Data Engineer specializing in cloud data platforms and scalable ETL pipelines

USA, USA3y exp
HCLTechUniversity of New Haven

Data engineer (~4 years) with full-stack delivery experience (Next.js App Router/TypeScript + React) building a real-time operations monitoring dashboard backed by Kafka and orchestrated data pipelines. Strong production focus: Airflow + CloudWatch monitoring, automated Python/SQL validation (99.5% accuracy), and CI/CD with Jenkins/Docker; has delivered measurable improvements in latency, pipeline reliability, and query performance (Postgres/Redshift).

View profile
TK

Mid-level AI Engineer specializing in LLM orchestration, RAG, and multi-agent systems

Houston, TX4y exp
University of HoustonUniversity of Houston

Research Assistant at the University of Houston who built and live-deployed a production RAG system for 1000+ research documents, using hybrid retrieval (dense+BM25+RRF) with cross-encoder reranking and RAGAS-based evaluation; reported 66% MRR, 0.85+ faithfulness, and 68% lower LLM inference costs. Also built a deployed LangGraph multi-agent research system (Researcher/Critic/Writer) with tool integrations (Tavily, arXiv) and dual memory (ChromaDB + Neo4j), plus freelance automation work delivering a WhatsApp chatbot and n8n workflows for a wholesale clothing business.

View profile
SC

Mid-level Full-Stack Developer specializing in React/Node, GraphQL, and Databricks lakehouse

Dallas, TX6y exp
Southern Glazer's Wine & SpiritsWebster University

Full-stack engineer currently at Southern Glazer’s who built and owned a real-time commercial finance expense analytics dashboard end-to-end (Next.js App Router + TypeScript), including post-launch monitoring, data quality checks, and stakeholder-driven iteration. Strong data/analytics backend experience (Postgres modeling and Databricks Delta Lake pipelines) with demonstrated performance wins—e.g., cutting a key reconciliation query from 8–12s to <400ms and improving frontend load time ~40% with a 25% bounce-rate drop at Verizon.

View profile
SH

Mid-level Data Engineer specializing in cloud ETL/ELT and lakehouse architecture

Jersey City, NJ4y exp
State StreetUniversity of New Haven

Data engineer focused on sales/marketing analytics pipelines, owning ingestion from CRMs/ad platforms through warehouse serving and dashboards at ~hundreds of thousands of records/day. Built reliability-focused systems including dbt/SQL/Python data quality gates with alerting, a resilient web-scraping pipeline (retries/backoff, anti-bot tactics, schema-change detection, backfills), and a versioned internal REST API with caching and strong developer usability.

View profile
SP

Mid-level Data Engineer specializing in real-time streaming and cloud data platforms

New York, NY4y exp
Wells FargoUniversity of Birmingham

Data engineer with Wells Fargo experience owning an end-to-end lakehouse ETL pipeline on Databricks/Azure Data Factory, processing ~480GB daily and implementing robust data quality/reconciliation across 40+ tables to reach ~99.3% reliability. Strong in performance optimization (cut runtime 5.5h→3.8h), CI/CD and monitoring, and resilient external/API ingestion with retries, schema validation, and backfills.

View profile

Need someone specific?

AI Search