Vetted PySpark Professionals

Pre-screened and vetted.

BC

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp
BlackRockWebster University

Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.

View profile
RK

Rohit Khoja

Screened

Mid-level Full-Stack Engineer specializing in cloud microservices and NLP/LLM systems

Tempe, AZ4y exp
CitigroupArizona State University

Full-stack engineer with 3+ years using Java/Spring Boot (Citi) and React, who built a production observability dashboard monitoring 53 microservices across 17 clusters with real-time health/latency tracing and significant performance improvements (cut load time from ~10s). Also designed a serverless AWS face-recognition system (Lambda/S3/SQS) built to handle burst traffic (~1000 concurrent requests), demonstrating strength in scalable, event-driven architectures.

View profile
SS

Mid-level Data Engineer specializing in real-time pipelines and cloud analytics

Chicago, IL5y exp
JPMorgan ChaseUniversity of South Dakota

Researcher from the University of South Dakota who built a production medical RAG system to help interpret model predictions by retrieving relevant clinical notes and medical literature, overcoming retrieval accuracy and imaging-dataset challenges through semantic chunking and metadata-driven indexing. Also has hands-on orchestration experience with Airflow and Azure Data Factory, plus a pragmatic approach to LLM evaluation and stakeholder-driven iteration.

View profile
SR

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp
Northern TrustBradley University

Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).

View profile
GJ

Mid-level Machine Learning Engineer specializing in MLOps, NLP, and Computer Vision

USA5y exp
WalmartUniversity of New Haven

ML/AI engineer with production experience across retail and healthcare: built a real-time computer-vision shelf monitoring system at Walmart and optimized edge inference latency by ~30% using TensorRT/ONNX and pruning. Also partnered with CVS Health clinical/pharmacy teams to deliver a medication-adherence predictive model, using Streamlit explainability dashboards and achieving an 18% adherence improvement.

View profile
RH

Rahul Hatkar

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and MLOps

San Francisco, CA6y exp
Scale AIWebster University

AI/ML engineer who has shipped production AI systems end-to-end, including an automated multi-channel (Gmail/WhatsApp/voice) candidate interviewing workflow and an enterprise RAG knowledge search platform. Demonstrates strong production rigor (monitoring, A/B tests, guardrails, schema validation, shadow testing) with quantified impact: ~60–70% reduction in interview evaluation time and ~20–30% relevance gains in RAG retrieval.

View profile
DM

Mid-level Generative AI Engineer specializing in decision intelligence and RAG for regulated enterprises

5y exp
JPMorgan ChaseSaint Louis University

Healthcare GenAI engineer who built a HIPAA-compliant, auditable RAG-based claims decision support system at Molina Healthcare, processing 3M claims and delivering major impact (48% faster manual reviews, 43% higher decision accuracy). Deep hands-on experience with LangChain orchestration, vector search (ChromaDB/FAISS), embedding fine-tuning, and safety controls (confidence scoring, rule validation, human-in-the-loop escalation) for clinical workflows.

View profile
DK

Senior Data Engineer specializing in Azure Lakehouse, Databricks/Spark, and Snowflake

Richardson, TX6y exp
PwCUniversity of Central Missouri

Data engineer/platform builder with experience across PwC and Liberty Mutual delivering high-volume, production-grade pipelines and real-time data services. Has owned end-to-end streaming + batch architectures on AWS and Azure, including web scraping systems, with quantified reliability gains (99.9% availability, 90%+ error reduction, 30% latency reduction) and strong observability/CI-CD practices.

View profile
AP

Mid-level Machine Learning Engineer specializing in fraud detection and LLM applications

Charlotte, NC5y exp
Bank of AmericaUniversity of North Carolina at Charlotte

Unreal Engine UI engineer focused on scalable, production-ready UI architecture (C++/Slate/UMG/CommonUI) with strong designer enablement via decoupled, interface-driven patterns and MVVM. Demonstrated measurable performance wins: replaced 200+ per-frame Blueprint bindings to cut UI prepass/paint from 4.2ms to 0.5ms and reduced VRAM by ~120MB using texture streaming proxies.

View profile
PG

Palash Gharde

Screened

Mid-level Software Development Engineer specializing in backend, data engineering, and ML systems

Arizona, USA5y exp
ServiceNowArizona State University

ML/Backend engineer with ServiceNow experience building production-grade inference services on FastAPI with Docker/Kubernetes (autoscaling, health checks) and strong reliability practices (monitoring, retries/timeouts, fallbacks). Delivered measurable improvements including 30% lower API latency and 18% higher model accuracy, and built A/B testing plus drift-triggered retraining loops to keep models stable in production.

View profile
Sayali Patil - Mid-level Python Full-Stack Developer specializing in Healthcare and FinTech in Everett, MA

Sayali Patil

Screened

Mid-level Python Full-Stack Developer specializing in Healthcare and FinTech

Everett, MA6y exp
Kaiser PermanenteHarrisburg University of Science and Technology

Backend engineer with hands-on experience building a fraud-transaction monitoring system in Python/Flask, architected as Dockerized microservices and integrated with Kafka for high-volume streaming. Demonstrates strong performance and reliability chops across PostgreSQL/SQLAlchemy tuning (EXPLAIN ANALYZE, N+1 fixes, bulk ops), multi-tenant data isolation, and scaling via background workers + Redis caching, plus real-time ML inference deployment using TensorFlow on AWS.

View profile
Harshavardhan Reddy - Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics in Albany, NY

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp
Capital OnePace University

ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.

View profile
Akshit Modi - Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps in Remote, USA

Akshit Modi

Screened

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

Remote, USA5y exp
TempusArizona State University

Healthcare/clinical ML practitioner who built and productionized ClinicalBERT-based pipelines to extract and standardize oncology EHR data, improving downstream model F1 from 0.81 to 0.92 while controlling training cost via LoRA/QLoRA. Experienced orchestrating real-time AWS ETL/ML workflows (Glue, Lambda, SageMaker) and partnering with clinicians using SHAP-based interpretability, contributing to an 18% reduction in readmissions and full adoption.

View profile
Bhavyasree Chinthala - Mid-level Data Engineer specializing in cloud data pipelines and real-time streaming in USA, USA

Mid-level Data Engineer specializing in cloud data pipelines and real-time streaming

USA, USA5y exp
PNCSaint Peter's University

Data engineer with PNC Bank experience owning high-volume financial transaction pipelines end-to-end (Kafka/REST ingestion through Spark/Glue transformations to Redshift serving) for risk and fraud analytics. Built strong reliability and data quality practices (Great Expectations, reconciliation, Airflow alerting, idempotent retries, incremental/windowed processing), reporting 40% ingestion efficiency gains and ~99.9% data accuracy.

View profile
Lance Chou - Intern Machine Learning Engineer specializing in NLP and MLOps in Canada

Lance Chou

Screened

Intern Machine Learning Engineer specializing in NLP and MLOps

Canada1y exp
VosynColumbia University

PhD-led research engineer who has shipped LLM-powered agents for automated knowledge extraction from STEM textbooks/papers into a graph database, reporting a 90% accuracy improvement and major reductions in manual curation time. Also built an end-to-end multi-agent news aggregation/sentiment pipeline using the Agno framework with Pydantic-structured outputs, retries, and monitoring, and has experience processing messy SEC filings.

View profile
AC

Mid-level Business Data Analyst specializing in healthcare analytics

USA6y exp
Johnson & JohnsonGovernors State University

Analytics-focused candidate with strong SQL, Excel, Python, and Tableau skills who supports payroll-, compensation-, and finance-adjacent processes through rigorous data validation and reconciliation. Stands out for uncovering a duplicate-record mapping issue that exposed roughly $250K in revenue leakage and for building repeatable controls, dashboards, and automated checks to improve reporting accuracy.

View profile
KS

Kristina Shen

Screened

Intern-level Data Scientist and ML Engineer specializing in analytics and AI systems

Long Island City, NY1y exp
DataLynnUniversity of Chicago

Early-career analytics candidate with hands-on experience in SQL/Python data pipelines, Tableau reporting, and marketing engagement analytics across internship and startup settings. Stands out for combining rigorous data quality practices with practical AI system design, including an end-to-end GPT-4 grading capstone that emphasized explainability and human oversight.

View profile
HL

Hao Liang

Screened

Mid-level Data Scientist specializing in GenAI, customer insights, and forecasting

Durham, NC5y exp
BASFUniversity of North Carolina at Chapel Hill

ML/AI practitioner with hands-on experience shipping production time-series forecasting and RAG-based customer insights platforms in an enterprise setting. At BASF, he improved seed sales forecasting beyond naive baselines using model selection tailored by brand size, and he also led a RAG solution over Salesforce reports, complaints, and surveys that reached 2,000+ users with strong daily engagement.

View profile
Amit Dharam - Junior AI/ML Software Engineer specializing in backend systems and cloud deployment in Tempe, AZ

Amit Dharam

Screened

Junior AI/ML Software Engineer specializing in backend systems and cloud deployment

Tempe, AZ3y exp
Arizona State UniversityArizona State University

Built multiple end-to-end automation and data systems, including an Accio RAG pipeline combining PDF parsing, FastAPI, Neo4j, and vector search, plus Selenium-based scraping for a virtual try-on product. Stands out for reliability-minded engineering: automated testing, structured logging, validation layers, and a data-driven approach to debugging flaky automation that improved CI pass rates to over 98%.

View profile
SP

Junior AI/ML Software Engineer specializing in LLMs and data-intensive systems

New York, NY3y exp
NYU Langone HealthNYU

AI/backend engineer who has owned production applied-ML systems end to end, including a Jitsi meeting intelligence platform with custom RoBERTa boundary detection, LLM summarization, and automated retraining from user feedback. Also has healthcare AI experience building a diabetes medication titration system with strict validation, drift monitoring, and safety guardrails—showing both product speed and high-stakes engineering rigor.

View profile
PM

Mid-level AI/ML Engineer specializing in LLM agents and workflow automation

4y exp
UnitedHealth GroupKansas State University

AI/LLM engineer with strong healthcare domain depth who has shipped production-grade agents for care coordination and clinical workflow automation. Stands out for combining Knowledge Graph RAG, LangGraph orchestration, and rigorous eval/guardrail systems to improve reliability in high-stakes environments, with measurable gains in review time, hallucination reduction, latency, and clinician adoption.

View profile
MB

Mounya Bonuga

Screened

Mid-level AI/ML Engineer specializing in multimodal AI and recommendation systems

USA4y exp
Goldman SachsUniversity of Central Oklahoma

ML/AI engineer with hands-on ownership of a production LLM/RAG system at Goldman Sachs, focused on workflow automation and large-scale document search for operational teams. They combine strong MLOps and backend engineering skills with practical GenAI evaluation and safety practices, and cite measurable impact including 22% better task guidance accuracy and sub-second search across millions of records.

View profile
DM

Diptesh Mool

Screened

Mid-level Full-Stack Developer specializing in backend microservices

5y exp
First Citizens BankUniversity of North Carolina at Charlotte

Frontend-focused product engineer with hands-on experience designing and shipping real-time dashboards and alerting systems in regulated domains like banking and healthcare. Has led both UX/design and implementation work, combining React/TypeScript and Angular frontend expertise with Kafka-driven event architectures, performance optimization, and production monitoring.

View profile
ST

Mid-Level AI Engineer specializing in NLP, computer vision, and LLM applications

Austin, TX3y exp
BookedByUniversity of Maryland, Baltimore County

LLM/RAG practitioner who productionized an LLM-driven customer communication and transaction understanding system at PayPal, emphasizing privacy/compliance guardrails and large-scale data normalization. Experienced in real-time debugging of hallucinations via retrieval pipeline tuning and in leading hands-on developer workshops and sales-aligned POCs to drive adoption.

View profile

Need someone specific?

AI Search