Vetted PySpark Professionals

Pre-screened and vetted.

CK

Mid-level Data Engineer specializing in financial data engineering and scalable pipelines

Jersey City, NJ4y exp
JPMorgan Chase
View profile
TS

Senior Full-Stack Developer specializing in cloud-native microservices (AWS)

Irving, TX8y exp
U.S. Bank
View profile
Mounika Nadendla - Senior Data Engineer specializing in cloud data platforms and real-time streaming

Mounika Nadendla

Screened ReferencesStrong rec.

Senior Data Engineer specializing in cloud data platforms and real-time streaming

5y exp
CVS HealthUniversity of Cincinnati

Data engineer focused on building reliable, production-grade data systems end-to-end: batch and real-time pipelines (Airflow/Kafka/Spark) with strong data quality, monitoring/alerting, and incident response. Has experience integrating external API/web data with retries, throttling, and schema-change handling, and serving curated datasets to analytics (Power BI) and backend consumers with performance optimizations like Redis caching.

View profile
PK

Senior Data Engineer specializing in multi-cloud data platforms and generative AI

Weston, FL5y exp
UKGUniversity of Alabama at Birmingham
View profile
YR

Mid-level AI/ML Developer specializing in FinTech fraud detection and GenAI assistants

MO, USA4y exp
Edward JonesUniversity of Central Missouri
View profile
RR

Mid-level Data Scientist specializing in financial ML, NLP, and MLOps

San Diego, CA5y exp
Morgan StanleySan Diego State University
View profile
SD

Srijan Dokania

Screened ReferencesModerate rec.

Junior Robotics & Machine Learning Engineer specializing in perception, SLAM, and edge AI

Boston, MA2y exp
Field Robotics Lab (Northeastern University)Northeastern University

Built and deployed an Azure-based, fine-tuned CLIP visual retrieval system at Staples for a ~300k-item product catalog, improving edge-case recall by 12% by engineering a custom delta-similarity/dynamic-margin loss. Also has robotics experience using ROS2 for sensor/compute orchestration, including GPS-time-synchronized sensor triggering for robot swarms and latency-bounded optical-flow benchmarking for edge deployment.

View profile
NG

Naga Gayatri Bandaru

Screened ReferencesModerate rec.

Mid-level AI/ML Engineer specializing in MLOps and production ML systems

Cleveland, Ohio3y exp
Cleveland ClinicSan José State University

Backend/ML engineer who has shipped high-scale real-time systems across e-commerce and healthcare: built a PharmEasy real-time recommendation engine for ~2M monthly users (cut feature latency 5 min→30 sec; +15% cross-sell) and architected a HIPAA-compliant multimodal clinical diagnostic workflow (DICOM+EHR) with XAI, MLOps (MLflow/Airflow/K8s), and drift/monitoring guardrails supporting 10k+ daily predictions.

View profile
AM

Junior AI/ML Engineer specializing in anomaly detection and LLM/RAG systems

Fort Mill, SC2y exp
HoneywellNortheastern University

Built and productionized a tool-first, multi-agent framework that augments an anomaly detection model with domain context to generate trustworthy, evidence-backed anomaly explanations (including false-positive likelihood). Architected the platform to be model/orchestration/vectorDB agnostic (e.g., GPT + CrewAI + ChromaDB vs Claude + LangGraph + other vector DB) with strong performance, reliability, and OpenTelemetry-based observability. Also built a personal LangGraph-based "mock interviewer" agent that asynchronously fuses voice + live code input using state reducers, stop conditions, and fallback routing.

View profile
SP

Soham Patel

Screened

Mid-level Machine Learning Engineer specializing in healthcare NLP and MLOps

Piscataway, NJ3y exp
Syneos HealthRutgers University - New Brunswick

ML/AI practitioner in healthcare (Syneos Health) who has deployed production clinical NLP and risk models. Built a BERT-based physician-note information extraction system on Docker + AWS SageMaker (reported ~42% retrieval improvement) and automated retraining/deployment with Airflow and drift detection, while partnering closely with clinicians to drive adoption (reported ~18% readmission reduction).

View profile
SK

Mid-level AI/ML Engineer specializing in Generative AI and healthcare data

NJ, USA6y exp
Johnson & JohnsonWichita State University

Built and deployed a production RAG-based document Q&A system on Azure OpenAI to help business teams search thousands of PDFs/Word files, using Qdrant vector search, MongoDB, and a Flask API. Demonstrates strong production engineering (streaming large-file ingestion, parallel preprocessing, monitoring/retries) plus systematic prompt/embedding/chunking experimentation to improve accuracy and reduce hallucinations, and has hands-on orchestration experience with ADF/Airflow/Databricks/Synapse.

View profile
AR

Anurag Reddy

Screened

Mid-level Data Scientist specializing in ML, MLOps, and Generative AI

TX, USA5y exp
CaterpillarUniversity of Illinois Chicago

ML/NLP engineer who built a RAG-based technical assistant for Caterpillar field engineers, transforming PDF keyword search into intent-based semantic retrieval across manuals, logs, sensor reports, and technician notes. Strong in productionizing data/ML systems (Airflow, PySpark) with rigorous preprocessing, entity resolution, and evaluation—delivering measurable gains in accuracy, relevance, and duplicate reduction.

View profile
KP

Kavya Paluvai

Screened

Mid-level Data Scientist specializing in fraud detection and healthcare ML

North Carolina, USA4y exp
Wells FargoUniversity of North Carolina at Charlotte

Applied NLP/ML in healthcare and financial services, including fine-tuning BERT on unstructured EHR text and building embedding-based similarity search for clinical concepts. Also redesigned a Wells Fargo fraud detection data pipeline using modular Python + AWS Glue/Step Functions, cutting runtime ~40% with improved monitoring and reliability.

View profile
AB

Ananya Bojja

Screened

Mid-level AI/ML Engineer specializing in healthcare analytics and MLOps

USA4y exp
CignaUniversity of New Hampshire

AI/ML engineer at Cigna Healthcare building a production, HIPAA-compliant LLM-powered clinical insights platform that summarizes unstructured medical notes using a fine-tuned transformer + RAG on AWS. Demonstrates strong end-to-end MLOps and cloud optimization (distillation, Spot/Lambda/Auto Scaling) with quantified outcomes (~28% accuracy lift, ~40% less manual review, ~25% lower ops cost) and strong clinician-facing explainability via SHAP and dashboards.

View profile
RN

Mid-Level Software Engineer specializing in Python backend, data engineering, and cloud microservices

New Jersey, USA6y exp
Abacus InsightsNJIT

Backend-leaning full-stack engineer with production experience in both healthcare (claims enrichment/interoperability at Abacus) and finance (Goldman Sachs pricing/risk APIs + React dashboards). Built an event-driven AI grading platform using Postgres Debezium CDC + Kafka + FastAPI on AWS that cut manual grading ~70% and served 1000+ students, with strong emphasis on reliability, testing, and performance tuning.

View profile
NK

Senior Data Engineer specializing in Palantir Foundry and Snowflake for regulated industries

USA5y exp
American ExpressUniversity of Massachusetts Boston

Data engineer focused on high-volume transaction pipelines (2M+ per day) using Snowflake/Snowpipe, Spark/PySpark, Kafka, and Airflow, with a strong emphasis on schema/data-quality enforcement and reliability improvements. Also built a greenfield compliance-focused RAG solution, using CloudWatch monitoring and adding ingestion validation to prevent malformed OCR documents from degrading search quality.

View profile
MG

Senior Data Engineer specializing in cloud data platforms and real-time streaming

6y exp
HCA HealthcareWright State University

Data engineer in healthcare (HCA) who owned end-to-end Azure-based pipelines at very large scale (50M+ daily claims/patient records). Strong focus on reliability: schema-drift fail-fast validation, quarantine layers, and Python/SQL data quality checks that reduced issues ~25%, plus performance tuning in Databricks/PySpark and versioned serving in Synapse for downstream consumers.

View profile
Cristian Vega - Senior AI/ML Engineer specializing in Generative AI and RAG in California, null

Cristian Vega

Screened

Senior AI/ML Engineer specializing in Generative AI and RAG

California, null9y exp
Morf HealthUniversity of Texas at Austin

ML/NLP practitioner at Morf Health focused on unifying fragmented healthcare data by linking structured patient/encounter records with unstructured clinical notes. Has hands-on experience with transformer embeddings, vector databases, and domain fine-tuning, plus rigorous evaluation (precision/recall) and human-in-the-loop validation with clinical SMEs to make pipelines production-grade.

View profile
Cary Burdick - Senior Data Scientist specializing in data engineering and analytics in Chicago, IL

Cary Burdick

Screened

Senior Data Scientist specializing in data engineering and analytics

Chicago, IL6y exp
USDAAuburn University

Data/NLP practitioner with experience in both financial services (Truist) and government (USDA), including an NLP-driven analysis of EU regulations to anticipate US regulatory focus and a major redesign/cleaning of complex pathogen lab-test public datasets. Built production data-quality pipelines with Dagster, Pandera, and Azure Synapse, and is comfortable validating hypotheses with historical backtesting and SME-driven quality controls.

View profile
HIMANSHU SHARMA - Mid-level AI Solutions Engineer specializing in enterprise GenAI and automation in Orlando, FL

Mid-level AI Solutions Engineer specializing in enterprise GenAI and automation

Orlando, FL6y exp
Kore.aiUniversity of South Florida

Built and shipped multiple production LLM/agentic systems, including an agentic RAG NL-to-SQL analytics app that cut manual reporting from 9 hours/week to 15 minutes by grounding on schema-aware retrieval and robust fallback/monitoring. Also implemented a LangChain supervisor-orchestrated enterprise IT automation agent that routes requests for search, identity validation, and action execution, and created a RAG search tool spanning Jira/Confluence/SharePoint for operations stakeholders.

View profile
Nikitha Kommidi - Mid-level AI/ML Engineer specializing in fraud detection, NLP, and MLOps

Mid-level AI/ML Engineer specializing in fraud detection, NLP, and MLOps

6y exp
CitibankUniversity of Texas at Arlington

Built a production real-time fraud detection and customer-support automation platform at Citibank, tackling extreme class imbalance (reported ~1:5000) and strict latency constraints. Combines hands-on MLOps (Airflow, Kubernetes, MLflow; Snowflake/Spark/S3 integrations; CI/CD model promotion) with cross-functional delivery to Risk & Compliance focused on interpretability and reducing false positives.

View profile
KD

Mid-level Business Analyst specializing in banking analytics and data engineering

Hollywood, FL4y exp
SantanderIndiana University Bloomington

Analytics professional at Santander Bank with hands-on experience building SQL and Python workflows for transaction reporting, reconciliation, and monitoring across messy multi-source financial data. They combine strong data validation and exception-handling practices with stakeholder-friendly dashboards, and also bring digital analytics experience from a Google Analytics UI optimization project focused on funnel drop-off and engagement.

View profile
SK

Mid-level Data Analyst and Data Engineer specializing in healthcare and financial analytics

3y exp
UnitedHealth GroupUniversity of North Texas

Analytics professional with healthcare and operations experience who turns messy enterprise data from platforms like Teradata, GCP, SQL Server, and Snowflake into trusted reporting layers and reproducible analysis workflows. They combine SQL, Python, PySpark, Power BI, and Tableau to improve reporting accuracy and performance, including a 30% dashboard refresh improvement and 20-25% accuracy gains in healthcare reporting.

View profile

Need someone specific?

AI Search