Vetted PySpark Professionals

Pre-screened and vetted.

Aishwarya Thorat - Intern Data Scientist specializing in ML engineering and LLM agentic workflows in San Francisco, CA

Intern Data Scientist specializing in ML engineering and LLM agentic workflows

San Francisco, CA6y exp
ContentstackSan José State University

Built an agentic, multi-step LLM system that generates full-stack code for API integrations using LangChain orchestration, Pinecone/SentenceBERT RAG, and a human-in-the-loop feedback loop for iterative code refinement. Also collaborated with non-technical content writers and PMs during a Contentstack internship to deliver a Slack-based AI workflow that generates and brand-checks articles with one-click approvals.

View profile
Srikanth Reddy - Mid-level AI/ML Engineer specializing in GenAI and financial risk & compliance analytics in Plainsboro, NJ

Mid-level AI/ML Engineer specializing in GenAI and financial risk & compliance analytics

Plainsboro, NJ7y exp
State StreetWilmington University

Built and deployed a production LLM-powered financial risk and compliance platform to reduce manual trade exception handling and speed up insights from regulatory documents. Implemented a LangChain multi-agent workflow with structured/unstructured data integration (Redshift + vector DB) and emphasized hallucination reduction for regulatory safety using Amazon Bedrock. Strong MLOps/orchestration background across Kubernetes, Airflow, Jenkins, and monitoring/testing with MLflow, Evidently AI, and PyTest.

View profile
Sai Swetha Bodlapati - Senior Data Engineer specializing in Spark, Kafka, and Databricks Lakehouse platforms in Dallas, TX

Senior Data Engineer specializing in Spark, Kafka, and Databricks Lakehouse platforms

Dallas, TX5y exp
Fidelity InvestmentsNorthwest Missouri State University

Data engineer at Fidelity who built and operated a real-time financial transactions lakehouse on AWS/Databricks, processing millions of records daily with Kafka streaming. Demonstrated strong reliability and data quality practices (watermarking, idempotent Delta writes, validation/reconciliation, observability) and delivered measurable improvements (~30% faster jobs and ~30% fewer data issues) while enabling trusted gold-layer analytics for downstream teams.

View profile
Atharv Sankpal - Mid-level Data Analyst specializing in financial and healthcare analytics in Baltimore, MD

Mid-level Data Analyst specializing in financial and healthcare analytics

Baltimore, MD4y exp
AIGUMBC

Analytics professional with experience at JPMorgan and Deloitte, focused on financial and risk data. They stand out for building scalable SQL/Python data pipelines, KPI and forecasting dashboards, and retention/cohort metrics that improved reporting reliability, forecast accuracy, and planning speed.

View profile
SG

Mid-level Data Analyst specializing in business intelligence and cloud data platforms

Stamford, CT4y exp
Franklin TempletonUniversity of Bridgeport

Healthcare analytics professional with TCS/Humana experience turning messy claims and eligibility data into reliable reporting assets using SQL and Python. They combine strong data engineering and analytics execution with stakeholder management, including automating monthly claims reporting from half a day to under 5 minutes and driving a provider outreach effort that reduced claim rejection rates by about 20%.

View profile
RE

Rakesh Eleti

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and healthcare ML systems

Florida, USA4y exp
CignaUniversity of Florida

Healthcare ML/AI engineer at Cigna who has owned a clinical RAG pipeline from prototype through production, monitoring, compliance, and iteration. Stands out for combining LLM product delivery with healthcare-grade safety and explainability, driving a 38% retrieval precision gain, 42% hallucination reduction, and meaningful improvements in team velocity and system reliability.

View profile
Sree Damineni - Mid AI/ML Engineer specializing in financial and insurance analytics in USA

Sree Damineni

Screened

Mid AI/ML Engineer specializing in financial and insurance analytics

USA3y exp
FISUniversity of Missouri-Kansas City

Senior AI/ML engineer focused on production ML, LLMs, and MLOps, with concrete experience shipping fraud detection and enterprise RAG systems. They combine strong deployment and monitoring discipline with measurable business impact, including 31% precision improvement in fraud detection and 37% better answer relevance in a financial-document QA system.

View profile
SB

Senior AI/ML Engineer specializing in Generative AI, NLP, and regulated industries

Illinois, USA7y exp
Northern TrustUniversity of New Haven

Built end-to-end ML and GenAI systems at Northern Trust, including a production RAG-based document intelligence platform for financial reports and contracts. Stands out for combining strong MLOps execution with practical product judgment—improving forecast accuracy by 22%, document review accuracy by 38%, and cutting deployment time by 45% while keeping latency and reliability production-ready.

View profile
Niranjaan Munuswamy - Mid-level Full-Stack Software Engineer specializing in cloud and data engineering in Chicago, IL

Mid-level Full-Stack Software Engineer specializing in cloud and data engineering

Chicago, IL4y exp
CignaIllinois Institute of Technology

Backend engineer with experience at Cigna evolving REST API services backed by PostgreSQL, emphasizing reliability/correctness, scalability, and observability. Has hands-on production experience with FastAPI (contract-first design, Pydantic schemas), performance tuning (indexes, caching), and secure auth patterns (OAuth/JWT, RBAC, row-level security via Supabase), plus low-risk incremental rollouts using feature flags and dual writes.

View profile
NS

Nisarg Shah

Screened

Junior Software Engineer specializing in data, systems, and AI engineering

Arizona, USA2y exp
Arizona State UniversityArizona State University

Early-career/new-grad candidate who built TrendScout AI, an evidence-first market intelligence agent that ingests messy news, extracts entities/events, builds a Neo4j knowledge graph, and answers questions via RAG with citations. Achieved ~95% retrieval relevance by combining ChromaDB semantic search with graph-based retrieval and validating outputs through human evaluation and guardrails to prevent hallucinations.

View profile
RE

Mid-level AI/ML Engineer specializing in NLP and Generative AI

Indiana, USA6y exp
Elevance HealthIndiana University Indianapolis

Built and deployed a production LLM-powered RAG assistant for healthcare teams (care managers/support) to answer questions from clinical and policy documentation, emphasizing trustworthiness via improved retrieval, reranking, and strict grounding prompts to reduce hallucinations. Also has hands-on orchestration experience with Apache Airflow for end-to-end ETL/ML workflows and applies rigorous testing/metrics (hallucination rate, tool-call accuracy, latency, cost) to ensure reliable AI agent behavior.

View profile
JW

Joseph Wonesh

Screened

Senior Full-Stack Software Engineer specializing in modern web apps and cloud platforms

Los Angeles, CA11y exp
SmartiStackUniversity of Florida

Backend/data engineer focused on production-grade Python microservices and AWS platforms, including a hybrid Lambda + ECS Fargate architecture managed with Terraform and CI/CD. Has hands-on reliability experience (JWT/OAuth, timeouts, retries, centralized error classification) and built AWS Glue/PySpark ETL pipelines consolidating PostgreSQL/RDS, MongoDB, and S3 sources into curated partitioned Parquet datasets. Demonstrated measurable SQL tuning impact (8 minutes to 25 seconds) and disciplined legacy-to-modern migrations with parity validation and UAT sign-off.

View profile
AK

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

USA4y exp
CignaTexas Tech University

ML/AI engineer with healthcare payer experience (Signal Healthcare, Cigna) who has shipped production fraud/claims prediction systems using Python/TensorFlow and exposed them via FastAPI/Flask microservices integrated with EHR and Salesforce. Emphasizes operational reliability and trust—Airflow-orchestrated pipelines with data quality gates plus SHAP-based interpretability, A/B testing, and drift/debug workflows—backed by reported outcomes of 22% lower false payouts and 17% higher model accuracy.

View profile
MD

Mid-level Full-Stack Developer specializing in web platforms and cloud (AWS)

United States4y exp
Lincoln FinancialCalifornia State University, Long Beach

Full-stack engineer with financial services experience (Lincoln Financial) who owned a customer-facing financial portal end-to-end using TypeScript/React and Node/Express. Has hands-on microservices and RabbitMQ event-driven workflows, addressing scale issues like retries/duplicates with idempotency and traceable logging, and built an internal real-time ops/support dashboard to improve monitoring and incident response.

View profile
OR

Mid-level Data Scientist specializing in predictive modeling, NLP/LLMs, and RAG search systems

Des Moines, IA6y exp
CDS GlobalUniversity of Massachusetts

Built production LLM/RAG platforms for financial services to enable natural-language Q&A over large policy/compliance document sets stored in Snowflake and SharePoint. Strong in MLOps and orchestration (Airflow, ADF, Step Functions, MLflow) and in solving real production issues like stale embeddings and model performance, including an incremental Snowflake Streams sync that cut processing time from hours to minutes.

View profile
RA

Rahul Alle

Screened

Mid-level Machine Learning Engineer specializing in NLP, LLMs, and MLOps

USA4y exp
CVS HealthAnderson University

Built a production internal LLM/RAG assistant at CVS Health to cut time spent searching long policy and clinical guideline PDFs, combining fine-tuned BERT/GPT models with FAISS retrieval and a FastAPI service on AWS. Demonstrates strong real-world reliability work (document cleanup, hallucination controls, monitoring/drift tracking with MLflow) and close collaboration with non-technical clinical operations teams via demos and feedback-driven iteration.

View profile
TN

Mid-level Data Scientist & AI/ML Engineer specializing in GenAI and cloud ML

Harrison, NJ5y exp
State FarmMonroe University

GenAI/LLM engineer who recently built a production compliance assistant at State Farm for KYC/AML and regulatory teams, using AWS Bedrock + LangChain with Textract/Lambda pipelines to extract fields, tag risk, and summarize long documents. Implemented RAG, strict structured outputs, and human-in-the-loop guardrails, and reports automating ~80% of documentation work while reducing review time by ~40%.

View profile
YT

Yash Tobre

Screened

Mid-level AI/ML Engineer specializing in computer vision, NLP/LLMs, and MLOps

Bentonville, AR4y exp
DyneticsUniversity of Texas at Arlington

ML/AI engineer with defense and commercial analytics experience: deployed a real-time aerial object detection system at Dynetics (YOLOv5 + TorchServe in Docker on AWS EC2) with drift-triggered retraining and 99.5% uptime, tackling ambiguous targets and weather degradation. Previously at Fractal Analytics, built and explained a churn prediction model for marketing stakeholders using SHAP and delivered it via a Flask API into dashboards, driving a reported 22% attrition reduction.

View profile
VN

Vasanthi N.

Screened

Senior AI/ML Engineer and Data Scientist specializing in Generative AI and MLOps

Los Angeles, CA9y exp
Pacific Community BankAurora University

ML/NLP practitioner focused on financial-services document intelligence and compliance workflows—built an end-to-end pipeline to classify documents and extract financial entities from loan applications, emails, and statements stored in S3/internal databases. Strong in entity resolution/record linkage and in productionizing pipelines with GitHub Actions CI/CD, testing, data validation, and Docker, plus semantic search using OpenAI embeddings and a vector database.

View profile
HK

Mid-level Data Analyst specializing in cloud ETL, BI, and machine learning

Texas, 752235y exp
UnitedHealth GroupUniversity of Texas at Arlington

Data/ML practitioner with experience at UnitedHealth Group building a fraud claims detection solution combining structured claims data and unstructured notes, validated with compliance stakeholders to improve actionable accuracy. Also applied embeddings, vector databases, and fine-tuned language models in a Bank of America capstone to detect threats/anomalies in financial documents, with production-minded Python ETL workflows using Airflow.

View profile
UO

Principal Data Scientist specializing in Generative AI, NLP, and MLOps

San Francisco, CA12y exp
CognizantUniversity at Buffalo

ML/NLP practitioner with banking experience (M&T Bank) who has built a GPT-4 RAG system using LangChain and Pinecone to connect unstructured customer data with internal knowledge bases, improving accuracy and reducing manual lookup time by 50%+. Strong in entity resolution and productionizing scalable Python data workflows, including major performance wins by migrating bottleneck joins from Pandas to Dask.

View profile
JM

Mid-level Data Scientist / ML Engineer specializing in FinTech and Healthcare ML systems

4y exp
FiservSan Diego State University

AI/LLM engineer who has shipped production RAG systems (including a 250K-document compliance knowledge tool on AWS) and focuses on reliability via citations, guardrails, and rigorous evaluation (Ragas/Opik/DeepEval). Also built a LangGraph-orchestrated webcrawler agent that cut research paper extraction from hours to minutes, and collaborated with clinical teams to deliver patient volume forecasting with an optimization layer for staffing.

View profile
KT

Kavita Tamire

Screened

Mid-level Data Engineer specializing in AWS cloud data platforms

California, USA3y exp
Charter CommunicationsUniversity of South Florida

Data engineer with Charter Communications experience modernizing large-scale AWS data lake pipelines: ingesting S3 data, validating against legacy systems, transforming with PySpark/Spark SQL, and serving via Iceberg/Delta tables. Worked at 50M–300M record scale, delivered >99.5% data match, and built monitoring/alerting (CloudWatch/SNS) plus retry orchestration (Step Functions) and data quality gates (Great Expectations).

View profile
DP

Mid-level AI/ML Engineer specializing in LLMs, RAG, and enterprise MLOps

Baltimore, MD4y exp
CVS HealthUniversity of Maryland, Baltimore County

Backend engineer who built an AI-driven "Smart Feedback Analyzer" API (Flask → FastAPI) that processes user feedback with NLP (Hugging Face + OpenAI) and returns structured insights. Demonstrates strong production-minded architecture: stateless services, Cloud Run + Docker deployment, Redis/Celery background processing, and Postgres/SQLAlchemy performance tuning (EXPLAIN ANALYZE, indexing, N+1 fixes), plus multi-tenant data isolation via JWT/API-key derived tenant IDs.

View profile

Need someone specific?

AI Search