Vetted PySpark Professionals

Pre-screened and vetted.

Sanjay Santhanam - Mid-level AI Software Engineer specializing in LLMs and FinTech data systems in San Jose, CA

Mid-level AI Software Engineer specializing in LLMs and FinTech data systems

San Jose, CA4y exp
Scry AIWestcliff University

Backend/AI systems engineer focused on productionizing agentic document-processing workflows for large financial PDFs. They describe owning deployments end-to-end, combining Python, Redis, LLM function calling, RAG/ReAct-style orchestration, and strong reliability practices to deliver 80% faster processing, reduce parsing errors from 12% to ~1%, and sustain 99.9% uptime in high-concurrency environments.

View profile
SG

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps for healthcare and finance

6y exp
CVS HealthUniversity of New Haven

Built a production LLM-powered RAG agent for healthcare/insurance operations that retrieves and summarizes patient medical documents with grounded citations, scaling to ~4.5M records. Addressed medical shorthand and terminology by fine-tuning ~120 lightweight DistilBERT models by specialty and validating entities against SNOMED/RxNorm, while using SHAP/LIME and human-in-the-loop review to make decisions explainable to stakeholders.

View profile
CS

Intern Data Scientist specializing in generative AI and forecasting

San Francisco, CA5y exp
Aurora AIUniversity of Chicago

ML/NLP practitioner working across healthcare and business/finance use cases: currently fine-tuning a domain-specific Llama 3.1 model for safe reasoning over EHRs/clinical notes using RAG + RL/DPO and RAGAS-based evaluation. Has built UMLS-driven entity normalization pipelines with quantified quality gains and developed embedding/vector-DB systems (FAISS) for semantic matching and forecasting/recommendation applications at Aurora AI and Banxico.

View profile
GV

Mid-level Full-Stack Software Engineer specializing in Java/Spring Boot and Angular

Frisco, TX5y exp
CiscoPurdue University

Full-stack engineer with Cisco supply-chain and Wipro internal platform experience, focused on customer-facing UI performance and secure backend services. Built a bulk Excel inventory upload feature (Spring Boot/Apache POI) that cut manual effort ~80%, and delivered high-scale Angular/React dashboards with strong reliability/observability (FastAPI, JWT, Docker, AWS, AppDynamics).

View profile
Vamshikrishna Bandi - Senior AI/ML Engineer specializing in Generative AI and agentic multi-agent systems

Senior AI/ML Engineer specializing in Generative AI and agentic multi-agent systems

6y exp
PayPalTrine University

Built and shipped a production LLM-powered multi-agent RAG system to automate complex internal support workflows, integrating tool execution (SQL/APIs) with validation guardrails to reduce hallucinations. Optimized for real-world latency and cost via model routing, caching, and async parallel tool calls, and enforced reliability with CI-gated golden test sets derived from anonymized production queries.

View profile
Praveen Nutulapati - Mid-level Generative AI Engineer specializing in LLM fine-tuning, RAG, and agentic systems in New York, NY

Mid-level Generative AI Engineer specializing in LLM fine-tuning, RAG, and agentic systems

New York, NY6y exp
JPMorgan ChaseUniversity of Central Missouri

Built and deployed a production multi-agent RAG system at JPMorgan Chase to automate regulated credit analysis and compliance clause discovery across large internal policy/document libraries. Implemented LangGraph-based supervisor orchestration with structured state management (Azure OpenAI) to support long-running, resumable workflows, plus hybrid retrieval + re-ranking and guardrails for reliability. Strong at evaluation/observability (trace logging, LLM-judge, HITL) and at communicating results to non-technical stakeholders via Power BI embeds and Streamlit prototypes.

View profile
Vagmin Yadav - Junior Software Engineer specializing in backend systems, ML pipelines, and DevOps in Pune, India

Vagmin Yadav

Screened

Junior Software Engineer specializing in backend systems, ML pipelines, and DevOps

Pune, India2y exp
Unbox RoboticsGeorgia Tech

TypeScript backend engineer in the robotics domain with hands-on experience building low-latency (20–40ms) production systems using RabbitMQ, Redis, and HA PostgreSQL (Patroni). Has owned end-to-end services supporting 15 clients via config-driven architecture, with strong CI/CD, automated testing, and observability (OpenTelemetry) practices, plus API versioning/deprecation using Keycloak auth.

View profile
YP

Mid-level Software Engineer specializing in backend, distributed systems, and AI infrastructure

Menlo Park, CA4y exp
SnowflakeUSC

Built Baioniq, an enterprise LLM platform for automating extraction from massive unstructured documents like contracts and insurance claims. They demonstrate unusually strong production depth in agentic AI—scaling to 100k+ requests/day, processing 1M+ claim documents, and improving extraction accuracy through rigorous RAG architecture, evaluation, and fallback design.

View profile
Sirisha Maddikunta - Mid-level Generative AI Engineer specializing in enterprise LLM and healthcare AI solutions in O Fallon, MO

Mid-level Generative AI Engineer specializing in enterprise LLM and healthcare AI solutions

O Fallon, MO6y exp
MastercardUniversity of Texas at Arlington

Built and owned an end-to-end LLM-powered fraud investigation assistant that automated case summaries and risk analysis, cutting analyst investigation/documentation time by 40%. Stands out for translating RAG concepts into a production-grade internal platform with strong evaluation, monitoring, and reusable Python service architecture that improved both analyst trust and engineering velocity.

View profile
BM

Mid-level AI/ML Engineer specializing in fraud detection and recommendation systems

California, USA3y exp
PayPalFlorida Atlantic University

ML engineer with production experience at PayPal and Flipkart, owning high-scale systems across fraud detection, recommendations, and LLM tooling. Stands out for combining strong modeling judgment with practical platform engineering, delivering measurable impact like 22% fewer fraud false positives, 18% CTR lift, 40% less LLM manual review, and 30% faster redeployments.

View profile
VS

Mid-Level Software Engineer specializing in LLM agents and real-time data streaming

8y exp
AmazonRutgers University–New Brunswick

Software engineer with experience at Striim and Amazon who ships end-to-end production systems across UI, backend, ML, and operations. Built a real-time PII detection capability for a streaming data platform by integrating Python ML inference into a Java monolith via gRPC sidecars, achieving ~3M events/hour throughput and ~93% accuracy, and helped drive enterprise adoption (Fiserv, CVS). Also modernized internal Amazon tooling for multi-region scale with modularization and fully automated deployments.

View profile
TN

Tanveer Nazir

Screened

Senior Cloud & DevOps Engineer specializing in enterprise cloud automation and Kubernetes

Remote, NY11y exp
Bank of AmericaCollege of Staten Island, CUNY

Infrastructure/DevOps engineer with primary ownership in enterprise Linux and AWS/Azure production environments (including financial systems). Built secure, repeatable CI/CD pipelines deploying containerized workloads to EKS/ECS and implemented Terraform/CloudFormation IaC with drift detection and rollback practices; lacks direct IBM Power/AIX/PowerHA experience.

View profile
RR

Rahul Reddy

Screened

Senior Data Engineer specializing in cloud data platforms and big data pipelines

New York, NY6y exp
CVS HealthSouthern Arkansas University

Data engineer with healthcare (CVS Health) experience who migrated production PySpark workloads to native BigQuery SQL and built a Great Expectations-based validation microservice on GKE (Flask + REST) integrated into Cloud Composer. Has operated high-volume pipelines (~300–400GB/day) and designed external vendor ingestion on AWS (Lambda/Step Functions/Glue) with schema-drift detection, alerting, and backfill-safe controls to protect downstream Snowflake/BigQuery tables.

View profile
RK

Mid-level AI/ML Engineer specializing in Generative AI, Conversational AI, and RAG systems

NJ, USA4y exp
Scale AIRowan University

Built and shipped a production enterprise RAG knowledge assistant that returns grounded, cited answers and uses confidence-based fallbacks (clarifying questions/abstention) with monitoring and compliance controls for sensitive data. Implemented end-to-end agent orchestration (function calling, structured JSON, state, retries/rate limits) plus eval/feedback loops, and achieved a reported 30–40% improvement in knowledge-task completion time while reducing hallucinations via retrieval improvements.

View profile
Vidhi Upadhyay - Senior Software Engineer specializing in AI/ML, computer vision, and cloud-native systems in Remote

Senior Software Engineer specializing in AI/ML, computer vision, and cloud-native systems

Remote8y exp
Saayam for AllCarnegie Mellon University

Independently built a production-grade, containerized enterprise agentic AI platform (stateful orchestration + RAG) focused on real-world reliability—guardrails, citation-based outputs, reranking, query rewriting, and evaluation harnesses to reduce hallucinations. Hands-on with OpenAI SDK, CrewAI, and LangGraph, and has delivered AI solutions for non-technical NGO stakeholders via demos and practical POCs.

View profile
Bhanu Chander - Senior Data Engineer specializing in cloud data platforms and real-time pipelines in New York, NY

Bhanu Chander

Screened

Senior Data Engineer specializing in cloud data platforms and real-time pipelines

New York, NY6y exp
DisneyIndiana Wesleyan University

Data engineer focused on reliability and observability, building end-to-end pipelines processing millions of records/day from sources like S3 and Kafka. Has hands-on experience with Airflow-based data quality automation, PySpark/Databricks transformations, and shipping versioned Python REST APIs deployed via Docker/Kubernetes with CI/CD (Jenkins) and monitoring (CloudWatch/Azure Logs).

View profile
Parth Tusham - Intern Software Engineer specializing in systems, cloud, and security in Sunnyvale, CA

Parth Tusham

Screened

Intern Software Engineer specializing in systems, cloud, and security

Sunnyvale, CA1y exp
NokiaTexas A&M University

Systems and infrastructure engineer pivoting toward robotics software; brings strong low-level debugging, multithreaded systems, and networking experience where correctness and timing matter. Has hands-on experience using Docker and CI/CD to build reproducible test/evaluation environments (thesis), and proposes a disciplined, contract-driven approach to distributed communication and real-time performance debugging.

View profile
BC

Mid-level GenAI Engineer specializing in RAG, LLMs, and enterprise AI

4y exp
Cardinal HealthRivier University

Built and shipped production LLM agents that automate document processing and decision workflows, with a strong focus on reliability, guardrails, and measurable business impact. Stands out for combining RAG, tool calling, evals/monitoring, and ERP integration to deliver 30-35% manual effort reduction and higher throughput without additional headcount.

View profile
AC

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and predictive analytics

New Jersey, USA5y exp
JPMorgan ChaseStevens Institute of Technology

GenAI/LLM engineer who architected and deployed a production RAG “research assistant” for JPMorgan Chase’s regulatory compliance team, focused on safety-critical behavior (mandatory citations, refusal when evidence is missing). Deep hands-on experience with LlamaIndex, Pinecone, Hugging Face embeddings, LangGraph agent workflows, and metric-driven evaluation (golden sets, TruLens), including a reported 28% relevancy lift via cross-encoder re-ranking.

View profile
MI

Mid-level Data Scientist specializing in machine learning and big data analytics

Bentonville, AR6y exp
WalmartUniversity of North Texas

Walmart engineer who built and shipped a production LLM+RAG system to automate triage and analysis of computer support chats/tickets, producing grounded, schema-constrained JSON outputs for summaries, urgency, and routing recommendations. Emphasizes reliability (hallucination control, confidence thresholds, human-in-the-loop) and runs end-to-end pipelines with Airflow and AWS-native orchestration, plus rigorous evaluation and monitoring tied to business KPIs.

View profile
CA

Chau An

Screened

Senior Full-Stack Software Engineer specializing in Healthcare IT and FinTech

United States14y exp
CiklumCalifornia Lutheran University

Backend/platform engineer building HIPAA-compliant, real-time healthcare systems: owned a Python/Flask API layer for an AI-enabled patient engagement and risk scoring service, implemented PHI-safe logging and cross-service auditability, and delivered Kubernetes microservices via ArgoCD GitOps. Also has experience with Kafka streaming pipelines and hybrid cloud-to-on-prem migrations in regulated healthcare/fintech environments.

View profile
SM

Mid-level Data Scientist specializing in NLP, LLMs, and cloud ML platforms

Remote, USA5y exp
Wells FargoUniversity of Illinois Urbana-Champaign

LLM/MLOps engineer who has shipped production systems for complaint intelligence and contact-center NLU, including LoRA/RLHF-tuned LLaMA models deployed on GKE with vLLM and Vertex AI batch pipelines to BigQuery. Demonstrates strong practical focus on hallucination control, data imbalance mitigation, and production monitoring (Langfuse) with regression testing and canary rollouts, plus experience orchestrating complex workflows with AWS Step Functions.

View profile
VV

Vishnu Varma

Screened

Senior AI/ML Engineer specializing in LLMs, GenAI, and MLOps

Milpitas, California8y exp
DatabricksCampbellsville University

AI/ML engineer (Cognizant) who built a production, real-time credit card fraud detection platform combining deep-learning anomaly detection with an LLM-based explanation layer. Strong focus on regulated deployment: addressed class imbalance and feature drift, and added guardrails (SHAP/structured inputs, fine-tuning on analyst reports, rule-based validation) to keep explanations accurate and compliant. Orchestrated the full pipeline with Airflow + Databricks/Spark and used MLflow/Prometheus plus A/B and shadow deployments for measurable reliability.

View profile

Need someone specific?

AI Search