Reval Logo
Home Browse Talent Skilled in PySpark

Vetted PySpark Professionals

Pre-screened and vetted.

PySparkPythonDockerSQLCI/CDAWS
AP

Anurag Patil

Screened

Mid-level Data Analyst specializing in machine learning, ETL, and real-world evidence analytics

California, USA6y exp
AbbVieUC Irvine

“Developed and productionized an AI-driven "indication finding" system for AbbVie to identify additional diseases a drug could target, working closely with clinical research teams on cohort inclusion/exclusion criteria and disease rollups. Leveraged an LLM to map clinical inputs to ICD codes and built configuration-driven ML pipelines (Cloudera ML, YAML, scheduled jobs) with structured testing and evaluation for reliability.”

PythonSQLRMachine LearningDeep LearningNeural Networks+65
View profile
AR

Anvith Reddy Dodda

Screened

Mid-level AI Engineer specializing in GenAI, NLP, and MLOps

Remote, USA3y exp
PayPalUniversity of Central Missouri

“LLM/agentic-systems engineer with PayPal experience hardening an LLM-powered fraud support assistant from prototype to production, focusing on low-latency distributed architecture, rigorous evaluation/testing, and security/compliance. Comfortable in customer-facing and GTM contexts—runs technical demos/workshops, builds tailored pilots, and aligns sales/CS with engineering to close deals and drive adoption.”

PythonPySparkSQLNoSQLNumPyPandas+200
View profile
SM

Shravya M

Screened

Senior AI/ML Engineer specializing in NLP, LLMs, and MLOps

Texas, USA6y exp
CVS HealthUniversity of North Texas

“LLM/agent workflow engineer with healthcare experience (CVS/CBS Health) who built and deployed a production call-insights platform using Azure OpenAI + LangChain/LangGraph, including sentiment and compliance checks. Demonstrates deep HIPAA/PHI handling (tenant-contained processing, redaction, RBAC/encryption/audit logging) and production rigor (testing, eval sets, validation/retries, autoscaling) to scale to thousands of transcripts.”

A/B TestingAgileAnomaly DetectionApache AirflowAzure Data FactoryAzure Machine Learning+139
View profile
KA

Kartikeya Anand

Screened

Mid-level Machine Learning Engineer specializing in NLP, LLMs, and multimodal modeling

Ann Arbor, USA3y exp
University of MichiganUniversity of Michigan

“Built and productionized a telecom-focused RAG assistant by LoRA fine-tuning LLaMA-2 and integrating LangChain+FAISS behind a FastAPI service, with dashboards and a human feedback UI for engineers. Demonstrated measurable impact (≈40% faster document lookup, +8–10% retrieval precision) and strong MLOps rigor via Airflow orchestration, CI/CD, and monitoring for drift and failures.”

Anomaly DetectionAWSBERTCI/CDCUDAC+++111
View profile
SK

Sasi Katamneni

Screened

Mid-level Data Scientist / AI-ML Engineer specializing in Generative AI and LLM applications

Dallas, TX5y exp
Baylor Scott & WhiteUniversity of North Texas

“Built a production GenAI-powered analytics assistant to reduce reliance on data analysts by enabling natural-language Q&A over Databricks/Power BI dashboards, backed by vector search (Pinecone/Milvus) and a Neo4j knowledge graph, including multimodal support via OpenAI Vision. Demonstrates strong real-world LLM reliability engineering with strict RAG, LangGraph multi-step verification, and Guardrails/custom validators, plus broad orchestration and production monitoring experience (Airflow, ADF, Step Functions, Kubernetes, Prometheus/CloudWatch).”

A/B TestingAgileAjaxAmazon API GatewayAmazon BedrockAmazon CloudWatch+267
View profile
SR

Santhosh Reddy

Screened

Mid-level AI/ML Engineer specializing in deep learning, NLP/LLMs, and MLOps

MA, USA6y exp
Flatiron HealthClark University

“Built and shipped a real-time oncology risk prediction system used by doctors during patient visits, trained on clinical data in AWS SageMaker and deployed via FastAPI with sub-second responses. Emphasizes clinician-trust features (SHAP explainability, validation checks) and HIPAA-compliant controls (encryption, RBAC, audit logging), plus Kubernetes-based production operations with autoscaling, monitoring, and drift/retraining workflows; collaborated closely with oncologists at Flatiron Health.”

PythonRSQLJavaC++Bash+123
View profile
BK

Bharath kumar

Screened

Director-level AI & Data Science leader specializing in GenAI, LLMs, and MLOps

Draper, UT12y exp
ThorneBharathiar University

“ML/NLP engineer currently working in NYC on a system that connects complex unstructured data sources to deliver personalized insights, using embeddings + vector DB retrieval and a RAG architecture (LangChain, Pinecone/OpenSearch). Strong focus on production constraints—especially low-latency retrieval—using FAISS/ANN, PCA, index partitioning, and Redis caching, plus PEFT fine-tuning (LoRA/QLoRA) and KPI/SLA-driven promotion to production.”

A/B TestingAPI DevelopmentAPI TestingApache HadoopApache HiveApache Kafka+251
View profile
RW

Rebecca Witmer

Screened

Principal Data Scientist specializing in NLP and Generative AI

Chicago, IL9y exp
Witmer Consulting CorporationGeorgetown University

“ML/NLP practitioner with experience building an embedding-based ad matching and search system at Vericast (BERT embeddings + similarity search) to replace a third-party taxonomy approach, evaluated via a human-curated gold standard. Also built a custom NER pipeline at Allstate for auto accident claims calls using a bidirectional LSTM and achieved 90%+ F1, with a strong emphasis on production-grade ML workflows (testing, CI/CD, orchestration, versioning, validation).”

PythonPySparkRetrieval Augmented Generation (RAG)SQLOpenAIChatGPT+81
View profile
RG

Raja Gurugubelli

Screened

Mid-level GenAI Engineer specializing in production RAG and LLM fine-tuning

San Jose, California5y exp
eBayTexas Tech University

“LLM engineer who built a production seller-support RAG system at eBay using hybrid retrieval (BM25 + Pinecone vectors) with Cohere reranking, LangGraph orchestration, and citation-grounded answers. Strong focus on reliability: semantic/structure-aware chunking, automated Ragas-based evaluation with nightly regressions, and production observability (LangSmith) plus drift monitoring (Arize). Also implemented a multi-agent fraud pipeline with AutoGen using JSON-schema contracts and explicit termination conditions.”

PythonSQLBashGPT-4LoRALangChain+130
View profile
JH

John Hoffman

Screened

Senior Data Engineer specializing in Databricks, Spark, and AWS for government healthcare data systems

Windsor Mill, MD12y exp
GDITUniversity of Virginia

“Python/AWS engineer focused on batch-processing and data workflows, including building reusable S3/boto3 utilities with reliability features and IAM-based auth. Has led low-risk legacy modernizations using parity testing plus a month of parallel production runs, and has owned production issues end-to-end (including fixing a client-side Excel macro) while contributing to significant AWS cost reductions (~$10k/month).”

PythonSQLBashDatabricksApache SparkPySpark+66
View profile
MS

Mohan Shri Harsha Guntu

Screened

Mid-level Data Scientist / Machine Learning Engineer specializing in fraud, risk, and MLOps

Remote, MO7y exp
Northern TrustWebster University

“AI/ML practitioner with Northern Trust experience who has shipped production LLM systems (internal support assistant) using RAG, vector databases, orchestration (LangChain/custom pipelines), and rigorous monitoring/feedback loops. Also built AI-driven fraud detection/risk monitoring solutions in a regulated financial environment, emphasizing explainability (SHAP), audit readiness, and stakeholder trust through dashboards and clear communication.”

PythonRSQLPandasNumPyScikit-learn+137
View profile
GB

Geetha Bommareddy

Screened

Mid-level AI/ML Engineer specializing in fraud detection and risk analytics in Financial Services

USA5y exp
JPMorgan ChaseTrine University

“At JP Morgan Chase, built and deployed a production LLM-powered RAG knowledge assistant to help fraud investigators and risk analysts quickly navigate regulatory updates and internal policies, reducing investigation delays and compliance risk. Strong focus on secure retrieval (RBAC filtering), reliability (layered testing + observability), and production constraints (latency/SLOs), with Airflow-orchestrated, auditable ML pipelines.”

Amazon EC2Amazon EKSAmazon RedshiftAmazon S3Amazon SageMakerAnomaly Detection+159
View profile
YP

Yash Pise

Screened

Mid-level Data Scientist specializing in Generative AI, LLMOps, and clinical data pipelines

5y exp
NovartisStevens Institute of Technology

“LLM/RAG engineer who has built and deployed corporate-scale systems at Novartis and Johnson & Johnson, including a healthcare AI agent that generates day-to-day treatment schedules. Recently handled a high-stakes safety incident (LLM suggesting overdose) by tightening model instructions and validating with ~200 test prompts, and has strong end-to-end data/embedding/vector DB pipeline experience (PySpark, FAISS, Pinecone) plus SME-in-the-loop evaluation (RLHF).”

PythonRJavaScriptMySQLPostgreSQLNumPy+88
View profile
SK

Santhosh Kumar

Screened

Mid-level GenAI/ML Engineer specializing in LLM agents and RAG for Financial Services & Healthcare

5y exp
Bank of AmericaVirginia Commonwealth University

“Built and deployed a production GenAI internal support agent at Bank of America (“Ask GPS/AskGPT”) using RAG on Azure, focused on reducing escalations and improving response quality for repetitive knowledge-based queries. Demonstrates strong production LLM engineering: custom LangChain orchestration, retrieval tuning to reduce hallucinations, rigorous offline/online evaluation, and model benchmarking with dynamic routing (e.g., GPT-4 vs Claude).”

AWSAWS LambdaCI/CDClaudeDatabricksDecision Trees+97
View profile
NP

Nikita Prasad

Screened

Mid-level AI/ML Engineer specializing in NLP, MLOps, and scalable data pipelines

Remote, USA5y exp
JPMorgan ChaseUniversity of Dayton

“Built and shipped a production LLM-powered personalized client engagement assistant in the financial domain, balancing real-time recommendations with strict privacy/compliance requirements. Demonstrates strong MLOps/LLMOps depth (Airflow + MLflow, containerized microservices, drift monitoring) and a privacy-by-design approach validated in collaboration with risk and compliance teams.”

PythonPandasspaCyRSQLPySpark+199
View profile
EL

Ethan Lam

Screened

Junior Software Engineer specializing in data platforms and full-stack development

Toronto, Ontario3y exp
Warner Music GroupUniversity of Toronto

“Software engineer with Warner Music Group experience owning and shipping analyst-facing data products (marketing/streaming data dashboards) end-to-end with high adoption through continuous stakeholder feedback. Also builds side projects with TypeScript/React and domain-driven API design, emphasizing flexibility (including swapping databases mid-development) and pragmatic microservices reliability patterns (logging, timeouts, retry backoff).”

PythonJavaSQLScalaJavaScriptTypeScript+72
View profile
LS

Likhith Sai Kumar Pasupuleti

Screened

Mid-level Software Engineer specializing in cloud-native microservices and workflow automation

TX, USA5y exp
ServiceNowCalifornia State University, Long Beach

“Enterprise platform engineer/product owner who led end-to-end delivery of customer-facing ServiceNow Service Catalog/workflow solutions, emphasizing reliability, security, and fast iteration. Built React/TypeScript portals with Node.js and Spring Boot backends, and improved microservices reliability at scale using Kafka, monitoring, and robust retry/timeout patterns.”

JavaPythonSQLCC++R+154
View profile
PD

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp
UnitedHealth GroupEast Texas A&M University

“Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.”

PythonPandasNumPyPySparkScikit-learnSQL+133
View profile
SK

Sharath Kumar

Screened

Mid-level AI/ML Engineer specializing in LLM fine-tuning, RAG, and MLOps

Remote, USA5y exp
HPWilmington University

“AI/ML engineer with HP experience building and productionizing an LLM-powered document intelligence platform (LangChain + Pinecone) to deliver semantic search and contextual Q&A across millions of enterprise support documents. Demonstrates strong MLOps and scaling expertise (Airflow, Kubernetes autoscaling, Triton GPU inference, monitoring with Prometheus/W&B) plus a structured approach to evaluation (A/B tests, shadow deployments, failover) and effective collaboration with non-technical stakeholders.”

PythonSQLPostgreSQLBigQuerySnowflakeBash+142
View profile
HK

Harini Kv

Screened

Mid-level AI/ML Engineer specializing in GenAI, NLP, and MLOps

Dallas, TX7y exp
EquinixFitchburg State University

“GenAI/data engineering practitioner with production experience across Equinix, Optum, and Citibank—built an Azure OpenAI (GPT-4) + LangChain document intelligence platform processing 1.5M+ docs/month and a HIPAA-compliant Airflow healthcare pipeline handling 5M+ claims/day. Also delivered a real-time fraud detection + explainability system using LightGBM and a fine-tuned T5 NLG component, improving fraud accuracy by 15%+ while partnering closely with compliance stakeholders.”

PythonSQLPySparkBashJavaJavaScript+169
View profile
UC

Uday Chilakala

Screened

Mid-level Machine Learning Engineer specializing in NLP, computer vision, and RAG systems

Atlanta, GA5y exp
Morgan StanleyKennesaw State University

“Machine learning/NLP engineer who built a production-oriented retrieval-based AI system at Morgan Stanley for healthcare use cases, combining RAG over unstructured patient records with deep-learning medical image segmentation (U-Net/Mask R-CNN). Strong in end-to-end pipelines and MLOps (Spark/MongoDB, AWS SageMaker, CI/CD, monitoring, automated retraining) and in entity resolution/data quality validation for noisy clinical data.”

PythonSQLFlaskApache SparkgRPCTensorFlow+125
View profile
PC

Prasanna Chelliboyina

Screened

Mid-level Machine Learning Engineer specializing in forecasting, NLP, and GenAI

United States6y exp
WalgreensSyracuse University

“GenAI/ML engineer with production experience building multilingual LLM systems (English/Spanish) and RAG-based clinical documentation summarization at Walgreens, combining prompt engineering, structured output validation, and rigorous evaluation (ROUGE + pharmacist review). Also orchestrated end-to-end ML pipelines for demand forecasting using Apache Airflow, PySpark, and MLflow with scheduled retraining and production monitoring.”

A/B TestingAgileAnomaly DetectionApache SparkAWSAzure Machine Learning+114
View profile
AC

Annie Chang

Screened

Senior Full-Stack/Backend Software Engineer specializing in cloud-native automation and microservices

San Francisco, CA9y exp
Booz Allen HamiltonUC Davis

“Backend/data engineer with strong AWS production experience across containers (ECS) and serverless (API Gateway/Lambda/SQS), plus Glue-based ETL to Parquet for Athena/Redshift. Demonstrates hands-on reliability and security depth (Cognito OAuth2/JWT with JWKS rotation, idempotency/DLQs, monitoring) and measurable performance wins (Redis caching + query tuning), along with legacy-to-services modernization using parallel-run parity and feature-flagged cutovers.”

API DesignAPI GatewayAngularAsynchronous ProcessingAuthenticationAuthorization+108
View profile
DV

Dheeraj Vajjarapu

Screened

Mid-level AI/ML Engineer specializing in MLOps, NLP/LLMs, and computer vision

Remote, USA4y exp
BarclaysYeshiva University

“Built and shipped a production LLM/RAG risk-case summarization and triage system used by fraud/compliance analysts, with strong grounding controls (evidence-cited outputs and refusal on low confidence). Demonstrates end-to-end ownership across retrieval quality, Airflow-orchestrated indexing pipelines, and compliance-grade privacy (PII redaction, RBAC, encrypted redacted logging, and auditable prompt/model versioning) plus a tight feedback loop with non-technical domain experts.”

PythonSQLBashMachine LearningDeep LearningScikit-learn+124
View profile
1...192021...78

Related

Machine Learning EngineersData ScientistsSoftware EngineersData EngineersAI EngineersData AnalystsAI & Machine LearningEngineeringData & AnalyticsEducation

Need someone specific?

AI Search