Reval Logo
Home Browse Talent Skilled in PySpark

Vetted PySpark Professionals

Pre-screened and vetted.

PySparkPythonDockerSQLCI/CDAWS
AM

Ajay Madhusudhan Thumala

Screened

Junior Software Engineer specializing in data engineering and LLM applications

Irvine, CA1y exp
GeisingerUC Irvine

“Computer science engineer and master’s graduate who independently built a mechatronics-heavy capstone prototype: a smartphone concept for deafblind users using micro-actuator arrays for braille reading. Also has platform engineering experience at Quantiphi, deploying webhooks to Kubernetes and implementing GitOps-based CI/CD using AWS CodeCommit/CodeBuild and ECR.”

API DevelopmentAPI GatewayAWSBashCC+++206
View profile
AA

Akshitha Akula

Screened

Mid-Level Full-Stack Python Engineer specializing in cloud APIs and data/ML platforms

Bentonville, AR4y exp
WalmartUniversity of Central Missouri

“Backend engineer at Goldman Sachs who deployed internal LLM-powered utilities to summarize operational logs/tickets, with a strong emphasis on data sensitivity and reliability. Built deterministic workflows with template-based prompts, confidence checks, and rule-based fallbacks, and used monitoring plus failure-rate metrics to tune performance; also has hands-on Temporal orchestration experience for resilient async backend jobs.”

PythonJavaScriptTypeScriptSQLFastAPIFlask+116
View profile
VM

Vasavi Mittapalli

Screened

Senior Data Scientist specializing in GenAI, LLMs and RAG

Dallas, TX5y exp
Texas InstrumentsTrine University

“Built and deployed a production LLM-powered RAG assistant for semiconductor manufacturing failure analysis, reducing engineer triage effort by grounding outputs in retrieved evidence and gating responses with SPC + ML signals (LSTM anomaly scores, XGBoost probabilities). Experienced with LangChain/LangGraph to ship reliable, observable multi-step agents with branching/fallback logic, and evaluates impact using both technical metrics and business KPIs like mean time to triage and downtime reduction.”

A/B TestingAgileAmazon DynamoDBAmazon EC2Amazon EMRAmazon Kinesis+195
View profile
JV

Jaswanth Vakkala

Screened

Mid-level Generative AI Engineer specializing in enterprise RAG and multimodal NLP

Iselin, NJ5y exp
Wells FargoSt. Francis College

“Built and deployed a production LLM/RAG chatbot at Wells Fargo for securely querying regulated financial and compliance documents, emphasizing low hallucination rates, explainability, and strict governance. Experienced with LangChain multi-agent orchestration plus Airflow/Prefect pipelines for ingestion, embeddings, evaluation, and retraining, and partnered closely with compliance/operations to drive adoption through demos and feedback-driven retrieval rules.”

A/B TestingAnomaly DetectionApache HadoopApache HiveApache SparkAWS+224
View profile
HK

Harshitha Kotari

Screened

Mid-level Data/ML Engineer specializing in NLP, GenAI, and scalable data pipelines

5y exp
AbbottClarkson University

“AI/ML engineer with production experience building LLM-powered document intelligence and customer support systems in healthcare/insurance, emphasizing high-accuracy RAG, long-document processing, and robust monitoring/fallback mechanisms. Also automates and scales ML lifecycle workflows using Apache Airflow and Kubeflow, and partners closely with non-technical operations stakeholders to drive adoption.”

PythonRSQLJavaMATLABHTML+148
View profile
SZ

Sahar Zargarzadeh

Screened

Junior AI/Backend Software Engineer specializing in ML and scalable systems

Dallas, TX2y exp
PMGUniversity of Maryland, College Park

“Backend engineer with strong AWS/CI/CD experience (multi-repo deployments, Lambda + core app, immutable ECR and image promotion) and a published master’s thesis building an ML framework for Solar PV energy prediction and CO2 reduction impact modeling using ensemble and meta-learning approaches benchmarked against SAM.”

PythonNode.jsTerraformJavaRJavaScript+99
View profile
UK

Uday Kumar gattu

Screened

Mid-level Generative AI Engineer specializing in LLM agents and RAG systems

4y exp
Capital OneLindsey Wilson College

“Built and deployed a production LLM/RAG knowledge assistant integrating internal docs, wikis, and ticket histories to reduce tribal-knowledge dependency and repetitive questions. Emphasizes reliability via grounding + a validation layer, and achieved major latency gains (>50%) through vector index optimization, caching, quantization, and selective re-validation. Comfortable orchestrating end-to-end LLM/data workflows with Airflow, Prefect, and Dagster, including monitoring and alerting.”

A/B TestingAmazon CloudWatchAmazon DynamoDBAmazon EKSAmazon RedshiftAmazon S3+129
View profile
SD

Sanjana Duvva

Screened

Mid-level AI/ML Engineer specializing in Generative AI, LLMOps, and MLOps

5y exp
Wells FargoUniversity of North Texas

“Built and deployed an AWS-based LLM/RAG ticket triage and knowledge retrieval system (Pinecone/FAISS + Step Functions + MLflow) that cut support resolution time by 20%. Demonstrates strong production focus on hallucination reduction, PII security, and low-latency orchestration, with measurable evaluation improvements (e.g., ~25% grounding accuracy gain via re-ranking) and proven collaboration with support operations stakeholders.”

PythonSQLJavaScalaShell ScriptingTypeScript+153
View profile
AR

Anagha Rumade

Screened

Senior Applied AI/ML Engineer specializing in GenAI, LLMs, RAG and agents

Palo Alto, California9y exp
JPMorgan ChaseStevens Institute of Technology

“Applied AI/ML Engineer at JPMorgan Chase who led a banker-facing LLM chatbot from an OpenAI-API POC to a production RAG workflow, including hallucination mitigation, automated evaluation in SageMaker, and operational monitoring with Dynatrace. Also delivers external technical education—hosted a hands-on Grace Hopper Celebration 2025 workshop teaching LangChain/LangGraph agentic workflows.”

AWSAWS LambdaCI/CDComplianceData AnalysisData Ingestion+58
View profile
AG

Abhinav Gupta

Screened

Junior Machine Learning Engineer specializing in LLMs and applied data science

2y exp
EsriUSC

“Built and shipped multiple production AI systems, including Auto DocGen (LLM-generated OpenAPI docs kept in sync via AST diffs, schema-constrained generation, and CI/CD on Render) and a multimodal sign-language recognition pipeline at USC orchestrated with FastAPI, MediaPipe, and PyTorch. Also partnered with Esri’s non-technical community team to fine-tune an LLaMA-based spam classifier with a review UI, cutting moderation time by 70%.”

PythonPandasNumPyScikit-learnJavaScriptTypeScript+126
View profile
BG

Bernard Griffin

Screened

Senior Data Scientist / ML Engineer specializing in cloud ML pipelines and GenAI

Baltimore, MD17y exp
IntelIllinois Institute of Technology

“ML/NLP practitioner with experience building a transformer-failure prediction system that combines sensor signals with unstructured maintenance comments using LLM-based extraction and similarity validation. Strong emphasis on production readiness—data leakage controls, SQL-driven data quality tiers, and rigorous bias/fairness validation (including contract/spec evaluation across diverse company profiles).”

A/B TestingAmazon BedrockAmazon EC2Amazon EMRAmazon KinesisAmazon Redshift+130
View profile
HG

Harshavardhan Garikala

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

NJ, USA4y exp
Red HatOklahoma Christian University

“Red Hat ML/LLM engineer who designed and deployed a production LLM-powered customer support automation system using RAG, improving latency by 30% via PEFT and vector search optimization. Built security and governance into retrieval (access-level filtering, encrypted Pinecone/ChromaDB) and delivered SHAP-based explainability via a dashboard for non-technical stakeholders. Experienced orchestrating distributed ML/RAG pipelines across AWS SageMaker and OpenShift with Airflow/Prefect, plus multi-agent workflows using CrewAI and LangGraph.”

PythonPySparkSQLTensorFlowPyTorchHugging Face+127
View profile
SM

Subhasmita Maharana

Screened

Mid-level Data Scientist specializing in NLP/LLMs, time series forecasting, and MLOps

New York, NY6y exp
CitigroupKent State University

“Data/ML practitioner with hands-on experience building NLP systems from prototype to production: delivered a Twitter sentiment classifier with robust preprocessing, SVM modeling, and Power BI reporting, and built entity-resolution pipelines for messy multi-source customer data (reporting ~95% improvement in unique entity identification). Also implemented semantic linking/search using SBERT embeddings with FAISS vector retrieval and domain fine-tuning (reported ~15% precision lift), and applies production workflow best practices (Airflow/Prefect, Docker, Azure ML/Databricks, Great Expectations).”

A/B TestingApache AirflowAzure Machine LearningBERTCI/CDClustering+170
View profile
BR

Bhavana Reddy Ponnapati

Screened

Mid-Level Software Engineer specializing in cloud-native distributed systems

Sunnyvale, CA5y exp
WalmartArizona State University

“Backend/platform engineer who has built and run production Python/Flask + Kafka microservices processing RFID and camera/RFID fusion streams for near-real-time retail cart updates at ~4–5M events/day. Strong in reliability/performance debugging (p99 latency, Kafka lag, Cosmos DB RU hot partitions) with measurable impact including ~30% database cost reduction, and has also shipped an end-to-end vulnerability scanning workflow with DynamoDB-backed state, idempotency, and robust retry/verification guardrails.”

PythonJavaSQLJavaScriptTypeScriptKotlin+162
View profile
SB

Sathyavarthan Balachandar

Screened

Mid-level Data Engineer specializing in scalable pipelines, Spark, and cloud data warehousing

Boston, USA3y exp
Fidelity InvestmentsNortheastern University

“Backend/data platform engineer who recently owned an end-to-end large-scale financial data platform delivering real-time decision support for finance and operations. Has hands-on experience modernizing legacy batch pipelines into AWS cloud-native ELT with parallel-run cutovers, strong data quality controls (dbt-style tests, reconciliation), and measurable improvements in runtime, cost, and SLA compliance. Also builds scalable, secure FastAPI microservices using Docker, ALB-based horizontal scaling, Redis caching, and managed auth with Cognito/Supabase plus Postgres RLS.”

PythonSQLGoApache SparkPySparkDatabricks+125
View profile
SP

Sayali Patil

Screened

Mid-level Python Full-Stack Developer specializing in Healthcare and FinTech

Everett, MA6y exp
Kaiser PermanenteHarrisburg University of Science and Technology

“Backend engineer with hands-on experience building a fraud-transaction monitoring system in Python/Flask, architected as Dockerized microservices and integrated with Kafka for high-volume streaming. Demonstrates strong performance and reliability chops across PostgreSQL/SQLAlchemy tuning (EXPLAIN ANALYZE, N+1 fixes, bulk ops), multi-tenant data isolation, and scaling via background workers + Redis caching, plus real-time ML inference deployment using TensorFlow on AWS.”

PythonFastAPIDjangoFlaskJavaScriptTypeScript+131
View profile
JS

Jash Shah

Screened

Mid-level Data Scientist specializing in LLMs, MLOps, and predictive analytics in healthcare and finance

New Jersey, USA4y exp
Johnson & JohnsonStevens Institute of Technology

“Built and deployed a production LLM/RAG clinical decision support system that enables real-time semantic search over unstructured EHR notes and delivers patient risk insights. Strong in healthcare-grade MLOps and compliance (HIPAA, PHI handling, encryption, RBAC, audit logs) and scaled embedding/retrieval pipelines using Spark/Databricks and Airflow. Partnered with clinicians via Power BI dashboards and explainability, contributing to an 18% reduction in patient readmissions.”

A/B TestingAPI IntegrationApache AirflowApache HadoopApache KafkaApache Spark+102
View profile
SA

SaiTeja Alavala

Screened

Mid-level AI/ML Engineer specializing in risk, fraud detection, and Generative AI

Lawrenceville, NJ4y exp
TD BankIndiana Wesleyan University

“Built and deployed an LLM-powered RAG document intelligence/search platform for banking risk & compliance teams, emphasizing sensitive-data handling, traceability, and conservative fallback logic to minimize hallucinations; deployed via Docker/REST on AWS and cut manual review effort by 35%. Also partnered with TD Bank marketing to deliver an AI customer segmentation solution that improved targeted campaign engagement by 18%.”

Anomaly DetectionAWSAzure Machine LearningCI/CDClassificationContainerization+77
View profile
HR

Harshavardhan Reddy

Screened

Mid-level AI/ML Data Scientist specializing in NLP, computer vision, and risk analytics

Albany, NY5y exp
Capital OnePace University

“ML/AI engineer with Capital One experience building production-grade customer segmentation and fraud detection systems combining NLP (transformers) and anomaly detection. Strong MLOps and orchestration background (PySpark ETL, MLflow, Airflow, Docker/Kubernetes, Azure ML) with real-time monitoring/alerting and performance optimizations like quantization and caching, plus proven ability to deliver business-facing insights through Power BI/Tableau for marketing stakeholders.”

PythonRSQLPySparkScalaJava+105
View profile
BC

Bhuvan Chandi

Screened

Mid-level Data Engineer specializing in AI/ML data platforms

NY, NY6y exp
BlackRockWebster University

“Built and productionized an LLM-powered PDF document Q&A system to eliminate manual searching through long documents, focusing on scalability and answer reliability. Implemented semantic chunking (using headings/paragraphs/tables), overlap, and preprocessing/quality checks to reduce hallucinations, and orchestrated the end-to-end pipeline with Airflow using retries, alerts, and parallel tasks.”

PythonSQLShell ScriptingApache SparkPySparkApache Hadoop+103
View profile
RK

Rohit Khoja

Screened

Mid-level Full-Stack Engineer specializing in cloud microservices and NLP/LLM systems

Tempe, AZ4y exp
CitigroupArizona State University

“Full-stack engineer with 3+ years using Java/Spring Boot (Citi) and React, who built a production observability dashboard monitoring 53 microservices across 17 clusters with real-time health/latency tracing and significant performance improvements (cut load time from ~10s). Also designed a serverless AWS face-recognition system (Lambda/S3/SQS) built to handle burst traffic (~1000 concurrent requests), demonstrating strength in scalable, event-driven architectures.”

AgileAmazon EC2Amazon S3Amazon SQSApache KafkaAWS Lambda+106
View profile
SS

Shanmukh Sai Madhu

Screened

Mid-level Data Engineer specializing in real-time pipelines and cloud analytics

Chicago, IL5y exp
JPMorgan ChaseUniversity of South Dakota

“Researcher from the University of South Dakota who built a production medical RAG system to help interpret model predictions by retrieving relevant clinical notes and medical literature, overcoming retrieval accuracy and imaging-dataset challenges through semantic chunking and metadata-driven indexing. Also has hands-on orchestration experience with Airflow and Azure Data Factory, plus a pragmatic approach to LLM evaluation and stakeholder-driven iteration.”

AgileAmazon EMRApache AirflowApache KafkaApache SparkAWS+122
View profile
AM

Akshit Modi

Screened

Mid-level AI/ML Engineer specializing in healthcare NLP and MLOps

Remote, USA5y exp
TempusArizona State University

“Healthcare/clinical ML practitioner who built and productionized ClinicalBERT-based pipelines to extract and standardize oncology EHR data, improving downstream model F1 from 0.81 to 0.92 while controlling training cost via LoRA/QLoRA. Experienced orchestrating real-time AWS ETL/ML workflows (Glue, Lambda, SageMaker) and partnering with clinicians using SHAP-based interpretability, contributing to an 18% reduction in readmissions and full adoption.”

PythonSQLC++JavaNumPyPandas+166
View profile
SR

Sai Raja Ramya Bhavana Thota

Screened

Senior Data Scientist specializing in machine learning and customer analytics

Illinois, USA7y exp
Northern TrustBradley University

“Data/ML practitioner with experience applying NLP and classical ML to large-scale customer data (2B+ records) for segmentation, prediction, and survey-text classification, delivering measurable business impact (~18% engagement efficiency). Has hands-on entity resolution across multi-source datasets and has built embedding-based semantic search using SentenceBERT + a vector database with domain fine-tuning (~20% relevance improvement), plus production workflow experience with Spark/Airflow and cloud tooling (AWS/Azure).”

A/B TestingAnalyticsAzure Machine LearningBashBigQueryC+195
View profile
1...202122...78

Related

Machine Learning EngineersData ScientistsSoftware EngineersData EngineersAI EngineersData AnalystsAI & Machine LearningEngineeringData & AnalyticsEducation

Need someone specific?

AI Search