Reval Logo
Home Browse Talent Skilled in PySpark

Vetted PySpark Professionals

Pre-screened and vetted.

PySparkPythonDockerSQLCI/CDAWS
AR

Anvesh Reddy Narra

Screened

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and MLOps

3y exp
State FarmCleveland State University

“Built a secure, on-prem/private GPT assistant to replace manual SharePoint-style search across thousands of policies/SOPs/engineering docs, using a production RAG stack (LangChain/LangGraph, FAISS/Chroma, PyMuPDF+OCR, vLLM). Implemented layout-aware ingestion (including table-to-JSON) and a multi-agent retrieval/generation/verification workflow with strong observability and compliance guardrails, delivering ~70% reduction in search time.”

Anomaly DetectionAnsibleApache KafkaApache SparkAWSBERT+184
View profile
RK

Rahul Karanam

Screened

Senior Computer Vision & Robotics Engineer specializing in perception and warehouse automation

San Jose, CA5y exp
RoboteonUniversity of Maryland, College Park

“Robotics engineer with hands-on experience scaling a multi-vendor heterogeneous warehouse robot fleet, building a distributed “traffic manager” for collision avoidance and real-time rerouting using CBS/MAPF and DCOP-style negotiation. Strong real-time/safety-critical systems background (RTOS, deterministic lock-free multithreading) plus modern perception and simulation tooling (CNN-LSTM/transformers, CARLA/Isaac Sim, VIO/GTSAM, camera-IMU calibration). Startup-oriented and comfortable moving quickly from prototype to production.”

AngularAWSAWS LambdaC++CI/CDComputer Vision+147
View profile
MK

Manpreet Kour

Screened

Senior Data Scientist specializing in Generative AI and NLP

Seattle, USA6y exp
SOTIDr. B. R. Ambedkar National Institute of Technology, Jalandhar

“ML/NLP engineer with recent Scotiabank experience building production-grade indexing automation over large-scale emails and customer databases, combining LLM fine-tuning (Mistral, XLM-R) with fuzzy matching to exceed 95% accuracy under strict banking constraints. Also built a RAG-based chat agent using Gecko embeddings, Vertex AI Search, Gemini, and cross-encoder reranking, and delivered a text-to-SQL chatbot at SOTI through iterative fine-tuning and benchmark-driven experimentation.”

Machine LearningDeep LearningGenerative AIComputer VisionPyTorchPySpark+92
View profile
SN

Sri Niyati Kompella

Screened

Senior Data Engineer specializing in cloud data platforms and ML pipelines

Atlanta, GA8y exp
Berkshire HathawayUniversity of Alabama at Birmingham

“Data engineer focused on AWS-based enterprise data platforms, owning end-to-end pipelines from multi-source batch/stream ingestion (Glue/Kinesis/StreamSets/Airflow) through PySpark transformations into curated datasets for Redshift/Snowflake. Emphasizes production reliability with strong monitoring/observability and data quality gates, and reports ~30% performance improvement plus improved SLAs and latency after optimization.”

Amazon DynamoDBAmazon EMRAmazon EKSAmazon KinesisAmazon RedshiftAmazon S3+138
View profile
NG

Nishchal Gante

Screened

Mid-level Data Scientist specializing in MLOps and Generative AI

Illinois, IL4y exp
BNY MellonIllinois Institute of Technology

“Robotics software/ML engineer who built perception and navigation-related ML systems for autonomous supermarket carts, including object detection, shelf recognition, and obstacle avoidance. Strong ROS/ROS2 practitioner who optimized real-time performance (reported 50% latency reduction) and deployed containerized ROS/ML pipelines at scale using Docker, Kubernetes, and CI/CD.”

A/B TestingAgileAmazon API GatewayAmazon BedrockAmazon EC2Amazon RDS+133
View profile
TN

Tanishq Nimale

Screened

Junior Software Engineer specializing in Cloud, Full-Stack, and Data Engineering

Virginia, USA2y exp
Strategy INCUniversity of Texas at Dallas

“Software engineer with experience across data engineering and backend/platform work: owned a Databricks/PySpark real-time pipeline powering customer dashboards with a 15-minute SLA, and helped modernize an investor web app from JSP to React/TypeScript with API + SQL/materialized-view performance improvements. Also contributed to breaking a Java monolith into microservices (Redis + gRPC on AWS EKS) and built an EC2-deployed Play Store/App Store crawler that reduced third-party data costs.”

AWSAWS LambdaApache KafkaAPI DevelopmentAuthenticationC#+84
View profile
HB

Harideep Balusa

Screened

Mid-level AI/ML Engineer specializing in FinTech risk, fraud detection, and GenAI/RAG systems

USA6y exp
Freddie MacUniversity of Wisconsin

“Built and productionized Azure-based LLM/RAG systems for regulatory/compliance use cases, including automating analyst research and compliance report generation across large unstructured document sets. Demonstrates strong practical depth in hallucination mitigation, hybrid retrieval tuning (BM25 + embeddings), and production MLOps (Databricks, Cognitive Search, AKS, Airflow/MLflow), plus proven ability to deliver auditable, explainable solutions with non-technical compliance teams.”

PythonRSQLScalaMachine LearningDeep Learning+125
View profile
AB

Alekya Battu

Screened

Mid-level Data Scientist specializing in ML, NLP, and MLOps

USA5y exp
Wells FargoWilmington University

“Senior data scientist with ~5 years’ experience building production ML/NLP systems in finance (Wells Fargo) and deep learning for sensor analytics in connected vehicles (Medtronic). Has delivered end-to-end platforms combining time-series forecasting with transformer-based NLP, including automated drift monitoring/retraining (MLflow + Airflow) and standardized Docker/CI/CD deployments; achieved a reported 22% precision improvement after domain fine-tuning.”

AgileScrumKanbanSDLCCI/CDWaterfall+144
View profile
SK

Sai Krishna Chittanuri

Screened

Mid-level Data Scientist specializing in real-time fraud detection and MLOps

San Francisco, CA5y exp
Charles SchwabCUNY Graduate Center

“ML/NLP engineer with experience at Charles Schwab building an NLP + graph (Neo4j) entity-resolution system to unify fragmented user/device/transaction data and improve downstream model quality and analyst querying. Has applied embeddings (SentenceTransformers + FAISS) with domain fine-tuning to boost hard-case matching recall by ~12% while maintaining precision, and has a track record of hardening scalable Python/Spark pipelines and productionizing fraud models via A/B tests and shadow-mode monitoring.”

PythonRSQLPandasNumPyPySpark+120
View profile
AB

Ankush Banthia

Screened

Senior Data & Platform Engineer specializing in cloud-native streaming and distributed systems

USA10y exp
JPMorgan ChaseNew York Institute of Technology

“Financial data engineer who has built and operated high-volume batch + streaming pipelines (200–300 GB/day; 5–10k events/sec) using AWS, Spark/Delta, Airflow, Kafka, and Snowflake, with strong emphasis on data quality and reliability. Demonstrated measurable impact via 99.9% SLA adherence, major reductions in bad records/nulls, MTTR improvements, and significant latency/runtime/query performance gains; also built a distributed web-scraping system processing 5–10M records/day with anti-bot and schema-drift defenses.”

OnboardingMentoringAgileScrumJiraConfluence+150
View profile
HG

Hritvik Gupta

Screened

Mid-level AI Engineer specializing in LLMs, RAG, and healthcare AI

San Francisco, CA3y exp
Penn MedicineUC Riverside

“Built and scaled an AI-powered voice/chat patient engagement platform at Penn Medicine from early prototype into production clinical workflows, focusing on latency, edge cases, and user trust. Strong in LLM reliability engineering (structured prompts, validation/fallbacks), real-time troubleshooting with observability, and cross-functional enablement through pilots, demos, and sales/customer partnership.”

AWSAWS LambdaC++CI/CDCommunicationData Engineering+78
View profile
MK

Mrunal Kakirwar

Screened

Mid-level Full-Stack Engineer specializing in cloud-native microservices and AI automation

USA5y exp
Fuel AICalifornia State University

“Software engineer/product owner who has led end-to-end delivery of AI and content-management platforms, including building RAG-based reliability improvements and migrating fragile systems to containerized AWS ECS/Kubernetes with Terraform-managed CI/CD. Experienced designing event-driven microservices (SQS/SNS/RabbitMQ), scaling queue consumers with autoscaling, and creating internal Python tooling to standardize data connectors (e.g., BigQuery/Airtable/internal APIs) to speed iteration.”

PythonJavaScriptTypeScriptShell ScriptingJavaSQL+108
View profile
BS

BHEEMA SABILLA

Screened

Mid-level Data Engineer specializing in Lakehouse, Streaming, and ML/LLM data systems

Remote, USA3y exp
DiscoverUniversity of South Dakota

“Built and productionized an enterprise retrieval-augmented generation platform for internal knowledge over large unstructured corpora, emphasizing trust via strict citation/grounding and hybrid retrieval (BM25 + FAISS + cross-encoder re-ranking). Demonstrates strong scaling and cost/latency optimization through incremental indexing/embedding and index partitioning, plus disciplined evaluation/observability practices. Has experience operationalizing pipelines with Airflow/Databricks/GitHub Actions and partnering closely with risk & compliance stakeholders on auditability requirements.”

PythonPySparkSQLScalaPandasNumPy+157
View profile
TT

Thrinesh Thode

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM applications

New York, NY4y exp
BNY MellonUniversity at Albany

“BNY Mellon engineer who has built and operated production AI systems end-to-end: a LangChain/Pinecone RAG platform scaled via FastAPI + Kubernetes to 1000 RPM with 99.9% uptime, supported by monitoring and data-drift detection. Also deep in data/infra orchestration (Airflow, Dagster, Terraform on AWS/EMR/EC2), processing 500GB+ daily and delivering measurable reliability and performance gains, plus strong compliance-facing model explainability using SHAP and Tableau.”

A/B TestingApache KafkaApache SparkAWSAWS LambdaBERT+86
View profile
NA

Nikshitha Aella

Screened

Mid-level Full-Stack Software Engineer specializing in AI platforms and microservices

Mooresville, NC6y exp
Lowe'sUniversity of North Carolina at Charlotte

“Backend engineer currently building an AWS Lambda/FastAPI inventory recommendation system using a LangChain + GPT-4 RAG pipeline and MongoDB vector search; drove major cost optimization via Redis caching (60% reduction) while sustaining 10k+ daily requests under 2s latency. Previously deployed Node.js microservices on AWS OpenShift with Jenkins/Helm at UnitedHealth Group and led a zero-downtime monolith-to-microservices migration at Verizon, including RabbitMQ-based real-time messaging with DLQs and idempotency.”

AgileAngularAPI GatewayAWSAWS LambdaCI/CD+83
View profile
VK

Varun Kumar Kota

Screened

Mid-level Software Engineer specializing in cloud, data engineering, and AI/ML

Remote3y exp
HandshakeUniversity at Buffalo

“Backend/platform engineer who owned an AI-powered resume optimization service end-to-end (FastAPI + Celery + Redis/Postgres) and optimized it for unpredictable LLM task latency. Strong Kubernetes/GitOps practitioner (Helm, autoscaling, probes, ArgoCD rollbacks) with experience in on-prem-to-cloud migrations using Terraform and CDC-based replication, plus real-time Kafka pipelines monitored via Prometheus/Grafana.”

PythonSQLRJavaJavaScriptJira+125
View profile
KG

Koushik Gunjala

Screened

Senior AI Engineer specializing in Agentic AI and distributed systems

Charlotte, NC4y exp
UnitedHealth GroupUniversity of North Carolina at Charlotte

“LLM/agentic workflow engineer with healthcare domain experience who built a HIPAA-compliant multi-agent RAG system for clinical review automation at UnitedHealth Group, achieving 92% precision and cutting latency 40% through async orchestration and Redis semantic caching. Also has strong data engineering orchestration background (Airflow on AWS EMR with Great Expectations) and a proven clinician-in-the-loop feedback process that improved model faithfulness by 18%.”

Distributed SystemsRetrieval-Augmented Generation (RAG)GPT-4LangChainLangGraphHugging Face+95
View profile
BV

Bala Venkateswarlu K

Screened

Mid-level Data Scientist specializing in Generative AI, NLP, and MLOps

USA5y exp
MetLifeHarrisburg University of Science and Technology

“Built and deployed an LLM-powered claims-document summarization system (insurance domain) that cut agent review time from 4–5 minutes to under 2 minutes and saved 1,200+ hours per quarter. Hands-on across orchestration and production infrastructure (Airflow retraining DAGs, Kubernetes, SageMaker endpoints, FastAPI) and recent RAG workflows using n8n + Pinecone, with a strong focus on reliability, cost, and explainability for non-technical stakeholders.”

A/B TestingAgileApache KafkaApache SparkAuto ScalingAWS+148
View profile
HE

Hema Edavalapati

Screened

Mid-level AI/ML Engineer specializing in cloud data engineering and GenAI

Florida, USA6y exp
LexisNexisUniversity of South Florida

“AI/LLM engineer with production experience in legal tech: built a GPT-4 + LangChain RAG summarization system at Govpanel that reduced legal case-file review time by 50%+. Previously at LexisNexis, orchestrated end-to-end Airflow data/AI pipelines processing 5M+ legal documents daily, improving ETL runtime by 35% with robust validation, monitoring, and SLAs.”

SQLSQL query optimizationPythonPandasNumPyPySpark+159
View profile
VA

Vardhan Addakattu

Screened

Mid-level Data Scientist specializing in Generative AI and NLP for financial risk

Glassboro, NJ4y exp
S&P GlobalRowan University

“Built and shipped production generative AI/RAG assistants in regulated financial contexts (S&P Global), automating compliance-oriented Q&A over earnings reports/filings with grounded answers and citations. Experienced across the full stack—AWS-based ingestion (PySpark/Glue), vector retrieval + LangChain agents, GPT-4/Claude model selection, and production reliability (monitoring, caching, retries) plus rigorous evaluation and regression testing.”

PythonRSQLPySparkPandasApache Spark+111
View profile
SK

Sridharan Kairmaknoda

Screened

Mid-level Data Engineer specializing in cloud data platforms and real-time analytics

Saint Louis, MO5y exp
CignaSaint Louis University

“Customer-facing data engineering professional who builds and deploys real-time reporting/dashboard solutions, gathering reporting and compliance requirements through direct stakeholder engagement. Experienced with Google Cloud IAM governance, secure integrations (encryption, audit logging), and fast production troubleshooting of ETL/pipeline failures with follow-on monitoring and automated recovery improvements; motivated by hands-on, travel-oriented customer work.”

SDLCAgileWaterfallPythonSQLJupyter Notebook+137
View profile
CT

Chethan Thimapuram

Screened

Mid-level AI/ML Engineer specializing in LLM systems, RAG, and MLOps

5y exp
HCA HealthcareUniversity of South Florida

“Built a production, real-time clinical documentation system at HCA that converts doctor–patient conversations into structured clinical summaries using speech-to-text, LLM summarization, and RAG. Demonstrated measurable gains from medical-domain fine-tuning (clinical concept recall +18%, ROUGE-L 0.62 to 0.74) while meeting HIPAA constraints via PHI anonymization and encryption, and deployed via Docker/FastAPI with CI/CD and monitoring.”

Amazon CloudWatchApache AirflowApache KafkaApache SparkAWS GlueAWS IAM+125
View profile
NM

Niranjaan Munuswamy

Screened

Mid-level Full-Stack & Data Engineer specializing in AWS cloud and real-time streaming

Chicago, IL4y exp
CignaIllinois Institute of Technology

“Backend engineer with experience at Cigna evolving REST API services backed by PostgreSQL, emphasizing reliability/correctness, scalability, and observability. Has hands-on production experience with FastAPI (contract-first design, Pydantic schemas), performance tuning (indexes, caching), and secure auth patterns (OAuth/JWT, RBAC, row-level security via Supabase), plus low-risk incremental rollouts using feature flags and dual writes.”

PythonJavaScriptTypeScriptSQLJavaRedux+105
View profile
AT

Aishwarya Thorat

Screened

Intern Data Scientist specializing in ML engineering and LLM agentic workflows

San Francisco, CA6y exp
ContentstackSan José State University

“Built an agentic, multi-step LLM system that generates full-stack code for API integrations using LangChain orchestration, Pinecone/SentenceBERT RAG, and a human-in-the-loop feedback loop for iterative code refinement. Also collaborated with non-technical content writers and PMs during a Contentstack internship to deliver a Slack-based AI workflow that generates and brand-checks articles with one-click approvals.”

A/B TestingAmazon RedshiftAmazon S3API IntegrationAWSAWS Glue+129
View profile
1...454647...79

Related

Machine Learning EngineersData ScientistsSoftware EngineersData EngineersAI EngineersData AnalystsAI & Machine LearningEngineeringData & AnalyticsEducation

Need someone specific?

AI Search