Reval Logo
Home Browse Talent Skilled in Apache Spark

Vetted Apache Spark Professionals

Pre-screened and vetted.

Apache SparkPythonDockerSQLAWSCI/CD
AT

Avantik Tiwari

Screened

Junior Data Scientist / Big Data Engineer specializing in ML, LLMs, and analytics platforms

Tempe, Arizona3y exp
Arizona State UniversityArizona State University

“Backend/data platform engineer who led a major redesign of a hybrid streaming+batch analytics platform processing 10+ TB/day (Airflow/Hive/BigQuery) with strong data-quality automation. Also built a production RAG PDF assistant with concrete mitigations for hallucinations and prompt injection (re-ranking, grounding, verifier step) and has deep experience executing low-risk migrations (dual-write, blue-green, rapid rollback) and implementing JWT-based row-level security.”

PythonSQLJavaJavaScriptMySQLPostgreSQL+112
View profile
KK

Keerthi Kalluri

Screened

Senior Full-Stack & GenAI Engineer specializing in healthcare and financial services

6y exp
Kaiser PermanenteTexas Tech University

“Built and deployed a production LLM-powered customer support assistant using a RAG backend in Python, focused on deflecting repetitive Tier-1 tickets and reducing resolution time. Demonstrates strong production engineering instincts around reliability (confidence scoring + human fallback), scalability/cost optimization (multi-stage pipelines), and workflow orchestration/observability (LangChain, custom DAGs, structured logging, step metrics).”

AgileAJAXAmazon EC2Amazon EKSAmazon RDSAmazon Redshift+220
View profile
SN

Shashwat Negi

Screened

Junior Full-Stack & ML Engineer specializing in LLM applications

San Jose, CA2y exp
InfrrdUniversity of Wisconsin–Madison

“Data Scientist (2–3 years) at ZS Associates who has built and productionized agentic LLM systems, including a LangGraph-based multi-LLM prompt-optimization pipeline for entity extraction deployed as a Spring Boot microservice via Jenkins. Also built an Insightmate.ai chatbot and improved its RAG accuracy by diagnosing vector retrieval issues and implementing HyDE query expansion, while partnering with sales and pharma stakeholders to drive adoption (e.g., Zimmer Biomet platform migration into a multi-year partnership).”

PythonJavaScriptTypeScriptSQLRPHP+81
View profile
SJ

Shanmukha Jayavarapu

Screened

Mid-level AI/ML Engineer specializing in fraud detection and healthcare predictive analytics

Missouri, USA4y exp
KPMGUniversity of Central Missouri

“Built and deployed a production LLM-powered calorie-counting chatbot that turns plain-English meal descriptions into normalized food entities, quantities, and calorie estimates using a hybrid transformer + rule-engine pipeline. Emphasizes reliability with schema/constraint guardrails, confidence-based routing (including embedding similarity search fallbacks), and strong observability/metrics (hallucination rate, calibration, latency, cost). Partnered closely with nutritionists to encode domain standards into mappings and validation logic.”

PythonPyTorchTensorFlowScikit-learnXGBoostLightGBM+97
View profile
PK

Pravalika Kasojjala

Screened

Mid-level AI/ML Engineer specializing in LLM, RAG/GraphRAG, and fraud analytics

Charlotte, NC5y exp
Bank of AmericaUniversity of Wisconsin–Milwaukee

“LLM/agent engineer who has deployed a production internal assistant to reduce employee inquiry resolution time while maintaining regulatory compliance. Experienced with RAG, hallucination risk triage, and graph-based orchestration (LangGraph) for enterprise/banking-style workflows, emphasizing schema-validated, citation-backed, tool-constrained agent designs and tight collaboration with non-technical business/compliance stakeholders.”

A/B TestingAgileAmazon BedrockAmazon CloudWatchAmazon EC2Amazon ECS+190
View profile
SS

Saniya Shinde

Screened

Mid-level Data Scientist specializing in NLP, LLMs, and RAG systems

Washington, DC4y exp
World BankGeorge Washington University

“Built and deployed a production-style vision-language pipeline that generates structured medical reports from chest X-rays using BioViLT embeddings, an image-text alignment module, and BiGPT fine-tuned with LoRA, delivered via Streamlit and hosted on AWS EC2. Also collaborating experience presenting EDA findings, feature importance, and model performance to Ford managers while working with vehicle parts data at Bimcon.”

PythonSQLRC++PyTorchTensorFlow+93
View profile
JL

Jahnavi Lasyapriya Vavilala

Screened

Junior Machine Learning Engineer specializing in LLMs, NLP, and computer vision

Bengaluru, Karnataka2y exp
PwCArizona State University

“Built a production, agentic multi-agent pharmaceutical intelligence system for US oncology (breast cancer) conference/news intelligence, automating MSL-style information gathering and summarization for pharma and healthcare stakeholders. Uses CrewAI + LangChain orchestration, custom scraping across ~15 pharma newsrooms, and a grounding-score evaluation approach (sentence transformers/cosine similarity) to mitigate hallucinations.”

PythonSQLRJavaJavaScriptSnowflake+121
View profile
NM

Narayanaroyal Marisetty

Screened

Mid-level Data Scientist/ML Engineer specializing in healthcare AI and MLOps

USA4y exp
CVS HealthUniversity at Buffalo

“Designed and deployed an enterprise LLM-powered clinical/pharmacy policy knowledge assistant at CVS Health, replacing manual searches across PDFs/Word/SharePoint with a HIPAA-compliant RAG system. Built end-to-end ingestion and orchestration (Airflow + Azure ML/Data Lake + vector index) with PHI masking, versioned re-embedding, and production monitoring (Prometheus/Grafana), and partnered closely with clinicians/compliance to ensure policy-grounded, auditable answers.”

A/B TestingApache AirflowApache HadoopApache HiveApache KafkaApache Spark+132
View profile
SM

Sanjay Mandru

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud microservices and real-time analytics

Buffalo, NY3y exp
SamsungUniversity at Buffalo

“Software engineer who built a reusable React component package (UI modules, auth helpers, API client wrappers) for an AI SaaS background-removal project, emphasizing performance (tree shaking/dynamic imports) and reliability (Jest + Storybook). Also delivered a unified REST API for Samsung Big Data Portal, resolving cross-team issues by standardizing schemas, improving validation/logging, and operating effectively amid shifting requirements.”

AgileAnsibleApache KafkaApache SparkAuthenticationAWS+123
View profile
NJ

Neeraj Jawahirani

Screened

Mid-level Data & AI Engineer specializing in healthcare data pipelines and MLOps

FL, USA4y exp
HumanaFlorida State University

“Built and deployed a production LLM-powered clinical note summarization system used by care managers to speed review of 5–20 page unstructured medical records. Implemented safety-focused validation (prompt constraints, rule-based and section-level checks, human-in-the-loop) to reduce hallucinations while maintaining low latency and meeting privacy/regulatory constraints, integrating via APIs into existing clinical tools.”

AgileAmazon CloudWatchAmazon EMRAmazon RedshiftAmazon S3Amazon SageMaker+122
View profile
KM

Kiran M

Screened

Mid-Level Full-Stack Software Engineer specializing in cloud-native microservices and data platforms

Bentonville, AR5y exp
WalmartNorthern Arizona University

“Backend/ML integration engineer with experience at Accenture and Walmart building Flask-based analytics and prediction APIs on PostgreSQL/MySQL. Strong focus on performance and scalability—uses precomputed aggregates, Redis caching, query tuning (indexes/partitioning/EXPLAIN), and async/background processing; also designs secure multi-tenant isolation with JWT and schema/db-per-tenant strategies.”

API GatewayAWSAWS GlueAWS LambdaBitbucketBigQuery+145
View profile
SM

Siva Manikanta Lakumarapu

Screened

Mid-level AI/ML Engineer specializing in Generative AI and NLP

Dallas, TX5y exp
Gilead SciencesUniversity of North Texas

“AI/LLM engineer with production experience building secure, scalable compliance-focused generative AI systems (GPT-3/4, BERT) including RAG over internal regulatory document bases. Has delivered end-to-end pipelines on AWS with PySpark/Airflow/Kubernetes/FastAPI, emphasizing privacy controls, monitoring, and iterative evaluation (A/B testing). Also partnered closely with bank compliance officers using prototypes to refine NLP summarization/classification and reduce document review time.”

A/B TestingAgileAmazon EC2Amazon RedshiftAmazon S3Apache Airflow+164
View profile
KB

kesav boob

Screened

Mid-Level Full-Stack Java Engineer specializing in microservices and cloud

San Francisco, California5y exp
Dell TechnologiesCal State LA

“Full-stack developer who built an end-to-end Hotel Management System using React and Spring Boot with MongoDB and AWS. Has hands-on experience debugging API/data-fetching issues with Postman and validating results against the database, plus exposure to handling large data workloads with chunking and monitoring via Grafana/Tabula.”

JavaSQLCC++C#Python+129
View profile
NA

Niveditha A

Screened

Mid-level AI/ML Engineer specializing in healthcare ML and LLM/RAG systems

USA4y exp
UnitedHealth GroupBowling Green State University

“AI/LLM engineer with recent production experience at UnitedHealth Group building an end-to-end RAG system over structured EMR data and unstructured clinical notes, including evidence retrieval, GPT/LLaMA-based reasoning, and a validation layer for reliability. Strong in orchestration (Kubeflow/Airflow/MLflow), prompt engineering for noisy healthcare text, and rigorous evaluation/monitoring with gold-standard benchmarking, plus close collaboration with clinical operations stakeholders.”

PythonNumPyPandasJSONSQLPostgreSQL+152
View profile
HR

Hrishikesh Raghunath

Screened

Mid-level Data Engineer specializing in scalable ETL, streaming analytics, and cloud data platforms

Remote, USA7y exp
Dreamline AICalifornia State University, Fullerton

“At Dreamline AI, built and productionized an AWS-based incentive intelligence platform that uses Llama-2/GPT-4 to extract eligibility rules from unstructured state policy documents into structured JSON, then processes them with Glue/PySpark and serves results via Lambda/SageMaker/API Gateway. Designed state-specific ingestion connectors plus schema validation and automated checks/alerts to handle frequent policy/format changes without breaking the pipeline, and partnered with business/analytics stakeholders to deliver interpretable eligibility decisions via explanations and dashboards.”

A/B TestingAmazon CloudWatchAmazon KinesisAmazon RedshiftAmazon S3Amazon SageMaker+114
View profile
AR

Arthi R

Screened

Mid-level Full-Stack Software Engineer specializing in FinTech and cloud-native microservices

Remote – Washington, D.C.5y exp
Fannie MaeWright State University

“Backend engineer with fintech/banking experience (e.g., Canara Bank) building secure Python/Flask microservices for financial reporting and unified data access. Strong in Postgres/SQLAlchemy performance optimization (including materialized views) and in productionizing ML services on AWS (Lambda/ECS/CloudWatch) with Docker, model registries, and blue-green deployments, plus multi-tenant isolation via JWT-based middleware.”

PythonJavaScriptTypeScriptCC++Go+129
View profile
RK

Rakesh Kolagani

Screened

Mid-level AI/ML Engineer specializing in MLOps and LLM-powered applications

Mountain View, CA5y exp
IntuitUniversity of Central Missouri

“AI/ML engineer with production experience building a RAG-based internal analytics assistant (Databricks + ADF ingestion, Pinecone vector store, LangChain orchestration) deployed via Docker on AWS SageMaker with CI/CD and MLflow. Strong focus on real-world constraints—latency/cost optimization (LoRA ~60% compute reduction), hallucination control with citation grounding, and enterprise security/governance. Previously at Intuit, delivered an interpretable churn prediction system (PySpark/Databricks, Airflow/Azure ML) that improved retention targeting ~12%.”

A/B TestingAmazon S3Apache AirflowAWS GlueAWS LambdaAWS Step Functions+126
View profile
PM

Pooja Murigappa

Screened

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in Financial Services

Austin, TX5y exp
Charles SchwabUniversity of Central Missouri

“ML/LLM engineer at Charles Schwab who built a production loan-advisor chatbot integrated with internal knowledge and loan-calculator APIs, adding strict numeric validation to prevent rate hallucinations and optimizing context to control costs. Also runs ~40 Airflow DAGs orchestrating retraining/ETL/drift monitoring with an automated Snowflake→SageMaker→auto-deploy pipeline, and uses rigorous testing plus canary rollouts tied to business metrics and compliance constraints.”

Amazon DynamoDBApache AirflowApache KafkaApache SparkAWSAWS Glue+183
View profile
SM

SUMIT MAMTANI

Screened

Mid-level Data Scientist specializing in ML, MLOps, and customer analytics

Tempe, AZ4y exp
QlikArizona State University

“ML/NLP practitioner focused on insurance/claims analytics for a large financial firm, working with millions of fragmented structured and unstructured records. Built production-grade pipelines for entity extraction, entity resolution, and semantic search using Sentence-BERT + vector DB, including fine-tuning with contrastive learning (reported ~15% recall lift) and scalable ETL/containerized deployment on Kubernetes.”

PythonPandasNumPyScikit-learnTensorFlowPyTorch+117
View profile
MD

Molli Dinesh

Screened

Mid-level AI/ML Engineer specializing in NLP, LLMs, and MLOps

Remote, USA4y exp
Marsh McLennanIllinois Institute of Technology

“Built an AI-driven insurance policy summarization platform at Marsh, taking it end-to-end from messy PDF ingestion/OCR and custom extraction through LLM fine-tuning and AWS SageMaker deployment. Delivered measurable impact (25% reduction in manual review time, 99% uptime) and demonstrated strong production MLOps/LLMOps practices with Airflow/Step Functions orchestration, rigorous evaluation (ROUGE + human review), and continuous monitoring for drift, latency, and hallucinations.”

PythonPandasNumPyScikit-learnRSQL+132
View profile
SG

Shweta Gupta

Screened

Senior Backend Software Engineer specializing in Java microservices, Kafka, and AWS

Seattle, WA6y exp
EasyBee AIUC Irvine

“AI engineer who shipped a production chat assistant for a storage company by building the underlying RAG-style knowledge base (document ingestion, chunking/embeddings, FAISS vector store) and an admin update interface to keep content current. Also has full-stack delivery experience (Python REST APIs + React/TypeScript UI) and AWS operations using Terraform/Jenkins, including handling a real production performance incident by optimizing DB queries and adding auto-scaling.”

A/B TestingAgileAPI TestingAWSBashBatch Processing+111
View profile
SM

Supriya Mattapelly

Screened

Mid-level AI/ML Engineer specializing in GenAI agents, RAG pipelines, and MLOps

USA6y exp
UnitedHealthcareKent State University

“AI/ML engineer who built a production RAG-based internal document intelligence assistant (LangChain + Pinecone) to let employees query enterprise reports in natural language. Demonstrated hands-on pipeline orchestration with Apache Airflow and tackled real production issues like retrieval grounding and latency using tuning, caching, and token optimization, while partnering closely with non-technical business stakeholders through iterative demos.”

A/B TestingAmazon CloudWatchAmazon EC2Amazon EMRAmazon RedshiftAmazon S3+152
View profile
AS

Arya Santosh Thorat

Screened

Mid-Level Software Engineer specializing in cloud-native backend systems

Texas, State4y exp
PNCUniversity of Texas at Dallas

“Full-stack/backend engineer with deep experience building real-time fraud and credit-risk systems. Shipped an event-driven fraud monitoring platform (Kafka→MongoDB/Redis→WebSockets) delivering sub-200ms updates to 3000+ concurrent internal users, and built a Java/Spring Boot credit risk decisioning API that improved turnaround time by 30–40%. Strong AWS production operations (ECS Fargate/RDS/Redis) with proven incident response and performance tuning.”

JavaPythonC++ScalaJavaScriptSpring Boot+103
View profile
RW

Ruijing Wang

Screened

Intern Data Scientist specializing in healthcare AI and experimentation

Boulder, CO1y exp
EchoPlus AIStevens Institute of Technology

“Human-AI Design Lab practitioner who productionized a wearable-health anomaly detection system by evolving a standalone autoencoder into a hybrid autoencoder + GPT-based approach, backed by PySpark ETL and MLOps on AWS SageMaker/MLflow. Also has applied LLM troubleshooting experience (fine-tuned FLAN-T5 summarization) and partnered with BI teams to run A/B tests and improve retention via feature stores and experimentation.”

PythonPandasScikit-LearnPyTorchTensorFlowSQL+97
View profile
1...585960...119

Related

Machine Learning EngineersSoftware EngineersData ScientistsData EngineersSoftware DevelopersAI EngineersEngineeringAI & Machine LearningData & AnalyticsEducation

Need someone specific?

AI Search