Reval Logo

Vetted PySpark Professionals

Pre-screened and vetted.

SK

Mid-level AI/ML Engineer specializing in Generative AI and healthcare data

NJ, USA6y exp
Johnson & JohnsonWichita State University

Built and deployed a production RAG-based document Q&A system on Azure OpenAI to help business teams search thousands of PDFs/Word files, using Qdrant vector search, MongoDB, and a Flask API. Demonstrates strong production engineering (streaming large-file ingestion, parallel preprocessing, monitoring/retries) plus systematic prompt/embedding/chunking experimentation to improve accuracy and reduce hallucinations, and has hands-on orchestration experience with ADF/Airflow/Databricks/Synapse.

View profile
AR

Anurag Reddy

Screened

Mid-level Data Scientist specializing in ML, MLOps, and Generative AI

TX, USA5y exp
CaterpillarUniversity of Illinois Chicago

ML/NLP engineer who built a RAG-based technical assistant for Caterpillar field engineers, transforming PDF keyword search into intent-based semantic retrieval across manuals, logs, sensor reports, and technician notes. Strong in productionizing data/ML systems (Airflow, PySpark) with rigorous preprocessing, entity resolution, and evaluation—delivering measurable gains in accuracy, relevance, and duplicate reduction.

View profile
CV

Cristian Vega

Screened

Senior AI/ML Engineer specializing in Generative AI and RAG

California, null9y exp
Morf HealthUniversity of Texas at Austin

ML/NLP practitioner at Morf Health focused on unifying fragmented healthcare data by linking structured patient/encounter records with unstructured clinical notes. Has hands-on experience with transformer embeddings, vector databases, and domain fine-tuning, plus rigorous evaluation (precision/recall) and human-in-the-loop validation with clinical SMEs to make pipelines production-grade.

View profile
CB

Cary Burdick

Screened

Senior Data Scientist specializing in data engineering and analytics

Chicago, IL6y exp
USDAAuburn University

Data/NLP practitioner with experience in both financial services (Truist) and government (USDA), including an NLP-driven analysis of EU regulations to anticipate US regulatory focus and a major redesign/cleaning of complex pathogen lab-test public datasets. Built production data-quality pipelines with Dagster, Pandera, and Azure Synapse, and is comfortable validating hypotheses with historical backtesting and SME-driven quality controls.

View profile
KP

Kavya Paluvai

Screened

Mid-level Data Scientist specializing in fraud detection and healthcare ML

North Carolina, USA4y exp
Wells FargoUniversity of North Carolina at Charlotte

Applied NLP/ML in healthcare and financial services, including fine-tuning BERT on unstructured EHR text and building embedding-based similarity search for clinical concepts. Also redesigned a Wells Fargo fraud detection data pipeline using modular Python + AWS Glue/Step Functions, cutting runtime ~40% with improved monitoring and reliability.

View profile
AB

Ananya Bojja

Screened

Mid-level AI/ML Engineer specializing in healthcare analytics and MLOps

USA4y exp
CignaUniversity of New Hampshire

AI/ML engineer at Cigna Healthcare building a production, HIPAA-compliant LLM-powered clinical insights platform that summarizes unstructured medical notes using a fine-tuned transformer + RAG on AWS. Demonstrates strong end-to-end MLOps and cloud optimization (distillation, Spot/Lambda/Auto Scaling) with quantified outcomes (~28% accuracy lift, ~40% less manual review, ~25% lower ops cost) and strong clinician-facing explainability via SHAP and dashboards.

View profile
RN

Mid-Level Software Engineer specializing in Python backend, data engineering, and cloud microservices

New Jersey, USA6y exp
Abacus InsightsNJIT

Backend-leaning full-stack engineer with production experience in both healthcare (claims enrichment/interoperability at Abacus) and finance (Goldman Sachs pricing/risk APIs + React dashboards). Built an event-driven AI grading platform using Postgres Debezium CDC + Kafka + FastAPI on AWS that cut manual grading ~70% and served 1000+ students, with strong emphasis on reliability, testing, and performance tuning.

View profile
RK

Rohith kollu

Screened

Senior Software Engineer specializing in backend microservices, cloud, and full-stack systems

Dallas, TX7y exp
CiscoIndiana Wesleyan University

Backend/platform engineer who has built and scaled production Java/Spring Boot + Kafka services on AWS/Kubernetes (1M+ msgs/day) and led reliability/performance fixes that restored SLAs (25–30% latency improvement; 99.9% uptime). Also shipped an AI customer-support chatbot end-to-end using retrieval + guardrails and rigorous evaluation/observability, improving resolution time 40% and satisfaction 25%, with a strong plan/execute/verify approach to agentic workflow reliability.

View profile
PS

Priya Shah

Screened

Mid-level DevOps Engineer specializing in AWS cloud infrastructure and CI/CD automation

OH6y exp
ServiceNowSardar Patel University

Backend/data engineer with production experience building a SaaS analytics platform: FastAPI-based microservices with Redis caching and reliability patterns (RBAC, retries/backoff, centralized error handling). Also delivered AWS data pipelines (Glue/PySpark to Redshift) and owned real production incidents using CloudWatch/SNS, plus hands-on PostgreSQL query tuning on multi-million-row reporting workloads.

View profile
SH

Mid-level Machine Learning & Data Infrastructure Engineer specializing in MLOps on AWS

Boston, MA5y exp
Dextr.aiNortheastern University

Built and deployed a fine-tuned Qwen 2.5 14B model into production at Dextr.ai as the backbone for hotel-operations agentic workflows, running on AWS EKS with Triton and TensorRT-LLM. Demonstrates strong cost-aware LLM engineering (QLoRA, FP8/BF16 on H100) plus rigorous benchmarking/observability (Prometheus, LangSmith) with reported sub-30ms TTNT. Previously handled long-running ETL orchestration with Airflow at GE Healthcare and Lowe's.

View profile
EG

Esha Gangam

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

USA4y exp
DeloitteUniversity at Albany

GenAI/ML engineer from Deloitte who built and shipped a production RAG-based internal search assistant for support teams, delivering quantified operational gains (20% effort reduction, 35% faster manual lookup). Experienced in enterprise-grade LLM reliability (grounding/hallucination control), compliance/security constraints, and rapid release cycles using CI/CD, MLflow, and orchestration tools (Airflow, Databricks Jobs, LangChain).

View profile
PK

Senior GenAI/ML Engineer specializing in LLMs, RAG, and multimodal generative AI

USA4y exp
GE HealthCareFranklin University

LLM/RAG engineer with production deployments in highly regulated domains (Frost Bank and GE Healthcare). Built secure, explainable document-grounded Q&A systems using LoRA fine-tuning, strict RAG with confidence thresholds, and citation-based responses; also established evaluation/monitoring (golden QA sets, hallucination tracking, drift) and achieved ~40% latency reduction through retrieval/prompt tuning.

View profile
AS

Aditya Sairam

Screened

Mid-Level Software Engineer specializing in cloud data platforms and AI search

Troy, MI6y exp
Robotics Technologies LLCCleveland State University

Open-source JavaScript contributor focused on data visualization, extending Chart.js/React with custom plugins for real-time streaming dashboards. Designed an end-to-end telemetry pipeline using Apache Kafka and Azure Cosmos DB, optimizing partitioning, batching, caching, and client throttling to keep latency low and support thousands of concurrent users. Demonstrates strong ownership in fast-changing environments, including building full-stack AI applications and ingestion/ETL pipelines at Robotics Technologies LLC.

View profile
PV

Mid-level Machine Learning Engineer specializing in LLM agents, RAG, and MLOps

New York City, NY6y exp
AvanadeUniversity of North Texas

Built a production AI-driven contract/document extraction system combining OCR, normalization, and LLM schema-guided extraction, orchestrated with PySpark and Azure Data Factory and loaded into PostgreSQL for analytics. Emphasizes reliability at scale—using strict JSON schemas, confidence scoring, targeted retries, and multi-layer validation to control hallucinations while processing thousands of PDFs per hour—and partners closely with non-technical business teams to refine fields and deliver usable dashboards.

View profile
VM

Mid-level Machine Learning Engineer specializing in LLMs, RAG, and Clinical AI

Chicago, Illinois4y exp
OptumIllinois Institute of Technology

Built and productionized a HIPAA-compliant LLM+RAG Clinical AI assistant at Optum, fine-tuning GPT/LLaMA on de-identified patient notes and integrating FAISS/Pinecone for sub-second retrieval; reported to cut diagnosis time by ~20 minutes per case. Experienced in orchestrating ML pipelines (Airflow, AWS Step Functions, Azure Data Factory) and in reliability techniques for LLM systems (grounding, citations, confidence filters, monitoring) while partnering closely with clinicians and compliance teams.

View profile
MN

Mid-level Data Engineer specializing in real-time pipelines and cloud data platforms

Remote, USA4y exp
TruistElmhurst University

Backend engineer with hands-on experience building secure Python/Flask services (sessions, JWT, RBAC) and optimizing PostgreSQL/SQLAlchemy performance, including custom SQL using CTEs/window functions profiled via EXPLAIN ANALYZE. Also integrates LLM features via OpenAI/Azure into backend systems and improves scalability with RabbitMQ-driven async processing, caching, and multi-tenant data isolation patterns.

View profile
BB

Mid-level Data Analyst specializing in healthcare and finance analytics

New Jersey, USA5y exp
Omada HealthRowan University

Built an end-to-end Alexa smart-home IoT application controlling a Wi-Fi bulb, including ESP32 firmware (MQTT) and an AWS serverless backend (IoT Core/Device Shadow, Lambda, DynamoDB) with a REST API. Demonstrates strong real-time scalability patterns (streaming ingestion, stateless processing, partition-key design) and full-stack delivery with Spring Boot + React (JWT auth, CORS, data-heavy dashboards).

View profile
SP

Mid-level Data Analyst specializing in AI/ML and advanced analytics

USA3y exp
AccentureMurray State University

Accenture data/ML practitioner who deployed a retail churn prediction and BERT-based sentiment analysis system to production, integrating behavioral + feedback data and operationalizing it with ETL automation, orchestration, and CI/CD. Experienced managing 2TB+ multi-source data, monitoring drift in Databricks, and translating results into Power BI dashboards for marketing teams (including K-means customer segmentation).

View profile
AP

Mid-level Full-Stack Java Developer specializing in cloud-native microservices

USA4y exp
Epic SystemsWebster University

Full-stack Java developer with IBM and Epic Systems experience modernizing legacy enterprise apps into microservices and delivering customer-facing healthcare claims workflows at very high scale (2M+ transactions/day). Strong blend of product engineering (APIs + React/TypeScript UI) and production operations on AWS, including performance incident remediation via query optimization, indexing, and autoscaling.

View profile
RG

Rohan Gore

Screened

Intern AI/ML Engineer specializing in agentic systems and full-stack development

New York City, NY0y exp
MARV CapitalNYU

Built and scaled a multi-agent LLM automation pipeline during a fintech internship, growing from a rapid 1-week proof-of-concept to a 15+ agent hierarchical system that cut market brief report generation time from ~5 hours to under 30 minutes. Hands-on with agent frameworks (Haystack, CrewAI, LangChain) and experienced in debugging agent communication issues via sandboxed modular testing and context/token management; also regularly gives architecture-first technical demos at multiple hackathons and university events.

View profile
KK

Mid-level Data Scientist specializing in MLOps, LLM/RAG applications, and deep learning

United States5y exp
CitigroupUniversity of North Texas

Built and deployed a production compliance automation RAG system (at Citi) that generates citation-backed, schema-validated risk summaries for regulatory document review. Emphasizes regulated-environment reliability with retrieval-only grounding, abstention, confidence thresholds, and immutable audit logging, plus orchestration using LangChain/LangGraph and Airflow. Reported ~60% reduction in compliance review effort while maintaining high precision and traceability.

View profile
NV

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and real-time fraud detection

4y exp
U.S. BankUniversity of Massachusetts Dartmouth

GenAI/ML engineer who has shipped production agentic systems in highly regulated and high-throughput environments, including an AWS Bedrock-based fraud/compliance workflow at U.S. Bank with PII redaction and hallucination detection that cut investigation time by 50%+. Also built and evaluated RAG and recommendation systems at Target, using RAGAS-driven testing, hybrid retrieval with re-ranking, and SHAP explainability dashboards to align model behavior with merchandising business KPIs.

View profile
MR

Mid-level GenAI Engineer specializing in production AI agents and evaluation pipelines

Overland Park, Kansas5y exp
MinutentagWilmington University

Built and shipped a production LLM-powered internal operations automation platform using LangChain RAG (Pinecone) and FastAPI microservices, deployed on AWS EKS, serving 10k+ daily interactions. Implemented a rigorous evaluation/observability stack (golden datasets, prompt regression tests, MLflow, retrieval metrics, hallucination monitoring) that drove hallucinations below 2% and improved reliability, and partnered closely with non-technical ops leaders to cut manual lookup work by 60%+.

View profile
RK

Ram Kottala

Screened

Mid-level Data & GenAI Engineer specializing in lakehouse, streaming, and RAG platforms

Michigan, USA5y exp
FordWebster University

Built a production internal LLM-powered knowledge assistant using a RAG architecture (Python, LLM APIs, cloud services) that answers employee questions with sourced, grounded responses from internal documents. Demonstrates strong practical depth in retrieval tuning (chunking/metadata filters), orchestration with LangChain, and production reliability practices (latency optimization, automated embedding refresh, evaluation metrics, logging/monitoring) while partnering closely with non-technical operations teams.

View profile

Need someone specific?

AI Search