Reval Logo
Home Browse Talent Skilled in Model Evaluation

Vetted Model Evaluation Professionals

Pre-screened and vetted.

Model EvaluationPythonSQLDockerAWSPyTorch
AP

Aaditey Pillai

Screened

Intern AI/ML Engineer specializing in LLM applications, RAG, and model evaluation

Atlanta, GA1y exp
PRGXDuke University

“Backend/ML engineer who built production LLM-enabled systems at PRGX, including an interpretable contract opportunity scoring engine (Bradley-Terry pairwise ranking) that reached 0.82 weighted Spearman agreement with SME auditors and was integrated into workflow. Also built a Duke student advisor chatbot and hardened it for real-world reliability/security with schema-driven tool calling, normalization, and off-domain defenses; led staged production rollouts with shadow testing and achieved 0.90 F1 on a new extraction field before shipping.”

PythonPandasNumPyScikit-LearnObject-Oriented Programming (OOP)Feature Engineering+94
View profile
SK

Santhosh Kumar

Screened

Mid-level GenAI/ML Engineer specializing in LLM agents and RAG for Financial Services & Healthcare

5y exp
Bank of AmericaVirginia Commonwealth University

“Built and deployed a production GenAI internal support agent at Bank of America (“Ask GPS/AskGPT”) using RAG on Azure, focused on reducing escalations and improving response quality for repetitive knowledge-based queries. Demonstrates strong production LLM engineering: custom LangChain orchestration, retrieval tuning to reduce hallucinations, rigorous offline/online evaluation, and model benchmarking with dynamic routing (e.g., GPT-4 vs Claude).”

AWSAWS LambdaCI/CDClaudeDatabricksDecision Trees+97
View profile
US

Utkarsh Srivastava

Screened

Junior Machine Learning Engineer specializing in LLMs, RAG, and medical imaging

New York City, USA3y exp
NYU Langone HealthNYU

“At Fileread, the candidate built and deployed an LLM-powered legal document classification and retrieval layer for an agentic extraction system that turns unstructured legal PDFs into structured tables with line-level citations. They productionized a RAG-style pipeline (ingestion, embeddings, retrieval, reranking, generation) and report 95%+ F1 across 70+ legal categories, emphasizing rigorous evaluation and close collaboration with legal domain experts for high-stakes precision.”

Large Language Models (LLMs)Retrieval-Augmented Generation (RAG)OpenAI APIEmbeddingsPrompt engineeringVector databases+94
View profile
SA

Shiva Adusumilli

Screened

Mid-level Software Engineer specializing in AI agents, backend systems, and data engineering

4y exp
AmazonGeorgia State University

“Amazon engineer who built a production AI agent platform (Python/AWS Strands on Bedrock) that lets teams create tool-using, multi-agent workflows—e.g., agents that auto-triage and resolve customer support tickets by reading internal documentation and collaborating with a research agent. Previously worked in Deloitte on IAM using Ping Identity/Ping DaVinci orchestration, and applies orchestration thinking plus structured evaluation (LLM-as-judge, surveys, automated tests) to improve agent reliability.”

PythonC++JavaJavaScriptTypeScriptMySQL+82
View profile
SS

Saptarshi Sengupta

Screened

Mid-level NLP/LLM Researcher specializing in question answering and retrieval-augmented generation

State College, PA6y exp
BoschPenn State University

“Built ToolDreamer, a framework for selecting relevant tools for LLM agents by training a retriever on LLM-generated reasoning traces, and has hands-on experience building multi-agent systems in AutoGen (MAG-V) focused on question generation and tool-trajectory verification. Currently works as an AI-guides supervisor at Penn State, regularly communicating AI concepts to non-technical stakeholders.”

PythonC++MATLABSQLPyTorchHugging Face+51
View profile
PD

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp
UnitedHealth GroupEast Texas A&M University

“Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.”

PythonPandasNumPyPySparkScikit-learnSQL+133
View profile
SV

Sragvi Vadali

Screened

Junior Software Engineer specializing in AI/ML and real-time systems

2y exp
University of Southern CaliforniaUSC

“Backend/AI engineer who built a real-time vector database system for high-frequency financial data using Kafka/Flink on Kubernetes, achieving sub-100ms similarity search at 10k+ concurrent load and resolving tricky duplication issues with idempotency/versioning. Also shipped an end-to-end LLM-based travel itinerary feature (profiling + prompt workflows + APIs) with a focus on quality consistency and low latency.”

JavaC++PythonJavaScriptTypeScriptFlask+86
View profile
BM

Bernie Miao

Screened

Junior Full-Stack Software Engineer specializing in EdTech and AI-powered learning tools

Berkeley, CA2y exp
CollegeNETUC Berkeley

“Edtech/education-focused engineer who took an accessibility-critical LLM/vision feature from concept to production: built an OpenCV-gated whiteboard capture pipeline feeding Gemini Vision for handwriting-to-LaTeX, improving math transcription 80% while cutting inference costs 60%. Also built RAG observability and retrieval fixes that stabilized inconsistent answers, and partnered directly with sales to reshape demos and open a new K-12 revenue pipeline aligned to California Digital Divide grant requirements.”

Apache TomcatAWSCCSSD3.jsDatabase schema design+82
View profile
SY

Shishir Yadav

Screened

Mid-level Full-Stack Java Developer specializing in financial services and cloud-native microservices

New York, NY3y exp
Freddie MacPurdue University

“Software engineer in the mortgage/financial services domain (Freddie Mac) who builds end-to-end loan origination and credit risk capabilities using Spring Boot microservices, Angular dashboards, and data pipelines. Delivered measurable impact (30% reduction in underwriting turnaround time) and emphasizes production reliability/compliance with strong guardrails, observability, and evaluation loops for risk scoring systems.”

JavaSpring BootSpring CloudSpring MVCREST APIsMicroservices Architecture+127
View profile
PC

Prasanna Chelliboyina

Screened

Mid-level Machine Learning Engineer specializing in forecasting, NLP, and GenAI

United States6y exp
WalgreensSyracuse University

“GenAI/ML engineer with production experience building multilingual LLM systems (English/Spanish) and RAG-based clinical documentation summarization at Walgreens, combining prompt engineering, structured output validation, and rigorous evaluation (ROUGE + pharmacist review). Also orchestrated end-to-end ML pipelines for demand forecasting using Apache Airflow, PySpark, and MLflow with scheduled retraining and production monitoring.”

A/B TestingAgileAnomaly DetectionApache SparkAWSAzure Machine Learning+114
View profile
SC

Sai Charan Kolla

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps on AWS

TX, USA5y exp
BlackRockTexas A&M University-Kingsville

“LLM engineer who built a production document intelligence/RAG pipeline to extract structured data from thousands of unstructured PDFs, cutting manual review time by 60%. Experienced with LangChain and Airflow orchestration plus rigorous evaluation (labeled datasets, prompt testing, HITL review, monitoring) to improve accuracy and reduce hallucinations while partnering closely with non-technical operations stakeholders.”

PythonSQLRJavaC++Machine Learning+99
View profile
SK

Siddhardha Kanamatha

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

USA4y exp
ServiceNowValparaiso University

“ServiceNow engineer who built and launched a production LLM-powered ticket resolution/knowledge assistant using RAG (LangChain + Hugging Face embeddings + vector search) integrated into internal support dashboards via REST APIs. Optimized the system from ~6–8s to ~2–3s latency while improving usability with concise, cited answers and guardrails (grounding + similarity thresholds), delivering ~30–35% reduction in manual ticket investigation effort.”

PythonSQLRJavaMachine LearningDeep Learning+93
View profile
DV

Dheeraj Vajjarapu

Screened

Mid-level AI/ML Engineer specializing in MLOps, NLP/LLMs, and computer vision

Remote, USA4y exp
BarclaysYeshiva University

“Built and shipped a production LLM/RAG risk-case summarization and triage system used by fraud/compliance analysts, with strong grounding controls (evidence-cited outputs and refusal on low confidence). Demonstrates end-to-end ownership across retrieval quality, Airflow-orchestrated indexing pipelines, and compliance-grade privacy (PII redaction, RBAC, encrypted redacted logging, and auditable prompt/model versioning) plus a tight feedback loop with non-technical domain experts.”

PythonSQLBashMachine LearningDeep LearningScikit-learn+124
View profile
SD

Sai Dev

Screened

Mid-level AI/ML Engineer specializing in MLOps, computer vision, and NLP

Newark, CA4y exp
Lucid MotorsCleveland State University

“GenAI/ML engineer from Lucid Motors who built and productionized an LLM-powered RAG diagnostic assistant for manufacturing and maintenance teams, deployed on AWS with Docker/Kubernetes and MLflow. Demonstrates end-to-end ownership from retrieval/prompt design to scalability, monitoring, and workflow integration via APIs, plus production ML pipeline orchestration with Kubeflow (Spark/Kafka + TensorFlow) for predictive maintenance use cases.”

PythonC++RSQLScalaTensorFlow+121
View profile
AA

Akshitha Akula

Screened

Mid-Level Full-Stack Python Engineer specializing in cloud APIs and data/ML platforms

Bentonville, AR4y exp
WalmartUniversity of Central Missouri

“Backend engineer at Goldman Sachs who deployed internal LLM-powered utilities to summarize operational logs/tickets, with a strong emphasis on data sensitivity and reliability. Built deterministic workflows with template-based prompts, confidence checks, and rule-based fallbacks, and used monitoring plus failure-rate metrics to tune performance; also has hands-on Temporal orchestration experience for resilient async backend jobs.”

PythonJavaScriptTypeScriptSQLFastAPIFlask+116
View profile
MK

Mahesh Kumar Duvvuri

Screened

Senior Full-Stack Software Engineer specializing in microservices and cloud-native systems

New York City, NY4y exp
JPMorgan ChaseUniversity of Dayton

“Backend/infra engineer with experience across Nestle, J.P. Morgan, and Capgemini, combining ML systems work (YOLOv8/PyTorch object detection with TFLite edge deployment) with production-grade cloud/Kubernetes operations. Has delivered measurable impact via AWS migrations (25% cost reduction, 99.9% availability), microservice modernization (35% faster processing), and low-latency Kafka streaming for financial dashboards (<100ms) using DLQs and idempotent consumers.”

JavaPythonTypeScriptJavaScriptCC+++158
View profile
HK

Harshitha Kotari

Screened

Mid-level Data/ML Engineer specializing in NLP, GenAI, and scalable data pipelines

5y exp
AbbottClarkson University

“AI/ML engineer with production experience building LLM-powered document intelligence and customer support systems in healthcare/insurance, emphasizing high-accuracy RAG, long-document processing, and robust monitoring/fallback mechanisms. Also automates and scales ML lifecycle workflows using Apache Airflow and Kubeflow, and partners closely with non-technical operations stakeholders to drive adoption.”

PythonRSQLJavaMATLABHTML+148
View profile
YX

Yuan Xu

Screened

Junior Machine Learning Engineer specializing in multimodal AI and audio deepfakes detection

Berkeley, California3y exp
Scam AICarnegie Mellon University

“Internship experience building production-oriented AI systems, including a real-time voice scam/spoof detector (RawNet + AASIST) hardened for noisy audio via aggressive augmentation and Zoom-based noise simulation, evaluated with EER on clean and wild datasets. Also built an LLM-driven UI automation agent using Playwright for apps like Linear/Notion with modular tool design, unit tests, and replayable scripted scenarios, and has AWS Step Functions experience orchestrating Lambda/Cognito workflows.”

PythonCC++JavaLinuxSQL+78
View profile
SD

Sanjana Duvva

Screened

Mid-level AI/ML Engineer specializing in Generative AI, LLMOps, and MLOps

5y exp
Wells FargoUniversity of North Texas

“Built and deployed an AWS-based LLM/RAG ticket triage and knowledge retrieval system (Pinecone/FAISS + Step Functions + MLflow) that cut support resolution time by 20%. Demonstrates strong production focus on hallucination reduction, PII security, and low-latency orchestration, with measurable evaluation improvements (e.g., ~25% grounding accuracy gain via re-ranking) and proven collaboration with support operations stakeholders.”

PythonSQLJavaScalaShell ScriptingTypeScript+153
View profile
HG

Harshavardhan Garikala

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

NJ, USA4y exp
Red HatOklahoma Christian University

“Red Hat ML/LLM engineer who designed and deployed a production LLM-powered customer support automation system using RAG, improving latency by 30% via PEFT and vector search optimization. Built security and governance into retrieval (access-level filtering, encrypted Pinecone/ChromaDB) and delivered SHAP-based explainability via a dashboard for non-technical stakeholders. Experienced orchestrating distributed ML/RAG pipelines across AWS SageMaker and OpenShift with Airflow/Prefect, plus multi-agent workflows using CrewAI and LangGraph.”

PythonPySparkSQLTensorFlowPyTorchHugging Face+127
View profile
SG

Sahithya Godishala

Screened

Mid-level AI/ML Engineer specializing in GenAI, LLMs, RAG, and MLOps

St. Louis, MO5y exp
CenteneSaint Louis University

“Built and deployed a production LLM-powered RAG document intelligence/Q&A system for healthcare prior authorization, reducing manual medical document review time and improving decision efficiency. Strong in end-to-end LLM application engineering (LangChain/LangGraph), retrieval quality improvements (hybrid search, embedding tuning, chunking strategies), and rigorous evaluation/monitoring for reliability.”

PythonSQLPostgreSQLREST APIsFastAPIFlask+108
View profile
SK

Saksham Khatwani

Screened

Mid-level Software Engineer specializing in NLP and search systems

Aurora, United States3y exp
University of Colorado Anschutz Medical CampusUniversity of Colorado Boulder

“Built an AI journaling app at HackCU 2025 featuring a speaking AI avatar with long-term memory via RAG (ChromaDB) and low-latency microservices coordinated through Kafka, including deployment under AMD/non-CUDA constraints using a quantized Llama 8B model. Also has Goldman Sachs experience deploying a Trade UI on Kubernetes with CI/CD rollback automation, plus a healthcare AI internship at CU Anschutz collaborating closely with physicians on diagnostic reasoning and dataset annotation.”

PythonJavaSQLJavaScriptTypeScriptHTML+83
View profile
1...101112...40

Related

Machine Learning EngineersSoftware EngineersData ScientistsAI EngineersResearch AssistantsData AnalystsAI & Machine LearningEngineeringData & AnalyticsEducation

Need someone specific?

AI Search