Vetted Model Evaluation Professionals

Pre-screened and vetted.

KS

Kshitij Singh

Screened ReferencesModerate rec.

Intern Software Developer specializing in healthcare data and systems analysis

Telangana, India0y exp
Apollo HospitalsIIT Jodhpur

Candidate comes from SaaS and healthcare analytics rather than game development, but has strong end-to-end ownership experience building real-time, high-availability systems in Python/AWS. They highlight measurable impact across performance, throughput, uptime, and cost reduction, including queue optimization and predictive ICU utilization pipelines, and are looking to transfer that systems engineering foundation into Unity/gameplay work.

View profile
PM

Phillip Martini

Screened ReferencesModerate rec.

Director-level Product and Data Executive specializing in B2B SaaS and analytics

Zionsville, IN21y exp
AviontéUniversity of Cincinnati

Product leader with a strong track record of modernizing legacy SaaS platforms, including an API-first rebuild that increased engagement by 40% and reduced support burden. Also led AI-powered workflow automation that delivered 80-90% time savings through human-in-the-loop design, showing a pragmatic, user-centered approach to applied AI.

View profile
KM

Mid-Level AI/ML Software Engineer specializing in agentic LLM systems

Dallas, Texas6y exp
DatatronUniversity of West Florida

Built and deployed a production LLM-powered multi-agent compliance copilot (life sciences/finance) using LangChain/LangGraph + RAG over vector databases, delivered via async FastAPI on Kubernetes. Emphasizes audit-ready, deterministic outputs with schema constraints and citations, plus rigorous evaluation/monitoring; reports 60%+ reduction in manual research time and successful production adoption.

View profile
TK

Mid-level AI/ML Engineer specializing in Generative AI, RAG, and Conversational AI

3y exp
AetnaIndiana Tech

Built a production RAG-based GenAI copilot backend at Aetna using Python/FastAPI, GPT-4, LangChain, and Azure AI Search, deployed on AKS with Prometheus/Grafana observability. Owned the system end-to-end (ingestion through deployment) and improved peak-time reliability by addressing vector search and embedding bottlenecks with Redis caching, index optimization, and async processing, plus added anti-hallucination guardrails via retrieval confidence thresholds.

View profile
CS

Junior AI/ML Engineer specializing in real-time computer vision and tracking systems

2y exp
Credence Management SolutionsUniversity of Maryland, College Park

Full-stack engineer who built and owned a production real-time computer-vision inference platform at Credence, spanning Next.js App Router/TypeScript frontend with SSE/WebSocket streaming, a Flask backend, and Postgres analytics. Demonstrated measurable performance wins (70% fewer re-renders; latency cut to ~40–50ms) and strong production rigor (durable orchestration, idempotency, observability, AWS EC2 + CI/CD) with tight post-launch UX iteration based on analyst feedback.

View profile
DB

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps on AWS

TX, USA5y exp
BlackRockTexas A&M University-Kingsville

AI engineer who built a production RAG-based internal analyst tool at BlackRock, fine-tuning an LLM on proprietary financial data and adding four layers of guardrails (input/retrieval/generation/output) to improve grounding and reduce hallucinations. Implemented a LangChain-based multi-agent orchestration (7 major agents) deployed on AWS ECS, with reliability measured via internal human evaluation, LLM-as-judge, and RLHF/drift monitoring.

View profile
Dylan Tang - Intern Software Developer and ML Researcher specializing in medical imaging and computer vision in Chicago, IL

Dylan Tang

Screened

Intern Software Developer and ML Researcher specializing in medical imaging and computer vision

Chicago, IL0y exp
IDX ExchangeUniversity of Chicago

AI/ML practitioner with experience spanning audio/LLM applications (built "Iota" using Whisper, tiktoken, and a local Ollama-served LLM) and healthcare ML (Facemed.ai; UChicago Radiology). Demonstrates a production-oriented mindset—focus on data/model fit, deterministic field testing, and operational safeguards—and has improved research evaluation workflows via a hash-table-based concurrent model tracking approach.

View profile
Chia-En Lu - Junior AI/ML Systems Engineer specializing in LLM infrastructure and distributed training

Chia-En Lu

Screened

Junior AI/ML Systems Engineer specializing in LLM infrastructure and distributed training

1y exp
GenseeAIUC San Diego

Built and shipped a production NMT system translating medical documentation for a rare/low-resource language, tackling data scarcity with retrieval-driven pattern matching plus dictionary/grammar- and LLM-based augmentation and validating quality with a linguistic expert. Also develops agentic LLM workflows with LangChain/LangGraph (including a deep-research style system) and has experience aligning medical AI deployments with clinician-defined risk metrics and human-in-the-loop decision making.

View profile
Prateeksha Ranjan - Mid-level Software Engineer specializing in embedded AI and full-stack systems in Irvine, California

Mid-level Software Engineer specializing in embedded AI and full-stack systems

Irvine, California4y exp
SynapticsUC Irvine

Robotics software engineer who built and owned core navigation components for a TurtleBot in ROS/ROS2 and Gazebo, including an RRT-based planner, waypoint-to-velocity motion planning, and PID trajectory tracking. Demonstrates strong real-time debugging skills (control-loop timing under CPU load), costmap/occupancy-grid tuning, and distributed ROS2 communication design using DDS/QoS, plus Docker and CI/CD automation experience from Keysight.

View profile
FM

Junior ML research engineer specializing in evaluation platforms and applied machine learning

New York, NY3y exp
Arthur AIEmory University

ML/LLM infrastructure engineer who built and shipped a production internal evaluation + failure-analysis agent (Arthur AI / R3AI context) that orchestrated end-to-end benchmarks with deterministic lineage, regression detection, and root-cause reporting at 5,000+ benchmarks/week. Also built backend observability and data validation systems for analytics pipelines at FullStory processing ~3.4B weekly events, emphasizing schema validation, quarantine fallbacks, and idempotent operations.

View profile
HL

Haolin Li

Screened

Entry-level Data Analyst specializing in marketing analytics and business intelligence

Los Angeles, CA1y exp
Helios & PartnersUSC

CRM/lifecycle marketer with hands-on ownership of high-volume, multi-channel programs across email, SMS, and push, including Braze journey design, QA, deployment, and post-campaign analysis. Stands out for combining strong campaign operations with incrementality measurement and experimentation, including a 20% conversion improvement from journey optimization and 12% incremental revenue lift identified through holdout-based analysis.

View profile
Aarushi Mahajan - Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps in New York, USA

Mid-level AI/ML Engineer specializing in NLP, Generative AI, and MLOps

New York, USA4y exp
IntuitUniversity of Massachusetts Amherst

Internship experience shipping production AI systems: built an end-to-end RAG platform (Python/FastAPI + LangChain/LangGraph + vector search) to answer support questions from unstructured internal docs, with a strong focus on hallucination prevention through confidence gating and rigorous offline/online evaluation. Also delivered an AI-driven personalization/analytics feature using an unsupervised clustering pipeline, iterating with PMs to align statistically strong clusters with actionable business segmentation.

View profile
Russell Gagnon - Executive product leader specializing in digital health, AI, and B2B2C platforms in Los Angeles, CA

Executive product leader specializing in digital health, AI, and B2B2C platforms

Los Angeles, CA21y exp
WellthGeorgia Tech

Healthcare product leader who built AI-powered, human-in-the-loop experiences that personalized patient journeys while escalating risk to clinicians. Combines strong ML product judgment with disciplined UX simplification and cross-functional leadership, and ties that work to concrete outcomes like 10-20% better blood pressure control, lower A1c, stronger adherence, and higher engagement.

View profile
AP

Intern AI/ML Engineer specializing in LLM applications, RAG, and model evaluation

Atlanta, GA1y exp
PRGXDuke University

Backend/ML engineer who built production LLM-enabled systems at PRGX, including an interpretable contract opportunity scoring engine (Bradley-Terry pairwise ranking) that reached 0.82 weighted Spearman agreement with SME auditors and was integrated into workflow. Also built a Duke student advisor chatbot and hardened it for real-world reliability/security with schema-driven tool calling, normalization, and off-domain defenses; led staged production rollouts with shadow testing and achieved 0.90 F1 on a new extraction field before shipping.

View profile
SK

Mid-level GenAI/ML Engineer specializing in LLM agents and RAG for Financial Services & Healthcare

5y exp
Bank of AmericaVirginia Commonwealth University

Built and deployed a production GenAI internal support agent at Bank of America (“Ask GPS/AskGPT”) using RAG on Azure, focused on reducing escalations and improving response quality for repetitive knowledge-based queries. Demonstrates strong production LLM engineering: custom LangChain orchestration, retrieval tuning to reduce hallucinations, rigorous offline/online evaluation, and model benchmarking with dynamic routing (e.g., GPT-4 vs Claude).

View profile
SA

Mid-level Software Engineer specializing in AI agents, backend systems, and data engineering

4y exp
AmazonGeorgia State University

Amazon engineer who built a production AI agent platform (Python/AWS Strands on Bedrock) that lets teams create tool-using, multi-agent workflows—e.g., agents that auto-triage and resolve customer support tickets by reading internal documentation and collaborating with a research agent. Previously worked in Deloitte on IAM using Ping Identity/Ping DaVinci orchestration, and applies orchestration thinking plus structured evaluation (LLM-as-judge, surveys, automated tests) to improve agent reliability.

View profile
SS

Mid-level NLP/LLM Researcher specializing in question answering and retrieval-augmented generation

State College, PA6y exp
BoschPenn State University

Built ToolDreamer, a framework for selecting relevant tools for LLM agents by training a retriever on LLM-generated reasoning traces, and has hands-on experience building multi-agent systems in AutoGen (MAG-V) focused on question generation and tool-trajectory verification. Currently works as an AI-guides supervisor at Penn State, regularly communicating AI concepts to non-technical stakeholders.

View profile
SY

Shishir Yadav

Screened

Mid-level Full-Stack Java Developer specializing in financial services and cloud-native microservices

New York, NY3y exp
Freddie MacPurdue University

Software engineer in the mortgage/financial services domain (Freddie Mac) who builds end-to-end loan origination and credit risk capabilities using Spring Boot microservices, Angular dashboards, and data pipelines. Delivered measurable impact (30% reduction in underwriting turnaround time) and emphasizes production reliability/compliance with strong guardrails, observability, and evaluation loops for risk scoring systems.

View profile
SK

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

USA4y exp
ServiceNowValparaiso University

ServiceNow engineer who built and launched a production LLM-powered ticket resolution/knowledge assistant using RAG (LangChain + Hugging Face embeddings + vector search) integrated into internal support dashboards via REST APIs. Optimized the system from ~6–8s to ~2–3s latency while improving usability with concise, cited answers and guardrails (grounding + similarity thresholds), delivering ~30–35% reduction in manual ticket investigation effort.

View profile
Pooja Dokuri - Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps in Remote, USA

Pooja Dokuri

Screened

Mid-level AI/ML Engineer specializing in GenAI, RAG pipelines, and cloud MLOps

Remote, USA4y exp
UnitedHealth GroupEast Texas A&M University

Built and deployed a production LLM + vector search clinical decision support system at UnitedHealth Group, retrieving medical evidence and patient context in real time for prior authorization and risk scoring. Strong in end-to-end RAG architecture (Hugging Face embeddings, Pinecone/FAISS, SageMaker, Redis) plus orchestration (Airflow/Kubeflow) and rigorous evaluation/monitoring, with demonstrated ability to align solutions with clinical operations stakeholders.

View profile
Sragvi Vadali - Junior Software Engineer specializing in AI/ML and real-time systems

Sragvi Vadali

Screened

Junior Software Engineer specializing in AI/ML and real-time systems

2y exp
University of Southern CaliforniaUSC

Backend/AI engineer who built a real-time vector database system for high-frequency financial data using Kafka/Flink on Kubernetes, achieving sub-100ms similarity search at 10k+ concurrent load and resolving tricky duplication issues with idempotency/versioning. Also shipped an end-to-end LLM-based travel itinerary feature (profiling + prompt workflows + APIs) with a focus on quality consistency and low latency.

View profile
Bernie Miao - Junior Full-Stack Software Engineer specializing in EdTech and AI-powered learning tools in Berkeley, CA

Bernie Miao

Screened

Junior Full-Stack Software Engineer specializing in EdTech and AI-powered learning tools

Berkeley, CA2y exp
CollegeNETUC Berkeley

Edtech/education-focused engineer who took an accessibility-critical LLM/vision feature from concept to production: built an OpenCV-gated whiteboard capture pipeline feeding Gemini Vision for handwriting-to-LaTeX, improving math transcription 80% while cutting inference costs 60%. Also built RAG observability and retrieval fixes that stabilized inconsistent answers, and partnered directly with sales to reshape demos and open a new K-12 revenue pipeline aligned to California Digital Divide grant requirements.

View profile
Utkarsh Srivastava - Junior Machine Learning Engineer specializing in LLMs, RAG, and medical imaging in New York City, USA

Junior Machine Learning Engineer specializing in LLMs, RAG, and medical imaging

New York City, USA3y exp
NYU Langone HealthNYU

At Fileread, the candidate built and deployed an LLM-powered legal document classification and retrieval layer for an agentic extraction system that turns unstructured legal PDFs into structured tables with line-level citations. They productionized a RAG-style pipeline (ingestion, embeddings, retrieval, reranking, generation) and report 95%+ F1 across 70+ legal categories, emphasizing rigorous evaluation and close collaboration with legal domain experts for high-stakes precision.

View profile
Prasanna Chelliboyina - Mid-level Machine Learning Engineer specializing in forecasting, NLP, and GenAI in United States

Mid-level Machine Learning Engineer specializing in forecasting, NLP, and GenAI

United States6y exp
WalgreensSyracuse University

GenAI/ML engineer with production experience building multilingual LLM systems (English/Spanish) and RAG-based clinical documentation summarization at Walgreens, combining prompt engineering, structured output validation, and rigorous evaluation (ROUGE + pharmacist review). Also orchestrated end-to-end ML pipelines for demand forecasting using Apache Airflow, PySpark, and MLflow with scheduled retraining and production monitoring.

View profile

Need someone specific?

AI Search