Vetted Latency Optimization Professionals

Pre-screened and vetted.

Latency Optimization Python Docker CI/CD SQL AWS

Omkarnath THAKUR

Screened

Intern AI/Data Scientist specializing in LLMs, RAG, and MLOps

Maryland, USA2y exp

University of MarylandUniversity of Maryland, College Park

“Internship project at Builder Market: built an end-to-end production multimodal LLM application that estimates renovation/replacement costs from appliance photos (CLIP embeddings) or text descriptions, combining fine-tuning with agentic RAG. Focused heavily on real-world performance constraints—latency and cost—using parallel agent workflows, model routing to smaller/open-source models, re-ranking, and retrieval chunking, and collaborated closely with CEO/co-founders to deliver the solution.”

Python Java SQL R Machine Learning Deep Learning+142

View profile

shubham patil

Screened

Mid-level AI Engineer specializing in Generative AI, RAG systems, and fraud analytics

New York, NY4y exp

Syracuse UniversitySyracuse University

“Built and deployed a RAG-based student/faculty support chatbot at a university that answers from official syllabus/policy documents and now supports 4,000+ students while reducing repetitive support requests. Hands-on with LangChain, LangGraph, and CrewAI to orchestrate reliable agentic workflows, with a strong focus on testing/monitoring in production and cross-functional delivery (e.g., marketing analytics automation at Steve Madden).”

A/B Testing Anomaly Detection API Development AWS Azure Machine Learning CI/CD+91

View profile

Yashi Agarwal

Screened

Mid-level Machine Learning Engineer specializing in NLP, Generative AI, and RAG systems

Los Angeles, CA4y exp

KaiyrosCalifornia State University, East Bay

“Built and deployed a production LLM-powered phone assistant for a healthcare clinic, combining streaming STT/TTS with RAG over approved clinic documents and strict safety guardrails to prevent unverified medical advice, plus seamless human handoff. Also has hands-on Apache Airflow experience building robust daily ML/data pipelines with data validation, retries/timeouts, monitoring, and metric-gated model deployment, and iterates closely with clinic staff using real call reviews.”

A/B Testing Apache Airflow Apache Spark Azure Machine Learning Bash BERT+103

View profile

Ooha Chintakunta

Screened

Mid-level Full-Stack Developer specializing in Python/Java and cloud-native web apps

Texas, USA4y exp

BNY MellonUniversity of North Texas

“Robotics-focused full-stack engineer with hands-on ROS experience building sensor-processing and control nodes, plus a track record of debugging and optimizing real-time robot responsiveness via profiling and message-timing analysis. Uses Webots for pre-hardware validation and Docker/CI/CD to standardize deployments and catch issues early.”

SDLC Agile Waterfall Python Java JavaScript+105

View profile

Jay Patel

Screened

Mid-level AI/ML Engineer specializing in NLP, Document AI, and MLOps

USA6y exp

State StreetPace University

“ML/LLM engineer with production experience building a RAG-based LLM support assistant (FastAPI, Redis, Kafka) with multi-layer validation and human-in-the-loop feedback loops to improve accuracy over time. Has orchestration and MLOps depth using Airflow and Kubeflow on Kubernetes (autoscaling, alerting, monitoring) and delivered measurable ops impact (40% ticket efficiency improvement) by partnering closely with customer support teams.”

Python R SQL PyTorch TensorFlow scikit-learn+106

View profile

Rajeev Reddy

Screened

Mid-level AI/ML Engineer specializing in NLP and production ML on cloud

4y exp

The HartfordFlorida Atlantic University

“ML engineer/data scientist who deployed a production credit risk + insurance claims triage platform at Hartford Financial, combining XGBoost default prediction with BERT-based document classification. Demonstrated strong MLOps by cutting inference latency to sub-500ms and building drift monitoring plus automated retraining/deployment pipelines (MLflow, CloudWatch, GitHub Actions, SageMaker) with human-in-the-loop review and SHAP-based explainability for underwriting adoption.”

A/B Testing Agile Amazon EC2 Amazon Redshift Amazon S3 Anomaly Detection+115

View profile

Ranxin Li

Screened

Mid-Level AI/Full-Stack Engineer specializing in agentic LLM systems and RAG

San Jose, USA2y exp

RevoAgent SolutionUC Davis

“Built and deployed Clyra.AI, an AI-driven daily scheduling product that uses a LangGraph-based multi-agent LLM pipeline (task extraction, verification, reflection) grounded with strict RAG over emails/documents/calendars and real-world signals like health metrics. Designed a custom agent orchestrator with bounded loops/termination conditions and a self-auditing verification/reflection layer to reduce hallucinations while controlling latency and cost via caching and model distillation.”

C C++Kotlin Java Python JavaScript+119

View profile

Chaitanya Kalagara

Screened

Mid-level Machine Learning Engineer specializing in LLMs, GenAI, and Computer Vision

Boston, MA3y exp

Camp4 TherapeuticsNortheastern University

“LLM/agent engineer who built a production multi-agent research automation system using LangGraph (planner, retriever with FAISS, supervisor, evaluator) with structured outputs and citation tracking for traceable reports. Emphasizes reliability and operations—LangSmith-based observability, multi-level testing, hallucination mitigation, and latency/cost controls—plus prior experience as a Computer Vision Software Engineer at Deepsight AI Labs working directly with non-technical customers.”

A/B Testing Amazon EC2 Amazon S3 Amazon SageMaker AWS AWS Lambda+87

View profile

Youssef Briki

Screened

Intern AI Researcher specializing in NLP, LLMs, and knowledge graphs

Montreal, QC1y exp

Acceleration ConsortiumUniversity of Montreal

“Built and shipped “LabMate,” a production AI assistant specialized in laboratory hardware, using a weighted multi-source RAG pipeline with reranking and reasoning-focused query decomposition to handle complex user questions. Deployed on a local GPU cluster with vLLM and NVIDIA MPS (plus OCR/VLM components), and established evaluation using synthetic + public reasoning datasets while collaborating weekly with non-technical admins to align requirements and resource constraints.”

API Development Authentication BERT C C++CUDA+94

View profile

KHUSHBU KAKDIYA

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG pipelines, and cloud MLOps

California, USA6y exp

CVS HealthCleveland State University

“Built and deployed a production LLM/RAG system at CVS to automate clinical documents, addressing PHI compliance, retrieval accuracy, and latency; achieved a 35–40% reduction in review effort through chunking and FP16/INT8 optimization. Also has experience translating AI outputs into actionable insights for non-technical stakeholders (sports analysts).”

Python SQL PySpark R Bash Scikit-learn+114

View profile

Ninad Walanj

Screened

Intern Software Engineer specializing in full-stack and LLM/RAG systems

Seattle, USA1y exp

Capria VenturesSyracuse University

“Full-stack engineer who built "Workstream AI," an AI-powered engineering visibility product that converts GitHub activity into real-time insights using an event-driven microservices stack (RabbitMQ/Postgres/Express) and GPT-4 with a React frontend. Previously a Founding SWE at a health & wellness startup, building data-driven user management tooling, and also delivered a real-time shuttle tracking/ride request system using Java Spring Boot/Hibernate + React; comfortable owning production deployment details (AWS EC2, DNS, SSL).”

Agile Angular AWS CI/CD Caching C+76

View profile

PremKumar Gandla

Screened

Mid-level AI/ML Engineer specializing in MLOps, NLP, and scalable model deployment

Texas, USA4y exp

BlackbaudSouthern Arkansas University

“Built and deployed a production autonomous AI data analyst agent (LangChain + GPT + Streamlit on AWS) that turns natural-language questions into validated SQL, visualizations, and insights, cutting manual analysis time by ~50%. Emphasizes reliability and MLOps: schema-aware validation/guardrails to prevent hallucinations, scalable large-data processing, and Azure DevOps CI/CD + MLflow for automated deployment and experiment tracking.”

Python SQL R TensorFlow PyTorch Scikit-learn+87

View profile

Ramya Konda

Screened

Mid-level AI/ML Engineer specializing in healthcare ML and generative AI

Remote, USA5y exp

HumanaUniversity of New Haven

“AI/LLM engineer at Humana who built and deployed a HIPAA-aware RAG system for clinical record retrieval, cutting search time dramatically and improving retrieval efficiency by 30%. Experienced with Spark-scale data preprocessing, QLoRA fine-tuning, LangChain orchestration, and MLflow+SageMaker integration, with a strong testing/evaluation discipline (A/B tests, human eval) to hit 95%+ accuracy and production latency targets.”

Python R SQL PostgreSQL BigQuery Snowflake+108

View profile

Nikhil Chagi

Screened

Intern Data Analyst specializing in data pipelines and LLM/RAG applications

San Francisco, CA1y exp

CignaUniversity of North Texas

“Built and deployed LLM-powered analytics and reporting systems, including a RAG-based assistant over Snowflake that let business users ask questions in plain English instead of writing SQL. Experienced orchestrating LLM agents (LangChain) and serverless reporting pipelines (AWS Lambda/S3/RDS), with a strong focus on grounded outputs, monitoring/evaluation, and data quality—used daily by non-technical finance and operations teams at Cigna.”

Amazon EC2 Amazon RDS AWS AWS Lambda Analytics Anomaly Detection+55

View profile

Sumanth Gottipati

Screened

Mid-level Full-Stack Software Engineer specializing in cloud-native microservices and FinTech

New York, NY4y exp

Delta Air LinesVirginia University of Science and Technology

“At Delta Airlines, built and shipped a production LLM-powered semantic search/troubleshooting assistant over maintenance logs and operational documentation using OpenAI embeddings and a vector database. Implemented hybrid ranking, query enrichment, and structured filters to improve relevance ~35% while optimizing latency via caching and vector tuning. Also designed a scalable Kafka + AWS (Lambda/SQS) ingestion pipeline with strong reliability/observability and an eval loop using real engineer queries and human review.”

Amazon CloudWatch Amazon DynamoDB Amazon EC2 Amazon S3 Amazon SQS Asynchronous Processing+111

View profile

Aryaa Deshpande

Screened

Junior AI Engineer specializing in ML, LLM systems, and RAG

Bangalore, India2y exp

NxtGen Cloud TechnologiesUniversity at Buffalo

“Built and deployed an LLM/applied-ML system enabling efficient extraction of useful information from large unstructured multimodal datasets, owning the full pipeline from ingestion to inference and APIs with a strong emphasis on production reliability, latency, and monitoring. Also delivered a voice-based AI workflow for Hindi policy document access for the Election Commission of India by translating non-technical usability needs into iterative demos and a successful implementation.”

Python SQL HTML CSS JavaScript C+83

View profile

Pranav Marla

Screened

Mid-level AI/ML & Full-Stack Engineer specializing in LLM agents and generative AI

Dallas, United States5y exp

KalpaNortheastern University

“LLM/agent builder who shipped a live consumer AI-agent app (kalpa.chat) that visualizes complex reasoning as interactive graphs and abstracts multi-provider model usage via a unified wallet. Professionally has applied LangChain/LangGraph to IVR parsing and to scaling a football video-generation pipeline at DAZN, including shipping a VAR-specific retrieval/order fix via SQL after iterating with a non-technical PM.”

Python Java C++JavaScript TypeScript SQL+80

View profile

Jitesh Kumar S

Screened

Junior Machine Learning Engineer specializing in NLP, computer vision, and MLOps

Lafayette, IN3y exp

YaarcubesUniversity of Maryland, College Park

“ML/LLM engineer with Meta experience building production AI systems for near real-time user-report classification and summarization under strict latency (<250ms), safety, cost, and privacy constraints. Has hands-on MLOps/orchestration experience (Airflow, Spark, MLflow, Kubernetes, Docker, GitHub Actions) plus observability (Prometheus/Grafana) and applies rigorous evaluation, staged rollouts, and A/B testing to keep agent workflows reliable in production.”

Python SQL Bash Shell Scripting Java C+++99

View profile

Akshay Katageri

Screened

Mid-level AI Engineer specializing in multi-agent systems and RAG

Jersey City, NJ4y exp

Elevance HealthPace University

“Built and shipped a production LangGraph-based multi-agent LLM analytics/decision copilot that answers questions across SQL/BI systems and unstructured docs, emphasizing grounded, tool-verified outputs with citations and confidence gating. Deep hands-on experience with orchestration (LangGraph, CrewAI, OpenAI Assistants, MCP) plus real-world latency/cost optimization (vLLM batching/KV caching, speculative decoding, quantization) and rigorous eval/observability. Partnered closely with business/ops stakeholders to deliver explainable reporting automation, cutting manual reporting time by 50%+.”

Cross-Functional Collaboration Data Pipelines Docker FAISS Feature Engineering Flask+106

View profile

Snehitha Penumaka

Screened

Mid-level AI/ML Engineer specializing in predictive modeling and cloud ML pipelines

Dallas, TX3y exp

Cambard LLCUniversity of Texas at Dallas

“LLM engineer/data engineer who has deployed production RAG systems for internal-document Q&A, building end-to-end ingestion, embedding, vector search, and FastAPI serving while actively reducing hallucinations and latency through rigorous retrieval tuning and caching. Also experienced in orchestrating cloud data pipelines (Airflow, AWS Glue, Azure Data Factory) and partnering with non-technical business teams to deliver AI solutions like automated document review.”

A/B Testing Agile Anomaly Detection Apache Spark AWS Lambda Classification+93

View profile

Sivapriya Rachakonda

Screened

Mid-Level Software Engineer specializing in cloud-native microservices on AWS and Kubernetes

Remote, USA5y exp

OptumUniversity of South Dakota

“Backend engineer who built a stateless Python/Flask service supporting a healthcare-document ETL pipeline, offloading heavy processing to Celery workers and adding strong observability (metrics, structured logs, audits). Demonstrates practical performance/reliability work: batch chunking, priority queues, autoscaling by queue depth/CPU, DLQ routing, and PostgreSQL tuning (indexes, pagination) to cut slow API responses. Also has experience deploying real-time ML classification via TensorFlow Serving behind a FastAPI wrapper and integrating models via REST/gRPC.”

A/B Testing Agile AWS AWS CloudFormation AWS Lambda Batch Processing+120

View profile

Jeet Ashwin Shah

Screened

Junior Full-Stack Engineer specializing in backend systems and agentic AI

San Francisco, CA2y exp

ASANTeUniversity of Colorado Boulder

“Founding/early engineer experience across Asante and a Series A startup (Adgency), shifting from data science/ML into owning production full-stack systems end-to-end. Built core product flows (registration, business profiles, map service), AWS-deployed gRPC microservices with CI/CD, and operated low-latency agent/video ad generation workflows with retries/fallbacks and PostHog-based observability.”

AWS Bash CI/CD Claude Containerization Data Modeling+69

View profile

Lakshmi Priya Ramisetty

Screened

Mid-level ML & Data Engineer specializing in GenAI, graph modeling, and fraud/risk analytics

Redwood City, CA5y exp

BlueArcYeshiva University

“Built a production AI fraud/risk scoring platform at BlueArc that ingests web business/product/site data, generates text+image embeddings, and connects entities in a graph to detect reuse patterns and links to known bad actors. Optimized for scale with incremental graph re-scoring and delivered investigator-friendly explainability by surfacing the exact signals/relationships behind each score; orchestrated workflows with Airflow and GCP event-driven components (Pub/Sub, Dataflow, Cloud Run) and has recent LLM workflow orchestration experience (retrieval, prompting, scoring).”

Python SQL PySpark Apache Airflow ETL PostgreSQL+92

View profile

Sai Addala

Screened

Mid-level AI/ML Engineer specializing in financial risk, fraud analytics, and forecasting

USA4y exp

Northern TrustSyracuse University

“Built and productionized an LLM-powered financial intelligence and forecasting platform at Northern Trust using a RAG architecture (LangChain + Hugging Face + FAISS) with end-to-end MLOps (Docker/Kubernetes, Airflow, MLflow). Emphasized regulatory-grade explainability (SHAP/Power BI) and hallucination control (retrieval-only grounding), achieving ~30% forecasting accuracy improvement and ~65% reduction in analyst research time, with sub-second inference and 95% uptime on EKS/AKS.”

Python NumPy Pandas JSON SQL PostgreSQL+116

View profile

Software Engineers Machine Learning Engineers Data Scientists AI Engineers Research Assistants Software Developers AI & Machine Learning Engineering Education Data & Analytics

Need someone specific?

AI Search

Related

Need someone specific?