“Machine learning/NLP practitioner at J.P. Morgan who led development of a production RAG system and an entity resolution pipeline for complex financial data. Deep hands-on experience with embeddings (Sentence-BERT), vector search (FAISS/pgvector), LLM fine-tuning (LoRA/PEFT), and rigorous evaluation (human-in-the-loop + A/B testing) backed by strong MLOps on AWS (Docker/Kubernetes, MLflow, Prometheus/Datadog).”

Python R SQL JavaScript REST APIs gRPC+124

View profile

Sai supriya

Screened

Mid-level AI/ML Engineer specializing in LLM alignment, safety, and scalable inference

St. Louis, MO7y exp

AnthropicSaint Louis University

“Built and productionized an AWS-hosted, Kubernetes-orchestrated RAG assistant that enables natural-language Q&A over internal document repositories with grounded answers and citations. Demonstrates strong applied LLM engineering: hallucination mitigation, hybrid retrieval + re-ranking, and rigorous evaluation via benchmarks and A/B testing, plus real-world scaling of compute-heavy inference with dynamic batching and monitoring.”

Apache Spark AWS CI/CD Data Ingestion Data Pipelines Data Preprocessing+127

View profile

Nishitha Thummala

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and scalable inference

San Francisco, CA6y exp

PerplexityUniversity of Nebraska Omaha

“Backend/retrieval-focused engineer with production experience at Perplexity building a large-scale real-time Q&A system using retrieval-augmented generation, emphasizing low-latency, high-quality answers through ranking, context optimization, and caching. Also has orchestration experience from both product-facing LLM pipelines and large-scale infrastructure workflows at Meta, and has partnered with non-technical stakeholders to align AI trade-offs with business goals.”

Python FastAPI Flask Django gRPC JavaScript+167

View profile

Nikhil Reddy

Screened

Mid-level AI/ML Engineer specializing in GPU inference and LLM platforms

San Francisco, CA5y exp

NVIDIASaint Louis University

“Built and deployed an LLM-powered platform that turns models into scalable REST/gRPC APIs, focusing on keeping GPU-backed inference fast and stable during traffic spikes. Experienced with AWS orchestration (EKS/ECS/Step Functions), safe model rollouts, and production-grade monitoring/testing for reliable AI agents and workflows.”

Python Java Spring Boot JavaScript TypeScript React+129

View profile

Krishna Reddy

Screened

Mid-level AI/ML Engineer specializing in fraud detection and clinical LLM assistants

New York, NY6y exp

StripeIndiana Wesleyan University

“Built and deployed a production clinical support LLM assistant at Mayo Clinic using a LangChain-orchestrated RAG architecture (Llama 2/PaLM) over de-identified clinical records, integrating BigQuery with Pinecone for semantic retrieval. Focused on healthcare-critical reliability by reducing hallucinations through grounding, implementing HIPAA-aligned privacy controls (Cloud DLP, VPC Service Controls), and running structured evaluations with clinician feedback.”

Agile Amazon Bedrock Apache Hadoop Apache Hive Apache Kafka Apache Spark+143

View profile

Graham Lutz

Screened

Director of Engineering specializing in platform, AI, and cloud-native SaaS

Atlanta, GA10y exp

RulaGeorgia State University

Digital Transformation Generative AI DevOps SDLC Automation AWS+123

View profile

Yuxin Xiong

Screened

Intern Machine Learning Engineer specializing in LLM reasoning, agents, and deployment

0y exp

Nexa AIUC San Diego

“AWS AI Lab engineer who deployed a production Chain-of-Thought analytical agent for tabular reasoning, emphasizing grounded tool-constrained workflows with schema-validated intermediate outputs. Built robust evaluation/logging with step-level observability to catch regressions across model versions, and has experience scaling distributed LLM training via Slurm + DeepSpeed/FSDP with checkpointing and failure recovery.”

Large Language Models (LLMs)Model deployment PyTorch Reinforcement learning Feature engineering XGBoost+91

View profile

Dexin Huang

Screened

Junior AI Engineer specializing in LLM systems, RAG, and full-stack automation

Guilford, CT1y exp

Slothful LLC (Iris)Columbia University

“Built and deployed an AI receptionist product for field-service businesses (HVAC/electrician), including real-time Jobber scheduling integrations and Twilio-based calling. Combines hands-on customer/operator shadowing with strong production engineering (queueing to handle API limits, rigorous testing/mocking, mirrored prod environment) and cross-layer troubleshooting, driving user adoption through review/override workflows.”

A/B Testing Analytics API Design Authentication AWS AWS Lambda+99

View profile

Gabrielle Burns

Screened

Senior AI/ML Engineer specializing in computer vision, NLP, and enterprise ML systems

Chicago, IL11y exp

Motorola SolutionsPrinceton University

“ML/AI engineer with hands-on ownership of production computer vision and GenAI systems, spanning real-time public safety video analytics and RAG-based knowledge assistants. Stands out for translating research-oriented approaches into scalable, monitored production systems with clear business impact, including 50% latency reductions, 25% faster response times, and 40% lower document search time.”

Python R SQL Scikit-learn TensorFlow Keras+165

View profile

jiawei Li

Screened

Intern Applied Scientist specializing in LLM agents for software engineering

0y exp

AmazonUC Irvine

“Applied Scientist intern at Amazon who built a production-adopted LLM-judge to evaluate an agentic chatbot’s intermediate reasoning and tool calls using a knowledge-graph grounding approach. Also published award-winning work (ACM SIGSOFT Distinguished Paper) using LangChain + GPT-4 tools to generate factually grounded commit messages, with rigorous human-centered evaluation metrics.”

Python Java R PyTorch Scikit-Learn XGBoost+69

View profile

Krishna Sahith Poruri

Screened

Mid-level AI/ML Engineer specializing in LLMs, RAG, and MLOps

CA, USA4y exp

AnthropicCalifornia State University, Long Beach

“ML/LLM engineer who built a production RAG system (GPT-4 + FAISS + FastAPI) to deliver fast, grounded answers from proprietary documents, optimizing for sub-200ms latency and high-concurrency scale. Strong MLOps/observability background: drift monitoring with Prometheus + Streamlit, automated retraining via Airflow, Kubernetes autoscaling, and MLflow-managed model lifecycle, plus inference cost reduction through quantization and structured pruning.”

Python SQL R C++Git Classification+101

View profile