Vetted Observability Professionals

Pre-screened and vetted.

Ashi Sinha - Junior Software Engineer specializing in full-stack and ML/NLP systems in New York City, NY

Ashi Sinha

Screened

Junior Software Engineer specializing in full-stack and ML/NLP systems

New York City, NY2y exp
IBMUniversity of Massachusetts Amherst

Entry-level full-stack engineer with internship experience at Amazon (Appstore IAP flow + uninstall recommendation workflow) and a health-tech startup (OneVector) where they built a DSUR reporting workflow end-to-end, including document generation, S3-backed versioning/metadata, and secure preview/download. Demonstrates strong production debugging and reliability mindset (instrumentation, deterministic retrieval, idempotent writes) and focuses on UX/performance in high-stakes user flows.

View profile
Osvaldo Calles - Senior Software Engineer specializing in developer tools, cloud automation, and generative AI in Redmond, WA

Senior Software Engineer specializing in developer tools, cloud automation, and generative AI

Redmond, WA13y exp
AmdocsUniversidad Autónoma de Guadalajara

Built and deployed a production chatbot on osvaldocalles.com and iterated through real-world LLM engineering issues: model quota/cost tradeoffs (migrating to Nova Pro), RAG accuracy via semantic chunking, AWS IAM/guardrail/security pitfalls, and Lambda/API Gateway streaming constraints (prefers JS for streaming layer). Experienced with agent orchestration using Strands SDK (AWS-focused) and LangGraph (Vercel/container deployments), plus evaluation pipelines using LLM-as-evaluator, dashboards, and staged model rollouts.

View profile
Nagsen Dahat - Executive engineering leader specializing in FinTech platforms and cloud-native systems in New York, NY

Nagsen Dahat

Screened

Executive engineering leader specializing in FinTech platforms and cloud-native systems

New York, NY18y exp
SolveBrown University

Motivated by the desire to work independently after spending significant time working for others. Demonstrated notable discipline and follow-through by completing a Master's program with great grades while managing a full-time job, family with two children, and a social life.

View profile
Jared Hoffen - Senior AI Engineer specializing in LLM agents, RAG, and ML infrastructure in Las Vegas, USA

Jared Hoffen

Screened

Senior AI Engineer specializing in LLM agents, RAG, and ML infrastructure

Las Vegas, USA12y exp
AI Research LabCalifornia State University, Northridge

Production-focused AI/ML engineer who has owned LLM agent and RAG systems end-to-end, from experimentation through deployment, monitoring, and iterative optimization. Stands out for building evaluation and observability layers around GenAI systems and delivering measurable gains in task success, regression detection speed, and token efficiency in production.

View profile
SC

Senior Cloud Infrastructure Architect specializing in multi-cloud, DevOps, and AI/ML platforms

San Francisco, California25y exp
AmazonAmerican River College

Engineering leader (Director of Development) with hands-on cloud and product experience who builds business-aligned technology roadmaps and scales teams. Delivered an enterprise cloud-migration enabler at UHG by implementing AD authentication and Terraform-based IaC for custom VM images while meeting 90-day InfoSec patch/rotation requirements, and drove a 20% lift in user consumption/retention by designing an interactive branded media portal experience for Sunkist.

View profile
Frank Goodman - Executive Engineering & Product Leader specializing in Cloud/SaaS observability and security in San Jose, CA

Frank Goodman

Screened

Executive Engineering & Product Leader specializing in Cloud/SaaS observability and security

San Jose, CA31y exp
GigamonUC Berkeley

Product/technology leader with deep security and cloud infrastructure expertise who drove a major shift from hardware-based networking/security appliances to cloud-native capabilities, growing cloud revenue from $0 to $400M in 4.5 years. Led an innovative eBPF-based approach (“precryption”) to enable lightweight cloud TLS interception/decryption, and has hands-on coding interest (recent Rust work on a personal cybersecurity identity/trust platform).

View profile
Kaushik Sriram - Mid-level Software Engineer specializing in event-driven FinTech backend systems in San Francisco, CA

Mid-level Software Engineer specializing in event-driven FinTech backend systems

San Francisco, CA5y exp
StripeUniversity of Central Missouri

Senior/Staff-level backend/platform engineer who owned Stripe’s global payout settlement system end-to-end, building an event-driven Python/Kafka platform processing millions of events daily across 30+ countries. Deep experience operating high-reliability distributed systems in production (incidents, replays/backfills, schema evolution, observability) and scaling on AWS/EKS with strong testing and deployment practices.

View profile
KY

Kenneth Young

Screened

Senior Site Reliability Engineer specializing in production LLM/RAG deployments

Fremont, CA21y exp
FM IndustriesUdacity

Built and operationalized an internal LLM/RAG system for engineering specs—starting with an at-home prototype using real ERP documents, then securing hardware, standing up a GPU/software stack, and deploying through UAT to production. Identified organizational gaps (no shared spec repository) and created a queryable RAG database that reportedly cut document discovery from days/weeks to minutes, while also resolving retrieval issues via improved PDF-aware chunking.

View profile
UB

Principal Data Scientist specializing in machine learning and generative AI

New York, NY12y exp
AtlassianRutgers University

Atlassian ML/AI engineer who has shipped end-to-end production systems combining classical ML, streaming infrastructure, and LLM-based personalization to improve onboarding and free-to-paid conversion. Particularly strong in turning research-style RAG and reranking ideas into low-latency, reliable product systems with robust evaluation, safety guardrails, and reusable platform services for other teams.

View profile
JW

Jonathan Wang

Screened

Senior Software Engineer specializing in platform, authentication, and developer infrastructure

9y exp
IndeedUC Davis

Software engineer who has deeply integrated AI into day-to-day development, using Claude Code, ChatGPT, and coding agents to speed up boilerplate generation, system design, and tradeoff analysis. Stands out for a pragmatic multi-model workflow focused on faster delivery and quicker architectural feedback.

View profile
Timothy Lee - Senior Full-Stack Engineer specializing in AI platforms and scalable web systems in Live Oak, FL

Timothy Lee

Screened

Senior Full-Stack Engineer specializing in AI platforms and scalable web systems

Live Oak, FL11y exp
Parker AIUniversity of Florida

Built and shipped production agentic/LLM systems that could safely perform real customer and subscription operations, not just answer questions. Demonstrates unusually strong depth in agent orchestration, tool safety, evals, tracing, and backend workflow design across Node.js/TypeScript, Go, Redis, Postgres, Kafka, and GPT-4.

View profile
CS

Mid-level Machine Learning Engineer specializing in fraud detection and real-time personalization

San Francisco, CA6y exp
StripeUniversity of Tampa

ML/LLM engineer with Stripe and Adobe experience who productionized a transformer-based Payments Foundation Model for real-time fraud detection at global scale (billions of transactions). Built petabyte-scale ETL/feature pipelines (Spark/EMR, Airflow, dbt, Kafka/Flink) and achieved <100ms multi-region inference (EKS, TorchServe, edge/Lambda, GPU/CPU routing) with strong PCI-DSS/GDPR compliance and explainability (SHAP/LIME), reporting a 64% fraud accuracy improvement.

View profile
NM

Staff Software Engineer specializing in headless commerce and developer platforms

New York, NY10y exp
ShopifyUniversity of Florida

End-to-end product engineer who built and shipped Shopify Magic, an LLM-powered product-description generator on Amazon Bedrock with RAG over a tenant-isolated vector database, achieving 50% faster content creation, sub-2s latency, and 70%+ merchant adoption. Also led a Flexport migration from a monolithic Rails app to microservices using feature flags and parallel runs, delivering zero downtime and a 60% improvement in development speed.

View profile
Keerthana Senthilnathan - Junior Machine Learning Engineer specializing in LLM systems and inference reliability in California, USA

Junior Machine Learning Engineer specializing in LLM systems and inference reliability

California, USA1y exp
llm-dUC San Diego

ML/LLM infrastructure-focused engineer who built a production stateful LLM inference service that cuts latency and GPU compute for repeated/overlapping prompts via caching with correctness guardrails. Strong in Kubernetes-based deployment and reliability engineering, using A/B testing and similarity-based evaluation to quantify performance gains without sacrificing output quality.

View profile
Bennett Smith - Senior Full-Stack Engineer specializing in cloud-native microservices and React in Los Angeles, CA

Bennett Smith

Screened

Senior Full-Stack Engineer specializing in cloud-native microservices and React

Los Angeles, CA14y exp
Universal StudiosNYU

Backend/data engineer with strong AWS production experience spanning high-traffic FastAPI APIs (Postgres/Redis/Kafka) and serverless+container deployments (Lambda/ECS) managed via Terraform and CI/CD. Has built Glue-based data lake ETL (S3 Parquet, Athena/Redshift) with schema drift/data quality controls, modernized legacy batch systems via parallel-run parity validation, and demonstrated measurable SQL performance wins (60–90s down to 3–5s).

View profile
Rohini Rajagopalan - Director-level Engineering Leader specializing in SaaS, Cloud Migration, and Cybersecurity in Santa clara, CA

Director-level Engineering Leader specializing in SaaS, Cloud Migration, and Cybersecurity

Santa clara, CA8y exp
CiscoTexas Tech University

Senior engineering leader with experience at Cisco, Amazon, and startup Shopkick, operating at high scale (e.g., Secure Web Gateway handling ~40M QPS). Known for measurable impact across reliability and cost (85% efficacy improvement; Datadog spend cut from ~$500k/month to ~$15k/month) and for leading complex platform modernization (1-year monolith-to-microservices/event-driven migration with zero customer impact) plus compatibility-focused API design that cut device onboarding from a month to a day.

View profile
Durgaprasad G - Mid-level AI/ML Engineer specializing in LLM infrastructure, RAG, and agentic systems in New York City, NY

Durgaprasad G

Screened

Mid-level AI/ML Engineer specializing in LLM infrastructure, RAG, and agentic systems

New York City, NY3y exp
StripeNJIT

Stripe engineer who owned and unified multiple team RAG systems into a shared production platform used by 200+ internal operators, deployed on EKS with Kafka ingestion and hybrid retrieval. Drove measurable business outcomes including <400ms latency, ~35% inference cost reduction, ~25% accuracy lift via fine-tuning, and real-time auto-approval of 80%+ merchant compliance applications through strong observability and reliability patterns.

View profile
CH

Senior Unity/Full-Stack Engineer specializing in distributed systems, VR, and AI/LLM integration

Springdale, AR13y exp
TectonUSC

Unity/C# gameplay engineer who has shipped a modular, data-driven combat ability system with strong measurable outcomes (≈80% fewer GC allocations, 15–20% better frame times, 10–12% higher early retention). Also integrated an LLM-driven NPC dialogue/quest hint system with a C#/.NET backend, caching/guardrails, and telemetry-driven iteration, and shipped Photon PUN real-time 4-player co-op plus a shared codebase across Meta Quest VR and iOS/Android.

View profile
AC

Senior Data Scientist specializing in machine learning, NLP, and MLOps

Dallas, TX8y exp
AstroSirensUniversity of Houston

ML/NLP engineer with experience building production-grade legal-tech and data platforms, including a GPT-4/LangChain contract review system using ElasticSearch embeddings (RAG) deployed on AWS EKS. Strong in entity resolution and scalable batch/streaming pipelines (Kafka/Spark), with measurable impact (70%+ reduction in contract review time) and a focus on monitoring and CI/CD for reliable delivery.

View profile
SM

Mid-level Machine Learning Engineer specializing in NLP, federated learning, and fraud detection

CA, USA6y exp
AppleUSC

ML/robotics engineer with Apple experience who built a computer-vision-driven industrial defect detection system integrating a robotic arm with ROS-based real-time inference on an edge GPU. Drove major performance gains (cut inference time ~60% via quantization + TensorRT) and improved robustness to lighting/material variation, with strong emphasis on production reliability (health checks, watchdogs, observability, CI/CD) and interest in shaping early-stage startup engineering culture.

View profile
AC

Aesha Choksi

Screened

Director-level Engineering Leader specializing in Cloud Security and Data Platforms

San Francisco, CA20y exp
SysdigCalifornia State University, East Bay

Engineering leader in cloud security at SysTech with player-coach experience spanning cross-team data/ownership standardization and reporting platform user-journey improvements. Stays technically deep through observability (SLA/SLOs, dashboards, alerting), rigorous code reviews (including AI-assisted coding), and end-to-end incident ownership in IAM/agentless cloud event collection. Targeting $270K–$300K base plus bonus/equity.

View profile
Sandeep Rohilla - Principal Backend/Platform Engineer specializing in GenAI agent orchestration and LLM pipelines in San Francisco, CA

Principal Backend/Platform Engineer specializing in GenAI agent orchestration and LLM pipelines

San Francisco, CA19y exp
MyResumeStar.comUSC

LLM-focused engineer/sales-engineering profile with hands-on experience productionizing complex systems: scalable distributed architecture, multi-tenant monitoring, canary/shadow rollouts, and robust fallback strategies. Demonstrated real-time troubleshooting depth (p99 latency spikes traced to DB connection limits causing retry storms) and strong developer-facing communication via RAG workshops and live, customer-specific demos that helped close deals quickly.

View profile
AR

Mid-level Software Engineer specializing in robotics, AI, and full-stack systems

Remote, USA5y exp
Mira MaceGeorgia Tech
View profile
Matthew Joseph - Staff Software Engineer specializing in SaaS and E-commerce platforms in Remote

Staff Software Engineer specializing in SaaS and E-commerce platforms

Remote10y exp
CalendlyUniversity of Texas at Austin
View profile

Need someone specific?

AI Search