Browse Talent Find Talent Open Jobs Pricing FAQsGet Started

Vetted Observability Professionals

Pre-screened and vetted.

Observability Python Docker CI/CD AWS Kubernetes

Sourabh Jain

Screened

Director of Software Engineering specializing in enterprise Data, ML & AI platforms

Bay Area, CA23y exp

RSA SecurityShri G. S. Institute of Technology and Science

“Former Walmart Director of Software Engineering who left in March 2025 to build products for clients. Recently delivered an LLM/RAG-based UNSPSC classification solution for an MRO client using a multi-stage retrieval + web search + prompt-engineering workflow, and has led large-scale retail forecasting initiatives and high-severity cloud-migration incidents end-to-end.”

Apache Kafka Apache Spark Authentication Authorization CI/CD Coaching+85

View profile

Jagadeesh Kakamolu

Screened

Mid-level Software Engineer specializing in backend, cloud, and AI systems

Seattle, WA4y exp

AmazonSaint Louis University

“Engineer with hands-on experience across backend, full-stack, cloud, and AI/ML systems, with particular depth in Python, FastAPI, AWS Bedrock, SageMaker, and RAG-based architectures. Stands out for treating AI and agents as accelerators within disciplined production engineering, emphasizing guardrails, observability, latency/cost monitoring, and scalable system design.”

Python C JavaScript TypeScript React HTML+101

View profile

Suraj Botcha

Screened

Intern AI/ML Engineer specializing in LLM systems and industrial AI

Remote1y exp

ControlRooms.AICarnegie Mellon University

“Full-stack AI engineer who has built both document-intelligence products and agentic investigation systems end to end. At ControlRooms.AI, they helped ship a production-facing root cause investigation workflow for industrial operations using Neo4j, FastMCP, RAG, OCR/VLM inputs, and multiple LLMs, contributing to roughly a 10x reduction in manual investigation time. They stand out for designing explainable, traceable AI systems that surface evidence, uncertainty, and missing context rather than forcing overconfident answers.”

Python SQL PyTorch TensorFlow Scikit-learn Hugging Face+103

View profile

Jay Yepuri

Screened

Mid-level Software Engineer specializing in AWS backend and cloud infrastructure

Seattle, WA4y exp

AmazonIndiana University Bloomington

“Full-stack engineer with AWS Skill Builder experience building internal content-management and search workflows in TypeScript across React and Node.js. They drove a shift from keyword to semantic search using OpenSearch and Bedrock Titan Embeddings, delivering a 5x reduction in discovery time while also improving production reliability and observability for large-scale content workflows.”

Java Python JavaScript TypeScript SQL C+++87

View profile

Sara Fang

Screened

Mid-level Software Engineer specializing in cloud data platforms and distributed systems

Remote6y exp

Terra Byte XUniversity of Delaware

“Backend/data engineer with production experience building FastAPI services with strong reliability patterns (circuit breaker, rate limiting, caching, graceful degradation) and JWT/OAuth2 auth. Has delivered AWS EKS deployments via Terraform with Secrets Manager/IRSA and HPA autoscaling, and built Glue/Spark ETL pipelines on S3 Parquet with schema-evolution and idempotent reruns; also demonstrated measurable SQL tuning impact (20–30s to <10s).”

Java Python Scala Go SQL JavaScript+101

View profile

Bhagya Sunkara

Screened

Mid-level Full-Stack Developer specializing in cloud-native backend services and real-time data platforms

Remote, USA4y exp

NetflixUniversity of Dayton

“Backend/data engineering candidate with Netflix experience designing and migrating analytics platforms from batch to real-time streaming (Kafka/Flink) across AWS and GCP. Delivered measurable improvements (40% lower data delay, 99.9% accuracy) using phased rollouts, automated data validation (Great Expectations), and strong observability (Prometheus/Grafana), and proactively hardened pipelines with idempotency to prevent duplicate Kafka processing.”

Agile Apache Kafka Apache Spark Authentication AWS AWS Lambda+150

View profile

BHARGAV KODURU

Screened

Mid-level Full-Stack Software Engineer specializing in cloud microservices and AI integration

Jersey City, NJ3y exp

UberPace University

“Backend/distributed-systems engineer with Uber experience building real-time telemetry and safety signal pipelines. Strong in Kafka-based event-driven architectures, low-latency processing under peak load, and production reliability via monitoring, retries, and fallback logic; has Docker/Kubernetes and CI/CD deployment experience.”

Java JavaScript TypeScript Python SQL Spring Boot+121

View profile

Matthew Clarke

Screened

Intern Firmware Validation & Systems Test Engineer specializing in embedded and full-stack tooling

Palo Alto, CA1y exp

TeslaOregon State University

“Safety-critical firmware validation engineer with Tesla autonomous vehicle experience who built Python-based HIL/SIL automation and dashboards, cutting regression time by 30% while maintaining an auditable risk-tradeoff process with safety and engineering teams. Also deployed an inventory management system across 8+ R&D teams in 3 countries at FUJIFILM, troubleshooting a major cross-site sync issue to a timezone root cause with strong documentation and interim mitigations.”

Test Automation Regression Testing System Design Data Analysis Full-Stack Development React+87

View profile

Tianyi Wang

Screened

Entry-Level Backend/Cloud Engineer specializing in distributed systems and AI platforms

Seattle, WA1y exp

AmazonUniversity of Michigan

“Full-stack engineer with deep serverless AWS experience who built VidToNote, an AI video analysis platform, end-to-end using Next.js App Router/TypeScript and an event-driven pipeline (API Gateway, Lambda, DynamoDB, S3, Step Functions, SQS). Strong on production reliability and observability (CloudWatch, X-Ray, structured logging), plus data/analytics work in Postgres with measurable query optimizations and durable LLM evaluation workflows. Amazon background; integrated 22 AWS services and completed AWS Solutions Architect Professional certification within a month.”

API Gateway AWS AWS CloudFormation AWS Lambda AWS Step Functions Bash+87

View profile

Tzu-Chieh Huang

Screened

Mid-level Software Engineer specializing in backend systems, IoT, and AI security

Pittsburgh, PA3y exp

NapticCarnegie Mellon University

“Full-stack engineer in the investment tracking/financial reporting space who built an automated reporting dashboard and compliance/reporting pipeline end-to-end using Next.js (App Router, server/client components), REST, and Postgres. Demonstrated measurable performance wins (~30% faster loads) through caching and query optimization, and built durable orchestrated workflows in n8n with retries, idempotency, and reconciliation checks.”

Python Java C++C#JavaScript SQL+74

View profile

Prudhvi Yalamanchili

Screened

Mid-level Software Development Engineer specializing in AWS telemetry and DDoS mitigation

Seattle, WA3y exp

Amazon Web ServicesTexas A&M University-Commerce

“Amazon engineer who built an Amazon Bedrock-powered summarization layer over large-scale network/service telemetry (“top talker” insights) to help security engineers triage anomalies faster. Emphasizes production-grade design patterns for LLM features—non-blocking enrichment, deterministic fallbacks, strict structured outputs, and monitoring to preserve trust in source-of-truth telemetry.”

AWS Alerting Automation CI/CD Data modeling Distributed systems+70

View profile

Andrew Liang

Screened

Intern Software Engineer specializing in full-stack and AI/ML systems

2y exp

AmazonUCLA

“Software engineer with experience at Amazon and Agora building end-to-end systems: a knowledge-base AI chatbot (React/TypeScript UI + retrieval/response backend + Docker deployment) and an internal approval governance platform using AWS Step Functions and DynamoDB. Emphasizes fast iteration without sacrificing trust via feature-flag rollouts, citation-required answers, abstention on low-confidence retrieval, regression query sets, and strong observability (request IDs, structured logs, latency/error monitoring).”

A/B Testing Algorithms Audit Logging AWS AWS Step Functions Bash+93

View profile

Elizabeth Xu

Screened

Entry-Level Software Engineer specializing in ML/NLP and security

Evanston, IL1y exp

RakutenNorthwestern University

“Early-career engineer (internship background) who built a production-style notes product using Next.js App Router with Server Components/Server Actions and a Postgres-backed analytics model. Demonstrates strong performance and reliability instincts—measured DB latency improvements via indexing and cursor pagination, plus durable orchestration with Temporal using idempotency and deterministic workflows.”

C C#C++Python SQL Java+77

View profile

Jacqueline Zhang

Screened

Mid-level Machine Learning Engineer specializing in LLMs, fairness, and healthcare ML

Illinois, USA4y exp

iSchool Statistical ML & AI LabUniversity of Illinois Urbana-Champaign

“ML/NLP practitioner with a master’s thesis focused on domain-adaptive knowledge distillation for LLMs (LLaMA2/sheared LLaMA), showing improved perplexity and ROUGE-L on biomedical data. Also built real-world data linking and search systems: integrated ClinicalTrials.gov with FAERS using fuzzy matching + embeddings, and delivered an LLM-powered FAQ recommender at Hyperledger using sentence-transformers, FAISS, and fine-tuning to mitigate embedding drift.”

A/B Testing API Development CI/CD Computer Vision C Data Engineering+93

View profile

Yeshwanth Sai Pala

Screened

Mid-level Full-Stack Developer specializing in cloud microservices and AI-driven FinTech

Remote, USA4y exp

StripeSouthern Arkansas University

“Stripe engineer who shipped an end-to-end merchant fraud insights dashboard, spanning Spring Boot/Kafka risk-scoring services and a React+TypeScript UI. Focused on low-latency, high-volume transaction processing and production operations on AWS (EKS/CloudWatch), including handling a real traffic-spike latency incident via query optimization, indexing, and rate limiting.”

AI Agents Agentic AI Amazon DynamoDB Amazon EC2 Amazon EKS Amazon Kinesis+143

View profile

Likhitha Bethi

Screened

Mid-level Software Engineer specializing in backend systems, distributed systems, and applied AI

Stony Brook, NY4y exp

Stony Brook UniversityStony Brook University

“Goldman Sachs engineer who owned end-to-end features for an internal onboarding and case management platform, spanning React/TypeScript UI, a GraphQL gateway, and Node + Spring WebFlux microservices. Built and operated a Kafka-based ingestion and search pipeline with DLQs, retries, idempotency, and strong observability, and improved developer experience via backward-compatible GraphQL API design and schema-driven documentation.”

Agile Authorization BERT C++Computer Vision Data Science+125

View profile

Xicheng Liang

Screened

Intern AI/Full-Stack Engineer specializing in backend systems and applied machine learning

Chicago, IL1y exp

Becker’s HealthcareUniversity of Pennsylvania

“Built and shipped a production agentic RAG system for healthcare analysts that automated compliance/operations knowledge retrieval across PDFs, reports, and databases. Emphasizes production reliability (monitoring, retries, fallbacks, async queues), strong evaluation/iteration loops, and measurable impact (3–10s responses and ~98% top-k retrieval accuracy).”

Java JavaScript Python C++C SQL+145

View profile

Steven Schoen

Screened

Staff Android Engineer specializing in mobile platform and design systems

Berkeley, CA12y exp

RedditUniversity of Central Florida

“Built and shipped a production internal framework-adoption agent for design system leadership, using Temporal, Google ADK, and a Slack bot interface. They appear to be an early internal builder of agentic systems at their company, with practical experience in prompt/process design, lightweight orchestration, and reliability tradeoffs for real-world LLM workflows.”

Android Kotlin Java GraphQL Design Systems Accessibility+54

View profile

Samhith Kakarla

Screened

Intern Software Engineer specializing in developer productivity and data/AI systems

Los Angeles, California1y exp

IntuitUC Berkeley

“Internship experience at Intuit building an LLM-grounded QA system for internal microservice data across 100+ microservices, using a graph database approach (evaluated Neo4j and selected AWS Neptune for production alignment). Also has UC Berkeley research experience (including work with Prof. Dawn Song / Berkeley Eye Research Lab) and cross-functional collaboration with bioinformatics/biology teams to deploy software systems on research servers.”

Agile Algorithms AWS CI/CD C C+++86

View profile

Yue Yang

Screened

Intern Data Scientist specializing in GenAI (LLMs, RAG) and ML model optimization

Sunnyvale, CA1y exp

SynopsysColumbia University

“Built and deployed a production LLM-powered risk assistant for KPMG and Freddie Mac that lets analysts query a confidential Neo4j risk graph in natural language (no Cypher), turning multi-day analysis into minutes with traceable, cited answers. Implemented rigorous guardrails, deterministic verification, RBAC/security controls, and a full eval/observability stack, cutting query error rate by ~50% and iterating through weekly UAT with non-technical risk analysts.”

Generative AI Large Language Models (LLMs)Retrieval-Augmented Generation (RAG)Machine Learning Deep Learning Data Science+113

View profile

Jagadeeshwar Reddy Thiyyagura

Screened

Senior Software Engineer specializing in distributed systems and AI workflow orchestration

Austin, TX5y exp

AppleUniversity of Central Missouri

“Backend owner at Apple for an AI workflow orchestration service, with hands-on experience stabilizing peak-traffic production systems using OpenTelemetry-style tracing, bounded async concurrency, and database performance tuning. Built and shipped a Python LLM-agent orchestration layer to automate multi-step operational workflows, emphasizing guardrails, auditability, and deterministic fallbacks to keep non-deterministic AI behavior production-safe.”

Python Go Java TypeScript SQL AWS+73

View profile

Sairam Banavathu

Screened

Mid-level Backend & Reliability Engineer specializing in AWS, Kubernetes, and automation

New Mexico, US5y exp

MetaUniversity of North Carolina at Charlotte

“Meta engineer focused on reliability/operations tooling who built a unified real-time health dashboard and scalable telemetry pipelines (AWS + Datadog) for thousands of devices. Also shipped an internal LLM-powered knowledge assistant using RAG over wikis/runbooks/logs with strong guardrails and a rigorous eval loop that drove measurable accuracy improvements via automated doc ingestion and embedding updates.”

Amazon EC2 AWS Lambda Amazon S3 Amazon EKS Kubernetes Docker+87

View profile

Pankaj Gautam

Screened

Senior Cloud Infrastructure & TechOps Leader specializing in AWS, Kubernetes, and SRE

San Francisco, CA27y exp

AmazonCal State East Bay

“Infrastructure/platform engineer with hands-on experience running production and non-production Amazon EKS clusters, including upgrade processes and reliability monitoring via Prometheus/Grafana. Also administered on-prem VMware vSphere/vCloud Director and handled a significant vSwitch/VLAN outage, and uses Terraform + Terragrunt with S3 remote state and release-based drift detection across dev/stage/prod.”

DevOps Kubernetes Terraform Ansible AWS Cost Optimization+95

View profile

Software Engineers Software Developers Full Stack Developers Machine Learning Engineers Software Development Engineers DevOps Engineers Engineering AI & Machine Learning Executive & Leadership Data & Analytics

Need someone specific?

AI Search

Related

Need someone specific?