Vetted Observability Professionals

Pre-screened and vetted.

CY

Staff Software Engineer specializing in distributed systems and platform architecture

Aldie, VA15y exp
ProviUniversity of Maryland, College Park

Built a production LLM-powered data ingestion workflow at Provi, an online alcohol marketplace, to clean and match millions of distributor inventory items against a product catalog. Their experience is strongest in applying LLMs to real-world, large-scale data operations with AWS Glue, S3, batching, API integration, human review, and drift detection.

View profile
TB

Thomas Baker

Screened

Senior Full-Stack Engineer specializing in serverless AWS and event-driven systems

Dallas, TX12y exp
AmazonUniversity of Texas at Austin

Backend/data engineer with experience at AWS and Intuit building and operating production serverless systems and data pipelines. Delivered an internal AWS TV video-processing platform using Step Functions/Lambda/S3/DynamoDB with strong reliability and cost controls, and built Glue-based ETL for compliance/risk events (Kafka to partitioned Parquet). Also modernized legacy compliance systems into Java/Node event-driven services and has demonstrated measurable SQL tuning impact (200s to 20s).

View profile
HC

Hernan Chalco

Screened

Senior Software Engineer specializing in eCommerce payments and integrations

San Jose, CA7y exp
AdyenUC Berkeley

Solutions/implementation-focused engineer with payments expertise (Adyen headless Magento integrations, 3DS components) who also builds and troubleshoots agentic LLM workflows using the OpenAI Agents SDK. Experienced in pre-sales technical validation and in tailoring live demos/workshops—e.g., pivoted a Quantum Metric workshop from custom JavaScript instrumentation to no-code analytics based on audience needs.

View profile
CH

Chengzhu He

Screened

Staff/Principal Cloud Infrastructure Engineer specializing in Kubernetes and OpenStack

14y exp
TikTokShanghai University

Platform/backend engineer focused on Kubernetes at scale: built a Java control-plane service for multi-region cluster provisioning/monitoring/upgrades using Kafka-driven async workers, and solved peak-load provisioning failures by eliminating blocking I/O and dynamically scaling consumers. Also shipped an LLM-assisted Kubernetes troubleshooting/remediation feature that pulls Prometheus logs/metrics into prompts and uses guardrails (confidence thresholds + human-in-the-loop) to prevent risky actions.

View profile
IH

ian holsman

Screened

Executive Engineering Leader (VP/CTO) specializing in Blockchain, DeFi, and FinTech platforms

Remote, USA19y exp
HederaMelbourne Business School

CTO-focused candidate with experience at foundations evaluating startups, including reviewing technical architectures and coaching teams to refine ideas for better platform fit and synergies. Prioritizes company culture and integrity when choosing leadership roles.

View profile
PA

Senior engineering leader specializing in AI-first full-stack SaaS platforms

null11y exp
DocuSignCarnegie Mellon University

Engineering leader with hands-on product and platform experience spanning AI-powered web forms, multi-region Azure/OpenAI architecture, and full-stack team scaling. They also remain close to production systems, citing a concrete debugging example involving iOS WebView local storage behavior causing token loss in embedded mobile web forms.

View profile
SN

Swami Nigam

Screened

Director-level software engineering leader specializing in AI/ML, analytics, and enterprise platforms

Pleasanton, CA24y exp
OracleRensselaer Polytechnic Institute

Senior software engineering leader with 20+ years of management exposure who has alternated between IC and director-level roles, leading teams of up to 25 across AI platform, analytics, Salesforce, and systems software projects. Particularly compelling for roles needing both technical depth and organizational leadership: they have architected systems themselves, built teams in new geographies, and coordinated platform, AI/data, and consumer engineering groups to deliver successful turnkey AI solutions.

View profile
RN

Ronald Nap

Screened

Intern Machine Learning & AI Engineer specializing in computer vision and ML systems

San Jose, CA2y exp
AMDUC Berkeley

Robotics/ML engineer with internship experience at Valeo building a deep-learning prototype to replace parts of a legacy SLAM backend for autonomous parking, focused on making models run reliably in real time on embedded hardware (quantization/distillation + TensorRT). Also brings strong MLOps/deployment experience (Docker, Kubernetes on AWS EKS, CI via GitHub Actions) and has supported patent filing by explaining the technical approach to legal stakeholders.

View profile
AS

Director-level Customer Success & GTM leader specializing in Cloud, AI, and Enterprise SaaS

Sunnyvale, CA30y exp
GoogleKeller Graduate School of Management

Commercial/GTN leader with GCP experience managing multi-year, multi-megawatt AI/GPU infrastructure commitments, owning segment P&L and governance for take-or-pay/reserved capacity. Drove a major client partnership scaling ARR from $50M to $100M in 18 months by aligning Product/Engineering, GTM, and infra teams and building flexible, margin-protective commercial structures. Known for speeding hyperscaler procurement/security reviews (FedRAMP/SOC2, IAM, data residency) and operationalizing multi-region delivery with landing zones and IaC.

View profile
SK

Mid-Level Software Engineer specializing in data pipelines, observability, and analytics

San Francisco, CA2y exp
MetaArizona State University

Meta engineer who improved a critical revenue estimation dataset pipeline that was arriving ~6 days late—diagnosed via raw logs/lineage, redesigned legacy scans to only process the needed window, and shipped validation plus freshness/lag dashboards. Delivered ~50% latency reduction (to ~3 days) and regained adoption by running old/new pipelines in parallel with gated cutover and evidence-based customer communication. Applies incident-response rigor to real-time LLM/agentic workflow debugging and regularly runs developer demos/workshops.

View profile
JH

Jiahua Huang

Screened

Intern Full-Stack Software Engineer specializing in web apps and cloud-native systems

1y exp
AmazonUniversity of Illinois Urbana-Champaign

Backend engineer who scaled a food delivery platform by migrating from a single-service architecture to Spring Cloud microservices with an API gateway and Kafka-based event-driven order pipeline. Reported outcomes include ~50% latency reduction, stable ~2K RPS throughput, and 99.8% uptime, with strong emphasis on safe migrations (dual writes, canaries, schema versioning) and security (JWT/RBAC/Postgres RLS).

View profile
Suparna Roy - Executive Cloud Infrastructure & SRE Leader specializing in AI-driven reliability and security in Austin, TX

Suparna Roy

Screened

Executive Cloud Infrastructure & SRE Leader specializing in AI-driven reliability and security

Austin, TX14y exp
IBMUSC

Engineering/technology leader with IBM Cloud experience leading large-scale infrastructure modernization from classic architecture to a standardized VPC/next-generation DC platform. Reports major outcomes including cutting region launch time from ~18 months to ~3 months and reducing operating costs by ~80% via automation, modular undercloud services, and platform standardization, while scaling a globally distributed org with clear service ownership and accountability.

View profile
Sara Rubacha - Engineering Manager specializing in databases and distributed systems in Weston, FL

Sara Rubacha

Screened

Engineering Manager specializing in databases and distributed systems

Weston, FL21y exp
UKGUniversity of Buenos Aires

Aspiring founder exploring an AI automation startup focused on automating processes involved in building companies. Not yet developed specific use cases or raised capital, but describes a clear plan to validate ideas through use-case research, building a pilot, and testing with early customers; not familiar with the VC/accelerator landscape yet.

View profile
Kieron Ong - Junior Software Engineer specializing in AI platforms and full-stack systems in New York, NY

Kieron Ong

Screened

Junior Software Engineer specializing in AI platforms and full-stack systems

New York, NY2y exp
HeadwayUC Berkeley

Frontend/product engineer with strong experience building sophisticated AI-assisted browser UIs for customer support operations in healthcare/therapy contexts. Particularly compelling for teams needing someone who can combine modern web architecture, observability, typed systems, and human-in-the-loop AI UX to improve both reliability and agent efficiency.

View profile
PP

Engineering Manager / Senior Backend Platform Engineer specializing in microservices and CI/CD

Houston, TX14y exp
FitbitCornell University

Fitbit engineer who has taken multiple projects from concept to release, including architecting a new warranty-evaluation system that achieved 100% accuracy and saved the company $6M. Interested in exploring startup ideas and emphasizes mission alignment and building strong cross-functional teams.

View profile
AS

Mid-level DevOps Engineer specializing in cloud-native infrastructure on AWS and Azure

CA, USA5y exp
StripeStevens Institute of Technology

DevOps/SRE focused on cloud-based distributed systems, with strong hands-on Kubernetes production experience (microservices deployments, Helm, probes, resource tuning, CI/CD and Docker build standardization). Demonstrated end-to-end troubleshooting across application, infrastructure, and networking layers—e.g., isolating degraded storage via node disk I/O metrics and restoring performance by draining the node and replacing the volume. Builds Python automation for operational reliability, including scheduled Kubernetes secrets rotation integrated with an external secret manager.

View profile
Sergey Pustovit - Director-level Data Platform & Analytics Engineering Leader specializing in distributed systems in Irvine, CA

Director-level Data Platform & Analytics Engineering Leader specializing in distributed systems

Irvine, CA31y exp
SentinelOneNational University "Odessa Maritime Academy"

Entrepreneurially minded builder focused on proving architecture concepts via minimal demo prototypes for marketing. Has hands-on experience improving an A/B experimentation framework by interviewing stakeholders, identifying system limits and bottlenecks, and defining success criteria to scale experimentation and speed up analysis.

View profile
MR

Mid-level Full-Stack Developer specializing in cloud-native web applications

5y exp
AmazonUniversity of Central Missouri

Frontend-leaning full-stack engineer who built an internal real-time operations dashboard from 0→1 using React, TypeScript, Redux Toolkit, Material UI, and Node.js integrations. Stands out for hands-on performance tuning at scale—profiling and fixing excessive re-renders, optimizing live-update UIs, and iterating post-launch with caching, pagination, and observability.

View profile
JO

Director-level Engineering Leader specializing in AI platforms and FinTech systems

San Francisco, CA27y exp
EarthXCGCal Poly San Luis Obispo

Fintech and AI product engineer who has owned major production rollouts, including Lending Club's banking-arm launch, and has since built LLM-powered decision systems for finance and climate use cases. Particularly strong in combining stakeholder management with pragmatic architecture choices like observability, deterministic pipeline design, RAG, and document-to-structured-data workflows.

View profile
DB

Junior Full-Stack Software Engineer specializing in scalable web platforms and AI integration

New York, NY2y exp
AmazonGeorgia Tech

Frontend engineer from Amazon Advertising who owned a sophisticated React/TypeScript ad creative builder used by advertisers and ad ops teams. Stands out for combining deep browser-level debugging with product-minded UX improvements that reduced support escalations and made complex multi-placement ad configuration faster and more reliable for power users.

View profile
AM

apparao metta

Screened

Director-level QA Engineering Manager specializing in cloud platform quality & reliability

San Francisco, CA22y exp
Amazon Web ServicesAcharya Nagarjuna University

AWS engineering manager leading delivery for an end-to-end encrypted communications product (calling/messaging/screen sharing), including shipping read receipts with full design/engineering/QA ownership. Demonstrated strong customer-driven problem solving (offline/mission users enrollment via admin one-time codes with account allowlisting) and reliability improvements (data retention bot crash RCA, monitoring/notification, and high-volume test simulation).

View profile
SS

Mid-level Python Backend Developer specializing in cloud-native microservices and AI/ML platforms

USA4y exp
NVIDIASanta Clara University

Backend/AI engineer who built a production GPU-backed real-time inference API at Nvidia and debugged burst-induced tail latency, cutting P95 by ~29% through dynamic batching and backpressure. Also shipped an end-to-end RAG + agentic operational diagnostics assistant with strict tool controls, evidence citation, confidence gating, and strong production guardrails, plus demonstrated hands-on Postgres optimization (900ms to 40–60ms).

View profile
Pratham Thukral - Mid-level Software Engineer specializing in distributed systems on AWS in Seattle, WA

Mid-level Software Engineer specializing in distributed systems on AWS

Seattle, WA3y exp
AmazonUniversity of Waterloo

Data/infra engineer with AWS DynamoDB experience who has shipped reliability-critical systems (Global Tables replica repair protocol) and customer-facing service rollouts using canary/percentage-based deployments, strong observability, and rollback strategies. Also built end-to-end Airflow pipelines producing weekly automated reports over ~10TB of advertising segment data, with rigorous week-over-week data quality validation.

View profile
Pankaj Goyal - Director-level Engineering Leader specializing in FinTech, IAM, and AI/ML platforms in SF Bay Area, CA

Pankaj Goyal

Screened

Director-level Engineering Leader specializing in FinTech, IAM, and AI/ML platforms

SF Bay Area, CA22y exp
PostLoShri Govindram Seksaria Institute of Technology and Science

Player-coach backend leader at PostLo who led a major backend architecture upgrade to enable AI-driven features by separating transactional systems from AI workloads (vector embeddings/image validation) and adding async processing for heavy jobs. Also owned production reliability improvements (query/index optimization, workload isolation, monitoring and load testing) and translated an ambiguous retention goal into a shipped cashback rewards feature with auditable transactions.

View profile

Need someone specific?

AI Search