Vetted Distributed Systems Professionals

Pre-screened and vetted.

Ashraf Jahangeer - Executive engineering leader specializing in AI, SaaS, and large-scale platform transformation in Pleasanton, CA

Executive engineering leader specializing in AI, SaaS, and large-scale platform transformation

Pleasanton, CA19y exp
BlueCargoUC Davis

Senior engineering executive with 20+ years of experience scaling and restructuring product engineering organizations across startups and larger companies. Over the last 8 years, they grew teams significantly at Stitch Fix and EasyPost, led monolith-to-microservices modernization, and launched products associated with 50M-100M+ ARR. Currently a hands-on VP Engineering at BlueCargo, combining executive leadership with direct technical contribution when customer-critical issues arise.

View profile
SB

Suraj Botcha

Screened

Intern AI/ML Engineer specializing in LLM systems and industrial AI

Remote1y exp
ControlRooms.AICarnegie Mellon University

Full-stack AI engineer who has built both document-intelligence products and agentic investigation systems end to end. At ControlRooms.AI, they helped ship a production-facing root cause investigation workflow for industrial operations using Neo4j, FastMCP, RAG, OCR/VLM inputs, and multiple LLMs, contributing to roughly a 10x reduction in manual investigation time. They stand out for designing explainable, traceable AI systems that surface evidence, uncertainty, and missing context rather than forcing overconfident answers.

View profile
SF

Sara Fang

Screened

Mid-level Software Engineer specializing in cloud data platforms and distributed systems

Remote6y exp
Terra Byte XUniversity of Delaware

Backend/data engineer with production experience building FastAPI services with strong reliability patterns (circuit breaker, rate limiting, caching, graceful degradation) and JWT/OAuth2 auth. Has delivered AWS EKS deployments via Terraform with Secrets Manager/IRSA and HPA autoscaling, and built Glue/Spark ETL pipelines on S3 Parquet with schema-evolution and idempotent reruns; also demonstrated measurable SQL tuning impact (20–30s to <10s).

View profile
TW

Tianyi Wang

Screened

Entry-Level Backend/Cloud Engineer specializing in distributed systems and AI platforms

Seattle, WA1y exp
AmazonUniversity of Michigan

Full-stack engineer with deep serverless AWS experience who built VidToNote, an AI video analysis platform, end-to-end using Next.js App Router/TypeScript and an event-driven pipeline (API Gateway, Lambda, DynamoDB, S3, Step Functions, SQS). Strong on production reliability and observability (CloudWatch, X-Ray, structured logging), plus data/analytics work in Postgres with measurable query optimizations and durable LLM evaluation workflows. Amazon background; integrated 22 AWS services and completed AWS Solutions Architect Professional certification within a month.

View profile
TT

Tommy Tomaye

Screened

Senior DevSecOps & Cloud Security Engineer specializing in AWS and application security

San Diego, CA10y exp
SonyUniversity of Mosul

IBM Power/AIX infrastructure engineer who has owned a large enterprise footprint (40 Power8/9 frames, 400+ AIX LPARs) with deep hands-on VIOS/HMC, NIM, performance tuning, and PowerHA recovery. Demonstrated high-impact incident response (avoided DB reboot saving ~4 hours; restored clustered services in <20 minutes) plus strong RCA and preventative remediation. Also brings modern DevOps/IaC experience building GitHub Actions pipelines and Terraform-managed AWS EKS/VPC/RDS/S3 environments.

View profile
TH

Mid-level Software Engineer specializing in backend systems, IoT, and AI security

Pittsburgh, PA3y exp
NapticCarnegie Mellon University

Full-stack engineer in the investment tracking/financial reporting space who built an automated reporting dashboard and compliance/reporting pipeline end-to-end using Next.js (App Router, server/client components), REST, and Postgres. Demonstrated measurable performance wins (~30% faster loads) through caching and query optimization, and built durable orchestrated workflows in n8n with retries, idempotency, and reconciliation checks.

View profile
PY

Mid-level Software Development Engineer specializing in AWS telemetry and DDoS mitigation

Seattle, WA3y exp
Amazon Web ServicesTexas A&M University-Commerce

Amazon engineer who built an Amazon Bedrock-powered summarization layer over large-scale network/service telemetry (“top talker” insights) to help security engineers triage anomalies faster. Emphasizes production-grade design patterns for LLM features—non-blocking enrichment, deterministic fallbacks, strict structured outputs, and monitoring to preserve trust in source-of-truth telemetry.

View profile
Likhitha Bethi - Mid-level Software Engineer specializing in backend systems, distributed systems, and applied AI in Stony Brook, NY

Mid-level Software Engineer specializing in backend systems, distributed systems, and applied AI

Stony Brook, NY4y exp
Stony Brook UniversityStony Brook University

Goldman Sachs engineer who owned end-to-end features for an internal onboarding and case management platform, spanning React/TypeScript UI, a GraphQL gateway, and Node + Spring WebFlux microservices. Built and operated a Kafka-based ingestion and search pipeline with DLQs, retries, idempotency, and strong observability, and improved developer experience via backward-compatible GraphQL API design and schema-driven documentation.

View profile
Vela Sivasankaran - Executive technology leader specializing in cloud, telecom, and digital transformation in Virginia, USA

Executive technology leader specializing in cloud, telecom, and digital transformation

Virginia, USA16y exp
AFCEA InternationalUniversity of Pennsylvania

Former founder of a financial services technology startup that is currently on hold for family reasons. Has hands-on startup fundraising exposure from employee roles, including presenting proof-of-concept demos to venture capital firms, and brings a strong focus on fraud prevention, safeguards, and regulatory compliance.

View profile
VK

Senior Software Engineer specializing in backend systems, cloud, and AI automation

Houston, TX5y exp
NetflixUniversity of Houston-Clear Lake

Built a production AI-powered workflow automation system at Netflix that integrated OpenAI and LangChain with FastAPI services on AWS, cutting roughly 320 hours of manual operational effort. Brings a mix of full-stack product development and practical AI systems experience, with strong attention to reliability, maintainability, and non-technical user adoption.

View profile
SK

Intern Software Engineer specializing in developer productivity and data/AI systems

Los Angeles, California1y exp
IntuitUC Berkeley

Internship experience at Intuit building an LLM-grounded QA system for internal microservice data across 100+ microservices, using a graph database approach (evaluated Neo4j and selected AWS Neptune for production alignment). Also has UC Berkeley research experience (including work with Prof. Dawn Song / Berkeley Eye Research Lab) and cross-functional collaboration with bioinformatics/biology teams to deploy software systems on research servers.

View profile
CW

Mid-level Robotics & Autonomy Engineer specializing in MPC, RL, and GPU-accelerated optimization

4y exp
Georgia Institute of TechnologyUC Berkeley

Robotics software engineer from Ati Motors who brought a Linear MPC approach (based on Kuhne et al.) into production, rebuilding parts of the planning stack to eliminate oscillations and safely double AMR speed from 0.8 m/s to 1.6 m/s. Also delivered an end-to-end point-cloud detection pipeline (PointPillars) including synthetic data generation in Isaac Sim and TensorRT deployment for real-time human/trolley detection, with a strong focus on production reliability via iterative hardening and nightly SIL.

View profile
JR

Senior Software Engineer specializing in distributed systems and AI workflow orchestration

Austin, TX5y exp
AppleUniversity of Central Missouri

Backend owner at Apple for an AI workflow orchestration service, with hands-on experience stabilizing peak-traffic production systems using OpenTelemetry-style tracing, bounded async concurrency, and database performance tuning. Built and shipped a Python LLM-agent orchestration layer to automate multi-step operational workflows, emphasizing guardrails, auditability, and deterministic fallbacks to keep non-deterministic AI behavior production-safe.

View profile
MO

Mid-Level Software Engineer specializing in cloud-native distributed systems

Bellevue, WA7y exp
AmazonUniversity of Washington

Gameplay engineer with hands-on ownership of a real-time C++ combat ability system, including diagnosing and eliminating large-scale combat frame spikes by refactoring hit detection to an event-driven, animation-notify approach (cut collision checks ~80%). Also implemented UE5 networked abilities (dash) with client-side prediction and server-authoritative reconciliation, plus projectile ballistics validated through debug spline visualizations and unit tests.

View profile
SB

Mid-level Backend & Reliability Engineer specializing in AWS, Kubernetes, and automation

New Mexico, US5y exp
MetaUniversity of North Carolina at Charlotte

Meta engineer focused on reliability/operations tooling who built a unified real-time health dashboard and scalable telemetry pipelines (AWS + Datadog) for thousands of devices. Also shipped an internal LLM-powered knowledge assistant using RAG over wikis/runbooks/logs with strong guardrails and a rigorous eval loop that drove measurable accuracy improvements via automated doc ingestion and embedding updates.

View profile
Shreya Roy Koneri - Mid-level Software Engineer specializing in backend microservices and real-time payments in Phoenix, AZ

Mid-level Software Engineer specializing in backend microservices and real-time payments

Phoenix, AZ5y exp
American ExpressUniversity of Dayton

Product-minded full-stack engineer who has owned customer-facing platforms end-to-end, including a unified web UI platform that increased adoption by 30% using feature flags and phased rollouts. Experienced designing TypeScript/React systems with microservices and RabbitMQ at scale, addressing reliability issues with DLQs, retries, and idempotent consumers, and building internal analytics tooling adopted company-wide within weeks.

View profile
Muhan Zhang - Junior AI Software Engineer specializing in LLM pipelines, OCR, and RAG in Palo Alto, USA

Muhan Zhang

Screened

Junior AI Software Engineer specializing in LLM pipelines, OCR, and RAG

Palo Alto, USA2y exp
Platflow.AICornell University

Built and shipped a production LLM pipeline for nursing home Medicare reimbursement (PDF OCR + fact extraction + keyword RAG + QA) that reportedly increased payouts by ~$1K/month per patient. Strong in LLM ops/benchmarking (ground truth, LLM-as-judge, cost/I-O tracking) and pragmatic optimization—swapped retrieval approaches, fine-tuned a small model to cut OCR cost 90%, and migrated workloads to Azure/Temporal to scale nightly processing 10x.

View profile
Jehanzeb Khan - Director-level Engineering Manager specializing in large-scale data and compute platforms in Sunnyvale, CA

Jehanzeb Khan

Screened

Director-level Engineering Manager specializing in large-scale data and compute platforms

Sunnyvale, CA20y exp
AmazonInstitute of Business Administration

Platform and distributed-systems leader (player-coach) who owned architecture and reliability for an Amazon analytics/data platform serving ~100K internal users at exabyte scale. Built an ML-driven “Lakeflow” optimization layer that cut pipeline completion times ~20–25% and reduced compute waste >15%, and led major incident response/redesign efforts (e.g., deletion storm) with strong rollout/observability/rollback practices.

View profile
Jingyao Chen - Junior Backend/Platform Engineer specializing in AI microservices and cloud-native systems in Pittsburgh, PA

Jingyao Chen

Screened

Junior Backend/Platform Engineer specializing in AI microservices and cloud-native systems

Pittsburgh, PA2y exp
MeowyAICarnegie Mellon University

Cofounder at MeowyAI who shipped a production multimodal (vision/voice/text) AI task manager using Gemini, tackling real-world issues like hallucinations, tool-calling safety, and RAG-based preference memory. Also built a production multi-agent RAG system orchestrated with LangGraph (and contributes to LangChain), with strong emphasis on latency optimization, observability (OpenTelemetry), and rigorous testing/evaluation including A/B tests and adversarial prompting.

View profile
KARTHIKBABU VADLOORI - Mid-level Full-Stack Developer specializing in Spring Boot, React, and cloud microservices in San Francisco, CA

Mid-level Full-Stack Developer specializing in Spring Boot, React, and cloud microservices

San Francisco, CA5y exp
MetaUniversity of Texas at Arlington

Backend engineer with experience at Meta and Accenture building regulated-data systems (healthcare/financial) using Python/Flask and Postgres. Has scaled high-throughput services to millions of daily requests, delivering measurable latency wins (~40% API latency reduction; ~35% faster DB-backed endpoints), and has productionized ML inference services using Docker/Kubernetes and AWS (ECS/SageMaker).

View profile
TS

Tirath Shah

Screened

Senior Software Engineer specializing in Unity game development, multiplayer networking, and VR

San Francisco, CA13y exp
RokuUniversity of Pittsburgh

Unity/C# gameplay engineer from Roku who led and shipped a cross-platform real-time multiplayer system spanning Meta Quest VR and iOS/Android, including AI/LLM-driven NPC behavior. Reported strong post-launch outcomes (+40% VR retention, +25% engagement) with stable networking (server-authoritative, delta compression/prediction) and robust debugging/observability via logging and replay tools.

View profile
HR

Mid-level Data Analytics professional specializing in BI, data engineering, and applied AI

California, USA6y exp
AmazonSan Jose State University

Built GenMedX, a multi-module clinical AI system for emergency department decision support spanning triage prediction, diagnosis, medication Q&A, and visit summarization. Stands out for combining medical LLM fine-tuning, RAG, and rigorous evaluation/monitoring to drive a major triage recall improvement from 38.5% to 76.6%, with a strong focus on safety, edge-case detection, and production reliability.

View profile

Need someone specific?

AI Search