Vetted Distributed Systems Professionals

Pre-screened and vetted.

ML

Mengyu Liu

Screened

Senior Data Scientist specializing in GenAI agents and causal inference

Remote, USA10y exp
HumanaUniversity of Miami

Built and deployed a production healthcare medical review agent that automates call-transcript summarization and medication reconciliation using a hybrid deterministic + LangGraph-orchestrated LLM workflow. Demonstrates strong reliability engineering (guardrails, schema validation, confidence thresholds, golden/adversarial eval, Langfuse monitoring) in a regulated environment, delivering 60% lower latency and 70%+ efficiency gains while partnering closely with care managers and operations.

View profile
CD

Mid-Level Software Developer specializing in Java microservices and cloud-native systems

St. Louis, MO5y exp
EpsilonSaint Louis University

Backend engineer focused on cloud/distributed systems, deploying Java 17/Spring Boot microservices on AWS EKS with RDS and Kafka. Demonstrated strong production readiness work (DB lock mitigation, Kafka idempotency, gradual rollouts) and delivered a major latency improvement (~400ms to ~100ms). Also has proven cross-layer troubleshooting skills, isolating intermittent API timeouts to a specific Kubernetes node’s network interface issue, and partners closely with ops teams to build dashboards and workflow automation (including Python scripts).

View profile
LL

Lisa Li

Screened

Director-level Engineering Leader specializing in SaaS, Cloud, and AI/ML delivery

Katy, Texas19y exp
Sainsbury'sThe Open University

Engineering leader who has led 100+ engineers at Sainsbury’s Tech and previously scaled an org from 6 to 60+ at AND Digital. Drove a high-impact modernization of a pricing/decisioning platform serving 1,700 stores—moving from batch monolith to real-time Kafka-based event-driven microservices with MLOps, IaC (Terraform), and zero-trust—delivering £18m+ annual profit uplift and 10+ deploys/day.

View profile
KK

Mid-Level Software Engineer specializing in AWS distributed systems and microservices

Chico, CA4y exp
AmazonCalifornia State University, Chico

Backend/ML-systems engineer with experience (including Amazon) building real-time face recognition services using PyTorch (MTCNN/FaceNet) and AWS (SQS/S3/Lambda/EC2) with a focus on low latency, burst handling, and cost control. Also led a revenue-critical legacy pricing workflow migration to a serverless event-driven architecture using strangler-pattern rollout, simulation-based validation, and strong security practices (JWT/RBAC/RLS).

View profile
RK

Rohit Kumar

Screened

Mid-level Data Engineer specializing in large-scale analytics platforms

San Jose, CA5y exp
NutanixUSC

Data/Backend engineer with experience at Naukri building large-scale analytics products over a 130M+ user base, including Spark/Airflow pipelines and Kafka-based clickstream validation with Confluent Schema Registry. Also built an audience segmentation backend (Athena/S3 + Spring Boot APIs) for non-technical internal teams and recently shipped a GenAI customer data audit system (FastAPI/Postgres/Llama) that cut sales-planning validation from ~3 months to ~1 week.

View profile
Shanmukha Koganti - Mid-level AI/ML Engineer specializing in recommender systems and edge computer vision in Bay Area, CA

Mid-level AI/ML Engineer specializing in recommender systems and edge computer vision

Bay Area, CA6y exp
ShopifyUniversity of North Texas

ML/AI engineer with production experience at Shopify and Intel, building a deep learning product ranking system that lifted add-to-cart ~14% and serving real-time similarity search via FAISS+Redis under <20ms latency at massive scale. Also deployed computer vision models to 100+ retail edge locations using Docker/Ansible/k3s with zero-downtime rollouts, and applies strong MLOps practices (A/B testing, canary/shadow, observability) plus performance optimization (OpenVINO, INT8).

View profile
RM

Ruby Medeiros

Screened

Staff SRE and Software Engineer specializing in distributed systems and cloud reliability

11y exp
ArenaNOVA University Lisbon

Built a production B2C behavioral interview system for job seekers using LangGraph/LangChain on AWS Bedrock with Nova models, plus a FastAPI backend and Vercel AI SDK frontend. Stands out for practical agent reliability work: local stress testing, OpenTelemetry-to-Datadog observability, token/cost monitoring, and guardrails to keep conversations on track and resistant to instruction override.

View profile
SP

Mid-level Software Engineer specializing in machine learning and full-stack AI systems

Seattle, WA4y exp
SakuraMedTechUniversity of Washington

Built production-grade Python systems in a medical/imaging context, including an image feature extraction and survival prediction microservice with strong testing, validation, and observability practices. Also developed a Playwright-based autonomous job application agent that handled dynamic UIs and anti-bot challenges with stealth tooling, proxies, and human-in-the-loop escalation.

View profile
Akhil Kunala - Mid-level Software Engineer specializing in backend systems and cloud-native FinTech in Seattle, WA

Akhil Kunala

Screened

Mid-level Software Engineer specializing in backend systems and cloud-native FinTech

Seattle, WA5y exp
AmazonUniversity of North Texas

Amazon engineer with 5+ years of experience who built an AI-assisted log investigation and triage workflow that cut debugging time by about 30% during on-call incidents. Combines observability tooling like CloudWatch and Splunk with Python, prompt engineering, and RAG-based diagnostics, and has practical experience orchestrating agentic AI workflows with a strong human-in-the-loop reliability focus.

View profile
Prakash Bhanu - Director of Software Engineering specializing in cloud, platform, and FinTech systems in Sunnyvale, CA

Prakash Bhanu

Screened

Director of Software Engineering specializing in cloud, platform, and FinTech systems

Sunnyvale, CA22y exp
Cast & CrewSofia University

Senior software engineering leader with broad 0-to-1 product experience spanning web apps, microservices, monoliths, messaging platforms, ML/AI products, and large-scale distributed systems. Notable examples include building a payroll/finance product for cast and crew, a distributed messaging platform, and a Walmart application deployed across multiple CDNs and clouds handling hundreds of TPS, with personal ownership across architecture, design, coding, and support.

View profile
SL

S Latha Naidu

Screened

Mid-level Software Engineer specializing in AI-powered full-stack systems

Seattle, WA4y exp
AmazonUniversity of Colorado Denver

Backend-focused engineer with experience at AWS building a global alarm processing platform (Python, Lambda/SQS/DynamoDB) handling traffic spikes and reliability issues; resolved duplicate alerts and latency under load by fixing hot partitions and enforcing idempotency. Previously at Cognizant, built Java/PostgreSQL backend workflows for healthcare dashboards using pre-aggregated summary tables, strong SQL optimization, and state-driven job orchestration with ELK-based observability and production guardrails.

View profile
Srinivas Vasudevan - Junior Software Engineer specializing in distributed systems and FinTech in Durham, NC

Junior Software Engineer specializing in distributed systems and FinTech

Durham, NC3y exp
Troxler Electronic LaboratoriesNorth Carolina State University

Built an end-to-end payment fraud monitoring dashboard with a React/TypeScript frontend, GraphQL backend, Redis hot path, and a production RAG chatbot, while solving real-time latency and scaling issues. Also shipped an OCR system on AWS EKS for a live manufacturing line at Troxler, improving production accuracy by 15% with custom preprocessing and model tuning.

View profile
TW

Tanny Wang

Screened

Mid-Level Software Engineer specializing in AI agents and Generative AI

San Diego, CA8y exp
ServiceNowUC San Diego

Backend engineer who built and evolved an internal multi-agent AI research platform (Electron + FastAPI) integrating OpenAI, focused on fast, reproducible experimentation with strong observability and run metadata for debugging. Has led incremental backend refactors with feature flags and parallel validation, and brings production-grade access control expertise from ServiceNow (table/field ACLs and row-level-style enforcement).

View profile
PK

Junior Software Engineer specializing in full-stack systems and distributed log analytics

Miami, FL1y exp
NeocisCarnegie Mellon University

CMU candidate with hands-on experience taking LLM concepts from research prototypes toward production-ready designs (structured outputs, guardrails, failure-scenario evaluation). Also partnered with sales/customer teams at Mazecare to drive adoption with Dontia Alliance (largest dental clinic chain in Singapore) and engaged Singapore government stakeholders, bridging clinical workflow needs with IT security/integration concerns.

View profile
RK

Rutuja Kawade

Screened

Mid-level Software Engineer specializing in cloud infrastructure and distributed systems

Atlanta, GA3y exp
RakutenGeorgia Tech

Cloud infrastructure/product engineer with end-to-end ownership of cloud-native storage/observability products, including taking an internal CMS to Google Cloud Marketplace and scaling to ~40,000 deployments. Strong in Kubernetes-based platforms (Operators, microservices, RabbitMQ) and performance/scalability work (e.g., 200% cluster capacity increase) plus internal tooling that materially improved SRE/QA debugging and release velocity.

View profile
KL

Ke Liu

Screened

Mid-Level Software Engineer specializing in search platforms and distributed systems

New York, NY4y exp
Fitch RatingsColumbia University

JavaScript/React-focused engineer with meaningful open-source impact: redesigned cache key normalization for a client-side data fetching/caching library using deterministic hashing, added robust test coverage, and collaborated closely with maintainers through GitHub PRs/issues. Also drives measurable runtime improvements by profiling hot paths, refactoring core abstractions, and validating with benchmarks/load tests; has taken ownership of unowned initiatives like improving relevance/ranking in an internal search platform.

View profile
VS

Mid-Level Software Engineer specializing in full-stack web, AI telemetry, and real-time graphics

San Francisco, CA3y exp
C3 AINortheastern University

Product-focused full-stack engineer building a GenAI-powered case summarization workflow for a telemetry dashboard, spanning React/TypeScript UI (confidence indicators, reasoning traces) and Python/FastAPI backend with caching to control LLM latency/cost. Has operated services on AWS (ECS Fargate, RDS Postgres, S3) and Kubernetes, and has hands-on experience resolving real production latency incidents through query/index optimization and caching.

View profile
PP

Mid-level Cloud Support Engineer specializing in AWS microservices and payments APIs

Anaheim, CA4y exp
StripeCalifornia State University, Fullerton

Customer-facing technical support/solutions professional with experience at Stripe and Intuit helping developers take payment API and webhook integrations from testing to production. Uses Datadog and AWS CloudWatch to diagnose real-time production issues (e.g., webhook signature validation errors causing retries/delays) and unblocks customer deployments through hands-on, developer-oriented guidance.

View profile
VD

vikhyath D

Screened

Mid-Level Software Development Engineer specializing in distributed microservices on AWS

Dallas, TX5y exp
AmazonUniversity of North Texas

LLM/agent engineer who has shipped multiple autonomous, multi-step agents to production (document-to-SOP conversion, test generation, code generation) using a custom Python DAG orchestrator with persistent state, tool-calling permissions, and structured outputs (Pydantic/JSON Schema). Demonstrates strong production hardening practices—semantic contracts, golden-dataset prompt regression tests, circuit breakers, and multi-level monitoring—and delivered large productivity wins (34 hours of manual writing reduced to ~20 minutes review; ~15–20 engineering hours/week saved).

View profile
Alex ZhuZhou - Intern Full-Stack Software Engineer specializing in AI/LLM platforms and data systems in Berkeley, CA

Alex ZhuZhou

Screened

Intern Full-Stack Software Engineer specializing in AI/LLM platforms and data systems

Berkeley, CA2y exp
EmbraerUC Davis

Backend/LLM engineer with experience productionizing RAG systems (legal-case natural language querying) and optimizing for latency/cost, including a reported ~40% reduction via Redis caching and batching. Built monitoring and real-time debugging workflows (FastAPI, structured logging, correlation IDs, sandbox repro) and regularly delivered technical demos/workshops. Also partners with BD/sales to translate LLM capabilities into business value, including ESG-metric extraction from corporate filings.

View profile
Anandapadmanabhan Santhosh - Junior Software Engineer specializing in distributed systems and backend microservices in Bangalore, India

Junior Software Engineer specializing in distributed systems and backend microservices

Bangalore, India1y exp
NykaaStony Brook University

Distributed systems engineer (ex-Nykaa, Licious) who built a PBFT-based Byzantine fault-tolerant consensus system in Go for a multi-node banking-style application, including checkpointing and automated failover/leader election. Strong production reliability background with Docker, Jenkins CI/CD, and monitoring/on-call troubleshooting using Grafana and New Relic; no direct ROS/robotics hardware experience yet but has highly transferable multi-node coordination expertise.

View profile
Praveen Nutulapati - Mid-level Generative AI Engineer specializing in LLM fine-tuning, RAG, and agentic systems in New York, NY

Mid-level Generative AI Engineer specializing in LLM fine-tuning, RAG, and agentic systems

New York, NY6y exp
JPMorgan ChaseUniversity of Central Missouri

Built and deployed a production multi-agent RAG system at JPMorgan Chase to automate regulated credit analysis and compliance clause discovery across large internal policy/document libraries. Implemented LangGraph-based supervisor orchestration with structured state management (Azure OpenAI) to support long-running, resumable workflows, plus hybrid retrieval + re-ranking and guardrails for reliability. Strong at evaluation/observability (trace logging, LLM-judge, HITL) and at communicating results to non-technical stakeholders via Power BI embeds and Streamlit prototypes.

View profile
Kunal Singh Pundir - Mid-level Full-Stack Developer specializing in cloud microservices and GenAI systems in USA, USA

Mid-level Full-Stack Developer specializing in cloud microservices and GenAI systems

USA, USA5y exp
UberNortheastern University

Built and owned an end-to-end AI-driven decisioning platform at Uber, combining LLM orchestration with typed tool contracts and a Snowflake-based RAG pipeline to make decisions fully auditable. Delivered large-scale measurable impact (120k requests/day, 18k cases auto-resolved/month) while improving ops SLA from 3 days to 6 hours and cutting incident response time nearly in half. Previously led a high-risk strangler-fig modernization of a legacy insurance platform across 120+ microsites at Accenture, coordinating across multiple squads with feature-flagged parallel cutovers.

View profile
PS

Senior Software Engineer specializing in backend infrastructure, cloud automation, and reliability

Mountain View, CA8y exp
OracleStony Brook University

End-to-end deployment owner for Oracle document delivery/print services in a hospital-like production environment, focused on reliability/performance at scale (thousands of systems). Also describes implementing event-driven RAG/agentic LLM workflows with attention to embeddings/index consistency, latency, and measurable improvements in response relevance and operational efficiency.

View profile

Need someone specific?

AI Search