Mid-level Site Reliability Engineer specializing in cloud infrastructure, Kubernetes, and LLM applications
San Jose, CASite Reliability Engineer6 years experienceMid-LevelCloud ComputingTechnologyMedia & Entertainment
ScreenedIdentity Verified
Connect with Jayanth
Jayanth already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.
Recommended
Already have an account?
About
SRE-focused engineer with experience at Sony Interactive Entertainment productionizing high-throughput LLM/agentic systems on Kubernetes, including GPU-aware autoscaling and warm-pool strategies to manage latency and cost under traffic spikes. Demonstrates strong incident response using Prometheus/Grafana + Jaeger tracing (e.g., resolving recursive agent loops and restoring 99.9% availability within minutes) and partners closely with sales/customer teams through PoV demos and developer workshops.
Experience
Site Reliability EngineerAmazon
Site Reliability Engineer (Research Assistant)San Jose State University
Site Reliability EngineerAccenture
Site Reliability EngineerSony Interactive Entertainment
Education
San Jose State Universitymaster, Software Engineering (2024)
Lovely Professional Universitybachelor, Computer Science (2021)
Key Strengths
Scaled high-throughput LLM prototypes to production using Kubernetes autoscaling and GPU-aware metrics
Cost-efficient handling of unpredictable traffic spikes via warm pool + preemptive autoscaling triggers
Real-time, multi-layer debugging of LLM/agentic workflows (infrastructure, dataflow, model behavior)
Used distributed tracing to identify recursive agent logic loops and schema-mismatch retry failures
Maintained service reliability by deploying circuit breakers and deterministic fallbacks under incident pressure
Restored 99.9% availability within minutes during a high-traffic latency incident