Reval LogoFind More Talent
UK

Uday Kumar Surabhi

Senior Site Reliability Engineer specializing in cloud observability and incident response

CA, USASite Operations Engineer6 years experienceSeniorTechnologyCloud ComputingSaaS
ScreenedIdentity Verified

Connect with Uday

Uday already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.

Recommended

Already have an account?

About

Backend engineer experienced in evolving high-scale legacy on-prem systems into cloud-native, event-driven microservices on AWS/Kubernetes (noted peak traffic ~1.5M QPS). Strong focus on reliability engineering and operational excellence—SLO-driven observability, GitOps/canary rollouts, chaos testing, and preventing cascading failures (e.g., retry-storm mitigation).

Experience

Site Operations EngineerPyramid Consulting
Site Reliability EngineerCapgemini
Site Reliability EngineerEtyme
DevOps EngineerThe Ramco Cements

Education

University of North Texasmaster, Information Systems and Technology (2023)
Osmania Universitymaster, Business Administration (2018)

Key Strengths

  • Modernized legacy on-prem backend into cloud-native event-driven AWS architecture without full rewrite
  • Designed microservices fault isolation at scale (Kubernetes namespaces/limits/PDBs, async messaging) for high-traffic systems (~1.5M QPS peak)
  • SLO-based observability and alerting focused on user impact (Prometheus/Grafana, distributed tracing)
  • Resiliency engineering: circuit breakers, strict timeouts, limited retries, retry budgets to prevent cascading failures
  • Safe delivery practices: GitOps rollouts, canaries, automated rollbacks, chaos testing
  • Migration risk management via small reversible changes, feature flags, parallel runs, and traffic shifting with close SLO monitoring
  • Identified and mitigated retry-storm failure mode that improved peak-traffic stability

Discover more candidates like Uday

Search across thousands of pre-screened, high-quality, high-intent candidates on Reval.

Search Talent

Connect with Uday

Uday already has a relationship with Reval, so a warm intro from us gets a much better response than cold outreach.

Recommended

Already have an account?

Contact

candidate@example.com(555) 123-4567LinkedIn Profile
Sign up to view

Languages

English

Skills

Site Reliability EngineeringObservabilityDevOpsIncident ManagementIncident ResponseEscalation ManagementOn-call SupportPostmortemsRoot Cause Analysis (RCA)SLOSLIMTTR ReductionDisaster Recovery (DR)DR PlanningChaos Engineering