Site Reliability Engineer (SRE) – AWS + Docker

Bengaluru - Bellandur (GTP), India · Cloud Operations / SRE

Role Number
JR1035257
Posted
Work mode
Onsite
Employment
full-time
Experience
7+ years
Apply on company site

Opens Synechron careers ↗

Summary

Synechron is seeking a Site Reliability Engineer (SRE) to improve the reliability, scalability, and performance of cloud-native systems. This role supports production operations through AWS infrastructure management, containerized workload operations, CI/CD enablement, observability, and incident response.

The position contributes to business goals by improving availability, reducing operational risk, and supporting cost-efficient system performance.

Description

You will define and maintain reliability standards, build AWS infrastructure for scalable systems, operate containerized services, and partner with development teams to improve reliability by design.

Responsibilities

  • Define and maintain SLOs, SLIs, SLAs, and error budgets
  • Build and manage AWS infrastructure for scalable, highly available systems
  • Operate containerized services using Docker and ECS/EKS/Kubernetes
  • Implement and optimize CI/CD pipelines and deployment strategies
  • Establish observability through metrics, logs, and traces
  • Automate infrastructure and operations using IaC and scripting
  • Manage incident response, runbooks, root-cause analysis, and remediation
  • Drive performance tuning, capacity planning, and cost optimization
  • Implement security best practices across infrastructure and deployments
  • Partner with development teams to improve reliability by design
  • Maintain AWS environments and containerized services daily
  • Participate in incident response, troubleshooting, and postmortems
  • Work with Dev, QA, and Security teams on resilience and operational readiness

Minimum Qualifications

  • 7+ years of experience in SRE, DevOps, or Cloud Operations
  • Strong hands-on AWS: EC2, ECS/EKS, IAM, VPC, ALB/NLB, Route 53, S3, CloudWatch
  • Docker and container orchestration using EKS/Kubernetes or ECS
  • CI/CD using GitHub Actions, Jenkins, or Azure DevOps
  • IaC using Terraform or CloudFormation
  • Observability: CloudWatch, Prometheus/Grafana, ELK/OpenSearch, X-Ray
  • Automation using Python and/or Bash
  • Linux system administration and troubleshooting
  • Networking: DNS, TCP/IP, TLS, security groups, NACLs
  • Bachelor's degree in CS, Engineering, IT, or equivalent practical experience
  • Experience owning production infrastructure and improving MTTR and availability

Preferred Qualifications

  • Experience with CloudFront, RDS, ElastiCache, Auto Scaling Groups
  • Blue/green and canary deployment strategies
  • Artifact management and release approval workflows
  • Vulnerability scanning and secrets management tools
  • AWS, Kubernetes, Terraform, or cloud operations certifications
  • Reliability patterns: circuit breakers, retries, backoff, health checks