Site Reliability Engineer (SRE) – AWS + Docker

Bengaluru - Bellandur (GTP), India · Cloud Operations / SRE

Role Number: JR1035257
Posted: 24 Jun 2026
Work mode: Onsite
Employment: full-time
Experience: 7+ years

Opens Synechron careers ↗

Summary

Synechron is seeking a Site Reliability Engineer (SRE) to improve the reliability, scalability, and performance of cloud-native systems. This role supports production operations through AWS infrastructure management, containerized workload operations, CI/CD enablement, observability, and incident response.

The position contributes to business goals by improving availability, reducing operational risk, and supporting cost-efficient system performance.

Description

You will define and maintain reliability standards, build AWS infrastructure for scalable systems, operate containerized services, and partner with development teams to improve reliability by design.

Responsibilities

Define and maintain SLOs, SLIs, SLAs, and error budgets
Build and manage AWS infrastructure for scalable, highly available systems
Operate containerized services using Docker and ECS/EKS/Kubernetes
Implement and optimize CI/CD pipelines and deployment strategies
Establish observability through metrics, logs, and traces
Automate infrastructure and operations using IaC and scripting
Manage incident response, runbooks, root-cause analysis, and remediation
Drive performance tuning, capacity planning, and cost optimization
Implement security best practices across infrastructure and deployments
Partner with development teams to improve reliability by design
Maintain AWS environments and containerized services daily
Participate in incident response, troubleshooting, and postmortems
Work with Dev, QA, and Security teams on resilience and operational readiness

Minimum Qualifications

7+ years of experience in SRE, DevOps, or Cloud Operations
Strong hands-on AWS: EC2, ECS/EKS, IAM, VPC, ALB/NLB, Route 53, S3, CloudWatch
Docker and container orchestration using EKS/Kubernetes or ECS
CI/CD using GitHub Actions, Jenkins, or Azure DevOps
IaC using Terraform or CloudFormation
Observability: CloudWatch, Prometheus/Grafana, ELK/OpenSearch, X-Ray
Automation using Python and/or Bash
Linux system administration and troubleshooting
Networking: DNS, TCP/IP, TLS, security groups, NACLs
Bachelor's degree in CS, Engineering, IT, or equivalent practical experience
Experience owning production infrastructure and improving MTTR and availability

Preferred Qualifications

Experience with CloudFront, RDS, ElastiCache, Auto Scaling Groups
Blue/green and canary deployment strategies
Artifact management and release approval workflows
Vulnerability scanning and secrets management tools
AWS, Kubernetes, Terraform, or cloud operations certifications
Reliability patterns: circuit breakers, retries, backoff, health checks

Technologies

EC2
EKS
ECS
IAM
VPC
CloudWatch
Terraform
GitHub Actions
Jenkins
ELK
AWS
Docker
Kubernetes
EKS
ECS
Terraform
Python
Bash
Prometheus
Grafana