**Important Information**
Experience: + 5 years
Job Mode: Full-time
Work Mode: Work from home
**Job Summary**
As a **_Senior Site Reliability Engineer (6632)_**, you will be part of a highly skilled technology and agile team by supporting and developing cutting-edge solutions to meet our business requirements. You will help us accelerate our customers' business results by innovating cutting-edge digital products.
Your responsibilities will include leading and actively participating in the design, development, and delivery of our software projects.
**Responsibilities and Duties**
- Design, implement, and maintain highly available and scalable cloud infrastructure on AWS platform.
- Develop and implement automated monitoring, alerting, and incident response mechanisms to ensure proactive identification and resolution of system issues.
- Collaborate with software engineering teams to establish Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure system reliability and performance.
- Conduct regular performance analysis, capacity planning to anticipate and address scaling requirements.
- Implement and maintain disaster recovery and failover strategies to mitigate service disruptions and ensure business continuity.
- Lead incident response and post-mortem analysis to identify root causes and implement preventive measures.
- Continuously improve system reliability through automation, optimization, and implementation of best practices.
- Stay updated with the latest AWS services and technologies, and evaluate their applicability to enhance our infrastructure and operations.
- Mentor junior team members and foster a culture of collaboration, learning, and continuous improvement.
**Qualifications and Skills**
- Bachelor's degree in Computer Science, Engineering, or related field. Master's degree preferred.
- AWS Certified Solutions Architect - Professional or AWS Certified DevOps Engineer - Professional certification is required.
- 8+ years of experience in Site Reliability Engineering, DevOps, or related roles, with a focus on AWS cloud technologies.
- Strong understanding of cloud architecture principles and experience with AWS services such as EC2, S3, RDS, Lambda, DynamoDB, etc.
- Proficiency in scripting and automation using languages such as Python, Bash, or PowerShell.
- Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation for provisioning and configuration management.
- Hands-on experience with monitoring, logging, and observability tools such as CloudWatch, Prometheus, Grafana, ELK stack, etc.
- Solid understanding of CI/CD principles and experience with related tools like Jenkins, GitLab CI/CD, or AWS CodePipeline.
- Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.
- Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams and influence stakeholders at all levels.
**About Encora**
Encora is the preferred digital engineering and modernization partner of some of the world's leading enterprises and digital native companies. With over 9,000 experts in 47+ offices and innovation labs worldwide, Encora's technology practices include Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering.
**At Encora, we hire professionals based solely on their skills and qualifications, and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.