Site Reliability Engineer Manager

Detalles de la oferta

**Introduction**
At IBM, work is more than a job - it's a calling: To build.
To design.
To code.
To consult.
To think along with clients and sell.
To make markets.
To invent.
To collaborate.
Not just to do something better, but to attempt things you've never thought possible.
Are you ready to lead in this new era of technology and solve some of the world's most challenging problems?
If so, lets talk.
**Your Role and Responsibilities**
The shift toward the consumption of IT as a service, i.e., the cloud, is one of the most important changes to happen to our industry in decades.
At IBM, we are driven to shift our technology to an as-a-service model and to help our clients transform themselves to take full advantage of the cloud.
With industry leadership in analytics, security, commerce, and cognitive computing and with unmatched hardware and software design and industrial research capabilities, no other company is as well positioned to address the full opportunity of cloud computing.
We are looking for an experienced Cloud SRE Manager to join our team, who innovates & shares our passion for winning in the cloud marketplace.
The IaaS Operations is a team dedicated to ensuring that the IBM Cloud is at the forefront of cloud technology, from data center design to network architecture to storage and compute clusters to flexible infrastructure services.
We are building IBM's next generation cloud platform to deliver performance and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency, and security.
It is an exciting time, and as a team we are driven by this incredible opportunity to thrill our clients.
In this role, you will also be driving requirements for automation, monitoring, and tooling for a team with a core mission of automation and IBM Watson AIOps to predict and prevent incidents before those are visible to clients

As the Cloud SRE Manager your key responsibilities will include:

- Managing a group of Site Reliability Engineers and the team's day to day operation, all quarterly reviews, evaluations, and career development.
- Managing critical customer issues, this requires on going communication with customers
- Preparing and delivering training sessions and other presentations
- Providing weekly quality reviews for team
- Providing a weekly status report showing metrics
- Analyze current operational processes and performance, recommending solutions for improvement where necessary
- Liaising with the development, operations, network, and storage teams and driving customer ticket resolution and SLA with these organizations

As the Cloud SRE Manager, you should possess:

- Proven leadership skills
- Well organized with effective time management skills
- Can respond promptly to production issues and alerts
- Be comfortable operating in fast paced environment
- Be comfortable using and navigating within a Linux environment

**Required Technical and Professional Expertise**
- Five (5) years of experience in a technical support/operations manager role, at least (3) years of experience in a technical support or development environment (preferably cloud or managed servers)
- History of process improvement, problem solving skills, customer advocacy orientation, and leadership in a cross-functional team environment.
- Excellent leadership and management skills with emphasis on mentoring, motivating, and driving a large team to success.
- Experience implementing team processes and monitoring effectiveness.
- Ability to identify, analyze, prioritize, and resolve daily operational problems and issues.
- Strong written and verbal communication skills.
- Demonstrated leadership and team building skills.
- Energetic, motivated, and customer focused.
- Ability to quickly adapt to a rapidly changing technology environment.
- Ability to hire, train, and retain quality team members is critical.
- Experience using Splunk and or other dashboards
- Understanding of web technologies and technology stack
- Working knowledge with Network and Storage technologies
- Working knowledge with ServiceNow, JIRA, Confluence, and GitHub
- ITIL Foundation V4 certification is a plus

**Preferred Technical and Professional Expertise**
- Understanding of business continuity, fault tolerant design, and fail-over architecture
- Automation of production monitoring
- Experience with configuration management systems
- Experience with service management tools such as Service Now, Jira, confluence etc.
- Experience writing scripts
- Experience as a support engineer
- Experience with Kubernetes
- Experience with GitHub, Perl and Python

**About Business Unit**

Digitization is accelerating the ongoing evolution of business, and clouds - public, private, and hybrid - enable companies to extend their existing infrastructure and integrate across systems.
IBM Cloud provides the security, control, and visibility that our clients have come to expect.
We are working to provide the right to


Salario Nominal: A convenir

Fuente: Whatjobs_Ppc

Requisitos

Technical Support Representative

**#ChangeMakers** Ready to make an impact? We develop, manufacture, and supply dental implants, clear aligners, instruments, CADCAM prosthetics and biomate...


Straumann Group - Heredia

Publicado a month ago

Personal Para Deshuese

Esta es una posición Permanente - Tiempo Completo, localizada en San Rafael de Alajuela. Tome en cuenta que Cargill no brinda asistencia para reubicación a l...


Cargill - Heredia

Publicado a month ago

Software Support Engineer (.Net)

**Our company**: Encora is a global Software and Digital Engineering company that helps business overcome the Software Engineering Talent shortage and provid...


Encora - Heredia

Publicado a month ago

Soporte Técnico

Nos encontramos en busqueda de un Especialista en Soporte Técnico, para trabajar de forma híbrida. **Requisitos**: - Bachillerato universitario en Soporte,...


Infotree Global Solutions - Heredia

Publicado a month ago

Built at: 2025-01-19T03:20:41.485Z