Job Description - Sr. Site Reliability Engineer
**Title**:
Sr. Site Reliability Engineer
**Location**:
Remote, based in Costa Rica
**Job Overview**:
**Key responsibilities include**:
- Leadership and Mentorship: Direct and mentor junior SREs, fostering a culture of excellence, continuous improvement, and learning within the team.
- Strategy Development: Lead the creation and execution of sophisticated strategies for system optimization, ensuring scalability, reliability, and security at all levels.
- System Architecture Contribution: Engage in the design and review of system architecture, advocating for security and reliability best practices.
- Advanced Incident Management: Manage complex security incidents with expertise, guiding the team during crisis situations and ensuring swift, effective resolutions.
- Cross-Functional Collaboration: Serve as a primary liaison between the SRE team, IT department, and other technical and business units, driving cohesive efforts towards shared organizational goals.
- Innovation and Research: Champion innovation by researching, advocating for, and implementing cutting-edge technologies and methodologies to enhance system reliability and security.
- Procedure Development: Formulate and maintain up-to-date incident response procedures and playbooks, ensuring their effectiveness and compliance with industry standards.
- Post-Incident Analysis: Conduct thorough post-incident reviews, deriving insights and recommendations to prevent recurrence and improve system security and reliability.
- Collaboration and Detection: Work closely with the vSOC to enhance detection and reporting mechanisms for timely incident response.
- Threat and Vulnerability Assessment: Provide expertise in threat analysis, conduct vulnerability assessments, and perform penetration testing using leading-edge tools and techniques.
- Security Measures Implementation: Partner with the IT team to deploy security controls and measures that safeguard against future incidents while ensuring system compliance and reliability.
- Stakeholder Engagement: Develop and maintain relationships with key external stakeholders, staying abreast of the latest security trends and practices.
- Technology Proficiency: Utilize and manage advanced incident response and reliability tools, including Splunk, Crowdstrike Falcon Complete, and MS Defender, among others.
**Preferred Qualifications and Experience**:
- Educational Background: Bachelor's degree in Computer Science, Information Technology, or equivalent experience. Advanced degrees or specialized certifications in site reliability engineering or cybersecurity are preferred.
- Professional Experience: A minimum of 5-7 years in cybersecurity, with extensive experience in site reliability engineering, including leadership roles or substantial project management experience.
- Technical Expertise: Deep understanding of cyber threats, attack methodologies, incident response techniques, and a solid grasp of NIST and ISO 27001 frameworks, with the ability to lead in architecture design, advanced troubleshooting, and performance optimization.
- Leadership Skills: Demonstrated leadership capabilities, with experience in guiding projects, mentoring team members, and leading by example in a high-stakes environment.
- Strategic Planning: Proven track record in strategic planning and execution, aligning technical projects with broader business objectives.
- Tools Proficiency: Expertise in using incident response tools and technologies such as SIEM, XDR, and threat intelligence platforms, with advanced knowledge in Splunk administration and other critical technologies.
- Analytical Skills: Exceptional analytical and problem-solving abilities, capable of sifting through large data sets to identify and address security incidents effectively.
- Communication: Strong communication skills, with the capacity to articulate complex technical information clearly to both technical and non-technical stakeholders.
- Adaptability: Ability to thrive in a fast-paced, ever-changing environment, showing flexibility and a commitment to continuous learning and improvement.
- Desirable Skills: Familiarity with Qualys, Contrast Security, KnowBe4 PhishER, PCI, and SOX compliance, along with experience in using Pager Duty, Jira, and Confluence, is advantageous.
**Desirable Skills**:
- Advanced Technical Skills: Experience with leading-edge technologies or methodologies, such as cloud-native technologies, Kubernetes, or advanced automation and orchestration platforms.
- Industry Leadership: Contributions to the field through speaking engagements, publications, or active participation in relevant professional communities are highly valued.