Manager, Site Reliability - Madrid, España - Nexthink

Nexthink

Empresa verificada

Madrid, España

hace 3 semanas

Publicado por:

Isabel García

beBee Recruiter

Descripción

Company Description

Hi, we're Nexthink. We're not just the leader in the digital employee experience category, we invented the category.

Our solutions combine real-time analytics, automation and employee feedback across all endpoints to help IT teams delight people at work.

Our cloud-native platform pinpoints issues and solutions, automates response, and helps companies continuously improve their employees' experience, making them more productive, efficient, and happy at work.

We have millions of endpoints deployed, we've surpassed $200M in ARR, and in 2021 secured $180M in Series D financing for a company valuation of $1.1B, but we're just getting started.

#LI-Hybrid

Job Description:

Ready to take your AWS expertise to new heights? We're scouting for a forward-thinking and inspired Cloud Operations leader to anchor our Site Reliability Engineering (SRE) team.

If you have a knack for coordinating and planning, have a robust technical background, and have a deep understanding of GitOps and DevOps practices, we're eager to hear from you.

As our on-the-ground champion of cloud operations, you'll be the primary guardian of our AWS cloud infrastructure.

You will ensure it operates with stability, performance, and ironclad security while reporting to the VP of SRE and inspiring your team across EU, India, and US.

It's time to roll up your sleeves Your day-to-day will involve tactile AWS management, facilitating synergy among cross-functional teams, and being the driving force for our journey to a fully implemented Site Reliability Engineering (SRE) model.

Cloud Infrastructure Management:

Manage and maintain our AWS cloud infrastructure, ensuring high availability, scalability, and reliability.
Deploy and optimize AWS services and resources to meet business and technical requirements.
Monitor cloud infrastructure performance, proactively identifying and resolving issues to minimize downtime and disruptions.
Implement appropriate security measures and compliance standards to protect data and systems.
Collaborating with development teams to understand upcoming releases and their impact on the cloud infrastructure.

Operations Team Leadership:

Lead a team of cloud operations professionals under the SRE organization, providing guidance, mentorship, and support.
Foster a collaborative and inclusive work environment, encouraging knowledge sharing and continuous learning.
Define and enforce operational processes and procedures, ensuring adherence to industry best practices.
Conduct regular performance evaluations and provide feedback to team members.

Incident Management and Troubleshooting:

Manage and respond to incidents and service disruptions, following established incident management processes.
Coordinate with internal teams, external vendors, and stakeholders to resolve critical issues in a timely manner.
Conduct root cause analysis of incidents, identify areas for improvement, and implement preventive measures.
Develop and maintain incident response documentation, including runbooks and playbooks.

Cost Optimization and Resource Planning:

Monitor AWS resource utilization and identify opportunities for cost optimization.
Collaborate with Cloud Costs Team to forecast cloud infrastructure expenses and budget accordingly.
Optimize resource allocation, ensuring efficient utilization of cloud services to achieve costeffective operations.

Continuous Improvement and Automation:

Identify opportunities for process automation and optimization, leveraging AWS services and tools.
Collaborate and improve infrastructure as code (IaC) practices using modern tools.
Drive initiatives to improve operational efficiency, scalability, and reliability of the cloud infrastructure.
Stay uptodate with emerging cloud technologies and industry trends, and evaluate their potential benefits for the organization.

Qualifications:

Bachelor's degree in computer science, information technology, or a related field (or equivalent experience).
Extensive experience managing and operating cloud infrastructure, with a strong focus on AWS.
Solid understanding of AWS services, including EKS, EC2, S3, RDS, VPC, IAM, ALB/NLB.
Handson experience with infrastructure as code (IaC) tools like CrossPlane, Terraform, or similar.
Proficient in troubleshooting and resolving complex technical issues in a timely manner.
Strong leadership and team management skills, with the ability to effectively motivate and guide a team.
Excellent communication and collaboration skills, with the ability to work closely with crossfunctional teams and stakeholders.
Strong analytical and problemsolving abilities, with a proactive and handson mindset.
Proficiency in English, both written and spoken.

Additional Information

We are a fast-paced growing company, and we are hiring and growing a lot in our Madrid office. If you are looking for a change and like a nice at