Site Reliability Engineer - Barcelona, España - Nexxiot

Nexxiot

Empresa verificada

Barcelona, España

hace 1 semana

Publicado por:

Isabel García

beBee Recruiter

Descripción

Nexxiot is a TradeTech leader with hardware-enabled data solutions and a Vision to Reduce Uncertainty in Cargo.

Nexxiot operates the most significant digital global fleet of around 300'000 Rail cars and 800'000 Intermodal containers in 2023 and follows an ambitious growth plan to quadruple the number of digitized assets by 2027.

Nexxiot empowers carriers, railroads, and shippers to monitor the location, status, and conditions of their assets and cargo in real-time, provides forensic analysis of what has happened in the past and allows predictive, actionable insights.

Sophisticated big data and AI-based analytics deliver business intelligence at scale to drive efficiency, process automation and achieve sustainability targets.

***As a Site Reliability Engineer (SRE) working at Nexxiot you are part of an interdisciplinary agile SRE team, responsible for implementing highly available, scalable, compliant and secure cloud infrastructure according to the requirements and priorities provided by the principal site reliability engineer. Working closely with the rest of the team your goal is to design, implement and test cloud infrastructure solutions and to operate and maintain the resulting software and cloud infrastructure services according to our Site Reliability Engineering (SRE) practices. As SRE you are pragmatic, taking the right tools for the job never afraid of reading and learning a new programming language or tool to the degree needed to get the job done. As well as improving existing toolchains and processes.

Your main responsibilities are:

Enable DevOps teams to deliver secure, compliant and resilient software services with short time to market.
Collaborate with Product Owners (PO) and Software Architects (SA) from product teams.
Deploy infrastructure components and services to different (development, testing and production) environments using continuous deployment practices according to the principles of Site Reliability Engineering.
Infrastructure Strategy (evaluation of external services).
Conduct regular architecture reviews, (re)evaluations and participate in surveillance and compliance activities.
Provide consultancy services for macro and security architecture to DevOps teams.
Identity and Access Management (IAM) for cloud infrastructure.
On & offboarding of staff members in jointventure with the internal IT department.
Act as onboarding buddy for new SRE team members.
Participate in agile team activities (e.g. standups, planning meetings, demos. retrospectives,).
Participate in team's oncall rotation during office hours to provide 3rd level support and to ensure system and service availability.
Actively participate in discussions, peer collaboration and solution reviews.
Develop and operate the Kubernetes platform.
Develop and operate the GitLab CI/CD platform.
Operation of persistent database systems and storage layers for primary data stores.
Define, develop, test and practice disaster recovery procedures.
Develop, maintain and improve monitoring (metrics, logs,...) for our infrastructure and platform services.
Develop and operate VPN services to provide secure and reliable connection to our infrastructure.
Security Information and Event Management (SIEM) for cloud infrastructure.
Participate in internal and external security audits & review activities (i.e. pentests)

Requirements:

Experience in writing and operating infrastructure as code with focus on reliable day two operations.
Experience with at least one of the following infrastructure management tools Pulumi, Terraform, AWS CDK, AWS CloudFormation or Ansible.
Experience in writing software in at least one of the following programming languages TypeScript, Rust, Go or Python.
Linux/Unix shell knowhow is a great plus.
Familiar with public cloud infrastructure concepts, experience with Amazon Web Services (AWS) is a big plus.
Acquainted with the Git version management system (Gitlab) and CI/CD best practices (Gitlab pipelines).
Experience in writing and operating highly available and scalable containerized software services on top of Kubernetes, AWS ECS, AWS Fargate or Docker is a great plus.
Experience in operating Kubernetes clusters is a great plus.
Experience with monitoring system like Grafana, Prometheus, DataDog or CloudWatch is a great plus.
Fluent in English spoken and written. German is a plus, but not mandatory.

What we offer:

Flexibility: We offer 4 days (80%) working week or 5 days (100%).
Remote: We offer unlimited working from home; however, please bear in mind that some roles require a certain degree of office.
Work Life Balance: We offer 30 days' paid leave.
Sustenance: Coffee, tea, water, fruit and nuts, snacks and fridges usually stocked with delicious smoothies all free of charge, in office locations.

Our Values

Contribute Actively
Be Transparent do not BS
Promote Mutual Respect
Keep Cool and Have Fun
Fail For

Más ofertas de trabajo de Nexxiot

Iot Product Owner

Barcelona, España - hace 1 semana

Ver todas las ofertas de empleo de Nexxiot

Site Reliability Engineer - Barcelona, España - Nexxiot

Descripción

Comparte este trabajo

Más ofertas de trabajo de Nexxiot

Iot Product Owner