Site Reliability Engineer
Transportation drives humanity forward. At Stratio we have a purpose: to change the transportation industry. We believe in a future with no disruptions, where vehicles never break down, a zero downtime future. For that we rely on great individuals and great teams.
The Site Reliability Engineer is responsible for keeping all production systems running smoothly. SRE at Stratio is a blend of pragmatic operator and software craftsperson that applies sound engineering principles, operational discipline, and mature automation to our environments and the GitLab codebase. Your main focus must be developing automated solutions for operational aspects such as on-call monitoring, performance and capacity planning, and disaster response.
- Run the production environment by monitoring availability and taking a holistic view of system health;
- Build software and systems to manage platform infrastructure and applications;
- Improve reliability, quality, and time-to-market of our suite of software solutions;
- Measure and optimize system performance, with an eye toward pushing our capabilities
forward, getting ahead of customer needs, and innovating to continually improve;
- Provide primary operational support and engineering for multiple large distributed software applications;
- Document technical knowledge in the form of notes and manual;
- Manage and scale complex distributed systems;
- Analyses and implements security improvements by assessing current situation; evaluating trends; anticipating requirements.
- Solid foundation in deployment and management for large-scale Linux systems;
- SRE experience and comfortable operating software in a Linux based environment;
- Understand large-scale complex systems from a reliability perspective;
- Experience collecting system and application metrics for observability (Nagios, Prometheus, Syslog);
- Deep network analysis experience;
- Knowledge and experiences about highly available and scalable architectures;
- Familiarity with container orchestration and containerization services, especially Kubernetes and Docker;
- Have experience with web servers (Apache, Nginx, HAProxy);
- Familiarity with Ansible, Puppet or Chef configuration management tools;
- Familiar with at least one Cloud environment, for example, AWS, GCP, or Azure (AWS certifications is a plus);
- Experience in software engineering and automation;
- Familiar with Infrastructure as Code (Terraform or similar);
- Ability to work under pressure;
- Fluency in English.
- Experience in managing and deploying Apache Kafka;
- Experience in managing and deploying Elasticsearch;
- Experience with Redis cache service;
- Experience in managing Databases (Postgres, SQLServer);
- Knowledge of Cyber-security.
We expect you:
- Mentor and grow elements of the team with less experience;
- Gather and analyse metrics from both operating systems and applications to assist in performance tuning and fault-finding;
- Partner with development teams to improve services through rigorous testing and release procedures;
- Participate in system design consulting, platform management, and capacity planning;
- Create sustainable systems and services through automation and uplifts;
- Balance feature development speed and reliability with well-defined service level objectives;
- Be able to work completely autonomously.
What we offer:
- Health Insurance;
- Fringe Benefits Policy;
- Flexible Work Hours - adjust your schedule to your needs;
- Work Setup - remote, hybrid, on-site - if your job can be done remotely, and you prefer to, you’re free to choose;
- Hardware and software for a full remote setup;
- Monthly All-Hands;
- Quarterly Events to discuss Strategy;
- Autonomy and Ownership Culture;
- Continuous feedback culture;
- Innovation Mindset;
- Career Acceleration.
- Remote / Hybrid / Coimbra / Lisbon
We want inspiring individuals in our teams, where age, race, gender, sexual orientation, politics and religion do not matter, and seek to create a tolerant and open space for everyone. We thrive to provide an inclusive and trustworthy environment.
You can find our Culture Manifesto and more team information here.
Take the road with us!