Keep running smoothly with Site Reliability Engineering
As your product grows, it's crucial to balance site reliability with new feature production. We can help you adopt Site Reliability Engineering (SRE) tenets and upgrade your team and processes to effectively manage SLOs and error budgets.
Let's make your product resilient
What we do
We'll help you establish good SRE practices, then support you when needed
We bring the tenets of SRE to your product team, sharing ways of working and building product resilience. Once the team is empowered to manage SLOs and error budgets on their own, thoughtbot moves into the background as on-call and long-term support.
Services
Fulltime Site Reliability Engineering
For projects with significant reliability and operations needs, we can assign a full-time SRE or DevOps Engineer to your team.
- Pitch SRE tenets and help product teams and stakeholders adopt the SRE mindset
- Establish SLOs and Error Budgets
- Implement monitoring and alerting to ensure Error Budgets are met
- Improve performance and scaling for applications to meet SLOs
- Improve CI/CD pipelines to allow continuous, fearless deployment to production environments
- Deploy new infrastructure to meet scaling, security, and compliance needs
- Implement infrastructure as code to ensure long-term maintainability
- Clients in the UK public sector can access our services as part of the G-Cloud-13 purchasing framework.