“Devopsification & the Cloud”, Phil from SkyScanner
Who are SkyScanner? 800 staff, 10 offices, 50% staff are engineers
Rapid growth necessitates change.
Adopted a squads / tribes model (ala Spotify) – teams between 1-10 people, localised ownership of sections of functionality from design through to development & running it, faster decision making – “You build it, you run it” – drives engineers to do things properly.
Delegation of previously centralised capabilities
Can’t get enough good ops engineers. Adopted cloud (AWS), currently still running hybrid as still have traditional systems.
Give control of resources to squads “More curator, less gatekeeper”.
Our most expensive resource is engineer time, not infrastructure.
Microservices beats monolith
Works well with the squad model, as can’t have a squad working with a large monolith
Fail fast, fail forward etc
Data driven experiences
Optimise, standardise and automate all the things
Super important, spend more time & energy building user facing product
“Some companies have moved away from squads as led to war”
There was some empire building at the start, but we quickly took steps to avoid this, kept tribal leadership, and agressively moved people between squads. Sometimes not always easy due to skillsets.
Short term focus.
Biggest challenge when adopting devops & cloud
Trying to find people with thrift skill set. You can’t, part of reason is that consultancies poach those people & pay large salaries. You need skills in house.
Develop staff internally
Developer Enablement Tribe
Define & promote standards
Create common tooling
Support common platforms
AWS architectural guidance
Set and enforce best practices
Provide internal training
Get people who are enthusiastic and have a mix of development & operations
Make development faster vs risks?
We split development and production into separate AWS accounts
Production = secure by default, only permit the bare minimum to get the job done
Sandbox = open permissions by default
Confession – we still have (small) dedicated operations, security and network teams
Self service portal
Administer cost management
Does this actually deliver value?
3 years ago took 6-8 weeks to release a new feature, not viable when require to be competitive
Today – 6-8 minutes