CitySwift

DevOps for buses

BlogBlog
Opinion
DevOps for buses

Imagine yourself designing a highly complex technical product.

You're a domain expert, but working in isolation with very little hard data. You crave a deeper understanding of how things really work. If you had more information and a means to interpret it, you could improve efficiency and reduce costs.

An operations team put your designs into production. Everything is reactive rather than proactive – and they only get feedback when things go wrong. If only there were a way to capture all their understanding of how things really work and build it into the design process!

You may be a software developer with sweaty palms thinking of some horror stories, but I’m actually describing how some bus operations work! 

DevOps vs ‘BusOps’

I hate to pick two random things and say they’re the same, but I hope you’ll allow me to present this analogy. A bus schedule is like software that runs on buses. Schedulers are developers, passengers are your users and a bus arriving on time is similar to your website being up! 

Buses

Software

SchedulersDevelopers
OperationsOperations
SchedulesSoftware
Buses & stopsServers & containers
GPS pingsInstrumentation
PassengersUsers
ReliabilityUptime
EfficiencyLatency

Great, they’re the same! Let’s do DevOps. 

Hold your horses. Unfortunately, the analogy breaks down when it comes to tools. Until recently, there was no bus equivalent for popular tools such as Git, AWS, Kubernetes or Terraform. And that's where CitySwift comes in.

CitySwift is a data-driven bus scheduling and business intelligence platform. You could call it 'DevOps for buses'.

We developed CitySwift as a toolkit for bus operators, purposefully avoiding creating something that slavishly followed existing bus scheduling processes – because we believe many of those processes are now ripe for transformation.

We developed CitySwift as a toolkit for bus operators, purposefully avoiding creating something that slavishly followed existing bus scheduling processes – because we believe many of those processes are now ripe for transformation.

People over process, over tools

But aren’t we doing it the wrong way round? Shouldn't it be people over process, over tools?  Aren't we creating yet another tool, when we should really be ensuring that they understand their function and responsibilities in the bus company – and making sure that the process works?

Well, we’re not just a software company, and we didn’t just start with the tools. 

We know a bit about buses. Our team has more than 40 years' experience across the industry and a strong understanding of the challenges it faces. Using this domain knowledge, we want transform the way buses are scheduled. But we don't claim to be the only experts.

Bus people are specialists in a difficult domain. So we spoke in-depth with lots of them in order to fully understand their processes and know what works well and what doesn’t. Then we collaborated with them to work out a process that delivers a clearer picture of their bus networks, and the insights and functionality they need to optimise them.

A modern BusOps process

A modern operational process for the bus sector should be:
• Data-driven. All decisions should be driven by the full set of data.
• Automated (to an extent). Boring, time-consuming, error-prone tasks should be automated, freeing up experts for where they can add the most value.
• Version controlled. Bus operators should be able to treat their schedules as software developers would treat code, with the ability to try new optimisations and easily roll back.

It should offer:
• Collaboration. All relevant parties should be able to access relevant information and work together cross-functionally to deliver optimisations across the organisation.
• Continuous delivery. Automation should enable more frequent schedule changes and improvements. More frequent updates allow iterative change and less risk.
• Monitoring and feedback. Bus operators should be able to quickly and easily correlate schedules, KPIs and unexpected events – dealing with incidents before they become issues.

Three CitySwift tools and their DevOp analogues

It’s a bad idea for developers to dive headlong into creating tools to solve problems, without first understanding the process. But we should create tools that encourage (and simplify) effective processes. Also, tools shouldn’t simply automate everything. And they definitely shouldn’t be intended to replace experienced, clever people. But they should make it easier for those experts to skip the boring stuff, spot patterns and make informed decisions.

So what are some good tools that encourage a good process (and here I’m assuming the process is DevOps)? To begin with, you can’t improve it until you measure it. For software, that means instrumenting everything and finding relations. How does CPU relate to response times, number of servers etc. Tools like Stackdriver and Datadog help us easily explore this dimensional data, ideally allowing us to compare different environments (A/B testing).

The instrumentation is already there for bus operators in the form of GPS pings and ticketing data. They would like to put this data to work and get insights into how their network is performing. How many passengers are using the buses? Are buses late? What effect did a schedule change make? Are Fridays different from the rest of the week? SwiftInsight lets you explore the answers to all these questions.

Version control is fortunately the norm for most software projects now, and its benefits seem like truisms. Infrastructure as Code, using tools such as Terraform and Ansible, is an extension of these benefits. Infrastructure, such as cloud resources, are modelled in code, and creating it is automated and repeatable. SwiftSchedule is our tool for automatically generating bus schedules. It was important for us to treat schedules as we would code. Each schedule is persisted and versioned, and it automates all the boring parts.

Even with all that, things can go wrong! What makes the difference is how we respond to it, and first you need a tool for monitoring your software in-flight. Driving without a speedometer, fuel gauge and rev-counter is unnerving, and the same goes for software in production without some dashboards. We could read the logs to find out what’s happening, but that’s the equivalent of listening to the engine and opening the bonnet to see what’s spinning (definitely not advised while in motion!). Dashboards work well here, but developers ideally need automated alerts based on metrics with tools like Datadog and OpsGenie, removing the need to keep a constant eye on dashboards. Once the issue is resolved, we need to make sure it doesn’t happen again. A key idea in DevOps is continuous improvement, and part of this is capturing all the data and doing a no-blame post-mortem. SwiftOps (coming soon) gives bus operators this visibility, with automatic alerts and incident logging.

Buses

Software

SwiftInsight:
View historical data and future predictions for journey times and passenger demand
Get detailed feedback on the impact of operational changes

• Stackdriver, Datadog, etc:
View historical data for latency and throughput
A/B testing:
Get detailed feedback on the impact of technical changes

SwiftSchedule:
Automate the generation of optimised schedules
Version control scheduling

Terraform / Ansible:
Provision infrastructure
Git:
Version control code

SwiftOps:
View operational information using dashboards and solve potential issues before they occur
Log incidents for future improvement and no-blame post-mortems

OpsGenie / Datadog alerts:
Monitor your software in production using dashboards and respond to issues when they do occur
Incident logging:
Future improvement and no-blame post-mortems

Why should bus operators embrace DevOps?

We’ve all been on the bus, or waiting patiently at a bus stop, when things have gone wrong. Whether it’s down to the weather, a mechanical fault, a traffic incident or an overcrowded bus home from a football match or music gig, it’s not good for passengers. And what’s bad for passengers is bad for bus operators.

Although they’re not expected to deliver nine-nines reliability, failing to hit their SLAs costs bus operators dearly. They lose passengers. They get fined by local authorities. They may even lose their licence to operate certain routes.

But what options do they have? Add more frequency for improved reliability? That’s inefficient and hugely expensive. You can’t just run ‘kubectl make more buses’ (but if you know how to solve that problem, you’re hired). And even if you could (after we've hired you of course), it's trickier again to find experienced bus drivers.

What’s needed is a happy balance (think number of servers versus response times in our software analogy). But to reach and maintain that happy balance, bus operators need data and the right tools to interpret it.

Work with us

Are you a software developer or data scientist interested in helping build the future of public transportation? We’re looking for talented people to join our growing technical team in Galway.

Are you a bus operator interested in exploring new tools and processes to make the most of your data? We’d love to show you the CitySwift platform and discuss what it can do for your bus network.

Frank Farrell is Lead Developer and Cloud Architect at CitySwift.