Documentation might not be the flashiest tool, but it’s the one that’s had the biggest impact on data democratization at internet testing and analytics company Ookla.
The team uses software tool Confluence to maintain information from all of their data sources and fields, and this unification makes it easy for anyone to quickly find the information they care about, be it expected values, examples queries or historical notes.
Another tried and true tool the Seattle-based company relies on? Consistent communication. Brian Connelly, a data science and analytics lead, said collaboration goes a long way in uniting siloed departments, but the work is ongoing. Instead of preaching a project’s completion, leadership rewards ongoing partnership, he said.
Collaboration is only useful, however, if employees understand how the newly unlocked data impacts customers. That’s why Ookla employees rely on Apache Airflow to generate reports they can use in client conversations and Shiny for data analysis.
The hope? That all team members — even the non-technical — can confidently speak to the speed and quality of mobile and fixed broadband connections.
What were the initial steps you took to break down data silos across your organization?
Surprisingly, the biggest challenge that we face at Ookla isn’t the huge number of records that we receive each day. It’s the breadth of our data. Modern data warehouses and data lakes make working with billions of records fast and easy. But you can only take advantage of those tools if you know which data points best address the question at hand. That often means you need to know both where those data points are stored and what they mean.
Our goal is to enable anyone in the company to quickly get answers to their questions without having to first become a specialist in network engineering, systems administration, radio frequency engineering, geography and more. This mission has required help from teams across the company. Data engineers, who manage the incoming flow and transformation of raw data; domain experts, who help give meaning and context to the data; and data scientists and software engineers, who build standards and tools for working with the data all play a role.
Fortunately, everyone across these teams has been enthusiastic about democratizing our data, so it’s been easy to create a culture and a set of processes around doing so. There are definitely some time and effort costs, but we’ve all enjoyed the benefits, which have been much larger.
We’ve tried to create a culture where making small, incremental changes is the norm.’’
What are some of the tools Ookla uses to integrate your data and make it more user-friendly for non-technical employees?
Although documentation is the least flashy of the tools we use, it’s had the biggest impact. We use Confluence to collaboratively maintain information about all of our data sources and each of the fields they contain. This storage and unification makes it easy for someone to quickly find the piece of information they care about. Any employee can see its expected values, historical notes, example queries and more.
We’ve tried to create a culture where making small, incremental changes is the norm. Flagging items as “to-do” or requesting help from others is totally fine. Although we may never have complete documentation due to our continually expanding data set, we have steady progress.
Before employees can begin using our data to answer questions, they first have to be familiar with what it says. We’ve used Apache Airflow extensively to produce automated reports that we share daily or weekly in different Slack channels. Not only does this provide additional context to those conversations (and lead to more interesting questions), but also it helps team members get a good feel for the normal patterns that occur in our data.
While good documentation is helpful for those who work directly with the data, it may not be enough for those who are more focused on the information that the data provides. We use Shiny to quickly build and deploy internal, self-serve apps that combine proper use of our data, statistical analysis and domain expertise. This resource empowers anyone in the company to perform complex analyses.
Tell us about a specific win one of your teams saw as a result of having better access to data.
Overall, increasing access to data has produced a lot of wins, both big and small. Automated reports have alerted us to significant events, such as natural disasters (and recoveries!) in near real-time. Since we work with data from every mobile and fixed broadband network around the world, it’d be extremely difficult to catch everything. Similarly, because more people are familiar with our data, they notice when interesting things happen and add new perspectives.
Documentation and self-serve apps have allowed people from all corners of the company to get answers to their questions, which has decreased both their turnaround time and the load on our data scientists, analysts and engineers.
Because all of our complex knowledge and information is available in a centralized place, new employees are able to onboard much more quickly. They require less help and assistance from others. Ookla spans many time zones, so not having to wait to synchronize with others makes a big difference.