Becoming an Experimentation Organization

The more I read about online experimentation, the more I realize that it represents a paradigm shift in the way that companies do business. When I first heard about A/B testing, I initially assumed that it was just a limited-purpose tool for tweaking designs. What I didn’t understand at the time is that controlled experimentation is on the rise because companies have realized that the scientific method is broadly applicable to business decision-making. When fully adopted, the experimentalist approach leads to a categorically different way of running a company.

Experimentation Works: The Power of Business Experiments caught my attention because it gives experimentation the breadth of treatment that it deserves – it recognizes that experimentation, when fully adopted, is a game-changer that opens the door to a new kind of company. The book is intended as a how-to for managers interested in increasing their organization’s maturity with respect to experimentation capabilities. A lot of emphasis is placed on how companies might need to re-invent themselves in order to take full advantage of experimentation.

I found the book’s to be most compelling in it’s discussions of the benefits of experimentation, and it’s recommendations on how to better incorporate experimentation into the standard operations of a company. This blog post summarizes the ideas in the book that I found most useful.

The Benefits of Experimentation

According to the author, companies should embrace experimentation because it’s a cost-effective way to decide on the best course of action in the face of uncertainty. Suppose for a moment that a company has a potentially promising idea, but they’re unsure about how well it will work. Testing that idea with an experiment would be valuable, regardless of whether the results end up showing that the idea is likely to be successful or unsuccessful. If the results suggest that the idea will be successful, then that allows the company to rationalize committing more resources to pursuing the idea. On the other hand, if the results suggest that the idea will be unsuccessful, then that allows the company to pull back and explore alternative ideas. In this way, both successful and unsuccessful experiments produce information that is useful in guiding next steps. Experiments are only wasteful if they are poorly designed and lead to inconclusive results.

The author emphasizes that experimentation remains valuable even if a company’s product has extensive analytics that capture large amounts of data. Observational data is problematic because it is entirely correlational and doesn’t allow for an inference of causality, unlike a controlled experiment. The caveat that correlation does not imply causation is particularly important to remember in a business context. Most companies operate in a “high causal density” environment, which means that a large number of interacting variables could potentially influence any outcome of interest. Controlled experiments allow each independent variable to be isolated from other confounding variables, allowing us to formulate theories of causation that we can’t get from analytics data.

Experimentation also opens the door to a product development approach known as “high-velocity incrementalism”. Under this approach, comapnies progressively refine their product by running a series of iterative experiments that each focus on some small improvement. For example, most tech companies test tens of thousands of candidate design changes per year on their flagship product. Only a small fraction of those end up proving successful, and those that do typically only generate a fraction of a percent of lift in some metric. Despite the small chance of success, and the small relative benefit of each success, all of those tiny improvements nevertheless compound to produce large cumulative effect on the companies’ bottom lines.

Experimentation isn’t just for small incremental changes, though. It’s equally well-suited for companies attempting to innovate with big, radical changes. Innovations, by virute of their novelty, often go against conventional wisdom and accepted practices, making stakeholders skeptical and reluctant to buy-in. As another complicating factor, if an innovation is truly novel, then there won’t be much data available about it. Running in-house experiments allows companies to collect data that can demonstrate the value of an innovative idea, which can then be used to persuade skeptical stakeholders to support the new idea.

Thomke makes convincing arugments for the numerous benefits of experimentation, but they can only be achieved if companies incorporate experimentation into their standard ways of working. In the next section, we’ll explore what that looks like.

Becoming an “Experimentation Organization”

Thomke uses the term “experimentation organization” to describe companies that test all new product changes through experimentation before the general release. This is the golden standard that he encourages his readers to aspire to. He cites booking.com as an example of this type of organization, saying that 75% of the company’s employees are involved in running experiments. To reach this level of maturity, there are some pre-requisites that need to be in place: the majority of employees must have the incentives, tools, and know-how to independently run experiments.

According to Thomke, the biggest challenges associated with becoming an experimentation organization are usually related to people, rather than technology. Developing or installing the technology is usually the easy part; successfully integrating it into company culture and processes is the hard part.

Experimentation has a Democratizing Effect on Company Culture

Experimentation changes company culture by making it less authoritarian and more democratic, and that usually requires a corresponding shift in management style.

When companies make decisions based on the opinions of their highest-paid individuals, there’s a sense in which they are authoritarian, because outcomes are determined by the rank of the people involved, rather than the strength of their arguments for taking a certain course of action. If the big boss calls the shots regardless of what the evidence says, then employees have little incentive to advocate for what they think is best. Conversely, leaders also have little incentive to collect evidence to support their intuitions.

On the other hand, when a company defers to experimental data over the opinions of the highest-paid people, decision-making becomes more participatory. This gives all employees the opportunity to make an impact, regardless of their rank, which can be very motivating for those who previously may have lacked decision-making power. This can also be somewhat confusing for managers, because it “upends their traditional role of translating executive direction into action”.

As a company starts to embrace experimentation, the role of the manager becomes to “create systems and a culture where [decisions are made] by fast-cycle experiments instead of by PowerPoint, politics, and position in the hierarchy.” As more decision-making becomes delegated to the teams running the experiments, the role of the manager becomes more about framing strategic goals, motivating employees, and ensuring that teams have the required tools and training to run experiments.

Encouraging Experimentation By Emphasizing the Value of Learning From Failure

Another tricky aspect of the cultural transition involved in becoming an “experimentation organization” is incentivizing employees to run experiments. According to the author’s reserach, employees run more experiments when managers to stress the value of learning from failure, and do not punish employees for testing a valid idea that didn’t end up working out. The vast majority of experiments fail, even in highly sophisticated companies. The author’s research shows that stigmatizing failures has a chilling effect on experimentation, particularly among less senior staff who lack organizational clout. Remember, the purpose of experimentation is to learn in the face of the uncertainty, not to get extra confirmation about a hypothesis that we’re already certain is true. While an experiment can be wasteful if the hypothesis is poorly-researched, or the experimental design was flawed, a failure in the idea being tested doesn’t mean that the experiment was a waste. Failures are valuable because they allow companies to re-direct their focus to more promising alternatives. Emphasizing the value of failure motivates teams to run more experiments, which is the goal.

Invest in Tooling to Make Experimentation Easy

To facilitate the adoption of experimentation across an organization, companies need to provide easy-to-use tooling which automates away some of the sticky points of experimentation. For example, most big companies who have developed their own in-house experimentation platform have a feature that automatically handles splitting up the treatment group and the control group so that they are comparable along a wide range of parameters. Tools typically also have automated guardrails, which continuously monitor the effect of all live experiments on a company’s key metrics. If a negative impact is detected, the tool should be able to send push alerts or automatically roll-back the experiment.

Training

In order to properly use the tools, companies need some in-house expertise in experimental design and statistics. For big companies, Thomke recommends two different levels of training: a basic level of training for all employees that run experiments, and then a more advanced level of training for experimentation specialists that work in “center of excellence” and are available for consultations.

The basic level of training for general employees would cover topics like what constitutes a solid experimental design, and basic statistical techniques for analyzing experimental results. Employees would learn the basics of setting up an experiment, like what makes for a testable hypothesis, how to control for confounding variables, and how the required sample size can vary depending on the anticipated size of the effect. Employees would also learn the basics of interpreting experimental results, e.g., p-values, and bayes theorem.

The more advanced level of expertise would only be possesed by a small group of experimentation experts who work in a center of excellence. The purpose of this group would be to remain available for consultation about advanced topics for staff conducting experiments.

Red-Tape Removal

After employees have the tools and training, it’s important that they actually have the authority to run experiments without going through a lengthy approval process. The author emphasizes that employees should be able to run experiments without seeking authorization provided that certain minimum requirements are met. For example, the experiment must not deal with a legally sensitive matter, there must be automated guardrails in place to shut off the experiment in case something goes haywire.

Key Take-aways

The book helped me to appreciate that fully-embracing experimentation transforms the way that businesses are run. It changes the way that planning is done, it changes the way that decisions are made, and it changes the role of managers. The book opened my eyes to the fact that the biggest challenge in adopting experimentation is not related to technology itself, but rather about integrating it into company culture and processes.

Dan Elbaum's Blog