Canary testing is a technique for evaluating the impact of new features or products while minimizing risk. Features can be slowly rolled out to a small subsets of users and their effect on metrics measured. Users not included in this segment do not see the change and are therefore a control group.
Let’s look at an example:
- An e-commerce company wants to change their “Add to cart” button to say “Shopping cart”.
- They have an idea how this change will affect metrics like ‘add to cart’, ‘purchase event’, and ‘revenue’ but want to test to ensure the feature has the intended effect.
- The team ships the “Shopping cart” change to 5% of users in the US.
- Over time, metrics for the canary users are compared to those of the control group to determine the impact of the feature.
- If there is a positive impact, the team could increase the % of users receiving the change. If there is a negative impact, the feature can be iterated on or abandoned.
The term canary testing comes from the old practice of coal miners sending canary birds into mining tunnels as an alert for toxic gases. Tragically, the gas would kill the canary before miners and this would warn the miners to leave the tunnels.