Picture a normal triangle in your head. Now picture a triangle without its area. Can you? In mathematics, a degenerate triangle is defined as follows:
A degenerate triangle is formed by three collinear points. It doesn’t look like a triangle, it looks like a line segment.
It’s as if the three angles of the triangle are flattened. In theory, the triangle exists, but all that remains are overlapping linear lines.
While this essay is not about mathematics, it will attempt to classify certain A/B testing methodology as degenerate using patterns that approach a line of user deception, also known as dark patterns. Then, we will review questionable large scale tests in the tech industry. Finally, we will conclude by giving recommendations on avoiding this deception by making clear statements to our users.
Before we begin, let’s discuss what A/B testing is.
A/B testing is based on statistical mathematics and theoretical psychological experimentation. There is a control “A” and a variation “B.” An author of a test sends these two types to a controlled percentage of users for a variation of time in the product under test. The percentage and groups of users can be controlled by targeting.
Once the A/B test is completed, the data is analyzed. The analysis allows the team to steer the product in a direction based on the results. These data points are inferred by the team, driving performance indicators such as customer retention and growth, call to action, revenue, conversion, or to find new indicators.
Traditionally A/B testing alters user experience, visual design, or the customer journey of the targeted system to reduce user friction and maximize engagement. Most tests are on the surface where the colors, layouts, or navigation flow/behavior are altered.
A/B testing has gained a particular perception. Consultancies have formed businesses around the discipline, just how they formed around search engine optimization (SEO). SEO gained popularity in the 2000s. A/B testing followed soon after. While SEO was beholden to secret algorithms that can change from direct control of companies like Google, A/B test logic is owned by the authors who execute them to meet goals for their stakeholders.
Data collection and tooling has increased dramatically over the past few years. The authors of A/B tests have the power to walk a fine line on a code of ethics. Let’s explore these practices by defining high-level dark patterns, with the first classification under platform tools.
Those that have used Android or iOS applications know what a notification badge is. Notification badges are visual indicators on the app icon that cue the user to important events. The badge is a brightly colored circle with a number associated with the notifications waiting.
Imagine if we push a badge with a deceptive count to represent subjective notification activity. Even when the user is logged out, they'll receive badges without significant user activity. The badge is engagement bait.
For those who are targeted, users will engage with the application to understand what was urgent. They'll engage many times, increasing chances of an outcome, because of their presence.
On mobile platforms, that notification badge is one of many routes to execute an interesting A/B test. The platform-tools will vary, but each platform has high-level indicators for engagement that users are comfortable and knowledgeable with. The users are wired to respond and engage.
Next, we will fake things.
Of course, there are many parts of a digital product that haven’t been built completely yet. However, we want to increase engagement in some way.
Recently, a group of users navigated a site looking to watch a particular movie genre. As the users typed in the search result, they were steered to partial matches.
The users engaged with the content.
Developing the final polished product is something we can fake, including search results. Instead of investing heavily in the perfect feature, we assemble the results into simple categories based on previous research.
Now, let’s test with scarcity.
Manipulating critical numbers such as item stock, content number, and ratings are well known to consumer marketing. While these tricks work temporarily, they will backfire.
We have the power to adjust numbers to increase the outcome of engagement–to a point.
Recently a group of users had been shopping on a website—and the customers are experienced with commerce sites.
Let’s say that one of those items that the customer absolutely needed was in stock, but only "one left." Of course, scarcity will motivate the user to act–purchasing in a timely manner.
The next day the user returned to the site to research another similar purchase—sixteen items were available. Their trust will quickly erode.
This idea can be extended to whatever the product desires–and is an infringement of consumer protections. Low space, stock, rating, and a lot of things are adjustable. If your team owns a video platform, “expiring” content is another angle. Finally, anything as a number can be rounded.
Next, let’s test by giving away.
Sometimes if we give or take away major items of the platform, it may manipulate the engagement.
A group of customers on an application that offered paid content behind a paywall. This app was freemium in nature.
Access became “unlocked,” and a group of users could view content for free without the paid subscription. Then time passed, and it was gone.
Some users liked the content–and so they purchased the product.
Opening the gates in an A/B test could be a potential opportunity to engage users. The process here is to select identities to them.
Additionally, moving or closing things unexpectedly will stimulate engagement and increase returns. If services that are relied on are missing, the users will return frequently until the content returns.
Finally, we can hide things.
One final example is the creative use of information. Here, we conduct an A/B test that manipulates information relevant to the users at the proper time and context. We either shift the information upstream or dilute it. We reduce the friction of information to obtain a result. Sometimes lack of information is that friction.
An example is a service that relies on the live location of the available liveries in the area — maybe those parked or on break, increasing information will deliver engagement.
We will try to alter information or direct the user to a desired action by either giving or taking information away at measured points. Taking it a step further, information manipulation can be targeted to identified groups of users.
This wraps up the pattern examples. Urgency, faking, scarcity, giving, hiding are subjectively divisive. Are they unethical?
In mathematics, a degenerate triangle exists in the overarching term of degeneracy cases:
A degenerate case where an element of a class of objects is qualitatively different from the rest of the class and hence belongs to another, usually simpler, class. Degeneracy is the condition of being a degenerate case.
The degenerate triangle is deceptive, and so are the patterns defined above. All tests are doable without the user understanding it had happened nor platforms catching authors in the act. These A/B tests are a degenerate class of the typical e-commerce tricks such as urgency, scarcity, and human fallacy. These techniques are, in most cases, harmless but act on deception. The question posed here is when the human is unknowingly participating, is it ethical to run these tests?
The question has been answered by large technology companies which published their results.
Some years ago, both Meta and OkCupid deceived their users by running controversial A/B tests surrounding content engagement by companion matching. Then, they posted the results to the public. Opponents suggested that manipulating emotions and they matching incompatible companions was wrong. This started a long chain of responses from the community.
One excellent research paper focused on this test fallout.. Raquel Benbunan-Fitch dubbed these as A/B/C/D testing, where she states:
This is a deep form of testing, which I propose to call Code/Deception or C/D experimentation to distinguish it from the surface level testing associated with A/B testing.
Are the tests ethical? Well, the debate continues today.
The position of this author is to communicate to your customers. Let me suggest ways in which we can state our testing code of conduct.
While the industry rolls out and scales A/B testing to their advantage, I’d like to define how we can avoid these types of C/D tests.
Here are a few ideas that are centered around communicating the use of live user testing.
- Clearly specifying an A/B testing section in a Terms of Service
- Publish an ethical statement and guiding principles of A/B testing
- Visually indicate an A/B test is being performed with opt-out
The messaging presented is up to the stakeholders, and should be plain language. The boundaries of testing must be clear.
For item three, test evangelists agree that this behavior would disrupt the results, and the test would not be valid. In "A" or "B" mode, the author requires the user to act as normal to measure correctly.
While that might be true, we would want to give the option at an appropriate time, in user preferences or sign-up. We want to measure and satisfy the user's concern which is the product's value.
In the future, state and government agencies will define C/D testing limitations as public knowledge increases and problematic tests are uncovered. Companies have been caught performing highly degenerate tests that none of us could fathom.
What would be the consequence of being deceptive? At this time, there are no examples of crossing a line. We have the power to ask whether to associate with the authoring team that knowingly deploys these questionable tests or opt-out of these deceptive practices.
Time will tell how deceptive testing shakes out. Data, tooling and their alignment will continue to improve testing. And increase public awareness will spur hard questions.
If there is one lasting advice, try not to deceive... too much.
If you are interested in more information on the subject of dark patterns, check out https://darkpatterns.org/.
Every dark #pattern is the brutal end of blindly following #A/B #tests. Alt title: 5 dark patterns of A/B testing (2017)
- What is A/B testing?
- Five dark patterns that are questionable in A/B testing.
- More on A/B/Code/Deception testing.
- Being ethical with tests by clearly communicating expectations.
Thanks to Hazem Saleh and Dan Leonardis.
#engagement #learning #testing #analytics #ethics #softwaredevelopment
- hackernews
- r/programming
- r/softwarengineering
- r/ProductManagement