Archive for September, 2009
How to keep visual design consistent while A/B testing like crazy

If you don’t watch out, after a couple months of A/B testing, your product will end up looking like Las Vegas!
Why A/B testing and visual design come into conflict
It’s great to implement consistent A/B testing in their product process, but then it becomes even harder to keep a consistent visual design while doing test after test. This tension comes from the fact that A/B tests push you towards local maxima, making the particular section of page you’re testing high-performance, but at the expense of the overall experience. As a result, there’s a lot of temptation to “hack in” a new design, the way that software engineers have to “hack in” a feature – but this is short-term at best. This often means adding a bold, colored link to the top of page with “NEW!” or adding yet another tab – these are all band-aid solutions because once you get to the next set of features, it’s not a scalable design to have 100 tabs.
Each of these competing features, taken by itself, moves the needle positively. However, there isn’t a great way to measure the gradual “tragedy of the commons” effect to the overall user experience. Each new loud page element competes with all previous page elements, and must be louder as a result – this leads to the Vegas effect that many Facebook apps end up in.
To really solve this problem, you need a central design vision – there’s no way around that. It also helps a lot to have a flexible design that embraces A/B testing – you can work with your designers to make this happen through modular, open elements.
Closed designs make it hard to add or remove content
Let’s take a particular example and look at it – this might be a standard example on a page like a video or otherwise:

It looks nice, but also has tremendous sensitivity to the content and an inflexible design that makes it hard to test new content. To be more specific, ask yourself the following quesitons:
- If you wanted to add a comments count in addition to views and votes, how would you do that?
- What happens when the views number gets beyond 10,000?
- What if you wanted to add favorites, or flagging for inappropriate content?
- If we decided to hide the thumbs down, how would this visually balance?
- If we wanted to fit more thumbnails onto a browse page, how easy it is to shrink the main thumbnail?
- etc.
The above design is an example of a “closed” design where everything fits just right, but makes it very difficult to add or remove elements. There’s an exact balancing of all the parts of the element, which makes it very sensitive.
Many of the solutions to the questions involve either require building out new pieces next to the element, which throws it off balance. Thus, if the above were used in an A/B test, the visual look would be immediately ruined.
Open designs that are A/B test-friendly
Let’s compare this to the elements below, which have a more modular design that can scale vertically:

The above elements don’t have the same “just right” visual appeal, but make it much easier to add and remove content. The key design decision is to add multiple bands of content which can be grouped together and extended vertically. Ideally, you would never end up with a repeating tile of 4-buttons and 3-stats, but you could certainly test it much more easily than with the closed design.
Here are some of the variations that can easily be tested:
- Switch the title section and the stats/buttons sections
- Add and remove buttons (or no buttons!)
- Add and remove stats (or no stats!)
- Combine price tags with other stats
- Try different buttons
- etc.
Following an open design on page elements enables substantial A/B testing within some flexible constraints. Now you may still be tempted to do something crazy like big hover overlays, <BLINK> tags, and other stuff, but at least you can make it easy to test a wide variety of low-hanging fruit. It also makes the owner of the overall visual design able to maintain a central “style guide” while still offering enough flexibility to keep people creative.
This same idea of open designs with horizontal bands of content can be applied to whole pages too – let’s examine a page from the king of A/B testing, Amazon.com.
Open page layouts
From the snapshot below, you can see that Amazon groups the center column of content – each element has a title explaining how it is, a list of items, and a navigation link to see more. This is also true with the item detail pages, which use a similar grouping to show everything from similar books to reviews to other elements. These pages can get very long, but because most of it is below-the-fold, it’s easy to get away with.
I’ve been told that this modular design enables Amazon to take a “King of the Hill” approach to testing each horizontal band of content against each other. Different software teams will create different kinds of navigation and recommendation, and if it causes people to click through to buy, then it floats up higher in the page. This systematic A/B testing is much more easily enabled when there’s the design flexibility for that sort of thing.
Here’s a snapshot for a reminder of what this looks like:

While you may argue that Amazon’s design is cluttered and actuallysucks, on the other hand, this approach lets them take a very experimental approach to pushing out features. It makes it very low-cost to implement a new recommendations approach and try it out without needing to figure out how to design it into the UX.
What’s next? Modular user flows?
Of course, if you can take a modular approach to scaling individual page elements or entire pages, the next question is whether you can take this approach to user flows.
I’ve never seen anyone do this, but this is how it might work:
- Any linear user flow is identified in a product (like new registration, payment flow, etc)
- This flow might be 1 page, or broken into N pages
- Similarly, every individual page might have a bunch of fields (like photo, about me, etc.)
- As part of the A/B testing process, you might want to drop a new page (or new fields) into the flow
- Then an optimization process shuffles pages throughout the flow to identify the best page sequence and page-by-page configuration
You might imagine something like this could be a very powerful process as it would allow you to identify whether you should offer a coupon pre-transaction or post-transaction, or on any given page, where an input field should be placed.
For those who want to know more, I have written a bunch more about A/B testing here.
Want more?
If you liked this post, please subscribe or follow me on Twitter. You can also find more essays here.
Netflix on their Freedom and Responsibility culture
Pretty fascinating slides – it’s 128 pages, but worth flipping through – see below. Those on RSS feeds, you can find a link to the presentation here. I found this via Bob Sutton, who writes:
This slideshow was on a number of blogs over the summer (see here) , but I wanted to make sure that everyone saw it and, frankly, to get a post here so I have a record of it. Apparently, Reed Hastings, the amazing CEO of Netflix, put-up a set of 128 slides that is a “reference guide to our freedom and responsibility culture.” I realize that most 128 page slide decks are deadly dull, but this is an exception. You may not agree with all their values and approaches, but on the whole I think you will be fascinated by the detail and thought. Now, I have no inside knowledge of what it is like to work at Netflix, but if this is accurate, it is a pretty impressive company — frankly far more enlightened than most in SiliconValley.
Why low-fidelity prototyping kicks butt for customer-driven design

Low-fidelity prototyping versus high-fidelity prototyping
In my discussions with designers, one of the interesting recurring conversations is the tools and process they use to prototype and mock up experiences. In particular, there’s a lot of divergence on how high or low-fidelity to go with a prototype.
For designers that primarily come from agency backgrounds, I’ve found that there’s an emphasis on quickly getting to a near pixel-perfect mockup, and the variations are minor in detail. In that worldview, the ideal deliverable is a single version of something that feels high-quality and gets minor feedback from clients. In the client-agency model, if you give your clients a bunch of rough mockups that seem low-fidelity, then you risk looking unprofessional. Or worse yet, you might get a ton of diverging comments that you then have to work out and iterate – in some cases that’s the last thing you want to do.
This is especially not a good situation for companies that focus on products delivered to customers – in that case, you mostly want your product to be the right one, no matter how many iterations it takes. As a result, low-fidelity prototyping can be really useful because it aids an iterative, customer-focused approach rather than one where the Great Designer comes up with something directly from his brain.
Here’s a couple of the main advantages:
- Get better and more honest feedback
- It’s great for A/B testing
- Make the cost of mistakes cheap, not expensive
- Refine the page flow, not the pages
- Figure out the interaction design rather than the visual design
In addition, after I’m done arguing my point, I’ll recommend some of the tools that have been useful for me in doing low-fidelity prototyping.
Get better and more honest feedback
The first time I really undestood the power of low-fidelity prototyping was when I started doing usability tests on consumer products I had built (This was years ago). Initially, I wondered why anyone did paper-prototyping? I immediately concluded that it was due to a deficiency of many designers that they couldn’t write code, and thus couldn’t do HTML mockups of the products they wanted to build.
But once I started getting people to view and interact with my prototypes, I realized that one of the big problems was that people didn’t give good feedback when the prototype you present to them is too perfect. Rather than telling me about the really high-level things, like “does the value proposition make sense?” instead they would focus on colors, fonts, the layout of the page, etc. And furthermore, they didn’t feel that they could really jump in and build on top of the ideas you showed them, because it was far beyond their capability to duplicate.
Compare this to a simple exercise where you are using hand-drawn cards or drawing paper and are literally sketching stuff out during a customer interview – you’re much more likely to try something out, and have the person you’re interviewing grab your pencil and say “no, more like this!” And that’s exactly the kind of interaction you want.
It’s great for A/B testing
As for as a metrics-driven approach goes, you have to remember that techniques like A/B testing fundamentally thrive off of variety. In particular, it thrives off of variety at the UI layer, where many small UI changes that cost very little technically can be tried out and optimized. As a result, you don’t want 2-3 pixel-perfect mockups, you actually want 10 or 20 rough mockups where you can select only the most high-variance candidates.
Some of the highest variance stuff has to do with changing the order in which you do things, or opt-in versus opt-out, or richer AJAXy interactions. These are all things where it’s easier to generate many candidates through low-fidelity prototypes since you’re often looking at things form a systems level.
Make the cost of mistakes cheap, not expensive
One of the hidden benefits of having a low-fidelity prtotyping process is that it makes changing directions much easier, which naturally facilitates a collaborative design discussion. When you’re using a customer-driven product philosophy that incorporates a lot of outside metrics and qualitative feedback, you’ll probably get multiple people involved in the design process. If it’s done by one person or a small group, and is polished significantly before it reaches the greater group, one of the problems is that it discourages collaboration. It’s very hard for people not to get defensive when they’ve spent a lot of time polishing something only for it to get changed significantly. Using a low-cost process makes it so that you can try a lot of variations cheaply, without any of the emotions involved.
Refine the page flow, not the pages
One of the highest leverage design decisions you can make is not about the look of an individual page, but what happens before and after it. For example, you can take a multi-step process and condense it onto one page, or change the ordering of something so that you do something and then register, rather than the other way around. These kinds of design decisions ultimately focus on the order and flow of the user, rather than the look or interactions of any specific page. If you go with a low-fidelity, then it’s easy to draw lots of small pages and link them up in a flow, and do things like cross pages off, change the ordering of a funnel, and lots of things that feel natural when the prototype is very rough. Otherwise, it’s too easy to get caught in the details on the “right” way something works without exploring the options.
Work your way up from low-fidelity to high-fidelity
Of course, you want to make sure that you use the right process for the right job. So there’s nothing wrong with high-fidelity prototyping, especially when you are in the later stages and thinking about issues like branding, look and feel, and all those other details. One way to keep this process going is to have multiple rough prototyping checkpoints so that design decisions are constantly getting refined – maybe the first step is a sketch on a paper, the next is a rough mockup on the computer, then a detailed mockup, then a rough built-out version, and then iterate to the final product. These steps make it so that all the design decisions are well understood, refined, and debated all the way through.
Tools I recommend
Finally, a couple recommendations on tools for paper prototyping:
- Number 2 Pencil
- Giant art pad for drawings – you can get these at an art shop or office store
- Balsamiq (check out the video on the linked page)
- Macromedia Fireworks
In general, nothing beats pencil and paper, but that’s just me
I’ve been told that for people who aren’t comfortable drawing, using tools like Balsamiq helps quite a bit.
Want more?
If you liked this post, please subscribe or follow me on Twitter. You can also find more essays here.
