Futuristic Play by @Andrew_Chen

Analysis on viral marketing, user experience, game design, and online ads

Are you new? Here's 5 steps to start exploring:

  1. View the "Best Of" list with 50+ essays on viral marketing, gaming, and ads »
  2. Get introduced: About this blog, why entrepreneurs and marketers recommend it »
  3. Receive updates by email or RSS feed or Twitter
  4. See the books I recommend to friends
  5. Write me an e-mail and let me know what you're up to!

Facebook viral marketing: When and why do apps “jump the shark?”

Comments

Excel spreadsheet download
For those of you who are interested in the gory details, please download the following spreadsheet here:

Viral and Retention Excel Model (Click to download)

Math warning!
This blog post will be a little more technical than usual, so I apologize to those of you who are bored by this. Anyway, let’s get started.

See this image before? Many would describe that as, EPIC FAIL ;-)

That’s what happens when you "jump the shark" and your app goes from successful to completely not successful. Why does this happens? This blog post is to dissect that exact issue.

Modeling user acquisition
First off, let’s look at some ways to model user acquisition. For those of you with the spreadsheet, this is the second tab. You first start with a couple constants:

  • Invite conversion rate % = 10%
  • Average invites per person = 8.00
  • Initial user base = 10,000
  • Carrying capacity = 100,000

(note that these are just example numbers)

To understand how these constants work, you basically want to think about how viral marketing works. What happens is that you start out with an initial userbase (=10k), and every time your userbase grows, each user ends up sending out invites (=8.00), which then have a specific conversion rate (=10%).

That means that in the first time period, you have 10k. In the second time period, you get 10k*8*10% more users, which equals 8k more users, who are the next round of users who send invites. Then in the third time period, it’s 8k*8*10%, and so on. Note that the new batch of users needs to exceed the previous batch, in order to "go viral." That ratio is often referred to as the viral coefficient. In fact, here’s the equation for this unbounded viral equation:

u(t) = u(0) * (1 + i * conv)^t
where u(0) = 10k, i = 8.00, conv = 10%, and t is the # of time periods

However, note that this assumes that your "carrying capacity," that is, how many users are in the total network, is unlimited. However, on Facebook, that’s not true - once you burn through the 60 million new users, then you don’t have any left. Similarly, it doesn’t reflect the reality that as you saturate the network, your invites may end up going towards people who have already evaluated or installed your app, and they are unlikely to install it again.

A simple model for network saturation
Thus, one simplifying assumption is that as you saturate the network, the conversion rate on your invites goes down. In one possible model, you’d argue:

  • If you have installs on 0% of the network, then your natural conversion rate (10%) holds
  • If you have installs on 50%, then your natural conversion rate is discounted 50%, which equals 5%
  • If you have installs on 99%, then your natural conversion rate is discounted 99%, and etc.

Note that you might even argue that this is an optimistic view. You might argue, for example, that the "discount" on your conversion rate should be related to the total % of the userbase that’s been invited, not the total % that’s installed something.

In that version, if someone hates your app and doesn’t want to install it, it’s unlikely that they will ever install it. In the version I’m describing, the only people who won’t install your app are the people who have already done so.

To describe this mathematically, you might say that at each point, there’s an "adjusted conversion rate" which looks like:

adjusted conversion rate
= natural conversion rate * saturation %
= natural conversion rate * (current installs / total Facebook population)

so if you agree that’s true, then you can combine the this last equation into the initial one:

u(t) can be defined as:
= u(0) * (1 + i * adjusted_conv)^t
= u(0) * (1 + i * conv * u(t-1) / carrying_capacity)^t

(This can then be simplified further, but I’ll leave the math to the reader - the spreadsheet reflects this thinking already)

As a result of this, you see that your cumulative install base kinda looks like a logistic curve:

Now that you see that the cumulative users follows an interesting trend, where it starts to grow exponentially, but then starts to hit saturation. Then it eventually takes some time, but it starts to plateau as you reach the carrying capacity of the network.

Quick break for Cohort analysis re-introduction
Before reading through this post, you might want to glance over a previous blog I wrote on cohort analysis and its relationship to user retention reports

 You may want to read that before going further…

Back to our story…
Previously, I discussed how you can mathematically model the viral acquisition
process, particularly as you hit the network saturation point. However,
while the model shows a growth curve for cumulative users, it doesn’t
take into account how retention metrics fit in.

In the spreadsheet linked above, you can flip to the "User
retention" tab, which shows a cohort analysis perspective of the
hypothetical site. Here’s how to read it:

  • On the Y-axis are "Time period cohorts" which are defined by
    the group of users that joined in a particular time period. So #1
    means, the users that joined in period #1
  • On the X-axis are the "Time period" which defines the time period that the specific cohort is in

So for example, in 1×1, there are an initial 3,000 active users on the site.

However, by the next time period, the 3,000 active users have
declined to 1,500 users. However, because there are a bunch of virally
generated users, there’s a new cohort of 2,328 users who have joined as
cohort 2. The number of "new" cohorts is defined by the rows in the
other spreadsheet tab, "Viral acquisition."

Then notice that at the bottom of each time period, there’s a count
for how many users are active in total, in each specific time period.

Does this make sense? If not, shoot me an email at voodoo[at]gmail
with what you’re confused by, and I’ll update this blog with more
clarifications!

Introducing the retention coefficient
So the key driver
for retention is the % of users that stay alive in a specific cohort,
between one period to the next. If it’s 50%, then if you start out with
3k users, in the next period you’ll be left with 1.5k users. If it’s
100% retention, then 3k users ends up with 3k users.

So let’s play around with the numbers.

At 99% retention, which means that over 20 periods you are losing
very few users, you get a graph of total active users that looks like
this:

This chart looks pretty good, of course. You start with exponential
growth, then hit a plateau, and you have a very slow burn on your
userbase. I suspect that the Facebook site, among other highly popular
sites, essentially have >99.999% retention between days. I say that
because people seem to use the site for years at a time, and probably
the early users of the site are probably mostly still on it.

Now for the EPIC FAIL.

OK, here’s the fun part, which is when you drop the retention coefficient down to 50%:

Ouch. Doesn’t look good. If you’ve read all the way this, far it’s pretty clear why this happens, but let’s summarize:

Key conclusion
The key in this calculation, if you look at the stats, is that:

  • Early
    on, the growth of the curve is carried by the invitations
  • However, over time the invitations start to slow down as you hit network saturation
  • The retention coefficient affects your system by creating a "lagging indicator" on your acquisition - if you have good retention, even as your invites slow down, you won’t feel it as much
  • If your retention sucks, then look out: The new invites can’t
    sustain the growth, and you end up with a rather dire "shark fin."

Things look great at first, but if you can’t retain users long-term, then you don’t have a business.

Improvements to the model
I want to make a couple comments on how the simplified model contained within the spreadsheet could be improved dramatically:

  • Don’t just model invites, model multiple viral channels
  • Include "usage loops" not just the "invite loops," which are triggered by users trying out the product
  • Try both a global carrying capacity, as well as a "niche discount"
    for the number, if your app is super-niche and focused on a particular
    demographic or user behavior
  • Be able to handle realistic numbers - perhaps even retrofit it onto Adonomics data, for example
  • Factor in re-engagement channels
  • etc.

Obviously if anyone would like to think about this more, feel free to and shoot me an email.

Questions and comments?
I built this model very quickly while on the plane ride back from Graphing Social Patterns, but if anybody wants to discuss the model, make improvements, etc., please e-mail me:

voodoo[at]gmail

Thanks!

UPDATE: Dave Fry sent in a correction on the fact that only the new delta of users sends out new invites, the old guys have already done so, and are unlikely to in the next period. Thanks Dave!

Written by Andrew Chen

March 5th, 2008 at 10:42 pm

Posted in Uncategorized

Viewing 22 Comments