IDEA: Investigate Youtube user attention spans.
Today I watched a Dutch documentary on Stephen Batchelor on Youtube. It was split into three parts. Something I noticed was that the number of views registered against each part were widely variable – it seems that most people who started in never got to the end (although strangely, more people viewed the last part than the one in the middle).
I thought I would investigate whether this is generally the case on Youtube, and to what degree. My results suggest that you should seriously think twice about presenting videos that have to be split into multiple parts, and should therefore frontload any information you wish to convey.
Investigation design:
- Search youtube for videos that match the following criteria:
- Their name includes “part 1 of “.
- They have a working statistics block when accessed through the Google Data Protocol.
- Collect about 10,000 of these (or until Google cuts you off).
- Work out how many parts each one has by looking at the end of “part 1 of X”.
- For each one, search the user who put up the first video to see if every other part is also in their account, and has a working statistics block. We’ll assume that the parts of the video follow the same general pattern for their titles.
- Look at the data.
- ???
- Conclusions.
Collection:
I wrote a python program to collect the data, which you can find under a cut at the end of this post. Running it on the 6th of February yielded up 188 sets of videos that satisfied all the criteria we put in place.
Preparation:
One simple way of looking at the data is to normalise each count of views as a fraction of the initial number. First, some data massage has to take place.
Excel seems to have trouble dealing with unicode strings in my tab delimited text, so I have romanised a few that had problems:
- “طاش ماطاش ارهاب اكاديمي”, which appears to be an episode of Saudi Arabian comedy “Tash ma Tash” about terrorism training camps.
- “恶作剧2吻 They Kiss Again Episode 20″ / “Episode 11″, which are episodes of Taiwanese romcom, “They Kiss Again“.
- “ドラゴンクエスト4 ピアノメドレー”, a sentence which more or less contains no actual Japanese words: “Dragon Quest 4 Piano Medley”.
Observations:
Right out of the gate, the average ratio second part viewers to first part viewers is very low – approximately 51%. This seems to suggest to me that if you want to say something on youtube, make sure you can cram it into one upload before the audience runs out of patience. Once you get viewers through this transition from part one to two, however, they seem to stick around. Another observation is that it seems the final installment often does better than the intermediate ones, showing people jumping to the end.
Check out my huge frickin’ chart (click for massive):
I think the chart works well – it shows the precipitous dropoff right out the gate, and also the little “ski jump” effect for some last episodes. You might have noticed that one video in particular has a very severe case of this – They Kiss Again, Episode 20. Wikipedia lists this series as having 20 episodes, so I guess people were tuning in for the finale!






Nicholas 1:14 am on February 7, 2010 Permalink
What happens if you normalise each series along the x axis as well, so that all shows take up the same width? I mean, if an episode has 10 shows, plot at 0.1, 0.2, etc, and if it has 20, plot at 0.0.5, 0.1, and so on.