My original High Impact formula had a fundamental flaw, which I think I may have fixed.
I spent the last post talking about the albums that made the biggest “impact” on me during 2007, but what exactly does that mean? Over the summer, I came up with the general concept, which basically defined impact as the average number of times any particular song from an album or artist in my iTunes library had been played, showing who has received the most attention relative to their presence in my library. Add together the play counts of all songs by an artist or on an album, then divide by the number of songs. There’s the impact score.
While this method produced some interesting results, it suffers from a substantial deficiency: it grossly inflates the relative impact of artists and albums that have a low number of songs. Artists or albums with a lot of songs have to be played a lot more in order to keep up. I noted this in my original post on the subject:
In cases where an artist has a low number of songs, each play count is “worth more” in relation to other artists. A single play by an artist with only one song nets that artist a full “point,” whereas as single play by an artist with 20 songs would only gain .05 points.
The solution I devised at the time was to set a threshold for inclusion, eg. artists must have more than 5 songs in my library to be ranked.
I never really liked that workaround because the threshold was arbitrary and it excluded artists who should really have been able to be ranked. So I spent a bit time recently tweaking the formula and I think found something that works to my satisfaction:
Total Play Counts [squared] divided by the number of songs.
This is essentially the same as the original formula, except that the total play counts an artist or album has received is multiplied by itself. The effect of this is to give the impact of a single play more weight as the number of songs increases. My thinking goes like this: albums with a lot of songs should rightly have a larger impact vs those which have fewer, even when the average play count is the same.
Suppose we have an album with 10 songs on it and an EP that has 4. Each song on both albums has been played twice.
Using the original formula:
Album: total play count (10 * 2) / number of songs (10) = 2
EP: total play count (4 * 2) / number of songs (4) = 2
Eight (8) plays gives the EP the same impact as the album that has received twenty (20).
Now the new formula:
Album: total play count squared (10 * 2)(10 * 2) / 10 = 40
EP: total play count squared (4 * 2)(4 * 2) / 4 = 16
Even though both recordings have been listened to the same number of times, the album’s larger footprint leaves a greater impact score than the EP.
The best analogy I can think of is mass vs speed. More songs equal greater “mass.” More plays equal greater “speed.” Just as lighter objects have to travel faster to hit with the same amount of force as heavier objects, an artist with a lighter presence in my library has to be played more times to have the same impact as an artist with a lot songs. A ping pong ball must travel at higher speeds to equal the same force as a baseball.
This table below shows the formula in action.
|# of Songs
|To Rococo Rot
|Molotov vs Dub Pistols
The two singles by Tomoyasu Hotei and Molotov vs Dub Pistols happen to be in the top 20 most played songs in my library (out of ~16000). That showing places each of them relatively high on the list, but not overwhelmingly so, considering the differences between the various AVG play counts. That’s an equitable result I’m pretty happy with.