This is a snippet I jotted down from listening to a podcast with Tim Ferriss on his ways to cope with various overloads, including information. The remark in the title asks why work until you are 65 then become fully retired? Start to take some of that now. I got cut short of hearing the whole interview but will get back to it.
Charles Arthur has a great post on the value of links and your personal linking brand
That currency is the hyperlink, a pointer to somewhere on the internet that holds some information that someone else might find useful. Like any currency, it can be debased, and lose its value. You’ve heard of the dollar/yen/pound/euro exchange rate, of course (and watched in amazement as they gyrate, and yet the price of American hardware and software never alters from a $1 = £1 translation). But in the link economy, when everyone’s passing around links, every person is their own central bank, determining the value of their own currency.
He also mentions Rickrolling, which is a specific yet humorous instance of link fraud, where links promising further content send you to a Rick Astley video.
Link Inflation: too many links chasing too little content. A measure of how many links you need to follow to find content of interest.
Given that there are about 7 billion people at the moment, my answer to this question was about 15 billion. Here I was thinking about exponential growth, but I was way off. The estimated value is just over 100 billion. The article sets out to debunk something of a myth in population studies
One reason the question keeps coming up is that somewhere, at some time back in the 1970s, a now-forgotten writer made the statement that 75 percent of the people who had ever been born were alive at that moment.
This article formed the main technical content of a post I made called The Wisdom of a Random Crowd of One. The point of the post was to show how IT data can be used to produce meaningful models, with PageRank as an example. However PageRank turned out to be vastly interesting, and essentially took over the post. Here you will find just the PageRank material withoout the IT data modelling context for easier referencing and indexing.
Incidentally, PageRank gets its name from being the ranking algorithm of Larry Page rather than because it ranks web pages.
The Random Surfer
The hero of the PageRank story is an anonymous and robotic random surfer. He selects a random (arbitrary) starting page on the internet, looks at links on that page, and then selects one to follow at random. On the new page, he again looks over the links and surfs along a random link. He happily continues following links in this fashion. However, every now and again, the random surfer decides to jump to a totally random page where he then follows random links once again. If we could stand back and watch our random surfer, we would see him follow a series of random links, then teleport to another part of the internet, follow another series of links, teleport, follow links, teleport, follow links, and so on, ad infinitum.
Let’s assume that as our random surfer performs this mix of random linking and teleporting, he also takes the time to cast a vote of importance for each page he visits. So if he visits a page 10 times, then the random surfer allocates 10 votes to the page. Surprisingly, the PageRank metric is directly derived from the relative sizes of the page votes cast by this (infinite) random surfer process.
This seems deeply counterintuitive. Why would we expect the surfing habits of a random process to yield a useful guideline to the importance of pages on the Internet? While the surfing habits of people may be time consuming, and sometimes downright wasteful, we probably all think of ourselves as more than random-clicking automatons. However the proof of the pudding is in the searching, and Google has 70% of the search market. So apparently when all of the erratic meanderings of the random surfer are aggregated over a sufficiently long period, they do in fact provide a practical measure of internet page importance. We cannot explain this phenomenon any better than by simply labelling it as the wisdom of a random crowd of one.
Breathing Life into the Random Surfer
The data that can be gathered relatively easily by web crawlers are the set of links on a given page and the set of pages they point to. Let’s assume that there are M pages currently on the internet, where M is several billion or so. We can arrange the link information into M x M matrix P = [ Pij ] where Pij is the probability that page Pi links to page Pj (Pij is just the number of links on Pi to Pj divided by the total number of links on Pi).
The matrix P is called stochastic since the sum of each row is 1, which simply means that any page must link to somewhere (if Pi has no links then it links to itself with probability 1). So P represents the probability of surfing (linking) from Pi to Pj in one click by a random surfer. The nice property of P is that P^2 = P*P gives the probability that Pj can be reached from Pi in two clicks by the random surfer. In general P^N gives the probability that the random surfer ends up on page Pj after N clicks, starting from page Pi.
Can we say anything about P^N as N becomes large? Well if P is ergodic (defined below) then there will exist a probability vector
L = (p1, p2, …, pM)
such that as N becomes large then
P^N = (L, L, …, L)^t
This says that for large N, the rows of P^N are all tending to the common distribution L. So no matter what page Pi the random surfer starts surfing from, his long run page visiting behaviour is described by L. We learn quite a bit about the random surfer from L.
As we said above, the long run probability vector L only exists for matrices that are ergodic. Ergodic matrices are described by 3 properties; they are finite, irreducible, and aperiodic. Our matrix P is large but certainly finite. Two pages Pi and Pj are said to communicate if it is possible to reach Pj by following a series of links beginning at page Pi. The matrix P is irreducible if all pairs of pages communicate. But this is clearly not the case, since some pages have no links (so-called dangling pages). If our random surfer hits such a page then he gets stuck, we don’t get irreducibility and we don’t get L.
To get the random surfer surfing again we make the following adjustment to P. Recall that we have M pages and let R be the M x M matrix for which each entry is 1/M (that is, R models the ultimate random surfer who can jump from any page to any page in one click). Let d be a value less than one and create a new matrix G (the Google matrix) where
G = d*P + (1-d)*R
That is, G is a combination of P (real link data) and R (random link data). Our random surfer than follows links in P with probability d or follows jumps (teleports) to a totally random page with probability (1-d). The value of d will be something like 0.85.
Its should be easy to see that G is irreducible since R enables any two pages to communicate. Without going into details, G is also aperiodic since it is possible for a page to link to itself (which is also possible in P as well). So G is ergodic and we can in theory compute the long run page distribution L of the random surfer.
So now that we know L exists for G, it remains to compute it. We have not as yet considered that the number of pages M is several billion and growing. So a direct representation of G as an M x M matrix would require storage on the order of 10^(18), or in the exabyte range (giga, tera, peta, exa). Luckily most pages are likely to have only a few links (say less than 20) and we can represent G using lists which will bring us back into the gigabyte range.
Computing L from G is a large but tractable computation. L is an eigenvector of G and there is an iterative algorithm for computing L from G called the power method. The power method begins with an approximation for L and improves on each iteration. The rate of convergence to the true value of L is geometrically fast in terms of the parameter d. Therefore we can compute the long run behaviour of our random surfer.
The diagram below (available from Wikipedia) shows the PageRank analysis for a simple collection of pages. The arrows represent the link data and the pages are drawn in size relative to their PageRank importance. If we divided the number on each page by 100 this would be our long run probability vector L.
Market predictions from the man himself, Lou Seefer, with a short excerpt below
Larry: Now I’m guessing you are a pretty traditional guy Lou. What’s your take on all this Internet technology?
Lou: Couldn’t support it more Larry. One of our big plays is to convince people that the place to be is “on top of things” rather than “at the bottom of things” – that is, to focus on the fleeting, not the foundational. It’s a win-win situation: people get to find a few cheap holidays and outsmart their doctor on something like the glycemic index, while we get mindshare that nothing is really relevant unless it arrives in your mailbox personally addressed to you as part of a competition.
Larry: Short term memory can be measured in mouse clicks.
Lou: Precisely. History becomes a hobby, not a lesson.
This is just a quick analogy between IT Security and high school teachers.
High school teachers have a very difficult job, that all parents agree is very important to do well. Parents want teachers to be both saints and sages, and often to be modestly renumerated for these qualities both in respect and financial reward. IT security has similar characteristics in that the complexity of performing this role is increasing while at the same time its perceived value is decreasing.
In academic publishing there is a pejorative notion of the least publishable unit (LPU). This refers to the practice of researchers subdividing results into the smallest credible units suitable for publication, as opposed to gathering smaller intermediate results towards a more significant publication. LPU is described in some detail on Wikipedia, where the entry states that the
Least publishable unit (LPU) is the smallest amount of information that can generate a publication in a peer-reviewed journal. The term is often used as a joking, ironic, or sometimes derogatory reference to the strategy of pursuing the greatest quantity of publications at the expense of their quality.
In considering the amount of blogs and tweets I now receive I wonder if this can be explained in terms of LPU behaviour. Does it make sense to speak of a Least Bloggable Unit (LBU) or a Least Tweetable Unit (LTU)?
In the case of Twitter, the medium itself at 140 characters is already small unit to work with – simply an observation or comment with a supporting link. My one-month experience with Twitter suggests that most tweets follow the comment-link couplet, which I would then define as the LTU. One could argue that just the comment is enough but it seems poor etiquette to tweet linkless comments.
For blogs, the LBU can essentially be the same as the LTU, a comment-link couplet, but more is expected since blogs are only restricted in length by the attention a reader is prepared to outlay. And how would we measure that?
Following a retweet from timoreilly, I tracked down a post from Amy Graham which proposes that the main “so what?” must be conveyed in the first 62 words of your online content. This magic value is the number of words that can be read in 15 seconds, assuming a reading speed of 250 words per minute (this is how web analytic urban legends are created). So we surmise that readers are prepared to grant 15 seconds of attention fame to any content that manifests on their screen.
The 62-word theory also spawned the Miniskirt theory which states that
Good writing should be like a skirt: Long enough to cover subject, but short enough to stay interesting
So the LBU is 15 seconds of attention or, for the visual thinkers, a head-turning skirt on a nice pair of legs.
