Saturday, June 7, 2008

Wiki Relation - Six Degrees

Wikipedia has now become a standard reference for definition of any matter in this universe.You can see almost all bloggers create a back link to Wiki page for definition of any words they use.It is having analogy with magazines in late 95's referring to oxford dictionary for any peculiar word in their article.I recently read an article on new Mars rover in daily local paper, where they have used a new word related to Mars surface property and the explanation was given like... "As Per Wikipedia....."It shows the power and growth of wikipedia as a standard reference  in last couple of years.

Today I heard the concept of finding the relationship between any two the articles in Wikipedia.It is not the logical relation.It is the number of links between the two articles.In simple words it the number of clicks required to reach one article from other.We can say it the distance between the two articles.So the biggest distance between any two article will define the diameter of the Wikipedia.I have read that this project was an attempt to find the diameter of Wikipedia, but it has given a handy algorithm to find out the distance between two articles.

Example( You can also have a try)

I gave vague unrelated topic,but see! Wiki is having awesome relationship between articles

               Wiki Blog

                                  Wiki relation between Y2K and Purananuru

According to Stephen Dolan,Wikipedia has 2301486 articles with 55550003 links between them.The largest "strongly-connected-component" of Wikipedia has 2111480 articles. That is, there are 2111480 articles with the property that from any of them, it is possible to get to any other one. The rest are mostly pages that no-one has linked to or disambiguation pages.

Stephen has used graph theory to convert the relation between articles as a graph.From Wikipedia's database Stephen got the XML and used scripts to parse the XML to find the graph.Finally distributed Computing algorithms has been used to identify

  • The diameter of Wikipedia
  • The distance between any two articles.

The techincal concepts like parsing, graph Theory and distributed computing has been clearly explained in Stephen's site

It is really an interesting game to play.I tried combinations of two completely different articles but I was able to get the maximum distance of 4.If any one could get more than four please comment on this blog.

