Wednesday, June 6, 2007

Playing with Wikipedia

I was working with Wisam Dakka on a Wikipedia project, and I was puzzled by some Wikipedia entries that had really long titles. The first one that I noticed was a term with 163 characters: "Krungthepmahanakornamornratanakosinmahintarayutthayamahadilokphopnopparatrajathaniburiromudomrajaniwesmahasatharnamornphimarnavatarnsathitsakkattiyavi" which redirects to Bangkok. I do not know if this is a prank, or a valid entry. (Update: It is a correct entry, according to the talk page of the entry.) Then, I noticed another term with 255 characters: "Wolfeschlegelsteinhausenbergerdorffvoralternwarengewissenhaftschaferswessenschafewarenwohlgepflegeundsorgfaltigkeitbeschutzenvonangreifendurchihrraubgierigfeindewelchevoralternzwolftausendjahresvorandieerescheinenwanderersteerdemenschderraumschiffgebrauchl," which in fact is a valid term and the 255 characters is simply a shortcut for the 580 character entry :-) Finally, there is a term with 182 characters: "Lopadotemachoselachogaleokranioleipsanodrimhypotrimmatosilphioparaomelitokatakechymenokichlepikossyphophattoperisteralektryonoptekephalliokigklopeleiolagoiosiraiobaphetraganopterygon," that has Greek roots, and I will let you click to find out its exact meaning. Also, these entries seem to trigger some buggy behavior on Google. If you do a web search for the above terms, you will find no web page with these words. However, Google returns a set of product matches on Google, none of which are really correct. The joy of large-scale data processing!