Wednesday, June 6, 2007

Playing with Wikipedia

I was working with Wisam Dakka on a Wikipedia project, and I was puzzled by some Wikipedia entries that had really long titles.

The first one that I noticed was a term with 163 characters: "Krungthepmahanakornamornratanakosinmahintarayu
tthayamahadilokphopnopparatrajathaniburiromudomra
janiwesmahasatharnamornphimarnavatarnsathitsakkattiyavi
"
which redirects to Bangkok. I do not know if this is a prank, or a valid entry. (Update: It is a correct entry, according to the talk page of the entry.)

Then, I noticed another term with 255 characters:
"Wolfeschlegelsteinhausenbergerdorffvoralternwarenge wissenhaftschaferswessenschafewarenwohlgepflegeund sorgfaltigkeitbeschutzenvonangreifendurchihrraubgierig
feindewelchevoralternzwolftausendjahresvorandieer
scheinenwanderersteerdemenschderraumschiffgebrauchl
,"
which in fact is a valid term and the 255 characters is simply a shortcut for the 580 character entry :-)

Finally, there is a term with 182 characters:
"Lopadotemachoselachogaleokranioleipsanodrimhypotrim
matosilphioparaomelitokatakechymenokichlepikossypho
phattoperisteralektryonoptekephalliokigklopeleio
lagoiosiraiobaphetraganopterygon
,"
that has Greek roots, and I will let you click to find out its exact meaning.

Also, these entries seem to trigger some buggy behavior on Google. If you do a web search for the above terms, you will find no web page with these words. However, Google returns a set of product matches on Google, none of which is really correct.

The joy of large-scale data processing!