Wednesday, February 02, 2011

Decoding URLs


Why is it that some web-addresses carry a long tail while other do not? How much does a web-address reveal?

For instance, a Yahoo mail-page has the following code:

http://us.mc369.mail.yahoo.com/mc/welcome?.gx=1&.tm=1296694680&.rand=e6tpkm8umv6mv#_pg=showFolder&fid=Inbox&order=down&tt=3571&pSize=100&.rand=850448145&hash=fa5c91801af9f69d3189c024f2592cd2&.jsrand=8243756

While  Facebook pages come up with something a lot simpler, like-

http://www.facebook.com/home.php#!/

Even though most web-addresses (aka URLs - universal resource locators) follow the format, << transfer protocol://servername.domain/directory/subdirectory/filename.filetype>>, they contain clues that tell you a number of interesting things:
  • <~> - a tidle: Usually indicates a personal folder -- perhaps the customer of a Web host, or a student at a university, etc.
  • <?> - a question-mark: Typically means that behind the scenes, a script will call information from the server or a database. Eg.,
  • <=> - the equal-to mark: Indicates use of stylesheets
Some gurus recommend the use of CURL-tools to dig out more information from URLs...

-------------------------------------
LINKS / REFERENCES

No comments: