Lachlan Brown on Wed, 3 Jul 2002 22:02:02 +0200 (CEST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[Nettime-bold] Re: Alexa(ndria).com - the portable web




Alexa, acquired by Amazon earlier this year (the only company doing well 
on the stock exchange these days), showed a little bit of cultural foresight in 96 by collecting the WWW with the Wayback machine. 
The WWW, now a portable 12 Terrabytes, has been collected at the Library
of Alexandria by our Arabian friends. (Knowledge wants to be distributed knowledge, I suppose)

My Difference Engine, held in a folder called /difference/ on the 
Goldsmiths College server which was erased there in an academic 
contest over control of my research (and of the site) is now held, along 
with the (much of the) rest of the WWW from 96 onwards, at the Library 
of Alexandria. I like this instance of cultural foresight and deep historical correlations. 

Alexa's search engine, partnered with Google I believe, ranks sites
by their access (its really curious to run some comparisons and to 
see how some individual's sites rank higher than global newspaper circulations) and it also shows the networks of sites that visitors 
also access.

I note that Nettime's network of 'also visited' is Digital Online 
Industrial (I think I referenced this constituency in the Fall when I referred to 'heavyweight lurkers' - I was not referring to senior 
academics or to art curators as some seemed to think).

The combination of archived WWW with 'current' Web, ranking and audience/user tracing, provides new ways of 'negotiating and navigating' the WWW. 


Lachlan



> Entire Web - Portable

> For organizations capable of hosting or mining an entire crawl index that exceeds 12 Terabytes in size, Alexa can ship the contents of the crawl to your location. Current customers include the Internet Archive and the Library of Alexandria in Egypt.
> 
> Frequently Asked Questions
> 
> Q: How large is the crawl index?
> A: Very, very large. The crawl index is over 12 Terabytes in size, spanning over 2 billion documents. This is Google sized, and approximately 4 times larger than Altavista and Inktomi's published sizes.
>  
> Q: How often is the crawl updated?
> A: The web-wide crawl takes approximately 2 months to complete. News sites and other sites of interest may be re-crawled several times per day. Special collections may be created on request and updated as often as needed.
>  
> 
> 
> 
> Data Services
>  
> Massive web crawl is available to researchers, historians and commercial enterprises
> 
> 
> Imagine the entire contents of the world wide web... on disk. 
> Alexa Data Services gives you the ability to tap the vast Alexa crawl index and master the ephemeral and performance issues that make creating collections so difficult for the Web Information Architect. 
> 
> Massive Index. 
> Spanning five years, filling 200 Terabytes of online storage and expanding at a rate of 12 Terabytes per month, the Alexa crawl represents the largest collection of Web information in the world today. 
> 
> Powerful Tools. 
> To explore information that is six times the size of the Library of Congress, Alexa has developed a proprietary operating system and a powerful set of data mining tools that leverage excess process capacity on hundreds of parallel computers. 
> 
> Specialized Collections
> Specialized collections of web data may be developed on request and, on a subscription basis, updated up to several times per day. Collections can be used as a one-off research-oriented collection or as a continuous up-to-date collection for Archivists and Search Engines. 
> 
> 
> --------------------------------------------------------------------------------
> 
> Researchers, historians, and commercial enterprises may access Alexa's massive crawl of the web in one of the following ways:
> 
> 
>  Free
> Alexa, in partnership with the Internet Archive, offers free access to an archive of Alexa's crawl, going back to 1996 via the Internet Archive Wayback Machine. This unique service, the first of its kind, provides public access to over 10 Billion archived web pages. 
> 
>  Special Collection - Hosted
> Specialized archive collections can be made for a reasonable cost. Working with you, Alexa would generate and maintain a custom index of web content available via web interface. This service is perfect for archivists or historians who would like to create a special collection of web documents available via the web. Example: September 11th Archive, commissioned by the Library of Congress.
> 
>  Special Collection - Portable
> When having a copy of the crawl at your location is the only option, Portable is for you. Alexa generates a special collection of archived documents, places it on disk and ships it to your location. Collections may be as small as a few hundred web pages or as large as several billion, depending on your needs. 
> 
>  Entire Web - Hosted
> Alexa's entire crawl of the web can be made available to you on a subscription basis with access to Alexa's specialized set of datamining tools. This product provides the maximum performance, access and update frequency. 
> 
>  Entire Web - Portable
> For organizations capable of hosting or mining an entire crawl index that exceeds 12 Terabytes in size, Alexa can ship the contents of the crawl to your location. Current customers include the Internet Archive and the Library of Alexandria in Egypt.
> 
> Frequently Asked Questions
> 
> Q: How large is the crawl index?
> A: Very, very large. The crawl index is over 12 Terabytes in size, spanning over 2 billion documents. This is Google sized, and approximately 4 times larger than Altavista and Inktomi's published sizes.
>  
> Q: How often is the crawl updated?
> A: The web-wide crawl takes approximately 2 months to complete. News sites and other sites of interest may be re-crawled several times per day. Special collections may be created on request and updated as often as needed.
>  
>  
> 
> Contact: 
> Paula Keezer 
> (415) 561-6928 
> paula@alexa.com 
> fax: (415) 561-6795 
> 
> Alexa Internet
> www.alexa.com 
> Building 37 
> Presidio of San Francisco 
> PO Box 29141 
> San Francisco, CA 
> 94129-0141 
>  
> 
> 
> Lachlan Brown
> T(416) 826 6937
> VM (416) 822 1123
> 
>                                        
> 
> -- 
> __________________________________________________________
> Sign-up for your own FREE Personalized E-mail at Mail.com
> http://www.mail.com/?sr=signup
> 
> Save up to $160 by signing up for NetZero Platinum Internet service.
> http://www.netzero.net/?refcd=N2P0602NEP8
> 
> 




Lachlan Brown
T(416) 826 6937
VM (416) 822 1123

                                       

-- 
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

Save up to $160 by signing up for NetZero Platinum Internet service.
http://www.netzero.net/?refcd=N2P0602NEP8

_______________________________________________
Nettime-bold mailing list
Nettime-bold@nettime.org
http://amsterdam.nettime.org/cgi-bin/mailman/listinfo/nettime-bold