Print

Print


Mike Richter and others have added some good comments to this thread
concerning my comment that something could be learned form considering what
Google does. My writing was subtle. I may not have even gotten it totally
myself. :-) 

When I noticed this thread there was a civil debate in progress over whether
we should try to save / preserve everything or whether we should just save /
preserve the best and most representative of everything. A counter to the
first idea was that the volume of the material available would make it
impossible. A counter to the latter idea (that we should preserve the best /
most representative) was that we might make the wrong choices.

Rather than making a lot of decisions concerning what sites are worthwhile
indexing, Google seems to index most of them and most of the words and
phrases in those. True, Google does not index sites blocked to it (many of
those I cannot access either) but for domains that it can it seems to index
them to rather deep page level. For me Google represents by analogy (not by
its actual function) one approach in the debate.

Here is the analogy: Google seems to me to be as successful and useful as it
is because it does not try to make value choices in what it indexes. If it
is there Google indexes it. (Goggle's ranking algorithm is a different
matter. Although important to Google, the ranking algorithm is not part of
the analogy. Anyone who only considers the first few of many Google hits are
letting the ranking algorithm choose what you consider and in some cases
will not be taking full advantage of what Google offers. In any case I
choose not to consider something being hard to find meaning that it does not
exist.)

Concerning the preservation of recordings if we follow the example of Google
if it is available then we save / preserve it. (It is an analogy, Google
does not save and preserve but through access makes things available. More
about that role of Google is coming up. The analogy is specifically on my
perception that Google does not make value choices. If Google does, then
consider a Google that didn't and use that for my analogy.)

It is easy to confuse Google as an analogy with the service that Google
performs, indexing. It is true that without a Google it would be hard to
access the information on the Web--it would be there but we could not find
it (much of) it. Similarly, if an attempt is made to save everything
initially we will suspect it is out there but will have no way to find it.
Eventually some way to organize the collection will be needed so that access
can be achieved. I feel we should not be impatient that we don't yet have a
way to organize it all because someday the capability will exist. But if the
materials are lost (were not saved / preserved) it won't matter.

About the objection that we cannot store all the information so we must
choose: It may be true that today we cannot store everything today in one
place. But if a myriad of folks save the piece that they have access too,
someday the pieces of the collection can be brought together in one place or
more importantly (and much more safely) all of it can be made accessible
from many places. The storage problems will be resolved in the course of
time. I vote to save / preserve all we can. I recommend that we do not to
try to cleverly discern and select only what seems important.

Some / many seem to have missed that I mentioned as an analogy. Some thought
I was suggesting that because of Google all we have to do is save / preserve
and Google can do the rest. That is not true, Google cannot do this and some
have mentioned this. Google can help archivists though: If each person who
has saved some things would put it on the Internet on a high level web site
(not too many layers below the main domain) today's Google would at least
alert others of the purported existence of the materials. (Other search
engines that do not eliminate indexing such sites for the reason that the
sites are not important can also provide this service.)

 Let's be aware of the search engines' limitations today (several have
pointed them out): The actual content of the saved material is not searched
by Google. Because the content of actual images can not be searched by
Google what Google does with images probably could be done with sound files
and video / moving image files. (My experience is that Google does not yet
have even a good database of all the Internet images.) The actual content of
media cannot searched by Google though (as I understand it, as I have
experienced it). Rather Google only deals with and indexes based on what is
said about the media content in the sites in which it is embedded, from
properties included with the media or by sites that link to the media.

Regards,
Ralph
July 8, 2003