Print

Print


Peter, we didn't envisage this as something that servers self-report,
but as something measured from the outside.  To a client, it's
irrelevant whether a server is running off a UPS, for example: all we
care about is whether we can contact and search in and retrieve from
the server.

In IRSpy (http://irspy.indexdata.com/) the reliability measurement
simply indicates how often the server responded to connection
requests.  It's a trivialised notion of what it means to be
"available", but as is so often the case, 10% of the work gets you 90%
of that functionality, and it's functionality that we need.  (I
deliberately proposed a vague semantics statement for availability
because I don't want to enforce that rather dumb definition of
availability on everyone who uses it.)

Similarly, Ed Zimmerman's message hugely over-complicates this simple
concept, and ends up concluding, as such messages so often do, that we
should do nothing.  Sorry, Ed, not having that.  As a compromise, I am
perfectly happy for _you_ to do nothing :-)



On 14 April 2010 01:18, Peter Noerr <[log in to unmask]> wrote:
> Back to the original suggestion, after the rather ironic detour this thread took...
>
> Such numbers would be useful to us as a fed search service. We actually maintain this sort of data for all the Sources we connect to, by means of an active checking program of our own, so it would not add greatly to our own practices, but it would be useful to have the site's own idea of how often it thought it was available, and it would be useful to the vast majority of systems which had no justification to set up monitoring programs.
>
> Which leads to the question of what this "percentage reliability" is actually measuring and how? The aforementioned power outage and servers playing doorstops obviously counts as "unavailable", but what if they were still happily running on their (long life) UPS, while the router was down? From the outside world's point of view both are bad, but how does the server check itself from outside? And is this a time average, a moving average, a snapshot, based on number of tries irrespective of time, or just whatever the server thinks is a good idea (better than nothing - probably)?
>
> Peter
>
>> -----Original Message-----
>> From: SRU (Search and Retrieve Via URL) Implementors
>> [mailto:[log in to unmask]] On Behalf Of John Harrison
>> Sent: Tuesday, April 13, 2010 7:29 AM
>> To: [log in to unmask]
>> Subject: Re: Add "reliability" index to CQL's "zeerex" context set
>>
>> Hi all
>>
>> These contextSet sites haven't been lost - there was a power cut over
>> the weekend affecting all cheshire3.org servers. Some came back on with
>> the power and some didn't. I've been in the US, but will get round to
>> fixing them tomorrow.
>>
>> Sorry for any inconvenience this has caused.
>>
>> Personally, I think the reliability index sounds like it could be
>> useful
>> so I'll monitor this mailing list for a couple of days to see if there
>> are any other comments, then add it to the ZeeRex context set.
>>
>> All the best,
>>
>> John
>>
>>
>> On Mon, 2010-04-12 at 17:22 +0100, Mike Taylor wrote:
>> > We have found it useful, in our IRSpy register of Z39.50 and SRU
>> > targets, to add a measure of "reliability" for each server, expressed
>> > as a percentage and measuring what proportion of all the connections
>> > we've tried to make have been successful.  Using this, we can search
>> > for only those targets that are up, say, 90% of the time.  (This
>> > searching facility is not yet wired out to the public Web UI at
>> > http://irspy.indexdata.com/ but it will be.)
>> >
>> > In order to enable searching in this way via SRU, we need to add a
>> > "reliability" index -- so far as we can determine, there is no such
>> > index in any of the existing context sets.  This seems like a good
>> > match for ZeeRex, which is all about describing databases and the
>> > services that provide them, so we propose that the new index be added
>> > to the ZeeRex context set.  We propose a brief, non-prescriptive
>> > semantics statement like "an integer in the range 0-100 indicating
>> how
>> > reliable the server had been found to be".
>> >
>> > --
>> >
>> > As an aside, the LC page about context sets,
>> >         http://www.loc.gov/standards/sru/resources/context-sets.html
>> > links the ZeeRex set to the location:
>> >         http://srw.cheshire3.org/contextSets/ZeeRex/
>> > but this URL has gone away since Rob Sanderson left the Cheshire
>> > project.  So have the Record Metadata set ("rec"), the Network
>> > Resource Information set ("net"), the Collectable Card Games set
>> > ("ccg") though that one will probably not cause so many problems, and
>> > the Relevance Ranking set ("rel").  This is very bad.
>> >
>> > Some, but not all, of those sets are available as old versions on the
>> > WayBack Machine: for example, there is an old "rec" set at
>> >
>> http://web.archive.org/web/20060717085701/http://srw.cheshire3.org/cont
>> extSets/net/1.0/
>> > but I have not been able to get it to give me an old "zeerex" set.
>> >
>> > For that reason, I have resurrected an old copy of the ZeeRex site as
>> > it was before I foolishly handed it over to Rob, and it is now
>> > available on
>> >         http://zeerex.z3950.org/
>> > In particular, the ZeeRex context set for CQL is at:
>> >         http://zeerex.z3950.org/search/contextset/2.0/
>> > I hope this is useful to more than just me.
>