On Wed, 14 Apr 2010 11:51:36 +0100, Mike Taylor wrote
> Peter, we didn't envisage this as something that servers self-report,
> but as something measured from the outside.  To a client, it's

Measured by whom? Do you propose as a group developing a reference globally
distributed monitoring service? This could be quite nice and very useful.
Many of us have multi-homed networks and together I think we span the globe.

> irrelevant whether a server is running off a UPS, for example: all we
> care about is whether we can contact and search in and retrieve from
> the server.

But from where? 

> In IRSpy ( the reliability measurement
> simply indicates how often the server responded to connection
> requests.  It's a trivialised notion of what it means to be

Responded to YOUR requests. If they are your servers then its self-reporting
and with (beyond your own network) wholly personal semantics.

> "available", but as is so often the case, 10% of the work gets you 
> 90% of that functionality, and it's functionality that we need.  (I 

That's fine. It may be what you feel you need but looking at what people in
large numbers think they need, for instance, in the Web sector.. its not ...

> deliberately proposed a vague semantics statement for availability 
> because I don't want to enforce that rather dumb definition of 
> availability on everyone who uses it.)

If you really need to say something without meaning.. why bother?

> Similarly, Ed Zimmerman's message hugely over-complicates this simple
> concept, and ends up concluding, as such messages so often do, that 

Its not complicated. Its standard practice today for a number of services. At
the easiest level its service pings from multiple networks up through
sophisticated traffic flow and metric analysis (in the Web sector this is a
very big sub-industry)

> we should do nothing.  Sorry, Ed, not having that.  As a compromise, 
> I am perfectly happy for _you_ to do nothing :-)

Collecting data makes a lot of sense.. Sure we do it.. (and even use it in our
models).. but anything short of a kind of global reference monitoring service
I've suggested above.....  I'm not sure it belongs anywhere other than in our
internal monitoring databases.. 

> On 14 April 2010 01:18, Peter Noerr <[log in to unmask]> wrote:
> > Back to the original suggestion, after the rather ironic detour this
thread took...
> >
> > Such numbers would be useful to us as a fed search service. We actually
maintain this sort of data for all the Sources we connect to, by means of an
active checking program of our own, so it would not add greatly to our own
practices, but it would be useful to have the site's own idea of how often it
thought it was available, and it would be useful to the vast majority of
systems which had no justification to set up monitoring programs.
> >
> > Which leads to the question of what this "percentage reliability" is
actually measuring and how? The aforementioned power outage and servers
playing doorstops obviously counts as "unavailable", but what if they were
still happily running on their (long life) UPS, while the router was down?
From the outside world's point of view both are bad, but how does the server
check itself from outside? And is this a time average, a moving average, a
snapshot, based on number of tries irrespective of time, or just whatever the
server thinks is a good idea (better than nothing - probably)?
> >
> > Peter


Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
Umsatz-St-ID: DE130492967