The suggestion at the March 2006 meeting was that there should be a
new diagnostic:
The numberOfRecords is approximate
(eg Google's 18 million results for 'the')
And an extension which allows the client to tell the server that it
doesn't care about the numberOfRecords.
Rob
On Mon, 6 Aug 2007, Roger Wallin wrote:
> On Fri, 20 Jul 2007 16:30:20 +0200, Matthew J. Dovey <[log in to unmask]>
> wrote:
>
>>> I'm writing a very "simple" SRU-server. I'm worried about the response
>>> parameter "numberOfRecords". For me it would seem natural to request
>>> for
>>> the "numberOfRecords" only when you need it, fex. by
>>> using "maximumRecords=0".
>>
>> A request for maximumRecords=0 is useful for when the client *only*
>> wants a count, but the assumption is that numberOfRecords is always
>> returned regardless of however many records are requested/returned.
>> There is no need for the client to actually use this data (it could just
>> ignore it), and intelligent servers may cache the count (and the result
>> set) say keyed on the query term, rather than repeat the count (and
>> query) should the client ask for a different set of records for the same
>> query.
>>
>> There may be situations where a server cannot return an accurate count
>> for whatever reason (e.g. too expensive, or in the case of relevance
>> based text retrieval engines ala Google). In this case, it is suggested
>> that you return an arbitrarily large value in the numberOfRecords.
>>
>> Matthew
>> ========================================================================
>
> Thanks for your answers,
>
> I'm sorry not having responded earlier, but I have been out of office for
> a couple of weeks.
> I suppose that I have to cope with the fact that the server shall always
> return the numberOfRecords of the whole record-set. Nevertheless I think
> that this is a lack of server-logic. Modern databases do usually have an
> effective way to do custom paging (limited pages) and then the need for
> the numberOfRecords of the whole record-set will always be a burden, and
> it will also be a burden for the server to conclude that an arbitrarily
> large value should be returned. I think that the best solution to this
> would be the possibility for the client to use a request-
> parameter "numberOfRecords", and if used/filled by the client it would
> tell the server that it doesn't have to return the "numberOfRecords".
> Yes, you do usually want to know the (total) "numberOfRecords" requesting
> for the first limited page (although not necessarily), so I have to look
> at how to cache the count (and the query). For me it would have been much
> easier to forget about state/caching, but I don't think that it's a good
> solution to count the total "numberOfRecords" for every limited page. Is
> this server-logic a consequence of servers building/saving the whole
> result-set (of course in an internal effective way) although just a
> limited page should be returned? Comments appreciated.
>
> Regards RogerW :).
>
|