Edward C. Zimmermann wrote:
> On Mon, 3 Dec 2007 23:05:10 -0500, Ross Singer wrote
>
>> On Dec 3, 2007 4:24 PM, Dr R. Sanderson <[log in to unmask]> wrote:
>>
>>> Amen.
>>>
>>> Also, the semantics of what they're actually describing aren't the
>>> clearest.
>>> The <id> element, for example, is not dc:identifier for the object
>>> described in the data, it's an arbitrary id of that particular entry in
>>> that particular feed. (As I understand it)
>>>
>>> Which makes perfect sense in an ATOM feed. And is totally meaningless in
>>> SRU.
>>>
>>>
>> Well, no. From:
>> http://www.atomenabled.org/developers/syndication/#requiredEntryElements
>>
>> id Identifies the entry using a *universally unique and
>> permanent URI*. Suggestions on how to make a good id can be found
>> here. Two entries in a feed can have the same value for id if they
>> represent the same entry at different points in time.
>>
>
> In Atom its important but (as the case with most feeds Atom or RSS) not very
> reliable.
>
>
>> So, what this means is that every search result would have a unique
>>
>
> In some systems our ids were generated from the system process-id. This
> proved reliable, persistant, sufficiently long lived and pragmatic but not
> permanent (and not intended to be). Designed to be able to live without access
> to the index they would be the result a given query at a moment in time. Their
> main use was for named result sets and in web interfaces for paging. Other
> objects would track the state of the index to be able to recycle some results
> as fast cached. These, by contrast, were never de-coupled from the index and
> to the outside world NEVER persistant but extremely volatile.
>
> We also have persistent URIs-- for example in IBU News--- that do nothing
> more than run queries against a defined target (lets canned queries become
> news feeds). Here the URI is persistent but the content (result) is in flux.
>
>
>> and permanent URI, to which I say "hallelujah!" but you might not be
>> as overjoyed.
>>
>
> A result of a search is composed of two variables
> - the query
> - the state of the target (which can be in flux)
>
> Permanent result sets?
> How can we have a permanent result against a target that's in flux?
> Since we don't want to save these sets **forever** we can provide
> permanent response to persistent URIs that produce an error response--
> no need to keep stale stuff around.
>
>
>> It might be a lot of work, but it's by no stretch of the imagination
>> "meaningless".
>>
>
> There is no problem making an Id..but of what?
>
> Unique set or non-unique query id as the basis?
>
> Would we not maybe want a persistent URI for the query and ignore the state
> as is the case sometimes with updated news stories. Different sets with the
> same Id.. Id really just being a signature for the search (if not the
> search as URI).
>
> Before we start to demand this we need to be very clear in our specifications
> what the Id means to us. If we want this we should specify what it so that
> it can have value!
>
To me the id for each feed item represents the resource that it matched.
If the underlying system does not support the notion of resource ids (a
very questionable information management design IMHO if it does not
support unique ids for each resource), then the implementation should
try and generate a URN based on a system specific unique URN generation
algorithm. If this is not possible then the last resort would be to
require that the id be a UUID urn generated using the UUID generation
algorithm. This behavior could be specified in the Search-WS spec.
I expect that vast majority of information management systems would not
have to resort to the UUID urn of last resort.
--
Regards,
Farrukh Najmi
Web: http://www.wellfleetsoftware.com
|