I support Rob in his argument for a session id. In July last year, I
submitted the following to the list:
> We would also like the Search response to include a "server user id
substitute". This can be returned in the ZNG:user
> field for subsequent requests and avoid the user id and password having to
be sent in multiple requests. It can also
> serve to guarantee the "actual user" (there may be more than 1 user able
to use the same user ID/password) a time slot
> on the server, because the system has a finite number of concurrent users.
Pica also uses this for being able to
> return search history, i.e. a notion of session, but this is out of ZNG
At the time "session id" was a politically incorrect phrase. In our
implementation of ZING/SRU we are adding session id into the actual request,
within the URL. We need it in the request so that we can control channels;
it doesn't work if it is in a different layer.
It's not complicated; it's just an optional element. Our server will send
it with the result set id and it expects it to be returned.
In July last year we also had a lot of discussion about deleting result
sets. Ray's prose accurately reflects this discussion and I think it is
correct and reflects reality. Our server (along with lots of others)
decides when it's time to do its own housekeeping with result sets unless
told otherwise, i.e. with the TTL. It will even override the TTL if it has
Consultant OCLC PICA ITC
Schipholweg 99, 2300 AW Leiden, The Netherlands
+ 31 71 524 65 00
+ 31 71 522 31 19 (fax)
[log in to unmask]
From: Robert Sanderson [mailto:[log in to unmask]]
Sent: Friday, 14 June 2002 12:37
To: [log in to unmask]
Subject: Re: result set model for srw
> You just need the request parameters.
No. This would work if resultsets only last for one response/request.
However if I were to do a search, do a second search and then try to
combine the resultsets from those two searches, the first would have
already disappeared. You need a session id (or to return ALL of the
parameters from ALL of the requests to date made by the client, which
would quickly overflow most URL implementations in SRU.
> > I could send continuous (SOAP is HTTP/1.1 so includes
> > pipelining and gzipping, making this even more effective)
> > requests to trash random resultset names.
> Yes, but I could send continuous requests to trash random session ids or
> client ids too! Or even better sniff the wire for the session ids/client
> ids and use those. So introducing a session id in addition to a result
> set name just adds complexity without actually solving the problem.
Packet sniffing is much harder these days than it was, due to switching
networks. It's still possible using ARP poisoning, but I digress.
Yes, introducing a session id doesn't make it impossible to do, but it
does make it exponentially more difficult. Difficult enough that an
attack is unlikely to succeed, whereas without it it is very possible.
Compare basic authentication of username and password. Guessing a
username and password combination is much harder than guessing one of the
two. And for large robust servers (the reason we're using SOAP right?)
there will be a lot of users. Unless the server is designed with this
sort of security in mind, the resultsets will probably just identified
with an incrementing scheme rather than a random set of characters, so
once a pattern has been found, the server is completely compromised.
User/password authentication has served for many years, and will likely
continue to do so for all but financially important traffic.
> A solution would be for the server to keep tracking of the source IP
> Addresses - using that to identify clients and using result set names as
> at present for maintaining state. Even that is foolproof since you can
> spoof ip addresses.
> If you really want more resilience against this attack than you should
> use SSL (HTTPS) with client certificates to identify the client.
Yes. But then you drastically limit the number of potential clients to
those that support encrypted SOAP. It also makes it more difficult to
just throw together a server, I wouldn't bother... Z39.50 is less complex
and more functional.
> In any case client identification (if required) is an Out of Band issue
> which can be resolved at other layers in the syste and needn't be
> included in the SRW spec.
Unless you have a web proxy, or specialised SRW proxy that multiple
clients send their requests through, not an uncommon event I would have
For example, if I were to set up a cgi script that interogated SRW
servers, I would want to map my local users to session ids. Except they're
all coming from the same IP address so the server will map them all to the
same client. At this point you once again need some sort of session id.
,'/:. Rob Sanderson ([log in to unmask])
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. Twin Cathedrals: telnet: liverpool.o-r-g.org 7777
____/:::::::::::::. WWW: http://liverpool.o-r-g.org:8000/
I L L U M I N A T I