Michelle,
This is no easy task.
You are basically asking to re-order the document based on the alpha sorting of the content of some nodes (2 level sort).
I want to remind you that unless these strings are all ASCII, you might not get the sort order you are looking for in the end (not to mention uppercase and lowercase).
I would probably create a node set as a pre-sorted key. Then I would use this node set key to call and generate the component nodes from the document.
Blocking the dups of level 1 would be easy. (are there dups at level 2?)
But the real questions are:
1. What constitutes a dup?
2. What is the audience for this "sorted" list? (English, French, etc.)
This question is the considered the "rub".
Not a strong point for XSLT 1.0.
I would also use java binding to deal with some of these issues. (i.e., ICU)
Anyway, just some thoughts for you.
It looks like you have some excellent suggestions already.
Mike Ferrando
IT Specialist
Library of Congress
Washington, DC
(202) 707-4454
----- Original Message ----
From: Michele R Combs <[log in to unmask]>
To: [log in to unmask]
Sent: Friday, February 8, 2008 10:08:20 AM
Subject: Re: Removing duplicates from reordered list
Thanks
to
those
who
have
posted
ideas
on
tackling
this
problem.
A
couple
of
suggestions
involved
using
keys
based
on
the
first
occurence
of
each
unique
<origination>,
but
I
think
my
sample
data
may
have
inadvertently
confused
things
--
my
apologies!!
I
put
the
sample
data
in
alpha
order
by
<unittitle>
for
my
own
convenience,
but
I
need
the
code
to
work
regardless
of
whether
the
original
document
is
in
alpha
order
by
unittitle
or
not.
In
other
words,
the
document-order
first
occurrence
of
<origination>Picasso</origination>
is
not
necessarily
associated
with
the
alphabetical-order
first
<unittitle>.
Herewith
the
revised
sample
data:
<c0x>...<unittitle>Horse,
still
life</unittitle><origination>Picasso</origination>...</c0x>
<c0x>...<unittitle>Apple,
still
life</unittitle><origination>Picasso</origination>...</c0x>
<c0x>...<unittitle>Giraffe,
still
life</unittitle><origination>Holbein</origination>...</c0x>
<c0x>...<unittitle>Baby,
still
life</unittitle><origination>Fra
Angelico</origination>...</c0x>
<c0x>...<unittitle>Frog,
still
life</unittitle><origination>Michelangelo</origination>...</c0x>
<c0x>...<unittitle>Elephant,
still
life</unittitle><origination>Holbein</origination>...</c0x>
<c0x>...<unittitle>Chair,
still
life</unittitle><origination>Picasso</origination>...</c0x>
<c0x>...<unittitle>Duck,
still
life</unittitle><origination>Michelangelo</origination>...</c0x>
Michele
+-----+-----+-----+-----+-----+-----+
Michele
Combs.
Librarian
for
Manuscripts
and
Archives
Processing.
Special
Collections
Research
Center.
Syracuse
University
Library.
222
Waverly
Avenue.
Syracuse,
NY
13244
+-----+-----+-----+-----+-----+-----+
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
|