> From: Geoff Mottram [mailto:[log in to unmask]] > Sent: Tuesday, April 02, 2002 4:15 PM > > I agree with the above statement as a general design > philosophy provided you > permit the occasional exception. > > I have been having some discussions with MARC listserv > regarding non-sorting > portions of a title and I believe there needs to be one > exception to not > mixing PCDATA and elements. Assume you add a "nonsort" > element as a method > of marking up the portions of a title that are not to be > sorted. If you > don't allow it to intermix with PCDATA within the title, you will have > problems distinguishing between the parts of the title. For example: > > <title> > <part>first part of the title</part> > <nonsort>I belong to first title part but should be > ignored</nonsort> > <part>second part of the title</part> > </title> > > With the above approach, you can't tell which part of the title the > "nonsort" element belongs to. It should be marked up as > follows to avoid > any ambiguity: > > <title> > <part>first part of the title > <nonsort>I belong to first title part but should be > ignored</nonsort> > </part> > <part>second part of the title</part> > </title> Actually, I disagree that you don't know what <part> the <nonsort> element belongs to. XML nodes (elements) are sequenced order. From the first example, the <nonsort> element falls between the first <part> and the second. Since it occurs before the second, by definition of XML sequencing nodes, it belongs to the first. Using XML's built-in sequencing solves the problem. This also goes along with your original thought that you can pull out all the <part> elements and that's your sort string. What you didn't point out is that the reason why it works is because of XML sequencing nodes. When I pull out all <part> elements with XPath, XPath will always give me the elements in sequence order unless I specify otherwise. Your second example can easily be converted to the non-mixed content model I described in my earlier message. It's my personal opinion that mixed-content modeling is a result of not fully thinking out what you are marking up. At the cost of adding one additional element, which makes the markup more clear, you could transform your example to: <title> <part> <sort>first part of the title<sort> <nonsort> I belong to first title part but should be ignored </nonsort> </part> <part>second part of the title</part> </title> Note the above still follows my original content rules. That being every element should be either #PCDATA or one or more refinement elements. In the case of the first <part>, I added the refinement elements <sort> and <nonsort>. In the case of the second <part>, I decided it was not necessary to refine, thus implicitly saying that the whole <part> is of element <sort>. You could, if you really wanted to, enclose the entire contents of the second <part> with a <sort> element. BTW, the content model for the <part> as I just described above would be: <!ENTITY % partRefinement "|(sort|nonsort)*"> <!ELEMENT part (#PCDATA%partRefinement;)> With due sincerity, no exceptions are needed. Andy.