* Regarding evaluation of embedded XML schema:
I believe that past incompatibilities have involved the XLINK namespace or schema, see here for an example: http://www.oxygenxml.com/pipermail/oxygen-user/2006-September/000928.html.
In general, because all the embedded XML in METS use 'lax' validation the only absolute requirement is that the embedded XML be well-formed. An XML Schema is optional or if there is an XML Schema you can just not specify how it can be found, and the XML schema validator will not attempt to validate the XML and it will not issue an error.
However, if there is an XML Schema for the embedded XML and the processor can locate that schema, then the embedded XML must be validated against that schema and it is considered an error if it is not valid.
Thus the three cases are:
1) There are no xlink namespace incompatibilities between METS and the embedded XML.
2) Embedded XML is well-formed.
3) If there is an XML Schema for the embedded XML, the embedded XML is valid according to that schema, but if there is no XML Schema for the embedded XML that is not an error.
* Regarding the xml.xsd (or other external schema):
I believe the Library of Congress has adopted the practice of hosting its own copy of the various external XML Schema which are imported by its own schema. For example that is why you will see the MODS schema pointing to http://www.loc.gov/mods/xml.xsd for the XML schema and to http://www.loc.gov/standards/xlink/xlink.xsd for the XLINK schema, instead of to the more canonical locations maintained by the W3C. I believe the reason for this is so that the LC XML Schema are isolated from any outages that might occur at the W3C web sites. I also believe that there have been cases in the past when the W3C has banned certain IP addresses because they appear to be abusing their web servers by repeatedly requesting these same schema URLs hundreds of times per second for long periods of time. This would occur for example if someone is attempting to validate a large batch of MODS files, and are not caching the needed schema. Unfortunately this means that the LC schema can get out of sync with the standard w3C schema.
Hope this helps explain.
Tom
> -----Original Message-----
> From: Metadata Encoding and Transmission Standard
> [mailto:[log in to unmask]] On Behalf Of Saašha Metsärantala
> Sent: Monday, May 06, 2013 8:04 AM
> To: [log in to unmask]
> Subject: Re: [METS] METS schema change requests
>
> Hello!
>
> > Part of our evaluation is confirming that any xml or binary data that
> > can be embedded in an mdWrap is compatible with the METS schema.
> > Usually this is not a problem, but there have been past issues with
> > XML schema incompatabilities.
> I consider that a clarification would be welcome here. Could you provide a
> link or otherwise refer to or describe which past issues were related to XML
> schema incompatibilities?
>
> > https://github.com/mets/wiki/wiki/Schema-Change-Requests
> Let's notice that the contents of the xml.xsd file referred to in the
> @schemaLocation attribute on the xs:import element in this documentation
> is different from the w3c's (most recent) xml.xsd file available from
> http://www.w3.org/2001/xml.xsd particularily when it comes to the fact that
> the @xml:id attribute is included in the specialAttrs xs:attributeGroup in the
> w3c's version of xml.xsd whereas it is not in LoC's version of this file. I
> consider that any decision to diverge from the w3c on this point would need
> to be motivated and clearly documented.
>
> Regards!
>
> Saašha,
|