Namespaces and Includes

When working with XML and XML Schemas, Namespaces makes it a bit difficult. The main issue with Namespaces in my opinion is the difficulty in dealing with the prefix to the namespace uri mapping. The software used to generate Visual Schemas has been tested with several different standards (as can be seen from this blog history), yet, very rarely some issues are identified in the software. As part of generating the Visual Schemas for Election Markup Language (EML), two additional issues were uncovered. One was an issue where the name of the element instead of the qname was used and it’s a outright implementation bug. However, the other one was more interesting.

EML schemas include additional schemas (core and external) and these included schemas include a few more schemas that have their own namespaces. The current Visual Schema software implementation computed the namespace prefix to uri mapping based only on the outer most schema. So far, all the other standards worked with this approach (either because they didn’t have multiple namespaces or they included all their namespaces in the outer schema). However, this didn’t work with EML since additional namespaces were defined in the included files. This makes it a bit complicated to deal with because usually developers doesn’t care about explicitly loading the included documents and just rely on the XML Schema parsing API to do it for them. So, there are two approaches to deal this issue

1) Recursively navigate through the XML Schema DOM and get all the included schemas and gather the namespaces defined in them. Ofcourse, this has the issue of making sure that the relative paths are properly converted to absolute paths.
2) Implement a custom Input Resolver and on the fly as the XML Schema parser is fetching the included schemas, gather the namespaces.

I am yet to decide on the approach I want to take, mostly will go with the approach 2. However, to deal with EML, the included namespaces are hard-coded for EML schemas.

My recommendation for anyone creating XML Schemas is perhaps to redefine all the namespaces from the included schemas (recursively) in the outer schema. Otherwise, developers working with the schemas might face challenges depending on the sophistication of the tools they are using.

Visual Schema For Election Markup Language (EML)

Right now I am watching Senator Barack Obama speaking in Minnesota on his successful winning of the required number of delegates for the nominations. Well, that has nothing to do with this post. But yes, the fact that this year the US Presidential Elections are going to happen has prompted me to look for XML Schema specifications related to Elections and found them on the oasis-open.org website.

VisualSchema.com is pleased to offer the Visual Schemas for Election Markup Language (EML) XML Schemas.

While creating the visual schemas for EML, a few issues were encountered. They will be listed in the next post.

XML Schema Annotation Documentation In Visual Schemas

Just like well commented code makes it easy to understand, XML Schemas can be augmented with comments as well. This is done using the documentation element within the annotation element. Annotation element can be part of many of the XML Schema constructs.

Visual Schema provided a way to visualize the XML Schema definitions. Now, Visual Schemas can display the annotations as inline tooltips so that the schemas can be studied along with their comments. opentravel visual schemas are the first to get this functionality on our website.

Visual Schema For Office 2003

A while back Microsoft has released xml schemas for their Office 2003 suite of products. Their download contains xml schemas for Microsoft Word, Spreadsheet, Visio and Project.

VisualSchema is pleased to announce Visual Schemas for Office 2003. Note that the visual schema is created only for Microsoft Project and not the other documents. This is because, while the technology used in Visual Schemas is capable of generating HTML for any arbitrary XML Schema, in the process of converting the dynamically generated forms into static HTML forms for browsing on visualschema.com, the number of HTML pages generated will be huge for certain class of XML schemas. These schemas are mainly document layout centric such as XHTML which are very flexible and allow many loops and nesting that translates to several unique paths. This is understandable because, the definition capturing a document layout has such a characteristic. For example, in case of XHTML, a table cell can contain another table. Because of this, the other XML Schemas in the Office 2003 product suite have not been converted into Visual Schemas.

Visual Schemas For Java EE XML Schemas

VisualSchema is pleased to offer Visual Schemas For Java EE XML Schemas.

Java EE (Java Enterprise Edition) defines a few XML Schemas for the definition of web applications, tag libraries and few other that are part of the Java EE specifications.

Visual Schemas is built using Java, JSP, Xerces and a few other technologies. So, we are happy to make the xml schemas of the Java EE to be available as Visual Schemas.

Visual Schemas For Amazon SimpleDB

Amazon SimpleDB is a special storage system optimized for certain class of problems. This service is certainly a welcome addition to the already existing Amazon Simple Storage. While Amazon Simple Storage is, as the name indicates, meant to store small and large files alike and that’s pretty much it and simple, SimpleDB is also simple, but more suited to storage database like data (tables, columns).

VisualSchema is pleased to offer Visual Schemas For Amazon SimpleDB.

VisualSchemas For papiNet

papiNet has standards for paper and forest industry. VisualSchema is pleased to announce the availability of Visual Schemas for papiNet XML Schema standards.

I am going to soon write up how exactly the Visual Schema are created in more detail later on. But the thing to note is that the UI is all pre-generated static html files. So, depending on the number of complex nodes and the various navigation paths, the number of these static files could be huge. So far, OAGIS™ used to be the most complex with several complex nodes for a few of the messages. papiNet has now over taken that complexity. The entire pre-generated html size is about 3 GB. Visual Schemas for OAGIS&trade Nouns on the other hand occupy only about 0.4 GB. papiNet would have occupied even more, but a few frequent and simple nodes have been deliberately excluded to reduce the storage requirements). This shouldn’t reduce the usability since the nodes that have been excluded are simple text fields and text fields with UOM.

A few reasons why some of the messages have several nodes is

1) Product definition is quite elaborate and recursive (packaging structure).
2) Paper & Wood, the key products of the papiNet standard have tolerances to their values with min and max values. One thing though, instead of defining a complexType of “value with UOM” and use it to define the value, range min and range max values, an alternate choice is to define a complexType that captures the value, UOM as well as range min and max values. In other words, instead of having to say

<Value UOM=”Inch”>20<Value>
<RangeMin UOM=”Inch”>18<RangeMin>
<RangeMax UOM=”Inch”>21<RangeMax>

saying

<Value UOM=”Inch” min=”18″ max=”21″>20<Value>

would have made the payload smaller.

Anyway, even though papiNet has overtaken OAGIS in terms of storage requirement for VisualSchemas, there is another open standard that has a single message that is probably the largest of all the standards. Because it’s very large, I haven’t published it yet. Need to think a bit more on how to reduce the storage requirement, perhaps changing the current architecture of VisualSchemas to suite a certain type of XML Schema that shares some of the papiNet message characteristics.