June 3rd, 2009
Anyone familiar with the World Wide Web will recognize the simple hyperlink, consisting of a ‘hot-spot’ on a page that, when clicked, takes the user to a destination indicated in the target address embedded in the hyperlink code.
XML is not limited to use in displayed Web pages, but the linking concept is essential. Although HTML is limited to creating single-dimensional one-way links, XLink allows developers to link two or more content ‘chunks’ together in such a way that the links can be traversed in either direction.
This requires not only that the target is uniquely identifiable, but also the source: if a two-way link is to be created, clear ‘labels’ are needed at both ends of the link. This raises the need for coherent resource identification and naming conventions for all information content. This is a cornerstone of a solid information architecture and scalable XML-centred system, as we will see.
While an Internet-accessible file can be identified and addressed by reference to its URI, use of XPointer allows a specific location or fragment within a file to be addressed. Using the XPath language introduced above, a pointer – either in the form of a specific fragment identifier, or expressed as a query that is resolved when addressed – allows parts of a document to be addressed by context or nature, rather than by means of an explicit static label.
Posted in Uncategorized | 6 Comments »
May 27th, 2009
XPath is the only member of the XML family that does not itself share XML’s basic syntax. XPath is a syntax intended to identify specific parts of an XML document, often known as ‘nodes’, and then locate them by offering a path to them relative to a defined point in the document. XPath is not strictly an XML application on its own, but is intended for use with XSLT and XPointer.
These two standards use this path to point to or process a particular part of a document that conforms to specified criteria. For example, an XPath expression could be constructed to identify every block of information contained within a particular element type, or all content contained within an element with a certain attribute value. Two other standards can be considered here, although technically speaking they are specific XML applications – SVG and MathML.
Scalable Vector Graphics (SVG) is a standard that encapsulates graphics and describes them using an XML vocabulary. While initially this might seem a long-winded way of encoding graphics, the programmability and transformability of XML means that graphics so described can be generated, manipulated and transformed in the same way as any other XML document. Further, the elements of a graphic can, like any other XML element, carry valuable semantic information to describe themselves.
As graphics themselves are often used either to reproduce real-life objects or concepts (think of the graphics in the simplest workflow diagram), the potential for SVG starts to become apparent.
Posted in Uncategorized | 8 Comments »
May 25th, 2009
HTML has already introduced the idea of a base reference to allow a Web page developer to avoid repeating the full URL everywhere it is used, but instead use addresses relative to a defined ‘base’. The same concept is used with XML, allowing other related standards and constructs to use similar relative addressing methods.
A logical XML document can be composed of any number of physical components. For example, a complete XML document might be broken down into, say, chapters, with each fragment being maintained and stored separately. At the same time, a schema might specify the overall structure. A single chapter, read alone by an XMLconformant tool and checked against the schema, would possibly throw errors unless it was given some clue that the fragment is part of a larger whole.
It is necessary, therefore, that the different parts – or fragments – can be distinctly identified and related to each other independently of their context. The XML Fragment standard allows for different components to be organized or managed in separate environments. The standard therefore allows developers to manage the component parts of a document without fear of losing the logical connections between them.
Posted in Uncategorized | 6 Comments »
May 23rd, 2009
As we saw earlier, labels can often be ambiguous when different people use the same term to signify different ideas. When processing a document, elements may well be checked against a schema. In a schema, each element’s content could additionally be validated against a datatype. Remember the example of: <title>
If the schema is expecting ‘Dr.’, ‘Ms.’, or ‘Prof.’, but is instead presented with a book title, the processor would probably throw an error. The XML Namespace standard is a simple but powerful mechanism that allows us to use the same term in different contexts.
By declaring that XML elements belong to a specific vocabulary and identifying them as such through a namespace declaration, we eliminate any possible ambiguity.
The Infoset – or XML Information Set – can be thought of as a normalized inventory of, or reference model for, the parts of an XML document. Its main use is intended for developers that require that a data model and related information – the information items that make up an XML document’s Infoset – are made available by a conforming XML processor for use and manipulation by an XML-aware application. It should not be confused with the ‘Document Object Model’
Posted in Uncategorized | 20 Comments »