XML Tutorial

XML declarations

Syntax

<?xml version='1.0' encoding='character encoding' standalone='yes|no'?>

XML documents can contain an XML declaration that if present, must be the first construct in the document. An XML declaration is made up of as many as three name/value pairs, syntactically identical to attributes. The three attributes are a mandatory version attribute and optional encoding and standalone attributes. The order of these attributes within an XML declaration is fixed.

The XML declaration begins with the character sequence <?xml and ends with the character sequence ?>. Note that although this syntax is identical to that for processing instructions, the XML declaration is not considered to be a processing instruction. All XML declarations have a version attribute with a value that must be 1.0

The character encoding used for the document content can be specified through the encoding attribute. XML documents are inherently Unicode, even when stored in a non-Unicode character encoding. The XML recommendation defines several possible values for the encoding attribute. For example, UTF-8, UTF-16, ISO-10646-UCS-2, and ISO-10646-UCS-4 all refer to unicode/ISO-10646 encodings, whereas ISO-8859-1 and ISO-8859-2 refer to 8-bit Latin character encodings. Encodings for other character sets including Chinese, Japanese, and Korean characters are also supported. It is recommended that encodings be referred to using the encoding names registered with the Internet Assigned Numbers Authority (IANA). All XML processors are required to be able to process documents encoded using UTF-8 or UTF-16, with or without an XML declaration. The encoding of UTF-8 and UTF-16 encoded documents is detected using the Unicode byte-order-mark. The XML declaration is mandatory if the encoding of the document is anything other than UTF-8 or UTF-16. In practice, this means that documents encoded using US-ASCII can also omit the XML declaration because US-ASCII overlaps entirely with UTF-8.

Only one encoding can be used for an entire XML document. It is not possible to “redefine” the encoding part of the way through. If data in different encodings needs to be represented, then external entities should be used. If an XML document can be read with no reference to external sources, it is said to be a stand-alone document . Such documents can be annotated with a standalone attribute with a value of yes in the XML declaration. If an XML document requires external sources to be resolved to parse correctly and/or to construct the entire data tree (for example, a document with references to external general entities), then it is not a stand-alone document. Such documents may be marked standalone='no', but because this is the default, such an annotation rarely appears in XML documents.

Example of xml declarations

<?xml version='1.0' ?>
<?xml version='1.0' encoding='US-ASCII' ?>
<?xml version='1.0' encoding='US-ASCII' standalone='yes' ?>
<?xml version='1.0' encoding='UTF-8' ?>
<?xml version='1.0' encoding='UTF-16' ?>
<?xml version='1.0' encoding='ISO-10646-UCS-2' ?>
<?xml version='1.0' encoding='ISO-8859-1' ?>
<?xml version='1.0' encoding='Shift-JIS' ?>

xml  declaration priority

If you are using xml declarations, it must come first. The xml declaration simply declares that a document is an xml document and describes its version. It is optional, but if you use it (and by convention you should, unless you are working with a document fragment for inclusion in another xml document), it must , unequivocally, be the first statement in an xml document:

 <!ELEMENT ANELEMENT 
(%myParameterEntity; |anotherElement)*>

The xml declaration is part of the document prolog, as you will discover later, and not part of the document instance (the main body of the document that holds the data you are working with).It has no bearing on the ordering or nesting of elements and is, in fact. not an element itself. Therefore, it is not subject to the rule that dictates that a root element must contain all other elements. This is not an exception to any rules; it is part of the rules .Because an xml declaration does not qualify as an element, it is not subject to the rules to which elements must adhere. It is also not a processing instruction, although it looks like one. A processing instruction hands off instructions to another application. An xml declaration does not do that.