XML Tutorial

XML prohibited character literals


Following characters are prohibited in XML:

  • & lt;
  • & amp;
  • & gt;
  • & apos;
  • & quot;

Certain characters cause problems when used as element content or inside attribute values. Specifically, the less-than character cannot appear either as a child of an element or inside an attribute value because it is interpreted as the start of an element. The same restrictions apply to the ampersand because it is used to indicate the starting of an entity reference. If the less-than or ampersand characters need to be encoded as element children or inside an attribute value, then a character entity must be used. Entities begin with an ampersand and end with a semicolon (;). Between the two, the name of the entity appears. The entity for the less-than character is < the entity for the ampersand is & The apostrophe (') and quote characters (") may also need to be encoded as entities when used in attribute values. If the delimiter for the attribute value is the apostrophe, then the quote character is legal but the apostrophe character is not, because it would signal the end of the attribute value. If an apostrophe is needed, the character entity ' must be used. Similarly, if a quote character is needed in an attribute value that is delimited by quotes, then the character entity " must be used.

A fifth character reference is also provided for the greater-than character. Although strictly speaking such characters seldom need to be "escaped," many people prefer to "escape" them for consistency with the less-than character.

Example of Built-in entity in element content:

<name>Cherry Garcia</name>
<manufacturer>Ben &amp; Jerry</manufacturer>

Example of Built-in entity in element content:

<say hello word='&apos;Hi&apos;' />

Use of the built-in entity &apos; inside attribute content

CDATA refers to character data. Besides, there are Processing instructions.