Chapter 11.6.1 – Basic XML Syntax | Introduction to Programming Using Java

Chapter 11.6.1 – Basic XML Syntax | Introduction to Programming Using Java

 

11.6.1 – Basic XML Syntax

 

Chapter 11.6.1 Basic XML Syntax | Introduction to Programming Using Java

 

An XML document looks a lot like an HTML document (see Subsection 6.2.3). HTML is not itself an XML language, since it does not follow all the strict XML syntax rules, but the basic ideas are similar. Here is a short, well-formed XML document:

 

Chapter 11.6.1 - Basic XML Syntax | Introduction to Programming Using Java

Chapter 11.6.1 - Basic XML Syntax | Introduction to Programming Using Java

 

The first line, which is optional, merely identifies this as an XML document. This line can also specify other information, such as the character encoding that was used to encode the characters in the document into binary form. If this document had an associated DTD, it would be specified in a “DOCTYPE” directive on the next line of the file.

Aside from the first line, the document is made up of elements, attributes, and textual content. An element starts with a tag, such as <curve> and ends with a matching end-tag such as </curve>. Between the tag and end-tag is the content of the element, which can consist of text and nested elements.

(In the example, the only textual content is the true or false in the <symmetric> elements.) If an element has no content, then the opening tag and end-tag can be combined into a single empty tag, such as <point x=’83’ y=’96’/>, which is an abbreviation for <point x=’83’ y=’96’></point>. A tag can include attributes such as the x and y in <point x=’83’ y=’96’/> or the version in <simplepaint version=”1.0″>. A document can also include a few other things, such as comments, that I will not discuss here.

The basic structure should look familiar to someone familiar with HTML. The most striking difference is that in XML, you get to choose the tags. Whereas HTML comes with a fixed, finite set of tags, with XML you can make up meaningful tag names that are appropriate to your application and that describe the data that is being represented. (For an XML document that uses a DTD, it’s the author of the DTD who gets to choose the tag names.)

Every well-formed XML document follows a strict syntax. Here are some of the most important syntax rules: Tag names and attribute names in XML are case sensitive. A name must begin with a letter and can contain letters, digits and certain other characters. Spaces and ends-of-line are significant only in textual content. Every tag must either be an empty tag or have a matching end-tag. By “matching” here, I mean that elements must be properly nested; if a tag is inside some element, then the matching end-tag must also be inside that element.

 

Chapter 11.6.1 - Basic XML Syntax | Introduction to Programming Using Java

 

A document must have a root element, which contains all the other elements. The root element in the above example has tag name simplepaint. Every attribute must have a value, and that value must be enclosed in quotation marks; either single quotes or double quotes can be used for this.

The special characters < and &, if they appear in attribute values or textual content, must be written as &lt; and &amp;. “&lt;” and “&amp;” are examples of entities. The entities &gt;, &quot;, and &apos; are also defined, representing >, double quote, and single quote.

(Additional entities can be defined in a DTD.)

While this description will not enable you to understand everything that you might encounter in XML documents, it should allow you to design well-formed XML documents to represent data structures used in Java programs.

 

SEE MORE:

1 thought on “Chapter 11.6.1 – Basic XML Syntax | Introduction to Programming Using Java”

Leave a Comment