The JavaTM Web Services Tutorial
Home
TOC
Index
PREV TOP NEXT
Divider

Defining Attributes and Entities in the DTD

The DTD you've defined so far is fine for use with the nonvalidating parser. It tells where text is expected and where it isn't, which is all the nonvalidating parser is going to pay attention to. But for use with the validating parser, the DTD needs to specify the valid attributes for the different elements. You'll do that in this section, after which you'll define one internal entity and one external entity that you can reference in your XML file.

Defining Attributes in the DTD

Let's start by defining the attributes for the elements in the slide presentation.


Note: The XML written in this section is contained in slideshow1b.dtd. (The browsable version is slideshow1b-dtd.html.)

Add the text highlighted below to define the attributes for the slideshow element:

<!ELEMENT slideshow (slide+)>
<!ATTLIST slideshow 
    title    CDATA    #REQUIRED
    date     CDATA    #IMPLIED
    author   CDATA    "unknown"
>
<!ELEMENT slide (title, item*)>
 

The DTD tag ATTLIST begins the series of attribute definitions. The name that follows ATTLIST specifies the element for which the attributes are being defined. In this case, the element is the slideshow element. (Note once again the lack of hierarchy in DTD specifications.)

Each attribute is defined by a series of three space-separated values. Commas and other separators are not allowed, so formatting the definitions as shown above is helpful for readability. The first element in each line is the name of the attribute: title, date, or author, in this case. The second element indicates the type of the data: CDATA is character data--unparsed data, once again, in which a left-angle bracket (<) will never be construed as part of an XML tag. Table 6-3 presents the valid choices for the attribute type.

Table 6-3 Attribute Types
Attribute Type
Specifies...
(value1 | value2 | ...)
A list of values separated by vertical bars. (Example below)
CDATA
"Unparsed character data". (For normal people, a text string.)
ID
A name that no other ID attribute shares.
IDREF
A reference to an ID defined elsewhere in the document.
IDREFS
A space-separated list containing one or more ID references.
ENTITY
The name of an entity defined in the DTD.
ENTITIES
A space-separated list of entities.
NMTOKEN
A valid XML name composed of letters, numbers, hyphens, underscores, and colons.
NMTOKENS
A space-separated list of names.
NOTATION
The name of a DTD-specified notation, which describes a non-XML data format, such as those used for image files.*

*This is a rapidly obsolescing specification which will be discussed in greater length towards the end of this section.

When the attribute type consists of a parenthesized list of choices separated by vertical bars, the attribute must use one of the specified values. For an example, add the text highlighted below to the DTD:

<!ELEMENT slide (title, item*)>
<!ATTLIST slide 
    type   (tech | exec | all) #IMPLIED
>
<!ELEMENT title (#PCDATA)>
<!ELEMENT item (#PCDATA | item)* >
 

This specification says that the slide element's type attribute must be given as type="tech", type="exec", or type="all". No other values are acceptable. (DTD-aware XML editors can use such specifications to present a pop-up list of choices.)

The last entry in the attribute specification determines the attributes default value, if any, and tells whether or not the attribute is required. Table 6-4 shows the possible choices.

Table 6-4 Attribute-Specification Parameters
Specification
Specifies...
#REQUIRED
The attribute value must be specified in the document.
#IMPLIED
The value need not be specified in the document. If it isn't, the application will have a default value it uses.
"defaultValue"
The default value to use, if a value is not specified in the document.
#FIXED "fixedValue"
The value to use. If the document specifies any value at all, it must be the same.

Defining Entities in the DTD

So far, you've seen predefined entities like &amp; and you've seen that an attribute can reference an entity. It's time now for you to learn how to define entities of your own.


Note: The XML defined here is contained in slideSample06.xml. The output is shown in Echo09-06.txt. (The browsable versions are slideSample06-xml.html and Echo09-06.html.)

Add the text highlighted below to the DOCTYPE tag in your XML file:

<!DOCTYPE slideshow SYSTEM "slideshow.dtd" [
  <!ENTITY product  "WonderWidget">
  <!ENTITY products "WonderWidgets">
]>
 

The ENTITY tag name says that you are defining an entity. Next comes the name of the entity and its definition. In this case, you are defining an entity named "product" that will take the place of the product name. Later when the product name changes (as it most certainly will), you will only have to change the name one place, and all your slides will reflect the new value.

The last part is the substitution string that replaces the entity name whenever it is referenced in the XML document. The substitution string is defined in quotes, which are not included when the text is inserted into the document.

Just for good measure, we defined two versions, one singular and one plural, so that when the marketing mavens come up with "Wally" for a product name, you will be prepared to enter the plural as "Wallies" and have it substituted correctly.


Note: Truth be told, this is the kind of thing that really belongs in an external DTD. That way, all your documents can reference the new name when it changes. But, hey, this is an example...

Now that you have the entities defined, the next step is to reference them in the slide show. Make the changes highlighted below to do that:

<slideshow 
  title="WonderWidget&product; Slide Show" 
  ...
 
  <!-- TITLE SLIDE -->
  <slide type="all">
    <title>Wake up to WonderWidgets&products;!</title>
  </slide>
 
   <!-- OVERVIEW -->
  <slide type="all">
    <title>Overview</title>
    <item>Why <em>WonderWidgets&products;</em> are 
great</item>
    <item/>
    <item>Who <em>buys</em> WonderWidgets&products;</item>
  </slide>
 

The points to notice here are that entities you define are referenced with the same syntax (&entityName;) that you use for predefined entities, and that the entity can be referenced in an attribute value as well as in an element's contents.

Echoing the Entity References

When you run the Echo program on this version of the file, here is the kind of thing you see:

ELEMENT:        <title>
CHARS:        Wake up to WonderWidgets!
END_ELM:        </title>
 

Note that the product name has been substituted for the entity reference.

Additional Useful Entities

Here are several other examples for entity definitions that you might find useful when you write an XML document:

<!ENTITY ldquo  "&#147;"> <!-- Left Double Quote --> 
<!ENTITY rdquo  "&#148;"> <!-- Right Double Quote -->
<!ENTITY trade  "&#153;"> <!-- Trademark Symbol (TM) -->
<!ENTITY rtrade "&#174;"> <!-- Registered Trademark (R) -->
<!ENTITY copyr  "&#169;"> <!-- Copyright Symbol --> 
 

Referencing External Entities

You can also use the SYSTEM or PUBLIC identifier to name an entity that is defined in an external file. You'll do that now.


Note: The XML defined here is contained in slideSample07.xml and in copyright.xml. The output is shown in Echo09-07.txt. (The browsable versions are slideSample07-xml.html, copyright-xml.html and Echo09-07.html.)

To reference an external entity, add the text highlighted below to the DOCTYPE statement in your XML file:

<!DOCTYPE slideshow SYSTEM "slideshow.dtd" [
  <!ENTITY product  "WonderWidget">
  <!ENTITY products "WonderWidgets">
  <!ENTITY copyright SYSTEM "copyright.xml">
]>
 

This definition references a copyright message contained in a file named copyright.xml. Create that file and put some interesting text in it, perhaps something like this:

  <!--  A SAMPLE copyright  -->
 
This is the standard copyright message that our lawyers
make us put everywhere so we don't have to shell out a
million bucks every time someone spills hot coffee in their
lap...
 

Finally, add the text highlighted below to your slideSample.xml file to reference the external entity:

<!-- TITLE SLIDE -->
  ...
</slide>
 
<!-- COPYRIGHT SLIDE -->
<slide type="all">
  <item>&copyright;</item>
</slide>
 

You could also use an external entity declaration to access a servlet that produces the current date using a definition something like this:

<!ENTITY currentDate SYSTEM
  "http://www.example.com/servlet/CurrentDate?fmt=dd-MMM-
yyyy"> 
 

You would then reference that entity the same as any other entity:

  Today's date is &currentDate;.
 

Echoing the External Entity

When you run the Echo program on your latest version of the slide presentation, here is what you see:

...
END_ELM: </slide>
ELEMENT: <slide
  ATTR: type        "all"
>
  ELEMENT: <item>
  CHARS: 
This is the standard copyright message that our lawyers
make us put everywhere so we don't have to shell out a
million bucks every time someone spills hot coffee in their
lap...
  END_ELM: </item>
END_ELM: </slide>
...
 

Note that the newline which follows the comment in the file is echoed as a character, but that the comment itself is ignored. That is the reason that the copyright message appears to start on the next line after the CHARS: label, instead of immediately after the label--the first character echoed is actually the newline that follows the comment.

Summarizing Entities

An entity that is referenced in the document content, whether internal or external, is termed a general entity. An entity that contains DTD specifications that are referenced from within the DTD is termed a parameter entity. (More on that later.)

An entity which contains XML (text and markup), and which is therefore parsed, is known as a parsed entity. An entity which contains binary data (like images) is known as an unparsed entity. (By its very nature, it must be external.) We'll be discussing references to unparsed entities in the next section of this tutorial.

Divider
Home
TOC
Index
PREV TOP NEXT
Divider

This tutorial contains information on the 1.0 version of the Java Web Services Developer Pack.

All of the material in The Java Web Services Tutorial is copyright-protected and may not be published in other works without express written permission from Sun Microsystems.