XHTML Primer

Table of Contents

Basic XHTML

HTML (HyperText Markup Language) has been the primary markup language of the Internet since the Internet began. The original idea behind HTML was to provide minimal hints about how the page should be displayed, but to provide insight into the structure and meaning of the document. This was done so that each web browser would be able to determine the best way to display various elements.

As the web grew and users began to expect more of it, web developers demanded ever-more powerful extensions to the HTML standard. Although these features allowed new capabilities, they also brought new problems including lack of uniformity, cross-browser issues, and still limited design options. Browsers allowed sloppy programming, and web developers were happy to oblige.

Programmers came up with incredibly clever ways to force design concepts into the HTML framework, but these workarounds (table-based designs, frameset layouts, invisible spacer images and other tricks) made code bulkier and more difficult to maintain.

Web developers are beginning to return to the roots of HTML as a language to describe the meaning of page elements. They can do this because of two technologies that are now at the center of web design: XHTML and CSS. XHTML is a stricter interpretation of HTML. It actually supports fewer tags and attributes than old-school HTML, but this smaller stricter design makes it easier to verify and ensure code will work correctly. CSS (Cascading Style Sheets) is a technology designed to let the developer apply various stylistic elements (colors, positions, fonts) to any HTML or XHTML element.

Justification

The combination of XHTML and CSS can provide important advantages over more traditional forms of web development:

It is important to note that a decision to use XHTML and CSS as the foundation of your web design does not mean you are sacrificing aesthetic beauty for technical merit. Many of the most visually appealing sites on the web are written using this technology. Visit www.csszengarden.com/ for evidence of this trend.

It is true that web developers who are used to the old "freewheeling" days of web development will face some shocks when moving to the XHTML standard. Many formerly common practices such as table-based page layout, use of the "bgcolor" attribute, and skipping ending tags are now considered in very poor form, and will not validate. Some of the most popular tags (<b>, <u>, <center>, and <font>, among others) are now deprecated, meaning they can no longer be used in strictly validated pages. However, CSS provides generally better and more flexible replacements for all these tools. Learning the new approach is not difficult, but unlearning skills that took years to acquire can be very difficult. If you are a seasoned web programmer trying to adjust to the new reality, be patient and keep trying. It won't take long for you to catch on to the new way of thinking. Most who have made the transition agree they'll never go back once they've gotten over the initial pain.

If you're new to web design, you have no bad habits to overcome. The XHTML and CSS model really makes a lot of sense, and it's not any harder to learn than the older techniques.

validation and doctypes

One key to modern web development is the notion of validation. A validator is a special program that checks your code and makes sure it meets standards you have agreed to. A validator is a lot like a spell-checker. Although it can be painful to see all your spelling errors, it's better to know about them before a document is published for the world to see. Likewise, it's good to know if there are errors in your document before you publish literally on the world stage.

Throughout this essay, I recommend use of the W3C validator available at http://validator.w3.org/. This online program examines your web code and reports any errors it finds to you. Like a spell-checker, a validator can be painful and humiliating at first, but it trains you to write code with fewer errors. If you validate your code frequently, you will find fewer errors. The current (very long) document validates without problems using the strictest document type available at the time of its writing.

Validators can use many different rulesets. You specify which ruleset you want to use by including special code in your HTML documents. For this essay, I suggest using the XHTML 1.0 strict doctype. It is a very restrictive ruleset, but this strictness provides a strong predictable framework for the CSS code you will add later.

template

In the early days of HTML, before XHTML and validation, it was enough to simply open up a text editor and start typing. If you want to validate your code (and you should) you need a somewhat more complicated starting template. It is no longer easy to memorize everything you need to start up a basic web page. Fortunately, I have provided a template that contains everything you need.

To begin writing standards-compliant XHTML, simply copy and paste the code below into any plain text editor (NOT a word processor; Word processors do not save in plain text formats and will not work correctly.)

 
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang = "EN" xml:lang = "EN" dir = "ltr">
<head>
<meta http-equiv="content-type" content="text/xml; charset=iso-8859-1" />

<title></title>
<style type = "text/css">

</style>

</head>
<body>


</body>
</html>

If it's easier for you, download template.html by right-clicking on it and saving it to your local file system.

Essential tags

In general, XHTML is much simpler than HTML. All XHTML tags refer to the meaning or purpose of its enclosed text, rather than how it should be displayed. There are a few key tags used in nearly every page:

<html> html

The html tag encloses the entire web page. Everything the web page displays is encased inside an <html></html> pair.

<head> heading area

The heading area is something like the engine compartment of a car. It contains machinery necessary to running the page efficiently, but not the main content of the page. For now, the heading will contain a title and style information. More sophisticated pages will sometimes put programming code in the header as well.

<title>

The title tag belongs in the header. The title doesn't actually show up in the main body of the page (although it is often repeated in a header tag. See below for information about those.) The title is usually displayed in the title bar of the browser (although this is not guaranteed.) Titles are also often used when referring to a page, and given special weight in search engine algorithms.

<body> body area

The body is the main visible area of a web page. If the head is the engine compartment, the body is the passenger compartment. Most of the text on a web page is placed inside the body element.

<p> paragraph

A paragraph is a generic element indicating paragraphs. Most long text passages in your page will be enclosed in paragraphs. Note that old-school HTML did not require code to be in any container. XHTML does require text to be marked in some kind of container. Paragraph is very generic, so it is very frequently used.

<h1>..<h6> header tags

The header tags are used to indicate headings in your document. Although most browsers have specific starting characteristics for each level of heading, the key is not how the heading looks, but the emphasis you wish each level of heading to have. If your page follows some kind of outline, you should have a few level one headings. Each of these should contain some level two headings, followed by lower-level headings.

Basic markup

Once you've established the framework for your page, you can embellish your design with CSS. There is a lot to CSS, but you can begin with a basic subset that adds tremendously to the look of your page.

Adding a style

The easiest way to add CSS to your page is through the <style></style> tags in the document heading. Use this element to list your various page elements and the styles you will apply to them. For example, consider simpleStyles.html.

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang = "EN" xml:lang = "EN" dir = "ltr">
<head>
<meta http-equiv="content-type" content="text/xml; charset=iso-8859-1" />

<title>Simple Styles</title>
<style type = "text/css">
h1 {
  color: blue;
  background-color: yellow;
}

h2 {
  color: yellow;
  background-color: blue;
  text-align: center;
}

p {
  font-family: "Comic Sans MS", cursive
}

</style>

</head>
<body>
<h1>Simple Styles with Monty Python</h1>
<h2>The Lumberjack Song</h2>
<p>
Oh, I'm a lumberjack and I'm OK, 
</p>

<p>
I sleep all night, and I work all day.
</p>

<h2>The Cheese Shop</h2>
<p>
Do you have any Venezuelan Beaver Cheese?
</p>


</body>
</html>

The HTML describes the HTML page, and the CSS style element describes how each of the various elements will look. CSS styles have their own unique syntax. Each style name is followed by a left brace ({). Any number of style rules can then be listed for that element. Style rules are usually listed one per line, and are usually indented two spaces. Rules consist of a rule name followed by a colon (:) and a value. Style rules usually end in semicolons (;). The rule list for an element ends with a right brace (}).

color

The color attribute describes the foreground color of an element. If the element contains text, this will be the color of that text. Take care to ensure that the foreground and background colors are contrasting so they are easy to read. Please see the colors section of this document for specific color values you can use.

background-color

The background-color attribute determines the overall color of the element. Set the background-color of the body tag to change the background of the entire page. Take care that there is high contrast between the background and foreground colors of a page or element. See the colors segment of this document for specific color values you can use.

font-family

The font-family attribute is used to describe which font should be used in an element. Unfortunately, font support is very challenging on the web, because fonts are installed on specific client computers. You might design a page that looks great on your machine using a font you have downloaded. However, if a user has not downloaded that font, he or she will not see the page the way you designed it.

To use a font, simply include that font's name. You can use custom fonts if you think the user will have them installed, but it's risky to do so. It is better to include a list of fonts. For example, you may choose a font from those typically installed on Windows machines followed by one usually available on Macintoshes. If the browser can use the first font, it will. If it cannot find that font, it continues on the list until it finds a font it can use. If a browser cannot find a suitable font, it will use a default standard font.

Any standards-compliant browser will be able to display the following generic fonts:

Note that the fonts won't always look exactly as they are depicted here. The browser simply determines a system font that most nearly matches the generic description.

Whenever you use the font-family attribute, use one of these generic font names at the end of the list. That way, if none of the fonts you specify is available, at least you will fall back to a font that has the general feel you were looking for.

font-size

Font size describes the size of the text in an element. Font size is described by a number followed by a unit of measurement. The following units are usually used to measure font size:

It's generally considered best to avoid setting a specific font size in points or pixels, as certain versions of IE do not allow the user to resize such text. Percentages or ems are preferred units of measurement.

text-align

The text-align attribute is used to specify how text is aligned inside an element. Legal values are the following:

The justify option attempts to line up text evenly on the left and right as done in newspaper and magazine layouts. It can be effective with longer lines of text, but produces strange results if the text is short.

Basic web colors

Two main techniques are used for describing colors on the web. You can simply type a color name, or you can use a numeric value to describe the color.

Color names are easy to remember, but they have problems. Only 16 color names are guaranteed to work:

color name color value
aqua
black
blue
fuchsia
gray
green
lime
maroon
navy
olive
purple
red
silver
teal
white
yellow

If you want more control of your colors, use the hexidecimal combinations. Each color is specified through two digit hex values from 00 to FF. Use these combinations to specify how much red, green, and blue you wish. For example:

You can discover hex values in any painting program. You'll also find several great programs online for picking colors that work well together. My personal favorite is the color scheme generator. This tool lets you easily choose colors that go well together in a variety of schemes.

Note that since colors often involve doubled hex values, you can use a shortcut format: #FF0 is the same as #FFFF00.

Borders

Borders allow you to specify the border around some element. Borders are used to visually separate elements, and also for debugging pages (especially when you are creating positionable elements.) Each border has three distinctive parameters: border-width, border-style, and border-color. These elements can be described individually, or combined into one border attribute.

border-width

The border-width attribute describes how wide the border will be drawn. It uses the same measurement types as any length measurement in CSS, but most programmers use pixels (px) to describe border size. Here are some example paragraphs with the same border type and color, but various sizes:

1 pixel border

2 pixel border

3 pixel border

5 pixel border

10 pixel border

15 pixel border

border-style

none

hidden

dotted

dashed

solid

double

groove

ridge

inset

outset

border-color

combined borders

Data Formatting

HTML is about describing the meaning of text. The body of your document will usually comprise a number of different kinds of text. Most of the text will go into paragraphs, but ocaisionally you will find you need another way to format your information. Lists and tables are common ways to organize parts of your web pages.

lists

A list is simply an organization of elements. HTML has a default way of displaying list data, but you are not bound to that; you can use CSS styles to make lists appear exactly as you wish. Lists begin and end with list tags, and each element has its own tag.

Ordered vs. unordered

The most common list looks like this:

<ul>
  <li>alpha</li>
  <li>beta</li>
  <li>charlie</li>
  <li>delta</li>
</ul>

The code above displays like this:

The <ul></ul> pair designates an unordered list. (Get it? ul - for unordered list!) In the absence of any other information, the browser will format this list as a bulleted list. Of course you can use CSS to alter the bullets and make them many other standard elements as well as using custom graphics. See the list style type and custom bullets section of this document for information on how to do this.

Inside the list, each element is enclosed inside <li></li> tags. Programmers typically indent each list item.

You can also create an ordered (or numbered) list using the <ol></ol> tags instead of the <ul> tags. Such a list would like this in the code:

<ol>
  <li>alpha</li>
  <li>beta</li>
  <li>charlie</li>
  <li>delta</li>
</ol>

The code above displays like this:

  1. alpha
  2. beta
  3. charlie
  4. delta

As with the unordered list, you can use CSS styles to dramatically change the type of numbers used.

Nested Lists

If you find yourself writing outlines or other structured documents, you'll frequently encounter lists nested inside each other. It's possible to make nested loops that still validate, but you must do so carefully. Consider the following example:

A list cannot simply happen inside another list. Instead, it must be placed inside one of the list items. If you think about this logically, it makes sense. The term "uno" is related to the term "Spanish". At one level, there is a list item containing the term "Spanish." That element contains that term and a new list showing several numbers written in Spanish. The code for producing the list above is reproduced here:

<ul>
  <li>English
    <ol>
      <li>one</li>
      <li>two</li>
      <li>three</li>
    </ol>
  </li>
  <li>Japanese
    <ol>
      <li>ichi</li>
      <li>nii</li>
      <li>san</li>
    </ol>
  </li>
  <li>Spanish
    <ol>
      <li>uno</li>
      <li>dos</li>
      <li>tres</li>
    </ol>
  </li>
</ul>

Notice how the list item for Spanish does not end directly after the word "Spanish." Instead, the new list of Spanish numbers is embedded into the list item. After this new list is finished with </ol>, the list item containing both the word "Spanish" and the inner list is completed. Proper indentation makes it much easier to avoid mistakes. Each ending element should line up exactly with the element that started it.

list-style type

custom bullets

tables

table borders

table headers

table rows

table data

CSS styles and tables

using tables for layout

links

Links are defined using the standard <a> tag. Normal links look like this:

<a href = "newPage.html">Go to another page</a>

The href value indicates the address of the new page. The text between the <a> and </a> is the text that will appear as a link on the page.

defining standard links

Like most XHTML elements, links have a standard appearance. For example, most browsers indicate a link by making the linked text blue and underlined. Users have gotten used to links being designated in these ways, so if you choose to change the appearance of links, you need to still be careful that the user understands the object is a link that the user can use to navigate off the page.

defining link styles

Since a link is simply an XHTML tag, you can apply a style to it like any other XHTML element. For example, to make all links black, use this style:

a { 
  color: black;
}

You can apply any style to an anchor that you wish. However, most usability experts recommend that you leave all links underlined unless they incorporate some other obvious navigation hints (perhaps they are formatted as buttons, for example.)

using the hover and visited pseudo-selectors

Anchors are special because they can have more than one state. You may want to apply different styles to the anchor if it has already been visited, or if the mouse is currently over it. The following style rules illustrate how this can be done:

a { 
  color: black;
}

a:visited {
  color: purple;
}

a:hover {
  color: white;
  background-color: black;
}

Here is some code using these styles:

The anchor tag still has its own style definition, but the other states (visited and hover) can also have styles devoted to them. The visited state occurs if the user has already been to this page. (Delete your browser history and cache to make all sites "unvisited" for testing) The hover selector allows you to specify a custom style that should occur when the mouse is "hovering" over the link.

links and useability

Links are an important useability feature of web pages. You must take care not to confuse the user when you apply custom styles to your links. It still must be very apparent to your users what elements can be clicked on for navigation. Consider the following tips when you build custom anchor styles:

Make it still look like a link
Be very careful not to makes unrecognizable as such. It's generally best to keep links underlined if possible.
Don't change sizes
It's tempting to make a hovered link larger or smaller. Avoid the temptation. Changing the size of an element dynamically can effect the rest of the page, making things jump around. Changing the font can have the same effect. Keep your hover states simple, changing colors, backgrounds, or images.
use color carefully
Color should be used to communicate. You want to stay within the color design of your document, but you also may use color to designate links.
Consider hover for highlighting
If you don't want to use underlining and color to make links apparent, consider using the hover tag. Make the link text change in an obvious way when the user hovers over it. It's better if links are still obvious as such when the user is not over it. You don't want to force the user to fly over the page with the mouse looking for links.
Don't underline things that aren't links!
Users have been conditioned that underlined blue text on a web page is a link. If you want to make your users crazy, put underlined blue text on your page that is not a link. You'll get angry emails. Now that web designers have moved away from the default link colors, underlining text in any color has become a powerful cue that the text is a link.

Advanced CSS

Using background images

repeated images

image borders and frames

Using Selectors

selecting by class

selecting by id

nested elements

Making link buttons

adjusting the list

adjusting the links

managing the hover state

Page Layout with CSS

Using float for layout

Indicating element width

Two-column layout using float

Adding Headers and footers