XHTML, HTML, Designers

XHTML is improved HTML, because the tags don't have mistakes in them.

Parsing HTML tags: there are inline tags, which can occur anywhere around typing, and compartment-style tags, which are sort of like containers for putting the document into. The only restriction on inline tags, generally, is that you can't put some of them inside of each other, like a link inside a link, and that you can't put some of them around a compartment tag, like you can't put a link tag around a form. I suggest that HTML be parsed not by parsing the original, but by altering tags (which may be missing, and are, in the majority of HTML) into a family of single/combined elements. A combined element is one such as code or em, which can be combined together and be the same element. A single element is one which has the property of being itself, and may contain others, which are also themselves. It is extremely important to discriminate between these two types of elements in a family tree.

3-21-04. It was a heroic thing to do to make any error in an XML document be fatal, because it meant that simply getting it to work (which is the practical thing for an employee to do) would produce proper documents, which did not introduce quirks and backward inconsistencies that "henceforth and forevermore" would have to be supported (because now they had once begun to mean something). An employee doesn't have to be idealistic, and still the idealism of the markup is saved.

There is the:

a abbr acronym address area big blockquote body button caption cite code dd dfn div dl dt em fieldset form h1 h2 h3 h4 h5 h6 head hr img input kbd label legend li link map meta object ol optgroup option p param pre q samp script select small span strong style sub sup table td textarea th title tr tt ul var

a
tag, which is for anchors, which are the links, like this: Yahoo. A common mistake is to omit the ending tag. Everything following it, up until the next link or button, will be also part of the link. This means it is a compartment element in actual use, or a "combined element", as written in the new markup language: "A combined element is also ended upon a further occurrence of that same element."
<a href="../index.html">Code Library</a>
This link is a "relative" link. From the existing URL (http://www.codelib.net/html/xhtml_tags.html) it would go up one directory (../) and open the file index.html, which is the Code Library home page.
<a href="http://www.yahoo.com/">Yahoo!</a>
This is an absolute link, which specifies the protocol + host + path. In this case, the link is to the Yahoo! website.
<a href="http://www.yahoo.com/" target="_blank">Yahoo!</a>
This is just the same as before, but the link will open in a new window, regardless of what windows already exist.
<a href="#textarea">Description of textarea</a>
This link will move the user to content in a certain portion of the existing page. A unique and visible HTML element could have been identified by id="textarea" so that clicking on the link would show that element of the page.
abbr
which is for abbreviations, but a rather stupid thing, like WWW--because you can already see it's an abbreviation, it is stupid, kind of like saying, "Yes, I like the--this is an abbreviation--HTTP/1.1 specification." This can only be around text; you may not specify that the contents of an image is an abbrevation by using this tag, for example. Also, you may not nest the abbr tag. Nested tags result in no change.

It was bad practice for the Standards people to introduce elements like "abbr," "acronym," "samp,", "kbd," "code," etc. Mechanistically, they do nothing but pre-define some simple stylesheets for the convenience of web authors. Theoretically, they add some meta-information so that the meaning of the words can be understood by search engines, or for other purposes, and not just to change the style of the words. But in fact, picking and choosing from a few elements makes the meta-information very unreliable. The elements should not have been introduced.

acronym
which is for acronyms, like radar, and it is almost as silly. This may only occur around text, and not be nested with itself.
address
is used for information on the author,
This is written by Joseph K. Myers
, not just an address. It is a way of finding this information, which could be marked with id="address" except that people would be more likely to use the tag itself instead of an id, since ids are supposed to be arbitrary. (I could have said id, indicated with the var tag.) I believe you can have as many addresses as you like, but they should be associated with "a major part of a document," such as a form, or simply the entire document, and often they are placed at the beginning or end--but that is not important. It is not a tag used for formatting or creating a paragraph, but often it is displayed in italic. This tag may not be nested. Display alterations may be indicated as with a formatting tag, with altered formatting up to the end of the tag. Searching for the addresses in a document, however, should proceed by looking for the beginning of each address tag.
area
is for a client-side image map area--an embedded image or object with "hot" links in certain regions. The area tag may not occur inside itself.
big
specifies a larger text style than surrounding text. This is a combined element; more and more big tags can be used for bigger and bigger text. When ending big tags are missing, the text stays larger. It would be wrong, by the parsing fixer would have to consider the entire document to be rendered in big text after an unended big tag at the beginning. Big and small text are not absolute, but relative. Inside of a big tag, inside a small tag, or inside any area of a document, a big tag will increase the text size following itself, until its ending tag, and a small tag will decrease the text size following itself. The result of a small tag inside a big tag is text size the same as the text size before the big tag. If you have A big tags around A small tags, then the total effect cancels, and the size of text is not changed. Nesting tags, however, is not good. Using them to change the document text size is also improper. Instead, do body { font: larger }. Also, you can format the entire font characteristics with the CSS font control
font: [ [ <'font-style'> || <'font-variant'> || <'font-weight'> ]? <'font-size'> [ / <'line-height'> ]? <'font-family'> ] | caption | icon | menu | message-box | small-caption | status-bar | inherit
blockquote
is to contain an extended quotation, such as an excerpt from other material. This does not need to be long, but it may contain multiple paragraphs. Usually it results in indented text, like this:

They went in single file, running like hounds on a strong scent, and an eager light was in their eyes. Nearly due west the broad swath of the marching Orcs tramped its ugly slot; the sweet grass of Rohan had been bruised and blackened as they passed. (The Two Towers).

body
Everything displayed in the document is considered to be in the body. Following body tags are ignored, so the body is considered to be a combined element without nesting.
button
is specific to forms, and used to draw a push-button. This is only useful for scripted web pages--for anything besides looks--since by itself it does nothing. Oops, sorry, that was my opinion. It can actually function as a submit or reset button as well.
caption
is specific to tables, and comes immediately after the table tag, to contain a table caption. It usually is confusing, unless a table occurs all alone. This is something seemingly ignored, that the table is often discussed outside the table, and for a table to occur without being described first--outside of the table, of course--is more confusing than omitting a table description. It seems like a description inside of a table itself, and a caption inside of a table itself, only have reasons when you contemplate a table without thinking of any of the other tags outside the table which would have been used to describe and title the information in a table. Doesn't anyone remember at this point to think outside of the box?

This is yet another instance of the blindness of the Standards people. It is obvious that they were thinking of the benefits to their own implementations to have such things as captions, and pawning off on others a bunch of nonsense about abstract information concepts. But even Grade 1 of abstract thoughts would teach someone to think outside the box.

cite
is only to indicate who said something, or where it came from, such as The Most Important People Who Say Things Never Are Recognized.
code
is an indication of computer code occurring, usually to preformat the code: ls -l is my girlfriend (would I have said just my friend, would you not have heard the end).
dd
stands for a definition description. Each of these elements is defined inside of dd, following the dt definition term.
dfn
is an especially ridiculous tag, because it is supposed to isolate the "defining definition" or "instance definition" (because they knew the idea was stupid) of a word; it is ridiculous because there is no formatting clue that something like, "Paul was an apostle," is the definition and not anything else like em for emphasis. (Really, it is useful for people who are reading the HTML--they know what you mean--but that isn't the normal ability of web users.)
div
is a generic container, and often does nothing except create a section of the document--which then may be beautified with stylesheets/CSS. By itself, it does nothing except break off its contents from previous or following lines, a clue to its literal meaning as a "division."
dl
creates a definition list (with nothing in it by itself) for the tags dt and dd, which are usually shown as a nice list of terms and definitions, just as this list you are reading right now.
dt
is for the "definition term," or simply a term, that comes in a dl definition list of terms and definitions.
em
is for emphasis to words, and is usually shown in italic. Many other formatting types are indistinguishable, and so they are useless as far as documents are concerned.
I fight the semantic battle in favor of i as a better HTML tag, and as a tag that has the meaning of em. Think of it--em doesn't have a semantic true meaning. Only your real original language shows the emphasis, etc. Putting an "emphasis" into a sentence with a markup language is not a substitute for script writing a movie part with notes where the actress should emphasize. If only the evil men of power would realize that!
fieldset
is similar to a list of terms and definitions, but used for forms. The classic mistake is made, that it is perfectly clear when reading the HTML code--except that most people don't get around to doing that, and that most browsers make a puddle of the beautifully organized form. The supposed excuse for this is that everyone was supposed to have "styled" their own forms with CSS, a nice try, but false, since there really isn't any meaning in the HTML if the very excuse that HTML won't work is to say to use CSS. Excuses aren't reasons. It works with the legend tag to identify form fields directly.
form
is used to create an interactive form, with nothing in it by itself. The interaction itself is for the user to put data into the form. Things are put into it by input, button, select, and textarea elements.
h1
this is a heading, the first size. Headings really help explain things.
h2
the second size of the same thing.
h3
the third size
h4
the fourth size
h5
the fifth size
h6
the sixth size
is used for the information like the title and the stylesheet of a page. The head isn't shown in the page, except for things which don't belong there, but you shouldn't put them there.
hr
makes a separation mark across the page.
img
embeds an image--the most popular tag in the world.
input
for a form. There are several kinds: text, password, checkbox, radio, submit, reset, file, hidden, image, button.
kbd
shows text to be entered by the user, just like saying "type this."
label
shows the kind of thing to put into an input field in a form, like "First Name" for the "fname" field.
legend
structurally indicates the caption or description of a fieldset, used to group related elements of a form
li
for a "list item," a part of either an ordered list, ol, or an unordered list, ul.
is a very important tag, for including stylesheets, especially.
map
create a client-side image map with nothing in it, to be used to put area tags in. This tag is automatically terminated by the occurrence of other material.
meta
is a tag for your own convenience, and traditionally applied for specifying descriptions--but again, if your page does not describe itself, what are you going to do? Write a meaningless page with a good description? Of course, there are other uses for it, often for a certain site--don't look at someone's special uses and think they actually are something to copy for your own site. One time I actually did that, and it reminds me of trying to copy the HTML for a "Top" link, when I hadn't put an anchor at the top that was named top--don't worry, you can learn by experience. :-) This tag is used by certain voluntary rating systems, and surprisingly, is accurate once in a blue moon, probably because inappropriate websites are encouraged to mark themselves as "adult" because most people in the world are actually looking for that. As usual, any attempt to deal with those problems by ranking or rating or filtering results in supporting, even financing, the problem.
object
embeds an object, such as an application or an applet into the HTML. For particular programs, necessary arguments can be supplied with the param element for objects.
ol
creates ordered lists, such as steps listed in order, with nothing them. The li list item tag indicates each part of the list.
optgroup
creates a group of options in a selectable form element, with nothing in it. The option is used to present selectable sub-choices.
option
writes a visible option in a selectable form element, with an associated data value.
p
formats a paragraph. This is extremely useful, as all paragraphs in HTML are not formatted unless they are written inside of this tag. The main benefit of HTML is obvious: text written any way becomes formatted and organized from merely this tag.
param
supplies named property values / parameters / arguments to a program embedded in the object tag.
pre
specifies preformatted text. This also is extremely useful, allowing you to format the borders and other elements of poetry, for example, while keeping the placement of text.
q
puts quote marks around a quote, as well as specifying a quote is present. I heartily disrecommend this element, as it requires the asinine elimination of quote marks from quotes, really doing nothing but rendering quotes out of context with everything except for the quibbly preferences of temporary web browsers.
samp
gives a sample of output, from a program, script, etc. This tag is interesting because it creates an easy-to-apply category for styling if you like to, but on its own it is an ambiguously differentiated text formatter.
script
embeds a script, usually JavaScript into the HTML. I advise that you please yourself to separate the script itself from your HTML by saying <script type="text/javascript" src="scriptcode.js"></script>, for example. Scripts are not HTML, and there is no reason for them to be conformed to HTML, and they will have to be if they are embedded directly into the HTML. (This is just the reason I vehemently oppose CDATA, because it is not a type of markup, but a thing for people who want to put in things where they don't belong. Embedding other types of data should not be making those types of data fit into the markup language itself.) It also organizes your pages better, and does not deny access to the script code, a plus for open source--like HTML--and a minus for people who strangely don't wish anyone to see their legs, and so they wear a long coat (which for goodness' sake, I am glad they wear the coat)--and then don't wish anyone to see their coat.
select
creates a selectable form element, with no selections in it. The available selections are written with option and groupable with optgroup.
small
specifies a smaller text style than surrounding text.
span
does absolutely nothing, except specify a portion of HTML to be controlled with CSS.
strong
is to strongly emphasize words. Text is usually changed to bold.
style
is used to create a stylesheet, with nothing in it. Generally this is bad, and stylesheets should be organized with the link element, e.g., <link rel="stylesheet" href="stylesheet.css" />.
sub
changes the text to be subscript. This is used for writing expressions like KC. Note that it usually increases the height of a line, sometimes leaving undesirable margins. This tag is a good example of a tag which is useful, but is wrong if used in the wrong way. Don't use subscript just to make a section of text look smaller, or something like that--it is a side effect--unless you want to be ridiculous. To change the size easily you may use small or big.
sup
changes the text to be superscript. This is like x2, but generally it is better not to do that, since text often needs to be copied from HTML and sent; in such text, you should say x^2.
table
creates an HTML table, with nothing inside. The table is a tremendous way for organizing data, but not for trying to mimic layouts. For that, use images or some CSS. Theoretically, and with a lot of trouble, you could make anything with a table, even with no text inside, simply by changing the color and size of each cell. The general meaning of a table is simply a structure of rows and columns, where each column lines up in each row, and if you draw borders along each "cell"--which you can--they would all line up. What you should generally do is to specify the meaning of each column, or row, with the th table header tag intended for this purpose, and list out the values continued in each row with the td table data tag.
Example:
cousinprettystylish
AmandaModestly attractiveSomewhat
AmandaVery beautifulGood clothes
Since tables are used very much for invisible layout, you have to expressly state border="1" (or greater than 1 for a thicker border) in order to visibly separate the cells.
Example:
cousinprettystylish
AmandaModestly attractiveSomewhat
AmandaVery beautifulGood clothes
If you want the thinnest border, you also need to control the cellpadding (distance separating the content of a cell from the cell's boundary) and cellspacing (distance between boundaries of cells). By setting these attributes to "0," you will get the thinnest border.
cousinprettystylish
AmandaModestly attractiveSomewhat
AmandaVery beautifulGood clothes
td
positions a cell of "table data."
textarea
creates a writable area of a form, with a default value of whatever you choose to write inside, starting with nothing. The textarea allows multi-line input.
th
positions a table header cell.
title
shall contain the title of the HTML document. The title belongs in the head.
tr
creates a table row, with nothing in it. The first table row may contain the series of th headings for each column. The following table rows may each begin with one th heading cell also, if desired. The table rows should all contain the same total number of cells in order for the table to be aligned. A single cell may be left blank, or cover more than one row (rowspan) or more than one column (colspan).
tt
formats text in teletype/monospace style.
ul
creates a very nice nestable, bulleted list of items
var
shows an instance of a variable or program argument. I would say that it formats it, except that it is indistinguishable from many other duplicate formatting. The usefulness is for particular authors, who may easily conform it to their needs for clarity.