A Brief Guide to HTML
Introduction
Other Virtual Workshops assume a knowledge of HTML (such as those on Interface Design and Replacing tables with CSS). While there are many other resources available on the 'Net, for reasons of completeness I have decided to include this brief guide. This Virtual Workshop is not about design, stylesheets or anything of that nature. It is purely about the syntax for marking up a page using HTML. In reality the knowledge learned here will have to be applied within a design context, but that context itself will not be discussed much here. I will use the W3C XHTML 1.0 specification (which is a fairly recent one) when discussing HTML throughout.
What is HTML?
HTML began as a way of describing documents so that could be displayed with formatting on the World Wide Web. This logical approach (calling a paragraph a paragraph, a heading a heading and a list a list etc) is still the best way to think about HTML. In the recent past HTML morphed slightly with the use of 'physical' HTML driven by browsers adding new features. Some examples of physical HTML are those that describe the document appearance: such as colours, fonts, text size. There have also been other 'tricks' that designers have used to control layout such as using nested tables and single pixel graphics.
These approaches have had many drawbacks, including making pages bigger and longer to load, making pages difficult to read for people using text browsers and those with disabilities, and making the pages so unwieldy (with more layout code than actual content) that you pretty much needed a WYSISWG editor to maintain your site.
Thankfully Cascading Style Sheets were developed to control the appearance of a page, and although it has taken a while there are enough browsers in circulation that support CSS properly now that we can throw away the hacks and work arounds and return to a purely logical HTML. We then use CSS properties to control how the page physically appears. This is explained in the Brief Guide to CSS workshop, but for styles to work, the HTML must be created and that's what we will learn to do next.
What HTML looks like
HTML is fairly straightforward. A simplistic and easily understood view is that a page is made up of the content you want the reader to see, 'marked-up' with tags that surround the text. The type of tag determines how the browser will display that text. A tag looks like so:
<element> - An opening angular bracket, the name of the element, and a closing angular bracket.
This is an opening tag. For every opening tag in XHTML there must be a closing tag which looks almost the same:
</element> - note the only difference is that there is a forward slash after the opening bracket.
So a piece of text in HTML that is marked-up as paragraph text would use the 'p' element in a tag and look like so:
<p>This is my Paragraph Text</p>
A heading would be marked up differently using one of the heading elements (h1, h2, h3 h4, h5, h6) in the tag .
<h1>This is my Heading</h1>
A little bit about Elements
As we have just seen, the element in the tags describes the text to the browser. There are three categories of element, which are:
- Block Elements
- Inline Elements
- List Elements
Each of these have default characteristics, as explained below.
Block Elements: Puts the text separate from the other page elements that surround it (i.e. text wont flow from one block to the next continuously). Generally block elements also force a margin at the top and the bottom of the block thus separating it from the other text that surrounds it. To demonstrate this look at the following text.
First Paragraph Block Text
Second Paragraph Block Text
The actual area of the block has a background colour that is light blue, the border surrounding the block is dark blue, and the white space between the two pieces of text is the margin (a combination of the bottom margin of the first paragraph text and the top margin of the second paragraph text).
Inline Elements: Unlike the block element, an Inline element is designed to be used within a block element adding additional information to a piece of text. For example to create a hyperlink within a block of text:
This is a piece of paragraph text, paragraph being a block element, and it contains a link which is an inline element.
The link is signified by the underline and as you can see does not break the text block. If this was to be a block element as well, it would break the unobtrusive nature of the hyperlink.
List Elements: List elements are fairly self explanatory. They make sure that text is displayed as a list, e.g.
- This is a list item
- So is this
And as you can see these also have several default characteristics, such as bullets and forced line breaks.
All characteristics of these elements can be changed using CSS, but it is important that we understand what is meant by a each of the Block, Inline and List element types.
Tag Attributes
As well as containing an element, tags can also contain attributes that carry additional information such as which CSS style to use, height and width, text alignment or target URL of a link. These attributes appear after the element in the opening tag and consist of property/value pairs. They do not appear in the closing tag. For example:
<element atribute1="value1" attribute2="value2" attribute3="value3"> Text here </element>
So if we were to specify a text link, we would use the <a> element, but also specify the target using the href attribute.
<a href="mypage.htm">This is link text</a>
Although don't be too concerned about making links yet. This is just an example of how an attribute is used within HTML.
Self-Closing Tags
Some tags that are used within HTML don't require a closing tag. This is generally because rather than defining an area of the page as one structural element or another, these are in fact objects that should be displayed at that point on the page. Some examples of this may be an image element (img) or a line break element (br). However in order to be valid XHTML, a mechanism to close the tag IS still required. A forward slash is added to the tag before the closing angular bracket like so:
<br />
Thus the element is opened and closed within one tag.
Now we have learned about tags, elements and attributes it is time to learn a bit about the structure of the page.
The Page Structure
Any HTML page MUST follow a certain structure, which is simple enough to remember. On the very first line of the page the DOCTYPE is declared. The doctype tells the browser which HTML specification (and thus set of rules) we are using for our page. As stated above, this tutorial is concerned with the XHTML specification and thus will recommend the XHTML transitional DOCTYPE like so.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
This includes the URL of the Document Type Definition which tells the browser which elements and attributes are valid. Although most browsers are aware of the tags and attributes, the display of them may differ based on the DOCTYPE you use. For example some attributes that were part of the HTML 4.01 specification are no longer included in the XHTML 1.0 specification. Thus if you use the XHTML 1.0 Strict DOCTYPE a browser should ignore these depreciated attributes.
The HTML Element
Next comes the HTML element tags (opening and closing). There are two important rules to remember about this particular element:
- It can appear only once in the page
- EVERYTHING must be within the opening and closing tags.
Whilst modern browsers are somewhat forgiving and will render a page if novice developers break these rules, there are other 'viewers' on the Internet such as spiders that index your page for search engines. Some of these will not be able to understand your page if the rules have been broken in this way.
Note: The HTML tag can also have attributes. One of the most common to specify a language (lang) attribute as part of the html tag. Thus we we would put they following below the DOCTYPE.
<html lang="en-GB"> </html>
Within the HTML Tags
Next we need to add the two sets of tags that contain the page content. The first is the head element, containing information for the browser that is not displayed as part of the visible page. The second is the body element which contains the markup for the visible part of the page that appears in the viewport of the browser (that part of the browser program that displays the page). Thus within the HTML tags, opening and closing head and body tags are required.
<html lang="en-GB"> <head> </head> <body> </body> </html>
Within the Head Tags
Within the head tags you can put all the 'stuff' that you don't to be displayed in the viewport, but is still crucial to the functioning of your page. This 'stuff' includes:
- a page title that will be displayed in the browser titlebar
<title>Keith's Example HTML Page</title>
- Any link to an external stylesheet (notice that this link element has attributes
and is open and closed within one tag).
<link rel="stylesheet" href="stylesheet.css" type="text/css" media="screen" />
- Any styles specific to that page
<style type="text/css"> <!-- styles here --> </style>
- Link to an external script file
<script src="myJavascript.js" type="text/javascript"></script>
- Any scripts specific to that page
<script language=" JavaScript"> <!-- script here // --> </script>
- Metadata tags (or Meta Tags for short) - these tags contain data about the page rather than page content (more of these in a moment).
These are all elements that belong inside the head tags and they should follow the rules of HTML - i.e. have opening and closing tags. Once again if these rules are not followed it is likely that it won't make a difference to the appearance of the page as modern browsers are pretty robust, but your page will not be valid and may suffer in other ways.
A Bit About Meta Tags
As I've already stated, meta tags are information about the page. There are many meta tags based on the meta element that use different attributes to give information about the page. Although there are many different meta tags, from those identifying the program that generated code to specifying expiry dates for the page, I will only cover those that I use regularly and thus deem more important.
The general syntax for the meta tag is like so:
<meta name="meta type" content="meta value" />
Where meta is the element, name is the first attribute which establishes the type of meta data being recorded, and content is a second attribute that gives the actual meta data values. So let's look at the main ones in turn.
Author: You should always identify who the author is.
<meta name="author" content="Keith Brown" />
Description: This gives a brief description of the page, which search engines often use when displaying their results.
<meta name="description" content="Keith's Example HTML Page" />
Keywords: These are the words that you would like associated with the page. You should separate each keyword with a comma.
<meta name="Keywords" content="HTML, tutorial, guide, virtual, workshop" />
Copyright: You can also specify your copyright statement for the document
<meta name="copyright" content="Keith Brown 2002-2003" />
Language: It is again good practice to announce the language in which you have written the page.
<meta http-equiv="content-language" content="en-GB" />
Note: The first attribute in this case is called http-equiv, but does the same job.
Content-Type: This explains the mime-type of the page and the character set used (important for validation).
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
There are also meta tags that are either aimed specifically at search engine robots.
Robots: This tells any robot that visits page what to do. The robot can either index the page (index) or not index the page (noindex) and follow the links on your page to other pages (follow) or not follow the links (nofollow). For example to allow a robot to follow links on your page, but not actually index the page.
<meta name="robots" content="noindex,follow" />
Revisit-after: This tells the search engine how often to send a robot to your site. This may or may not be obeyed as search engines have their own crawling cycles, but is worth having anyway. So to specify a visit every 2 weeks:
<meta name="revisit-after" content="14 days" />
There are many more meta tags that you may wish to use including from the Dublin Core.
Within the Body Tags
Within the body tags is where the web author puts the meat of the page. This is the part of the page that you want to be displayed in the viewport and thus is the part that tends to be 'designed'. Remember this workshop is not about the actual design of a page, but rather the tools used to logically structure content for formatting and styling with CSS.
Hierarchical Structure of Text
When presenting your page is it a good idea to have a definite hierarchal structure for the text. In other paper documents you may have a hierarchal structure such as headline, sub-head one, sub-head two etc. HTML is no different and uses a structure of:
h1
-h2
--h3
---h4
----h5
-----h6
------p
Before the ability to change the appearance of headings (or even have subtypes of headings) using CSS these were misused by designers, including myself. My most common 'error' was to use h4 and h5 headings for navigational links. This workaround is no longer necessary and thus it is better to stick to a hierarchy of presenting data based on these headings. For example, first use the h1 element to give the page a title:
<h1>Keith's Example HTML Page</h1>
Then add a short paragraph with authorship details.
<p>By Keith Brown</p>
Then a second level heading...
<h2>HTML Links<h2>
Another paragraph...
<p>These are some links to HTML pages</p>
Next you could create a further two sub-categories with third level headings
<h3>Links to W3C Sites</h3> <p>These are some links to Official W3C sites</p> <h3>Links to Tutorial Sites</h3> <p>These are some links to useful tutorial sites</p>
And then another second level heading to indicate the start of a new section.
<h2>HTML Discussion Sites</h2>
To see what this would look like I have prepared an example page.
Adding a list
Next we are going to look at making lists. There are two main types of list - an ordered list (ol) -which uses bullets - and and unordered list (ul) - which uses numbers. Once a list has been defined (using opening and closing tags) each list item is placed within the list using list elements (li). To demonstrate, an unordered list could be added under the "Links to W3C Sites" heading and introductory paragraph like so:
<ul> <li>HTML Home Page</li> <li>HTML Tidy</li> <li>HTML Validate</li> </ul>
And we can add an ordered list under the heading "Links to Tutorial Sites", again after the paragraph:
<ol> <li>Keith Brown's Virtual Workshop</li> <li>George McMurdo's HTML for the lazy</li> </ol>
Again we can check this in an example.
Creating Links
An obvious advantage of the WWW is that you can create your own interlinked pages, and link to other sites as well. Thus we will add a links to the items in each list. A standard link is made up of the a element and the href attribute that contains the target file or URL and surrounds a piece of anchor text. This anchor text is the the part that appears as a underlined hyperlink on the page. So let's add a link to the "HTML Home page" in the first list. To do this we use the full URL of the target site, which is known as an absolute URL:
<a href="http://www.w3.org/MarkUp/">HTML Home Page</a>
We can also link to local files, by using a relative URL which specifies the path and file name of the target in relation to the current file. So a link back to this page from the top of the second list would be:
<a href="../design_4.shtml">Keith Brown's Virtual Workshop</a>
If we then add all the other links to the list we can view the example page.
Relative URLs perhaps need a touch more explanation and to do this we should look at the two related pages used in this workshop.
Page One is this page and it has a full URL of http://www.keithjbrown.co.uk/vworks/design/design_4.shtml
Page Two is the example that we have just looked at and has a URL of http://www.keithjbrown.co.uk/vworks/design/examples/html_3.shtml
We can see that both URLs share a similar base (http://www.keithjbrown.co.uk/vworks/design/) and if we remove that we can see how the files relate to one another.
design_4.shtml | examples/html_3.shtml
The page design_4.shtml and the directory 'examples' are on the same level, but as the html_3.shtml is within the examples directory it is a level below the design_4.shtml file (with the base URL being the 'top'). To link to page two from page one, we must specify both the directory AND the filename:
<a href="examples/design_4.shtml">Keith Brown's Virtual Workshop</a>
Whereas to link to the first page from the second page, we can specify the target file (design_4.shtml), but we must also specify that this file is a level above the second file. The way this is done is by using a double dot (..) as the directory signifying 'go up one directory' and which is standard across all operating systems.
Inserting Images
Images are important to all sites, no matter how plain, and therefore we must also know how to insert them properly. The element for specifying an image is img and again the tag relies on a src attribute to specify the image to embed. The src can (like the link above) be either be a local file or a remote URL. Let's insert a logo at the top of the page (just after the opening body tag). This will be a local file stored in an images subdirectory.
<img src="images/htmllogo.gif" alt="HTML Workshop" height="60" width="500" />
Notice that as well as the src attribute there are three other attributes that should be used in all images. These are an alt attribute which specifies some text to display in place of the image in non-graphical browsers and the height and width attributes. Specifying both height and width allows graphical browsers to render the page more quickly than if all the images on the page had to be downloaded to determine their attributes, and thus the layout of the page on the screen.
As the code that has been generated so far has been valid code we can also use the w3c's valid xhtml 1.0 button which has a remote URL as the src. This also includes a link to the validator, using the image as an anchor. We shall insert this code at the bottom of the page just above the closing body tag.
<p> <a href="http://validator.w3.org/check/referer"> <img src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0!" height="31" width="88" /> </a> </p>
It is worth noting that you can split tags over multiple lines and the white space is ignored. Also please note that it is generally not good practice to link to images on a server that you do not control. The W3C do however allow linking to their images.
Again a quick check of an example page will see this in action.
Tables
Although tables have extensively been used by designers to control page layout (as in my Visual Design Virtual Workshop), their true function is to display tabular data.And thus they are important to learn for that reason alone. Tables follow a simple structure and to demonstrate this we will add some tabular data under the second level heading "HTML Discussion Sites". First the table element tags are specified.
<table width="500"> </table>
Again we use a width attribute, this time in the opening table tag. For each row of the table we must specify opening and closing tr element tags within the table. This example will use 3 rows.
<table width="500"> <tr> </tr> <tr> </tr> <tr> </tr> </table>
Within each row we must then specify the cells or Table Data (td) elements that represent the columns in each row. This example will use 3 columns with table headings of 'Site Name' (which will also be a hyperlink), 'Description' and 'Rating'.
<table width="500">
<tr>
<td>Site Name</td>
<td>Description</td>
<td>Rating</td>
</tr>
<tr>
<td><a href="http://www.sitepointforums.com/forumdisplay.php?forumid=16">
SitePoint Community Forums</a>
</td>
<td>Just building your first webpage? Ask beginner questions here.
Find out what you need.
</td>
<td>
Good
</td>
</tr>
<tr>
<td>
<a href="http://www.webmasterworld.com/forum21/">Webmaster World</a>
</td>
<td>
Q and A around the grand topic. HTML, the heart and soul of the net.
</td>
<td>
Excellent
</td>
</tr>
</table>
This will work fine (see this example). The headings have not really been highlighted. This is because we have used 'ordinary' td elements in the first row which contains the headings. What should have happened is that in the first row we use the table heading (th) element.
<table width="500">
<tr>
<th>Site Name</th>
<th>Description</th>
<th>Rating</th>
</tr>
This results in the headings being correctly emphasised as in this example.
Divs
As I've mentioned above, CSS is beginning to take over from tables for page layout. As part of this process the div element has taken increased precedence for defining logical divisions in a document. What this means is that you can define your page headers, footers, navigation areas and content areas with HTML then apply styles with CSS to control layout. Using div elements follows the now familiar pattern of opening and closing tags around the relevant structural content. The different logical divisions are identified by the use of an id attribute within the tag.
The example page has 3 definite areas. A header (the logo), a footer (the W3C button) and the page content (everything else). These are marked-up like so:
<body>
<div id="header">
<img src="images/html_logo.gif" alt="HTML Workshop" height="60" width="500" />
</div>
<div id="mainContent">
<h1>Keith's Example HTML Page</h1>
-- snip --
</table>
</div>
<div id="footer">
<p>
<a href="http://validator.w3.org/check/referer"><img
src="http://www.w3.org/Icons/valid-xhtml10"
alt="Valid XHTML 1.0!" height="31" width="88" /></a>
Designed by Keith Brown, 2003.
</p>
</div>
</body>
You can also add a little ownership notice to the footer as well. This doesn't actually make a difference to the appearance of the page but does pave the way for applying CSS later.
Comments
As with all computer readable languages you can place comments into the source code that will be ignored by the browser. These comments take the following structure.
<!-- comment inserted in here -->
Or in 'plain language: opening angular bracket, exclamation mark, two dashes. the actual comment, another tow dashes and the closing angualr bracket.
Miscellaneous tags
There are also some miscellaneous tags, not discussed so far, that are worth mentioning briefly.
strong - This strongly emphasises text. Usage: <strong>text</strong>
em - This emphasises text. Usage: <em>text</em>
br - This inserts a line-break. Usage: text<br />
hr - This inserts a horizontal rule. Usage: <hr />
b - Similar to strong, but depreciated, makes text bold. Usage: <b>text</b>
i - Similar to em, but depreciated, italicises text Usage: <i>text</i>
span - Defines an Inline section allowing styles to be applied. Usage: <span>text</span>
blockquote - Defines an area as a quota and indents the text by default. Usage: <blockquote>text</blockquote>
pre - Defines an area that preserves whitespace. Usage: <blockquote>pre</blockquote>
Conclusion
This Virtual Workshop was conceived to show how HTML should be used to logically mark up a page and should be separated from style. The finished result without any styles is pretty plain and not as pleasant to look at as many other sites, but WILL work in all browsers and provide a good browsing experience to everyone. And to those with a modern graphical browser the addition of a stylesheet can make the design visually appealing with the added bonus that changing the style sheet changes the layout of the page.
Lastest 10 Threads - view all
zqrlwwkzpj
Posted By: zqrlwwkzpj at 00:28:31 on Tuesday the 19th of June 2007
Hello! Good Site! Thanks you! ypbsdrnifivql
Good site! I found in google.com
Posted By: Pablo at 13:30:08 on Wednesday the 14th of March 2007
ERR s
correction
Posted By: Sharon at 11:36:53 on Tuesday the 5th of April 2005
Hi Keith, Nice tutorial. I'm just learning how to use HTML and your site is very helpful. Just want to tell you that in the 1st para after "Adding a List", you exchanged the ordered (numbered) and unordered (bulleted) lists in their description.
How do I ?...
Posted By: porphyry at 09:14:56 on Thursday the 12th of February 2004
Hello,
How does one center content regardless of browser window size. My site is www.karmic.ch and I would like the splash image to be centered.
Thanks for any help !!
Y.
Re: How do I ?...
Posted By: Sean at 05:29:21 on Monday the 22nd of March 2004
The simplest way is to insert a table outside of your content with width and height set to 100%. like this:
| Your splash image here |
This will centerise your image.

Allegra
Posted By: Debby at 18:14:37 on Friday the 22nd of June 2007
You have built a good websiten
Reply to this comment