“Brief” Guide to JavaScript – Part 5
DOM Tree
This part of the main reason of using JavaScript in the first place. Browsers give JavaScript access to its DOM, or Document Object Model, which is how the browser internalizes the HTML tree. This is an extremely large topic in its own right. A nice article in progress at Nettuts but otherwise you could probably find other resources on the internet besides this article. I wish I could go in-depth about DOM, but I would consume more than half this series (and JavaScript isn't just about the DOM tree). Experimentation and online resources will help you plenty when wrangling with the DOM.
What is it?
The document object model is a tree-styled data structured which represents the html structure of a page. Browsers use DOM as a way to draw the page. Given this example HTML page:
<html> <head> <title>My title</title> </head> <body> <h1 id="header">My title</h1> <p>My paragraph text</p> <p id="secondpara">My second <strong>amazing</strong> paragraph.</p> </body> </html>
When the browser loads this page, it generates a DOM tree close to the one shown below:
Before we start, we should quickly review some terminology of typical computer-science trees.
- Node: In the graph, a node is a box. They make up the tree itself.
- Leaf Node: Nodes that have no nodes underneath it.
- Internal Node: Nodes that contains more elements underneath it.
- Root: All nodes are "under" this node. For DOM trees, that's the document node which is the root node for all html elements.
- Parent: A parent node is a node that is "above" the particular node we're talking about. For example, the body node is the parent node to the h1 node.
- Child: A node that is directly underneath the node we're talking about. The h1 node is a child node to the body node but the strong node would not be a child node to the body node.
- Sibling: Nodes that share the same parent node with the node we're talking about. The h1 node has two sibling nodes, both of which are p nodes.
(As you can start to see, most of the naming schemes of trees are similar to family trees.)
Browsers start at the root node and visit every child, grandchild, great grandchild, etc. node to figure out how to draw the page. Any node that isn't a descendant of the root node isn't drawn.
Now let's talk about the content of the tree itself. Each node represents a particular HTML element in the page, including all its css styling and attributes. Text nodes are simply a special node the browsers uses to represent text. Although text nodes are always created when you enter text, the creation of certain text nodes varies from browser to browser.
For example, if you added spacing when typing your html page (like the html I showed to display this tree), the browser would add text nodes because of the spaces and newlines. The only exception is for certain elements that don't accept text nodes directly (like tr or table elements). But pesky Internet Explorer is the only browser that adds text nodes to those elements.
All html nodes are descendants to the document node. Besides being the root for all html elements, the document node is directly accessible to use through javascript and provides access to its child elements. Since html documents always have head and body elements, we can access them easily with the document node:
document.body; // the body element document.head; // the head element (only works for firefox, afaik)
head, body, html, and all other html elements provides many capabilities, but we'll only cover some of the most common ones:
- element.tagName: The html tag that represents this element in all uppercase. document.body.tagName would be set to "BODY". and document.head.tagName would be set to "HEAD". Non-html nodes tagName properties are undefined, such as text/comment nodes.
- element.nodeType: Like tagName, but gives the type of tag the node represents in numeric form:
- 1 = Element Node: Your typical html element
- 2 = Attribute Node: An attribute node, like a text node (see last bullet point here)
- 3 = Text Node: A node that represents text
- 8 = Comment: A comment in your html page.
- 9 = Document: The root node for a DOM tree.
- element.innerHTML: The most popular feature officially not part of the DOM spec. This contains all the html that this element would contain. Using our html page from above, document.head.innerHTML would contain "<title>My title</h1>". You can also set this property to replace the interior html.
- element.innerText: Like innerHTML, but text only (no html tags). This is useful if you don't want to bother escaping your string to display text in an element.
- element.style: This provides access to modifying the element's style. All changes only affect the given element, which is identical to using the style attribute in your html tags (<p style="...">). All properties under style are identical to their CSS variants with the exception that a hyphened word is changed to a capitalized word: background-color becomes backgroundColor. The only exception is the float property, which is styleFloat in IE and cssFloat in other browsers. All CSS property values are stored as strings.
- element.className: Since class is a reserved word in the JavaScript language (saved for the future), className is used. className contains the string of all classes attached to this element.
- element.getElementsByTagName(tagNameString): A function that accepts a string for the element tag name to search for that this element contains. Returns all the elements with the matching tag.
- element.attributes: An array of special text nodes that hold all the attributes of the element. Usually you can access the attributes directly in the element (like document.body.id) or using getAttribute, setAttribute, and removeAttribute.
Let's change the h1 text as soon as the page finishes loading. To do this, we need to get to our h1 element.
Luckily, we gave the element an id attribute. JavaScript provides functionality to quickly access id'ed elements: document.getElementById. To quickly access the h1 tag we do this:
<html> <head> <title>My Site</title> </head> <body> <h1 id="header">My title</h1> <p>My paragraph text</p> <p id="secondpara">My second <strong>amazing</strong> paragraph.</p> <script type="text/javascript"> // <![CDATA[ alert(document.getElementById('header').innerHTML); // displays "My title" // ]]> </script> </body> </html>
Where 'header' is the id attribute value. But we want to change the text, the easiest way is to modify the innerHTML text, so let's do that.
<script type="text/javascript"> // <![CDATA[ document.getElementById('header').innerHTML = 'JavaScript Rulez!'; // ]]> </script>
Let's say we want to make the h1 element in red and have a background of light-light grey. We can use the style properties:
var h1 = document.getElementById('header'); h1.style.color = 'red'; // as if we're setting css style, color h1.style.backgroundColor = '#eee'; // rgb, hex, and keywords are accepted. h1.style['background-color'] = '#eee'; // does same as above if you like your hyphens
Styling without those silly stylesheets. Although stylesheets are better, really.
Manually Navigating the Tree
There comes a point where you want to navigate the DOM tree that doesn't have an an ID. Either because you don't want to clutter you page with a bunch of id's or you're creating your own JavaScript framework/embed script That's where we can use the following properties for elements:
- element.parentNode: gets the parent element based on the current element.
- element.previousSibling: gets the previous sibling element from the current element.
- element.nextSibling: like previousSibling, but gets the next sibling instead of the current element.
- element.childNodes: an array of all of the child elements for this element.
- element.firstChild: simply a short-hand to get the first child element. Identical to childNodes[0].
- element.lastChild: short-hand to get the last child element. Identical to childNodes[childNodes.length - 1].
null is returned where any element doesn't exist (like getting the previousSibling for the first element).
I find it easier to learn with examples than definitions. So let's say our h1 element didn't have an ID (where we can use getElementById), but we wanted to get to that element.
We'll start from the top of the DOM tree (document) an work our way to the h1 element. At first glance, the following line may seem to get what we want.
document.body.firstChild;
But there are text nodes, which I didn't show on my diagram since the diagram would be too large. In between and around each element, has a text node. So the code above retrieves the text node before the h1 node (which probably contains a couple newlines and spaces). We can get to the h1 node in several ways:
document.body.firstChild.nextSibling; // the node after the first (text) node document.body.childNodes[1]; // second node // gets all h1 tags on the page (we're looking for the first and only one) document.body.getElementsByTagName('h1')[0];
From there we can do whatever manipulations we want.
Manipulating the Tree
I said we can manipulate the element once we pick it out, but we haven't really talked how to do that. Our handy element object provides some more functions to handle adding and removing new nodes to the DOM:
- element.appendChild(newNode): Inserts a passed element, newNode, as the last child node for the current element.
- element.insertBefore(newNode[, referenceNode]): Like appendChild, but inserts the passed element before the given child referenceNode. If referenceNode is not given, then this function behaves like appendChild.
- element.replaceChild(newNode, childNode): Takes a new node and existing node. Replaces the existing child node with the new node.
- element.removeChild(childNode): Removes the given child element.
- element.cloneNode(deep?): Returns a duplicate node to current node. Accepts a boolean (true/false) value that indicates whether to perform a deep clone. A deep clone copies all child elements. The duplicated node has no parent node and thus, not drawn.
Using the previously provided html page, we can move the h1 element to the end by simple re-appending it to the body element:
document.body.appendChild(document.getElementById('header'));
Or maybe you didn't like that header one iota.
document.body.removeChild(document.getElementById('header'));
And away it goes. Now you want the second paragraph to be repeated, for emphasis or your subliminal messaging scheme:
// copies the strong element too.var cloned_paragraph = document.getElementById('secondpara').cloneNode(true); document.body.appendChild(cloned_paragraph); // attach cloned node to the DOM tree. cloned_paragraph.id = "thirdpara"; // change the id since IDs need to be unique.
We also have one more function, provided by document, to complete our element manipulation features. document.createElement creates a new, parent-less, element with a given tag name. Let's dynamically add an image to the page. I'm using a local copy of a photo on Flickr to load since hotlinking is evil.
var img = document.createElement('img'); img.src = 'cat.jpg'; // whatever you named it document.body.appendChild(img); // attach to tree to be displayed
Result:
Kitten! (photo taken by Tambako)
Feel free to experiment on your own. (That's your homework!)
To reiterate, there are good detailed resources about the DOM tree out there. Google and Mozilla are good starting points.
Window Object
As stated earlier, the window object is the browser's way of storing the global state of the page. Listed here are some browser provided features, exposed via the window object.
- alert(message): You probably know what this does already. Displays a modal dialog with the message as text.
- document: The browser's DOM tree exposed to JavaScript. We just talked about it above.
- location: Browser-provided information on page information as an object.
- hostname: The subdomain and domain name (eg - finance.google.com)
- host: alias to hostname.
- href: The full URL the browser used to reach this page. (eg - http://finance.google.com/finance). Changing this property will make the browser go to another url. Use location.href.replace(newURL) to also replace this page in the browser back history with newURL.
- pathname: The server path used access the page (eg - finance)
- protocol: The protocol the browser used to access the page. Usually either "http:" or "https:"
- console.log(message): This isn't implemented in all browsers (just Firefox + Firebug, Opera, Webkit, and Chrome), but prints out a message to the JavaScript Debugger. This is more flexible than alert since javascript arrays and objects can usually be inspected (instead of "object" being printed).
That wraps it up for this week. Next up: Events. Events are utilized for responding to user actions (such as mouse clicks, key presses, etc.). Read part 6.
-
Abercrombie and Fitch
-
MBT Shoes
-
ugg outlet
-
nfl
-
ppds
-
jeffhui