This iteration of HTML is the first in a long time (HTML 4.01 – the current standard – was published as a W3C Recommendation way back in 1999), and I’d argue that it’s the first in the new age of branding, logos, and chiclets for everything that anyone thinks up. Because of this, there has been far more hype for HTML5 than we used to see with a new technical standard. In the old days, the techies working directly with a standard might get excited about it, but word spread more slowly and there were few podia from which to extol their virtues.
Nowadays, everything gets a brand and blog posts and press releases, all of which zing around the planet at light speed, and HTML5 is no exception. The HTML5 logo is showing up all over the place – at least in the places where I hang out. HTML5 has become a term which describes one thing – or many things. Because of this, there’s some confusion about what it even is. It may even be a floor wax and a dessert topping.
Depending on what you read or who you talk to, you may hear that HTML5 is one of the following:
- A new standard for the HTML markup language
- An entirely new way to think about and work with the Web
In my opinion, on the spectrum of options 1 to 3 above, HTML5 is definitely the first, is strongly connected to the second, and may well lead to the third.
When it comes to SharePoint, we have some challenges. Due to SharePoint’s generally three-year release cycle and the pace of change on the Web, SharePoint is always behind when it comes to the latest Web technologies. Some of us work to enhance past that, but it can be a struggle with a platform that is a static as SharePoint is. The flip side of this, of course, is that SharePoint is an “enterprise class” platform that can be far more reliable and predictable than some of its more frequently updated cousins. (Cue laugh track here from many SharePoint developers, but trust me that it’s true.) There’s always the danger that today’s shiny new penny will be tomorrow’s recycling, so jumping on every hot new technical option right away can be a huge mistake. With SharePoint, Microsoft insulates us from that problem, yet at the same time our options can be somewhat limited.
But enough rhetoric and opinion. What is HTML5 really, anyway?
When I first start to learn about something in the technology space, I’ll often go and visit the Wikipedia page for it. I don’t always read all of the definition – oftentimes they go straight into the weeds – but just the first paragraph or two. That gives me a good feeling for what the thing is all about.
Here’s the first paragraph about HTML5 on Wikipedia (as of 26 March 2012, footnotes removed):
HTML5 is a language for structuring and presenting content for the World Wide Web, and is a core technology of the Internet originally proposed by Opera Software. It is the fifth revision of the HTML standard (created in 1990 and standardized as HTML4 as of 1997) and as of March 2012 is still under development. Its core aims have been to improve the language with support for the latest multimedia while keeping it easily readable by humans and consistently understood by computers and devices (web browsers, parsers, etc.). HTML5 is intended to subsume not only HTML 4, but XHTML 1 and DOM Level 2 HTML as well.
What this tells us is that HTML5 is simply a progression from earlier versions of HTML – no surprise there based on its name. It also tells us that it includes improvements for the “modern” things we see on the Web, such as the “latest multimedia”. Today that means videos and audio, but it may well mean new things we don’t even know about down the road. (Anyone remember Smell-o-Vision?) The new HTML5 standard adds new elements and attributes to the HTML standard and also deprecates some others (e.g., font, center).
One of the main goals for HTML5 is to bring us closer to what has been called the “Semantic Web”. This is another term I find confusing. Turning again to Wikipedia for a definition (again, as of 26 March 2012, footnotes removed):
The Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a “web of data”. It builds on the W3C’s Resource Description Framework(RDF).
According to the W3C, “The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.”
Gobbledy-gook, if you ask me. The point is to improve the way we mark up and delineate content in Web pages so that other applications can better understand that content and use it in other contexts. Hopefully that’s a little clearer, but let me give a very small example.
In a Web page today, we’re very likely to see something like this markup:
<div class="my-app-title-class">This is the Title of an Article</div> <div class="my-app-body-class">Praesent porta massa vel lacus sodales placerat. Nam diam orci, pulvinar eu dapibus bibendum, rutrum in arcu. Etiam lorem mauris, vehicula dignissim commodo ut, tempus id sapien...</div>
This is all well and good. The title is displayed in the user’s browser with some CSS applied to it that makes it stand out somehow, the body text follows with some appropriate formatting, and they read it – grand. But other applications can’t look at that markup and make sense of it unless they understand the arbitrary CSS classes we’ve decided to use. What the Semantic Web principles say is that we should make it far clearer what that content is.
Here’s another way to publish that same content:
<article> <h1>This is the Title of an Article</h1> <p>Praesent porta massa vel lacus sodales placerat. Nam diam orci, pulvinar eu dapibus bibendum, rutrum in arcu. Etiam lorem mauris, vehicula dignissim commodo ut, tempus id sapien...</p> </article>
It’s a tiny example, but by adding that new article element, we can indicate to other applications that this page contains, well, an article. That’s the basic idea behind the Semantic Web – making the content more understandable both internally to the browser and externally to other applications.
The HTML5 standard contains new elements that help to move us closer to the Semantice Web idea. Here is the list, taken from the HTML5 standard, which helps us to improve structure:
sectionrepresents a generic document or application section. It can be used together with the
h6elements to indicate the document structure.
articlerepresents an independent piece of content of a document, such as a blog entry or newspaper article.
asiderepresents a piece of content that is only slightly related to the rest of the page.
hgrouprepresents the header of a section.
headerrepresents a group of introductory or navigational aids.
footerrepresents a footer for a section and can contain information about the author, copyright information, etc.
navrepresents a section of the document intended for navigation.
figurerepresents a piece of self-contained flow content, typically referenced as a single unit from the main flow of the document.
figcaptioncan be used as caption (it is optional).
As you can hopefully see from this list, we can more clearly indicate what the content in the page actually is, rather than just getting it on the page looking right.
There are many other new elements and attributes, changed elements and attributes, and “absent” (or no longer OK) elements and attributes listed in the HTML5 standard. Rather than copying them all into this post, I’d recommend that you peruse HTML5 differences from HTML4 if you are interested in the details.
Of course, everyone wants to know about the stuff that will make pages more zingy and fun in SharePoint, and in upcoming articles, I’ll go into much more detail on those changes. The improvements I touch on above, while not as flashy, may well have a bigger impact on the Web at large. You just may not see the differences right away.