Happenings

The X-Philes

Further to my post about Content Sent Correctly, I’ve been added to Evan’s list of sites that pass all 3 tests called “The X-Philes“. So far, only 11 sites are on that list.

What I didn’t mention before-hand, is that there is actually a 4th test: Why do you use XHTML? So thought it might be worth answering that one, even if it isn’t necessary.

  1. To learn – the best way to learn a new technology, for better or worse, is to use it in a practical application. For XHTML, that means building a website,
  2. Because XHTML is XML – I like XML. It’s well-formed, logical, and if something isn’t how you wanted it to be, it’s probably your fault (rather than a browser bug, as is usually the case in HTML). It’s also scrappable. If I want to extract any part of my markup without going to stupid lengths, I can do so with any XML parser,
  3. It’s semantically richer – no more name tag, only one language attribute, and lots more redundancy is gone.

I’m sure there are other reasons that I’ve forgotten for now, but I think those are sufficient reasons.

Pub Crawls

The single greatest innovation of recent times: automated pub crawling. Just give it some location data (post code, or city name) and it’ll generate a random pub crawl for you to go on. It changes every time, so the alcoholics (read: students) out there will never get bored. Isn’t technology wonderful?

Content Sent Correctly

A few days ago, I read about a test someone was doing on various sites. The test showed that of 119 sites tested, only 1 site had fully conformed to XHTML. So, I wanted to see if this site passed the tests.

The first two were easy: every page on this site validates as XHTML 1.1 (or it damn well should). The last one was a known problem: sending the pages you read as application/xhtml+xml rather than text/html. So off I went to find a solution.

It wasn’t long before I found Mark Pilgrim’s Road To XHTML 2.0: MIME types, which included a trivial piece of PHP that would solve my problems. It was quickly added to my headers.

Now, this change to my PHP meant that browsers who could accept the correct MIME type got that sent to them, others (like IE/WIN) got a normal MIME type that wouldn’t strain them.

Everything was going fine until I had to edit a post (I never spell-check this until well after entries go online). The editing feature of my CMS bawked at the new MIME-type. Why? Because browsers refuse to display any XHTML in a textarea when you use the proper MIME-type, leaving me screwed since my CMS dumps XHTML straight into a textarea to be manipulated.

After a bit of fudging, it all works. I can hardly wait for a new bunch of problems when XHTML 2.0 hits.

Sorry for how boring this all turned out. I’m a little ill today, and it seemed interesting when I started writing. Oh, how wrong.

MetaData

I recently came across the PostCon format (an RDF-based format) in a document describing an article on monsters. Take a look at it: that’s a lot of metadata! It got me thinking: how much metadata should we store on a given article?

The Finetto XML format is very small, but also in its early unsettled days (Finetto is the content management system I use and build). The elements are:

  • ID – a unique ID for each article, derived from the time it was written,
  • Title – the title of the item, not necessarily unique,
  • Date – The date the article was created. This is a throwback to when I didn’t understand how to use event-driven parsers properly, and has always annoyed me,
  • Description – A short description of the article, entered manually,
  • Author – Name of the person who wrote the article. This appears automatically (taken from a users log-in), but can be entered manually,
  • Content – The content itself as a chunk of XHTML.

Now, compared to PostCon, that is tiny. But there are times when I wish I had stored category information, or used an RSS-like format, or even scrubbed the date (it can be taken from ID). The question is should we attempt to store all information that could possibly maybe be useful down the line? I’m not convinced either way.

On one side, you’ve got the benefit that if you ever need to know anything about the document, it’s right there: no need to infer it from other sources (the web page that the article appears on, for instance). But, on the other side, you also have a tremendous amount of bloat if the data is never used. If a post is small, the metadata outweights the data which strikes me as horribly wrong.

When I can get a clear path to backwards-compatibility, I’ll seriously look at getting a lot more metadata into my format. For now, I’ll just muse over how much is enough.

X-Men 2

Note: this contains minor plot spoilers.

After many mixed reviews from friends, I thought I’d better see X-Men 2 for myself. So I did. Today.

Although it wasn’t as bad as the first film (which had a wafer-thin plot, badly written script, wooden acting etc), it was hardly ground-breaking or fantastic. More mutants from the comics were paraded around (for future usage), plot lines for future films were wedged into it in unsubtle ways (anyone who has seen either the cartoon series or read the comics will know what happened to Jean Grey), dialogue was poor, and Halle Berry gave another wooden performance (how does she get work?).

The good points? Nightcrawlers attack on the White House at the start was a stunningly executed sequence. Thinking about it, it went downhill after that and never recovered. A shame.

All in all, a passable film.