November 27, 2003 | Category:

Fixed Ugc Rss Feeds

After the UGC recently changed their pages to a more irregular format (in terms of code), the screen-scraped RSS feeds I made from them fell apart.

I finally got a chance to have a look at the code tonight and was terrified that it might be part of the main scraping that was going wrong (it’s fragile, very fragile), rather than the postprocessing.

Thankfully, the problem was that rather than leave directors and actor details blank (as previously) they now remove the line entirely. This actually reduces the amount of code I was using to parse out the details I didn’t want.

In short, there was a problem and I’ve now fixed it. Please resume enjoyment of the UGC RSS feeds as long as they last before breaking again (such is the perilous nature of screen scraping).