Happenings

Research Topics

I’m curious about something. Actually, I’m curious about a number of things and that, inadvertantly, is the topic of this post:  I’m curious as to whether or not other computing scientists/software engineers keep an active list of research topics that they want to delve into further.

For years now I’ve had a series of nagging questions on different topics that I want to answer. Sometimes they’ll be generally understood problems that I want to know more about (like recommendation systems, a topic I looked at a while back), sometimes they’ll be new problems that are just opening up and actually need people to move them along (like lifestreams, a topic I fully intend on coming back to). These, broadly, form “research topics”: things I want to know more about or explore in some way.

I tend to track these topics in two ways: 1) I simply let them drift in and out of my head. If I find that they keep coming back, then I give them some thinking time and see what falls out. 2) I keep a list of active topics and questions at hand at all times (actually stored on my mobile as a draft text message), which I look over periodically. You’ll note that I’m not necessarily actively pursuing anything, just thinking over the topics and seeing if any new information I’ve come across in the time since I last looked at the topic has shed any new light on the matter.

By using my research topics as sieves for the huge swathes of information I’m exposed to, I passively manage to form a greater understanding of things that matter to me without wasting time on deadends. I’ve found that these methods help me find items that I’m interested in over the medium to long term, rather than picking up and following fads (I deal with current trends in a different way).

The problem is that I find this to be a little too passive at times. I’m painfully aware that my open source work (which I’ve mostly done anonymously, but you can find it if you really try) and privately posted projects have all but dried up in the last year or so. I don’t have a huge amount of free time, and the time I do have I use pursuing other interests. I think I should, however, get back into more active research and produce more product from it, whether that product is code or merely more concrete ideas.

I guess, in a shortened form, my questions are: do people have their own research topics and how do they pursue them? Relentlessly or passively? Soaking them in or furiously chasing them? I’m curious to know.

The New Console Experience

It’s been an interesting generation for gaming. For the first time in a few generations we’ve seen some big changes not just in the style of games we’re playing, but fundamental shifts in the platforms we’re gaming on. With their dashboards and updates, the modern console is a very different prospect from those gone before.

For no particular reason, I want to catalogue some of the big changes we’ve seen at a platform and hardware level, and mention at least one change I’d like to see in the future. (Some of these are rooted, in part, in the previous generation, but have really come into their own in this generation):

  • Achievements – I’ve talked about this before but I think achievements, when done well, are the most promising change in gaming for a long time. The best achievements extend the life of games by asking you to do something out of the ordinary, the worst are essentially free (think the Avatar game). Although I’m not a Gamerscore Whore, I often find myself pushing on a little longer and further to get something that’s just out of reach. A great idea.
  • Friend lists – The Wii/DS online experience shows us exactly how not to do online gaming. While it’s good to make sure that both parties know each other for safety reasons, do they really need to do it for every single game? The 360 model is effective, safe enough, and user friendly. You add each other, and that’s that. Knowing which of your friends is online and ready to play something is very handy. I’ve had some excellent Halo 3 and Burnout Paradise sessions that I wouldn’t have enjoyed as much with random strangers.
  • Avatars – Nintendo got this one very right: bring in the mass market by putting them in the games. While I’m sure Wii Sports would’ve been plenty of fun without them, the miis of your friends add a lot of extra charm. Who hasn’t moaned when they get someone useless on their baseball team? A very handy excuse for poor performance. I’d like to see their integration go further though. Most of the games that use the miis (or the new 360 avatars) are pretty lightweight, casual games. I think we can do better than that.
  • Control methods – Nintendo, again, have proven very handy here. The Wiimote and balance board have both brought in gamers way outside the traditional hardcore element. Long may it continue. I hope we see some better uses of other inputs, such as the cameras that can be bought for the 360 and PS3, and the plastic instruments from the various music games that are around (I’m a big fan of Guitar Hero).
  • Downloadable Content – While I think few games have yet to deliver on the promise of DLC, we’re on the threshold of delivering substantial new content for games that would otherwise be shelved. Burnout Paradise has really led the way here, with the bike pack etc, but we’re seeing some big updates coming for most of the A-class games (I am, in fact, writing this while I await the download of Fable 2’s Knothole Island — released today).

I fully expect that, while not all consoles today have all of those features, all of the next generation consoles will have them all.

What would I like to see for the next generation? Richer integration with the web. Now, I know that sounds a little odd, but hear me out. Right now, you can go and find my Xbox Live gamercard online (I won’t link to it just now, but it’s not hard to find). It’ll tell you my current score and some recently played games. A few other sites who happen to be part of the Xbox Community Developers Program can also get at a handful of my other stats as well, like recent achievements. I want more.

I want a decent REST API for everything that happens to my profile (which anyone can opt out of, of course): games played, time played per day, new achievements, and any game specific stats like my levelling up in Fallout 3, or a new high score in Wario Ware. I want everything opened up.

Why? Two reasons: 1) I’m becoming very interested in the concept of lifestreams (more on which at a later date), and 2) because there is information there that I bet is illuminating and can be used in interesting ways that the developers and I cannot foresee just now.

Maybe I could tie my playing time stats into a fitness website, which would start hassling me about getting out and about. Maybe I’d like to see the distance my virtual characters have walked in Fallout 3 or Fable 2 (or both of them combined), and have it project onto a Google Map.

In short, I don’t know exactly why I want that data, but I know I want it to be accessible.

Is it likely to happen? Probably not. While I see the platforms opening up slowly, there is a cost in making that volume of data available and I honestly don’t know if any of the platform holders or developers would be willing to foot the bill for potentially nothing.

I hope they do though, because the more information in our lives we can mash together and accumulate into a context, the more interesting and rich that data can become.

Data Mappers with Java

In the Java world, there have been many ways to persist and retrieve domain objects, and most involve some form of ORM mapping. Modern methods include Hibernate, iBatis, Spring JDBC Templates and other JPA providers; these are all fine methods and all should be used appropriately. Most of the time you should use one of the above approaches and not what I’m going to show you below.

One thing I’d like to say about persistence in Java is that you should almost never be using raw JDBC. There’s too much boiler-plate and repetition. You need to create a DataSource, and then get a Connection, then use that Connection to get a PreparedStatement. That can start to look pretty messy.

Worst of all is that you get a ResultSet back which, if used without any form of conversion or mapping to your own domain, exposes the rest of your application to the persistence layer. This is bad for several reasons, not least of all it makes your code more tightly coupled and brittle in the face of change. Let’s be clear here: passing around a ResultSet, CachedRowSet or any other variant is BAD.

It’s easy to avoid this problem by convert between a ResultSet row and a domain-specific object. Spring popularised an idea called the RowMapper (which I’ll outline below). The idea is that you have an interface that, once implemented, will convert between a resultSet and your domain object.

Why might you need this? If you’re dealing with legacy applications, it’s frequently too costly to go back and retrofit one of the more modern approaches to ORM; the database layer doesn’t fit well with Hibernate, or the rest of the codebase won’t sit nicely with JPA, or you can’t take an (additional) external library dependency for any number of reasons. Sometimes it’s easier to quickly roll your own. Here’s an example I’ll call a DataMapper (NOTE: I’ve omitted Exception handling for the sake of clarity, you will need to add this back in!):

public interface DataMapper<T> {

	public T mapRow(ResultSet resultSet);

}

That’s nice and simple. When you implement this interface, your implementation doesn’t bother with any sort of row iteration, all it does is take the current row of the resultSet, call the standard getString(...) etc methods and builds up a single one of your beans, here represented by the generic placeholder T. Elsewhere you have this partner method (I like to keep this as a statically imported utility):

public static <T> Collection<T> map(ResultSet resultSet, DataMapper<T> mapper) {
    Collection<T> mappedObjects = new LinkedList<T>();
    while(resultSet.next() {
        mappedObjects.add(mapper.mapRow(resultSet));
    }
    return mappedObjects;
}

This method takes care of the iteration. You pass it your resultSet, and a DataMapper implementation and it’ll give you back a collection of your domain objects. Still quite simple and neat.

How about an example? Let’s say you have a ResultSet that could be mapped to a simple Person bean:

public class Person {

    private String firstName;
    private String lastName;
    private int age;

    //getters and setters for the above fields are omitted
    //Imagine they were below. Go on, it'll make life easier.

}

To do this using the DataMapper code I’ve outlined above, you’d have something like this:

public class PersonMapper implements DataMapper<Person> {

    @Override
    public Person mapRow(ResultSet resultSet) {
        Person bean = new Person();
        bean.setFirstName(resultSet.getString("first_name"));
        bean.setLastName(resultSet.getString("last_name"));
        bean.setAge(resultSet.getInt("age"));
        return bean;
    }

}

Pass that to the map method along with a valid ResultSet and you’ll get out a Collection of Person beans to work with. We’ve gone from ResultSet to domain-specific in very little code.

Now these DataMappers don’t have to be simple. They could use the ResultSet and only the ResultSet, or they could use that as a basis to do more queries to build the object. They could lookup other services for information. They can do whatever you want for a relatively high power-to-lines of code ratio.

I’ve found myself using the DataMapper abstraction for the purposes of unit testing. You write a DataMapper that outright ignores the (perhaps empty) ResultSet you pass it, and it instead uses the some randomised test data you’ve got ready. You can now easily create test beans without relying on the DB layer.

It’s not perfect, and there are often better ways of doing persistence, but I’ve found this pattern has been very useful on a number of occasions.

Six Years of Solitude

So, as of the publishing date of this piece, Solitude will have been around and active for six years. It seems like a long time. Going back and looking over the first few entries is, as one might expect, cringe-inducing.

There’s the very first post (which promised the now long-defunct main VKPS site would be back soon – HA!) and first real update about Geolocation (which hasn’t really done much of any use, though we’ll see if Geode changes that), and the first film review (for The Tuxedo), and even the first first recipe (I remember how bad that omlette was, it was all shell).

Even though my post rate has dropped over the years, I hope that Solitude or a successor will still be here in another six years, to give me pleasant memories to wince over.

Neater Excel in Java

If you write applications or websites using Java, you’ve probably needed to export some data to Excel at some point or other. It’s pretty well accepted that the best library for this is the invaluable POI. It provides a neat wrapper over most Excel functionality. It lets you create or reference a spreadsheet (an HSSFWorkbook in POI terms), and then gives you object abstractations for just about any way you might want to manipulate that. Cell styles, formulas, drawing shapes… I won’t go into everything but the cook book is a pretty good starting point for any work you might be doing.

The one thing that’s always bothered me about POI is that it’s pretty low-level, you have to manage absolutely every aspect of what you’re doing. Now it’s great to be able to access that kind of power, but an abstraction over the top would be good.

That’s where jXLS comes in. It builds on top of POI to provide a templating approach to writing new Excel documents, a decent abstraction for most purposes.

Rather than managing everything from what can quickly become fairly complex Java code, you create a template excel file that contains placeholders for where your data should appear. It’s smart enough to be able to intelligently expand a collection of beans and create a new row for each one. That can reduce your boilerplate code significantly.

Downsides? The documentation isn’t the best, and the expression language for some of the more advanced uses is JEXL, which is pretty horrible. Those, however, aren’t major concerns. You’ll rarely need either in depth and, if in doubt, you can always drop down to the POI APIs anyway.

Now, it’s not to be used in all circumstances, there are still situations that are a little beyond it, but it’s an extremely good 80/20 API (you get 80% of the use cases made easy, and the other 20% are still possible). Definitely worth a look.