Machine Tags in Drupal

When I started work on the OpenStreetMap module for Drupal towards the end of last year I got pretty much to a point where I'd implemented the basics, and then slowed down as I pondered how best to proceed. The tagging in Drupal of objects from the OpenStreetMap database was perhaps the biggest issue to consider, and with Christmas in the way I didn't get much further on with it.

Everything of interest in OpenStreetMap is either tagged (e.g., a point of interest or a whole road) or helps build up other structures that are themselves tagged (such as junctions, or vertices, in roads). The tags aren't simple 'tags' as are commonly used in Flickr, del.icio.us, or any other Web-2.0ey site you may be familiar with, but are instead key-value pairs consisting of, for example, key=amenity, value=cinema (often simply written as amenity=cinema for brevity).

These tags are similar in style to some advanced tags that have been used by a growing number of people on Flickr and other sites for a while, notably those such as geo:lat=54.2, geo:lon=-4.4 to denote the location of a photo. Nothing had ever been formalised though, so those tags were listed amongst the other simple tags, looking a little out of place. That is, until recently, with the launch of machine tags on Flickr, announcing that they will be supporting this more advanced usecase even better. You can read more about all this over on Dan Catt's blog.

Examples of these advanced tags, triple tags, or machine tags in terms of OpenStreetMap could be openstreetmap:amenity=cinema, openstreetmap:name="Palace Cinema".

But for the OpenStreetMap module, where I want to store this advanced tag information within the context of the Drupal taxonomy system, I'm left a little boggled as I consider the way ahead. Sure, I could save these tags 'as is' in the taxonomy system using the triple tags style noted above. But that means they're basically just treated as simple tags, even if they look a little different, and wouldn't necessarily be easily filterable (e.g., looking for all cinemas). Another option was to go off and implement a custom tagging system just for this module, but that didn't make much sense either.

I think what's really needed is a Machine tags module that plugs into the existing taxonomy system which is in the core of Drupal. The question is, should the taxonomy system be used as normal, storing the triple tags in full, and then have another module that allows the extraction, display and use of those tags more flexibly. Or, should there be an option in the core taxonomy system of creating a taxonomy that can store those machine tags in a better way, perhaps in a separate table that has columns for the each of namespace, key and value, or perhaps even different tables for each. The extra module to display and reuse those tags would probably still be necessary, tieing into the taxonomy data and allowing it to be used in different ways.