- apache
- api
- beelzebub
- benchmarking
- captcha
- crosswords
- developers
- dom
- domdocument
- dtd
- fifteensquared
- fireeagle
- lamp
- m255
- maths
- mobile
- mobileweb
- mono
- ms221
- mst121
- openuniversity
- overtheair
- overtheair08
- php
- phparchitect
- playphone
- python
- recaptcha
- recruitment
- rest
- slideshare
- softwarearchitecture
- spam
- subversion
- sucks
- svn
- talks
- ted
- testing
- tutorial
- versioncontrol
- wurfl
- xml
- yammer
- zendframework
- zendplatform
PHPTuring
A few years ago, as an exercise in Test-Driven Development, I wrote a Turing machine simulator in PHP and imaginatively named it PHPTuring.
I had completely forgotten about it until today, when I dug it out for another look. Truth be told, I still haven't seen a Turing machine done any better in PHP, and apart from a few syntactical niceties (removing closing PHP tags as per the Zend way, neatening up the PHPDoc blocks) I'm actually pretty comfortable with the code.
Using it is a breeze. It reads pipe-separated tapes and newline plus pipe-separated instruction sets like so:
<?php
$prog = '0|1|1|R|0\n0||1|R|1\n1|1|1|R|1\n1|||L|2\n2|1|||stop';
$tape = '1|1|1|1|1|1||1|1|1|1|1|1|1|1';
$machine = new Machine();
$compiler = new SimpleCompiler();
$parser = new SimpleTapeParser();
$debugger = new SimpleDebugger();
$debugger->watch($machine);
header('Content-type: text/plain');
$machine->run($compiler->compile($prog),
$parser->parse($tape));
It should work with other formats, so long as someone writes parsers for them. Similarly, the debugger is just an Observer that dumps the state and tape to the screen at each step, but it could easily do something more subtle some day.
The code ships with full tests, and is available for download on PHPTuring's Sourceforge download page.
So why am I banging on about it here? I don't know. Maybe just because I like it, because it was the first afternoon's coding that really got me test-infected, and because I'd be interested in any feedback.
Read more »The Get Up Kids are The Greatest Band That Ever Existed
The Get Up Kids are the greatest band that ever existed. To prove this, I would like to draw your attention to some very early live footage.
Read more »MySQL versus PostgreSQL: Adding a 'Last Modified Time' Column to a Table
This is the second post here detailing my ongoing adventures with PostgreSQL. This time I had a requirement to add a "timestamp" column to a table. The point of this being to allow us to track the "last modified" time of a row, without requiring that the application code manage the timestamp itself.
There's a lot of reasons why you might wish to do this. In this case it was to simplify syncing the data into a data warehouse. More specifically, to allow the DBA to easily identify rows which have changed since the last import.
Having done this a couple of times in MySQL, I assumed that the process would be straightforward. I should know better by now!
Read more »An Introduction to Tera-WURFL
I recently added a post about Wurfl, a comprehensive open-source XML database of mobile device capabilities. I noted that actually querying Wurfl in a performant manner:
is going to be a non-trivial task, and is perhaps a topic for a further article.
Well, I guess this is that article. It's time to have a look at Tera-WURFL, which is perhaps the most popular tool for querying Wurfl programmatically - from PHP, at least.
Read more »Herding Cats
For as long as anyone can remember, the term "herding cats" has been used as an analogy for the challenges involved in managing developers. The implication being, of course, that developers tend to be smart, wilful, single-minded folks. Personal experience suggests this is often the case.
The analogy was reflected in the title of a book named "Herding Cats: A Primer for Programmers who Lead Programmers", written by the impressively named J. Hank Rainwater. I mention it because this is a decent read for anyone who develops, or who works with developers - whether in a management capacity or not. It's not in the same league as "The Pragmatic Programmer", but I'm getting off the point now.
So anyway, I came across this video via Yahoo MySQL guru Jeremy Zawodny's blog. It's so slickly made that you're not surprised when it turns out to be an ad for a big expensive professional services company. But I liked it.
Read more »MySQL versus PostgreSQL: Adding an Auto-Increment Column to a Table
The bulk of my database experience (almost eight years now) has been with the popular open-source MySQL database management system. MySQL has progressed significantly over the years, and has grown into a remarkable product. It finally has all the must-have features such as views, stored procs and referential integrity, coupled with the blistering performance for which MySQL has always been known. In short, it rocks.
But I digress. I've recently been having to get to grips with PostgreSQL (an old version of course - 7.1 or so - just to make life really interesting). It's largely intuitive, but there are quirks around most corners. This is my favourite so far: I recently needed to add an auto-incrementing integer "id" column to a table.
Read more »A Brief Glossary of Mobile Jargon
A few terms that seem to get bandied about in the industry. I'll probably add to this over time.
Read more »Managing Mobile and Non-mobile Versions of a Site Using Tera-WURFL and Zend Framework
This is a quick proof-of-concept I put together after a discussion on how to deal with running a mobile site and a 'full' web site on the same hostname in a sane way, and to transparently route user agents to the appropriate site.
Read more »Mobile Device Detection: WURFL and UAProf
If you do any kind of development for mobile devices, you'll soon stagger into the minefield of browser and device detection.
Now, this is quite a different sort of challenge to that faced on the desktop web. On the desktop we have maybe one browser worth using, plus a whole lot of people using Internet Explorer, along with a handful of computer programmers who, bewilderingly, persist in using Firefox.
Ok, light-hearted browser snobbery and a few CSS hacks aside, the point is that these days, you basically know where you stand with desktop browsers.
By contrast, when we turn to mobile development, we find we're up against quite literally thousands of subtly and not-so-subtly different devices, and countless combinations of device, firmware and operating system versions. How can we get around this staggering complexity in order to find out the user's screen size? What markup can they handle? What ringtone formats can they accept? How about Java applications and games?
This is where WURFL comes in. WURFL is a free, open-source database of mobile devices and their specifications and capabilities. It is as close to a "standard" solution to this problem as currently exists.
WURFL
The WURFL database is essentially a large (just over 6 Megabytes at the time of writing) XML file. The data is hierarchical and is ultimately keyed by user agent string (as typically supplied in an HTTP request's User-Agent header). The key is mapped to a specific device, and to one or more families of devices, via the "fall_back" mechanism. This allows us, with a bit of work, to build up a list of the device's "capabilities".
If it helps, you can think of this in OO terms as a big stack of classes extending other classes, overriding fields as they go.
Let's try a concrete example. Imagine we come across an agent which identifies itself as "SEC-SGHE950/1.0 NetFront/3.4 Profile/MIDP-2.0 Configuration/CLDC-1.1". Querying WURFL, we find:
<device user_agent="SEC-SGHE950/1.0
NetFront/3.4 Profile/MIDP-2.0
Configuration/CLDC-1.1"
fall_back="samsung_sgh_e950_ver1"
id="samsung_sgh_e950_ver1_sub1"/>
This tells us...well, nothing! Nothing, beyond the fact that this user agent string is known to WURFL, and thanks to the crucial "fall_back" attribute, we know that it's one of an unspecified number of user agent strings which identify a device or family of devices known internally to WURFL as "samsung_sgh_e950_ver1". So we need to query WURFL again to find the details of this item. Here it is:
<device user_agent="SEC-SGHE950"
actual_device_root="true"
fall_back="sec_e900_ver1"
id="samsung_sgh_e950_ver1">
<group id="product_info">
<capability name="model_name" value="E950"/>
</group>
<group id="display">
<capability name="resolution_width" value="240"/>
<capability name="resolution_height" value="320"/>
<capability name="max_image_width" value="233"/>
<capability name="max_image_height" value="280"/>
</group>
<group id="markup">
<capability name="preferred_markup" value="html_wi_oma_xhtmlmp_1_0"/>
</group>
</device>
This is more like it. We've identified the device (actual_device_root="true"), we have its model number (E950) and its screen size (240 x 320), and we've learned that, ideally, it would like to be served web pages as XHTML-MP. Cool. But there's more: the device falls back further, to "sec_e900_ver1":
<device user_agent="SEC-SGHE900"
actual_device_root="true"
id="sec_e900_ver1"
fall_back="netfront_ver3">
<group id="product_info">
<capability name="brand_name" value="Samsung"/>
<capability name="model_name" value="E900"/>
</group>
<group id="markup">
<capability name="preferred_markup" value="html_wi_oma_xhtmlmp_1_0"/>
<capability name="html_wi_oma_xhtmlmp_1_0" value="true"/>
<capability name="html_wi_w3_xhtmlbasic" value="true"/>
<capability name="wml_1_3" value="true"/>
</group>
<group id="display">
<capability name="resolution_width" value="240"/>
<capability name="resolution_height" value="320"/>
<capability name="max_image_height" value="300"/>
<capability name="max_image_width" value="232"/>
</group>
<group id="image_format">
<capability name="gif" value="true"/>
<capability name="jpg" value="true"/>
<capability name="png" value="true"/>
<capability name="colors" value="262144"/>
</group>
<group id="storage">
<capability name="max_deck_size" value="8000"/>
</group>
<group id="object_download">
<capability name="ringtone" value="true"/>
<capability name="ringtone_midi_monophonic" value="true"/>
<capability name="ringtone_midi_polyphonic" value="true"/>
<capability name="ringtone_aac" value="true"/>
<capability name="ringtone_mp3" value="true"/>
<capability name="wallpaper" value="true"/>
<capability name="wallpaper_gif" value="true"/>
<capability name="wallpaper_jpg" value="true"/>
<capability name="screensaver" value="true"/>
<capability name="screensaver_gif" value="true"/>
<capability name="video" value="true"/>
<capability name="video_qcif" value="true"/>
<capability name="video_sqcif" value="true"/>
<capability name="video_wmv" value="true"/>
<capability name="video_3gpp" value="true"/>
<capability name="video_mp4" value="true"/>
<capability name="video_acodec_aac" value="true"/>
<capability name="wallpaper_colors" value="18"/>
<capability name="wallpaper_png" value="true"/>
<capability name="wallpaper_preferred_height" value="320"/>
<capability name="wallpaper_preferred_width" value="240"/>
<capability name="ringtone_voices" value="64"/>
</group>
<group id="sound_format">
<capability name="aac" value="true"/>
<capability name="midi_monophonic" value="true"/>
<capability name="midi_polyphonic" value="true"/>
<capability name="mp3" value="true"/>
<capability name="voices" value="64"/>
</group>
<group id="j2me">
<capability name="j2me_midp_2_0" value="true"/>
<capability name="j2me_midp_1_0" value="true"/>
<capability name="j2me_cldc_1_1" value="true"/>
</group>
</device>
Jackpot! We've found out a lot about the device now, in particular the content types it supports. (Note that the model name here is specified as "E900". We can disregard that, as we know it to be overridden one step further down the hierarchy). We can keep going in this manner, successively falling back to "netfront_ver3", "generic_xhtml", "generic" and finally "root". At each stage we pick up a little more information about the device (and at each stage we'll need to disregard a whole bunch of default information too).
Now, I'm sure you're as excited by all of this as I am, but before we get carried away, let's bear in mind that WURFL itself currently weighs in at roughly 127,000 lines of XML. Evidently, recursively querying WURFL directly, in real time, is not a practical option. We'll most likely want to import the data into a relational database, and schedule regular updates. This is going to be a non-trivial task, and is perhaps a topic for a further article.
Alternatives to WURFL
So WURFL's pretty awesome, but it's going to involve some effort on our part to press it into service. Since we're professionals, we'll want to consider whether there are any alternatives which provide better value-for-effort. Well, it turns out that there are alternatives, but they're few and far between. Perhaps the most notable of these few is a mechanism known as "UAProf".
UAProf
UAProf is a W3C initiative, the idea behind which is as follows: the user agent supplies a specific HTTP request header (of which, more in a moment). This header contains the URL of an XML file which describes the capabilities and specifications of the device. The developer programmatically retrieves the XML file, parses it and responds appropriately, typically caching the derived data somehow.
This approach is considered to be attractive, for reasons including:
- We don't need to do a lot of work up front. When an unfamiliar handset hits our site we simply retrieve the UAProf reactively. Theoretically, maintenance is minimal.
- UAProf data is supposed to be supplied by the handset manufacturer, thus ensuring it to be correct and up-to-date.
The reality is a little different. I'll loosely quote from Wikipedia here:
- Not all devices have UAProfs (including many new Windows Mobile devices, iDen handsets, or legacy handsets)
- Not all advertised UAProfs are available (about 20% of links supplied by handsets are dead or unavailable, according to figures from UAProfile.com)
- UAProf can contain schema or data errors which can cause parsing to fail
- Retrieving and parsing UAProfs in real-time is slow and can add substantial overhead to a web request
- There is no industry-wide data quality standard for the data within each field in a UAProf
- UAProf headers can often be plain wrong (i.e. for a completely different device)
Add to this that even the name of the HTTP header itself is not clearly defined. Typically, it is "x-wap-profile", but there are several alternatives in the wild right now.
In any case, there's little motivation for manufacturers to provide accurate data. To quote Michael Kaye on the wmlprogramming Yahoo tech group:
We cannot rely on the manufacturers to provide correct information: no company I know of would provide an accurate list of their bugs and misfeatures for the entire world to see.
Quite. All in all, UAProf is considered neither mature nor stable enough to be relied upon in production systems. Moreover, since UAProf is merely one of the sources of data integrated into WURFL in the first place, there's little benefit in developers reinventing the wheel by working with the UAProf mechanism directly.
Other Alternatives
There's a certain amount of talk about how nice it would be if there were an official, centralised, open "Device Description Repository". To this end, the W3C's Mobile Web Initiative has spawned a Device Description Working Group to continue talking - albeit in a more focused manner - about just how nice it would be.
There are commercial offerings too, many of which either layer a Web Services API over WURFL, or attempt to maintain their own database of mobile devices. Inexplicably, dotMobi themselves have hired one of WURFL's original developers and taken the first steps towards setting up their own competing device database.
There's also some activity around automated adaptive rendering technologies. For the .NET developer there's ASP.NET Mobile Controls, but I don't hear a lot of happy noises coming from that direction. For Java and - allegedly - PHP, there's WALL, a tag library somewhat based on WURFL. These may approach some of the minor issues around fragmentation of markup standards.
I have little to no experience with these so I won't be commenting (and I'd be interested to hear from developers with experience of any of the above). Suffice to say, your mileage may vary, and for the time being, WURFL remains a powerful tool in many cases.
Read more »Can anyone lend me a beginner's C++ book and a house in California?
I'm mildly flattered to have been sort-of-headhunted by Google's US technical staffing team today. Apparently my...
experience at Pitch and strong educational background make [me] an excellent fit at Google.
Cool, thanks Google. But I'm not sure that you've researched me too well.
Never mind. Should I feel the urge to up sticks and become a C++ programmer in Mountain View, California (a mere 5,431.51 miles from home) I'll give you a ring :)
Read more »