5 Feb 2009, 6:27 p.m.

Syntax Highlighting with GeSHi

On this very site, I've recently started using GeSHi, to implement the rather nifty code syntax highlighting you see in posts like this and this.

I had previously been using PHP's built in highlight_string() function, but that function is only of use for highlighting PHP code! As I often seem to need to highlight other languages, it was time to turn to GeSHi.

Since I did, a couple of people have asked about ease of use, implementation and so forth, so this post is my attempt at answering those questions.

GeSHi

GeSHi - the "Generic Syntax Highlighter - is a free, open-source library written in PHP, and released under the GNU General Public License.

GeSHi currently highlights well over 100 languages/dialects, from PHP, CSS and XML to more obscure languages such as GLSlang, VHDL and - mind-bogglingly - Whitespace. Furthermore, if you wish to highlight examples of a language which is not supported, it should be fairly easy to add that support yourself, thanks to Geshi's relatively straightforward syntax-file format.

Installation

Installing and running GeSHi is so unbelievably straightforward that it barely merits a blog post! It's merely a case of grabbing the latest version (1.0.8.2 at the time of writing) from the SourceForge downloads page, and unzipping it somewhere convenient on your server. I'll be referring to the resulting installation directory as /path/to/geshi for the purposes of this post.

In the simplest possible case, pressing Geshi into action is a matter of just three lines of code:


<?php

require_once '/path/to/geshi/geshi.php';

$string = "<?php\\n\\necho 'Hello World!';\\n";

$geshi = new Geshi($string, 'php');
return $geshi->parse_code();

Admittedly, that's four lines, but only three are GeSHi-specific. The resulting output will be something like the following:


<?php

echo 'Hello World!';

I rather enjoy the fact that when highlighting PHP buit-in functions (and some language constructs, such as the echo call in the previous example), GeSHi drops in a link to the manual. That could be especially useful in the case of code snippets which demonstrate calls to more esoteric functions.

There are a few more advanced options of course. You can have GeSHi display simple line numbers, or even "fancy" line numbers (where every nth line is emphasised). You can also modify the default colours and markup, and apply CSS rules. The options are well documented in the Advanced Features section of the documentation, so I won't labour the point here.

It's worth noting that the GeSHi site has also a nifty demo page which allows you to preview a few of the options.

Blog Integration

Finally, a quick note on how I've integrated GeSHi with this particular blog. Blog entries are marked up in a kind of basic HTML in the database. Obviously I wasn't keen to store the formatted, highlighted code in the database too, so I hacked together a little...well, hack.

Code snippets are stored inline in articles - right where they're used - and are marked up in <listing /> tags, with a kind of pseudo-attribute specifying the language. That is then simply regexed out in a View Helper, and fed to GeSHI via a callback method. Here's the code for that, highlighted using...well, you guessed it.


<?php

class Weblog_View_Helper_RenderBlogPost
{
    // ...more methods here...


    /**
     * @param string $string The source of the post item
     */
    private function _renderCode($string)
    {
        return preg_replace('/(.*?)<\/listing>/es',
                    '$this->highlightString(\'\\2\', \'\\1\')', 
                    $string);
    }


    /**
     * @param string $string The code to highlight
     * @param string $lang Which language it is in
     */
    public function highlightString($string, $lang='php')
    {
        $geshi = new Geshi(
                        stripslashes(trim($string)),
                        $lang);
        return @$geshi->parse_code();
    }
}

That's about it really. There may be more elegant ways, but it works for me.

Posted by Simon in PHP and Programming
6 Feb 2009, 11:14 a.m.

Russell

I have been putting off looking into syntax highlighting for a while as I thought it would be a pain to implement but GeSHi looks easy enough to get working. There is also a proposed Zend Framework component called Zend_Syntax which although yet to be released is worth keeping an eye on. Looking at the proposal for Zend_Syntax however it looks as though there has been no progress in the last few months.