Syntax Highlighting with RedCloth

Keyboard_thumb

After I implemented LaTeX rendering into my views, I was quite satisfied with how it came out. LaTeX can now be effortless inserted into a field that is textile enabled with very good results. But then I began to look at the code snippits and I realized how ugly they look when compared to the rest of the site. I did a quick web search for ways to incorporate syntax highlighting in Rails applications, but I wasn’t really satisfied with what I found. Then the solution smacked me in the face. Just extend RedCloth!

As with the LaTeX problem, I wanted something that could integrate with what I was alreadying using. Parsing the entire text twice was just not an option. RedCloth was already identifying the pre and code blocks because it doesn’t textilse those. I just had to figure out how to intercept and then stylize them. I came across Ryan Heath’s blog and his code snippits instantly caught my eye. I shot him an email and asked him what he was using to preform the syntax highlighting. Turned out he was using the Ruby Syntax gem. Apparently, he tried out UltraViolet, but couldn’t get it to run on his hosts’ server. I think the UltraViolet gem was overkill for what I wanted to accomplish and that the Syntax gem was just what I needed — simple and fast.

Now that I found the tool sytlizing code, I needed to intercept the code blocks from RedCloth. After a little bit of fishing around in the RedCloth documentation, I found the function I needed to override: rip_offtags. Before we can begin though, we have to install the Sytax gem. Preform the following on the command line:

> sudo gem install syntax

The syntax gem contains a function called convert which takes code and tranforms into HTML with each token of the code put into spans. It is very simple to use. The following is taken directly from its docs:

require 'syntax/convertors/html'

convertor = Syntax::Convertors::HTML.for_syntax "ruby"
html = convertor.convert( File.read( "program.rb" ) )

puts html

From the above, you can see that once we are able to intercept the code blocks from RedCloth, we just pass it onto the convert function and we are done. We are now ready to edit the washcloth.rb file we created when making the LaTeX extension. Add the following to the very begining of the file (before the class tag):

require 'syntax/convertors/html' # for the syntax gem

Next we need to copy the existing rip_offtags defintion from redcloth.rb.

# File lib/redcloth.rb, line 1009

OFFTAGS = /(code|pre|kbd|notextile)/
OFFTAG_MATCH  = /(?:(<\/#{ OFFTAGS }>)|(<#{ OFFTAGS }[^>]*>))(.*?)(?=<\/?#{ OFFTAGS }|\Z)/mi

def rip_offtags( text )
  if text =~ /<.*>/
    # strip and encode pre content
    codepre, used_offtags = 0, {}
    text.gsub!( OFFTAG_MATCH ) do |line|
      if $3
        offtag, aftertag = $4, $5
        codepre += 1
        used_offtags[offtag] = true
        if codepre - used_offtags.length > 0
          htmlesc( line, :NoQuotes ) unless used_offtags['notextile']
          @pre_list.last << line
          line = ""
        else # this is the part that we're interested in
          htmlesc( aftertag, :NoQuotes ) if aftertag and not used_offtags['notextile']
          line = "<redpre##{ @pre_list.length }>"
          @pre_list << "#{ $3 }#{ aftertag }"
        end
      elsif $1 and codepre > 0
        if codepre - used_offtags.length > 0
          htmlesc( line, :NoQuotes ) unless used_offtags['notextile']
          @pre_list.last << line
          line = ""
        end
        codepre -= 1 unless codepre.zero?
        used_offtags = {} if codepre.zero?
      end
      line
    end
  end
  text
end

The code that were need to change is the following:

htmlesc( aftertag, :NoQuotes ) if aftertag and not used_offtags['notextile']
line = "<redpre##{ @pre_list.length }>"
@pre_list << "#{ $3 }#{ aftertag }"

The local variable, aftertag, contains the unformatted code that we are interested in. What is happening in that block, is RedCloth is passing it off to another function which escapes special characters. Since the syntax gem will do that for us we can use it’s convert function as a drop in replacement. Chage the above code to the following:

if aftertag and not used_offtags['notextile']
    convertor = Syntax::Convertors::HTML.for_syntax "ruby"
    aftertag = convertor.convert(aftertag, false)
end

line = "<redpre##{ @pre_list.length }>"
@pre_list << "#{ $3 }#{ aftertag }"

Notice that my call to convert has a second parameter set to false. This is because we already have pre and code tags that RedCloth is going to insert back in for us. Setting the second paramter to false tells convert to not add in pre tags.

The only thing left to do is write some custom CSS to tell the browser how we want to style our tokens. Here is a good starting point. Just edit the colors to fit the scheme you have for your own site!

Update:

There is a bug in the current implementation. For instance if you have a pre tag inside your code block, RedCloth will break it out and deal with it separately which then breaks your highlighting. Once solution would be to create [code] blocks like we did for LaTeX rendering, but I am not sure if I like that way. Have to think about this. If you have any good ideas, let me know and post it in the comments!