GitHub Flavored Markdown with the Commonmarker gem

By Exequiel Rozas

If our users are moderately technical, allowing them to write Markdown instead of using a WYSIWYG editor can make them happier and more efficient.

Ruby has more than a couple of gems whose main concern is parsing Markdown. Each of these has a different approach to the problem and implements a differing Markdown specifications.

In this article, we will cover Commonmarker, a gem that implements the CommonMark specifications and adds support for GitHub-flavored Markdown, which is some users prefer because of its feature set.

Let's start by seeing what we will build:

What we will build

We will build a simple blog application with a single Post model and CRUD views.

To have a better experience, we will be using the Marksmith Markdown Editor to add the ability to preview what we're writing before saving the changes to the database.

We will be showcasing how the gem works, how to add common features like syntax highlighting, and we will also add a feature to add custom embeds also known as shortcodes which are very useful when Markdown doesn't completely fit our needs.

The result will look like this:

Application setup

We start by creating the application using Tailwind for CSS:

$ rails new commonmarker --css=tailwind --javascript=esbuild

Next, we navigate to the directory and add a Post using scaffolding:

$ bin/rails generate scaffold Post title excerpt content:text

We set up the database, install Active Storage and run the migrations:

$ bin/rails db:setup && bin/rails active_storage:install && bin/rails db:migrate

Now, we add and install the commonmarker gem:

$ bundle add commonmarker && bundle install

Finally, we add the ability to add a cover image to our Post model, and we add validations:

class Post < ApplicationRecord
  has_one_attached :cover

  validates :title, presence: true
  validates :body, presence: true
end

Marksmith installation

To have access to Marksmith we need to install the gem and the associated JavaScript.

As a note, Marksmith comes with a default renderer that uses the Commonmarker gem behind the scenes, so we don't need to do anything regarding that.

Let's start by installing the gem:

$ bundle add marksmith && bundle install

Now, to add the package, we will use importmap:

$ yarn add @avo-hq/marksmith

Then, we require the desired packages in our Stimulus controller index:

// app/javascript/controllers/index.js
import { MarksmithController, ListContinuationController } from '@avo-hq/marksmith'

application.register("marksmith", MarksmithController)
application.register("list-continuation", ListContinuationController)

Next, we add the stylesheet to our application.html.erb layout file:

<%= stylesheet_link_tag "marksmith" %>

Then we add the Marksmith editor to our form's body field:

<%= form.marksmith :body, rows: 12, class: "border-2 border-neutral-300 px-3 py-2 focus:ring-gray-500 focus:border-gray-500 block w-full border-gray-300 rounded-md", placeholder: "Write your post content here..." %>

Finally, and after styling the scaffolds, we should have something like this:

New post view using Marksmith with Commonmarker renderer

Rendering the content

Because we're using Marksmith, we could just use the default Marksmith renderer, but we will actually add a renderer of our own to better showcase the Commonmarker gem.

To achieve this, we have to override the Marksmith::Renderer class to add the desired behavior.

By using the overridden renderer, the content shown in the Marksmith's preview will match the one we end up showing to our users.

We could put this class in the lib folder but, for simplicity’s sake, we will add it to the models folder.

Before adding any customization, we will use the default Commonmarker renderer:

# app/models/marksmith/renderer.rb
module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      Commonmarker.parse(@body).to_html
    end
  end
end

Now, we will create a post using Markdown to showcase how Commonmarker works out of the box, including GitHub-flavored Markdown features.

In this article, we will learn how to add Markdown rendering to a Rails application using the Commonmarker gem.

Markdown is very useful, especially if you have **technical users**.

We can add features that are not in standard Markdown, like ~~strike through text~~.

By default, Commonmarker also includes an autolinking feature: https://avohq.io

But that's not all, we can also use emojis :heart:

Syntax highlighting works out of the box:

```ruby
def greet(name)
  puts "Hello, #{name}"
end
```

Tables also do:

| Gem | Main Feature |
| --- | --- |
| CommonMarker | GitHub-flavored Markdown |
| Marksmith | Ruby on Rails Markdown Editor|
| Avo | Code based app builder for Rails |

Task lists are also available by default:

- [x] Install Commonmarker
- [ ] Customize renderer (optional)

HTML tag filtering is enabled by default also:

<div>Hello</div

Now, we render the content in the post view using the <%== marksmithed %> helper:

<div class="blog-content">
  <%== marksmithed @blog.body %>
</div>

![[commonmarker-default-rendering-result.png]]

As you can see, all of these features come by default with the gem!

To handle the styling, use the blog-content class or whatever class you wish to use.

Now that we have the basics setup, let's start by learning how to configure Commonmarker to fit our needs:

Configuring the gem

Commonmarker has excellent documentation, but it can be moderately confusing if we're not paying attention.

Using the gem, there are two ways to produce HTML from Markdown:

body = Post.first.body
html = Commonmarker.to_html(body) ## Using .to_html directly
html_alt = Commonmarker.parse(body).to_html

By default, they both work equally, however if we want to customize the gem's behavior, using parse only allows us to pass the parse, render and extension configurations, while using to_html directly allows us to pass the same configurations plus the plugins in the same hash.

So, if we want to use parse our render method might look like this:

def render
  Commonmarker.parse(@body, **options).to_html(**plugins)
end

Where **options and **plugins are method calls that return a hash with the appropriate root key:

def options
  {
    options: {
      parse: {},
      render: {},
      extension: {}
    }
  }
end

def plugins
  {
    plugins: {}
  }
end

Otherwise, if we use the to_html class method, we can pass the plugins within the options hash:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      Commonmarker.to_html(@body, **options)
    end

    private

    def options
      {
        options: {
          parse: {},
          render: {},
          extension: {},
        },
        {
          plugins: {}
        }
      }
    end
  end
end

By default, commonmarker renders every heading tag wrapping a link to the parameterized ID of the heading's content. This is very useful if you want to add a dynamic table of contents to your application.

Parse options

The gem comes with 4 parsing options:

  • smart: If set to true, punctuation symbols: quotes, full-stops and hyphens are converted into 'smart' punctuation. For example, straight quotes would become curly quotes.
  • default_info_string: it sets the default string to be used for fenced code blocks. It defaults to "".
  • relaxed_tasklist_matching: it allows any non-space character for the checked state of task lists, instead of allowing x and X exclusively.
  • relaxed_auto_links: relaxes the autolink parsing. For example: links within brackets are recognized, and it also permits any URL scheme.

Render options

There are multiple config options associated with the way Commonmarker renders the content:

  • hardbreaks: set to true by default. It converts soft breaks into hard breaks. For example, if we don't use an explicit enter between lines and use something like shift + enter, soft breaks would mean the text won't skip to the next line. Using hard breaks, the text will do so because a <br/> tag will be introduced.
  • github_pre_lang: it adds the language inside fenced code blocks as a lang attribute on the pre tag.
  • full_info_string: gives info string data.
  • source_pos: includes the source position information in the outputted HTML or XML.
  • unsafe: allows rendering raw HTML and potentially dangerous links. It's set to false by default. We should proceed with caution when setting it to true: never trust user input.
  • escape: it escapes raw HTML. Set to false by default. If set to true the HTML will be shown as a string and not evaluated, and it will override the unsafe option because HTML won't be evaluated.
  • escaped_char_spans: it wraps escaped characters in span tags. It's set to true by default.
  • ignore_setext: it ignores setext-style headings, which are defined by underlining the heading text with equal signs or hyphens. Equal signs represent a level 1 heading, and hyphens a level 2. It's set to false by default.
  • ignore_empty_links: when set to true, empty links are ignored. It's set to false by default.
  • gfm_quirks: set to false by default. If set to true it outputs HTML with GitHub Flavored Markdown quirks like not nesting <strong> inlines.
  • prefer_fenced: if set to true it outputs fenced code blocks even where they are defined as indented. Set to false by default.

Extensions

The gem comes with 20 extensions that expand standard Markdown's features. Some of the most important that are enabled by default are:

  • strikethrough
  • tagfilter
  • table
  • autolink
  • tasklist
  • shortcodes: enables emoji shortcodes like :emoji:

Other extensions that are disabled by default are:

  • superscript
  • footnotes
  • description_lists
  • mathdollars and mathcode
  • underline
  • spoiler
  • alerts

Feel free to check the whole documentation for Commonmarker extensions to see which ones you can enable to get to your desired results.

Extending syntax highlighting

Commonmarker provides syntax highlighting by default.

It actually includes a set of themes that we can choose to highlight code: base16-ocean.dark, base16-eighties.dark, base16-mocha.dark, base16-ocean.light, InspiredGitHub, Solarized (dark) and Solarized (light).

To configure the theme, we need to pass the theme name to the syntax_highlighter key inside plugins:

body = Post.first.body
Commonmarker.to_html(body, plugins: { syntax_highliter: { theme: "Solarized (dark)" } })

That, would produce the following result:

![[code-block-with-solarized-dark-syntax.webp]]

However, we can customize syntax highlighting even further.

You might be thinking about using something like the Rouge gem or Highlight.js but, with Commonmarker, we can dispense with those tools and define our themes or copy them from the web using the .tmtheme format.

This format is an XML based format which was created by TextMate, the past editor of choice of DHH, Rails creator.

If you want to find specific themes for this format, you can use the following Google query: site:github.com inurl:tmtheme followed by your theme of preference.

For example, if we would like to find the tmTheme version of the cobalt theme, we can search for site:github.com inurl:tmtheme cobalt which will produce the following result:

![[tmtheme-search-with-google.webp]]
Navigating to any of those results will produce the desired XML file that we can use to define our custom themes.

After fetching the XML for the theme, we have to store it somewhere in our application to reference it in the configuration, so we copy the content of the Cobalt theme, and we store it in app/assets/themes/cobalt2.tmTheme.

Now, we reference the theme from the syntax_highlighter option using the path key:

# Inside app/models/marksmith/renderer.rb
def plugins
  {
    plugins: {
      syntax_highlighter: { 
        theme: "cobalt2", 
        path: Rails.root.join("app", "assets", "themes").to_s 
      },
    }
  }
end

And now, the result of our syntax highlighting should be using the Cobalt2 theme:

![[custom-syntax-highlighting-commonmarker.webp]]

Of course, we can make our own themes if we desire. With this configuration, you can store them in the themes folder and only define the themes you need instead of relying on external dependencies.

Alerts or callouts

If you've been paying attention, you might have noticed that we mentioned an alerts extension that is set to false by default by Commonmarker.

Alerts, also known as callouts, are a special type of component that is supposed to draw attention from the reader to communicate something that's important.

It may be a warning, so users take some security considerations, an alert to make it even more significant or just a note that is supposed to complement the surrounding content.

In GitHub-flavored Markdown, alerts or callouts are defined using syntax similar to blockquotes, with the exception that they add a bracket enclosed word that starts with an exclamation:

> [!WARNING]
> Hello this is a Warning callout. Does it work?

If we haven't set the alerts extension to true, the output should be a blockquote like so:

Default alert rendering as blockquotes

But, if we set the alerts extension to true we get something like this:

Alerts without styling in Commonmarker

But that doesn't look like a warning alert, does it?

To make it look like one, we have to consider the HTML it generates, which is:

<div class="markdown-alert markdown-alert-warning">
  <p class="markdown-alert-title">Warning</p>
  <p>Hello this is a Warning callout. Does it work?</p>
</div>

Every alert will have the markdown-alert class, so we can use that to give global styles to the alerts and use the markdown-alert-#{type} class to specify how we want the alert to look like.

Like GitHub, the Commonmarker gem defines 5 types of callouts: Note, Tip, Important, Warning and Caution so let's define the CSS for all of those using Tailwind.

Let's start with the markdown-alert class that's shared by every callout:

.markdown-alert {
  @apply py-3 px-4 rounded-md my-4;
}

Now, let's add every type of callout to the content:

> [!NOTE]
> Hello from the Note callout

> [!TIP]
> Hello from the Tip callout

> [!WARNING]
> Hello from the Warning callout

> [!IMPORTANT]
> Hello from the Important callout

> [!CAUTION]
> Hello from the Caution callout

And let's add the corresponding CSS:

.markdown-alert {
  @apply py-3 px-4 rounded-md my-4;

  & p {
    @apply !mb-0 !leading-tight;
  }
}

.markdown-alert-title {
  @apply !my-0 !mt-1 !leading-tight font-bold !text-sm;
}

.markdown-alert-warning {
  @apply !bg-yellow-100 text-yellow-900 !leading-tight;
}

.markdown-alert-note {
  @apply !bg-blue-100 text-blue-900 !leading-tight;
}

.markdown-alert-tip {
  @apply !bg-green-100 text-green-900 !leading-tight;
}

.markdown-alert-caution {
  @apply !bg-red-100 text-red-900 !leading-tight;
}

.markdown-alert-important {
  @apply !bg-purple-100 text-purple-900 !leading-tight;
}

After adding this, we should see something like this in the body:

Alerts or callouts in Markdown with Commonmarker

We can customize the callouts even more using CSS with before pseudo selectors and SVG icons. After doing that for every callout, we get the following result:

Alert callouts with icons with Commonmarker

The way to achieve this is by adding a pseudo-selector and padding-left to the title:

.markdown-alert-title {
  @apply !my-0 !mt-1 !leading-tight font-bold !text-sm flex items-center;
  position: relative;
  padding-left: 28px;

  &::before {
    content: "";
    position: absolute;
    left: 0;
    top: 50%;
    transform: translateY(-50%);
    width: 20px;
    height: 20px;
    background-repeat: no-repeat;
    background-position: center;
    background-size: contain;
  }
}

Then, we add the SVG icon to the title of the specific callout:

.markdown-alert-warning .markdown-alert-title::before {
  background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 24 24' fill='%23B45309' width='24' height='24'%3E%3Cpath d='M12 22c1.1 0 2-.9 2-2h-4c0 1.1.9 2 2 2zm6-6v-5c0-3.07-1.63-5.64-4.5-6.32V4c0-.83-.67-1.5-1.5-1.5s-1.5.67-1.5 1.5v.68C7.64 5.36 6 7.92 6 11v5l-2 2v1h16v-1l-2-2zm-2 1H8v-6c0-2.48 1.51-4.5 4-4.5s4 2.02 4 4.5v6z'/%3E%3C/svg%3E");
}

We repeat the process for every type of callout, and we're set with our callout feature. Of course, you can try with other icons, positioning, sizing, etc. if you want your callouts to look less conventional.

Customization traversing the AST

We can customize the output of the parsed Markdown by traversing the Abstract Syntax Tree that represents the document.

To showcase how this works, we will do two things: make sure that links use the https protocol and have external links render as text.

Link manipulation with AST parsing

Commonmarker, or comrak the Rusty library it wraps, models the Markdown documents as an AST (Abstract Syntax Tree).

The gem provides two methods to traverse the tree and manipulate the nodes: walk and each. The first allows us to access every child node recursively, while the latter iterates over the direct children of the node.

The Commonmarker::Node instances can be of multiple types, but they don't have many customization options. For example, a node of type link allows us to set the url and the title attribute.

Let's start by making sure our links use the https protocol by traversing the document with the walk method and substituting the protocol with https://:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      doc = Commonmarker.parse(@body, **options)

      doc.walk do |node|
        if node_type == :link
          force_https(node)
        end
      end

      doc.to_html(**plugins)
    end

    private

    def force_https(node)
      node.url = node.url.gsub("http://", "https://")
    end

    ## Options and plugins methods
  end
end

Now, we will make sure that external links are rendered as plain-text. To achieve this, we will need to use the insert_before and delete methods of Commonmarker::Node instance:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      doc = Commonmarker.parse(@body, **options)

      doc.walk do |node|
        if node_type == :link
          force_https(node)
          plain_text_if_external(node)
        end
      end

      doc.to_html(**plugins)
    end

    private

    def force_https(node)
      https_url = node.url.gsub("http://", "https://")
      node.url = https_url
      node.first_child.string_content = https_url
    end

    def plain_text_if_external(node)
      return node unless external_link?(node.url)

      node.insert_before(node.first_child)
      node.delete
    end

    def external_link?(url)
      url.include?("avohq.io")
    end

    ## Options and plugins methods
  end
end

Now, after adding this, we get the following:

URL parsing using the AST in commonmarker

Even though we set the link to AvoHQ with the http protocol in the text, we get the link with the https protocol and the text that reflects that.

Moreover, we also get the link to Avo Demo rendered as plain-text.

We can use AST traversing to modify the content to our liking but, like we said before, the options are limited, so now we will explore how to introduce customizations using Nokogiri.

Please note that the external_link? method shown above doesn't really contemplate many edge cases like including the domain name as a resource id or as a parameter. A more robust implementation should be considered if

Customizations with Nokogiri

Because of the previously commented limitations with node parsing in the Commonmarker gem, there are some common features that we can only achieve parsing the generated HTML.

For this, we will use the Nokogiri gem to modify the HTML produced by the gem.

The first use case is to add a rel="nofollow" attribute to external links:

Making external links nofollow

Similarly to the AST parsing example where we made external links render as plain-text, we will use the Nokogiri gem to add the nofollow attribute.

The first thing we need to consider is that we need to use the to_html method directly with the **plugins argument, so the parsed HTML respects the plugin configuration.

What we will do is generate the HTML and then define a postprocess method where we will actually parse the HTML:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      doc = Commonmarker.to_html(@body, **options)
      post_process(doc)
    end

    private

    def post_process(doc)
      nokogiri_doc = Nokogiri::HTML.fragment(doc)
      no_follow_if_external(nokogiri_doc)
    end

    def no_follow_if_external(doc)
      doc.css('a').each do |link|
        href = link['href']
        next unless href

        if external_link?(href)
          current_rel = link['rel']
          if current_rel
            link['rel'] = "#{current_rel} nofollow" unless current_rel.include?('nofollow')
          else
            link['rel'] = 'nofollow'
          end
        end
      end

      doc.to_html
    end

    def external_link?(href)
      !href.include?("avohq.io")
    end
  end
end

Now, links that don't point to our domain should have the rel="nofollow" attribute added to them

Copy to clipboard button for code blocks

Another common feature we might also add is a button that copies a code block's content to the clipboard.

To make this work, we first have to identify what makes a code block. If we inspect the source for any fenced code block rendered by Commonmarker, we will find that they consist of a <pre> tag that wraps a <code> tag.

To add this feature, we will need to check for every <pre> tag in our document and replace the content with a wrapping relatively position <div> that adds a button to copy the content.

First, let's add the partial:

<%# app/views/shared/_code_block_with_button.html.erb %>

<div class="relative" data-controller="copy-to-clipboard" data-copy-to-clipboard-source-value="<%= code_content %>">
  <%= code_block_html.html_safe %>
  <button class="copy-button absolute top-2 right-2 py-1 px-3 rounded bg-gray-600 text-white opacity-70 hover:opacity-100" 
          data-action="copy-to-clipboard#copy">
    <svg xmlns="http://www.w3.org/2000/svg" class="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
      <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M8 5H6a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2v-1M8 5a2 2 0 002 2h2a2 2 0 002-2M8 5a2 2 0 012-2h2a2 2 0 012 2m0 0h2a2 2 0 012 2v3m2 4H10m0 0l3-3m-3 3l3 3" />
    </svg>
  </button>
</div>

This partial receives the code_block_html and the code_content variables from the renderer:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      doc = Commonmarker.to_html(@body, **options)
      post_process(doc)
    end

    private

    def post_process(doc)
      nokogiri_doc = Nokogiri::HTML.fragment(doc)
      no_follow_if_external(nokogiri_doc)
      add_copy_buttons_to_code_blocks(nokogiri_doc)
      nokogiri_doc.to_html
    end

    def no_follow_if_external(doc)
      doc.css('a').each do |link|
        href = link['href']
        next unless href

        if external_link?(href)
          current_rel = link['rel']
          if current_rel
            link['rel'] = "#{current_rel} nofollow" unless current_rel.include?('nofollow')
          else
            link['rel'] = 'nofollow'
          end
        end
      end

      doc.to_html
    end

    def add_copy_buttons_to_code_blocks(doc)
      doc.css('pre > code').each do |code_block|
        html = ApplicationController.render(
          partial: 'shared/code_block_with_button',
          locals: {
            code_block_html: code_block.parent.to_html,
            code_content: code_block.content
          }
        )

        # Replace the original code block with our enhanced version
        code_block.parent.replace(Nokogiri::HTML.fragment(html))
      end
    end

    def external_link?(href)
      !href.include?("avohq.io")
    end

    def options
      {
        options: {
          parse: {
            smart: true,
          },
          render: {
            escape: false,
            unsafe: true,
          },
          extension: {
            alerts: true,
          },
        },
        plugins: {
          syntax_highlighter: { 
            theme: "cobalt2", 
            path: Rails.root.join("app", "assets", "themes").to_s 
          },
        }
      }
    end
  end
end

Now, we add a Stimulus controller to handle the actual copying:

import { Controller } from "@hotwired/stimulus"

export default class extends Controller {
  static values = {
    source: String
  }

  connect() {
    this.isCopying = false
  }

  copy(event) {
    if (this.isCopying) return
    this.isCopying = true

    const button = event.currentTarget
    const originalButtonHTML = button.innerHTML
    const textToCopy = this.sourceValue;

    navigator.clipboard.writeText(textToCopy)
      .then(() => {
        button.innerHTML = this.successButtonHTML

        setTimeout(() => {
          button.innerHTML = originalButtonHTML
          this.isCopying = false
        }, 1500)
      })
      .catch(() => {
        this.isCopying = false
      })
  }


  get successButtonHTML() {
    return `
      <svg xmlns="http://www.w3.org/2000/svg" class="h-4 w-4" fill="none" viewBox="0 0 24 24" stroke="currentColor">
        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M5 13l4 4L19 7" />
      </svg>
    `
  }
}

Now, our copy-to-clipboard feature should be working correctly:

We're using the browser's native clipboard copy functionality. It might not work on some browsers and have limited functionality. You can try a library like clipboard.js in case it's not working for you.

Adding custom embed support

Markdown is supposed to be portable: moving Markdown content from one editor to another, should produce the same results.

However, it's very unlikely that every application that needs to parse Markdown has the same requirements, and occasionally, we need to compromise somewhere in the middle.

That's why adding a feature like custom Markdown embeds implies that our Markdown is no longer portable, at least not entirely.

Let's learn how to add the feature and explore a solution to make it as portable as we can:

Self-closing embeds

To keep this section short, we will be exploring how to add self-closing shortcodes, they look like this: {{shortcode/}}.

They can also include parameters: {{shortcode key="value"/}} that we can use inside the partials.

If you want to see how to add other types of shortcodes like block shortcodes or VitePress styled shortcodes, feel free to check our Marksmith shortcodes article where we explore different alternatives for this feature.

The first thing we need to do is create a ShortcodeParser class that will be responsible for:

  • Receiving HTML and replacing the shortcode instances with HTML from an associated partial.
  • Parsing the params into local variables that are passed to the partials.
  • Handling possible errors.

Let's start by defining the class with an initializer and a parse method which will be responsible for transforming the shortcode string into the actual HTML:

module Marksmith
  class ShortcodeParser
    attr_reader :document

    def initialize(document)
      @document = document
    end

    def parse
      doc = Nokogiri::HTML.fragment(document).to_html

      doc.gsub!(/\{\{(\w+)(.*?)\/\}\}/) do |match|
        name = $1
        param_string = $2
        params = parse_params(param_string)
        render_shortcode(name, params)
      end

      doc
    end

    ## parse_params and render_shortcode implementation pending
  end
end

Here, the parse instance method takes an HTML document and, using a regular expression, extracts data from the shortcode string with it's params and produces the shortcode HTML.

The regular expression states that the string we're matching against has to start with two consecutive left curly braces, then it captures one or more words, and it assigns it to the $1 variable available within the block it receives.

Then, we have another capture group (.*?) which lazily matches any character except for a newline happening zero or more times. This part captures the params string, which can be something like id="12345" name="best-video" and assigns it to the $2 variable.

However, we need to define the parse_params method in order to transform that string into a hash that correctly represents the key-value pair relation:

def parse_params(param_string)
  params = {}
  normalized = param_string.to_s.gsub(/[“”]/, '"')
  normalized.strip.scan(/(\w+)\s*=\s*"([^"]*)"/).each do |key, value|
    params[key.to_sym] = value
  end

  params
end

This method introduces a new regular expression with two capture groups as well.

The first capture group expects one or more word characters (a-zA-Z0-9_) one or more times. This is the key part of the hash.

Then, the Regex declares another capture group surrounded by quotes which is "([^"]*)": it states that we expect the expression to be around quotes and the capture group matches anything that's not a double quote because we know that as soon as a double quote appears we can consider that to be the closing quote.

This is the value part of the hash.

Between these two capture groups we have \s*=\s* which matches zero or more whitespace characters before and after the equal sign.

Using this Regex, we will match the string id="4" name="Steven" to the hash { :id => "4", :name => "Steven"}.

But, to access those parameters from the view, we need to render it using the render_shorcode method:

def render_shortcode(name, params)
  begin
    ApplicationController.render(
      partial: "shortcodes/#{name}",
      locals: { params: params },
      layout: false,
    )
  rescue => error 
    Rails.logger.error("Error rendering shortcode #{name}: #{error.message}")
  end
end

This method simply calls ApplicationController.render requesting a partial that matches the name extracted from the shortcode, passing the params as local variables and using the layout: false to avoid having a layout surrounding the partial.

We're rescuing and logging if any error occurs so we can better troubleshoot our feature in case of failure.

The complete ShortcodeParser class definition:

module Marksmith
  class ShortcodeParser
    attr_reader :document

    def initialize(document)
      @document = document
    end

    def parse      
      document.gsub!(/\{\{(\w+)(.*?)\/\}\}/) do |match|
        name = $1
        param_string = $2
        params = parse_params(param_string)
        render_shortcode(name, params)
      end

      Nokogiri::HTML.fragment(document)
    end

    private

    def parse_params(param_string)
      params = {}
      normalized = param_string.to_s.gsub(/[“”]/, '"')
      normalized.strip.scan(/(\w+)\s*=\s*"([^"]*)"/).each do |key, value|
        params[key.to_sym] = value
      end

      params
    end

    def render_shortcode(name, params)
      begin
        ApplicationController.render(
          partial: "shortcodes/#{name}",
          locals: { params: params },
          layout: false,
        )
      rescue => error 
        Rails.logger.error("Error rendering shortcode #{name}: #{error.message}")
      end
    end
  end
end

And, the final Marksmith::Renderer version:

module Marksmith
  class Renderer
    def initialize(body:)
      @body = body
    end

    def render
      doc = Commonmarker.to_html(@body, **options)
      post_process(doc)
    end

    private

    def post_process(doc)
      nokogiri_doc = Nokogiri::HTML.fragment(doc)
      no_follow_if_external(nokogiri_doc)
      add_copy_buttons_to_code_blocks(nokogiri_doc)
      ShortcodeParser.new(nokogiri_doc.to_html).parse
    end

    def no_follow_if_external(doc)
      doc.css('a').each do |link|
        href = link['href']
        next unless href

        if external_link?(href)
          current_rel = link['rel']
          if current_rel
            link['rel'] = "#{current_rel} nofollow" unless current_rel.include?('nofollow')
          else
            link['rel'] = 'nofollow'
          end
        end
      end

      doc.to_html
    end

    def add_copy_buttons_to_code_blocks(doc)
      doc.css('pre > code').each do |code_block|
        html = ApplicationController.render(
          partial: 'shared/code_block_with_button',
          locals: {
            code_block_html: code_block.parent.to_html,
            code_content: code_block.content
          }
        )

        # Replace the original code block with our enhanced version
        code_block.parent.replace(Nokogiri::HTML.fragment(html))
      end
    end

    def external_link?(href)
      !href.include?("avohq.io")
    end

    def options
      {
        options: {
          parse: {
            smart: true,
          },
          render: {
            escape: false,
            unsafe: true,
          },
          extension: {
            alerts: true,
          },
        },
        plugins: {
          syntax_highlighter: { 
            theme: "cobalt2", 
            path: Rails.root.join("app", "assets", "themes").to_s 
          },
        }
      }
    end
  end
end

And now, we have custom embeds:

Markdown custom embed rendering

If you wonder why we're normalizing the params string using param_string.to_s.gsub(/[“”]/, '"') it's because Commonmarker changes quotes to curly quotes when the smart parsing option is set to true.

Stripping shortcodes

Portability is an important characteristic of Markdown and adding the custom embeds or shortcode feature.

If we want to generate a shortcode-free version of the post's body, we can add a render_without_shortcodes method in the renderer, or even in the Post class itself:

def render_without_shortcodes
  body.gsub(/\{\{(\w+)(.*?)\/\}\}/, "")
end

This Regex will replace self-closing shortcodes with an empty character.

If you add other types of shortcodes with more advanced features, you will need to edit the Regex and test against more complex and robust scenarios.

Summary

The Commonmarker gem is a suitable option if we wish to introduce GitHub flavored Markdown into a Ruby or Rails application.

It wraps the comrak Rust library, which implements the Commonmark spec, and provides us with a very flexible alternative to rendering Markdown.

It comes with several extensions that divert from the standard Markdown syntax and, by default, provide a similar experience to how GitHub renders Markdown.

But the fun doesn't stop there, it adds extensions that help us control how Markdown is rendered, and we can add as many or as little as we want.

The same thing goes with plugins like syntax_highlighter which we can use to customize how code blocks are styled and also allows us to define our custom themes without requiring for external dependencies.

Furthermore, we can traverse the AST that the gem generates to modify the HTML that's generated. We used that to force the https protocol on links and make external links rendered as plain-text.

When the AST traversing gets a little short, we resort to using Nokogiri to parse the generated HTML to produce two outcomes: make external links nofollow and add a copy-to-clipboard button for code blocks.

Then, we added a basic shortcode feature using Nokogiri as well to produce custom embeds with parameters and we added the ability to strip shortcodes from the post's body.

All in all, the Commonmarker gem is a great alternative to produce GitHub flavored Markdown for our Ruby or Rails applications as it's customizable, fast and produces a good experience without much customization.

I hope you enjoyed this article and that you can apply some of these tips to your projects.

Have a good one and happy coding!

Build your next rails app 10x faster with Avo

Avo dashboard showcasing data visualizations through area charts, scatterplot, bar chart, pie charts, custom cards, and others.

Find out how Avo can help you build admin experiences with Rails faster, easier and better.