The significance of this is that you can build parsing expressions on the fly …

I found Meta II [1] to be an interesting approach to parsing, and the closest modern equivilent to that are parsing expression grammars [2] (PEG (Programming Expression Grammars)s), and the easiest one to use I've found is the Lua [3] implementation LPeg [4].

What's interesting about LPeg is that it isn't compiled into Lua, but into a specialized parsing VM (Virtual Machine), which makes it quite fast. Maybe not as fast as lex [5] and yacc [6] but certain easier to understand and vastly easier to use.

Let me amend that: I find the re [7] module to be easier to use (which is build on LPeg), as I find this:

>
```
local re = require "re"
parser = re.compile [[
expr <- term (termop term)*
term <- factor (factorop factor)*
factor <- number
/ open expr close
number <- space '-'? [0-9]+ space
termop <- space [+-] space
factorop <- space [*/] space
open <- space '(' space
close <- space ')' space
space <- ' '?
]]
```

to be way easier to read and understand than

>
```
local lpeg = require "lpeg"
local space = lpeg.P" "^0
local close = space * lpeg.P")" * space
local open = space * lpeg.P"(" * space
local factorop = space * lpeg.S"*/" * space
local termop = space * lpeg.S"+-" * space
local number = space * lpeg.P"-"^-1 * lpeg.R"09"^1 * space
local factor , term , expr = lpeg.V"factor" , lpeg.V"term" , lpeg.V"expr"
parser = lpeg.P {
"expr",
factor = number
+ open * expr * close,
term = factor * (factorop * factor)^0,
expr = term * (termop * term)^0
}
```

As such, I've been concentrating on using the re module to brush up on my parsing skills [8] to the point that I've been ignoring a key compent of LPeg—expressions!

Sure, raw LPeg isn't pretty, but as you can see from the above example, it is built up out of expressions. And that's a powerful abstraction right there.

For instance, in mod_blog, I have code that will parse text, converting certain sequences of characters like --- (three dashes) into an HTML (HyperText Markup Language) entity &mcode;. So, I type the following:

>
```
``The name of our act is---The Aristocrats! ... Um ... hello?''
```

which is turned into

>
```
&ldquo;The name of our act is&mdash;The Aristocrats! &hellip; Um &hellip;
hello?&rdquo;
```

to be rendered on your screen as:

“The name of our act is—The Aristocrats! … Um … hello?”

Now, I only support a few character sequences (six) and that takes 160 lines of C code. Adding support for more is a daunting task, and one that I've been reluctant to take on. But in LPeg, the code looks like:

>
```
local lpeg = require "lpeg"
local base =
{
[ [[``]] ] = "&ldquo;" ,
[ [['']] ] = "&rdquo;" ,
[ "---" ] = "&mdash;" ,
[ "--" ] = "&ndash;" ,
[ "..." ] = "&hellip;",
[ ".." ] = "&#8229;" ,
}
function mktranslate(tab)
local tab = tab or {}
local chars = lpeg.C(lpeg.P(1))
for target,replacement in pairs(tab) do
chars = lpeg.P(target) / replacement + chars
end
for target,replacement in pairs(base) do
chars = lpeg.P(target) / replacement + chars
end
return lpeg.Ct(chars^0) / function(c) return table.concat(c) end
end
```

Now, I could do this with the re module:

>
```
local re = require "re"
local R = { concat = table.concat }
local G = --[[ lpeg/re ]] [[
text <- chars* -> {} -> concat
chars <- '`' -> '&ldquo;'
/ "''" -> '&rdquo;'
/ '---' -> '&mdash;'
/ '--' -> '&ndash;'
/ '...' -> '&helip;'
/ '..' -> '&#8229;'
/ { . }
]]
filter = re.compile(G,R)
```

But the former allows me to pass in an additional table of translations to do in addition to the “standard set” programmed in, for example:

>
```
translate = mktranslate {
["RAM"] = '<abbr title="Random Access Memory">RAM</abbr>',
["CPU"] = '<abbr title="Central Processing Unit">CPU</abbr>',
["(tm)"] = '&trade;'
}
```

And I would want this why? Well, I have Lua embedded in mod_blog [9], so using Lua to do the translations is straightforward. But, now when I make an entry, I could include a table of custom translations for that entry. Doing it this way solves a problem [10] I saw nearly a decade ago.

[1] /boston/2011/08/11.1

[2] http://pdos.csail.mit.edu/~baford/packrat/

[3] http://www.lua.org/

[4] http://www.inf.puc-rio.br/~roberto/lpeg/

[5] http://en.wikipedia.org/wiki/Lex_(software)

[6] http://en.wikipedia.org/wiki/Yacc

[7] http://www.inf.puc-rio.br/~roberto/lpeg/re.html

[8] https://github.com/spc476/LPeg-Parsers

[9] /boston/2011/11/28.1

[10] /boston/2003/11/19.2

Gemini Mention this post

Contact the author