option: --script Esrap

Esrap

In addition to regular Packrat / Parsing Grammar / TDPL features Esrap supports dynamic redefinition of nonterminals, inline grammars, semantic predicates, and include introspecive facilities for development.

Esrap is maintained courtesy of Steel Bank Studio Ltd by Nikodemus Siivola.

Esrap is maintained in Git:

     git clone git://github.com/nikodemus/esrap.git

will get you a local copy.

     http://github.com/nikodemus/esrap

is the GitHub project page.

Esrap is licenced under an MIT-style licence.

For more on packrat parsing, see http://pdos.csail.mit.edu/~baford/packrat/thesis/ for Bryan Ford's 2002 thesis: “Packrat Parsing: a Practical Linear Time Algorithm with Backtracking”.

Table of Contents

1 Parsing Expressions

Parsing proceeds by matching text against parsing expressions. Matching has three components: success vs failure, consumption of input, and associated production.

Parsing expressions that fail never consume input. Parsing expressions that succeed may or may not consume input.

A parsing expressions can be:

Terminal

A terminal is a character or a string of length one, which succeeds and consumes a single character if that character matches the terminal.

Additionally, Esrap supports some pseudoterminals.

Nonterminal

Nonterminals are specified using symbols. A nonterminal symbol succeeds if the parsing expression associated with it succeeds, and consumes whatever the input that expression consumes.

The production of a nonterminal depends on the associated expression and an optional transformation rule.

Nonterminals are defined using defrule.

Note: Currently all rules share the same namespace, so you should not use symbols in the COMMON-LISP package or other shared packages to name your rules unless you are certain there are no other Esrap using components in your Lisp image. In a future version of Esrap grammar objects will be introduced to allow multiple definitions of nonterminals. Symbols in the COMMON-LISP package are specifically reserved for use by Esrap.

Sequence

     (and subexpression ...)

A sequence succeeds if all subexpressions succeed, and consumes all input consumed by the subexpressions. A sequence produces the productions of its subexpressions as a list.

Ordered Choice

     (or subexpression ...)

An ordered choice succeeds if any of the subexpressions succeeds, and consumes all the input consumed by the successful subexpression. An ordered choice produces whatever the successful subexpression produces.

Subexpressions are checked strictly in the specified order, and once a subexpression succeeds no further ones will be tried.

Negation

     (not subexpression)

A negation succeeds if the subexpression fails, and consumes one character of input. A negation produces the character it consumes.

Greedy Repetition

     (* subexpresssion)

A greedy repetition always succeeds, consuming all input consumed by applying subexpression repeatedly as long as it succeeds.

A greedy repetition produces the productions of the subexpression as a list.

Greedy Positive Repetition

     (+ subexpresssion)

A greedy repetition succeeds if subexpression succeeds at least once, and consumes all input consumed by applying subexpression repeatedly as long as it succeeds. A greedy positive repetition produces the productions of the subexpression as a list.

Optional

     (? subexpression)

Optionals always succeed, and consume whatever input the subexpression consumes. An optional produces whatever the subexpression produces, or nil if the subexpression does not succeed.

Followed-By Predicate

     (& subexpression)

A followed-by predicate succeeds if the subexpression succeeds, and consumes no input. A followed-by predicate produces whatever the subexpression produces.

Not-Followed-By Predicate

     (! subexpression)

A not-followed-by predicate succeeds if the subexpression does not succeed, and consumes no input. A not-followed-by predicate produces nil.

Semantic Predicates

     (predicate-name subexpression)

The predicate-name is a symbol naming a global function. A semantic predicate succeeds if subsexpression succeeds and the named function returns true for the production of the subexpression. A semantic predicate produces whatever the subexpression produces.

Note: semantic predicates may change in the future to produce whatever the predicate function returns.

2 Dictionary

2.1 Primary Interface

— Macro: defrule symbol expression &body options

Define symbol as a nonterminal, using expression as associated the parsing expression.

Following options can be specified:

— Function: parse expression text &key start end junk-allowed

Parses text using expression from start to end. Incomplete parses are allowed only if junk-allowed is true.

— Function: describe-grammar symbol &optional stream

Prints the grammar tree rooted at nonterminal symbol to stream for human inspection.

2.2 Utilities

— Function: text &rest arguments

Arguments must be strings, or lists whose leaves are strings. Catenates all the strings in arguments into a single string.

2.3 Introspection and Intercession

— Function: add-rule symbol rule

Associates rule with the nonterminal symbol. Signals an error if the rule is already associated with a nonterminal. If the symbol is already associated with a rule, the old rule is removed first.

— Function: change-rule symbol expression

Modifies the nonterminal symbol to use expression instead. Temporarily removes the rule while it is being modified.

— Function: find-rule symbol

Returns rule designated by symbol, if any. Symbol must be a nonterminal symbol.

— Function: remove-rule symbol &key force

Makes the nonterminal symbol undefined. If the nonterminal is defined an already referred to by other rules, an error is signalled unless :force is true.

— Function: rule-dependencies rule

Returns the dependencies of the rule: primary value is a list of defined nonterminal symbols, and secondary value is a list of undefined nonterminal symbols.

— Function: rule-expression rule

Return the parsing expression associated with the rule.

— Function: (setf rule-expression) expression rule

Modify rule to use expression as the parsing expression. The rule must be detached beforehand.

— Function: rule-symbol rule

Returns the nonterminal associated with the rule, or nil of the rule is not attached to any nonterminal.

— Function: trace-rule symbol &key recursive break

Turn on tracing of nonterminal symbol. If recursive is true, turn on tracing for the whole grammar rooted at symbol. If break is true, break is entered when the rule is invoked.

— Function: untrace-rule symbol &key recursive break

Turn off tracing of nonterminal symbol. If recursive is true, untraces the whole grammar rooted at symbol. break is ignored, and is provided only for symmetry with trace-rule.

2.4 Error Conditions

— Condition: esrap-error

Class precedence list: esrap-error, parse-error, error, serious-condition, condition, t

Signaled when an Esrap parse fails. Use esrap-error-text to obtain the string that was being parsed, and esrap-error-position the position at which the error occurred.

— Condition: left-recursion

Class precedence list: left-recursion, esrap-error, parse-error, error, serious-condition, condition, t

Signaled when left recursion is detected during Esrap parsing. left-recursion-nonterminal names the symbol for which left recursion was detected, and left-recursion-path lists nonterminals of which the left recursion cycle consists.

frob: --script