No description
Find a file
2022-02-21 13:57:40 +09:00
.github/workflows create github workflow test.yml 2020-06-07 03:36:11 +09:00
src modify error message for assertion 2021-03-31 01:16:11 +09:00
tests modify error message for assertion 2021-03-31 01:16:11 +09:00
.gitignore create github workflow test.yml 2020-06-07 03:36:11 +09:00
azure-pipelines.yml change container for test with nimlang/nim 2019-10-24 08:06:48 +09:00
changelog.rst update changelog 2021-01-23 14:31:32 +09:00
LICENSE Update LICENSE 2020-05-22 15:57:50 +09:00
nimly.nimble version bump 2021-01-23 14:26:36 +09:00
README.rst fix typo 2022-02-21 13:47:42 +09:00

#######
 nimly
#######
|github_workflow| |nimble|

Lexer Generator and Parser Generator as a Macro Library in Nim.

With nimly, you can make lexer/parser by writing definition
in formats like lex/yacc.
``nimly`` generates lexer and parser by using macro in compile-time,
so you can use ``nimly`` not as external tool of your program but as a library.

niml
====
``niml`` is a macro to generate a lexer.

macro niml
----------
macro ``niml`` makes a lexer.
Almost all part of constructing a lexer is done in compile-time.
Example is as follows.

.. code-block:: nim

  ## This makes a LexData object named myLexer.
  ## This lexer returns value with type ``Token`` when a token is found.
  niml myLexer[Token]:
    r"if":
      ## this part converted to procbody.
      ## the arg is (token: LToken).
      return TokenIf()
    r"else":
      return TokenElse()
    r"true":
      return TokenTrue()
    r"false":
      return TokenFalse()
    ## you can use ``..`` instead of ``-`` in ``[]``.
    r"[a..zA..Z\-_][a..zA..Z0..9\-_]*":
      return TokenIdentifier(token)
    ## you can define ``setUp`` and ``tearDown`` function.
    ## ``setUp`` is called from ``open``, ``newWithString`` and
    ## ``initWithString``.
    ## ``tearDown`` is called from ``close``.
    ## an example is ``test/lexer_global_var.nim``.
    setUp:
      doSomething()
    tearDown:
      doSomething()

Meta charactors are as following:

- ``\``: escape character
- ``.``: match with any charactor
- ``[``: start of character class
- ``|``: means or
- ``(``: start of subpattern
- ``)``: end of subpattern
- ``?``: 0 or 1 times quantifier
- ``*``: 0 or more times quantifire
- ``+``: 1 or more times quantifire
- ``{``: ``{n,m}`` is n or more and m or less times quantifire

In ``[]``, meta charactors are as following

- ``\``: escape character
- ``^``: negate character (only in first position)
- ``]``: end of this class
- ``-``: specify character range (``..`` can be used instead of this)

Each of followings is recognized as character set.

- ``\d``: ``[0..9]``
- ``\D``: ``[^0..9]``
- ``\s``: ``[ \t\n\r\f\v]``
- ``\S``: ``[^ \t\n\r\f\v]``
- ``\w``: ``[a..zA..Z0..9_]``
- ``\w``: ``[^a..zA..Z0..9_]``

nimy
====
``nimy`` is a macro to generate a LALR(1) parser.

macro nimy
----------
macro ``nimy`` makes a parser.
Almost all part of constructing a parser is done in compile-time.
Example is as follows.

.. code-block:: nim

  ## This makes a LexData object named myParser.
  ## first cloud is the top-level of the BNF.
  ## This lexer recieve tokens with type ``Token`` and token must have a value
  ## ``kind`` with type enum ``[TokenTypeName]Kind``.
  ## This is naturally satisfied when you use ``patty`` to define the token.
  nimy myParser[Token]:
    ## the starting non-terminal
    ## the return type of the parser is ``Expr``
    top[Expr]:
      ## a pattern.
      expr:
        ## proc body that is used when parse the pattern with single ``expr``.
        ## $1 means first position of the pattern (expr)
        return $1

    ## non-terminal named ``expr``
    ## with returning type ``Expr``
    expr[Expr]:
      ## first pattern of expr.
      ## ``LPAR`` and ``RPAR`` is TokenKind.
      LPAR expr RPAR:
        return $2

      ## second pattern of expr.
      ## ``PLUS`` is TokenKind.
      expr PLUS expr
        return $2

You can use following EBNF functions:

- ``XXX[]``: Option (0 or 1 ``XXX``).
  The type is ``seq[xxx]`` where ``xxx`` is type of ``XXX``.
- ``XXX{}``: Repeat (0 or more ``XXX``).
  The type is ``seq[xxx]`` where ``xxx`` is type of ``XXX``.

Example of these is in next section.

Example
=======
``tests/test_readme_example.nim`` is an easy example.

.. code-block:: nim

  import unittest
  import patty
  import strutils

  import nimly

  ## variant is defined in patty
  variant MyToken:
    PLUS
    MULTI
    NUM(val: int)
    DOT
    LPAREN
    RPAREN
    IGNORE

  niml testLex[MyToken]:
    r"\(":
      return LPAREN()
    r"\)":
      return RPAREN()
    r"\+":
      return PLUS()
    r"\*":
      return MULTI()
    r"\d":
      return NUM(parseInt(token.token))
    r"\.":
      return DOT()
    r"\s":
      return IGNORE()

  nimy testPar[MyToken]:
    top[string]:
      plus:
        return $1

    plus[string]:
      mult PLUS plus:
        return $1 & " + " & $3

      mult:
        return $1

    mult[string]:
      num MULTI mult:
        return "[" & $1 & " * " & $3 & "]"

      num:
        return $1

    num[string]:
      LPAREN plus RPAREN:
        return "(" & $2 & ")"

      ## float (integer part is 0-9) or integer
      NUM DOT[] NUM{}:
        result = ""
        # type of `($1).val` is `int`
        result &= $(($1).val)
        if ($2).len > 0:
          result &= "."
        # type of `$3` is `seq[MyToken]` and each elements are NUM
        for tkn in $3:
          # type of `tkn.val` is `int`
          result &= $(tkn.val)

  test "test Lexer":
    var testLexer = testLex.newWithString("1 + 42 * 101010")
    testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

    var
      ret: seq[MyTokenKind] = @[]

    for token in testLexer.lexIter:
      ret.add(token.kind)

    check ret == @[MyTokenKind.NUM, MyTokenKind.PLUS, MyTokenKind.NUM,
                   MyTokenKind.NUM, MyTokenKind.MULTI,
                   MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM,
                   MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM]

  test "test Parser 1":
    var testLexer = testLex.newWithString("1 + 42 * 101010")
    testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

    var parser = testPar.newParser()
    check parser.parse(testLexer) == "1 + [42 * 101010]"

    testLexer.initWithString("1 + 42 * 1010")

    parser.init()
    check parser.parse(testLexer) == "1 + [42 * 1010]"

  test "test Parser 2":
    var testLexer = testLex.newWithString("1 + 42 * 1.01010")
    testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

    var parser = testPar.newParser()
    check parser.parse(testLexer) == "1 + [42 * 1.01010]"

    testLexer.initWithString("1. + 4.2 * 101010")

    parser.init()
    check parser.parse(testLexer) == "1. + [4.2 * 101010]"

  test "test Parser 3":
    var testLexer = testLex.newWithString("(1 + 42) * 1.01010")
    testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNORE

    var parser = testPar.newParser()
    check parser.parse(testLexer) == "[(1 + 42) * 1.01010]"

Install
=======
1. ``nimble install nimly``

Now, you can use nimly with ``import nimly``.

vmdef.MaxLoopIterations Problem
-------------------------------
During compiling lexer/parser, you can encounter errors with ``interpretation requires too many iterations``.
You can avoid this error to use the compiler option ``maxLoopIterationsVM:N``
which is available since nim v1.0.6.

See https://github.com/loloicci/nimly/issues/11 to detail.

Contribute
==========
1. Fork this
2. Create new branch
3. Commit your change
4. Push it to the branch
5. Create new pull request

Changelog
=========
See changelog.rst_.

Developing
==========
You can use ``nimldebug`` and ``nimydebug`` as a conditional symbol
to print debug info.

example: ``nim c -d:nimldebug -d:nimydebug -r tests/test_readme_example.nim``


.. |github_workflow| image:: https://github.com/loloicci/nimly/workflows/test/badge.svg
    :target: https://github.com/loloicci/nimly/actions?query=workflow%3Atest
.. |nimble| image:: https://raw.githubusercontent.com/yglukhov/nimble-tag/master/nimble.png
    :target: https://github.com/yglukhov/nimble-tag
.. _changelog.rst: ./changelog.rst