Arc Forumnew | comments | leaders | submitlogin
parser and test runner
4 points by conanite 5981 days ago | 6 comments
There is some arc stuff in rainbow that isn't rainbow-specific, so I've committed it to anarki.

It includes an arc parser (spun off because welder needed a tokeniser for colourising) and a little test runner.

This parser or something like it could eventually replace 'sread. An advantage of having an arc parser written in arc is that there can be no ambiguity about what is officially arc syntax; also, it will be easier to make certain kinds of syntax changes. A temporary disadvantage is that this parser doesn't support all the scheme syntax that arc supports. It's a very simple implementation (apart from the token generator function which took some wikipedia support to get working) and probably needs improvements.

  arc>(apply (eval (parse "[* _ _]")) '(27))
  729
The test runner runs simple tests, or test suites, of this form:

  (suite "parser tests"
    ("parse foo"    (parse "foo")      foo)
    ("parse a list" (parse "(a b c)")  (a b c)))
There are lots of test examples under lib/tests. "foundation-test.arc" covers the most basic, low-level arc functions, and might be useful to anyone building their own arc interpreter. "parser-test.arc" provides a bunch of tests for the parser, and for the source indexer (used by welder).

This is one way to launch tests:

  arc> (load "lib/unit-test.arc")
  nil
  arc> (load "lib/parser.arc")
  nil
  arc> (load "lib/tests/parser-test.arc")
  nil
  arc> (run-all-tests)
  passed: 27
  failed: 0
I've been a fan of test-driven-development for a few years and this test runner, unsophisticated though it may be, has already been pretty helpful.


1 point by almkglor 5981 days ago | link

As an aside, there already exists a parser combinator library for Arc; my position is that it would be preferable to use that if possible.

Also, the tokenizer function could be redone as a scanner using (scanner 'car (car-expression) 'cdr (cdr-expression)). You can even have each state of the tokenizer as a separate sub-function and model state transfers as tail-function-calls. You could do something like this:

  (def tokenize-scanner (s (o ind 0))
    (with ((reading-comment reading-unquote ...) nil
           cur-token nil
           next-ind ind)
      (= reading-comment
         (fn () ... (default-state)))
      ...
      (default-state)
      (scanner 'car cur-token
               'cdr (tokenize-scanner s next-ind))))
It might also be good to abstract away the generator portion and put the generator as a library (arguably though, scanners are already monadic generators)

It might be useful too to have the reader read from a list or scanner, and in my opinion will be the way it will be done in SNAP and/or arc2c.

Still, I'm not above actually using your code ^^ Good job!

-----

1 point by conanite 5980 days ago | link

how do you use a scanner? Is it

  (car my-scanner)
or

  (my-scanner 'car)
? I've been browsing arki source for examples but I'm completely out of my depth there ...

One aspect of the token generator is that sometimes it recognises two tokens simultaneously. In other words, when it sees the right-paren in

  ... foo)
, it recognises "foo" and right-paren all at once. Perhaps this is the wrong way to do it, and I should be using 'peekc instead. But I suppose I can do this with a scanner:

  (scanner 'car "foo"
           'cdr (scanner 'car right-paren
                         'cdr (tokenize-scanner etc ...)))
I had previously tried modelling each state as a separate sub-function as you suggest, but couldn't get it to work. But that was before I noticed the

  (with ((a b c d) nil)
      (= a (fn () ...))
      (= b (fn () ...))
      (= c (fn () ...))
   ...)
idiom. Gotta try again ...

-----

4 points by almkglor 5980 days ago | link

> I noticed the

  > (with ((a b c d) nil)
  >     (= a (fn () ...))
  >     (= b (fn () ...))
  >     (= c (fn () ...))
  >  ...)
> idiom.

If it's an idiom, probably needs a macro for it then ^^

  (mac with-r (vars . body)
    (let vars (pair vars)
      `(let ,(map car vars) nil
         ,@(map [let (var val) _ `(= ,var ,val)] vars)
         ,@body)))
edit: as an aside, peekc doesn't seem to work properly sometimes ^^

-----

1 point by absz 5934 days ago | link

Nice macro, but why the -? withr seems more consistent with things like withs.

-----

2 points by almkglor 5933 days ago | link

I agree. Should indeed be 'withr, and probably also define a 'givenr just for consistency

-----

2 points by almkglor 5980 days ago | link

How to use a scanner:

  (car my-scanner)
How to construct a scanner:

  (scanner 'car (your-expression)
           'cdr (your-expression))
Note that the expressions in the 'scanner form are delayed, i.e. they are not evaluated until a 'car or 'cdr is performed on your scanner, and they are evaluated only once.

edit: an important note: scanners have the exact read semantics of lists. So simply zapping cdr at a scanner will not advance the scanner, it will only advance the place you are zapping.

There's no need to use 'peekc or similar: all you need is to use stuff like 'cadr, 'caddr.

Because scanners have the exact read semantics of lists, you can use such things as 'each, 'map, etc. Just don't write using scar, scdr, or sref.

If you wanted to emulate lists, you can do something like:

  (def my-cons (a d)
    (scanner 'car a
             'cdr d))
Of course, since a and d are just var references, there's little point in delaying their execution.

edit2: Here's how you might make a generator:

  (def generator (f v)
    (scanner 'car v
             'cdr (generator f (f v))))

  (= b (generator [+ _ 1] 0))
  (car b)
  => 0
  (cadr b)
  => 1
  (cadr:cdr b)
  => 2
'map, 'keep, and a few other functions become lazy when applied on scanners, so you can use an infinite-series generator on them safely

-----