Arc Forumnew | comments | leaders | submit | conanite's commentslogin
4 points by conanite 2780 days ago | link | parent | on: Pursuing higher quasiquotation

Is a higher quasiquotation like a nested macro? A macro that generates macros? Like (made-up example) ...

    (mac metamac (opname op)
      `(mac ,opname (name n)
         `(mac ,name (x)
            `(',',,op ,,n ,x))))

    (metamac multiplier *) ;; defines a macro called multiplier

    (multiplier times-five 5) ;; defines a macro called times-five

    (times-five 3) ;; expands to (* 5 3)
Or something else?

-----

3 points by rocketnia 2778 days ago | link

No, just like you can nest parentheses without spontaneously having quasiquotation, you can nest quasiquotations like you're talking about without reaching a higher degree of quasiquotation.

Suppose we have notations like so:

  ( ) parentheses
  ` , quasiquotation
  ^ $ the next higher degree of quasiquotation
      (I would go on, but I'll run out of punctuation.)
S-expressions are bounded on both sides by single characters, and they nest within each other like this (labeling the layers as "a" and "b" and anything outside their lexical extent as "__"):

  (a (b) a) --
If we want to look at just the "b" s-expression, it's simple to write that down on its own:

  (b)
Quasiquotations are bounded on both sides by s-expressions, and they nest like this:

  `(a `(b ,(a ,(--) a) (b) b) a ,(--) a (a) a) --
The meaning of "--" in the inner expressions hasn't changed: Every "--" is outside the lexical boundary of the `(a ...) quasiquotation, just as every "a" is outside the lexical extent of the `(b ...) quasiquotation. Occurrences of "--" can appear when the lexical extent closes on an s-expression (")") or a quasiquotation (","), and both of these are shown in the example.

We can isolate the "b" part pretty easily again, but we need to allow for holes in our data structure:

  `(b ,(--) (b) b)
Quasiquotations of the next higher degree are bounded on both sides by quasiquotations, and they nest like this:

  ^`(a
      ^`(b
          $`(a
              $`(--)
              a ,(b (b) b) a ,(b) a `(a) a (a) a)
          b ,(a) b `(b) b (b) b)
      a ,(--) a `(a) a (a) a)
  --
Occurrences of "--" can appear when the lexical extent closes on an s-expression (")"), a quasiquotation (","), or a quasiquotation of the next higher degree ("$"), and all of these are shown in the example.

This time, writing down just the "b" part gets tricky using s-expression-shaped syntax because one of the holes has orphaned sections inside (holes in the hole):

  ^`(b
      $`(-- ,(b (b) b) ,(b))
      b ,(--) b `(b) b (b) b)
When we have more than one orphaned section in the same hole like we do here, we may need to use labels (or positions, or extra hole structure) to tell them apart so we can insert the "b" section into the "a" section deterministically. So far I haven't figured out how to represent this data in a way that works, let alone an elegant way.

For a concrete use case, look at it this way: When do we use ( and )? When our DSLs don't last all the way to the end of the file. When do we use ` and ,? When our DSLs have holes in them (although it might be unusual to hear this, because most Lisps couple these syntaxes together with a particular nested-list-generating DSL). When do we use ^ and $? When our DSLs have holes with holes in them. And so on, where higher degrees of quasiquotation have more and more intricate holes.

Let's say I'm writing a macro that implements a DSL where Common Lisp code and Arc code can be combined in the same function. (I'm going with Common Lisp so that we can't simply compile it to inline Racket code.) I may have Common Lisp s-expressions occurring under my Arc s-expressions and Arc s-expressions occurring under my Common Lisp s-expressions, but I want Arc variables to be visible in all the Arc parts and Common Lisp variables to be visible in all the Common Lisp parts. When we take a look at the lexical scope of any one local Common Lisp (or Arc) variable in that code, it's not simply shaped like an s-expression like traditional lexical scopes are; it has holes-with-holes wherever the Arc (respectively Common Lisp) expressions occur. So ^ and $ are a natural fit for the DSL:

  (def three ()
    (arc-in-common-lisp
      ^`(let ((common-lisp-var 1))
          $`(let arc-var 2
              (+ ,common-lisp-var arc-var)))))

-----

3 points by rocketnia 2778 days ago | link

Maybe I could actually use something like that ^ and $ syntax.

I'll need to generalize it to an infinite number of degrees:

  (     )
  ^.    $.
  ^-.   $-.
  ^--.  $--.
  ^---. $---.
Since Scheme's ,',',', technique doesn't generalize to other DSLs, I'll want to have at least one way to unquote from more than one level of nested quasiquotation at once:

  $---.
  $$---.
  $$$---.
Cene has more than one notation for doing that. (I can write a label on an outer layer, and then I can unquote all the way to a particular label.) I'd like to support various unquoting styles as macros, but maybe that macro system can expand to a notation like this, so getting this to work first seems best.

Finally, representing data structures with orphaned parts might be easy enough once I actually try using key-value tables to hold orphans like I've planned.

I think I like this approach even better than the one I reached in the blog post. It means that I actually can process an infinite number of degrees of quasiquotation "in the reader" rather than letting the top level of macroexpansion begin with an s-expression.

But I suppose this and the blog post tackle two different parts of the problem. The blog post's approach/challenges still apply to the process of parsing this syntax into an infinite-degree quasiquotation.

-----

2 points by rocketnia 2773 days ago | link

I've fooled myself before, but I may have settled on the right data structure to represent higher quasiquotations.

I define the data structure three times:

- Once as a sequence of types that represent progressively higher-degree quasiquotations, which use maps to hold orphaned nodes. (https://github.com/rocketnia/lathe/blob/5375d95cb7b748972a65...)

- Once as a single type that can represent arbitrarily high degrees of quasiquotation, but in a very weakly typed way. (https://github.com/rocketnia/lathe/blob/5375d95cb7b748972a65...)

- Once as a sequence of types that represent progressively higher-degree quasiquotations, which use type families to hold orphaned nodes. This is the most strongly typed of all these representations. (https://github.com/rocketnia/lathe/blob/5375d95cb7b748972a65...)

Here's the simplest one, the one in the middle:

  data HDExpr s m
    = HDExprMedia (m (HDExprNonMedia s m))
  data HDExprNonMedia s m
    = HDExprHole s [Map s (HDExpr s m)]
    | HDExprLayer (HDExpr s m) [Map s (HDExpr s m)]
It's not very self-documenting, so to start describing it, here's a Lispier pseudocode:

  <expr> ::= <s-expression where some leaves are <paren>>
  
  <paren> ::= (close <identifier> <list of <environment of <expr>>>)
  <paren> ::= (open-and-close <expr> <list of <environment of <expr>>>)
I say "s-expression" there, but we could have any monad there for our syntax. If we work in the s-expression monad, the usual quasiquote operations ` , are parens of degree 1. When the monad we're working in is Haskell's (Writer String), our syntax is effectively reader syntax, where the ( ) parens are degree 1 and ` , are degree 2.

(There are also parens of degree 0, but they're a bit weird: Once they open, they only close again by reaching the end of the syntax. So when we're in the s-expression monad, perhaps ' is a paren of degree 0. When we're in the (Writer String) monad, perhaps pressing enter at the end of a REPL command is a paren of degree 0.)

The degree of the paren is reflected in the length of the list of environments it contains. Degree-0 parens have no environments, degree-1 parens have exactly one, and so on. If we want to represent a degree-N expression, then the parens we use are limited in a specific way:

  <expr> ::= <s-expression where some leaves are <paren>>
  
  <paren> ::=
    (close <identifier>
      <for some (M < N), an M-element list of
        <environment of <expr>>)
  <paren> ::=
    (open-and-close <expr>
      <for some (M <= N), an M-element list of
        <environment of <expr>>)>
The (open-and-close ...) syntax begins a nested expression. The environments serve to fill the holes in the <expr>, resuming the previous level of nesting. Each element of the list fills holes of a different degree. Holes of degree higher than the number of environments in the list are not filled; they remain holes. For instance, in `(a b (c d ,e) f), the "(" before "c" has two holes in its expression (",e" and ") f"), but it only fills the hole for ") f". The hole for ",e" is a higher degree than the "(" paren, so it's not filled at such a local place. It's only filled by the "`" paren on the outside.

The (close ...) syntax begins a hole of degree equal to the number of elements of the list. The list of environments will be used when the hole is replaced with an expression. It'll fill in the (low-degree) holes of that expression.

Note that the list lengths correspond with the degrees of holes, but not with the degrees of expressions. In fact, a degree-N expression does not contain any expressions of degrees other than N. If we consider ourselves to be working with higher quasiquotations of an infinite degree N, but every (close ...) actually occurring in our data has finite degree, then we can represent our data using finite lists. We never need to select a finite value for N!

---

In Haskell, the strongly typed variations I wrote do require a finite value for N to be decided beforehand (and baked into the name of the type), because I'm pretty sure that vastly simplifies the type system features I would need to use to get it to work. :-p Roughly, the difficulty is that I need 0 to give me something of kind ( * ), 1 to give me something of kind ( * -> * ), 2 to give me something of kind (( * -> * ) -> ( * -> * )), and so on, so I would need kinds that depend on values. Agda or Idris might already be up for this task, but I doubt that's going to be the kind of effort I want to spend since I intended to build my macroexpander in (untyped) Racket to begin with.

There might just be a little more I want to tinker with in Haskell, because on my way to defining this data structure I started to develop some code for "higher monads," and it would be fun if this data structure turned out to be an instance of that concept. Still, at this point I'm probably ready to go back to Racket.

I'm even hopeful again that this macro system might play nicely with Racket's. Racket has some hygiene features[1] that walk recursively over the inputs and outputs of macros, expecting them to be s-expression-shaped with no holes. I wasn't sure that Racket's technique could be reconciled with higher quasiquotation. With this data structure, the exotic nesting of higher quasiquotations is converted to the traditional kind of nesting that Racket expects.

So, this might be a usable self-contained Racket library in the end -- not even a framework but a seamless library -- which is what I want, because that would make it easy to clarify the meaning of higher quasiquotation in terms of an existing ecosystem before I use it in Cene. :)

Since this thread and my code comments contain some walls of text, I'll see if I can convert them to a blog post soon.

---

[1] In particular, Racket uses a hygiene technique where it attaches a fresh scope label to a piece of syntax before macroexpanding and then flips the presence of that scope after macroexpanding so that the scope winds up attached to only the parts of the macro result that don't come from the input (making that region more local than the rest). Racket also has a "syntax taints" feature which lets macros attach "dye packs" to their results so that the identifiers occurring in those results can't be used by a client to access private module bindings.

-----

6 points by conanite 2794 days ago | link | parent | on: How many people still lurk here?

hi!

-----


1. Thank you! I figured arc could have gone a lot further with special-syntax, by making it configurable inside arc code.

2. this "open source" has had only two eyeballs on it so far ... attacks welcome :)

3. The execution stack is entirely plagiarised from rainbow so it should not be hard to re-implement continuations in the same manner ... the need hasn't arisen yet though ...

-----

2 points by conanite 4623 days ago | link | parent | on: Running Java in Arc

Rainbow has an implementation of tetris and a text editor, both are full of examples of using swing. What specifically do you want to do?

-----

4 points by conanite 4688 days ago | link | parent | on: Onions in the varnish

I agree, the vertical-bar symbol-syntax is a very smelly onion. And as far as I could tell the last time I checked, the only place arc really depends on it is in webserver code, to define a handler for the application root uri ( "/" ) - this path is defined by the empty symbol, which is represented as ||

You could get away with totally ignoring vertical-bar symbol-syntax, and add this somewhere before you define your webapp:

  (assign || (sym ""))
Hindsight is great. I wish I had thought of this before writing all those vertical-bar tests :)

-----

3 points by akkartik 4688 days ago | link

Alternatively, we could make handler names start with the leading '/', so you say "defop /" instead of "defop ||".

Do people consider srv.arc to be 'part of arc' for compatibility purposes?

---

Or we could define the root handler using "defop nil".

  wart> (sym "")
  nil

-----

2 points by Pauan 4688 days ago | link

"Do people consider srv.arc to be 'part of arc' for compatibility purposes?"

I would consider any Arc program to be "part of Arc" for compatibility purposes because any Arc program is supposed to use only the core and things defined in arc.arc (along with its own libraries, of course).

Obviously all bets are off if it uses the $ macro to drop into Racket, but then you're not really writing Arc code anymore, so I'm ignoring that case.

---

"(assign || (sym ""))"

That's a pretty good way of solving that one particular use-case, but it won't help in general because there might be some Arc program that actually does use the |...| notation more extensively.

However, I suspect such programs are quite rare, and with good reason. So I don't think it's a major deal if an Arc implementation chooses to not implement the |...| notation. However, such incompatibilities should be documented, so people using your implementation know what works and what doesn't (and why it doesn't work).

And in that case, it would probably also be a good idea to have your implementation throw an error if a program ever uses the |...| notation. That makes it easy to figure out which programs are broken, and makes it easy to pinpoint where the problem is so it can be easily fixed, rather than failing silently or at a later step in execution.

-----

1 point by akkartik 4688 days ago | link

"I would consider any Arc program to be "part of Arc" for compatibility purposes.."

I don't understand. If I write a hello world arc program, are you saying that's part of arc?

"..because any Arc program is supposed to use only the core and things defined in arc.arc."

I'm not sure what this means either. I think my question boils down to, "what do you consider to be the core of arc?"

-----

1 point by Pauan 4688 days ago | link

"I don't understand. If I write a hello world arc program, are you saying that's part of arc?"

For the sake of compatibility with Arc, yes. An implementation claiming compatibility with Arc should be capable of running programs designed for Arc 3.1.

---

"what do you consider to be the core of arc?"

We were discussing what constitutes "part of Arc" for compatibility purposes. "srv.arc" is not a part of the core of Arc, nor are any libraries written in Arc. But it is still "part of Arc" for the sake of compatibility, in the sense that a compatible implementation of Arc must be capable of running them, warts and all.

Unfortunately, that means that certain programs may end up relying on things inherited from Racket, such as the |...| syntax. I think it must be decided on a case-by-case basis whether compatibility with Arc 3.1 is worth it or not. In the particular case of |...| I don't think it's worth it. But I'm not the one implementing Arcueid: that's up to dido to decide.

-----

1 point by thaddeus 4687 days ago | link

That's my vote -> (defop nil ...)

-----

1 point by dido 4687 days ago | link

Strangely enough, Arcueid manages to run the web server code just fine while being unable to handle the vertical bar symbol syntax. shrugs

-----

1 point by akkartik 4687 days ago | link

I just tried it, and the '/' url isn't recognized. It's just not throwing an error on '||'.

-----

4 points by conanite 4702 days ago | link | parent | on: Unit testing Arc

Implementation-neutral arc tests from rainbow are here:

https://github.com/conanite/rainbow/tree/master/src/arc/lib/...

It's safe to use these as a specification.If you fire up an arc3 repl within the rainbow src directory you can run the same tests to verify you get the same behaviour.

Rainbow-specific tests are in another directory.

-----

2 points by dido 4702 days ago | link

Hmm, core-evaluation-test.arc seems to hang Anarki, as well as Arc 3.1.

Well, I've tried to run the tests that do work fine with 3.1 and Anarki under Arcueid and find a lot of issues. For starters, I had no idea that Arc treats symbols with |'s specially. Looks like more accidental behavior inherited from MzScheme/Racket. Scheme48 says that '|abc| contains illegal characters. Guile creates a symbol |abc|. I don't see anything in R^6RS that mandates any of this behavior. Heh, looks like I've got a lot of work to do!

Apparently the bars in symbols is a sort of convention when it comes to case sensitivity of symbols in Scheme. It seems that Arc, in its current implementation anyway, is unintentionally inheriting a lot of onions from MzScheme...

-----

1 point by conanite 5176 days ago | link | parent | on: Arc Conference

an arc conference would be très cool

1. Anywhere; Europe is more convenient

2. What should the core language be; macro design; libraries; implementations and interopability

3. I hope I can think of something to say about rainbow

4. no

5. as projectileboy mentions, we would need to find a conference space large enough to fit ... a dozen people?

-----


  arc> (tuples (range 0 19) 3)
  ((0 1 2) (3 4 5) (6 7 8) (9 10 11) (12 13 14) (15 16 17) (18 19))
tuples is defined in arc.arc:

  (def tuples (xs (o n 2))
    (if (no xs)
        nil
        (cons (firstn n xs)
              (tuples (nthcdr n xs) n))))

-----

1 point by hasenj 5186 days ago | link

Thanks!

Interestingly, it's smaller than 'pair' even though it does more.

-----

3 points by akkartik 5186 days ago | link

I first encountered this idea in a theorem-proving class - that if you have a hard time proving a theorem, often a really good approach is to find a more general 'lemma' to try to prove. Stronger theorems are often easier because they let you assume more in the induction step. Recursion has a similar 1-1 correspondence with inductive proofs.

http://www.cs.utexas.edu/users/moore/classes/cs389r/index.ht...

-----

1 point by hasenj 5186 days ago | link

Actually I just realized, 'tuple' uses 'firstn' and 'nthcdr', where as 'pair' sort of inlines these concept and has to deal with nils.

-----

3 points by conanite 5198 days ago | link | parent | on: A better syntax for optional args

Two problems with (o arg default) - you need to remember not to use 'o at the start of a destructuring list (and not get confused when you see (def handle-request ((i o ip)) ...) ), and as akkartik says it's paren-inefficient, a single keyword to delimit required/optional args would mean fewer tokens.

The first problem is easy to fix though - use a symbol that's less likely to be an arg name to identify optional args. How about '= ?

  (def myfun (a b (= c (something)) ...)
it has the advantage of similarity with ruby:

  def myfun a, b, c=something
disadvantage: looks silly when you don't supply a default value:

  (def myfun (a b (= c) ...)

-----

4 points by conanite 5198 days ago | link | parent | on: A better syntax for optional args

(a) might look funny when you want to evaluate an expression for the default value of an optional arg

  (def foo (a b '(c (defaults 'c x y z)) ...
To the untrained eye, (defaults 'c x y z) looks like it should not be evaluated because it's quoted

(c) makes parsing harder ... the assumption of only one element after the dot may be built into the parser

  arc> '(a b c . d e)
  Error: "UNKNOWN::8: read: illegal use of `.'"
It could be some privileged symbol instead of "." though ...

-----

1 point by akkartik 5198 days ago | link

Great points. I realized c was breaking the metaphor of '.'; I didn't realize it would actually refuse to parse.

It doesn't make sense to quote forms that may have expressions to evaluate, so b is better than a.

-----

More