Arc Forumnew | comments | leaders | submitlogin
A 'defined op for Arc, or my quest to get the ideal 'each
2 points by palsecam 5625 days ago | 7 comments
This message can be seen as a follow-up to the "Macro expansion/evaluation problem" thread (http://www.arclanguage.org/item?id=10205). In short: I want to make 'each have an "anonymous form", where I could say:

   arc> (each '(1 2 3) (pr _))  ; use the _ symbol, like in anonymous 1-param fns
   123nil
But currently I can't do it (because I'm weak, but nothing is impossible ;-)). I thought the root of all my problems were the 'if behaviour in Arc, but its behaviour is actually quite normal (i.e: like in Common Lisp, but I stay persuaded it should be more "lazy" or something ;-P). So CatDancer said I better try to implement a 'defined operator, that could tell, at macro-expansion time if a variable is defined or not.

So, I followed his wise advice, and I've implemented something which works, but well, it's a dirty hack and it is insufficient anyway.

   arc> (mac isdef (x) (defined x))
   #3(tagged mac #<procedure: isdef>)
   arc> (isdef a)
   #f
   arc> (let a 1 (isdef a))
   #t
   arc> (isdef (+ 2 1))
   #t
   arc> (isdef (+ 2 a))
   #f
   arc> (= a 42)
   42
   arc> (isdef (+ 2 a))
   #t
   arc> (defined a)   ; could only be used in macro
   #<procedure:ac-defined>
Big picture: when 'ac sees an sexpr beginning by 'defined, it returns the function 'ac-defined. Later, in ac-mac-call, if the result of the application of the args to the macro is a procedure (it is 'ac-defined), it is called with the macro args & env for parameters.

I'll post the patch in a comment.

However, as I said, it is imperfect. Big, big problem is... it doesn't work with 'if. Since 'if quasiquotes the test:

   `(if (not (ar-false? ,(ac (car args) env)))
and in the 'defined case, ,(ac ...) returns #<procedure:ac-defined> which is obviously true.

So, I tried to also hack 'if :-D, but I will turn insane if I continue for now!

It's like with 'ac-mac-call, 'if should now be aware of this new (strange) kind of object. I try to make 'if returning a closure when the test is 'ac-defined (or any procedure? actually ; could be the start of a way to do some kind of lazy evaluation), but... this leads to a lot of environment/arguments problems, because they vary between "compile" and runtime.

So here I am for the moment, resting from looking at Arc guts, and asking for any advice/ideas/suggestions a person interested in the subject would have :-)



4 points by fallintothis 5625 days ago | link

I can't help but feel this is the wrong way to approach the problem. Even if you got your current definition working, there are too many ways for it to break. e.g.,

  (isdef (table [for i 1 3 (= _.i (* i i))]))
reads in as

  (isdef (table (fn (_) (for i 1 3 (= _.i (* i i))))))
and fails. There's the special form fn (neither lexically nor globally bound), plus the tricky business of checking defined when compiling fn so that _ is actually lexically bound (otherwise, it appears to be undefined). Same goes for expanding the for macro and i, I imagine.

All this parsing / partial compilation smacks of over-engineering. Even if you avoid multiple-evaluation issues, it seems too easy to make a typo in the middle of a contextual each like

  (myeach (table [for i 1 3 (= _.i (* i u))])
    (prn _))
where u is undefined, meaning it tries to destructure the table expression against (prn _), leading to confusing errors:

  arc> (each (table [for i 1 3 (= _.i (* i u))]) (prn _))
  Error: "Can't understand fn arg list 1"
The syntax for this each you're trying to write is suspect. From a certain angle, it almost seems like trying to overload on arity even though both forms take an unlimited number of values (on account of the rest parameters).

For all I know about Perl (which isn't much), it looks like its foreach works with explicit and implicit variables because you can consistently parse it -- you always have parentheses delimiting the array.

  foreach $item (@items) { print $item; }
vs

  foreach (@items) { print $_; }
If that's the case, the equivalent doesn't map cleanly over to Arc, since the syntax for

  (each item items (prn item))
looks effectively the same as

  (each items (prn _))
to the parser.

I know it's probably not what you want, but perhaps you could consider something with a more regular syntax?

  (mac myeach ((var expr) . body)
    (when (no expr)
      (= expr var var '_))
    `(each ,var ,expr ,@body))

  arc> (myeach ('(1 2 3)) (prn _))
  1
  2
  3
  nil
  arc> (myeach (v '(1 2 3)) (prn v))
  1
  2
  3
  nil
  arc> (myeach ((k v) '((a 1) (b 2) (c 3))) (prn k ": " v))
  a: 1
  b: 2
  c: 3
  nil
  arc> (myeach ((k v) (table [for i 1 3 (= _.i (* i i))])) (prn k ": " v))
  2: 4
  1: 1
  3: 9
  #hash((3 . 9) (1 . 1) (2 . 4))
  arc> (myeach ((table [for i 1 3 (= _.i (* i i))])) (prn _))
  (2 4)
  (1 1)
  (3 9)
  #hash((3 . 9) (1 . 1) (2 . 4))
  arc> (macex1 '(myeach (a b) c))
  (each a b c)
  arc> (macex1 '(myeach (a) b c))
  (each _ a b c)
  arc> (let v 2 (myeach (v '(1 2 3)) (prn v)))
  1
  2
  3
  nil
  arc> (let lst '(1 2 3) (myeach (lst) (prn _)))
  1
  2
  3
  nil
The extra parentheses are kind of ugly when you have expressions (e.g., a call to table), but I find that I'm most often iterating over variables anyways. Or you could rely on the Scheme reader recognizing curly-braces as parens:

  arc> (myeach {x (range 1 5)} (prn x))
  1
  2
  3
  4
  5
  nil
Sorry I don't have any better suggestions.

-----

1 point by palsecam 5625 days ago | link

> too many ways for it to break

> The syntax for this each you're trying to write is suspect

> perhaps you could consider something with a more regular syntax?

You're right. The more I think about it, the less I like this 'each idea.

> All this parsing / partial compilation smacks of over-engineering.

True, but this is also part of the fun here ;-)

> For all I know about Perl (which isn't much), it looks like its foreach works with explicit and implicit variables because you can consistently parse it -- you always have parentheses delimiting the array.

You are absolutely right here, and you made me realize I am actually trying to get something more dirty/complicated than in Perl, which is... really not a good sign :-D!

> Or you could rely on the Scheme reader recognizing curly-braces as parens

This is also an interesting option. Good to know.

> Sorry I don't have any better suggestions.

Gosh, that's already a bunch of good ideas! Thanks!

-----

3 points by absz 5625 days ago | link

[This is about your "quest for the ideal each", not about defined.]

Not to be a heretic, but this is where a more complex syntax shines. Or even something à la CL's loop, where you have a fixed keyword argument:

  (every x in '(1 2 3)
    (prn x))

  ; equivalent to
  
  (every '(1 2 3)
    (prn _))
Lightly tested:

  (mac every args
    (if (and (>= (len args) 3) (is args.1 'in))
      `(each ,(car args) ,(car:cddr args)
         ,@(cdr:cddr args))
      `(each _ ,(car args)
         ,@(cdr args))))
Another option would be anaphoric each:

  (mac aeach (xs . args)
    `(each it ,xs ,@args))
This produces

  (aeach '(1 2 3)
    (prn it))
It's not quite what you wanted, but it's similar. (It may also have been suggested before, I don't recall.)

-----

2 points by palsecam 5625 days ago | link

Your 'every example is very very attractive. I think I'll adopt it at this point:

1. It does the job without problematic cases, like there would be in what I describe:

   (with (lst '(1 2 3)
          v 3)
      [...]
      (myidealeach v lst (prn v)))  ; <-- v mistaken for the seq to traverse here, 
                                    ; because it is defined
2. OK, it adds some syntax and this is bad, but for the common case where I'd like an implicit '_ variable, it's as short as possible!

3. For the other cases, the extra "in" is not so heavy, and in a way it makes the expression more readable!

Thanks a lot absz!

----

For the rest:

> anaphoric each

Yes, a good idea but I don't like to have two loop constructs where one could suffise. Currently, I'm using something like that (I have a 'each_) but this sucks IMO.

However, interesting that you choose to use 'it' and not '_'. It makes me realize, maybe it would be better if there was only one of the 2, there are a bit redundant aren't they?

   (aif (car a) (prn _))  
   ; I like this, because "_" is more visible than "it", less likely 
   ; to be confused with a "normal" variable, and less english-centric 
Or:

   ([* it it] 2)  ; hmmm don't like that so much actually
This is pure personal taste however.

In the same way, this is also personal taste, but I mention this because an American/British people may not think about this kind of stuff. I prefer "list?" over "alist" because anyone used to the latin alphabet will quickly understand what it does (and even this is not the majority of people...). True story, I didn't understand what "acons", "alist" mean when I first looked at Arc (I'm French), while the Scheme way to name predicates was obvious at first sight.

> CL's loop

Ouch I'd really not copy 'loop too much however.

In a way, it is a good construct, I mean it is very powerful, but I can never remember of the syntax to do basic things, and this is why I prefer the Arc way here. 'loop is really a DSL in itself.

Thanks again!

-----

2 points by absz 5624 days ago | link

Glad I could be of assistance :)

As for the it/_ question: I hadn't noticed until writing my post that _ and it (and self in afn) were serving similar purposes; I don't really know which one I like more, though that's not something I care heavily about. (My feeling about the English-centricity problem is that if everything else in the language is already in English---if, each, and, etc.---then changing it to _ probably wouldn't make a huge difference.)

As for the function-naming issue: I also prefer the Scheme naming convention, though not for multiple-language reasons (I'm an American [mais je peux parler Français]); that's an aspect I hadn't thought of. (The a-predicate versus anaphoric thing is already annoying.) Internationalization of programming languages is a hard problem that I don't think anybody's tried to tackle, and I'm not sure how tractable it is---it would be interesting to see a language or framework which tried to explore that design space. The obvious approach is something like

  (mac si (c v f) `(if ,c ,v ,f))
  (= chaque each)
  ; etc.
The downside is that this only internationalizes the core; any library still has to support these synonyms or have a translator, and so I'm not sure how big the eventual gain is. (I think AppleScript had this once, but since applications didn't support this, that feature died.)

In my opinion (and I've barely used CL), loop is a brilliant piece of code because it's a great little DSL for iteration; at the same time, also having simpler alternatives, e.g. each, available is really handy because that way you don't need to know everything loop can do all the time.

-----

2 points by palsecam 5623 days ago | link

> then changing it to _ probably wouldn't make a huge difference

I also don't think it would be a huge difference for non-english programmers, although it may be a little win. But for my case, I don't prefer "_" just because of that. I'd say, the main reason is that "_" is more easily recognized by my parser (aka brain) as a special variable than "it" because it is an uncommon glyph. "it" gets lost in the flow of the program.

> I'm an American [mais je peux parler Français]

Now, that is not common and that is cool. I mean, not that you know how to speak French, but that you know how to speak a foreign language.

Those who know no foreign language knows nothing of their mother tongue. -- Goethe

It's not a coincidence that Larry Wall (Perl creator) studied linguistics...

> that's an aspect I hadn't thought of

Well in this case I can go to sleep happy, knowing I made you aware of something new ;-)

> a-predicate versus anaphoric thing is already annoying.

Yes! When I realized 'acons & the like were predicates, well I thought 'afn was, too...

> loop is a brilliant piece of code because it's a great little DSL for iteration; at the same time, also having simpler alternatives, e.g. each, available is really handy

Totally agreed. 'loop is something great to have as a library or something, but if it's the only way you have to do basic iteration, well, it sucks IMO. Because in top of knowing CL you have to know the 'loop stuff, which is completely different. Gosh, writing this message I realize I don't even remember how to iterate over a list while printing its elements. But yes, 'loop is extremely powerful.

Anyway, thanks once again absz, your message was really interesting to read!

-----

1 point by palsecam 5625 days ago | link

The patch (diff -Nurp):

   --- ac.scm.orig	2009-08-04 21:04:24.000000000 +0200
   +++ ac.scm	2009-08-04 21:21:33.000000000 +0200
   @@ -27,6 +27,7 @@
            ((eq? (xcar s) 'if) (ac-if (cdr s) env))
            ((eq? (xcar s) 'fn) (ac-fn (cadr s) (cddr s) env))
            ((eq? (xcar s) 'assign) (ac-set (cdr s) env))
   +        ((eq? (xcar s) 'defined) ac-defined)  ; to be called later in 'ac-mac-call
            ; the next three clauses could be removed without changing semantics
            ; ... except that they work for macros (so prob should do this for
            ; every elt of s, not just the car)
   @@ -254,6 +255,14 @@
                     ,(ac (cadr args) env)
                     ,(ac-if (cddr args) env)))))
 
   +(define (ac-defined x env)  ; called only via macro
   +  (if (pair? x)
   +      (and (ac-defined (car x) env)
   +	       (ac-defined (cdr x) env))
   +      (or (literal? x)
   +	      (lex? x env)
   +	      (bound? x))))
   +
   (define (ac-dbname! name env)
     (if (symbol? name)
         (cons (list name) env)
   @@ -462,8 +471,9 @@
 
   (define (ac-mac-call m args env)
      (let ((x1 (apply m (map ac-niltree args))))
   -    (let ((x2 (ac (ac-denil x1) env)))
   -      x2)))
   +    (if (procedure? x1)  ; 'defined call
   +	  (x1 args env)
   +	  (ac (ac-denil x1) env))))
  
    ; returns #f or the macro function

-----