Representing JavaScript dot notation has been a challenge for me with the arc=>js compiler. Which way would you prefer to write the following JS expression?
That's an excellent idea. Instead of Arc ssyntax and JavaScript dot notation being at odds, they should work together.
I think this starts to get into the more interesting namespace issues raised in http://arclanguage.org/item?id=11920 that I've totally dodged up to this point.
So, if it's document!body and (document 'body), then document can be an Arc table. But if it's document.body and (document body), then will it need to be a function or macro instead? Hmm... I may need to play with this for awhile.
I've got something that covers most of the cases now. I used to take a quoted Arc expression and start translating it immediately. Now I do a recursive ssexpand first:
By the time the compiler's primary function (js1) sees x.y, it has already been expanded to (x y). Then the corresponding JS function call can be printed:
arc> (js `foo.bar)
foo(bar);
arc> (js `(foo bar)) ; prints the same
foo(bar);
For ! ssyntax, document!body now expands to (document 'body) as it should. Then the correct JS is printed:
It turned out like the one fallintothis proposed. (Thank you, by the way.) Note that that the function wrapping is delibrate. [1]
But how does the compiler differentiate a function call from object access? Well, the implementation is naive: if there is only 1 arg and it's quoted, consider it object access; else it's a function call. So it no longer allows function calls to with a single quoted argument (hence, "something that covers most of the cases"). The only way I could think of to get around this is to check the type of the item in functional position, but that can't be done until runtime. Any ideas?
: ssyntax works now too! I defined compose and you can see that the equivalent expressions print the same:
arc> (js `(car:cdr (list 1 2 3)))
(function(){
var g13954=arraylist(arguments);
return apply(cdr,g13954).car;
})(list(1,2,3));
arc> (js `((compose car cdr) (list 1 2 3)))
(function(){
var g13956=arraylist(arguments);
return apply(cdr,g13956).car;
})(list(1,2,3));
You wouldn't be able to tell that they work from this, of course, since you're reading a macroexpansions with gensyms. (Plus some of the functions are defined in JavaScript.) When executed in a browser, though, they both evaluate to 2.
So I guess the problem that inspired this poll is mostly solved now! That is, unless I've neglected something critical, made a terrible mistake and need to revert. ^_^
[1] (= x y) should compile to (function(){return x=y;})() rather than x=y for the same reason (if x y z) should compile to (x?y:z) rather than if(x)y;else z;. [2] (Thanks, rocketnia.)
but there's probably little need to use quotation in JavaScript anyway.
(= x y) should compile to (function(){return x=y;})()
I don't know why you suppose (x=y) wouldn't work for that case, but at least your approach will generalize consistently to the case (= x y z w). ^_^
On another note, you may want to change your function block format to "(function(){ ... }).call(this)". That way this has the same value inside and outside the block. This is especially relevant for something as basic as assignment; if I'm planning to use a function as a constructor and I say things like (= this!x 10), I'll be frustrated if I end up setting properties on the window object. :-p
> at least your approach will generalize consistently to the case (= x y z w).
Yes, that's the reason. (= x y) was a poor example since you don't actually need the wrapping function for single assignment. In fact, I've provided a way to do it without:
arc> (js `(assign x y))
x=y;
Feels like cheating to compile assign to = (if you did it in Arc, you'd have a circular definition! ^_^), but I don't think JavaScript provides a more primitive assignment operator.
> That way this has the same value inside and outside the block.
Very astute! ^_^ I had become aware of the problem of this changing values but didn't know how to fix it. I will try your 'call approach soon. Thanks a lot!
arc> (ssexpandall ''dont-expand-me.im-quoted)
Another good point. Something else was making me question my ssexpandall formulation so this is going on my TODO. I think you're right that quotation isn't critical in JavaScript, but I do want to compile it correctly. I hope to eventually support eval:
I doubt quasiquotation will be supported though. (How would you compile that anyway, string concatenation?) I don't think JavaScript has anything quite like quasiquotation, which is probably why it doesn't have macros (which is in large part why lisp=>js compilers are attractive in the first place [1]). Additionally, there's an elegance to reserving unquote for escaping Arc code, which would be difficult to do if you compiled quasiquotation.
Edit: Hmm... that last paragraph may not be very well reasoned. Maybe string concatenation - or rather concatenating strings with unquoted expressions - is in fact a good counterpart to quasiquotation and I should compile it, even though JavaScript doesn't have macros. What do you think?
I think the way you're going to go about it, by having (quote ...) forms be compiled, has a bit of a caveat. If you're already planning to have a!b expand to (a 'b) and compile to "a.b", then won't (js '(eval 'foo)) just result in "eval.foo"?
Maybe a!b should expand to (ref a "b") or something, where (ref a b) compiles to "a[b]" for most arguments but compiles to "a.contentsOfB" when the second argument is a literal string that counts as a JavaScript identifier. (The second case could be totally left off to make things easier; document['getElementById']('foo') is still a method call, and regular property gets and sets work too.)
All that being said, I bet you already support eval(), in a way:
what would be (js '(eval '(alert (eval '(+ 1 2)))))
is expressed as (js `(eval ,(js `(alert (eval ,(js '(+ 1 2)))))))
The difference here is just syntax sugar, IMO. (Saving parentheses is a fine goal of syntax sugar, though!)
Maybe string concatenation ... is in fact a good counterpart to quasiquotation and I should compile it...
That feature would be a bit more difficult to simulate if 'js didn't support it intrinsically. Here's a quick approach:
(mac jswith (bindings . body)
`(fn-jswith (list ,@(map .1 bindings))
(fn (,(map .0 bindings)) ,@body)))
(def fn-jswith (vals body)
; We're adding the suffix "v" to each name so that it isn't the
; prefix of any other name, as might happen with gs1234 and gs12345,
; for instance. Note that it still counts as a JavaScript identifier
; with this suffix.
(withs (strnames (map [string (uniq) 'v] vals)
names (map sym strnames))
`( (fn ,names
(eval ,(multisubst (map [list (+ "('+" _ "+')") _)] strnames)
(js do.body.names))))
,@vals)))
(mac jslet (var val . body)
`(jswith (,var ,val) ,@body))
now what would be (js '(eval `(+ 1 ,foo)))
is expressed as (js `(eval ,(jslet f 'foo
(js `(+ 1 ,f)))))
where the final form sent to 'js is
(eval ((fn (gs1001v) (eval "'(1+('+gs1001v+'))'")) foo))
or expressed as (js:jslet f 'foo
(js `(eval ,(js `(+ 1 ,f)))))
where the final form sent to 'js is
((fn (gs1001v) (eval "'eval(\'(1+('+gs1001v+'))\')'")) foo)
I do feel that this difference is more than sugar, since the 'foo subexpression is moved out of context.
Also, more importantly, it has a security leak I'm not sure how to fix. The call to 'subst doesn't pay attention to the meaning of what it's replacing. If an attacker is able to get a string like "gs1001v" into a forum post or username or whatever in the server data, and then that string is embedded as a literal string in JavaScript code which is processed as the body of a 'jslet, something wacky might happen, and the attacker will be in a position to arrange things so that just the wrong wacky things happen.
If you just make a way to put identifiable "holes" in the compiled JavaScript, you'll remove the need to resort to blind string substitution here. The holes could be as simple as names surrounded by delimiters which you guarantee not to appear elsewhere in the result (even in string literals); that way a string substitution approach doesn't have to be blind. The holes could help you implement 'quasiquote, and conversely, if you implement 'quasiquote, there might not be much of a need for the holes.
Additionally, there's an elegance to reserving unquote for escaping Arc code, which would be difficult to do if you compiled quasiquotation.
Well, the Arc quasiquotes are processed before 'js even sees the input, right? Here's the only problem I see (and maybe it's exactly what you're talking about):
A typical way to escape from two levels of nested Arc quasiquotes is ",',", as in `(a `(b ,c ,',d)). That constructs something that includes a (unquote (quote ...)) form, so it only works when you're sending the result somewhere where unquote-quotes don't matter (like, to be evaluated as Arc). So ideally, the 'js meanings of 'quasiquote and 'quote should have this property. I don't think this would be especially hard to guarantee, but it might be easy to miss.
(Note that if Arc's quasiquotes didn't nest, the same example would be expressed as `(a `(b ,',c ,d)), and no unquote-quote would hang around to be a problem. I'm beginning to wonder if nesting quasiquotes are a wart of Arc.)
It appears I've reinvented a worse version of your wheel. ^_^
Your ssexpand-all is superior and I'm using it now. I did try to refactor it, thinking there must be a function f (like my ssexpandif but more sophisticated) that satisfies
(treewise cons f expr)
while producing the same functionality, but I haven't been able to determine what that would be.
arc> (ssexpand 'a:.b)
(compose a .b)
arc> (ssexpand '.b)
(get b)
So, we need to recurse in the f argument anyway. At a certain point, it seems like the anonymous & higher-order functions add layers of indirection on what should just be a straightforward recursive definition.
I get really annoyed at that, though, when working with trees in Arc. There always seems to be some underlying pattern that's just different enough that I can't abstract it into a higher-order function.
The table-like syntax is nice, but it has the following problem.
Let's say you have the expression;
a!b!c!d!e!f
If you now want to replace the a!b part with (a!b 4), you end up with;
(((((a!b 4) 'c) 'd) 'e) 'f)
Unless I'm missing something, there is no way to have ssyntax for the part after the first set of parentheses. If it was the f that gained parentheses, it would not affect the rest of the expression;
(Before you get too excited, I'm not the person you replied to. ^_^ )
First, for your particular example, you could just do this:
a!b.4!c!d!e!f
If you need (a!b 4 5), it does get more complicated, and I've gotten a bit annoyed about that myself. Nevertheless, there's still a way (albeit a way which requires a bunch of typing to refactor into):
(!f:!e:!d:!c:a!b 4 5)
You know, I bet this would create an awful lot of JavaScript. XD
In practice, I rarely have more than three chained property accesses in C-like languages, or more than four things connected by ssyntax in Arc, so I think I'd just add one or two sets of parentheses and live with it. I won't pretend my case is typical, though. :-p
No, no, I trust that your translation is correct. I was just disappointed that it would compile down to this much JS code since my example was design to model a.b(4).c.d.e.f.
I don't have a running Arc to check it on at the moment because mzscheme 372 does not compile for me (probably my gcc version is too new).
Ah, I see. Yes, at the moment this compiler isn't very good at generating minimal JavaScript, since it's so faithful to arc.arc's macro definitions. A lot of the later work might involve optimizing it to produce smaller, more efficient JS.
Of course, you can still use (((((a!b 4) 'c) 'd) 'e) 'f) to generate a.b(4).c.d.e.f. [1]
> mzscheme 372 does not compile for me
Did you know Arc 3.1 works on the latest MzScheme? [2]
---
[1] Actually, you might be further disappointed to know (((((a!b 4) 'c) 'd) 'e) 'f) is currently compiling to:
get here is a JS function not unlike rocketnia's ref [3]. Its purpose is to disambiguate the Arc form (x y), which may compile to x(y), x[y] or (car (nthcdr y x)), depending on the type of x (function, array/object or cons, respectively).
I wrestled with this disambiguation problem for some time and finally settled (for now ;) on a simple inference system based on the most common use cases. The algorithm is:
1. If the form has a single quoted arg, as in (x 'y), it's compiled to x['y']. This allows object access chains like document!body!innerHTML to be compiled correctly by default.
2. If the form has 0 or 2+ args, or 1 arg that isn't quoted, then it's considered a function call:
(x) => x()
(x y) => x(y)
(x y z) => x(y,z)
I'm still looking into the least kludgy way to pass a single quoted arg to a function. Here are some options:
(x "y")
(x `y) ; quasiquote isn't currently used for anything else
(x 'y nil) ; the function can just ignore the nil arg
(fncall x 'y)
I don't know. If it comes up often enough, I think I'd rather have a special (fncall x 'y) ssyntax. Maybe x!y could expand to (fncall x 'y) and x.`y could expand to (x 'y).
I had assumed that since x.'y was read as two distinct symbols, x.`y would be too, but it's not the case:
arc> 'x.'y
x.
arc> y ; still evaluating previous expr
arc> 'x.`y
|x.`y|
Any idea why these are treated differently? Whatever the reason, it means I can use x.`y without hacking the reader. So, thanks for pointing this out to me! ^_^
I'm currently torn about whether to do
x!y => (x 'y) => (fncall x 'y) => x('y')
x.`y => (x `y) => (objref x 'y) => x['y']
as you suggested, or the reverse. Leaning toward your way so that functions are totally normal and objects special, rather than having functions with a single quoted arg be some exception.
This example works particularly well because the $("a") jQuery selector can be compiled from $!a. A challenge arises with more complex selectors, as in this snippet from the Find Me: Using Selectors and Events tutorial:
Since $("#ordered list") has the special character #, we're unable to compile it from $!#orderedlist. Either most of the ssyntax has to be sacrificed for parens, as in
Not quite sure (I suspect it's a bug), but it seems like it has to do with the implementation of make-readtable (which brackets.scm uses).
$ mzscheme
Welcome to MzScheme v4.2.1 [3m], Copyright (c) 2004-2009 PLT Scheme Inc.
> (parameterize ((current-readtable #f)) (read))
x`y ; read in as two items
x
> y
> (parameterize ((current-readtable (make-readtable #f))) (read))
x`y ; read in as one symbol
|x`y|
In fact arc3.1 even works on Racket, the new PLT Scheme. Only thing is that the command-line "racket" prints a newline after the "arc>" prompts, for some reason. But you can open as.scm with the editor DrRacket (as you could with DrScheme), set the language to be "Pretty Big", and hit Run; it will work.
For some reason, now I don't notice any issues with the "arc>" prompt in "racket" either. And I don't think I'm doing anything differently than I was before. ...I am forced to conclude that, when entering things into the REPL, I held down the return key long enough that it accepted an extra (blank) line of input. This explains the behavior exactly. Strange that I should have done this several times in a row... and how embarrassing. Oh well. At least now I can give racket a clean bill of health.
That is a known issue with Windows. (I'm guessing it's the reason arc3 is still the "official" version on the install page.) Simple workaround[1]: Find the line that says:
Could you talk about your decision to use it for Readwarp then? If Arc's not really ready for production use, might it still be a good choice for a certain minority of developers?
Yeah, I'm not trying to say you shouldn't use it for production use :)
They're opposing perspectives. As a user of arc I'd throw it into production[1]. At the same time, from PG's perspective I'd want to be conservative about calling it production ready.
I suspect arc will never go out of 'alpha' no matter how mature it gets, just because PG and RTM will not enjoy having to provide support, or having to maintain compatibility.
[1] With some caveats: treat it as a white box, be prepared to hack on its innards, be prepared to dive into scheme and the FFI. And if you're saving state in flat files, be prepared for pain when going from 1 servers to 2.
> The table-like syntax is nice, but it has the following problem. [...] Unless I'm missing something, there is no way to have ssyntax for the part after the first set of parentheses.
Yes, this is sometimes a problem for me too, or at least an annoyance. It's one of those things that's a bug or feature depending upon who you ask, though. [1] Whichever way you classify it, the root issue is with Arc, not the compiler, which just conforms to Arc's ssyntax rules.
Interesting formulation, but the inner parens' inclusion of value makes it look like value is another argument in the function call. It also might be too similar to dotted cons notation, e.g. '("foo" . value).
infix is my favorite. Unfortunately, it has been the hardest to implement as well. ^_^ (Maybe I'm just going about it the wrong way.) Implementing the other options, however, was quick and painless.
pedantic prefix is the more technically correct prefix option, but it's so cumbersome to read and write that I've pretty much ruled it out already.
prefix gets you fewer parens and dots than its pedantic counterpart, but its grouping misleads by suggesting that "foo" is passed to a function getElementById. ("foo" is really passed to a function document.getElementById.)
infix-prefix hybrid corrects the grouping problem prefix has, but it's kludgy to have the two different forms of dot notation mixed together.
("foo" is really passed to a function document.getElementById.)
Right... but I wonder if you understand what I understand by that. If you allow for (document.getElementById "foo"), then document.getElementById will need to have a different meaning in functional position than in non-functional position, just like "foo.x()" and "(true && foo.x)()" have different meanings in JavaScript. (The first one uses foo as this in x, and the second one uses the window or whatever as this. The "foo.x()" form amounts to one syntax, even though the "foo.x" part can be wrapped in grouping parentheses without changing the meaning.)
The only example you give that doesn't require this kind of treatment, and in that way is the least misleading, is prefix, 'cause the \. form could be a simple macro.
On the other hand, the treatment isn't especially outlandish, 'cause Arc does the same thing; (a:b c) is different from ((and t a:b) c), thanks to the fact that a (compose ...) form is given special treatment in functional position (where it's the head of a metafn call). So with the right programmer-bewares in your documentation, you could use all-new ssyntax and metafn rules and mimic JavaScript's quirky invocation in a way that still works somewhat like Arc.
With that approach in mind, selected parts of the expansion might go like this (except probably depth-first, rather than breadth-first):
Note that Arc's 'get doesn't expand as a metafn that way; 'get only gets special treatment in setforms. Here, I've treated 'jsget so that "foo.a(1).b(2).c(3)" can be expressed as ((.c ((.b (foo.a 1)) 2)) 3), as opposed to something even more ridiculous. (Er, but (jscall (jscall (foo.a 1) "b" 2) "c" 3) is more readable IMO, and sooner or later you'll probably want to define a macro for (dots foo (a 1) (b 2) (c 3)) like the prefix option anyway, so maybe the whole 'jsget issue is pointless. :-p )
So yeah, I hope these complications help show you the way. If they lead you screaming in another direction, that's progress too, eh? ^_^
We can combine the next macro with the ":" shortcut
(mac c ((obj (func . args)))
'(call ,obj ,func ,@args)))
(= document!body!innerHTML
((c:document:getElementById "foo") value))
It will translate to:
(= ((document 'body) 'innerHTML)
((call document getElementById "foo") value))
Also a nitpick: your c macro needs quasiquote and has too many parens:
(mac c (obj (func . args))
`(call ,obj ,func ,@args)))
I'm not quite sure how the args in that macro are supposed to work. Is it somehow taking advantage of destructuring bind, or why is it (obj (func . args)) instead of just (obj func . args)
Thanks for the suggestion. Looking forward to hearing you elaborate.