Arc Forum | I'm not really sold on the need for a CL-Arc.

Arc Forum

new | comments | leaders | submit

2 points by wfarr 6658 days ago | link | parent

I'm not really sold on the need for a CL-Arc.

3 points by raymyers 6658 days ago | link

Besides access to all the CL libraries, I think it would be worth doing for speed alone. Not everyone has (yet) used Arc in a situation where the speed became an issue, but it does happen pretty easily (just ask almkglor).

In SBCL, I am consistently able to achieve results within 30% of C programs. Without type annotations the speed is still quite respectable.

If we ever want Arc to be usable in the wild for a diverse range of applications, we will need a compiler that can generate fast code. This is one way of getting there.

-----

4 points by almkglor 6658 days ago | link

Well, there are still ways to break the speed barrier. For instance the main reason I made cached-table was in order to cache the results of parsing the wikitext in Arki (but not keeping them for too long, since that would simply be a waste of memory). This is even arguably the "correct" thing to do and I might not have done it yet if Arc hadn't been so slow ^^.

"When I was a noob, I talked like a noob, I thought like a noob, I reasoned like a noob. When I became a hacker, I put noobish things behind me. Now we see but a poor implementation as in an alpha; then we shall see code to code. Now I code in part; then I shall code in full, even as I am full of code. Now these three remain: hubris, impatience, laziness. But the greatest of these is laziness."

-----

1 point by raymyers 6658 days ago | link

Larry Wall, I think.

Yes, you can always optimize your Arc code to mitigate the problem. My point still stands though. If we want to be able to use Arc in production situations, we will eventually crave an industrial-strength Arc compiler.

This becomes even more obvious if you take seriously the notion of "A Hundred-Year Language".

-----

1 point by almkglor 6658 days ago | link

Actually a paraphrase of the Christian Bible's 1 Corinthians 13:11-13, although yes, the three virtues are the Larry Wallian ones ^^

-----

1 point by eds 6658 days ago | link

Wow, did you come up with that yourself? I'd like to quote it if you don't mind.

-----

1 point by almkglor 6657 days ago | link

Err, yeah. I tend to pattern match on lots of things - I've got some weird takes on Buddhist philosophy somewhere on the boards too.

-----

1 point by wfarr 6658 days ago | link

We ought to focus on what gap Arc is really meant to fill: quick prototyping, and easy deployment of applications. It is not designed to crunch numbers at the speed of light, or cure cancer. Web applications simply don't need the kind of speed that C et al. do. If they did, we wouldn't be using Python or Ruby either.

We should focus on improving what Arc does, rather than trying to morph it into the behemoth Swiss Army knife that is Common Lisp.

Not to discourage you from your own pursuits regarding this, but I think as a whole, the Arc community would be better off putting its focus elsewhere.

-----

5 points by raymyers 6657 days ago | link

I can't speak for the whole of "the Arc community" any more than you can, but...

I'm not trying to turn Arc into Common Lisp, I'm advocating making Arc fast. There's a difference.

CL was a behemoth because it was a union by committee of all the cruft in every popular Lisp dialect circa 1980. One of Arc's goals is to design from a clean slate. There is not a pressure to conform to bad or inconsistent decisions of the past. None of that has anything to do with performance.

The question becomes: Do we want to make a real general purpose language, or do we want Arc to become "yet another scripting language"?

This is a long term question. We may not need a compiler tomorrow, but I'm not ready to say for the long haul: "Arc is for exploratory programming and rapid prototyping. It will never scale. You shouldn't write ray tracers in it, or image processing, or games. It's fine for web apps -- unless they get allot of traffic."

-----

3 points by sacado 6657 days ago | link

The funny thing is that, even in some of my webapps, I always needed some more speed, at one moment or another. The last example I have in mind : I wanted to generate a chart showing the use of a big system. As the chart's look depends on a few criteria given by the user, it has to be generated on demand. Well I had to dig into a DB containing millions of items, then mix these items together (that couldn't be done in SQL) and finally generate the chart.

Glad I had Psyco there. If I hadn't had it, or at least Pyrex, I would have probably dropped Python because writing C extensions for it is quite painful. And that's also the reason why I never really used Ruby, despite its cool features.

And I don't want to write prototypes and say : "Hmm... My code is working now, let's write it in a serious language for the production version".

Look at Scheme anyway. We really can't say it's a language focused on speed or designed to crunch numbers. Well, look at Ikarus. Oh, yes, for number crunching, you might prefer Stalin. Even C looks slow when compared to Stalin.

Of course, this shouldn't be the main focus of the community, and I don't even think the language should be designed with speed in mind (well, a little of course, or else we would have dead slow numbers implementd with lists of symbols) but that it should be seriously taken into consideration.

-----

3 points by Jesin 6656 days ago | link

Yes, exactly. I think a major cause of all this railing against optimizations is all the newbies who have just learned to write programs running around shouting about efficiency. I was one of those just a couple of years ago. The problem with these newbies is that they're naive. They decide that something is fast or slow based on how efficient its most naive implementation sounds. It seems that they grasp the vague idea that optimized code tends to be longer than code that is not optimized, but rather than responding to that by not trying to optimize until they know what parts of the code are slow, they respond by assuming that approaches that take more lines of code are more efficient and therefore better.

A misplaced focus on speed is bad, and you should get it working before you make it fast, but that doesn't mean speed is a non-issue. If a program is slow enough to cause annoyance, that is a problem, and it should be fixed. Languages have to pay extra attention to speed issues. If programs written in a language are slow because of the language, and not because the programs themselves are badly written, there's something wrong with the language.

Another thing that newbies don't get is that well-built languages are usually optimized so that the more obvious and more commonly-used approaches are actually faster than that tangled mass of "optimized" code you just wrote. Profiling profiling profiling. Don't just guess.

So, the points are: performance should be a secondary concern, but secondary is still pretty high up on the list, and optimization should be based on information gathered with a profiler, not what sounds efficient or inefficient. Sorry for rambling, I hope this post contributes something to something. I just have a tendency to spew everything I have to say about a topic all in one place every now and then, even if only some of it is relevant. I guess you could boil this post down to a "me too", but only if you boiled it a lot.

-----

3 points by almkglor 6657 days ago | link

Yes, but if we want quick prototyping, we also want transformation of the prototype to something we'll use, which might need to be faster than our mockup. Take for example Arki: the wikitext parser was written easily using a quick prototype on raymyers' treeparse, but in actual use, the wikitext parser was too slow. Unfortunately, there's no easy way to transform the parser structure to faster techniques (such as state machines), without doing it by hand, which rather undermines the purpose of the quick prototype. By optimizing the underlying parser, however, the prototype was still useable, and by exposing a few more particular combinations (such as nil-returning sequences, for when we only care that a syntax exists or doesn't exist, not the individual charactes of the syntax), we can refine the prototype into something faster.

-----

2 points by stefano 6657 days ago | link

Ikarus is a compiled scheme implementation. Porting Arc to it would lead to a real speed gain, without rewriting ac.scm in Common Lisp.

-----

2 points by sacado 6657 days ago | link

I tried that. Not very easy as Ikarus' hash-tables don't work like mzscheme's (there is no "equal" hash-tables). You would have to rewrite the reader too. And I'm not sure we would get all the stuff with sockets and networking. It doesn't have an FFI yet, either.

Well, actually, I'm not sure if implementing a still-designed language in the beta-release of a compiler is a long-term solution... :) Maybe in a few months / years this could be done ? But as for now, I think porting it to CL would be easier.

-----

4 points by stefano 6657 days ago | link

If you look in the long term run the best (and most difficult) solution would be to write an Arc compiler that translates Arc code directly in machine code.

-----

1 point by almkglor 6657 days ago | link

I also suggest that. In fact, looking at Chicken's implementation - stack == heap - is rather inspiring, because it shows exactly how a garbage-collected memory manager should be done: just decrement a pointer, in this case the stack pointer. Brilliant IMO. Wish I'd thought of that.

-----

2 points by eds 6658 days ago | link

I was thinking about working on a meta-circular native code compiler for Arc, but that is way beyond my abilities at the moment, so I came up with something I think I can actually accomplish.

At the very least, if I write this CL-Arc compiler, I'll be able to write games in Arc using existing CL libraries for SDL.

-----

4 points by almkglor 6657 days ago | link

Why not? Every higher-level language is just a macro on the assembly language of a computer. Model your native code as a list like (__asm (mov 42 ax) (ret)). Consing is just a call to a predefined cons: (__asm (push d) (push a) (call cons) (ret)) - or by applying the lessons of Lambda The Ultimate X (__asm (push d) (push a) (jmp cons)). ^^

Okay, okay, I was actually planning to do something like that a long time ago, haha. But I haven't (yet) found any holes in the basic concept of modelling every function call (f x) as a macroexpansion on (__funcall f x), which expands to an (__asm) form with the function call. Then for functions themselves (fn (x) (f1 x) (f x)), transform the last function call to a tail call: (__asm_let x (esp 4) (__funcall f1 x) (__tail_funcall f x)). Stuff like that ^^.

-----

1 point by eds 6657 days ago | link

If you want to start writing an Arc native code compiler (in Arc), it just brings up a lot of difficult issues that I'm not sure how to deal with. A reader, GC, continuations, tail recursion, etc. all have to be implemented in Arc. You stop getting those for free once you remove the Scheme runtime from the picture.

So quite frankly, I have a lot to learn before I'll be able to take on a project like that.

-----

1 point by almkglor 6657 days ago | link

Encapsulation my friend, encapsulation. Just make a general sketch and leave the details a bit. Then write the details. Besides, you've already done it! Just s/CL/Arc/ your posted article, then s/compile to Arc/compile to assembly/

Sure GC is nontrivial, but Boehm's GC is not bad at all. And if you really need continuations/tail recursion than make everything continuation passing style (you'll probably need to anyway). And I'm sure raymyers' treeparse can help in the reader department.

(IMO the difficulty here in itself is probably the assembly code you'll emit for a given piece of code, not reader/GC/conts/tailrec)

Remember, CL is by itself not so similar to Scheme that you can directly use its reader, as well as its execution model, in your final product. You'll write your own Arc reader in CL anyway (in CL 'arc == 'ARC, in arc 'arc != 'ARC), so you might as well (tada!) write it in Arc and compile it down to assembly. You'll need conts and no, you can't trust CL enough to handle tailrecs.

(Hmm. Maybe I shouldn't be advising you, maybe I should be doing this myself to steal your thunder ^^)

-----

3 points by stefano 6656 days ago | link

CL reader is highly extendable and you can tell it to be case sensitive: (setf (readtable-case readtable) 'sensitive), if I remember correctly. It's possible, I think, to use it to read Arc code.

-----

1 point by eds 6656 days ago | link

Is this portable?

But assuming it works in some form or other, it should remove most of the need to write a custom reader. Though there is still the issue of (for example) complex number syntax, etc.

-----

3 points by stefano 6656 days ago | link

It's in the standard (CLTL2). You can fix everything, because you can tell the reader to use your own functions on particulars characters, as an example this is a piece of code that lets you use Arc [... _ ...] syntax in Common Lisp:

(defun read-square (stream c) (declare (ignore c)) (let ((body (read-delimited-list #\] stream t))) `#'(lambda (_) ,body)))

(set-macro-character #\] #'(lambda (x y) (declare (ignore x y)) (values))) (set-macro-character #\[ #'read-square)

-----

1 point by eds 6657 days ago | link

So would you use the existing (C/C++) implementation of Boehm GC? If so then doesn't that make this not completely implemented in Arc? If not then that's one more piece to write (although I guess it isn't too difficult to translate code that's already been written in another language).

Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.

You have a good point about the reader, I should probably add that to my proposal. And writing it in Arc would be an interesting exercise.

And can't I trust CL about tail recursion? Most decent implementations do tail recursion, right? And can't I tell people to stay away from those that don't?

But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?

-----

1 point by almkglor 6657 days ago | link

> So would you use the existing (C/C++) implementation of Boehm GC? If so then doesn't that make this not completely implemented in Arc?

And neither is Linux completely implemented in C, and C compilers written in C are not completely implemented in C, because bits and pieces of the libraries they link their code to are written in assembly.

> Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters.

The Lambda the Lutimate papers are a good place to start if you're interested - they include some hand-written assembly code equivalents to Scheme/Lisp code, largely function calls and prefix/suffix. Given that the most basic axioms of Arc include (fn ...) and a function call syntax, this would be quite of interest.

> But suppose I can't trust CL to do tail recursion. What am I supposed to do about it?

Use 'prog and 'go? ^^ Lambda is the lutimate!!

-----

1 point by sacado 6656 days ago | link

"Yes, the assembly part of it looks difficult to me. When I look at Arc or Lisp code I don't see any way to translate that to native code. Obviously has been done, I'm just not educated on such matters."

I found a good link / tutorial about how to compile a subset of Scheme to C language. The compiler is about 800 lines of Gambit Scheme (blank lines included) and even deals with tail-recursion and continuations ! (well, that's based on the lambda papers...)

No GC, but you can use Boehm and have one for free.

-----

1 point by stefano 6656 days ago | link

You can trust CL to do tail recursion. I'm not sure if the standard requires it but every decent implementation must provide it.

-----

2 points by eds 6656 days ago | link

I think I'll just say that CL-Arc is only compatible with the subset of CL implementation that do tail recursion. (Which happens to be all the implementations I would consider using anyways.)

-----

1 point by kens2 6657 days ago | link

Garbage collection?

-----

3 points by almkglor 6657 days ago | link

The only time you need garbage collection is when you're allocating new memory. This is of course abstracted away into the 'cons procedure you end up calling at each (cons a d) - basically 'cons triggers gc if necessary. You may then very well just use the someone-Boehm garbage collector for C, which will (I think!) helpfully look at registers and stack for you.

The someone-Boehm GC (reportedly) works well with C - I'm reasonably sure that it will work well with assembly.

-----

1 point by sacado 6657 days ago | link

only cons ? What about bignums ? And strings ?

-----

2 points by stefano 6657 days ago | link

If you use the Boehm GC, you can handle everything simply by calling GC_malloc every time you need memory.

-----

1 point by sacado 6657 days ago | link

Yes, you're right. Sorry about that.

Anyway, maybe the right way to do so is by destructuring a Scheme implementation ? Starting from a given implementation, you write your compiler from scratch, but use the facilities of the chosen implementation for the reader and the GC. Then, once it's working, you gradually remove the scaffolding by implementing these things by hand...

-----

1 point by almkglor 6657 days ago | link

Well, if you're going to end up implementing something like my unrolled-lists ideas, then everything can very well be a cons cell underneath. Including bignums and strings.

-----

4 points by stefano 6657 days ago | link

PicoLisp (http://www.software-lab.de/ref.html#cell) uses cons cells to implement everything, from bignums to strings.

-----

1 point by sacado 6657 days ago | link

Hmm, looks like an interesting beast... I'll have a look at it some day...

-----