Arc Forum | The concern here being subtle bugs like (for

Arc Forum

2 points by fallintothis 5930 days ago | link | parent

The concern here being subtle bugs like

  (for_ i 0 (- (len xs) 1) 
    (prn (xs i)))

This looks alright:

  arc> (= xs '(a b c))
  (a b c)
  arc> (for_ i 0 (- (len xs) 1) (prn (xs i)))
  a
  b
  c
  nil

but

  arc> (= xs nil)
  nil
  arc> (for_ i 0 (- (len xs) 1) (prn (xs i)))
  Error: "Function call on inappropriate object nil (0)"

since

  arc> (for_ i 0 (- (len xs) 1) (prn i))
  0
  -1
  nil

People often suggest a bidirectional for, then demonstrate its readability on literal values -- and it does look good. The issue is that when you use variables or expressions (as is usually the case), it's not immediately obvious what the bounds (and thus behavior) of the loop will be. Hence, the separation.

1 point by palsecam 5929 days ago | link

Shouldn't 'forlen be used instead in your examples?

   arc> (= xs '(a b c))
   (a b c)
   arc> (forlen i xs (prn (xs i)))
   a 
   b 
   c
   nil

But hey hey, too many loop constructs. If I hadn't `grep' Arc (see below), once again, I'd not know about it, too.

I don't agree with 'for_ being "buggy" or with the 'for/'down being an obvious/mandatory division. It's just a tradeoff in my opinion. But you're right, don't get me wrong, you're right there is a rational behind 'down.

The descendant loop being a separate concept because there are cases where it is "necessary", well... I actually, personally, don't buy this. I call "everyone repeats the lesson, and no one questions it".

It happens, it is rare but it happens, that I need a descendant 'for. And schtroumpf, I can never remember the syntax for it (in any language), I always need to Google it. But God knows I can remember a tremendous amount of details (last example in Arc: the need for 'write/'disp, understand it while coding evsrv, hop it's in my brain and it will not be forgotten. This kind of details, OK.). And because I think my brain is awesome, and only forget useless stuff (I remember "useful" things I saw when I was 5), well I don't buy the need for 'down.

For strange behaviours where bounds would be inversed, well, I always check the input where needed (general rule). If I'm doing "explanatory programming", well it's OK if this causes "bugs". It is far more OK than if I need to WTF and loose time, and my concentration, to start my browser, and ask Google how to 'down. And often, use 'forlen/'each. 'for is good for C.

And yes, all this is terribly arrogant. But I'm not alone in the "need to Google it every time".

The funnier is, I actually `grep'ed Arc files yesterday, and I'm nearly sure 'down could be removed without problems/'for_ adopted, according that you modify some things (like the def of 'forlen).

And I'm nearly sure it will be good, because Worse Is Not Always Better. The programmer should have the easy life (i.e: not having to remember 100 loop constructs), and not the {system|language} designer (which should make sure the def of say 'forlen is correct even for empty lists, even if it means adding an 'if or anything). I don't care too much if Arc.arc is a little bloated if it means I have less stuff to remember.

Where "good" here means, my definition of "good", and the crazy definition of "Would make Arc and news.arc shorter". Yes, I claim it'd actually make it shorter.

Unfortunately, ars longa, vita brevis, and I don't want to waste time to prove this rather useless point. But I'm nearly sure 'down is useless in this current small-not-so-small version of Arc + libs. Oh and schtroumpf, I add it to my ARROGANT_TODO list. Will demonstrate my point of view is at least very acceptable one of these days, so that you don't take me for a moron too much :-)

But of course, if you like 'for to be like this, I see no problem with this.

And thanks for taking the time to remind all this to me (because sincerely, one more time, I couldn't see why 'for in Arc couldn't go in descendant).

And anyway, 'for is so 70s. 'each, 'repeat are far more used. How many times do we use 'for directly (hint: something like 5 times in x.arc, macros definitions excluded because this doesn't count IMO, "worse is not always better", and most of the times when you are sure the bounds are ok, e.g: (for 0 255 ...))? And how many times do we use 'down (hint: once in x.arc)?!

----

Ultime arrogance, I'll quote Einstein here:

The important thing is not to stop questioning [the real need for 'down, even if everyone says so, when `grep' is far less convinced than people on this]. Curiosity has its own reason for existing.

-----

2 points by fallintothis 5929 days ago | link

Shouldn't 'forlen be used instead in your examples?

I'd say each should be used in the examples. The point is that they are easy instances of a more general problem, as I noted: when using for, you're typically using expressions; when you're using expressions, you aren't sure if the bounds will result in an ascending or descending loop. There are instances where this distinction is important. Take posmatch in strings.arc, defined as

  (def posmatch (pat seq (o start 0))
    (catch
      (if (isa pat 'fn)
          (for i start (- (len seq) 1)
            (when (pat (seq i)) (throw i)))
          (for i start (- (len seq) (len pat))
            (when (headmatch pat seq i) (throw i))))
      nil))

Here we see the else-clause for-loop isn't merely a place to substitute forlen or each: it only iterates up (by for's behavior) to the largest index at which the pattern could occur in the sequence:

  arc> (load "trace.arc") ; see http://arclanguage.org/item?id=10372
  nil
  arc> (trace posmatch headmatch)
  *** tracing posmatch
  *** tracing headmatch
  nil
  arc> (= trace-indent* 2)
  2
  arc> (posmatch "a" "abc")
  1. Trace: (posmatch "a" "abc")
    2. Trace: (headmatch "a" "abc" 0)
    2. Trace: headmatch ==> t
  1. Trace: posmatch ==> 0
  0
  arc> (posmatch "abc" "a")
  1. Trace: (posmatch "abc" "a")
  1. Trace: posmatch ==> nil
  nil

As a drop-in replacement, the bidirectional variant would fail:

  arc> (untrace posmatch)
  *** untracing posmatch
  nil
  arc> (mac for_ (v init end . body)
         (w/uniq (gv gi ge gt gf)
           `(do
              (if (> ,end ,init)
                (= ,gt < ,gf +)   ; classic, "ascendant", 'for
                (= ,gt > ,gf -))  ; 'down
              (with (,gv nil ,gi ,init ,ge (,gf ,end 1))
                (loop (assign ,gv ,gi) (,gt ,gv ,ge) (assign ,gv (,gf ,gv 1))
                  ((fn (,v) ,@body) ,gv))))))
  #(tagged mac #<procedure: for_>)
  arc> (def posmatch (pat seq (o start 0))
         (catch
           (if (isa pat 'fn)
               (for_ i start (- (len seq) 1)
                 (when (pat (seq i)) (throw i)))
               (for_ i start (- (len seq) (len pat))
                 (when (headmatch pat seq i) (throw i))))
           nil))
  *** redefining posmatch
  #<procedure:zz>
  arc> (trace posmatch)
  *** tracing posmatch
  nil
  arc> (posmatch "a" "abc")
  1. Trace: (posmatch "a" "abc")
    2. Trace: (headmatch "a" "abc" 0)
    2. Trace: headmatch ==> t
  1. Trace: posmatch ==> 0
  0
  arc> (posmatch "abc" "a")
  1. Trace: (posmatch "abc" "a")
    2. Trace: (headmatch "abc" "a" 0)
  Error: "string-ref: index 1 out of range for empty string"

For strange behaviours where bounds would be inversed, well, I always check the input where needed (general rule).

Your point, if I understand it, is that you'd rather catch this behavior in the logic of the code, e.g.,

  (if (and (>= (len seq) (len pat))
           (<= start (- (len seq) (len pat))))
    (for_ i start (- (len seq) (len pat))
      ...))

That's fine, of course. It works. It just seems gratuitous -- like something I'd want handled for me already. But everyone has their definitions of "good", and yours is certainly no less (or more) valid than mine.

Hell, someone might like having the bidirectional loop in general, then use a separate loop ("up"?) for this case.

The programmer should have the easy life (i.e: not having to remember 100 loop constructs)

Whereas I think remembering 100 loop constructs is easier than remembering that the handful of loop constructs are incredibly fragile.

But of course, if you like 'for to be like this, I see no problem with this.

Nor do I see a problem if you want a bidirectional for. This is one use for macros: rather than worry that Arc doesn't have some loop construct, you're allowed to make your own. No need for the language spec to get updated if you can easily write a bidirectional loop. And if for was changed to be bidirectional, I could similarly write macros for ascending and descending loops.

As you say, this is just the rationale. But that's not saying much: by its very nature, language design is about rationale; the only "necessary" components of the language are basically the ones that make it Turing-complete.

-----

1 point by palsecam 5927 days ago | link

No-down patch at http://dabuttonfactory.com/res/arc-no-down.patch

Thanks my arrogance/guts for pushing me to try to remove 'down, because it showed me the Arc codebase confirms my own experience of programming:

- you never use 'for directly, but in cases where you are sure the bounds are OK.

Where "directly" means, not in a library {mac|fn} definition, because here you must anyway validate your input, if you agree w/ "Worse is not always better" (i.e: the {sys|lang|lib} writer does the hard work, not you, the user). If you don't agree, well, one problem is, it leads to incoherences/bugs. See below.

The "problematic" (few) occurrences of 'for only appears in arc.arc and strings.arc which are typical librairies files. Not even "normal" librairies, but "core" ones. The kind of ones were I'd strongly apply "worse is not better".

You'll not see 'for used with expressions in any other files, i.e: "application" (blog.arc, news.arc, etc.) or even other libs files. You'll not even see it at all in news.arc, srv.arc, code.arc, prompt.arc. You'll see it used directly twice, here:

  blog.arc:      (for i 0 4                ; no bounds pb
  html.arc:(for i 0 255 (= (hexreps i      ; no bounds pb

- you sometimes, rarely, also need to directly use a descendant 'for ('down). Only once in all Arc (but once = it is needed):

   news.arc:      (down id maxid* 1

Where maxid* is a global, and the kind of one which is nearer IMO to a litteral than to a (complex) expression, so no pb. See below.

So it's a pity that for this one time, you can't use 'for, and have to ressort using yet another loop construct that is here for... non-existing problems.

- for the vast, vast majority of looping, you use higher-level loop constructs (each/repeat/etc.), so there is no problem w/ incorrect bounds, assuming the lib writer is not a moron.

----

In arc.arc:

Is it coherent than 'posmatch will return nil when pat > seq, where 'headmatch will throw an error in the same case (even stranger knowing 'posmatch actually calls 'headmatch)?

  arc> (headmatch "abcd" "abc")
  Error: "string-ref: index 3 out of range [0, 2] for string: \"abc\""
  arc> (posmatch "abcd" "abc")
  nil

w/no-down-patch:

  arc> (headmatch "abcd" "abc")
  nil
  arc> (posmatch "abcd" "abc")
  nil

Coherent, and correct IMO. We ask if it matches. If pat > seq, the answer is just "no", it's not an error per-se.

Or: how 'headmatch is "incredibly fragile", and the so-called "solid" 'for hides this fact here. Thanks pseudo-solidity. Validate your input, and don't rely on the behaviour of something inherently fragile (using a raw construct), when writing a library fn.

In news.arc, I (obviously) changed:

      (down id maxid* 1

to:

      (for id maxid* 1

I feared it may not work when there are no item, tested this case (nsv), then access localhost:8080, and there were actually no problem. I don't use news.arc, so can't test for the rest, but it should be OK. (If pb, maybe just changing to (for id maxid* 0 ...) would solve it.)

----

"You claimed it'd make the code shorter! Prove it!"

Clever, interesting test:

  arc> (let toto 0 
         (each (k v) (tokcount '("arc.arc" "strings.arc" "news.arc")) 
           (++ toto v)) 
         toto)
  14756

  arc-no-down> (let toto 0 
                 (each (k v) (tokcount '("arc.arc" "strings.arc" "news.arc")) 
                   (++ toto v)) 
                 toto)
  14749

Harder, dumber, raw `wc' test:

  $ wc -m 3.1orig/*.arc
  [...]
  198017 total

  $ wc -m 3.1nodown/*.arc
  [...]
  198017 total     # Argh, failed! It's ==, not strictly <...

----

No-down patch was coded quickly and with nearly no testing afterwards, so there might be bugs. I hope someone prouve me I've introduced lots of bugs, like this I could be sure all this crap at least makes someone take a look at the reality (where the reality is, here, some pratical code, and not some books), and try to question things. One thing Arc got very right is "code.arc".

And no, telling me "it is buggy for me" doesn't count without showing some Arc code, in where you'll be effectively embarrassed by the new 'for behaviour. Else it's like with hygienic macros: "incredibly less fragile" but no one cares 'cause unhygienic is good enough/more powerful, according you live in the real world.

And anyway it doesn't count because everyone here more or less accept the fact that the Arc codebase is a superb piece of software (so if you don't have the same coding practice, you suck), that brevity is power, and that it is a valid codebase to test the necessity of an operator. All of this IS questionable. But too many people here are... not qualified to do so, unless they are sure their comments history will not reveal some stupid blind adoration for Arc.

I trust {my|other people} guts & feelings, but on the end I believe only in reality, in data (and you know as well as me that code is data :-D), and not in opinions and books.

-----

2 points by fallintothis 5927 days ago | link

- you never use 'for directly, but in cases where you are sure the bounds are OK.

The "problematic" (few) occurrences of 'for only appears in arc.arc and strings.arc which are typical librairies files.

What makes arc.arc and strings.arc less valid examples of for usage? They're Arc programs, too. Should they not inherit the elegance they're attempting to define? (While still balancing efficiency, of course, cf. the tutorial: http://ycombinator.com/arc/tut.txt)

To the contrary, because arc.arc and strings.arc use for I think they make perfect examples -- which would make your first statement untrue, since you had to write extra bounds-checking.

- you sometimes, rarely, also need to directly use a descendant 'for ('down). Only once in all Arc (but once = it is needed):

So it's a pity that for this one time, you can't use 'for, and have to ressort using yet another loop construct that is here for... non-existing problems.

You're ignoring that down has another purpose. As you say, the need for a descending loop is rare. But the need for for to only go in one direction is much less rare (more on that later).

for the vast, vast majority of looping, you use higher-level loop constructs (each/repeat/etc.), so there is no problem w/ incorrect bounds, assuming the lib writer is not a moron.

So you'd also want to foist the responsibility of not being a "moron" onto every user of for? If other loops are already used to avoid silly bugs, why not for?

I count at least 12 different loop constructs in arc.arc: while, loop, for, down, repeat, each, whilet, whiler, forlen, on, until, noisy-each, and arguably others like evtil and drain.

I find that adding these makes code simpler: they express (and implement) purposeful loops. That's why I can do

  (each x xs (prn x))

instead of

  (forlen i xs (prn (xs i)))

which can be done instead of

  (for i 0 (- (len xs) 1) (prn (xs i)))

which can be done instead of

  (loop (= i 0) (< i (len xs)) (++ i) (prn (xs i)))

etc. If I wanted the most general & least to remember, I'd use a goto.

When for tries to infer the direction I want to go, I need to fight it to stop from going in the opposite direction -- to me, this is inconvenient.

Is it coherent than 'posmatch will return nil when pat > seq, where 'headmatch will throw an error in the same case (even stranger knowing 'posmatch actually calls 'headmatch)?

I agree that headmatch has odd behavior here. But with the fixed behavior (i.e., your patch):

  arc> (load "../arc3.1/trace.arc")
  nil
  arc> (trace posmatch headmatch)
  *** tracing posmatch
  *** tracing headmatch
  nil
  arc> (posmatch "a" "abc")
  1. Trace: (posmatch "a" "abc")
  2. Trace: (headmatch "a" "abc" 0)
  2. Trace: headmatch ==> t
  1. Trace: posmatch ==> 0
  0
  arc> (posmatch "abc" "a")
  1. Trace: (posmatch "abc" "a")
  2. Trace: (headmatch "abc" "a" 0)
  2. Trace: headmatch ==> nil
  2. Trace: (headmatch "abc" "a" -1)
  2. Trace: headmatch ==> nil
  2. Trace: (headmatch "abc" "a" -2)
  2. Trace: headmatch ==> nil
  1. Trace: posmatch ==> nil
  nil

Just because the function to which you funnel input sanitizes data doesn't mean you should be supplying bad values. Further, if we add more error-checking to posmatch to avoid the redundant calls, we're adding even more complexity -- wrestling against for to get it to go just one direction.

"You claimed it'd make the code shorter! Prove it!"

I believe only in reality, in data

Then let's inspect your patch closer:

inspect-patch.arc

  (def default (file)
    (+ "../arc3.1/" file))

  (def patched (file)
    (+ "../arc-patch/" file))

  (def sexp-tokcount (sexp)
    (len (flat sexp)))

  (= for-def*
    '(mac for (v init max . body)
       (w/uniq (gi gm)
         `(with (,v nil ,gi ,init ,gm (+ ,max 1))
            (loop (assign ,v ,gi) (< ,v ,gm) (assign ,v (+ ,v 1))
              ,@body))))
     down-def*
     '(mac down (v init min . body)
        (w/uniq (gi gm)
          `(with (,v nil ,gi ,init ,gm (- ,min 1))
             (loop (assign ,v ,gi) (> ,v ,gm) (assign ,v (- ,v 1))
               ,@body))))
     new-for-def*
    '(mac for (v init end . body)
       (w/uniq (gi gm gt gf)
         `(do
            (if (> ,end ,init)
                (= ,gt < ,gf +)
                (= ,gt > ,gf -))
            (with (,v nil ,gi ,init ,gm (,gf ,end 1))
              (loop (assign ,v ,gi) (,gt ,v ,gm) (assign ,v (,gf ,v 1))
                ,@body))))))

  ; if this calculation is wrong, it should be revealed in logic-savings
  (= max-diff* (- (+ (sexp-tokcount for-def*) (sexp-tokcount down-def*))
                  (sexp-tokcount new-for-def*)))

  (def token-total (file)
    (sum cadr (tokcount (list file))))

  (def token-diff (file1 file2)
    (- (token-total file1) (token-total file2)))

  (def compare-tokcount (filename)
    (let diff (token-diff (default filename) (patched filename))
      (if (> diff 0)
            (prn "The patch saved " (plural diff "token") " in " filename)
          (< diff 0)
            (prn "The patch added " (plural (- diff) "token") " to " filename)
            (prn "The patch didn't change the token count in " filename))))

  (def maximum-savings ()
    (prn "The patch could have saved at most (caveat lector) "
         (plural max-diff* "token")
         " in arc.arc"))

  (def logic-savings ()
    (let diff (token-diff (default "arc.arc") (patched "arc.arc"))
      (if (<= diff max-diff*)
          (prn "So, by changing 'for in arc.arc, "
               (plural (- max-diff* diff) "token")
               " got added to code that used the previous version of 'for")
          (err "miscalculated the maximum number of tokens you could save"))))

  (map compare-tokcount '("arc.arc" "strings.arc" "news.arc"))
  (prn)
  (maximum-savings)
  (logic-savings)

At the REPL

  arc> (load "inspect-patch.arc")
  The patch saved 9 tokens in arc.arc
  The patch added 2 tokens to strings.arc
  The patch didn't change the token count in news.arc

  The patch could have saved at most (caveat lector) 17 tokens in arc.arc
  So, by changing 'for in arc.arc, 8 tokens got added to code that used the previous version of 'for
  nil

To explain the "caveat", I assume the most this new for could change is: (a) remove the single-direction for and down, (b) add the bidirectional for, and (c) leave any other piece of code that used for/down unchanged (save switching the word "down" to the word "for").

With these assumptions (and by inspecting the code), the assessment seems correct: arc.arc nets 8 additional tokens to stop for from going backwards. It's not that the token count is shorter from having for go both directions; it's that the code you've added to avoid for's new behavior isn't quite enough to outweigh the savings from removing down's definition.

In actuality, you'll wind up saving far less than 9 tokens because of multiple evaluation bugs:

   (mac repeat (n . body)
     `(if (> ,n 1) (for ,(uniq) 1 ,n ,@body)))

with

  arc> (sexp-tokcount '(mac repeat (n . body)
                         `(if (> ,n 1) (for ,(uniq) 1 ,n ,@body))))
  18

should be

  (mac repeat (n . body)
    (w/uniq gn
      `(let ,gn ,n (if (> ,gn 1) (for ,(uniq) 1 ,gn ,@body)))))

with

  arc> (sexp-tokcount '(mac repeat (n . body)
                         (w/uniq gn
                           `(let ,gn ,n
                              (if (> ,gn 1) (for ,(uniq) 1 ,gn ,@body))))))
  25

i.e., 7 more tokens, and

  (mac forlen (var s . body)
    `(unless (empty ,s)
       (for ,var 0 (- (len ,s) 1) ,@body)))

with

  arc> (sexp-tokcount '(mac forlen (var s . body)
                         `(unless (empty ,s)
                            (for ,var 0 (- (len ,s) 1) ,@body))))
  21

should be

  (mac forlen (var s . body)
    (w/uniq gs
      `(let ,gs ,s
         (unless (empty ,gs)
           (for ,var 0 (- (len ,gs) 1) ,@body)))))

with

  arc> (sexp-tokcount '(mac forlen (var s . body)
                         (w/uniq gs
                           `(let ,gs ,s
                              (unless (empty ,gs)
                                (for ,var 0 (- (len ,gs) 1) ,@body))))))
  28

i.e., 7 more tokens, totaling 14 more tokens, which outweighs the original figure. So, nothing is even really saved in arc.arc. Though, of course, the rewrites could be shorter with something like once-only (see towards the end of http://gigamonkeys.com/book/macros-defining-your-own.html).

Further, strings.arc and news.arc did not get shorter (strings.arc even got a little longer). The only way it seems that un-patched code could get shorter is if it had to go either up or down and the order didn't matter -- unlike code in the files inspected.

Therefore, this patch can either make new code longer or make you hope that for doesn't iterate in a direction you don't want it to (as in news.arc), unless you needed to do the Arc 3.1 equivalent of

  (if (< start end)
      (for i start end ...)
      (> start end)
      (for i end start ...))

which, with this patch, could be replaced with

  (for i start end ...)

which is shorter.

As infrequently as such code occurs (0 times in the standard Arc 3.1 distribution, so far as I can tell), this does not yield big space savings. If it does occur frequently enough, it shouldn't outweigh the need for single-direction iterations, but would probably instead be made into a separate macro:

  (mac between (var bound1 bound2 . body)
    ...)

Additionally, you assert that having an extra loop construct entails an unnecessary mental burden for the programmer. I disagree. It's not a burden if its purpose is specific: if you want to repeat a block of code, use

  (repeat n ...)

instead of

  (for temp 1 n ...)

If you want to iterate over the length of a sequence, use

  (forlen i xs ...)

instead of

  (for i 0 (- (len xs) 1) ...)

Moreover, if you want to iterate upwards through a range of integers, use

  (for i start (- (len seq) (len pat)) ...)

instead of

  (if (and (>= (len seq) (len pat))
           (<= start (- (len seq) (len pat))))
      (between i start (- (len seq) (len pat))
        ...))

-----