Arc Forumnew | comments | leaders | submitlogin
Racket lists properly mutated (queue bug fixed)
10 points by waterhouse 5077 days ago | 6 comments
Background: Arc is implemented in Racket/MzScheme, which recently introduced immutable pairs. However, Arc is supposed to allow for list mutation. This was hacked around (by Eli Barzilay of PLT) using some unchecked pointer operations in MzScheme to alter the supposedly immutable pairs.

That seemed to work well, but a year ago, akkartik noticed something going wrong with queues. Occasionally, adding or deleting an item from the queue would fail. This suggested something was going wrong, somehow, with the list mutation. Here is his original thread: http://arclanguage.org/item?id=11347

Recently, I conducted a more detailed investigation, and found that the problem was caused when garbage collection happened in the middle of the execution of the hack "set-car!" or "set-cdr!" functions. Here was my thread: http://arclanguage.org/item?id=13518

Today, I continued my investigation and found that, specifically, the Racket function "ptr-ref", which was used to return a C pointer to the pair to be mutated, caused memory allocation (and, therefore, possibly garbage collection).

This is the main part of Eli's code:

  (define (set-ca/dr! offset who p x)
    (if (pair? p)
      (let ([p* (get-ptr)])
        (ptr-set! p* _scheme p)
        (ptr-set! (ptr-ref p* _pointer 0) _scheme offset x))
      (raise-type-error who "pair" p)))

  (define (n-set-car! p x) (set-ca/dr! 1 'set-car! p x))
  (define (n-set-cdr! p x) (set-ca/dr! 2 'set-cdr! p x))
Here, Eli uses a global C pointer, (get-ptr), and sets that to point to the pair to be modified. The entire 'let expression could be replaced by this:

      (ptr-set! (cast p _scheme _pointer) _scheme offset x))
Here, 'cast returns a new pointer. It's slightly more efficient to reuse a global pointer, hence what Eli did. But either way, if you want to use ptr-set!, you'll need to have something return a C pointer to the pair-to-be-modified, which means malloc'ing. Apparently all C pointers returned as values are malloc'd objects.

[To be precise about what goes wrong, the set-ca/dr! function first makes a C pointer to the pair-to-be-modified, and then, if a garbage collection happens at this point, the function will follow the now-invalid pointer and modify the car or cdr of what used to be the pair. That location might be garbage, causing the operation to have no effect--or it might even be memory no longer owned by the Racket process, causing a segmentation fault. This completely explains my previous results.]

So I looked for another way to do it, one that didn't malloc. And I found one.

I have found a flawless way to implement set-car! and set-cdr! on Racket's immutable pairs. The solution is to treat them as vectors and use Racket's unsafe vector ops. [Edit: rntz points out below that one can also use unsafe-set-mcar! and unsafe-set-mcdr!, which work just as well, and are likely to be better supported in future versions of Racket. That seems much better and easier and simpler to me, and clearly I need to RTFM more thoroughly. For posterity, I've left some information on how the unsafe-vector-set! version works.]

  > (define xs '(1 2 3))
  > (define ys '(1 2 3))
  > (require racket/unsafe/ops)
  > xs
  '(1 2 3)
  > (unsafe-vector-set! xs 0 ys)
  > xs
  '(1 1 2 3)
  > (unsafe-vector-set! xs -1 ys) ;yeah, -1
  > xs
  '((1 2 3) 1 2 3)
  > (unsafe-set-mcar! xs 22) ;this works too!
  > xs
  '(22 1 2 3)
We see, using a Racket version of my 'mem-use macro, that this causes no malloc'ing.

  (define-syntax mem-use
    (syntax-rules ()
      ((_ body ...)
       (let ((u (current-memory-use)))
         body ...
         (- (current-memory-use) u)))))

  > (mem-use (unsafe-set-mcar! xs 15))
  0
  > xs
  '(15 1 2 3)
  > (mem-use (unsafe-vector-set! xs -1 300))
  0
  > xs
  '(300 1 2 3)
And now for the implementation in ac.scm. I will present the unsafe-set-mca/dr! version. Throw away (or comment out) the old definitions of ptr, get-ptr, set-ca/dr!, n-set-car!, n-set-cdr!, x-set-car!, and x-set-cdr!, and put this in:

  (require racket/unsafe/ops)

  (define (x-set-car! p x)
    (if (pair? p)
        (unsafe-set-mcar! p x)
        (raise-type-error 'set-car! "pair" p)))

  (define (x-set-cdr! p x)
    (if (pair? p)
        (unsafe-set-mcdr! p x)
        (raise-type-error 'set-cdr! "pair" p)))
The fact that it explicitly says (require racket...) implies that this must be Racket, so I have dropped the check for native set-ca/dr!.

This fixes the queue bug. I've run akkartik's test program and a few versions thereof, and there have been no problems. Additionally, list mutation is faster, to the point where sorting a large list is about 3x as fast as it was before (5.5 secs instead of 18.7 secs for 500k random numbers), and the destructive list reversal function 'nreverse (from Common Lisp) is actually faster than 'rev, not to mention it creates almost zero garbage:

  (def nrev (xs)
    (let u nil
      (whilet head xs
        (= xs cdr.xs)
        (scdr head u) ; faster than (= cdr.head u) for various reasons
        (= u head))
      u))

  arc> (mem-use:no:= xs nrev.xs) ;where xs is a 100k-element list
  168
It's possible that the implementation of Racket pairs or mutable pairs (or vectors if we prefer that version) will change in the future. And it would be nicer if what we're doing were officially supported by the language. But I see no reason to expect either of those changes anytime soon, and right now, this works, as far as I've tested it. So we can rejoice and be merry.


7 points by rntz 5077 days ago | link

I can't say I'm surprised that breaking the abstraction of cons cells that way ended up being a problem. This new implementation, if anything, breaks the abstraction even more (exploiting the fact that viewing a cons cell as a vector happens to result in mutating certain parts of the "vector" having the desired effect on the cons). Arc (anarki at least) really should switch over to using racket's mutable pairs.

Actually, speaking of that, why not just use unsafe-set-mcar! and unsafe-set-mcdr!? I find it far more likely that the representation of mutable pairs will continue to coincide with that of pairs than that the representation of vectors will, and a quick examination at the mzscheme prompt seems to confirm my suspicion. unsafe-set-mcar! and unsafe-set-mcdr! work just fine on cons cells. I'm using plt scheme 4.2.2, not racket, though.

-----

4 points by waterhouse 5076 days ago | link

Egad, you are thoroughly right and I have rewritten the post. mentally smacks myself on the head Thank you, that is a much better implementation. I'm a little bit disappointed that it's so easy...

Uh-oh, the edit window has timed out on my post, and I've noticed that my version doesn't do any type-checking. That is el baddo. So far, it hasn't caused things to crash yet...

  arc> (= x 'meh)
  meh
  arc> (= (car x) 2)
  2
  arc> x
  meh
...but that's just luck and I wouldn't rely on it. [It does destroy the world when you set the car of a table.] Better version is here:

  (require racket/unsafe/ops)
    
  (define x-set-car!
    (let ((fn (namespace-variable-value 'set-car! #t (lambda () #f))))
      (if (procedure? fn)
          fn
          (lambda (p x)
            (if (pair? p)
                (unsafe-set-mcar! p x)
                (raise-type-error 'set-car! "pair" p))))))
  (define x-set-cdr!
    (let ((fn (namespace-variable-value 'set-cdr! #t (lambda () #f))))
      (if (procedure? fn)
          fn
          (lambda (p x)
            (if (pair? p)
                (unsafe-set-mcdr! p x)
                (raise-type-error 'set-cdr! "pair" p))))))

-----

1 point by waterhouse 5076 days ago | link

Man, I really wish I could continue editing my original post, 'cause without the type-checking, it's kind of a nuclear bomb in case someone accidentally uses it on a table. Perhaps I should just have left the post unchanged, or just added "Edit: Look in the comments for improvements" at the bottom.

-----

4 points by akkartik 5076 days ago | link

I still have the edit link (because I can kill stories). Feel free to email me an updated version (address in profile) and I'll replace it for you.

-----

1 point by waterhouse 5076 days ago | link

Excellent. You've got mail.

-----

2 points by akkartik 5077 days ago | link

w00t!!!

Update: I've pushed it out to anarki, my keyword-args repo at https://github.com/akkartik/arc (unit tests all pass), and to readwarp.

-----