I have a Windows 8.1 machine now, and it's still the same process. :)
The biggest pain was when I was trying to download the Anarki GitHub repo in the first place. I've been cloning GitHub repos over SSH URLs using Cygwin's build of Git, and setting this up on Windows 8 was a bit surprising: http://stackoverflow.com/questions/9561759/why-cannot-chmod-...
Nowadays, there are probably easier ways to get Anarki on Windows, like using GitHub for Windows or just using the GitHub website's "Download ZIP" button.
I've recently been pondering the role of errors in language design, and this is where I'm at right now. My train of thought takes some rickety detours into the far future and ethics, so I took a week to clarify my thinking before I submitted this here.
I privately asked for akkartik and evanrmurphy's help along the way, and they had some good responses that helped me notice the most unclear spots. I hope they'll respond again now that it's here on Arc Forum. :)
I really liked this point: "In preference to having usage errors, the system tends to support ad hoc behaviors that seem useful enough for now." I think using this principle you can deal with design errors in useful ways while avoiding some of the controversy about cloning developers' minds and such. What I'm thinking of is the error mechanism that lets you get around whatever abstraction is causing problems at the moment.
For example, say you're designing a survey with multiple-choice questions. A glaring design hole with multiple-choice questions is, what if my answer isn't one of the choices! The tried-and-true error mechanism for this is to provide an "Other" choice that opens up a text box where the user can provide whatever answer they want. "Other" is a multiple choice answer that breaks down the whole abstraction of multiple choice answers. There's no answer you could have that it can't handle! "Other" is so awesome. IMO it's even superior to having a clone of the survey designer's mind to chat with when I have an answer that's not in the list.
Yeah, that's a pretty good example of how an design hole might play out. :) The developer wants to present a multiple-choice question, not a text box that happens to have default options off to the side. If they add an "other" choice, that's plan B, for when plan A isn't quite what the user wants to participate in.
If we say the developer has plan A, plan B, plan C, and so on, it's probably more honest to say these are a continuum. They're like layers of sediment, and a design hole in one layer just means we can step down into the next one. But the deeper levels of developer intentions are more fluid and fickle, and after some point (plan N, plan O, plan P...), the developer has no idea what they want. They'll just leave the problem for someone else to solve, whether it's their future self, a mind clone, a client, a friend, etc.
In this blog post, I imagined that cliff would be pretty drastic; one portion of the intentions is absolutely precise, and the other part is so open-ended that it must be handled by an intelligent representative. The dividing line is whether the program code contains explicit instructions, or whether it invokes an error mechanism. But it would be nice to acknowledge that this isn't the only line on the map.
Hmm, here's a possible two-dimensional map:
---
Rigor Dimension
Rigor 1: Code so rigorous and comprehensive, yet with such a simple interface, that you could call it a mathematical theory. Code like this almost never needs to change, but sometimes it isn't the right code for the task at hand.
Rigor 2: Code that is built using simple mechanical metaphors on a simple mechanical foundation (like most of today's programming language semantics), but is expected to change to some degree. This layer might be further divided into external API definitions, libraries, and application code, in order of decreasing stability.
- (In the blog post, my dividing line was somewhere around here.) -
Rigor 3: Code that relies on intricate and data-heavy techniques but can still be engineered in fuzzy ways, like AI.
Rigor 4: Code that relies on a true universal intelligence embedded in the program. Mind clones go here. So do human pilots.
---
Stubbornness Dimension
Stubbornness 1: Code that gets deployed with the program and always stays. Without this code, it just isn't the same program.
- (In the blog post, my dividing line was somewhere around here.) -
Stubbornness 2: Code that gets deployed with the program but actually isn't as precise as it looks. Perhaps it represents initialization data that may change over time due to external input, or perhaps it describes an ideal outcome that an AI system will try to approximate.
Stubbornness 3: Code which can be modified, and which is completely useless until that modification actually occurs, except perhaps for the useful behavior of begging an external system for help.
---
Actually, I hesitate to give the above lists without mentioning some other interesting kinds of code that don't quite seem to have a place in those two dimensions. I think it's actually possible to organize these outliers into two more dimensions:
---
Immersion Dimension
Immersion 1: Code which exists in such a transitory way that the developer hardly even considers it their code in the first place. For instance, a player can control Mario, but Mario is just a part of a much bigger program beyond the player's control, and (I think) players rarely consider themselves to be developers of Mario control programs.
Immersion 2: Code which communicates with the developer in a way that's extremely inexpensive for them both. A player can control Mario without bothering to open a text editor, so something like this might be possible for programming too.
Immersion 3: Code which, if modified, somehow exposes its modifications to the developer(s) for possible inclusion in future deployments. Copyleft licenses are a non-mechanical example of this. Another example is a personal shell script which never leaves the developer's machine to begin with.
Immersion 4: Code which is completely detached from the developer once it's deployed. (In the blog post, I only considered this kind of code.)
---
Openness Dimension
Openness 1: Code which an external system can modify using tools very similar to the ones that were used to write the original code.
Openness 2: Code which was created using certain tools but must be modified using others, like a special reflection API, scripting language, or bytecode generation.
Openness 3: Code which can't be modified.
(In the blog post, this dimension was irrelevant to me.)
---
(Stubbornness 1 and Openness 3 sound like the same thing, and I think Immersion 0 would be the same thing too. It might help to reverse the rankings on some of these axes; I just put them in an order that made them relaively easy to explain.)
So there, I guess I probably won't divide programs in quite the way I described in that blog post. I now have plenty of rope to confuse myself with. :-p
Hmm, what are some examples of languages with design holes, or language mechanisms that help programmers manage design holes? Lower down you suggest that an error when adding a string to a non-string is a design hole. The creators of Java would disagree. Any language designer would say that at least some of his errors are 'designed in'. So what's the subset of error messages you're getting at?
Like I said privately, I disagree that exceptions are always discouraged. They're encouraged in python. It sounds like they're discouraged in javascript. But I'm not sure you can generalize to other languages, or to 'error mechanisms' in general. Javascript doesn't seem to have any problem throwing syntax errors on the console.
"Hmm, what are some examples of languages with design holes, or language mechanisms that help programmers manage design holes?"
Design holes are one part of a venn diagram: They're not a part of the program's design, and yet it's possible to encounter them in the program's behavior.
If a specification document says a behavior is unspecified--I think the C spec is notorious for this--then that's where you'll find a concrete example of a design hole. The design is concrete, and any implementation is concrete, so their difference is concrete.
Personally, I see design holes whenever I want to run my program without finishing it. The unfinished parts are gaping holes.
---
"Lower down you suggest that an error when adding a string to a non-string is a design hole."
It's an example of a design hole if the designer doesn't care what happens.
---
"Any language designer would say that at least some of his errors are 'designed in'."
I'd say if the language designer really wants people to avoid a certain design hole, they can put a little fence around it. The fence is part of the design, but on the other side, there's still a hole.
---
"Like I said privately, I disagree that exceptions are always discouraged."
Inasmuch as they're not discouraged, they don't count as an error mechanism. This a gray area.
I know this is a slippery response to give, but this is about the way I set up my terminology, rather than the purposes I have for talking this way.
I've never done Hindley-Milner type inference before, and I just took a look at Poly. It's nice to see the solve function there, looking nice and simple like I hoped. XD It seems Algorithm W just treats the program as a graph of type equality constraints, and it does substitutions and such until it runs out of equations to process, at which point it's collected a full map of type variable bindings. I hope that makes sense.
Well, this is a pretty sleek language syntax, and the combination of features seems like a good start for some cozy programming. I think I'd be happy to use this in place of JavaScript, if only I didn't care about Web browsers and standards compliance. :-p
To describe Pyret, its primary purpose is to be a teaching language, and I think its most remarkable feature on a technical basis seems to be the way it does type annotations and unit tests. I think these are just run time contracts for now, but they'll somehow pave the way for a static type system, limited by a policy that there must always be a way to understand the static type system in terms of run time behavior (https://news.ycombinator.com/item?id=6704276).
I don't see very much value (but I do see some) in a static type system if all it does is promote some run time errors to compile time. Module API enforcement and program inference are what I mainly care about as far as static types go, and Pyret doesn't seem to provide either of these. (Contracts don't quite reassure me that a client module will continue to obey the API throughout the lifetime of a multi-step interaction like a handshake or a higher-order loop. A contract violation error can occur partway through.)
If they ever weaken their policy about the type system mimicking run time checks, I think Pyret could support some special types and tests which require compile time processing if they're used. At the same time, it could still use all its expressive run time contracts. I'd really like to see this synthesis, but I don't really expect to; the implementors probably have enough to worry about just to build a static typechecker at all. :)
I wonder if Pyret's return value checks inhibit tail recursion optimization. Perhaps they can collapse into a single stack frame if they're repetitions of the same check.
The culprit is that 'err is defined to be Racket's 'error. It looks like every single use case of 'error is discouraged for one reason or another in the Racket reference:
- (error sym) creates a message string by concatenating "error: " with the string form of sym. Use this form sparingly.
- (error msg v ...) creates a message string by concatenating msg with string versions of the vs (as produced by the current error value conversion handler; see error-value->string-handler). A space is inserted before each v. Use this form sparingly, because it does not conform well to Racket’s error message conventions; consider raise-arguments-error, instead.
- (error src frmat v ...) creates a message string equivalent to the string created by
(format (string-append "~s: " frmat) src v ...)
When possible, use functions such as raise-argument-error, instead, which construct messages that follow Racket’s error message conventions.
Er, I knew it was weird for me to say "'err is defined to be Racket's 'error," but I just realized, that factoid was in the original post of this thread. :-p
In order to really define "backward compatible," you'd have to define Arc in a way that's implementation-independent. In Arc, the code is the spec, so as soon as the code changes, compatibility becomes subjective.
For instance, suppose Anarki defines a new utility and uses it to simplify the implementation of 10 other utilities. (It does this in a few places.) Now suppose my Arc 3.1 code has defined a utility with exactly the same name, and running this code on Anarki causes those other 10 utilities to misbehave, thus wrecking my program. This is a case where Anarki isn't compatible with Arc 3.1, but since it's so easy for me to choose a different name for my utility, it's hardly even worth mentioning. Pretty much any substantial update to Arc would break it in exactly the same way.
There's only one difference between Arc 3.1 and Anarki that's ever gotten in my way, and that's the way Anarki has revamped the [...] syntax to support multi-argument functions. When I say [do t] or [do `(eval ',_)], Anarki treats these as 0-arity functions, and when I say [let (a . b) _ ...], Anarki chokes when trying to search the dotted list for any underscored variables. Once again, this is the kind of change that's pretty easy to work around, and I can't really say Anarki is worse for having this extra functionality.
I'd say Arc platforms are not really portable with each other, in the sense that not all code that works on one platform will work on another. However, I've found it pretty easy to develop my code so it'll work on multiple Arc platforms at the same time.
This is a horrible change. I didn't respond to the original post because of "Thumper's rule," and I couldn't rebut thaddeus in time to stop you. :(
-
== Problems with functionize in general ==
The functionize-based utilities are discontinuous about the way they detect underscores: The body can use _ three times, or two times, or one time, but as soon as it uses _ zero times, it means something completely different. Thanks to this, I can break several layers of code by making a single local edit. But will I? Yes:
The 'treemem function detects occurrences of _ without regard for quoting or local scopes. So if I use an _ to activate one functionize-based utility, then I'll accidentally activate all the other functionize-based utilities which surround that one. If I want to avoid refactoring several layers of code each time I edit, I pretty much have two options:
- I can avoid putting an _ anywhere in my code, in which case this functionize feature won't be very useful to me.
- I can make sure to activate each and every functionize utility as soon as I use it, in which case they would have been better off as 'let variants. For instance, I might settle on the idiom (zap (do '_ ...) foo), but it would be more convenient to say (zaplet orig foo ...).
-
== Problems in Arc ==
Your Anarki commit is one of those things that is "guaranteed to break all your code." Personally, I like using the pattern (zap [map [...] _] args), which now breaks since the _ activates zap's automatic function wrapper. It seems you would want me to write (zap (map [...] _) args) instead, but for compatibility with Rainbow, Jarc, etc., I think I'll define a macro (itfn ...) and write (zap (itfn:map (itfn:...) it) args). Effectively, I'll be recreating the [...] functionality in the way I like.
Meanwhile, Arc already covers a lot of the functionality of Clojure's -> and ->> operators using (aand ...). If you still miss -> or ->>, I recommend just implementing an 'aand variant that doesn't short-circuit on nil. Call it 'ado or something.
Just spotted your comment while I'm working on something else. Haven't digested it all, but judging from the first five words -- feel free to revert! It was definitely intended as an experiment, and I'm not attached to it. I may well do so myself later today if you don't get to it first.
Ok, done reading now, and you're right, I'll revert it.
I can only defend myself against the wart section :) In wart the pipe operator can only take two args and is intended to be used in infix. I use a non-infix transform for more args, and for prefix mode in general: https://github.com/akkartik/wart/commit/ec0f9a38b8
My weak defense for the rest: functionize and the _ syntax was only intended for tiny expressions.
"I can only defend myself against the wart section :) In wart the pipe operator can only take two args and is intended to be used in infix. I use a non-infix transform for more args, and prefix mode in general"
Oh, so you're pursuing both options at once. I look forward to you figuring out what kind of indentation you prefer here. :) My "considerations about wart" section was only wishy-washy anyway.
---
"For the rest, my weak defense is that functionize and the _ syntax is only intended for tiny expressions."
In Penknife, when I used the a'b operator as sugar for (b a), I found I ended up with a few really long lines of a.b.c'd'e.f, so it kinda suffered from its own success. ^_^ My a'b is the same as your (a -> b._), and it exactly corresponds to your no-underscore special case, (a -> b), so I expect you to have the same issue.
I suspect these syntaxes actually have a special tendency to let sugar accumulate, driving them away from the ideal "tiny expressions" case. Specifically, they make it possible to inject new code without breaking apart the surrounding sugar first:
a.b.f.c.d # before refactoring
a.b."foo".c.d # illegal
a.b -> (_ "foo") -> _.c.d # legal? (not quite the example you gave)
a.b'[itfn:it s.foo].c.d # Penknife code of similar generality
Fortunately for wart, its infix operators allow whitespace in between, which possibly means you can write these long expressions on multiple lines. (That wasn't the case in Penknife.)
"In Arc (and similar), there happens to be the falsehood-creep into the empty list. I'm not sure I really like that, because it isn't maximally consistent: why aren't other empty sequences false, too? Just do away with the question by having a canonical false value all its own. Then you still get some of the code-golf benefits of having everything else be true."
This is exactly what my preference would be too. Thanks for saying it first. :)
Well, this ended up leading in different directions than I expected, so I'll be more specific about my opinions here.
I like the idea of the main (if ...) semanics being just another equality check or dynamic type check: "Is this nil?" If falsiness overlaps with multiple other dynamic types, then we end up having confusing crosshatching where one extension wants to do X with any falsy value and another extension wants to do Y with any list.
Secondarily, I also see some benefit in distinguishing between () and #f, because then it's possible to dispatch on whether something is a list or a boolean. But I'm also happy if we don't have booleans at all, because then "Is this nil?" can just be a special case of "Is this a list?"
An interesting turnaround happens with this philosophy, too: instead of treating "the" empty sequence as false, you can treat false as though it's an empty sequence. This is what Factor does: http://docs.factorcode.org/content/article-sequences-f.html
So maybe if Arc spelled the empty list like () and nil was the singleton false value (so that (is nil ()) was nil), then map/each/etc. could still work on nil just fine. It's just that (if () 'a 'b) would evaluate to 'a instead. Not saying it's the best way, but it's certainly an option.
Interesting. One quibble with this idea: it doesn't matter as much that map et al work on nil if nil isn't at the end of each list.
So perhaps the reason for empty list to be special is that so many list algorithms are recursive in nature, and it's nice to be able to say "if x recurse" rather than *if !empty.x recurse". Hmm, the empty array or empty string isn't included in every array/string respectively, so perhaps it's worth distinguishing from nil in some situations..
I just ran into a case where I wished the empty list wasn't the same as the false value. When implementing infix in wart (http://arclanguage.org/item?id=16775) I said: "Range comparisons are convenient as long as they return the last arg on success (because of left-associativity) and pass nils through."
(a < b < c)
=> (< (< a b) c) ; watch out if b is nil
(< nil x) ; should always return nil
Ok, I'm now experimenting with a new keyword in wart called false.
a) There's still no boolean type. The type of false is symbol. (The type of nil has always been nil; maybe I'll now make it list.)
b) if treats both nil and false as false-y values.
c) nil and false are not equal.
d) Comparison operators now short-circuit on false BUT NOT nil.
I can mostly use either in place of the other. But I'm trying to be disciplined about returning false from predicates and nil from functions returning lists.
Wart now has four hard-coded symbols: nil, object, caller_scope and false.[1]
Thoughts? It was surprisingly painless to make all my tests pass. Can anybody think of bugs with this kinda-unconventional framework? If you want to try it out:
$ git clone http://github.com/akkartik/wart
# Optionally "git checkout 0ff47b6bce" if I later revert this experiment.
$ cd wart
$ ./wart
ready! type in an expression, then hit enter twice. ctrl-d exits.
[1] fn is just a sym with a value:
let foo fn (foo () 34)
=> (object function {sig, body})
Technically, my first thought was that something was broken. Hitting C-d as soon as I got the prompt:
$ time ./wart
ready! type in an expression, then hit enter twice. ctrl-d exits.
=> nil
real 0m29.200s
user 0m27.602s
sys 0m0.000s
Anyway, I was going to test to see if you had Arc's t; but it doesn't look like it:
(if t 'hi 'bye)
020lookup.cc:28 no binding for t
=> bye
Note that it's trivial to add:
(<- t 't)
=> t
(if t 'hi 'bye)
=> hi
The reason I thought to try this was because I initially balked at maintaining false and nil at the same time with the same truth values. Then I thought of t, and suddenly the pieces clicked together: at least in part, it seems like you just want a Python-like system anyway.
Once I got the landscape laid out in my head, I started objecting to it less, because I could make sense of it. You're most of the way there:
- false is a separate, canonical false value.
- t (if you chose to have it) is a separate, canonical truth value.
- nil is an empty list, but empty lists are false.
Compare to Python's True, False, and []. The major differences being:
1. No first class boolean type. In wart, this produces more of a disconnect between t and false. t (i.e., 't) is just a normal symbol whose truth value is incidental. But false is a special, unassignable keyword.
(<- false 'hi)
=> hi
false
=> false
Python lacks symbols (you can't just say True = 'True), so this disconnect between symbolic value and keyword doesn't exist. There is still, however, a different sort of disconnect in Python because the "first class" boolean type gets contaminated by the int type:
2. You don't take Python's next logical leap. Since you already make the empty list false, other values become fair game, such as the thread's original idea (make 0 false), the empty string, other empty data structures, etc. But like I said before, I make do in such systems. Keeping nil falsy is really just your prerogative, if you want to avoid calls to empty? that much. ;)
Thanks for trying it out, and for the comments! Yeah it's gotten slow :(
I hadn't realized how close to python I've gotten. Seems right given how the whitespace and keyword args are inspired by it. On rosetta code I found a cheap way to get syntax highlighting was to tag my wart snippets with lang python :)
I've been using 1 as the default truth value, and it's not assignable either. I was trying to avoid an extra hard-coded symbol, but now that I've added false perhaps I should also add true.. I'm not averse to going whole-hog on a boolean type, I'd just like to see a concrete use case that would benefit from them. pos seems a reasonable case for keeping 0 truth-y, and the fact that lists include the empty list seems a reasonable case so far to keep nil false-y. But you're right, I might yet make empty strings and tables false-y.
(True, False = 0, 1 :( That's the ugliest thing I've ever seen python allow. At least throw a warning, python! Better no booleans than this monstrosity.)
"pos seems a reasonable case for keeping 0 truth-y"
While I personally like 0 being truthy, I don't see this as a convincing reason.
I'd treat 'pos exactly the same way as 'find. They're even conceptually similar, one finding the key and the other finding the value. For 'find, the value we find might be falsy, so truthiness isn't enough to distinguish success from failure. The same might as well be true for 'pos.
---
"But you're right, I might yet make empty strings and tables false-y."
What if the table is mutable? That's an interesting can of worms. :)
JavaScript has 7 falsy values, all of which are immutable. If we know something's always falsy, we also know it encodes a maximum of ~2.8 bits of information--and usually much less than that. It takes unusual effort to design a program that uses all 7 of those values as distinct cases of a single variable.
This means if we have a variant of Arc's (and ...) or (all ...) that short-circuits when it finds a truthy value, we don't usually have to worry about skipping over valuable information in the falsy values.
If every mutable table is falsy as long as it's empty, then a falsy value can encode some valuable information that a practical program would care about, namely the reference to a particular mutable table.
---
"(True, False = 0, 1 :( That's the ugliest thing I've ever seen python allow. At least throw a warning, python! Better no booleans than this monstrosity.)"
The PEP describes the design and rationale of introducing booleans to Python this way. Version 2.3 implements this. Version 2.2.1 preemptively implements bool(), True, and False to simplify backporting from 2.3.
Notably, the variable names "True" and "False" were chosen to be similar to the variable name "None", and all three of these are just variables, not reserved words.
Later, version 2.4 made it an error to assign to None:
I've added some messages to at least set expectations on how slow it is:
$ wart
g++ -O3 -Wall -Wextra -fno-strict-aliasing boot.cc -o wart_bin # (takes ~15 seconds)
starting up... (takes ~15 seconds)
ready! type in an expression, then hit enter twice. ctrl-d exits.
Oh, there's also the issue with 'find. It's already cumbersome to search a list of booleans for false and a list of lists for an empty list, and now it would be difficult to search a list of numbers for zero.
Of course, we could just do the same thing as 'pos: