it also feels inelegant, dirty even, to have globally-accessible functions that are relevant only in a very specific context
Yes, but how do you know that your Arc parser functions are only going to be relevant in the code you've written? Perhaps someday I'll be writing my own parser, or something completely different, and I'll find it useful to use one of your functions in a way that you didn't think of!
I suggest trying out writing your code is in the simplest possible way. For example, in your original:
(def arc-tokeniser (char-stream)
(withs (make-token (fn (kind tok start length)
(list kind tok start (+ start length)))
"make-token" does not use "char-stream", so we can make this simpler:
(def make-token (kind tok start length)
(list kind tok start (+ start length))
Now I can look at "make-token" in isolation. I can easily understand it. I know that all that other stuff in arc-tokeniser isn't affecting it in some way. And, if I'm writing my own parser and I want to use "make-token", I can do so easily.
And sure, down the road there may be some other library that also defines "make-token". At that point, it will be easy to make a change so that they work together. Perhaps by renaming one or the other, or by doing something more complicated. The advantage of waiting is that then we'll know which functions actually conflict, instead of going to a lot of work now to avoid any possibility of future conflict, the majority of which may never happen.
Now of course I'm not saying to pull every single function out of arc-tokenizer. You've some functions that depend on char-stream and token and states and so on. So those it makes perfect sense to leave inside arc-tokenizer. My claim is to today write the simplest possible parser.arc library, explicitly not worrying about future namespace clashes. That it is better to deal with them in the future, when they actually happen.