Arc Forumnew | comments | leaders | submitlogin
1 point by partdavid 6129 days ago | link | parent

Ah, you're right, it's a bit hard to see when folks have replied. Yes, I'm an Erlang programmer.

In response to your question, I don't accept your premise that replicating a particular regular expression is a real programming task. You say your email regular expression isn't perfect, but it's not clear to me why you chose those particular set of restrictions beyond what's defined in the RFC--so it's a little hard for me to replicate (for example, the local-part and domain of the address can have a more kinds of characters that what you have defined).

Instead, I'll offer this as a non-equivalent but interesting comparison. I've elided the module declarations (as have you), including the imports that allow some of these functions without their module qualifiers:

  email(S) ->
     [User, Domainp] = tokens(S, "@"),
     {User,
      case {address(Domainp), reverse(tokens(Domainp, "."))} of
         {{ok, Addr}, _} -> Addr;
         {_, RDomain = [Tld|_]} when length(Tld) >= 2,
                                     length(Tld) =< 4 ->
            join(reverse(RDomain), ".")
      end
     }.
I don't know how the terseness of this compares with your example, given that it includes some things that yours doesn't (a way to call it, a format for the return value rather than the capture variables). Terseness, of course, in the pg sense of code tree nodes, whatever they are. :)

The Erlang function above returns a tuple of the local-part and the domain part and throws an error if it can't match the address. If this were something I wanted to ship around to other functions or send to another machine or store in a database table or something, I would have email/1 return a fun (function object) instead.

If either one of us wanted something better than what we have (or even if we don't--it seems like coming up with The Right Thing To Do With Email Addresses is worth a bit of time to do only once) I would write a grammar. The applicable RFC 2822 more or less contains the correct one, which is only a few lines.

At the "low" end of text processing power, there are basic functional operations on strings and lists, and at the "high" end there are grammar specifiers and parser generators. In the band in between lives regular expressions, and I am not convinced that that band is very wide. I like regular expressions (and, indeed, I would like it very much if Erlang had better support for them) but for me they are a specialized tool, particularly useful (like wildcard globbing) for offering as an input method to users.

But they aren't a general solution to every kind of problem, and for that reason I don't think Arc or any other general-purpose language benefits from baking them into the basic syntax--they belong in a library.