In Haskell, there are a few different options to “parsing text”. I know of Alex & Happy, Parsec and Attoparsec. Probably some others.
I’d like to put together a library where the user can input pieces of a URL (scheme e.g. HTTP, hostname, username, port, path, query, etc.) I’d like to validate the pieces according to the ABNF specified in RFC 3986.
In other words, I’d like to put together a set of functions such as:
validateScheme :: String -> Bool
validateUsername :: String -> Bool
validatePassword :: String -> Bool
validateAuthority :: String -> Bool
validatePath :: String -> Bool
validateQuery :: String -> Bool
What is the most appropriate tool to use to write these functions?
Alex’s regexps is very concise, but it’s a tokenizer and doesn’t straightforwardly allow you to parse using specific rules, so it’s not quite what I’m looking for, but perhaps it can be wrangled into doing this easily.
I’ve written Parsec code that does some of the above, but it looks very different from the original ABNF and unnecessarily long.
So, there must be an easier and/or more appropriate way. Recommendations?
1
I’m curious why Parsec was so long, I’d write it something like
data URL = URL { scheme, hostname, username, port, path, query :: String}
parser :: String -> Either ParseError URL
parser = flip parse "Foo" $ Parser <$> parseScheme <*> parseHostname <*> parseUsername ...
validators = [validateScheme . scheme, validateHostname . hostname, .....]
validate :: URL -> Bool
validate url = all ($url) validators
This is nice and concise looking to me. As for how to implement the parser, it’s a straightforward to parse each section and then apply URL
to each piece.
4