Is there some code or library (in Java, or any other JVM language, I’m actually working in Scala) that can distinguish between a message-id and an email-address?
Since both have essentially the same format, we cannot go by the syntax as described in the various RFCs, but usually, when I see one, I can tell whether it is an email address or a message id (message ids look more “random”). So I would expect it to be possible (but maybe nontrivial) to distinguish them heuristically.
Background: I am scraping links and addresses from text files and wish to give the user to open the links. In case of an email, I want a new email window to open, in case of a message id, I want the corresponding email to be displayed (after searching for it).
One solution I thought of: Since I have a good training set (years of emails), it would probably be possible to train a machine learning algorithm to distinguish. The main problem is that I have little experience with these things and wouldn’t even know which kind of ML algorithm to use (and which library, etc. etc.)