If I have two files, a.txt
containing:
meow ✓ bar
and b.txt
containing:
meow ⨯ bar
Why can’t the GNU Coreutils command comm
tell the difference between the two when $LANG
is set to a UTF-8 encoding?
$ LANG=C comm a.txt b.txt
meow ✓ bar
meow ⨯ bar
$ LANG=en_US comm a.txt b.txt
meow ✓ bar
meow ⨯ bar
$ LANG=en_US.UTF-8 comm a.txt b.txt
meow ⨯ bar
1