I’m confused over whether grep is searched for patterns include quotes or omit them and when the patterns are treated literally in bash.
My experience so far led me to believe that grep automatically treats the search pattern in quotes as a single string. E.g.:
If file.txt contains:
Welcome to stack*overflow !
The above appears to special character * that can be interpreted as a single string in txt file, but also can be interpreted as a wildcard character by the bash shell before grep processes it.
I understand that quotes can be used to treat the pattern as a single string. If I were to call the bash shell with quotes…
> grep "stack*overflow" file.txt
>
… isn’t it the case that the bash shell prints the line containing the exact string “stack*overflow”? If not, why does it not produce any results when run this way?
Similarly, why does it also not produce any results when I were to call the bash shell without quotes:
> grep stack*overflow file.txt
>
What’s the difference between grep behave with quotes and without quotes in bash shell?
1
This is basically a duplicate of When to wrap quotes around a shell variable? but let’s review specifically for regular expressions.
When the shell sees
grep .* file.txt
it will expand the wildcard if it can; so if you have files named .foo
and .bar
, you will be running
grep .bar .foo file.txt
i.e. looking for (any character followed by) bar
in .foo
and file.txt
, which almost certainly isn’t what you want.
The simplest and most straightforward guideline is to always, forever, without exceptions use quotes around your regular expressions. Single quotes are absolutely the best because they quote verbatim, though sometimes (such as when your actual regular expression needs to contain a single quote) you need to fall back to double quotes, and then understand that any backtick, dollar sign, or backslash needs to be backslash-escaped.
There are corner cases where omitting the quotes is safe, but probably stupid. There are also corner cases around how exactly the shell will expand wildcards (nullglob
etc) but if you make sure you always use quoted regular expressions, that’s irrelevant.
Tagentially, you also seem to be confused about the differences between shell wildcard (glob) patterns and regular expressions. Stack*
in a regular expression looks for Stac
followed by zero or more occurrences of k
, as many as possible. See also What are the differences between glob-style patterns and regular expressions? as well as perhaps the Stack Overflow regex
tag info page which contains a brief beginner FAQ and links to learning resources.
To search for a literal asterisk, the regular expression is *
or [*]
. To search for anything (any character, as many as possible) the regular expression is .*
(any character, zero or more). In this case, perhaps you want [^A-Za-z0-9]*
which looks for as many nonalphanumeric characters as possible.