Do you know any reasons why SQL-92 standard has changed glob pattern wildcard characters from *
and ?
(SQL-89) to %
and _
?
Currently I need to do mask conversions to allow users searching data with *
and ?
which they already know from file system. Also, %
and _
need escaping more frequently, because they tend to appear in data fields more frequenly than *
or ?
. What was breaking the glob standard good for?
1
SQL dates back to the early 1970s. It’s about the same age as UNIX. UNIX-style file name matching wildcards didn’t become ubiquitous until much later.
*
was already a reserved character in SQL, as in SELECT * FROM Customer
. Having it also be a wildcard would probably have been confusing.
And standardization — of wildcard characters or anything else — always takes “prior art” into account, i.e. what database vendors were already doing in the late 1980s / early 1990s. If no one was using *
and ?
as pattern matching wildcards there’s no way those characters would work their way into the standard (the vendors are, after all, represented on the standards committee).
I’d also note that pattern matching (LIKE
) expressions aren’t exactly analogous to file system globbing. The regular expression functionality found in sed
, awk
, and many other UNIX utilities IMO represent a much better analogue.
So a better question might be, why didn’t SQL adopt regular expressions for pattern matching?
Microsoft Access was based on a database engine called Jet which offered *
and ?
as wildcards for LIKE
. These were never a part of the SQL standard. At some point, Microsoft incorporated a SQL-92 mode which offered the standard characters.
The use of %
and _
goes back a very long way. My copy of An Introduction to Database Systems by C.J. Date (4th edition, 1986) describes it that way. IBM DB2 (first appeared ca. 1983) had it, and I wouldn’t be surprised if predecessors SQL/DS (1981) and System R (mid- to late 1970s) did, too. SQL in those products became the foundation for the standards.
5