![]() Character Classes |
![]() Active Web |
![]() |
Regular expressions can also include the following back quote escapes to refer to popular classes of characters:
\w any word constituent character (same as [a-zA-Z0-9_])
\W any character but a word constituent
\d a digit (same as [0-9])
\D anything but a digit
\s a white space character
\S anything but a white space character
\n a line separator character (CR or LF)
\N anything but line separator character (CR or LF)
These escapes are also allowed in character classes: '[\w+-]' means 'any character that is either a word constituent, or a plus, or a minus'.
Character classes can also include the following grep(1)-compatible elements to refer to:
[:alnum:] any alphanumeric, i.e., a word constituent, character
[:alpha:] any alphabetic character
[:cntrl:] any control character. In this version, it means any character whose code is <32.
[:digit:] any decimal digit.
[:graph:] any graphical character. In this version, this mean any character with the code >= 32.
[:lower:] any lowercase character
[:print:] any printable character. In this version, this is the same as [:cntrl:]
[:punct:] any punctuation character.
[:space:] any white space character.
[:upper:] any uppercase character.
[:xdigit:] any hexadecimal character.
Note that these elements are components of the character classes, i.e. they have to be enclosed in an extra set of square brackets to form a valid regular expression. For example, a non-empty string of digits would be represented as '[[:digit:]]+'.
Topic ID: 150145