Wildcards In UNIX
Linux, offers the facility to perform an operation on a set of files without having to list out the names of all the files on which the operation has to be performed. This is made possible by the use of certain Special characters in the command in the place of the actual filenames. UNIX interprets these special characters as a specific pattern of characters. UNIX then compares all filenames under the directory specified in the command to find out which filenames match that pattern. The command is executed on files whose names match that pattern.
How to use UNIX Wildcards
Many computer operating systems provide ways to select certain files without typing complete filenames. For example, we may wish to remove all files whose names end with
old. Linux allows us to use wildcards (more formally known as meta-characters) to stand for one or more characters in a filename.
The two basic wildcard characters are
*. The wildcard
? matches any one character. The wildcard
* matches any grouping of zero or more characters. Some examples may help to clarify this. (Remember that Linux is case- sensitive).
The ‘*’ Wild-Card
This Wild Card is interpreted as a string of one or more characters (or no characters). It can be used with other characters. It can also be repeated in the command line.
#output: file1 file2 file3 file4 fileabc file
* (asterisk) is appended to the string file, the pattern file* expands into all files, in which the first four constitute the string file and after these four characters it can be anything or nothing i.e. the above command lists all files in the current directory as shown above.
* Wild-Card can also be repeated as shown below
Example: Assume that files, file1.txt, file2.txt, file3.abc already exists and gives the command
#output: file1.txt file2.txt file3.abc
This displays the files starting with file and containing any sequence of characters or no characters followed by
a . (dot) and then followed by any sequence of characters or no characters.
The ‘?’ Wild-Card
? Wild Card matches exactly one occurrence of any character. It can be used with other Wild Cards. It can be in the command line.
Assume that the files a, b, c, file1, file2, file1.txt, file2.c already exists in the current directory.
#output: a b c
? matches exactly any one character. Here all the files whose names are of single character are displayed.
#output: file1 file2
In the second example, it is listing all files with a pattern file and followed by any single character i.e. file1 and file2. In the third example, it is displayed for any file with a
. (dot) extension of three characters. In the fourth example, it displays the files with pattern file2 and with a
. extension of a single character.
Standard File Descriptors
The Linux environment allows for each process to have access to three standard file descriptors by default. They are
0 standard input
1 standard output
2 standard error
It is the responsibility of the shell when executing a command to provide appropriate file descriptors to the process for each of these standard files. Most Linux tools are developed to take their input from the standard input file and write their output to the standard output file. Error messages that do not make up part of the expected output are usually written to the standard error file. Unless otherwise specified, the shell will usually pass it’s own standard file descriptors down to the process that it executes, allowing the output from any called tools to be included with the output of the script.
Through using I/O redirection, the developer can modify how the shell handles the file descriptors and usually either replace one of the standard interactive file descriptors with a file on disk, or create a pipe to connect the output file descriptor of one process to the input file descriptor of another process. Redirection can also be used to perform redirection on file descriptors for a group of commands.
Basic File Redirection
Disk file redirection is done using the < and > characters.
>redirects the standard output of the command to write to a file
>>redirects the standard output of the command to append to a file
<redirects the standard input of the command to read from a file
ls -al > dirlist.txt
ls -al >> longlist.txt
cat a > f1
Advanced File Redirection
>& is used to redirect one file descriptor to another.
command > common.log 2>&1
This redirects standard output to common.log, and then redirects standard error to the same place as standard output. The order of these redirections is important, if the 2>&1 is placed before the >common.log, then standard error will be redirected to the standard output (console) then standard output will be redirected to common.log.
<< redirects the standard input of the command to read from what is called a “here document”. Here documents are convenient ways of placing several lines of text within the script itself, and using them as input to the command. The
<< Characters are followed by a single word that is used to indicate the end of file word for the Here Document. Any word can be used, however there is a common convention of using EOF (unless we need to include that word within your here document).
sort << EOF
Wild card matching
|Matches one or more character or a string of more than one character
|Matches exactly one character
|Matches exactly one of a specified set of characters
The ‘[ ]’ Wild-Card
[ ] Wild-Card matches any one of a specified set of characters which are given within the brackets. It can be used with other Wild Cards.
A single character expression, taking the values 1 or 2 or 3 can be represented by the expression . This can be combined with any string or other Wild-Card expression.
Ex: Assume that the files a, b, c, d, file1, file2, file3 exists in the current directory.
# output: a b c
It displays the files with single characters of either a or b or c.
# output: file1 file3
All the above Wild-Cards can be combined with each other.