Читать книгу LPIC-1 Linux Professional Institute Certification Study Guide - Richard Blum - Страница 13

Part I
Exam 101-400
Chapter 1
Exploring Linux Command-Line Tools
Using Streams, Redirection, and Pipes

Оглавление

Streams, redirection, and pipes are some of the more powerful command-line tools in Linux. Linux treats the input to and output from programs as a stream, which is a data entity that can be manipulated. Ordinarily, input comes from the keyboard and output goes to the screen. You can redirect these input and output streams to come from or go to other sources, such as files. Similarly, you can pipe the output of one program as input into another program. These facilities can be great tools to tie together multiple programs.


Part of the Unix philosophy to which Linux adheres is, whenever possible, to do complex things by combining multiple simple tools. Redirection and pipes help in this task by enabling simple programs to be combined together in chains, each link feeding off the output of the preceding link.

Exploring File Descriptors

To begin understanding redirection and pipes, you must first understand the different file descriptors. Linux handles all objects as files. This includes a program's input and output stream. To identify a particular file object, Linux uses file descriptors:

Standard Input

Programs accept keyboard input via standard input, abbreviated STDIN. Standard input's file descriptor is 0 (zero). In most cases, this is the data that comes into the computer from a keyboard.

Standard Output

Text-mode programs send most data to their users via standard output, abbreviated STDOUT. Standard output is normally displayed on the screen, either in a full-screen text-mode session or in a GUI terminal emulator, such as an xterm. Standard output's file descriptor is 1 (one).

Standard Error

Linux provides a second type of output stream, known as standard error, abbreviated STDERR. Standard error's file descriptor is 2 (two). This output stream is intended to carry high-priority information such as error messages. Ordinarily, standard error is sent to the same output device as standard output, so you can't easily tell them apart. You can redirect one independently of the other, though, which can be handy. For instance, you can redirect standard error to a file while leaving standard output going to the screen. This allows you to view the error messages at a later time.

Internally, programs treat STDIN, STDOUT, and STDERR just like data files – they open them, read from or write to the files, and close them when they're done. This is why the file descriptors are necessary and why they can be used in redirection.

Redirecting Input and Output

To redirect input or output, you use operators following the command, including any options it takes. For instance, to redirect the STDOUT of the echo command, you would type something like this:


The result is that the file path.txt contains the output of the command (in this case, the value of the $PATH environment variable). The operator used to perform this redirection was > and the file descriptor used to redirect STDOUT was 1 (one).


The cat command allows you to display a file's contents to STDOUT. It is described further in the section “Processing Text Using Filters” later in this chapter.

A nice feature of redirecting STDOUT is that you do not have to use its file descriptor, only the operator. Here's an example of leaving out the 1 (one) file descriptor, when redirecting STDOUT:


You can see that even without the STDOUT file descriptor, the output was redirected to a file. However, the redirection operator (>) was still needed.

You can also leave out the STDIN file descriptor when using the appropriate redirection operator. Redirection operators exist to achieve several effects, as summarized in Table 1.2.


Table 1.2 Common redirection operators


Most of these redirectors deal with output, both because there are two types of output (standard output and standard error) and because you must be concerned with what to do in case you specify a file that already exists. The most important input redirector is <, which takes the specified file's contents as standard input.


A common trick is to redirect standard output or standard error to /dev/null. This file is a device that's connected to nothing; it's used when you want to get rid of data. For instance, if the whine program is generating too many unimportant error messages, you can type whine 2> /dev/null to run it and discard its error messages.

One redirection operator that requires elaboration is the << operator. This operator implements something called a here document. A here document takes text from subsequent lines as standard input. Chances are you won't use this redirector on the command line. Subsequent lines are standard input, so there's no need to redirect them. Rather, you might use this command in a script to pass data to an interactive program. Unlike with most redirection operators, the text immediately following the << code isn't a filename; instead, it's a word that's used to mark the end of input. For instance, typing someprog << EOF causes someprog to accept input until it sees a line that contains only the string EOF (without even a space following it).


Some programs that take input from the command line expect you to terminate input by pressing Ctrl+D. This keystroke corresponds to an end-of-file marker using the American Standard Code for Information Interchange (ASCII).

Piping Data between Programs

Programs can frequently operate on other programs' outputs. For instance, you might use a text-filtering command (such as the ones described shortly in “Processing Text Using Filters”) to manipulate text output by another program. You can do this with the help of redirection operators: send the first program's standard output to a file, and then redirect the second program's standard input to read from that file. This method is awkward, though, and it involves the creation of a file that you might easily overlook, leading to unnecessary clutter on your system.

The solution is to use data pipes (aka pipelines). A pipe redirects the first program's standard output to the second program's standard input, and it is denoted by a vertical bar (|):


For instance, suppose that first generates some system statistics, such as system uptime, CPU use, number of users logged in, and so on. This output might be lengthy, so you want to trim it a bit. You might therefore use second, which could be a script or command that echoes from its standard input only the information in which you're interested. (The grep command, described in “Using grep,” is often used in this role.)

Pipes can be used in sequences of arbitrary length:


Another redirection tool often used with pipes is the tee command. This command splits standard input so that it's displayed on standard output and in as many files as you specify. Typically, tee is used in conjunction with data pipes so that a program's output can be both stored and viewed immediately. For instance, to view and store the output of the echo $PATH command, you might type this:


Notice that not only were the results of the command displayed to STDOUT, but they were also redirected to the path.txt file by the tee command. Ordinarily, tee overwrites any files whose names you specify. If you want to append data to these files, pass the -a option to tee.

Generating Command Lines

Sometimes you'll find yourself needing to conduct an unusual operation on your Linux server. For instance, suppose you want to remove every file in a directory tree that belongs to a certain user. With a large directory tree, this task can be daunting!

The usual file-deletion command, rm (described in more detail in Chapter 4), doesn't provide an option to search for and delete every file that matches a specific criterion. One command that can do the search portion is find (also described in more detail in Chapter 4). This command displays all of the files that match the criteria you provide. If you could combine the output of find to create a series of command lines using rm, the task would be solved. This is precisely the purpose of the xargs command.

The xargs command builds a command from its standard input. The basic syntax for this command is as follows:


The command is the command you want to execute, and initial-arguments is a list of arguments you want to pass to the command. The options are xargs options; they aren't passed to command. When you run xargs, it runs command once for every word passed to it on standard input, adding that word to the argument list for command. If you want to pass multiple options to the command, you can protect them by enclosing the group in quotation marks.

For instance, consider the task of deleting several files that belong to a particular user. You can do this by piping the output of find to xargs, which then calls rm:


The first part of this command (find / – user Christine) finds all of the files in directory tree (/) and its subdirectories that belong to user Christine. (Since you are looking through the entire directory tree, you need superuser privileges for this to work properly.) This list is then piped to xargs, which adds each input value to its own rm command. Problems can arise if filenames contain spaces because by default xargs uses both spaces and newlines as item delimiters. The -d ”\n” option tells xargs to use only newlines as delimiters, thus avoiding this problem in this context. (The find command separates each found filename with a newline.)


It is important to exercise caution when using the rm command with superuser privileges. This is especially true when piping the files to delete into the rm command. You could easily delete the wrong files unintentionally.

A tool that's similar to xargs in many ways is the backtick (`), which is a character to the left of the 1 key on most keyboards. The backtick is not the same as the single quote character ('), which is located to the right of the semicolon (;) on most keyboards.

Text within backticks is treated as a separate command whose results are substituted on the command line. For instance, to delete those user files, you can type the following command:


The backtick solution works fine in some cases, but it breaks down in more complex situations. The reason is that the output of the backtick-contained command is passed to the command it precedes as if it had been typed at the shell. By contrast, when you use xargs, it runs the command you specify (rm in these examples) once for each of the input items. What's more, you can't pass options such as -d ”\n” to a backtick. Thus these two examples will work the same in many cases, but not in all of them.


Use of the backtick is falling out of favor because backticks are so often confused with single quotation marks. In several shells, you can use $() instead. For instance, the backtick example used in the preceding example would be changed to

This command works just as well, and it is much easier to read and understand.

LPIC-1 Linux Professional Institute Certification Study Guide

Подняться наверх