Pipe Viewer

0
1007

Pipe Viewer is an open source application. You can download its source code and build the application from scratch or, if available, pull an existing binary from your UNIX distribution’s repository.

To build from scratch, download the latest source tarball from the Pipe Viewer project page (see Resources). As of mid-September 2009, the latest version of the code is 1.1.4. Unpack the tarball, change to the newly created directory, and type ./configure followed by make and sudo make install. By default, the build process installs the executable named pv into /usr/local/bin. (For a list of configuration options, type ./configure --help.) Listing 1 shows the installation code.

Listing 1. Pipe Viewer installation code
$ wget http://pipeviewer.googlecode.com/files/pv-1.1.4.tar.bz2
$ tar xjf pv-1.1.4.tar.bz2
$ cd pv-1.1.4
$ ./configure
$ make
$ sudo make install
$ which pv
/usr/local/bin/pv

To pull the pv binary from a repository, use your distribution’s package manager and search for either pv or pipe viewer. For example, a search using Ubuntu version 9’s APT package manager yields this match:

$ apt-cache search part viewer
pv - Shell pipeline element to meter data passing through

To continue, use your package manager to download and install the package. For Ubuntu, the command is apt-get install:

$ sudo apt-get install pv

Once installed, give pv a try. The simplest use replaces the traditional cat utility with pv to feed bytes to another program and measure overall throughput. For instance, you can use pv to monitor a lengthy compress operation:

$ ls -lh listings.txt
-r--r--r--  1 supergiantrobot  staff   109M Sep  1 20:47 listings.txt
$ pv listings.txt | gzip > listings.gz
96.1MB 0:00:09 [11.3MB/s] [=====================>     ] 87% ETA 0:00:01

When the command launches, pv posts a progress bar and continually updates the gauge to show headway. From left to right, the typical pv display shows how much data has been processed so far, the time elapsed, throughput in megabytes/second, a visual and numeric representation of work complete, and an estimate of how much time remains. In the display above, 96.1MB of 109MB has been processed, leaving about 13 percent of the file to go after 9 seconds of work.

By default, pv renders all the status indicators for which it is able to calculate values. For instance, if the input to pv is not a file and no specific size is manually specified, the progress bar advances from left to right to show activity, but it cannot measure the percent complete without a baseline. Here’s an example:

$ ssh faraway tar cf - projectx | pv --wait > projectx.tar
Password:
4.34MB 0:00:07 [ 611kB/s] [      <=>                  ]

This example runs tar on a remote machine and sends the output of the remote command to the local system to create projectx.tar. Because pv cannot calculate the total number of bytes to expect in the transfer, it shows throughput so far, time elapsed, and a special indicator that reflects activity. The little “car” (<=>) travels left to right as long as data is streaming through.

The --wait option delays the rendering of the progress meter(s) until the first byte is actually received. Here, --wait is useful, because the ssh command may prompt for a password.

You can enable individual indicators at your discretion with eponymous flags:

$ ssh faraway tar cf - projectx | \
  pv --wait --bytes > projectx.tar
  Password:
   268kB

The latter command enables the running byte count with --bytes. The other options are --progress, --timer, --eta, --rate, and --numeric. If you specify one or more display options, all remaining (unnamed) indicators are automatically disabled.

There is one other simple use of pv. The --rate-limit option can throttle throughput. The argument to this option is a number and a suffix, such as m to indicate megabytes/second:

$ ssh faraway tar cf - projectx | \
  pv --wait --quiet --rate-limit 1m > projectx.tar

The previous command hides all indicators (--quiet) and limits throughout to 1MB/s.

Advanced usage of Pipe Viewer

So far, the examples shown employ a single instance of Pipe Viewer as the producer or consumer in a pair of commands. However, more complex combinations are also possible. You can use pv multiple times in the same command line, with some provisos. Specifically, you must name each instance of pv using --name, and you must enable multiline mode with --cursor. Combined, the two options create a series of labeled indicators, one indicator per named instance.

For example, imagine you want to monitor the progress of a data transfer and its compression separately and simultaneously. You can assign one instance of pv to the former operation and another to the latter, like so:

$ ssh faraway tar cf - projectx | pv --wait --name ssh | \
  gzip | pv --wait --name gzip > projectx.tgz

After you type a password, the Pipe Viewer commands produce a two-line progress meter:

  ssh: 4.17MB 0:00:07 [ 648kB/s] [     <=>             ]
       gzip:  592kB 0:00:06 [62.1kB/s] [   <=>               ]

The first line is labeled ssh and shows the progress of the transfer; the second line, tagged gzip, shows the progression of the compression. Because each command cannot determine the number of bytes in its respective operation, the accumulated totals and the activity bar are shown on each line.

If you know or are able to approximate or calculate the number of bytes in an operation, use the --size option. Adding this option provides some finer-grained detail in the progress bars.

For instance, if you want to monitor the progress of a significant archiving task, you can use other UNIX utilities to approximate the total size of the original files. The df utility can show statistics for an entire file system, while du can calculate the size of an arbitrarily deep hierarchy:

$ tar cf - work | pv --size `du -sh work | cut -f1` > work.tar

Here, the subshell command du -sh work | cut -f1 yields the total size of the work directory in a format compatible with pv. Namely, du -h produces a human-readable format such as 17M for 17 megabytes—perfect for use with pv. (The ls and df commands also support -h for human-readable format.) Because pv now expects a specific number of bytes to transit through the pipe, it can render a true progress bar:

700kB 0:00:07 [ 100kB/s] [>                    ]  4% ETA 0:02:47

Finally, there is one additional technique you’re sure to find useful. Beside counting bytes, Pipe Viewer can visualize progress by counting lines. If you specify the modifier --line-mode, pv advances the progress meter each time a newline is encountered. You can also provide --size, and the number is interpreted as the expected number of lines.

Here’s an example. Oftentimes, find is helpful for locating a needle in a haystack, such as locating all the uses of a particular system call in a large body of application code. In such circumstances, you might run something like this:

$ find . -type f -name '*.c' -exec grep --files-with-match fopen \{\} \; > results

This code finds all C source files and emits the file’s name if the string fopen appears anywhere in the file. Output is collected in a file named results. To reflect activity, add pv to the mix:

$ find . -type f -name '*.c' -exec grep --files-with-match fopen \{\} \; | \
  pv --line-mode > results

Line mode is phenomenal, because many UNIX commands, like find, operate on a file’s metadata, not on the contents of the file. Line mode is ideal for systems administration scripts that copy or compress large collections of files.

In general, you can inject Pipe Viewer into command lines and scripts whenever rate is measurable. You may have to get creative, though. For example, to measure how quickly a directory is copied, switch from cp -pr to tar:

$ # an equivalent of cp -pr old/somedir new
$ (cd old; tar cf - somedir) | pv | (cd new; tar xf - )

You might also consider line mode for use with networking utilities such as wget, curl, and scp. For instance, you can use pv to measure the progress of a sizable upload. And because many of the networking tools can take input from a file, you can use the length of such a file as an argument to --size.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.