colplot Help

First and foremost, the focus on this help is on usability and not intended to be a lesson in how to generate data in plot format, how to choose which data to display, how to display it (from a formatting perspective) or how to interpret the results.

The Methodology
Colplot is not a general purpose plotting package, but rather one intimately tied into collectl data file formats and naming conventions. Its purpose is to plot data files within a particular time frame which are all in the same directory, having no capability to look in multiple locations. This means if you want to plot data for 2 systems, you need to put all associated data files in the same directory. The general convention used is to have a separate directory that just contains plottable data files, but if there are non-plottable files in that directory colplot will ignore them. However, your life will be much easier if you limit that directory to plot files.

Since colplot uses gnuplot to generate the plots and since gnuplot cannot handle compressed files, you must make sure your plot files are not compressed. If they are, colplot will ignore them and either not generate any plots or simply generate plots for those files that aren't compressed.

Colplot uses the metaphor of a speadsheet to display its plots, the default being rows of a particular type of a plot and columns of different systems (if only plotting data of a single type or for a single system this will collapse into a single column). If you prefer your spreadsheet to be ordered differently, you can change the arragement with the Display pulldown menu. When geneating plots, colplot will put data for multiple days into the same plot which can result in a very dense (and possibly unreadable) plot. Your choices are then to either choose a shorter timeframe OR to choose a Display that includes the day as one of the sort criteria which with then generate a separate plot for each day.

A number of check boxes are provided that allow you to choose individual plots OR a couple of macros that select all of a particular type. There are many more plots available to you than have check boxes and these must be chosen by name if you want to select one of them. To see the full list of available plots simply click on the Help button next to the Plots by Name(s): field.

The following section describes all the fields on the colplot form in the order they occur.

What is the purpose of Enable Refresh??
This is a mode in which a plot is redrawn at a user selectable frequency, specified in seconds, resulting in pseudo-real time plots. In conjunction with refresh one can either select a time scale or request the last n-minutes of data be displayed by selecting the corresponding radio button.

The key to making this all work is to select files for the current day that are continuously being updated by collectl in non-compressed, plottable format (this requires the collectl switches -P, -F0 and -oz in addition to any other switches you choose). If collectl is running on a remote machine it can either write its data to a directory being shared via nfs or one can periodically pull the changing contents of the file(s) via a tool such as rsync.

How to select files to plot
Plot files are selected by using a combination of the From/Thru and Filename Containing fields. On startup, colplot always examines the currently selected directory, which you can change with the Change Dir button, to make sure there at least one file containing plottable data. It also makes note of the oldest and newest files in that directory and sets the Date fields to these as default values. If you do nothing else from a file selection perspective, plots will be generated for all files in this directory. Naturally you can change the dates to narrow the scope of which files are selected.

The second way to narrow the scope is to enter one or more strings, separated by spaces, into the Filenames Containing field. In addition you specify whether any of the strings must be included in a filename (the default) to select it or if all strings must occurs to select it though this is typically not necessary.

Further, if the string contains a [, it is assumed to be in pdsh format. This is a format for specifying a larger number of hostnames in a compressed format such as xyz[1-5,10], which is the equivalent of xyz1, xyz2, xyz3, xyz4, xyz5 and xyz10. In this case the any/all field is ignored and any files whose hostname portions match will be selected.

When files are selected for plotting, each name will be examined to see if it matches any of the selection strings and if not, will not be selected. You can also use an asterisk as a wild card character in these fields to indicate any characters can match that portion of the string. Another way to think of these is as what you might specify in an ls command.

TIP - if you want to select a specific file or files, it is sometimes easier to just specify the unique portion of the hostname and the date in the Filenames Containing box rather than play with date ranges. If you want all the hosts for a given date just enter the date portion.

How to select the time period
No real surprise here. To change the plot time period to something other than a full day, simply change one or both of the time fields.

Plot Type
Normally you need do nothing and you will get the default plot type, which more often than not is a basic line plot. However you can choose a type to be applied to all plots from one of: line, point, stacked line or stacked point.

Display
It can often be confusing to look at a number of plots across multiple systems and dates and by default, one plot is generated across multiple dates if applicable. However, if the dates are far apart (say a week or more), one can get an x-axis for which all the dates are run together. It may also take a lot longer to gererate multi-day plots so be careful.

As described earlier, whether the plots contain single or multiple dates, a tabular display is presented much like a spreadsheet in which one or more variables specified by Display is used as the column and data within the column is sorted by the remaining (if any) types. Use this field to change the default (columns organized by system and then grouped by plot). If one of the types includes the string Day the resultant plots will only be for a single day.

Summary Plots
As the name states, these plots provide reports on summary data. This is the type of data you would see when running collectl and selecting summary data by specifying lower case arguments with the -s switch. Keep in mind that if data wasn't collected for the plot chosen you will not get any output!

Detail Plots If the summary plots are based on lower case subsystems, these plots represent the data collected with upper case switches. By default, you will see one plot for each device for which data was collected. When producing plots for multiple systems with different hardware configurations you may see a different number of detail plots.

Detail Filters
Sometimes you are only interested in the details of a particular device or perhaps a subset of them. If so, enter one or more strings into this field to select those devices you are interested in. For example, if you were to enter eth c1 and chose details for networks and disks, you might see reports for eth0 eth1 eth2 c1d0 c1d1.

Plots By Names
There are a number of additional plots that are available to you that are simply too numerous, and less common, to list on the main page. To see a complete list of these click on the associated Help button. You may enter one or more of these and they will be produced in addition to those that may have been selected elsewhere.

Changing the destination of the plots
In case you haven't figured it out yet, the plots are generated for display by gnuplot in png format, but you can also save them as a file or deliver them using email as a pdf attachment. The following options allow you to change the way you dispose of your plot(s) and in what format.

Email or Dir If this field contains an @, it is assumed to be an email address, otherwise it is assumed to be a directory name. In any event, a value in this field directs colplot to generate its output as a file and either deliver it to the specified email address OR place it in the specified directory. It is recommended that before using this to first generate a plot in the format you want and then fill in this field and generate the plot(s) again.
Subject Enter a subject line you would like to see in your email or else a generic one will be assigned
Type By default, plots are generated in pdf format (or ghostscript on windows). Using this field, you can request plots to be generated as png objects, making it very easy to later include individual plots (rather than a page of plots) in a document. This field also allows you to tell colplot to not generate any plots - see Include Ctl File below
Include Ctl For those who wish to take customization of their plots to another level, checking this box will cause colplot to deliver the gnu control file to the requested destination. One can then edit that file to customize various gnuplot settings not possible through the user interface. One can then manually rerun gnuplot, specifying the control file as an argument.

Changing format of the plots
While there are many different options that are simply hardcoded into colplot, it was decided to expose a handful of the more useful ones through the user interface. In most cases, you'll never need to change any of these, but it is at least worth mentioning what they are and what they do:

Width This controls the width of the plots as they are displayed. By making this number larger, you can make the plot significantly wider than a screeen width and use the horizontal scroll bar to browse through them. The benefit of doing so is to allow you to look at the data with a finer grain without having to shrink the time frame.
Height This field pretty much works the same as the Width except it controls height of the plots. It can also be very useful in getting a fine grained look at data.
Thick Make lines or points is scatter plots bigger. This has only been found to work with gnuplot V4.2.
X-Increment By default, gnuplot chooses its own values for the increments on the X-Axis, which most of the time are quite reasonable. However, there are times when they are not and this parameter allows you to change that and get an easier to read plot.
Legend Legends take up a fair amount of plot real estate and can be removed by unchecking this box. This can be of value when one wants to line up multiple plots side by side (which will automatically happen if they are narrow enough - see Width).
AdjustHeight Sometimes the combination of the number of lines that appear in a plot and the plot height can cause a second column of line definitions to appear in the column. If you check this box the height of an individual plot will be increased if necessary to fit the entire legend as a single column of names. If you do choose this option not all individial plots may be the same height. In no circumstances will a plot be shorter that the height specified in the previous checkbox.
PagBrk This option only applies to pdf output (see Type below) and then only when generating plots for multiple dates or multiple systems. It will cause a page break every time the date or the system name portion of the file name changes, making the output easier to read.
YLog The values for the Y-Axis are generated dynamically, based on the values of the data. Sometimes a single, high valued value can make the rest of the data unreadable. Two ways to make more data readable is to plot it logrithmically and to change the maximum value of the Y Axis. This option does both at the same time.
X-Axis Like the legend, the X-Axis labels take up vertical real estate and can also be surpressed by unchecking this box. This can be of value (though admittedly limited) when you want to fit as many plots on the screen at the same time as possible and still maximize detail.

Error Messages
There are 2 major kinds of error messages and they will not be discussed in any detail other than to say some are operational, such as a user trying to select a directory with no plotting data in it, and some are fatal, such as not configuring the system with the proper path to gnuplot.

In any event, if something does go wrong, the user should be informed of what the problem is and hopefully it will be obvious enough to fix.

Command Line Interface
colplot also supports a CLI that exposes all the options available though the web interface. For more details type colplot -help in a terminal window and if on a Linux system type you can also type man colplot for additional information.

updated Feb 21, 2011