Difference between revisions of "Gnuplot"

From Noah.org
Jump to navigationJump to search
m
 
(25 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
[[category:Engineering]]
 
[[category:Engineering]]
  
I use gnuplot from time to time. I don't know it well enough to do anything fancy, but I keep it in my toolbox.
+
I use '''gnuplot''' and '''GNU plotutils''' from time to time. They are unrelated. I prefer the traditional UNIX pipeline filter and toolkit style of the '''GNU plotutils'''. It is also easier to use and more often then not it ''just works''. The '''gnuplot''' tool produces nicer looking graphs and has a lot more features (including interactive 3D graphs).
Most of the time I just want to plot some points from a data file.
 
  
[[http://www.gnuplot.info/docs/gnuplot.html gnuplot documentation]]
+
Most of the time I just want to plot some 2D data points from a file or stream so I use '''GNU plotutils'''.
 +
<pre>
 +
graph -F HersheySans -T png < cdck-plot.dat | display -
 +
</pre>
 +
 
 +
On Ubuntu you can install either package using the following commands:
 +
;'''GNU plotutils''': sudo aptitude install plotutils
 +
;'''gnuplot''': sudo aptitude install gnuplot
 +
 
 +
[http://www.gnu.org/software/plotutils/ GNU Plotutils documentation]
 +
 
 +
[http://www.gnuplot.info/docs/gnuplot.html gnuplot documentation]
 +
 
 +
== gnuplot ==
 +
 
 +
One of the first things I do is alias "gnuplot" to "gnuplot -persist":
 +
<pre>
 +
alias gnuplot='gnuplot -persist'
 +
</pre>
 +
You can also put this in your gnuplot script when you set the terminal:
 +
<pre>
 +
set terminal x11 persist
 +
</pre>
 +
This will tell gnuplot to keep an image displayed on the screen. By default gnuplot will close a display window as soon as it finishes drawing.
 +
 
 +
=== simple plot of date time vs. value data ===
 +
 
 +
If you have a simple data file that contains dates and values then you can use the following '''gnuplot''' script to plot it. Assume your data file is called '''report''' and looks something like this:
 +
<pre>
 +
10/31/09 4.86 714.01
 +
11/30/09 9.77 723.78
 +
12/31/09 13.40 737.18
 +
1/31/10 18.98 756.16
 +
2/28/10 22.46 778.62
 +
3/31/10 6.70 785.32
 +
</pre>
 +
 
 +
You can use the following script to plot the data. Use '''1:2''' if you want to plot the second column of data.
 +
<pre>
 +
set terminal x11 persist
 +
set style data lines
 +
set xdata time
 +
set timefmt x "%m/%d/%y"
 +
set grid
 +
plot "report" using 1:3
 +
</pre>
  
One of the first tricks I do is alias "gnuplot" to "gnuplot -persist":
+
=== fancy plot of date time vs. value data ===
  alias gnuplot='gnuplot -persist'
 
This will tell gnuplot to keep an image displayed on the screen.
 
By default gnuplot will close a display window as soon as it finishes drawing.
 
  
 +
This adds a title and lebels and plots lines with a thicker linewidth. The key (legend) is turned off. Output goes to a PNG file instead of the X11 terminal. This plots a Google Adsense report.csv. Note that it appears that '''Gnuplot''' can't handle data in '''Little-endian UTF-16 Unicode''' format, so you have to convert to ASCII first. I used '''Vim''' and ran ''':wq ++enc=ascii'''.
 +
<pre>
 +
set terminal png size 1000,600
 +
set output 'report.png'
 +
set title "Google AdSense income over time"
 +
set xlabel "time"
 +
set ylabel "dollars"
 +
set style data lines
 +
set xdata time
 +
set timefmt x "%Y-%m-%d"
 +
set xtics format "%Y-%m-%d"
 +
set grid
 +
set key off
 +
plot "report.csv" using 1:7 with lines linewidth 2
 +
</pre>
  
I don't run an ntp daemon.Instead, I just sync my clock once a day using ntpdate.
+
=== example of how to plot data from a log file ===
The ntpdate log file looks like this:
 
  
 +
I have an embedded platform without a RTC backup (no hardware clock), so I don't run an '''ntp daemon''' (ntpd). Instead, I just sync the clock once a day using ntpdate. The ntpdate log file looks like this:
 
<pre>
 
<pre>
 
28 Jul 04:02:41 ntpdate[4781]: step time server 192.43.244.18 offset 13.828862 sec
 
28 Jul 04:02:41 ntpdate[4781]: step time server 192.43.244.18 offset 13.828862 sec
Line 21: Line 77:
 
</pre>
 
</pre>
  
I save the output from ntpdate to a log file. I can plot the daily clock drift
+
I save the output from ntpdate to a log file. I can plot the daily clock drift using this gnuplot script:
using this gnuplot script:
 
 
 
 
<pre>
 
<pre>
set data style errorlines
+
set terminal x11 persist
 +
set style data lines
 
set title "Daily clock offset\nnegative means clock runs fast"
 
set title "Daily clock offset\nnegative means clock runs fast"
 
set xlabel "Date"
 
set xlabel "Date"
Line 35: Line 90:
 
set grid
 
set grid
 
plot "ntpdate.log" using 1:10
 
plot "ntpdate.log" using 1:10
 +
</pre>
 +
Notice the '''using 1:10''' clause. Even though the '''timefmt''' specifies that time uses three columns I still specify that the '''y''' data is in column 10.
 +
 +
== GNU plotutils examples ==
 +
 +
Plot the output of [[QtDMM]] (a digital multimeter recording package). The other columns in this data file contain date information which is usually wrong. This app still has a lot of bugs, but it does the job. Here I use awk to look only at the measured data. I then tell `graph` to use an automatic abscissa since my data now only has a single axis.
 +
 +
<pre>
 +
cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --auto-abscissa 0.3 -r 0.1 -u 0.1 -h 0.8 -w 0.8 --bitmap-size 1024x768 -F HersheySans -T png | display -
 
</pre>
 
</pre>
  
== other examples ==
+
The same plot as above but with PostScript output:
  
For using with `cdck`
+
<pre>
 +
cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --page-size ledger --x-label 'Samples 100ms' --y-label 'Current mA' -r 0.1 -u 0.1 -h 0.8 -w 0.8 --auto-abscissa -F HersheySans -T ps | display -background white -flatten -
 +
</pre>
 +
 
 +
The same plot as above but with PostScript output converted to PNG (better quality than just straight PNG conversion):
 +
 
 +
<pre>
 +
cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --page-size ledger --x-label 'Samples 100ms' --y-label 'Current mA' -r 0.1 -u 0.1 -h 0.8 -w 0.8 --auto-abscissa -F HersheySans -T ps | convert -background white -flatten - power_test_1.png
 +
</pre>
 +
 
 +
=== histogram ===
 +
 
 +
In this example the `histogram.py` script will read a binary file and generate a histogram report in the following format:
 +
<pre>
 +
# filename: entropy.bin
 +
# max count index 200: 88
 +
# min count index 229: 44
 +
#
 +
# index count
 +
000  57
 +
001  53
 +
002  71
 +
003  58
 +
...
 +
253  64
 +
254  66
 +
255  68
 +
</pre>
 +
That data can be plotted using the following command:
 +
<pre>
 +
./histogram.py entropy.bin | egrep -v "^\s*#" | awk '{print $2}' | graph --auto-abscissa 1.0 -r 0.1 -u 0.1 -h 0.8 -w 0.8 --bitmap-size 1024x768 -F HersheySans -T png | display -
 +
</pre>
 +
 
 +
=== cdck ===
 +
 
 +
The `cdck` tool will scan a CD-ROM or DVD for correctable read errors. There is no `fsck.iso9660`, so you can't easily scan a CD to see if there are filesystem errors or media errors. What `cdck` does is measure the time it takes to read each block. All CDRs and DVDs have error detection and correction. When the disc reader detects an error it will attempt to correct it. If it cannot then it will attempt to re-read the data. All of this takes times, so even if the data is read error-free you can still tell there was trouble in certain blocks based on how hard the CD/DVD reader had to work in order to read a block.
 +
 
 +
The following command will run `cdck` against the default CD-ROM device. This generates a database. After this step I will plot the data.:
 +
 
 +
<pre>
 +
# cdck -t -p /dev/sr0
 +
 
 +
NB! For disks written with some burners cdck might
 +
  report about unreadable sectors at the end of the disk.
 +
  In such cases you can just ignore those warnings.
 +
 
 +
Reading sectors 1-281120
 +
281120 ok
 +
 
 +
CD overall:
 +
  Sectors total: 281120:
 +
  Good sectors: 281120:
 +
  Bad sectors (incl. with poor timing): 0
 +
CD timings:
 +
  Minimal = 1 usec (0.000001s)
 +
  Maximal = 103820 usec (0.103820s)
 +
  Average = 471 usec (0.000471s)
 +
 
 +
Conclusion:
 +
  Satisfactory disc
 +
</pre>
 +
 
 +
This plots the output of `cdck` using '''GNU Plotutils'''. Note that one could just use '''-T X''' for output to generate no stream output. In other words, you can skip the pipe into `display`. I use it here because wanted to demonstrate the pipeline abilities of `graph`.
 +
<pre>
 +
graph -F HersheySans -T png --bitmap-size 800x600 --x-label 'Sectors (2Kb/sec)' --y-label 'Reading time (usec)' -C -w 0.8 -h 0.8 -r 0.15 -u 0.075 -f 0.025 < cdck-plot.dat | display -
 +
</pre>
 +
 
 +
The output can be improved quite a bit if you output PostScript instead of bitmap. Mainly, this produces much better looking fonts because `display` (part of ImageMagick) does a better job of rendering fonts to a bitmap.
 +
<pre>
 +
graph -T ps -F Helvetica --x-label 'Sectors (2Kb/sec)' --y-label 'Reading time (usec)' -C  -g 3 -w 0.8 -h 0.8 -r 0.15 -u 0.075 -f 0.025 < cdck-plot.dat | display -background white -flatten -
 +
</pre>
 +
 
 +
[[Image:cdck.png|frameless]]
 +
 
 +
The following is equivalent using '''gnuplot''' with `cdck`. This can't be done with a stream, I'm afraid. Lack of pipeline support is a very annoying deficit of '''gnuplot'''. You can't even pass in command-line arguments for filenames to plot. The main reason to use '''gnuplot''' is because the final output is nicer looking.
 
<pre>
 
<pre>
 
#!/usr/bin/env gnuplot
 
#!/usr/bin/env gnuplot
 
+
set terminal x11 persist
 
set data style errorlines
 
set data style errorlines
set terminal x11
 
 
set grid
 
set grid
 
set xlabel 'Sectors (2Kb/sec)'
 
set xlabel 'Sectors (2Kb/sec)'
 
set ylabel 'Reading time (usec)'
 
set ylabel 'Reading time (usec)'
 +
set format y "%12.f"
 +
set format x "%12.f"
 
#set logscale y
 
#set logscale y
 
 
plot 'cdck-plot.dat' with lines
 
plot 'cdck-plot.dat' with lines
 
pause -1
 
 
</pre>
 
</pre>
  

Latest revision as of 16:23, 7 January 2013


I use gnuplot and GNU plotutils from time to time. They are unrelated. I prefer the traditional UNIX pipeline filter and toolkit style of the GNU plotutils. It is also easier to use and more often then not it just works. The gnuplot tool produces nicer looking graphs and has a lot more features (including interactive 3D graphs).

Most of the time I just want to plot some 2D data points from a file or stream so I use GNU plotutils.

graph -F HersheySans -T png < cdck-plot.dat | display -

On Ubuntu you can install either package using the following commands:

GNU plotutils
sudo aptitude install plotutils
gnuplot
sudo aptitude install gnuplot

GNU Plotutils documentation

gnuplot documentation

gnuplot

One of the first things I do is alias "gnuplot" to "gnuplot -persist":

alias gnuplot='gnuplot -persist'

You can also put this in your gnuplot script when you set the terminal:

set terminal x11 persist

This will tell gnuplot to keep an image displayed on the screen. By default gnuplot will close a display window as soon as it finishes drawing.

simple plot of date time vs. value data

If you have a simple data file that contains dates and values then you can use the following gnuplot script to plot it. Assume your data file is called report and looks something like this:

10/31/09 4.86 714.01
11/30/09 9.77 723.78
12/31/09 13.40 737.18
1/31/10 18.98 756.16
2/28/10 22.46 778.62
3/31/10 6.70 785.32

You can use the following script to plot the data. Use 1:2 if you want to plot the second column of data.

set terminal x11 persist
set style data lines
set xdata time
set timefmt x "%m/%d/%y"
set grid
plot "report" using 1:3

fancy plot of date time vs. value data

This adds a title and lebels and plots lines with a thicker linewidth. The key (legend) is turned off. Output goes to a PNG file instead of the X11 terminal. This plots a Google Adsense report.csv. Note that it appears that Gnuplot can't handle data in Little-endian UTF-16 Unicode format, so you have to convert to ASCII first. I used Vim and ran :wq ++enc=ascii.

set terminal png size 1000,600
set output 'report.png'
set title "Google AdSense income over time"
set xlabel "time"
set ylabel "dollars"
set style data lines
set xdata time
set timefmt x "%Y-%m-%d"
set xtics format "%Y-%m-%d"
set grid
set key off
plot "report.csv" using 1:7 with lines linewidth 2

example of how to plot data from a log file

I have an embedded platform without a RTC backup (no hardware clock), so I don't run an ntp daemon (ntpd). Instead, I just sync the clock once a day using ntpdate. The ntpdate log file looks like this:

28 Jul 04:02:41 ntpdate[4781]: step time server 192.43.244.18 offset 13.828862 sec
29 Jul 04:02:45 ntpdate[19059]: step time server 192.43.244.18 offset 13.838561 sec
30 Jul 04:03:36 ntpdate[17510]: step time server 192.43.244.18 offset 13.844762 sec

I save the output from ntpdate to a log file. I can plot the daily clock drift using this gnuplot script:

set terminal x11 persist
set style data lines
set title "Daily clock offset\nnegative means clock runs fast"
set xlabel "Date"
set xdata time
set timefmt "%d %b %H:%M:%S"
set format x "%d/%m"
set ylabel "Required offset (negative minutes fast)"
set yrange [ -1.0 : 30.0]
set grid
plot "ntpdate.log" using 1:10

Notice the using 1:10 clause. Even though the timefmt specifies that time uses three columns I still specify that the y data is in column 10.

GNU plotutils examples

Plot the output of QtDMM (a digital multimeter recording package). The other columns in this data file contain date information which is usually wrong. This app still has a lot of bugs, but it does the job. Here I use awk to look only at the measured data. I then tell `graph` to use an automatic abscissa since my data now only has a single axis.

cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --auto-abscissa 0.3 -r 0.1 -u 0.1 -h 0.8 -w 0.8 --bitmap-size 1024x768 -F HersheySans -T png | display -

The same plot as above but with PostScript output:

cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --page-size ledger --x-label 'Samples 100ms' --y-label 'Current mA' -r 0.1 -u 0.1 -h 0.8 -w 0.8 --auto-abscissa -F HersheySans -T ps | display -background white -flatten -

The same plot as above but with PostScript output converted to PNG (better quality than just straight PNG conversion):

cat power_test_1.data | egrep -v "^\s*#" | awk '{print $3}' | graph --page-size ledger --x-label 'Samples 100ms' --y-label 'Current mA' -r 0.1 -u 0.1 -h 0.8 -w 0.8 --auto-abscissa -F HersheySans -T ps | convert -background white -flatten - power_test_1.png

histogram

In this example the `histogram.py` script will read a binary file and generate a histogram report in the following format:

# filename: entropy.bin
# max count index 200: 88
# min count index 229: 44
#
# index count
000   57
001   53
002   71
003   58
...
253   64
254   66
255   68

That data can be plotted using the following command:

./histogram.py entropy.bin | egrep -v "^\s*#" | awk '{print $2}' | graph --auto-abscissa 1.0 -r 0.1 -u 0.1 -h 0.8 -w 0.8 --bitmap-size 1024x768 -F HersheySans -T png | display -

cdck

The `cdck` tool will scan a CD-ROM or DVD for correctable read errors. There is no `fsck.iso9660`, so you can't easily scan a CD to see if there are filesystem errors or media errors. What `cdck` does is measure the time it takes to read each block. All CDRs and DVDs have error detection and correction. When the disc reader detects an error it will attempt to correct it. If it cannot then it will attempt to re-read the data. All of this takes times, so even if the data is read error-free you can still tell there was trouble in certain blocks based on how hard the CD/DVD reader had to work in order to read a block.

The following command will run `cdck` against the default CD-ROM device. This generates a database. After this step I will plot the data.:

# cdck -t -p /dev/sr0

NB! For disks written with some burners cdck might
   report about unreadable sectors at the end of the disk.
   In such cases you can just ignore those warnings.

Reading sectors 1-281120
281120 ok

CD overall:
  Sectors total: 281120:
  Good sectors: 281120:
  Bad sectors (incl. with poor timing): 0
CD timings:
  Minimal = 1 usec (0.000001s)
  Maximal = 103820 usec (0.103820s)
  Average = 471 usec (0.000471s)

Conclusion:
  Satisfactory disc

This plots the output of `cdck` using GNU Plotutils. Note that one could just use -T X for output to generate no stream output. In other words, you can skip the pipe into `display`. I use it here because wanted to demonstrate the pipeline abilities of `graph`.

graph -F HersheySans -T png --bitmap-size 800x600 --x-label 'Sectors (2Kb/sec)' --y-label 'Reading time (usec)' -C -w 0.8 -h 0.8 -r 0.15 -u 0.075 -f 0.025 < cdck-plot.dat | display -

The output can be improved quite a bit if you output PostScript instead of bitmap. Mainly, this produces much better looking fonts because `display` (part of ImageMagick) does a better job of rendering fonts to a bitmap.

graph -T ps -F Helvetica --x-label 'Sectors (2Kb/sec)' --y-label 'Reading time (usec)' -C  -g 3 -w 0.8 -h 0.8 -r 0.15 -u 0.075 -f 0.025 < cdck-plot.dat | display -background white -flatten -

cdck.png

The following is equivalent using gnuplot with `cdck`. This can't be done with a stream, I'm afraid. Lack of pipeline support is a very annoying deficit of gnuplot. You can't even pass in command-line arguments for filenames to plot. The main reason to use gnuplot is because the final output is nicer looking.

#!/usr/bin/env gnuplot
set terminal x11 persist
set data style errorlines
set grid
set xlabel 'Sectors (2Kb/sec)'
set ylabel 'Reading time (usec)'
set format y "%12.f"
set format x "%12.f"
#set logscale y
plot 'cdck-plot.dat' with lines

old style, depricated:

set terminal png 
#gif small size 640,480
set title "Daily clock offset\nnegative means clock runs fast"
set data style fsteps
set xlabel "Date"
set timefmt "%d/%m/%y\t%H%M"
set yrange [ -4.6 : -4.8]
set xdata time
set xrange [ "25/06/00":"29/08/00" ]
set ylabel "Required offset (negative minutes fast)"
set format x "%d/%m"
#\n%H:%M"
set grid
set key left
plot "data.dat" using 1:10
reset