rugplot: Histograms
Contents
rugplot: Histograms#
A histogram is commonly used to visualize an approximation of the
distribution of one-dimensional continuos data. Here some examples are
given to show how to create histograms using the rugplot
container. The famous iris dataset
, (Fisher, 1936) will be used
to create such examples. The dataset can be downloaded directly from
DataHub, by running
the following command
wget https://datahub.io/machine-learning/iris/r/iris.csv
Invoke-WebRequest https://datahub.io/machine-learning/iris/r/iris.csv -OutFile iris.csv
or by adding the link in the previous command to the JSON template that will be created in the second step to create the histogram.
Creating a histogram using the rugplot
container#
For simplicity it is better to create an alias
, see the
Docker commands section.
Step 1, create a rugplot histogram template
rugplot template -p histogram
docker run --rm -v "$PWD":/app/data -u $(id -u):$(id -g) venustiano/rugplot:0.1.0 \ template -p histogram
docker run --rm -v ${PWD}:/app/data venustiano/rugplot:0.1.0 ` template -p histogram
A
histogram_params.json
file will be created including some of the name/value pairs listed below:{ "description": "Parameters to create a histogram(s) using the 'rugplot' R package", "filename": "<filename path>", "variables": null, "aesthetics": { "y_variable": null, "x_variable": "<X required column name>", "fill": null, }, "labels": { "title": null, "subtitle": null, }, }
Step 2, add the
'data file'
,'y variable'
and the'title'
values in the template:{ "filename": "https://datahub.io/machine-learning/iris/r/iris.csv", "aesthetics": { "x_variable": "sepallength", }, "labels": { "title": "Sepal length histogram", }, }
Step 3, create the histogram
rugplot plot -p histogram --file histogram_params.json
docker run --rm -v "$PWD":/app/data -u $(id -u):$(id -g) venustiano/rugplot:0.1.0 \ plot -p histogram --file histogram_params.json
docker run --rm -v ${PWD}:/app/data venustiano/rugplot:0.1.0 ` plot -p histogram --file histogram_params.json
The result will be stored in the
Rplots.pdf
file.
Customizing the histogram#
Different attributes can be customized such as other labels, colours
and file format. For example, adding the values below (to save space,
only the updated name/value
are listed) in
histogram_params.json
"colour": "class",
"labels": {
"x": "Sepal length",
},
"save": {
"save": true,
"outputfilename": "sepal-length_histogram.png",
"device": "png",
}
and running the exactly same
command in step 3 will produce the
following visualization stored in a png
file.
The png file has the default size 10x15 cm (height/width) and 72
dots per inch. These properties can be changed in the "save"
attributes of the JSON file.
Other properties can also be added such as facets, interactive plots and LaTeX tikDevice plots.