rugplot: Scatter plots#

A scatter plot is commonly used to visualize the relationship between two variables. Here some examples are given to show how to create scatterplots using the rugplot container. The famous iris dataset, (Fisher, 1936) will be used to create such examples. The dataset can be downloaded directly from DataHub, by running the following command

wget https://datahub.io/machine-learning/iris/r/iris.csv

or by adding the link in the previous command to the JSON template that will be created in the second step to create the scatter plot.

Creating a scatter plot using the rugplot container#

For simplicity it is better to create an alias, see the Docker commands section.

  1. Step 1, create a rugplot scatter template

    rugplot template -p scatter
    

    A scatter_params.json file will be created including some of the name/value pairs listed below:

    {
        "description": "Parameters to create a scatter plot using the 'rugplot' R package",
        "filename": "<filename path>",
        "variables": null,
        "aesthetics": {
            "y_variable": "<Y required column name>",
            "x_variable": "<X required column name>",
            "colour": null,
        },
        "labels": {
            "title": null,
            "subtitle": null,
        },
    }
    
  2. Step 2, add the 'data file', 'x and y variables', 'title', and 'colour' values in the template:

    {
        "filename": "https://datahub.io/machine-learning/iris/r/iris.csv",
        "aesthetics": {
            "y_variable": "sepallength",
            "x_variable": "sepalwidth",
            "colour": "class",
        },
        "labels": {
            "title": "width vs length",
        },
    }
    
  3. Step 3, create the scatter plot

    rugplot plot -p scatter --file scatter_params.json
    

    The result will be stored in the Rplots.pdf file.

    scatter plot