sjp.scatter {sjPlot}

This document shows examples for using the sjp.scatter function of the sjPlot package.

Ressources:

(back to table of content)

Data initialization

Please refer to this document.

library(sjPlot)
library(sjmisc)
data(efc)
sjp.setTheme(theme = "scatter", 
             geom.label.size = 3.5, 
             axis.textsize = .9, 
             axis.title.size = .9)

Simple scatter plot

The simplest function call is just by providing either one variable or two variables (one for the x- and one for the y-axis).

If you use just one variable (vector), the second axis will range from one to the length of the vector.

sjp.scatter(efc$c160age)

If you want to plot the variable along the other axis, specify the axis. If you use just one variable (vector), the 2nd axis will range from 1 to the length of the vector.

sjp.scatter(y = efc$c160age)

When having two variables, the first is used for the x- and the second for the y-axis.

sjp.scatter(efc$c160age, efc$e17age)

Automatic titles

You can add title and axis labels automatically in case you have set the attributes accordingly (see this documentation for further information).

sjp.scatter(efc$c160age, efc$e17age, title = NULL)

If you want to remove (axis) titles, set the related argument to "" (which is default for title).

sjp.scatter(efc$c160age, efc$e17age, axis.titles = c("", ""))

Adding a rug plot and avoid overlapping

If you have continuous variables with a larger scale, you shouldn’t have problems with overplotting or overlaying dots. However, this problem usually occurs, if you have variables with just a few categories (factor levels). The function automatically estimates the amount of overlaying dots and then automatically jitters them, like in following example, which also includes a marginal rug-plot.

sjp.scatter(efc$e16sex, 
            efc$neg_c_7, 
            efc$c172code, 
            show.rug = TRUE, 
            title = NULL)

The same plot, when auto-jittering is turned off, would look like this:

sjp.setTheme(theme = "scatter", 
             geom.label.size = 3.5, 
             axis.textsize = .8, 
             axis.title.size = .85, 
             legend.size = .75, 
             legend.title.size = .85,
             legend.pos = c(.5, 1),  # legend inside plot, centered (.5) at top (1)
             legend.just = c(.5, 1)) # legend justification centered (.5) at bottom (1)
sjp.scatter(efc$e16sex, 
            efc$neg_c_7, 
            efc$c172code, 
            show.rug = TRUE, 
            auto.jitter = FALSE)

Use dot size to indicate overplotting

Instead of jittering dots to avoid overplotting, a higher count of overlap-values can be indicated by dot size, using the emph.dots-argument. First, let’s examine overplotting by changing the alpha level of dot geoms. Darker dots indicate more overlapping dots.

sjp.setTheme(theme = "scatter", 
             geom.label.size = 3.5, 
             axis.textsize = .85, 
             axis.title.size = .85,
             geom.alpha = .4)
sjp.scatter(efc$c160age, efc$e17age)

Now we reset the alpha level and use the count of overlap as dotsize, hence bigger dots indicate a higher amount of overlapping values.

sjp.setTheme(theme = "scatter", 
             geom.label.size = 3.5, 
             axis.textsize = .85, 
             axis.title.size = .85)
sjp.scatter(efc$c160age, efc$e17age, emph.dots = TRUE)

Adding fitted lines

With fit.line.grps you can add a fitted line to the scatter plot. If you have groups, a fitted line for each group is plotted.

sjp.setTheme(theme = "scatter", 
             geom.label.size = 3.5, 
             axis.textsize = .85, 
             axis.title.size = .85, 
             legend.size = .8, 
             legend.title.size = .8,
             legend.pos = "right")
sjp.scatter(efc$c160age, 
            efc$e17age, 
            efc$e42dep, 
            fit.line.grps = TRUE)

You can specify the fit method with fitmethod. Furthermore, you can add a fitted line for the overall plot with fit.line.

sjp.scatter(efc$c160age, 
            efc$e17age, 
            efc$e42dep, 
            fit.line.grps = TRUE, 
            fitmethod = "loess", 
            fit.line = TRUE)

Faceting groups

Sometimes the plot is clearer when groups are faceted. The following figure shows a faceted scatter plot, where a standard error region is added to each group fitted line.

sjp.scatter(efc$c160age, 
            efc$e17age, 
            efc$e42dep, 
            fit.line.grps = TRUE, 
            fitmethod = "loess", 
            show.ci = TRUE, 
            facet.grid = TRUE)