sjp.scatter {sjPlot}

This document shows examples for using the sjp.scatter function of the sjPlot package.

Ressources:

• Developer snapshot at GitHub
• Submission of bug reports and issues at GitHub

(back to table of content)

Data initialization

Please refer to this document.

library(sjPlot)
library(sjmisc)
data(efc)
sjp.setTheme(theme = "scatter",
geom.label.size = 3.5,
axis.textsize = .9,
axis.title.size = .9)

Simple scatter plot

The simplest function call is just by providing either one variable or two variables (one for the x- and one for the y-axis).

If you use just one variable (vector), the second axis will range from one to the length of the vector.

sjp.scatter(efc$c160age) If you want to plot the variable along the other axis, specify the axis. If you use just one variable (vector), the 2nd axis will range from 1 to the length of the vector. sjp.scatter(y = efc$c160age)

When having two variables, the first is used for the x- and the second for the y-axis.

sjp.scatter(efc$c160age, efc$e17age)

Automatic titles

You can add title and axis labels automatically in case you have set the attributes accordingly (see this documentation for further information).

sjp.scatter(efc$c160age, efc$e17age, title = NULL)

If you want to remove (axis) titles, set the related argument to "" (which is default for title).

sjp.scatter(efc$c160age, efc$e17age, axis.titles = c("", ""))

Adding a rug plot and avoid overlapping

If you have continuous variables with a larger scale, you shouldn’t have problems with overplotting or overlaying dots. However, this problem usually occurs, if you have variables with just a few categories (factor levels). The function automatically estimates the amount of overlaying dots and then automatically jitters them, like in following example, which also includes a marginal rug-plot.

sjp.scatter(efc$e16sex, efc$neg_c_7,
efc$c172code, show.rug = TRUE, title = NULL) The same plot, when auto-jittering is turned off, would look like this: sjp.setTheme(theme = "scatter", geom.label.size = 3.5, axis.textsize = .8, axis.title.size = .85, legend.size = .75, legend.title.size = .85, legend.pos = c(.5, 1), # legend inside plot, centered (.5) at top (1) legend.just = c(.5, 1)) # legend justification centered (.5) at bottom (1) sjp.scatter(efc$e16sex,
efc$neg_c_7, efc$c172code,
show.rug = TRUE,
auto.jitter = FALSE)

Use dot size to indicate overplotting

Instead of jittering dots to avoid overplotting, a higher count of overlap-values can be indicated by dot size, using the emph.dots-argument. First, let’s examine overplotting by changing the alpha level of dot geoms. Darker dots indicate more overlapping dots.

sjp.setTheme(theme = "scatter",
geom.label.size = 3.5,
axis.textsize = .85,
axis.title.size = .85,
geom.alpha = .4)
sjp.scatter(efc$c160age, efc$e17age)

Now we reset the alpha level and use the count of overlap as dotsize, hence bigger dots indicate a higher amount of overlapping values.

sjp.setTheme(theme = "scatter",
geom.label.size = 3.5,
axis.textsize = .85,
axis.title.size = .85)
sjp.scatter(efc$c160age, efc$e17age, emph.dots = TRUE)

With fit.line.grps you can add a fitted line to the scatter plot. If you have groups, a fitted line for each group is plotted.

sjp.setTheme(theme = "scatter",
geom.label.size = 3.5,
axis.textsize = .85,
axis.title.size = .85,
legend.size = .8,
legend.title.size = .8,
legend.pos = "right")
sjp.scatter(efc$c160age, efc$e17age,
efc$e42dep, fit.line.grps = TRUE) You can specify the fit method with fitmethod. Furthermore, you can add a fitted line for the overall plot with fit.line. sjp.scatter(efc$c160age,
efc$e17age, efc$e42dep,
fit.line.grps = TRUE,
fitmethod = "loess",
fit.line = TRUE)

Faceting groups

Sometimes the plot is clearer when groups are faceted. The following figure shows a faceted scatter plot, where a standard error region is added to each group fitted line.

sjp.scatter(efc$c160age, efc$e17age,
efc\$e42dep,
fit.line.grps = TRUE,
fitmethod = "loess",
show.ci = TRUE,
facet.grid = TRUE)