sjp.grpfrq {sjPlot}

This document shows examples for using the sjp.grpfrq function of the sjPlot package.

Ressources:

(back to table of content)

Data initialization

Please refer to this document.

# load package
library(sjPlot)
library(sjmisc)
# load sample data set.
data(efc)
# set theme
sjp.setTheme(theme = "539",
             axis.title.size = .8,
             axis.textsize = .8,
             geom.label.size = 3,
             geom.label.color = "black",
             legend.size = .8, 
             legend.title.size = .8)

Customizing plot appearance

Please refer to this document

Plotting simple grouped frequencies

First, the basic plotting type is described. You can change the plot type with the type argument. When bar charts are plotted, some statistics are plotted by default: N, Chi-square, df, Phi or Cramer’s V and p-value. You can show the summary with the show.summary argument.

Basic frequency plot

The simplest function call is just to use two vectors / variables as argument. First, the count variable that is plotted along the x-axis. Second, the grouping variable. The intention is to plot distributions of categorial variables (factors) by groups. By default, count and percentage values are automatically plotted.

Note that value and variable labels have already been attached (see paragraph Data initialization). The value labels of the count variable are plotted on the x-axis, the value labels of the grouping variable are used as legend labels.

sjp.grpfrq(efc$e42dep, efc$e16sex)

You can also stack the bars using bar.pos = "stack". In this case, the y-limit of the axis is adjusted. You may need to specify your own y-axis-limit using ylim.

sjp.grpfrq(efc$e42dep, 
           efc$e16sex, 
           ylim = c(0, 400), 
           bar.pos = "stack")

Grouped frequency plot with automatic title

The plot title can also be automatically be extracted if variable labels are attached to the variable (see paragraph Data initialization, function set_label). If the title argument is set to NULL, variable labels will be used as plot title The title then combines both the variable labels of the count and grouping variable.

sjp.grpfrq(efc$e42dep,
           efc$e16sex, 
           title = NULL,
           show.summary = TRUE,
           summary.pos = "l")

Legend in grouped frequency plot

In contrast to the sjp.frq function, sjp.grpfrq has a legend for the grouping variable. You have various options to change the legend style (see customize plot appearance for more details). Note that added group counts are also printed, using show.grpcnt = TRUE.

sjp.setTheme(legend.pos = "bottom", 
             legend.size = .8, 
             legend.backgroundcol = "grey80", 
             legend.bordercol = "grey30",
             axis.title.size = .8,
             axis.textsize = .8,
             geom.label.size = 3)
sjp.grpfrq(efc$e42dep, 
           efc$e16sex,
           legend.title = "Elders gender",
           show.grpcnt = TRUE,
           axis.titles = "")

Dot plots

Dot plots can be made using the type = "dot" argument. To emphasize group associations, dots for each group are highlighted by a shaded area. This highlighting area can be switched off using the emph.dots argument. Use geom.spacing to tweak the jitter of the dots.

# reset legend settings
sjp.setTheme(axis.title.size = .8,
             axis.textsize = .8,
             geom.label.size = 3,
             legend.size = .8, 
             legend.title.size = .8)
sjp.grpfrq(efc$e42dep,
           efc$e15relat, 
           type = "dot", 
           geom.colors = "PuRd", 
           geom.size = 2,
           geom.spacing = .5,
           show.values = FALSE,
           coord.flip = TRUE,
           expand.grid = TRUE)

Grouped frequency plot with count data

If you have variables with count or numerical variables with many categories, you can still plot it as normal frequency plot. However, you have to take into account that the bars and axis labels may be very narrow, thus adjusting the figure width is recommended:

sjp.grpfrq(efc$e17age, 
           efc$e16sex, 
           show.values = FALSE)

Note that if we have groups with expected values smaller than 5, the Fisher’s exact test instead of Chi-squared-test is computed!

Grouped frequency plot with grouped count data, automatic grouping

The sjp.frq function offers to automatically group variables with many categories in order to have clear plots. Use the auto.group argument to specify at which amount of unique values a variable is automatically grouped.

sjp.grpfrq(efc$e17age, 
           efc$e16sex, 
           auto.group = 15)

Note that auto.group = 15 does not produce exactly 15 groups, but not more than 15 groups. The amount of groups is calculated by dividing the range of the variable by auto.group value and round up to the next integer. In the above case: 38/15 = 2.53, which means a group size of 3. See sjmisc::group_var for more details.

Line-styled histogram

Instead of plotting bars, you can also plot a line curve with type = "line":

sjp.grpfrq(efc$e17age, 
           efc$e16sex, 
           show.values = FALSE, 
           type = "line")

Box and violin plots

Count variables may also be plotted as box or violin plots. Use argument type = "box" to plot a box plot. Beside the median, the mean value of a variable is plotted as small circle inside the box plot. Furthermore, Mann-Whitney-U-Tests between each group can plotted as annotation, if show.summary = TRUE.

sjp.grpfrq(efc$barthtot,
           efc$e42dep, 
           type = "box", 
           geom.size = .3, 
           inner.box.width = 4, 
           title = NULL, 
           expand.grid = TRUE)

With type = "violin" you can plot a violin plot. This plot shows a (mirrored) vertical density curve of the variable with a box plot inside of the violin plot.

sjp.grpfrq(efc$barthtot, 
           efc$e42dep, 
           type = "violin", 
           inner.box.width = .2)

Adding interaction terms

Box and violin plots can have additional interaction terms, i.e. further “sub grouping” or subsetting. Use intr.var to specify interaction variables. Note that this argument will only be used if type is box or violin plots. In the following example, box plots for age by dependency are plotted, where each dependency category interacts with gender:

sjp.grpfrq(efc$e17age, 
           efc$e42dep, 
           intr.var = efc$e16sex, 
           type = "box")

Faceting groups

With facet.grid = TRUE, the plot can be facted according to the grouping variable. Each group then appears as own facet. This works both for bar charts and histograms (and also box or violin plots).

sjp.setTheme(axis.angle.x = 90,
             axis.title.size = .8,
             axis.textsize = .8,
             geom.label.size = 3)
sjp.grpfrq(efc$e42dep, 
           efc$e15relat, 
           show.values = FALSE, 
           facet.grid = TRUE)

sjp.setTheme(axis.title.size = .8,
             axis.textsize = .8)
sjp.grpfrq(efc$e17age, 
           efc$e42dep, 
           show.values = FALSE, 
           facet.grid = TRUE, 
           ylim = c(0,25))