Data initialization {sjPlot/sjmisc}

This document shows basic usage of the sjmisc package and how to prepare data labels for use with the functions of the sjPlot package.

Ressources:

• Developer snapshot at GitHub
• Submission of bug reports and issues at GitHub

(back to table of content)

When visualizing data - for instance, frequencies of factor variables with labels - plots automatically use axis labels depending on the factor levels.

x <- factor(sample(1:2, 200, replace = T, prob = c(0.6, 0.4)), labels = c("female", "male"))
plot(x)

The same applies to functions from the sjPlot package.

library(sjPlot)
sjp.frq(x)

However, when reading data files - especially SPSS data etc. - variables have numeric values and are not labelled factors. Instead, the imported variables have additional attributes for the value and variable labels. See following example, taken from the sample data set in the sjmisc package, that contains data read from an SPSS file:

library(sjmisc)
data(efc)
efc$e16sex, axisLabels.x = c("independent", "slightly dependent", "moderately dependent", "severely dependent"), legendLabels = c("male", "female")) The next two examples demonstrate how you can save time, because labels don’t have to be specified each time you want to plot a figure. Function call with automatic label detection Here is a function call that demonstrates the automatic label detection: sjp.xtab(efc$e42dep, efc$e16sex) Function call with manually defined axis and legend labels In this example, the value and variable labels are passed as parameters to the function: sjp.xtab(efc$e42dep, efc$e16sex, axis.labels = c("independent", "slightly dependent", "moderately dependent", "severely dependent"), axis.titles = "how dependent is the elder? - subjective perception of carer", legend.labels = c("male", "female"), legend.title = "elder's gender") Converting data to sjPlot There are some packages that add specific class-attributes to vectors, for instance the haven- or Hmisc-package, which create labelled-class objects when creating new (labelled) variables or reading data. If you consider any problems with objects of class labelled or to avoid problems and incompatibilities with haven-imported data, there’s a function to ‘convert’ labelled objects into an sjPlot-friendly format, unlabel. When using the sjmisc::read_spss function, this conversion is done automatically. The original haven-structure of imported data: str(mydf$e42dep)
## Class 'labelled'  atomic [1:908] 3 3 3 4 4 4 4 4 4 4 ...
##   ..- attr(*, "label")= chr "how dependent is the elder? - subjective perception of carer"
##   ..- attr(*, "labels")= Named num [1:4] 1 2 3 4
##   .. ..- attr(*, "names")= chr [1:4] "independent" "slightly dependent" "moderately dependent" "severely dependent"

The result after conversion:

str(unlabel(mydf\$e42dep))
##  atomic [1:908] 3 3 3 4 4 4 4 4 4 4 ...
##  - attr(*, "label")= chr "how dependent is the elder? - subjective perception of carer"
##  - attr(*, "labels")= Named num [1:4] 1 2 3 4
##   ..- attr(*, "names")= chr [1:4] "independent" "slightly dependent" "moderately dependent" "severely dependent"

unlabel either accepts a single vector or a complete data frame as parameter, and simply removes the labelled class-attribute from vectors.

Writing data to SPSS

The haven-package offers fantastic possibilities to write R data frames to other formats, currently SPSS and STATA are supported.

To make sure that value labels are written as well, variables either need to be of class labelled (see haven::labelled()) or labelled factors. However, both vector-types do not support variable labels, thus data is saved without variable labels.

The write_spss function of the sjmisc-package converts the data into a format that also exports variable labels. When writing data to SPSS or STATA, it is recommended to do so with sjmisc-write-functions:

write_spss(my_data_frame, "path/to/spss-file.sav")