sjt.frq {sjPlot}

This document shows examples for using the sjt.frq function of the sjPlot package.

Ressources:

(back to table of content)

Basics of the sjt-functions

Please refer to this document

Data initialization

Please refer to this document.

# load package
library(sjPlot)
library(sjmisc)
# load sample data set.
data(efc)

Before using the functions of the sjPlot package, it might be useful to assign value and variable labels to the used variables (vectors) or data frames. If you are using the sjmisc::read_* functions to import data, value labels are automatically attached to the data frame.

Note that factor variables do not necessarily have to be converted to numeric vectors. Factor levels will automatically be used as variable labels.

If the data you use with the sjPlot package has value and variable label attributes, you don’t need to specify these information within function calls. See the following example that shows how you can save work if you have attached label attributes:

# Function call when vectors have label attributes
sjt.frq(efc$e42dep)
# Equivalent function call when vectors do not have
# label attributes and value labels should be 
# printed to table
sjt.frq(efc$e42dep, 
        value.labels = c("independent", "slightly dependent", 
                         "moderately dependent", "severely dependent"))

Basics of the sjt-functions: Printing HTML tables

All sjt functions create a HTML page with the formatted table output. This table, by default, is opened in the viewer pane of your IDE (in case you’re using an IDE which also supports the viewer pane, see argument use.viewer for details). If a viewer pane is not available, the created HTML output is saved as temporary file and opened in your default webbrowser. The temporary files are deleted after your R session ends.

You can also save the HTML page as file for further usage by specifying the file argument. The saved HTML file can be opened by word processors like LibreOffice or Microsoft Office.

Character encoding

In some cases, you may have to specify a character encoding in order to get proper labels in the HTML tables. If labels don’t display correctly, use the encoding argument to change the character encoding. This value dependes on your region where you live. Following example, which works for western European countries, is the default behaviour of all sjt-functions:

# don't need to do this, because all sjt-functions
# use this code as default encoding-detection
if (.Platform$OS.type == "unix") 
  encoding <- "UTF-8" 
else 
  encoding <- "Windows-1252"
sjt.frq(efc$e15relat, encoding = encoding)

This example first detects your operating system and then chooses the associated character encoding, which is used in the HTML file. If this does not work for you, you have to use the encoding argument.

Printing simple frequencies

The basic function call just needs a variable as argument. By default, the variable label (is attached as attribute, see paragraph Data initialization) is used as table caption and the value labels are automatically set. If these information are not available, a standard string "data" will be displayed as caption and the values are used as value labels.

sjt.frq(efc$e15relat)
relationship to elder
value N raw % valid % cumulative %
spouse/partner 171 18.83 18.98 18.98
child 473 52.09 52.50 71.48
sibling 29 3.19 3.22 74.69
daughter or son -in-law 85 9.36 9.43 84.13
ancle/aunt 23 2.53 2.55 86.68
nephew/niece 22 2.42 2.44 89.12
cousin 6 0.66 0.67 89.79
other, specify 92 10.13 10.21 100.00
missings 7 0.77
total N=908 · valid N=901 · x̄=2.85 · σ=2.08

A summary row with total and valid N, as well as mean and standard deviation is always shown by default, even if mean and sd don’t make sense (as in the above example, where we have an unordered factor).

Showing additional statistics

There are a few arguments that allow computing additional statistics like skewness and kurtosis of variables. These information will be added to the summary row.

sjt.frq(efc$e15relat, 
        show.skew = TRUE, 
        show.kurtosis = TRUE)
relationship to elder
value N raw % valid % cumulative %
spouse/partner 171 18.83 18.98 18.98
child 473 52.09 52.50 71.48
sibling 29 3.19 3.22 74.69
daughter or son -in-law 85 9.36 9.43 84.13
ancle/aunt 23 2.53 2.55 86.68
nephew/niece 22 2.42 2.44 89.12
cousin 6 0.66 0.67 89.79
other, specify 92 10.13 10.21 100.00
missings 7 0.77
total N=908 · valid N=901 · x̄=2.85 · σ=2.08 · γ=1.55 · ω=1.21

Better overview for continuous or count data

Printing continuous or count data in tables, for instance age variables, results in very long, almost unreadable tables. This is shown in the following example, where zero-categories are skipped (using the skip.zero argument):

sjt.frq(efc$e17age, skip.zero = TRUE)
elder’ age
value N raw % valid % cumulative %
65 32 3.52 3.59 3.59
66 24 2.64 2.69 6.29
67 29 3.19 3.25 9.54
68 24 2.64 2.69 12.23
69 29 3.19 3.25 15.49
70 32 3.52 3.59 19.08
71 20 2.20 2.24 21.32
72 22 2.42 2.47 23.79
73 34 3.74 3.82 27.61
74 28 3.08 3.14 30.75
75 37 4.07 4.15 34.90
76 37 4.07 4.15 39.06
77 31 3.41 3.48 42.54
78 30 3.30 3.37 45.90
79 46 5.07 5.16 51.07
80 34 3.74 3.82 54.88
81 33 3.63 3.70 58.59
82 46 5.07 5.16 63.75
83 43 4.74 4.83 68.57
84 43 4.74 4.83 73.40
85 24 2.64 2.69 76.09
86 34 3.74 3.82 79.91
87 28 3.08 3.14 83.05
88 19 2.09 2.13 85.19
89 32 3.52 3.59 88.78
90 24 2.64 2.69 91.47
91 20 2.20 2.24 93.71
92 13 1.43 1.46 95.17
93 15 1.65 1.68 96.86
94 12 1.32 1.35 98.20
95 7 0.77 0.79 98.99
96 1 0.11 0.11 99.10
97 5 0.55 0.56 99.66
98 1 0.11 0.11 99.78
99 1 0.11 0.11 99.89
103 1 0.11 0.11 100.00
missings 17 1.87
total N=908 · valid N=891 · x̄=79.12 · σ=8.09

Auto-grouping

The sjt.frq function offers to automatically group variables with many categories. Use the auto.group argument to specify at which amount of unique values a variable is automatically grouped. Futhermore, the altr.row.col argument might make long tables more readable. Note that, although we have less rows now because categories are grouped, the mean and standard deviation is still computed correctly (compare with above table).

sjt.frq(efc$e17age, altr.row.col = TRUE, auto.group = 15)
elder’ age
value N raw % valid % cumulative %
65-67 85 9.36 9.54 9.54
68-70 85 9.36 9.54 19.08
71-73 76 8.37 8.53 27.61
74-76 102 11.23 11.45 39.06
77-79 107 11.78 12.01 51.07
80-82 113 12.44 12.68 63.75
83-85 110 12.11 12.35 76.09
86-88 81 8.92 9.09 85.19
89-91 76 8.37 8.53 93.71
92-94 40 4.41 4.49 98.20
95-97 13 1.43 1.46 99.66
98-100 2 0.22 0.22 99.89
101-103 1 0.11 0.11 100.00
missings 17 1.87
total N=908 · valid N=891 · x̄=5.37 · σ=2.71

Note that auto.group = 15 does not produce exactly 15 groups, but not more than 15 groups. The amount of groups is calculated by dividing the range of the variable by auto.group value and round up to the next integer. In the above case: 38/15 = 2.53, which means a group size of 3, resulting in 13 groups. See sjmisc::group_var for more details.

Show quartiles and median

Median and quartiles are not shown in the table summary. However, you can highlight these statistics directly inside the table. Quartiles are separated with a red line, and the row with the median value is highlighted in red italics.

sjt.frq(efc$e17age, emph.md = TRUE, emph.quart = TRUE)
elder’ age
value N raw % valid % cumulative %
65 32 3.52 3.59 3.59
66 24 2.64 2.69 6.29
67 29 3.19 3.25 9.54
68 24 2.64 2.69 12.23
69 29 3.19 3.25 15.49
70 32 3.52 3.59 19.08
71 20 2.20 2.24 21.32
72 22 2.42 2.47 23.79
73 34 3.74 3.82 27.61
74 28 3.08 3.14 30.75
75 37 4.07 4.15 34.90
76 37 4.07 4.15 39.06
77 31 3.41 3.48 42.54
78 30 3.30 3.37 45.90
79 46 5.07 5.16 51.07
80 34 3.74 3.82 54.88
81 33 3.63 3.70 58.59
82 46 5.07 5.16 63.75
83 43 4.74 4.83 68.57
84 43 4.74 4.83 73.40
85 24 2.64 2.69 76.09
86 34 3.74 3.82 79.91
87 28 3.08 3.14 83.05
88 19 2.09 2.13 85.19
89 32 3.52 3.59 88.78
90 24 2.64 2.69 91.47
91 20 2.20 2.24 93.71
92 13 1.43 1.46 95.17
93 15 1.65 1.68 96.86
94 12 1.32 1.35 98.20
95 7 0.77 0.79 98.99
96 1 0.11 0.11 99.10
97 5 0.55 0.56 99.66
98 1 0.11 0.11 99.78
99 1 0.11 0.11 99.89
103 1 0.11 0.11 100.00
missings 17 1.87
total N=908 · valid N=891 · x̄=79.12 · σ=8.09

Customizing table appearance using CSS

The table output is in in HTML format. The table style (visual appearance) is formatted using Cascading Style Sheets. If you are a bit familiar with these topics, you can easily customize the appearance of the table output.

Many table elements (header, row, column, cell, summary row, first row or column…) have CSS-class attributes, which can be used to change the table style. Since each sjt function has different table elements and thus different class attributes, you first need to know which styles can be customized.

The example HTML and CSS code, which is described below, is based on this table output:
how dependent is the elder? - subjective perception of carer
value N raw % valid % cumulative %
independent 66 7.27 7.33 7.33
slightly dependent 225 24.78 24.97 32.30
moderately dependent 306 33.70 33.96 66.26
severely dependent 304 33.48 33.74 100.00
missings 7 0.77
total N=908 · valid N=901 · x̄=2.94 · σ=0.94

Retrieving customizable styles

Each sjt function invisibly returns several values. The return value page.style contains the style information for the HTML table. You can print this style sheet to console using the base R cat function:

cat(sjt.frq(efc$e42dep, no.output = TRUE)$page.style)
## <style>
## table { border-collapse:collapse; border:none; }
## .thead { border-top:double; text-align:center; font-style:italic; font-weight:normal; padding-left:0.2cm; padding-right:0.2cm; }
## .tdata { padding:0.2cm; }
## .summary { text-align:right; font-style:italic; font-size:0.9em; padding-top:0.1cm; padding-bottom:0.1cm; }
## .arc { background-color:#eaeaea; }
## .qrow { border-bottom: 1px solid #cc3333; }
## .mdrow { font-weight:bolder; font-style:italic; color:#993333; }
## .abstand { margin-bottom: 2em; }
## .lasttablerow { border-top:1px solid; border-bottom:double; }
## .firsttablerow { border-bottom:1px solid; }
## .leftalign { text-align:left; }
## .centeralign { text-align:center; }
## caption { font-weight: bold; text-align:left; }
## .firsttablecol {  }
## </style>

The HTML code is obtained by using the page.content return value. Since the sjt.frq function allows to plot multiple tables at once, this function returns a list of HTML tables as page.content.list (note that most sjt-functions only return a single value for page.content, not a list!). The following code prints the HTML code of the table to the R console:

cat(sjt.frq(efc$e42dep, no.output = TRUE)$page.content.list[[1]])
## <table>
##    <caption>how dependent is the elder? - subjective perception of carer</caption>
##    <tr>
##      <th class="thead firsttablerow firsttablecol">value</th>
##      <th class="thead firsttablerow">N</th>
##      <th class="thead firsttablerow">raw %</th>
##      <th class="thead firsttablerow">valid %</th>
##      <th class="thead firsttablerow">cumulative %</th>
##    </tr>
## 
##    <tr>
##      <td class="tdata leftalign firsttablecol">independent</td>
##      <td class="tdata centeralign">66</td>
##      <td class="tdata centeralign">7.27</td>
##      <td class="tdata centeralign">7.33</td>
##      <td class="tdata centeralign">7.33</td>
##    </tr>
##  
##    <tr>
##      <td class="tdata leftalign firsttablecol">slightly dependent</td>
##      <td class="tdata centeralign">225</td>
##      <td class="tdata centeralign">24.78</td>
##      <td class="tdata centeralign">24.97</td>
##      <td class="tdata centeralign">32.30</td>
##    </tr>
##  
##    <tr>
##      <td class="tdata leftalign firsttablecol">moderately dependent</td>
##      <td class="tdata centeralign">306</td>
##      <td class="tdata centeralign">33.70</td>
##      <td class="tdata centeralign">33.96</td>
##      <td class="tdata centeralign">66.26</td>
##    </tr>
##  
##    <tr>
##      <td class="tdata leftalign firsttablecol">severely dependent</td>
##      <td class="tdata centeralign">304</td>
##      <td class="tdata centeralign">33.48</td>
##      <td class="tdata centeralign">33.74</td>
##      <td class="tdata centeralign">100.00</td>
##    </tr>
##  
##    <tr>
##      <td class="tdata leftalign lasttablerow firsttablecol">missings</td>
##      <td class="tdata centeralign lasttablerow">7</td>
##      <td class="tdata centeralign lasttablerow">0.77</td>
##      <td class="tdata lasttablerow"></td>
##      <td class="tdata lasttablerow"></td>
##    </tr>
##   <tr>
##     <td class="tdata summary" colspan="5">total N=908 &middot; valid N=901 &middot; x&#772;=2.94 &middot; &sigma;=0.94</td>
##    </tr>
##  </table>

Now you can see which table elements are associated with which CSS class attributes. If you compare the page.style with the related page.content, you see that not all style attributes are used:

Customizing table output with the CSS argument

You can customize the table output with the CSS argument. This argument requires a list of attributes, which follow a certain pattern:

  1. each attributes needs a css. prefix
  2. followed by the class name (e.g. caption, thead, centeralign, arc etc.)
  3. equal-sign
  4. the CSS format (in (single) quotation marks)
  5. the CSS format must end with a colon (;)

Example:

sjt.frq(efc$e42dep, 
        CSS = list(css.centeralign = 'text-align: left;', 
                   css.caption = 'font-weight: normal; font-style: italic;', 
                   css.firsttablecol = 'font-weight: bold;', 
                   css.lasttablerow = 'border-top: 1px solid; border-bottom: none;', 
                   css.summary = 'color: blue;'))
how dependent is the elder? - subjective perception of carer
value N raw % valid % cumulative %
independent 66 7.27 7.33 7.33
slightly dependent 225 24.78 24.97 32.30
moderately dependent 306 33.70 33.96 66.26
severely dependent 304 33.48 33.74 100.00
missings 7 0.77
total N=908 · valid N=901 · x̄=2.94 · σ=0.94

In the above example, the summary-table row lost the original style and just became blue. If you want to keep the original style and just add additional style information, use the plus-sign (+) as initial character for the argument attributes. In the following example, the summary row keeps its original style and is additionally printed in blue:

sjt.frq(efc$e42dep, CSS = list(css.summary = '+color: blue;'))
how dependent is the elder? - subjective perception of carer
value N raw % valid % cumulative %
independent 66 7.27 7.33 7.33
slightly dependent 225 24.78 24.97 32.30
moderately dependent 306 33.70 33.96 66.26
severely dependent 304 33.48 33.74 100.00
missings 7 0.77
total N=908 · valid N=901 · x̄=2.94 · σ=0.94