Data can be usefully organized into tables with “cases” and “variables.” In “tidy data,” every case is the same sort of thing, e.g. a person, a car, a year, a country in a year. We talked about data tables, cases, variables, etc. a few weeks ago.
Data graphics can be constructed easily when each case corresponds to a “glyph” (mark) on the graph, and each variable to a graphical attribute of that glyph such as x- or y-position, color, size, length, shape, etc. Such data is called “glyph-ready.” (The same is true for more technical presentations of data, e.g., models, predictions, etc. — once the data are set up with appropriate cases and variables, the presentation is straightforward.)
Next Week: When data are not yet in glyph-ready form, you can transform them into glyph-ready form using one or more of a small set of basic operations (called “data verbs”) on data your table(s).
In its original sense, in archeology, a glyph is a carved symbol.
Heiroglyph
Mayan glyph
The features of a data glyph encodes the value of variables.
Aesthetics are visual properties of a glyph.
## Warning: Using size for a discrete variable is not advised.
sex -> color
color is black
The relationship between the variable value and the value of the aesthetic the variable is mapped to.
The conversion from SBP to position is a scale.
The conversion from Smoker to color is a scale.
—————————-|————————— |
sbp
and sex
sex
Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.
More details, …, e.g. setting of aesthetics to constants
Remember …
Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.
Your choices depend on what information you want to reveal and convey.
Learn by reading graphics and determining which ways of arranging thing are better or worse.
A basic principle is that a graphic is about comparison. Good graphics make it easy for people to perceive things that are similar and things that are different. Good graphics put the things to be compared “side-by-side”, that is, in perceptual proximity to one another.
In roughly descending order of human ability to compare nearby objects:
Color is the most difficult…
## Warning: Using size for a discrete variable is not advised.
Glyph-ready data has this form:
Glyph-ready data
## sbp dbp sex smoker
## 1 129 75 male never
## 2 105 62 female never
## 3 122 72 male never
## 4 128 83 female former
## 5 123 90 male former
## 6 122 77 male current
Mapping of data to aesthetics
sbp -> x position
dbp -> y position
smoker -> color
sex -> shape
Scales determine details of translation from
variable -> aesthetic
Each layer may have its own data, glyphs, aesthetic mapping, etc.
## sbp dbp sex smoker
## 1 129 75 male never
## 2 105 62 female never
## 3 122 72 male never
## 4 128 83 female former
As a group, you’re going to practice reproducing graphs using the interactive R functions introduced in the reading:
scatterGraphHelper()
barGraphHelper()
distributionGraphHelper()
WorldMap()
USMap()
Begin a new RMarkdown file using one of our class templates:
RStudio >> File >> New File >> R Markdown >> From Template >> DataComputing Simple
The graphs to reproduce for this exercise are:
As you work through each task, mirror your work to the front screen.
When finished, each group should upload an HTML document to Canvas that shows:
If you finish early, practice recognizing some graphics features non-standard graphics (Handout)
All homework, activities, etc from now on must be submitted to Canvas as HTML files with embedded .Rmd (i.e. Use class template) unless otherwise stated.
teaching | stat 184 home | syllabus | piazza | canvas