Small Group Discussion:

What were the muddiest points from these chapter?
- Ch 5: Introduction to Data Graphics
- Ch 6: Frames, Glyphs, and other Components of Graphics
What do you think about the interactive graphics commands?
- pro’s?
- con’s?

Three Important Concepts for Today

Data can be usefully organized into tables with “cases” and “variables.” In “tidy data,” every case is the same sort of thing, e.g. a person, a car, a year, a country in a year. We talked about data tables, cases, variables, etc. a few weeks ago.
Data graphics can be constructed easily when each case corresponds to a “glyph” (mark) on the graph, and each variable to a graphical attribute of that glyph such as x- or y-position, color, size, length, shape, etc. Such data is called “glyph-ready.” (The same is true for more technical presentations of data, e.g., models, predictions, etc. — once the data are set up with appropriate cases and variables, the presentation is straightforward.)
Next Week: When data are not yet in glyph-ready form, you can transform them into glyph-ready form using one or more of a small set of basic operations (called “data verbs”) on data your table(s).

Today’s Agenda

Explore some basic graphical choices:
- The format of graphs: major categories of glyph: points, bars, shapes.
- The nomenclature for different parts of graphics: frame, scale, guide.
- Effective mappings from variables to graphical attributes.
Exercise: Practice recognizing some of these features using non-standard graphics
Activity: Practice Reproducing graphs using interactive R functions

Glyphs and Data

In its original sense, in archeology, a glyph is a carved symbol.

Heiroglyph

Mayan glyph Mayan glyph

Data Glyph

A data glyph is also a mark, e.g.

The features of a data glyph encodes the value of variables.

Some are very simple, e.g. a dot:
Some combine different elements, e.g. a pointrange:
Some are complicated, e.g. a dotplot:

See: http://docs.ggplot2.org/current/

Data Glyph Properties: Aesthetics

Aesthetics are visual properties of a glyph.

Aesthetics for points: location (x and y), shape, color, size, transparency

## Warning: Using size for a discrete variable is not advised.

Each glyph has its own set of aesthetics.

Why “Aesthetic”?

Some Graphics Components

glyph: The basic graphical unit that represents one case. Other terms used include mark and symbol.
aesthetic: a visual property of a glyph such as position, size, shape, color, etc.

may be mapped based on data values: sex -> color
may be set to particular non-data related values: color is black

scale: A mapping that translates data values into aesthetics.

example: male -> blue; female -> pink

frame: The position scale describing how data are mapped to x and y
guide: An indication for the human viewer of the scale. This allows the viewer to translate aesthetics back into data values.

Examples: x- and y-axes, various sorts of legends

Scales

The relationship between the variable value and the value of the aesthetic the variable is mapped to.

Systolic Blood Pressure (SBP) has units of mmHg (millimeters of mercury)
Position on the x-axis measured in distance, e.g. inches.

The conversion from SBP to position is a scale.

Smoker is “never”, “former”, “current”
Color is red, green, blue, …

The conversion from Smoker to color is a scale.

Guides

Guide: an indication to a human viewer of what the scale is.

Axis ticks and numbers

Legends

—————————-|————————— |

Labels on faceted graphics

Designing Graphics

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Design choices

What kind of glyph, e.g. scatter, density, bar, … many others
What variables constitute the frame. And some details:
- axis limits
- logarithmic axes, etc.
What variables should be mapped to other aesthetics of the glyph.
Whether to facet and with what variable.

More details, …, e.g. setting of aesthetics to constants

Good and Bad Graphics

Remember …

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Your choices depend on what information you want to reveal and convey.

Learn by reading graphics and determining which ways of arranging thing are better or worse.

A basic principle is that a graphic is about comparison. Good graphics make it easy for people to perceive things that are similar and things that are different. Good graphics put the things to be compared “side-by-side”, that is, in perceptual proximity to one another.

Perception and Comparison

In roughly descending order of human ability to compare nearby objects:

Position
Length
Area
Angle
Shape (but only a very few different shapes)
Color

Color is the most difficult…

color gradients — we’re good at
discrete colors — must be carefully selected.

Count the ways this graphic is bad

## Warning: Using size for a discrete variable is not advised.

Glyph-Ready Data

Glyph-ready data has this form:

There is one row for each glyph to be drawn.
The variables in that row are mapped to aesthetics of the glyph (including position)

Glyph-ready data

##   sbp dbp    sex  smoker
## 1 129  75   male   never
## 2 105  62 female   never
## 3 122  72   male   never
## 4 128  83 female  former
## 5 123  90   male  former
## 6 122  77   male current

Mapping of data to aesthetics

   sbp -> x position      
   dbp -> y position    
smoker -> color
   sex -> shape

Scales determine details of translation from

variable -> aesthetic

Layers – building up complex plots

Each layer may have its own data, glyphs, aesthetic mapping, etc.

one layer has points
another layer has the curves

Stats: Data Transformations

What are the glyphs, aesthetics, etc. for this plot?
How is the data for this plot related to the “raw” data?

##   sbp dbp    sex smoker
## 1 129  75   male  never
## 2 105  62 female  never
## 3 122  72   male  never
## 4 128  83 female former

Activity:

As a group, you’re going to practice reproducing graphs using the interactive R functions introduced in the reading:

scatterGraphHelper()
barGraphHelper()
distributionGraphHelper()
WorldMap()
USMap()

Begin a new RMarkdown file using one of our class templates:

RStudio >> File >> New File >> R Markdown >> From Template >> DataComputing Simple

The graphs to reproduce for this exercise are:

Problem 5.3
Problem 5.4

As you work through each task, mirror your work to the front screen.

When finished, each group should upload an HTML document to Canvas that shows:

the names of each person in the group
the plots reproduced from problems 5.3 & 5.4 AND the R code (i.e. code chunks in RMarkdown) to produce each plot
- Use the interactive R command in the Console to produce the plot
- click “show expression”
- copy the expression from the Console into a code chunk in your .Rmd file
- when you Knit the .Rmd file, the code and plot will appear together

Exercise:

If you finish early, practice recognizing some graphics features non-standard graphics (Handout)

Homework

All homework, activities, etc from now on must be submitted to Canvas as HTML files with embedded .Rmd (i.e. Use class template) unless otherwise stated.

Turn in Graph Replication Activity (HTML to Canvas)
DC Ch 5 & 6 Exercises (HTML to Canvas): 5.1, 5.2, 6.5, 6.6, 6.7, 6.8
DC chapter 7 reading quiz on Canvas

teaching | stat 184 home | syllabus | piazza | canvas

Week 4 Class Notes