Small Group Discussion:

Three Important Concepts for Today

Today’s Agenda

  1. Explore some basic graphical choices:
    • The format of graphs: major categories of glyph: points, bars, shapes.
    • The nomenclature for different parts of graphics: frame, scale, guide.
    • Effective mappings from variables to graphical attributes.
  2. Exercise: Practice recognizing some of these features using non-standard graphics
  3. Activity: Practice Reproducing graphs using interactive R functions

Glyphs and Data

In its original sense, in archeology, a glyph is a carved symbol.

Heiroglyph Heiroglyph

Mayan glyph Mayan glyph

Data Glyph

A data glyph is also a mark, e.g.

The features of a data glyph encodes the value of variables.

  • Some are very simple, e.g. a dot:
  • Some combine different elements, e.g. a pointrange:
  • Some are complicated, e.g. a dotplot:

See: http://docs.ggplot2.org/current/

Data Glyph Properties: Aesthetics

Aesthetics are visual properties of a glyph.

## Warning: Using size for a discrete variable is not advised.

Why “Aesthetic”?

Some Graphics Components

glyph
The basic graphical unit that represents one case. Other terms used include mark and symbol.
aesthetic
a visual property of a glyph such as position, size, shape, color, etc.
scale
A mapping that translates data values into aesthetics.
frame
The position scale describing how data are mapped to x and y
guide
An indication for the human viewer of the scale. This allows the viewer to translate aesthetics back into data values.

Scales

The relationship between the variable value and the value of the aesthetic the variable is mapped to.

The conversion from SBP to position is a scale.

The conversion from Smoker to color is a scale.

Guides

Guide: an indication to a human viewer of what the scale is.

  • Axis ticks and numbers
  • Legends

—————————-|————————— |

  • Labels on faceted graphics

Facets – using x and y twice

Designing Graphics

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Design choices

  • What kind of glyph, e.g. scatter, density, bar, … many others
  • What variables constitute the frame. And some details:
    • axis limits
    • logarithmic axes, etc.
  • What variables should be mapped to other aesthetics of the glyph.
  • Whether to facet and with what variable.

More details, …, e.g. setting of aesthetics to constants

Good and Bad Graphics

Remember …

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Your choices depend on what information you want to reveal and convey.

Learn by reading graphics and determining which ways of arranging thing are better or worse.

A basic principle is that a graphic is about comparison. Good graphics make it easy for people to perceive things that are similar and things that are different. Good graphics put the things to be compared “side-by-side”, that is, in perceptual proximity to one another.

Perception and Comparison

In roughly descending order of human ability to compare nearby objects:

  1. Position
  2. Length
  3. Area
  4. Angle
  5. Shape (but only a very few different shapes)
  6. Color

Color is the most difficult…

Count the ways this graphic is bad

## Warning: Using size for a discrete variable is not advised.

Glyph-Ready Data

Glyph-ready data has this form:

Glyph-ready data

##   sbp dbp    sex  smoker
## 1 129  75   male   never
## 2 105  62 female   never
## 3 122  72   male   never
## 4 128  83 female  former
## 5 123  90   male  former
## 6 122  77   male current

Mapping of data to aesthetics

   sbp -> x position      
   dbp -> y position    
smoker -> color
   sex -> shape

Scales determine details of translation from

variable -> aesthetic

Layers – building up complex plots

Each layer may have its own data, glyphs, aesthetic mapping, etc.

Stats: Data Transformations

##   sbp dbp    sex smoker
## 1 129  75   male  never
## 2 105  62 female  never
## 3 122  72   male  never
## 4 128  83 female former

Activity:

As a group, you’re going to practice reproducing graphs using the interactive R functions introduced in the reading:

Begin a new RMarkdown file using one of our class templates:

RStudio >> File >> New File >> R Markdown >> From Template >> DataComputing Simple

The graphs to reproduce for this exercise are:

As you work through each task, mirror your work to the front screen.

When finished, each group should upload an HTML document to Canvas that shows:

Exercise:

If you finish early, practice recognizing some graphics features non-standard graphics (Handout)

Homework

All homework, activities, etc from now on must be submitted to Canvas as HTML files with embedded .Rmd (i.e. Use class template) unless otherwise stated.


teaching | stat 184 home | syllabus | piazza | canvas