Announcements

Announcements

  • grading
    • Piazza posted (will update again)
    • GitHub posted (split 1st & 2nd halves of term)
  • midterm exam
    • planning to return them & discuss on Wed (3/13)
    • Extra Credit survey (to boost avg slightly)
  • MDSR Ch 6 (ethics): no programming assigned
  • MDSR Ch 6 exercises–no code, but use GitHub, make commits, & submit R Notebook like always
  • MDSR Ch 10 programming notebook

MDSR Ch 10 Errata / Tips

  • Some sections don’t require programming, but please still include the headers for navigation purposes
  • p. 234-235: Rename the result for 20,000 simulations as sim_results20k
    • the authors reuse the object name sim_results for the example with 20,000 simulations, but this overwrites the object expected for the plot at the top of p. 235.
    • by renaming the object resulting form 20,000 simulations, the ggplot call will still be able to access the intended object

In the news…

These Two Charts Prove A College Education Just Isn’t Worth The Money Anymore

  • Q: Any thoughts?

source: https://www.businessinsider.com/these-two-charts-prove-a-college-education-just-isnt-worth-the-money-anymore-2012-6

Education vs Income Observations

source: https://io9.gizmodo.com/11-most-useless-and-misleading-infographics-on-the-inte-1688239674

Stand your ground

Professional Ethics

Professional Ethics

General principles

American Statistical Association (2018 ethical guidelines)

Data Science ethics

(PSU) Guidelines for decision making

  1. Percieved problem or ethical dilemma?
  2. What are the facts?
  3. What stakeholders, values, and guidelines are involved?
  4. What are the options (good & bad)?
  5. Consider the options (outcomes, virtues, …)?
  6. Which is the BEST (or “least bad”) option?
  7. How might we prevent this issue in the future?

Some Examples

Applying our guidelines: Employment Discrimination

Applying our guidelines: Employment Discrimination

Applying our guidelines: Data Scraping

Applying our guidelines: Data Scraping

Applying our guidelines: Reproducible spreadsheet analysis

Applying our guidelines: Reproducible spreadsheet analysis

Data reidentification and disclosure avoidance

Data scraping and terms of use

Reproducibility

Multiple Testing

Multiple Testing

More examples the ethical minefield we call Statistics & Data Science…

