filter()
& grepl()
)mutate()
& gsub()
)tidyr::extract()
)Date
columnpage <- "https://en.wikipedia.org/wiki/Mile_run_world_record_progression"
XPATH <- '//*[@id="mw-content-text"]/table'
table_list <- page %>%
read_html() %>%
html_nodes(xpath = XPATH) %>%
html_table(fill = TRUE)
IAAFmen <- table_list[[4]]
head(IAAFmen, 3)
Time | Auto | Athlete | Nationality | Date | Venue |
---|---|---|---|---|---|
4:14.4 | John Paul Jones | United States | 31 May 1913[5] | Allston, Mass. | |
4:12.6 | Norman Taber | United States | 16 July 1915[5] | Allston, Mass. | |
4:10.4 | Paavo Nurmi | Finland | 23 August 1923[5] | Stockholm |
Now we can use mutate()
& gsub()
to help us clean up the footnotes from Date
:
IAAFmen %>%
mutate(Date = gsub("\\[.\\]$", "", Date)) %>%
head(3)
Time | Auto | Athlete | Nationality | Date | Venue |
---|---|---|---|---|---|
4:14.4 | John Paul Jones | United States | 31 May 1913 | Allston, Mass. | |
4:12.6 | Norman Taber | United States | 16 July 1915 | Allston, Mass. | |
4:10.4 | Paavo Nurmi | Finland | 23 August 1923 | Stockholm |
The assignment is worth a total of 10 points.
ggplot
to construct a bar chart in descending order of popularity for the street name identifiers you found.Two data sets are provided. One includes 15,000 street addresses of registered voters in Wake County, North Carolina. The other includes over 900,000 street addresses of Medicare Service Providers. You can use either data set (or both!) for the activity.
Note: There’s nothing to do in the “For the professional…” section at the very end except to be impressed.
teaching | stat 184 home | syllabus | piazza | canvas