Additions and corrections for Practical R for Mass Communication and Journalism since publication in December 2018. To suggest an update or correction, please open an issue in this book’s GitHub repository
3.4 Download and graph a city’s median income
At the bottom of page 19, “The fifth line creates the graph, using the dygraph() function from the dygraphs package. The first argument, sfdata, tells dygraph what data set to graph.” should read “The third line creates the graph, using the dygraph() function from the dygraphs package. The first argument, sfincome, tells dygraph what data set to graph.”
3.7 Comparing one city’s data to the US median
On page 21, this code:
usincome <- getSymbols("MHIUS00000A052NCEN", src="FRED")
usincome <- getSymbols("MHIUS00000A052NCEN", src="FRED", auto.assign=FALSE)
4.6 Easy sample data
After submitting the manuscript to my publisher, Wikipedia changed the format of their list of U.S. cities. I suggest not using code that tries to import a table from https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population. Instead, import from a copy of the older file I posted at http://bit.ly/WikiCityList. Instructions on how to do this are also in this section of the book.
Chapter 5: Basic Data Exploration
Discovered after publication: The DataExplorer R package, which generates an HTML report about a data frame with a single line of code. https://boxuancui.github.io/DataExplorer/.
After the book was published, I discovered the paletteer package. It pulls together numerous other additional palettes for ggplot2 from dozens of other packages including dutchmasters, ggthemes, RolorBrewer, Redmonder, and viridis. Available on GitHub: https://github.com/EmilHvitfeldt/paletteer.
Also after the book was published, the BBC’s Visual and Data Journalism team posted an explainer on how they use ggplot2 to create charts for publication – plus a “cookbook for R graphics” with code on “How to create BBC style graphics”. Blog post: https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535. Cookbook for BBC-style graphics using R: https://bbc.github.io/rcookbook/.
Another one that didn’t make the book: From Data to Viz, a site that offers advice on what visualizations to use based on the type of data you have – one numeric, two numeric, one numeric and one categorical, etc. In addition, it has sample code for dozens of visualizations and “common caveats you should avoid.” Extremely helpful for beginning and intermediate R users alike. https://www.data-to-viz.com/.
Chapter 10: Write Your Own R Functions
DataCamp is suggested as an additional resource. However, readers may want to know that DataCamp has been embroiled in a controversy over how it has handled an executive’s ‘uninvited physical contact’ with an employee. The latest: DataCamp CEO steps down indefinitely in wake of ‘inappropriate behavior’ and A Multimillion-Dollar Startup Hid A Sexual Harassment Incident By Its CEO — Then A Community of Outsiders Dragged It Into the Light.
Chapter 11: Maps in R
In section 11.11,
ggmap::geocode() no longer works without a Google Maps API key. That’s due to a change in Google policy. Running
?register_google in the R console gives information on how to obtain and register a key. While Google allows some free usage of its geocoding service, you will need to register a credit card in a Google account even if you don’t exceed the free usage tier.
Making thematic maps with R. Step-by-step guide to making choropleth maps in R with the sf, tmap, leaflet, and gpplot packages, from a February 2019 workshop by researcher Maarten Hermans. https://workshop.mhermans.net/thematic-maps-r/
In February 2019, Economist data journalist G. Elliott Morris released the politicaldata package for analyzing U.S. political data in R. It’s designed to make it “easier to explore polling, election results, demographic data and more,” according to an explainer Morris wrote, and it includes data on U.S. Congressional ideology ratings, Congressional Elections, polling results, and Gallup’s Most Important Problem questions. See it on GitHub: https://github.com/elliottmorris/politicaldata.
Resources of possible interest that are not included in the book, either to save space or they weren’t available when I turned in my manuscript:
R Programming at the Urban Institute – This guide features useful explainers with examples and code for ggplot2 visualizations, maps, and code optimization as well as basics. https://ui-research.github.io/r-at-urban/index.html
Videos from sessions at the 2019 RStudio conference: https://resources.rstudio.com/rstudio-conf-2019
Text as Data - open-source version of a class offered by Chris Bail, professor of Sociology, Public Policy, And Data Science at Duke University. https://cbail.github.io/textasdata/
17.5 More stories done with R
“How the Suburbs Will Swing the Midterm Election” – analysis of Congressional District leanings based on population density, by David Montgomery and Richard Florida for CityLab. Includes interactive Shiny app. Story: https://www.citylab.com/equity/2018/10/midterm-election-data-suburban-voters/572137/. GitHub repo with R code and data: https://github.com/theatlantic/citylab-data/tree/master/citylab-congress
“How safe are Maryland’s bridges?” – front-page story in the Baltimore Sun finds hundreds are in ‘poor’ condition, many are structurally deficient. Story: http://www.baltimoresun.com/news/maryland/bs-md-bridge-collapse-maryland-20180815-story.html. GitHub repo with code and data: https://github.com/baltimore-sun-data/bridge-data
“What new Census data reveal about wealth, diversity, and connectivity in Maryland” - analysis of American Community Survey Census data. Story: https://www.baltimoresun.com/news/maryland/bs-md-acs-census-release-20181206-story.html. GitHub repo with R code using tidycensus and censusapi packages: https://github.com/baltimore-sun-data/census-data-analysis-2018
“Denied Justice” - Star Tribune’s series that highlighted major problems with how Minnesota investigates and prosecutes rape cases, named a Pulitzer Prize finalist in local reporting.
MaryJo Webster, an experienced data journalist and Excel super power user, said this investigative project was her first major effort using R for all the analysis.
“R was the perfect choice for this because we had data rolling in gradually over many, many months,” she told me on Slack. “I had to re-run the same analysis literally every single week for the better part of 8 months.”
No better statement on why it’s useful to learn a scripting language!
You can see the full series here: http://www.startribune.com/deniedjustice
And a results page she built using R Markdown here: http://strib-data-public.s3-us-west-1.amazonaws.com/projects/rape/highlights.html
Last updated May 14, 2019