Applied Economics Teaching Resources

an AAEA Journal

Agricultural and Applied Economics Association

Gold in Them Tha-R Hills: A Review of R Packages for Exploratory Data Analysis

Kota Minegishi(a) and Taro Mieno(b)
(a) University of Minnesota, Twin Cities, (b)University of Nebraska-Lincoln

JEL Codes: A2, Q1, Y1
Keywords: Exploratory data analysis, data science, data visualization, R programming

Publish Date: June 25, 2020
Volume 2, Issue 3

View Full Article (PDF) | Request Teaching Notes/Supplemental Materials

Abstract

With an accelerated pace of data accumulation in the economy, there is a growing need for data literacy and practical skills to make use of data in the workforce. Applied economics programs have an important role to play in training students in those areas. Teaching tools of data exploration and visualization, also known as exploratory data analysis (EDA), would be a timely addition to existing curriculums. It would also present a new opportunity to engage students through hands-on exercises using real-world data in ways that differ from exercises in statistics. In this article, we review recent developments in the EDA toolkit for statistical computing freeware R, focusing on the tidy verse package. Our contributions are three-fold; we present this new generation of tools with a focus on its syntax structure; our examples show how one can use public data of the U.S. Census of Agriculture for data exploration; and we highlight the practical value of EDA in handling data, uncovering insights, and communicating key aspects of the data.

About the Author: Kota Minegishi is an Assistant Professor at the University of Minnesota, Twin Cities (corresponding author: kota@umn.edu). Taro Mieno is an Assistant Professor at the University of Nebraska-Lincoln.

Copyright is governed under Creative Commons CC BY-NC-SA

References

Alonzo, A. 2016. “Top 5 Broiler Producers Dominate US Production.”Retrieved from https://www.wattagnet.com/articles/26925-top-5-broiler-producers-dominate-us-production

Athey, S., J. Tibshirani, andS. Wager. 2019. “Generalized Random Forests.”The Annals of Statistics47(2):1148–1178.

Coble, K.H., A.K. Mishra, S.Ferrell, andT. Griffin. 2018. “Big Data in Agriculture: A Challenge for the Future.”Applied Economic Perspectives and Policy40(1):79–96.

Healy, K. 2018. Data Visualization: A Practical Introduction, 1sted. Princeton NJ: Princeton University Press.

Ismay, C., and A.Y. Kim. 2019. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse,1sted. Boca Raton: Chapman and Hall/CRC.

Johnson, K.M., andG.V. Fuguitte. 2000. “Continuity and Change in Rural Migration Patterns, 1950–1995.”Rural Sociology65(1):27–49.

Kabacoff, R. 2018. Data Visualization with R.Online open-source book accessed athttps://rkabacoff.github.io/datavis/

Longworth, R.C. 2009. Caught in the Middle: America’s Heartland in the Age of Globalism.New York: Bloomsbury USA.

Lovelace, R., J. Nowosad, and J. Muenchow.2019. Geocomputation with R, 1sted. Boca Raton: ChapmanandHall/CRC.

O’Donoghue, E., R. Hoppe,D.Banker, and P. Korb. 2009. Exploring Alternative Farm Definitions: Implications for Agricultural Statistics and Program Eligibility. Economic Information Bulletin No. 49. Washington DC: U.S. Department of Agriculture.

Storm, H., K. Baylis, and H. Heckelei. 2019. “Machine Learning in Agricultural and Applied Economics.” European Review of Agricultural Economics. https://doi.org/10.1093/erae/jbz033

Twain, M. 1892. The American Claimant. New York: Charles L. Webster.

Walzer, N. 2003. The American Midwest: Managing Change in Rural Transition,1sted. Armonk NY: Routledge.

White, K.J.C. 2008. “Population Change and Farm Dependence: Temporal and Spatial Variation in the U.S. Great Plains, 1900–2000.”Demography45(2):363–386.

Wickham, H., and G. Grolemund.2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, 1sted.

Sebastopol CA: O’Reilly Media.

Wickham, H., M. Averick, J.Bryan, W.Chang, L.McGowan, R. François, G.Grolemund, . . .H. Yutani. 2019. “Welcome to the

Tidyverse.” Journal of Open Source Software4(43):1686.

Wilkinson, L. 2005. The Grammar of Graphics.Springer.

Wood, D. 2018. “Costco Poultry Processing Plant to Boost Nebraska Economy.”Retrieved from https://www.acppubs.com/articles/7398-costco-poultry-processing-plant-to-boost-nebraska-economy