Good practices in data analysis: a short reference list

General good practices

  • Kass, Robert E., Brian S. Caffo, Marie Davidian, Xiao-Li Meng, Bin Yu, and Nancy Reid. “Ten Simple Rules for Effective Statistical Practice.” PLOS Computational Biology 12, no. 6 (June 9, 2016): e1004961. https://doi.org/10.1371/journal.pcbi.1004961
  • Zuur, Alain F., Elena N. Ieno, and Chris S. Elphick. “A Protocol for Data Exploration to Avoid Common Statistical Problems.” Methods in Ecology and Evolution 1, no. 1 (2010): 3–14. https://doi.org/10.1111/j.2041-210X.2009.00001.x
  • Steel, E. Ashley, Maureen C. Kennedy, Patrick G. Cunningham, and John S. Stanovick. “Applied Statistics in Ecology: Common Pitfalls and Simple Solutions.” Ecosphere 4, no. 9 (2013): art115. https://doi.org/10.1890/ES13-00160.1
  • Murtaugh, Paul A. “Simplicity and Complexity in Ecological Data Analysis.” Ecology 88, no. 1 (2007): 56–62. https://doi.org/10.1890/0012-9658(2007)88[56:SACIED]2.0.CO;2

Regression

  • Zuur, Alain F., and Elena N. Ieno. “A Protocol for Conducting and Presenting Results of Regression-Type Analyses.” Methods in Ecology and Evolution 7, no. 6 (2016): 636–45. https://doi.org/10.1111/2041-210X.12577
  • Graham, Michael H. “Confronting Multicollinearity in Ecological Multiple Regression.” Ecology 84, no. 11 (2003): 2809–15. https://doi.org/10.1890/02-3114
  • Schielzeth, Holger. “Simple Means to Improve the Interpretability of Regression Coefficients.” Methods in Ecology and Evolution 1, no. 2 (2010): 103–13. https://doi.org/10.1111/j.2041-210X.2010.00012.x

Non-normality

  • Warton, David I., Mitchell Lyons, Jakub Stoklosa, and Anthony R. Ives. “Three Points to Consider When Choosing a LM or GLM Test for Count Data.” Methods in Ecology and Evolution 7, no. 8 (2016): 882–90. https://doi.org/10.1111/2041-210X.12552
  • Richards, Shane A. “Dealing with Overdispersed Count Data in Applied Ecology.” Journal of Applied Ecology 45, no. 1 (2008): 218–27. https://doi.org/10.1111/j.1365-2664.2007.01377.x

To log or not to log

  • O’Hara, Robert B., and D. Johan Kotze. “Do Not Log-Transform Count Data.” Methods in Ecology and Evolution 1, no. 2 (2010): 118–22. https://doi.org/10.1111/j.2041-210X.2010.00021.x
  • Ives, Anthony R. “For Testing the Significance of Regression Coefficients, Go Ahead and Log-Transform Count Data.” Methods in Ecology and Evolution 6, no. 7 (2015): 828–35. https://doi.org/10.1111/2041-210X.12386

Mixed-effect models

  • Harrison, Xavier A., Lynda Donaldson, Maria Eugenia Correa-Cano, Julian Evans, David N. Fisher, Cecily E. D. Goodwin, Beth S. Robinson, David J. Hodgson, and Richard Inger. “A Brief Introduction to Mixed Effects Modelling and Multi-Model Inference in Ecology.” PeerJ 6 (May 23, 2018): e4794. https://doi.org/10.7717/peerj.4794
  • Barr, Dale J., Roger Levy, Christoph Scheepers, and Harry J. Tily. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68, no. 3 (April 1, 2013): 255–78. https://doi.org/10.1016/j.jml.2012.11.001
  • Houslay, Thomas M, and Alastair J Wilson. “Avoiding the Misuse of BLUP in Behavioural Ecology.” Behavioral Ecology 28, no. 4 (August 1, 2017): 948–52. https://doi.org/10.1093/beheco/arx023
  • Schielzeth, Holger, and Shinichi Nakagawa. “Nested by Design: Model Fitting and Interpretation in a Mixed Model Era.” Methods in Ecology and Evolution 4, no. 1 (2013): 14–24. https://doi.org/10.1111/j.2041-210x.2012.00251.x

Missing data

  • Nakagawa, Shinichi, and Robert P. Freckleton. “Missing Inaction: The Dangers of Ignoring Missing Data.” Trends in Ecology & Evolution 23, no. 11 (November 1, 2008): 592–96. https://doi.org/10.1016/j.tree.2008.06.014
  • Noble, Daniel W. A., and Shinichi Nakagawa. “Planned Missing Data Design: Stronger Inferences, Increased Research Efficiency and Improved Animal Welfare in Ecology and Evolution,” May 2, 2018. https://doi.org/10.1101/247064

Model selection, model averaging, etc.

  • Forstmeier, Wolfgang, and Holger Schielzeth. “Cryptic Multiple Hypotheses Testing in Linear Models: Overestimated Effect Sizes and the Winner’s Curse.” Behavioral Ecology and Sociobiology 65, no. 1 (January 1, 2011): 47–55. https://doi.org/10.1007/s00265-010-1038-5
  • Galipaud, Matthias, Mark A. F. Gillingham, Morgan David, and François-Xavier Dechaume-Moncharmont. “Ecologists Overestimate the Importance of Predictor Variables in Model Averaging: A Plea for Cautious Interpretations.” Methods in Ecology and Evolution 5, no. 10 (2014): 983–91. https://doi.org/10.1111/2041-210X.12251
  • Johnson, Jerald B., and Kristian S. Omland. “Model Selection in Ecology and Evolution.” Trends in Ecology & Evolution 19, no. 2 (February 1, 2004): 101–8. https://doi.org/10.1016/j.tree.2003.10.013
  • “Null Hypothesis Significance Testing: A Review of an Old and Continuing Controversy. – PsycNET.” Accessed November 22, 2021. https://doi.apa.org/doiLanding?doi=10.1037%2F1082-989X.5.2.241
  • Nakagawa, Shinichi, and Robert P. Freckleton. “Model Averaging, Missing Data and Multiple Imputation: A Case Study for Behavioural Ecology.” Behavioral Ecology and Sociobiology 65, no. 1 (January 1, 2011): 103–16. https://doi.org/10.1007/s00265-010-1044-7
  • Rykiel, Edward J. “Testing Ecological Models: The Meaning of Validation.” Ecological Modelling 90, no. 3 (November 1, 1996): 229–44. https://doi.org/10.1016/0304-3800(95)00152-2
  • Cade, B.S. (2015), Model averaging and muddled multimodel inferences. Ecology, 96: 2370-2382. https://doi.org/10.1890/14-1639.1

Did you enjoy this? Consider joining my on-line course “First steps in data analysis with R” and learn data analysis from zero to hero!