Courses Details

EPID639: R For Epidemiologic Research

  • Graduate level
  • Residential
  • Fall term(s) for residential students;
  • 2 credit hour(s) for residential students;
  • Instructor(s): Kelly Bakulski (Residential);
  • Prerequisites: Must be a current EPID graduate student
  • Advisory Prerequisites: Must be a current EPID graduate student
  • Description: This course will introduce the R statistical programming language, as implemented through Posit software, for epidemiologic data analysis. The overall goal of the course is to provide students with a set of new data analysis tools for Epidemiology using R through Posit.
  • Learning Objectives: 1. Understand what R is and why we use Posit 2. Become familiar with Posit Cloud interface 3. Identify file paths for locations of files within an R project 1. Adapt Quarto markdown YAML header code for multiple report types (.pdf, .html, .docx) 2. Render Quarto markdown files (.qmd) to produce reports that contain both code and output 1. Classify R object types (vector types, data frames) 2. Implement functions to perform actions on data objects 3. Use R as a calculator 1. Apply functions to import and export datasets 3. Explore a newly imported data frame 1. Implement best practices for tidy coding and file organization 2. Use the help viewer to assess new functions and function default settings 3. Practice parsing error/warning messages and troubleshooting solutions in code 4. Identifying online resources for solving coding issues 5. Perform logic checks by comparing expected and observed output 1. Select columns in a data frame 2. Order and filter dataset rows based on participant criteria 3. Join multiple data frames into one 1. Create new variables from existing variables 2. Understand how to code and wrangle missing data 1. Understand the required and optional components of a scatterplot with ggplot2 2. Prioritize plot types (bar chart, histogram, boxplot) based on data types (number and shape of covariates) 1. Describe coding features (labels, limits, colors, legends, size, transparency) for common plot types 2. Generate multipaneled plots to view data by groups 3. Export plots from Posit for use in other programs 1. Based on variable type (continuous, categorical) determine appropriate measures and functions for assessing central tendency and spread 2. Describe univariate and bivariate distributions of variables using central tendency and spread 2. Describe univariate and bivariate distributions of variables using central tendency and spread 2. Describe univariate and bivariate distributions of variables using central tendency and spread2. Describe univariate and bivariate distributions of variables using central tendency and spread 1. Calculate univariate and bivariate statistics 2. Create professional and reproducible descriptive statistics tables for export 1. Review selecting statistical methods by variable characteristics. 2. Implement and interpret output from two category tests: Correlation tests, T-tests, Wilcoxon rank sum test. 3. Implement and interpret output from multiple category tests: ANOVA, Chi-square test, Fisher's exact test 4. Generate and interpret odds ratios 1. Construct and interpret simple & multivariable linear models (continuous and categorical predictor variables) 2. Create professional and reproducible regression output tables for export 3. Create plots for regression diagnostics 1. Apply formats for date objects 2. Describe when to use for loops and how they work 3. Develop custom functions to perform repeated tasks 1. Explore generalized regression function options including for splines, logistic regression, Poisson regression 2. Become familiar with code for matched case-control studies, survival analysis 3. Explore coding mixed effects models for clustered data 4. Try adding weights for complex survey samples 5. Perform a meta-analysis in R
  • Syllabus for EPID639
BakulskiKelly
Kelly Bakulski