About

This advanced statistics course in sociology, titled “Multiple Regression and Fundamentals of Causal Inference,” offers a comprehensive understanding of both the theory and application of multiple regression analysis. Throughout the course, you will learn how to effectively interpret regression coefficients and utilize regression analysis to test hypotheses and make accurate predictions.

In addition to mastering multiple regression analysis, this course will equip you with a solid grasp of the fundamental principles underlying causal inference. You will gain the ability to differentiate between correlation and causation, identify the influence of confounding variables, and employ techniques to establish causal relationships. Appreciating the pivotal role of multiple regression as a core statistical tool in causal inference will be emphasized.

A key aspect of this course is the exploration of various research designs, including randomized controlled trials, quasi-experimental designs, and observational studies. You will acquire the skills necessary to select the appropriate design based on specific research inquiries. Hands-on experience will be gained through the use of R to conduct multiple regression analysis and apply causal inference techniques.

Throughout the course, sociological research questions will be addressed, such as investigating the impact of news media consumption on the awareness of ethno-racial discrimination, or assessing the effects of office interventions on domestic repeat violence. By delving into these topics, you will understand how multiple regression analysis and causal inference can be effectively employed within sociological research.

Ultimately, this course aims to establish a strong foundation in statistical analysis and causal inference, which are indispensable skills for conducting rigorous sociological research.

Before the first session!

Before the first session, I would like you to please follow these steps in exactly this order, regardless of whether you have worked with R and RStudio before.

  1. Get a recent version of R (>= 4.4.0).

  2. Install RStudio (>= 2024.04), or update to the most recent version. Also check this homepage.

  3. Open RStudio and follow these instructions to setup a Project for this class.

  4. As a final step, run the lines of code below in the Console of your RStudio. Note that this can take a while. If you are asked whether you want to “Do you want to install from sources the package which needs compilation?” answer: “no”. Make sure to wait until the courser blinks behind a > again. Press enter again two times to produce two more >.

update.packages(ask = FALSE)

install.packages("pacman")

pacman::p_load(
  tidyverse, # Data manipulation,
  ggplot2, # beautiful figures,
  wbstats, # download data from Worldbank. Tremendous source of global socio-economic data.
  estimatr, # Regression for weighted data,
  modelr, # Turn results of lm() into a tibble,
  furniture, # For row-means,
  modelsummary, # for regression and balance tables,
  ivreg, # IV 2SLS,
  countrycode, # Easy recodings of country names,
  remotes) # Install beta version packages from GitHub.

remotes::install_github("jrnold/masteringmetrics", subdir = "masteringmetrics")
remotes::install_github("xmarquez/democracyData")
remotes::install_github("avehtari/ROS-Examples", subdir = "rpackage")
# P.S.: If you cannot install one or several of these remotes::install_github(...) packages
# Please try from home, or an internet connection that is not associated with KU.

pacman::p_load(
  rosdata, # Data from Gelman & Hill's "Regression and other Stories"
  masteringmetrics, # Data and examples from Mastering Metrics
  democracyData) # download democracy data used in the scholarly literature.
  1. Now type sessionInfo() into your console and check whether all of the following are mentioned under “other attached packages”: tidyverse, ggplot2, wbstats, estimatr, modelr, furniture, modelsummary, ivreg, countrycode, remotes, rosdata, masteringmetrics, and finally democracyData. If any package is missing. Contact a tutor to help you set everything up properly.
  • Please always make sure to bring your laptop to class.

Course Components and Expectations

This course consists of a combination of methodological and substantive lectures that introduce crucial concepts, statistical methods, and their broader implications. Additionally, exercise classes will provide students with practical experience in applying and discussing these concepts. To fully benefit from this course, four expectations are set:

  1. Thoroughly prepare for each class by completing the assigned readings.
  2. Actively engage in lectures by asking relevant questions, participating in discussions, and completing assigned tasks during exercise classes.
  3. Solve the weekly Online Quizzes on Absalon (at least 10 to qualify for the exam!).
  4. Complete and submit the integrated exam shared with Mads’ Jæger’s course “Velfærd, ulighed og mobilitet”.

Course structure

The course schedule includes weekly lecture every Wednesday at 10am. Each session has a duration of three hours and comprises two lecture iterations, small exercises, and a 15-minute break. Additionally, a tutoring session is scheduled every Friday to provide a smaller group setting for working on stats problems, asking questions, and gaining hands-on experience with R. Finally, there is an Absalon Online Quizz every week. Note that this is an individual quizz and that each student must submit at least 10 of the overall 14 quizzes in order to qualify for the exam!

Course Schedule and Team

Wednesdays, kl. 10-13: Lecture @ CSS 1-1-18 (Merlin)

Fridays, kl. 8-10, Exercise group 1 @ CSS 2-1-02 (Jonas)

Fridays, kl. 8-10, Exercise group 1 @ CSS 4-1-30 (Josefine)

Fridays, kl. 8-10, Exercise group 1 @ CSS 2-2-55 (Elias)

Fridays, kl. 8-10, Exercise group 1 @ CSS 2-1-55 (Adna)

Course Literature

This course uses two textbooks. Both textbooks are accessible at Academic Books on the City Campus and are also available at the university library. The first textbook carried over from last semester serves as a reference for regression and statistical inference:

  • Stats: Data and Models, Global Edition by Veaux, De, Velleman, and Bock (2021), New York: Pearson.

To deepen our understanding of how regression analysis can be applied to identify causal effects, we will heavily rely on the book:

  • Mastering ’Metrics: The Path from Cause to Effect by Angrist, Joshua D., and Jörn-Steffen Pischke (2014), Princeton: Princeton University Press.