Resources for learning about causal inference and AI/ML.

Causal inference

Here are materials that we find useful in studying causal inference. Our lab hosts a Causal AI Reading Group that meets (in-person) weekly to read about causality and AI. Email Dr. Zhang if you are interested in joining us.


  1. Pearl, Glymour, Jewell (2016). Causal Inference in Statistics: A Primer.
    • A short booklet that covers all basics about causal inference.
    • Very intuitive examples accompanied by clear math.
    • Illustrates the beauty of causal graphical models.
  2. Peng Ding(2023). A First Course in Causal Inference.
    • A very new book (May 2023 1st version) that includes the latest advances at the intersection of causal inference and machine learning.
    • R code and datasets used in the book.
    • Python code
  3. Hernan and Robins (2023). Causal Inference: What if.
    • More tailed to biostatistics and health care.
    • Includes a time series chapter.
  4. Pearl (2009). Causality.
    • Fantastic. Good for readers who finish the shorter book (A Primer) and ready to dive deeper into causality.
  5. Pearl (2018). The Book of Why.
    • Good storytelling even for non-scientists. A great gift for your parents and friends.
  6. Winship and Morgan (2013). Counterfactuals and Causal Inference: Methods and Principles for Social Research.
    • Covers causal graphs & potential outcomes, which is rare.
  7. Imbens and Rubin (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction.
    • Potential outcome framework.

Online tutorials and talks

  1. Brady Neal. Causal Inference
    • Succinct tutorials. Each video is several minutes long.
    • Fun to wathc. Brady makes excellent diagrams that are easy to understand.
  2. Online Causal Inference Seminar.
    • Great speakers from all over the world working on causality.
    • Meets on Zoom almost every week.

EHR and informatics

Here are materials that we find useful in understanding electronic health record (EHR) data, data standards, and the field of biomedical informatics, especially clinical informatics.


  1. The Book of OHDSI.
    • This is a book about the Observational Health Data Sciences and Informatics (OHDSI) collaborative.
    • Great resource to learn about OMOP common data model, OHDSI open-source analytics, and how to run network studies.
    • OHDSI official website.
    • Latest updates about the community as well as recordings of past community and workgroup calls.
  3. HADES
    • HADES (formally known as the OHDSI Methods Library) is a set of open source R packages for large scale analytics, including population characterization, population-level causal effect estimation, and patient-level prediction.
    • Includes a clear step-by-step instructions on setting up the R environment for HADES and OHDSI studies. That’s where the fun begins!