Description of modules offered:
Module 1: Introduction to R, Biostatistics
Instructors:
Ken Rice, Professor, University of Washington Instructor profile
David Reif, Associate Professor, North Carolina State University Instructor profile
This module introduces the R statistical environment, assuming no prior knowledge. It provides a foundation for the use of R for computation in later modules, including accessing the resources of Bioconductor and other open-source software. In addition to discussing basic data management tasks in R, such as using R commands to read in data and produce summaries, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping. Examples and exercises will use data drawn from genetic, biological and medical applications. Hands-on use of R is a major component of this module; users require a laptop and will use it in all sessions.
Module 2: Advanced Quantitative Genetics
Instructors:
Matt Robinson, Assistant Professor, University of Lausanne Instructor profile
Zoltan Kutalik, Associate Professor, University of Lausanne Instructor profile
This module focuses on the genetics and analysis of quantitative traits in human populations, with emphasis on estimation and prediction analysis using genetic markers. Topics include: the resemblance between relatives; estimation of genetic variance associated with genome-wide identity by descent; GWAS for quantitative traits; the use of GWAS data to estimate and partition genetic variation; principles and pitfalls of prediction analyses using genetic markers. A series of computer exercises will provide hands-on experience of implementing a variety of approaches using R, PLINK and GCTA.
Module 3: Integrative Genomics: Gene Expression and Network Analysis
Instructors:
Greg Gibson, Professor, Georgia Institute of Technology Instructor profile
Alison Motsinger-Reif, Professor, North Carolina State University Instructor profile
The gene expression module will cover the theory and application of transcriptomics based on RNA-Seq methodologies, as well as network analysis of differentially expressed genes. The focus of half of the module will be on the statistical basis of hypothesis testing including fundamentals of ANOVA and significance thresholds, but also covering the central role of normalization
strategies as they impact inference of differential expression. Some discussion of single cell RNASeq will also be provided, as well as an introduction to eQTL analysis. The other half of the module will discuss how clustering approaches can provide a window into biological systems as well as complex diseases, and can be used to understand how biological functions are implemented and how homeostasis is maintained. We will demonstrate the leveraging of biological knowledge from gene expression experiments, and published data and knowledge-bases, in order to identify the pathways and disease sets associated with disease or an outcome of interest. The techniques discussed will be demonstrated in R and other computational tools. This course assumes a previous course in regression and familiarity with R or other command line programming languages. Users require a laptop and may use it in all sessions.
Module 4: Computational Pipeline for WGS Data
Instructors:
Stephanie Gogarten, Research Scientist, University of Washington Instructor profile
Tim Thornton, Associate Professor, University of Washington Instructor profile
This module provides an introduction to analysis of whole-genome sequence data. Topics include sequencing data structures, population structure and relatedness, aggregating and filtering variants using annotation, and association testing using single- and multi-marker tests. Concepts will be illustrated with hands-on exercises in R. Computational pipelines to link multi-step analyses will be presented, along with considerations for deploying these pipelines on a local compute cluster or in the cloud.
Module 5: Social Science Genetics
Instructors:
Philipp Koellinger, Professor, VU University Amsterdam Instructor profile
Richard Karlsson Linnér, Research Scientist, VU University Amsterdam Instructor profile
After decades of scientific debate, a consensus has emerged that all human behavioural traits are partly heritable -- i.e., they are affected to some degree by random genetic variation within families. Furthermore, it has become evident that genetic and environmental causes of individual differences are interrelated. Thanks to rapid technological and scientific progress in the last few years, it is now possible to study the genetic architecture of social-scientific outcomes directly and to integrate the gained insights into the social and medical sciences. This module of the winter school will provide an overview of state-of-the-art methods that are used to study the genetic architecture of human traits (e.g. GWAS) and to use those insights in the social science (e.g. to study gene-environment interactions and to obtain more precise estimates of the effect of environmental conditions). The theoretical lectures will give plenty of room for discussions and will be accompanied by computer tutorials that focus on skills that are particularly useful for social scientists who want to integrate genetic data in their research (e.g. how to detect and control for population structure in the data and how to construct polygenic scores).