2nd CAPICE workshop on “Introduction to the statistical analysis of Genome wide association studies (GWAS)” at Imperial College London

Details: Written by Hema Sekhar Reddy Rajula; Category: Events; Published: 07 September 2018

The second workshop of CAPICE project on “Introduction to the statistical analysis of Genome-wide association studies (GWAS)” at Imperial College London held from 2 to 6 July 2018. The lectures were good and especially the practical sessions are well organized and easy to follow.

The concepts were covered by professors during the course are the following. The course duration is of 5 days. The first day they were a lecture on “Introduction to statistics for geneticists” given by Dr. Inga Prokopenko, in the afternoon lecture on “Introduction to GWAS” given by Dr. Marika Kaakinen and followed by computer workshop on “Introduction to UNIX and R” (Practicals).
Genetic variants discovered so far explain only a small proportion of the estimated heritability in many of the traits. But there is a concept called missing heritability. Finding the missing heritability is possible due to larger sample size, gene x environment interaction studies, analysis of rare variants (minor allelic frequency < 1%) and analysis strategies increasing power, e.g. multi-phenotype analysis.
During the second day, Dr. Reedik Magi talked about “Quality control for GWAS”, in the afternoon lecture on “Statistical models for genetic association analysis” given by Dr. Krista Fischer and followed by computer workshop (Practicals).

Quality control for GWAS
QC is one of the most time-consuming steps in GWAS. QC frequently detects important issues in data and sample collection and storage, laboratory protocols, data management, phenotype distributions, and definitions. Some QC steps (post-analysis) might require re-evaluation of specific QC criteria themselves, redoing QC or analysis.

Statistical models for genetic association analysis
In order to reach valid conclusions based on genetic data, knowledge of statistical methodology is essential. A study has to be planned while keeping in mind the final analysis inclusive power and sample size issues.
During the third day, Dr. Inga Prokopenko gave a lecture on “Association analysis”, in the afternoon lecture on “Population structure” given by Prof. Andrew Morris and followed by computer workshop (Practicals).

Association analysis
The short-term goal is to identify genetic variants that explain differences in phenotype among individuals in a study population. Qualitative is the disease status, presence/absence of a congenital defect. Quantitative is blood glucose levels, % body fat. If association found, then further study can follow to understand the mechanism of action and disease etiology in individuals and characterize relevance and/or impact in more general population. The long-term goal is to inform the process of identifying and delivering better prevention and treatment strategies.

Population Structure
Population structure can lead to spurious associations if disease prevalence and allele frequencies vary between subpopulations. We can use information from markers scattered throughout the genome to test for the presence of structure, identify groups of individuals with similar ancestry, and to correct association tests for mismatching of cases and controls. The genomic control inflation factor can be used as an indicator of the presence of population structure. Principal components analysis (PCA) can be calculated axes of genetic variation that maximize the variability between individuals. Linear mixed models can be used to account for population structure and relatedness which is computationally efficient.

During the fourth day, Dr. Inga Prokopenko talked about “Imputation of GWAS”, in the afternoon lecture on “Meta-analysis of GWAS” given by Prof. Andrew Morris and followed by computer workshop (Practicals).
Imputation of GWAS
Imputation allows prediction of genotypes at untyped variants, but which are present in a high-density reference panel. Imputation enables meta-analysis across studies typed with different genotyping arrays, increased power, and improved fine-mapping resolution. Quality of imputation depends on genotype scaffold, choice of the reference panel, and minor allele frequency.

Meta-analysis
Meta-analysis of GWAS summary statistics allows an increase in power to detect association without a direct exchange of genotype (and other relevant phenotypes) data. Fixed-effects meta-analysis assumes homogenous allelic effect across studies, so important to assess evidence for heterogeneity. Software designed for GWAS meta-analysis with features that allow for strand alignment, genomic control correction, and SNP filtering. Trans-ethnic meta-analysis can improve fine-mapping resolution.
During the fifth day, Dr. Krista Fischer gave a lecture on “Genetic risk scores, Mendelian randomization”, in the afternoon lecture on “Analysis of rare variants” given by Prof. Andrew Morris and followed by computer workshop (Practicals).
Large prospective biobank cohorts make it possible to move towards personalized risk prediction, however, there are many statistical challenges on this road. Polygenic risk scores are increasingly popular, but there is no unified approach to calculate them efficiently. One should be aware of the choice of the polygenic predictor, sample selection, ancestry issues, modeling assumptions. Studies based on Mendelian randomization are increasingly popular and feasible if sample sizes increase. Mendelian randomization is helpful, but that is not a miracle tool to get unconfounded estimates, one should be aware of potential bias due to pleiotropy.
Rare genetic variants may account for a proportion of the “missing heritability” of complex human traits. Statistical methods focus on the accumulation of minor alleles at rare variants (mutational load) within the same “genomic unit”. The most powerful rare variant test will depend on the underlying genetic architecture of the trait. Rare variants can be captured through re-sequencing, exome array genotyping, and imputation into GWAS scaffolds. Novel discoveries for a genetic basis of complex human traits emerging through analysis of rare variants.
They were three invited lectures during the course. First invited lecture on “Genetics of arterial blood pressure: from common to rare variants”. Second invited lecture on “Systems genetics of major depression and other stress-related phenotypes” and last invited lecture on “Mendelian randomization to reveal biomarkers, obesity subtypes and boost GWAS power”. These lectures gave us knowledge on practical applications of GWAS studies.

Blog & News

2nd CAPICE workshop on “Introduction to the statistical analysis of Genome wide association studies (GWAS)” at Imperial College London

Get in Touch!

Join our Newsletter

Project Consortium

Partners