Blog & News

The CAPICE blog hosts news and announcements, events, media and articles, mostly written by the Early Stage Researchers (ESRs).
They constantly pursue the publication of articles about their research during their activities carried on within this project, and this blog works as a travelogue to disseminate the research results to a broad audience of scientists, clinicians, patients and their parents and the general public.

Blog Home  News   Events   Media   ESRs' travelogue

The second workshop of CAPICE project on “Introduction to the statistical analysis of Genome-wide association studies (GWAS)” at Imperial College London held from 2 to 6 July 2018. The lectures were good and especially the practical sessions are well organized and easy to follow.

The concepts were covered by professors during the course are the following. The course duration is of 5 days. The first day they were a lecture on “Introduction to statistics for geneticists” given by Dr. Inga Prokopenko, in the afternoon lecture on “Introduction to GWAS” given by Dr. Marika Kaakinen and followed by computer workshop on “Introduction to UNIX and R” (Practicals).
Genetic variants discovered so far explain only a small proportion of the estimated heritability in many of the traits. But there is a concept called missing heritability. Finding the missing heritability is possible due to larger sample size, gene x environment interaction studies, analysis of rare variants (minor allelic frequency < 1%) and analysis strategies increasing power, e.g. multi-phenotype analysis.
During the second day, Dr. Reedik Magi talked about “Quality control for GWAS”, in the afternoon lecture on “Statistical models for genetic association analysis” given by Dr. Krista Fischer and followed by computer workshop (Practicals).

Quality control for GWAS
QC is one of the most time-consuming steps in GWAS. QC frequently detects important issues in data and sample collection and storage, laboratory protocols, data management, phenotype distributions, and definitions. Some QC steps (post-analysis) might require re-evaluation of specific QC criteria themselves, redoing QC or analysis.

Statistical models for genetic association analysis
In order to reach valid conclusions based on genetic data, knowledge of statistical methodology is essential. A study has to be planned while keeping in mind the final analysis inclusive power and sample size issues.
During the third day, Dr. Inga Prokopenko gave a lecture on “Association analysis”, in the afternoon lecture on “Population structure” given by Prof. Andrew Morris and followed by computer workshop (Practicals).

Association analysis
The short-term goal is to identify genetic variants that explain differences in phenotype among individuals in a study population. Qualitative is the disease status, presence/absence of a congenital defect. Quantitative is blood glucose levels, % body fat. If association found, then further study can follow to understand the mechanism of action and disease etiology in individuals and characterize relevance and/or impact in more general population. The long-term goal is to inform the process of identifying and delivering better prevention and treatment strategies.

Population Structure
Population structure can lead to spurious associations if disease prevalence and allele frequencies vary between subpopulations. We can use information from markers scattered throughout the genome to test for the presence of structure, identify groups of individuals with similar ancestry, and to correct association tests for mismatching of cases and controls. The genomic control inflation factor can be used as an indicator of the presence of population structure. Principal components analysis (PCA) can be calculated axes of genetic variation that maximize the variability between individuals. Linear mixed models can be used to account for population structure and relatedness which is computationally efficient.

During the fourth day, Dr. Inga Prokopenko talked about “Imputation of GWAS”, in the afternoon lecture on “Meta-analysis of GWAS” given by Prof. Andrew Morris and followed by computer workshop (Practicals).
Imputation of GWAS
Imputation allows prediction of genotypes at untyped variants, but which are present in a high-density reference panel. Imputation enables meta-analysis across studies typed with different genotyping arrays, increased power, and improved fine-mapping resolution. Quality of imputation depends on genotype scaffold, choice of the reference panel, and minor allele frequency.

Meta-analysis of GWAS summary statistics allows an increase in power to detect association without a direct exchange of genotype (and other relevant phenotypes) data. Fixed-effects meta-analysis assumes homogenous allelic effect across studies, so important to assess evidence for heterogeneity. Software designed for GWAS meta-analysis with features that allow for strand alignment, genomic control correction, and SNP filtering. Trans-ethnic meta-analysis can improve fine-mapping resolution.
During the fifth day, Dr. Krista Fischer gave a lecture on “Genetic risk scores, Mendelian randomization”, in the afternoon lecture on “Analysis of rare variants” given by Prof. Andrew Morris and followed by computer workshop (Practicals).
Large prospective biobank cohorts make it possible to move towards personalized risk prediction, however, there are many statistical challenges on this road. Polygenic risk scores are increasingly popular, but there is no unified approach to calculate them efficiently. One should be aware of the choice of the polygenic predictor, sample selection, ancestry issues, modeling assumptions. Studies based on Mendelian randomization are increasingly popular and feasible if sample sizes increase. Mendelian randomization is helpful, but that is not a miracle tool to get unconfounded estimates, one should be aware of potential bias due to pleiotropy.
Rare genetic variants may account for a proportion of the “missing heritability” of complex human traits. Statistical methods focus on the accumulation of minor alleles at rare variants (mutational load) within the same “genomic unit”. The most powerful rare variant test will depend on the underlying genetic architecture of the trait. Rare variants can be captured through re-sequencing, exome array genotyping, and imputation into GWAS scaffolds. Novel discoveries for a genetic basis of complex human traits emerging through analysis of rare variants.
They were three invited lectures during the course. First invited lecture on “Genetics of arterial blood pressure: from common to rare variants”. Second invited lecture on “Systems genetics of major depression and other stress-related phenotypes” and last invited lecture on “Mendelian randomization to reveal biomarkers, obesity subtypes and boost GWAS power”. These lectures gave us knowledge on practical applications of GWAS studies.

Get in Touch!


Prof. Christel Middeldorp, project coordinator

VU University Amsterdam
Dept. of Biological Psychology
email : c.m.middeldorp(at)

Natascha Stroo, project manager
VU University Amsterdam
Dept. of Biological Psychology
email : natascha.stroo(at)

Matteo Mauri, web & dissemination manager
University of Cagliari
email : matteo.mauri(at)

Join our Newsletter

Sign up for our newsletter for all the latest news and information


Newsletter brought to you by MailChimp.

Project Consortium