International collaborations are crucial for reliable identification and characterisation of genetic determinants of complex diseases and traits. Since 2017 CKB has contributed data from genetic analyses to more than 50 collaborations including some of the largest international consortia such as Genetic Investigation of ANthropometric Traits (GIANT), Diabetes Meta-analysis of Trans-ethnic Association Studies (DIAMANTE), and the Reproductive Genetics Consortium (ReproGen).
The majority of genome wide association studies (GWAS) to date have been mostly or exclusively in populations of European ancestry. Many additional discoveries about the underlying biological pathways that influence phenotypes or disease risk can be made through greater inclusion of other ancestries, including East Asians. In addition to GWAS conducted primarily in CKB (e.g. of lung function) (Eur Respir J. 2021), we have contributed data to several major collaborative studies focussed specifically on populations of East Asian ancestry. We made the largest single contribution to a GWAS of depression in East Asians (JAMA psychiatry. 2021), which identified five novel associations and demonstrated substantial differences between East Asians and Europeans in the genetic architecture of depression. CKB also played a major role in a GWAS of type 2 diabetes in 433,540 individuals, the largest East Asian GWAS to date (Nature. 2020), which identified 61 new diabetes associations and suggested that the relative importance of the (shared) biological pathways leading to diabetes may differ between ancestries.
CKB has contributed to many high profile trans-ancestry GWAS meta-analyses, including studies of:
- intracranial aneurysm, more than doubling the number of known associations (Nat Genet. 2020);
- recurrent miscarriage, identifying for the first time robust associations contributing to pregnancy loss, at loci with potential roles in placental biology (Nat Commun. 2020);
- fingerprints, identifying key roles for limb development genes in influencing fingerprint patterning (Cell. 2022);
- height, a GWAS of 5.4 million individuals (by far the largest sample size to date) identifying more than 12,000 independent associations that account for nearly all of the genetic basis for variation in height (bioRxiv. 2022);
- circulating blood lipids, identifying more than 1,700 distinct genetic associations (Nature. 2021).
Such studies clearly demonstrate that increasing GWAS ancestral diversity is the key to improving identification of the underlying causal variants and construction of ‘portable’ polygenic prediction scores for improved risk prediction.
Replication of associations
Given the extremely large number of statistical tests involved and the modest effect size of the resulting genetic associations, GWAS results need to be replicated in independent studies to ensure that they are robust and reproducible. CKB has contributed replication analyses to multiple studies for this purpose.
For a GWAS of blood pressure (Nat Commun. 2018), CKB provided replication of 19 new genetic loci, several of which were ancestry-specific associations. Similarly, for a GWAS of age at menopause in approximately 200,000 women of European ancestry (Nature. 2021), CKB provided replication results for associations at 290 loci, 234 of which were novel; we demonstrated broad replication of these results in East Asian women, indicating that these same factors are important for female reproductive longevity in Chinese women. Conversely, for a GWAS of early-onset ischaemic stroke which identified three associations (one of which was novel) (Neurology. 2022; in press), none of these loci had association signals in CKB, despite the strength of the original associations; this is consistent with previous observations of substantial differences in the genetic architecture of stroke between ancestries.
Polygenic scores (PSs) are derived from a combination of many individual genetic variants associated with a phenotype or disease, and can improve risk predictions over and above conventional models. However, scores derived from European populations often do not perform well for individuals of other ancestries, so it is important to assess whether they can usefully be applied in different populations. In two separate studies, CKB tested the performance of PSs for lung function in predicting chronic obstructive pulmonary disease (COPD) (Nat Genet. 2017, Nat Genet. 2019); we found that, although they performed less well than in Europeans, these scores nevertheless strongly predicted COPD in East Asians. Similarly, CKB showed that a PS derived from GWAS of bone mineral density in Europeans successfully predicted incident major osteoporotic fracture in Chinese adults (Genome Med. 2021). CKB has also assessed PSs derived from GWAS of East Asian populations: we showed for the first time that such scores can be effectively used to discriminate sub-populations at high risk of lung cancer (The Lancet Respir Med. 2019), and that PSs show good transferability across ancestral subgroup for predicting breast cancer (Genet Med. 2021).
CKB has contributed to the development of new methods to support future genomics research. We assisted in the development of methods for genetic association analyses using very-low-coverage whole genome sequencing from non-invasive prenatal testing (Cell. 2018), and for trans-ancestry colocalisation methods to assess whether two populations share causal variants (Nat commun. 2019). CKB has also played a major part in the development of the Multi-Ancestry Meta-Analysis (MAMA) method for combining GWAS summary statistics from multiple populations (bioRxiv. 2021). This provides substantially increased power when analysing data from East Asian and European ancestry populations, and the identification of novel loci that are not discovered by existing methods.
Impact of research
CKB’s collaborations illustrate our commitment to sharing data with researchers from around the world, to address specific scientific questions and to enhance collaboration, capacity development, and knowledge transfer. Members of the CKB team have become established as important participants in international consortia, for instance as part of the Global Biobank Metaanalysis Initiative (medRxiv. 2021), and as convenors of the GIANT Trans-Ancestry Meta-Analysis working group.
As CKB genomics resources are developed and strengthened, for instance through whole genome sequencing and improved imputation, we will broaden and deepen these collaborative efforts, with the aim of improving prediction, prevention, and treatment of major chronic diseases worldwide. In addition to many well-established, ongoing collaborations, such as CKB-led projects on adiposity and blood pressure, we continue to seek new projects. Recent initiatives include joining the PGS Catalog Project (Nat Genet. 2021), a new trans-ancestry meta-analysis within the CARDIoGRAMplusC4D consortium, and collaborations with industry partners to identify, investigate, and validate potential drug targets.