The wealth of data available in the CKB on lifestyle, behavioural and environmental exposures, genetic and other blood-based factors, and the long-term follow up for disease outcomes, provides a foundation for extensive research into the risk factors and causes of a wide range of chronic diseases. To maximise the scientific benefits of this resource, the CKB is developing collaborative research programmes  with other research groups, both in China, the UK and internationally.  For information on new collaborations or other forms of data access, please see the Data Access Policy.

See below for details of ongoing/planned collaborative research.

Genetic and metabolomic analysis of haemorrhagic stroke in adult Chinese

Key collaborators: Peter Donnelly (University of Oxford), Mika Ala-Korpela (Universities of Bristol & Oulu), Cathie Sudlow (University of Edinburgh)

Stroke is the second leading cause of death globally, with much less known about its genetic and non-genetic determinants than for other major diseases, especially for intracerebral haemorrhage (ICH). In the UK, although ICH accounts for only 10-15% of stroke cases, it causes more than half of all stroke deaths before age 70, and much severe disability. In China, about 30% of incident cases of stroke are due to ICH and almost all suspected cases of stroke have CT/MRI after admission to hospital, allowing reliable determination of stroke subtypes. Twin studies and other family studies indicate a substantial genetic component to stroke risk, but there are to date no reports of well-powered GWAS of ICH.

We will investigate the genetic contribution to ICH in a case-control study nested within the CKB study of 5,000 scan-confirmed incident cases of ICH and 10,000 controls. We will use a 800K SNP Affymetrix array, custom-designed to maximise whole-genome coverage in the Chinese population. The 10,000 control subjects will also form a set of “common controls” for genotyping studies of other diseases (e.g. ischaemic stroke). We will also conduct NMR measurements of metabolomics markers in plasma, as well as standard blood biomarker assays. Using system biology approaches to combine these ‘-omics’ data with detailed environmental and lifestyle information, we aim ultimately to understand the molecular basis of haemorrhagic stroke.

Using genetic tools to assist with drug development and drug safety evaluation

Key collaborators: Lon Cardon, Dawn Waterworth & Stephanie Chissoe (GSK)

Levels of lipoprotein-associated phospholipase A2 (Lp-PLA2), an inflammatory enzyme expressed in atherosclerotic plaque, are associated with cardiovascular disease in Western populations. A drug (darapladib) has been developed to lower Lp-PLA2 activity, and is currently in phase III trials for secondary prevention of cardiovascular disease. To help assess the long-term safety of such drugs, the effects of genetic variants in PLA2G7, the gene encoding the Lp-PLA2 protein, can be studied, using the wide range of disease outcome data available from the CKB follow-up. The SNP PLA2G7 835G>T (rs76863441) encodes a null variant V279F, which results in inactive Lp-PLA2. This SNP is, however, found only in Chinese and other East Asian populations, not in Western populations.

Associations of PLA2G7 V279F with a range of disease outcomes among 100,000 CKB participants will be explored, using a phenome-wide approach. Confirmation of any significant associations will be sought in further CKB samples. The study findings should not only provide important evidence on both the efficacy and the safety of blocking this pathway but also help inform strategies for future drug development.

Helicobacter pylori and cancer of the stomach and oesophagus in adult Chinese

Key collaborators: Silvia Franceschi, Catherine de Martel, Martyn Plummer (Infections and Cancer Epidemiology Group, International Agency for Research on Cancer, France)

Gastric cancer is a leading cause of cancer mortality in China. Helicobacter pylori infection (Hp) is the strongest risk factor for non-cardia gastric cancer, whereas cardia gastric cancer, accounting for approximately 16% of gastric cancer in China, is not associated with Hp and may even be associated with its absence. Oesophageal cancer is also a leading cause of cancer mortality in China. Gastric Hp may be associated with oesophageal adenocarcinoma through its role in gastro-oesophageal reflux disease (GERD) and Barrett’s oesophagus.

We plan to study the relationship between Hp and different types of cancer of the stomach and oesophagus in the CKB using a nested case-control or case-cohort design. Hp antibodies will be assessed using a Western blot test (Helicoblot 2.1; Genelabs Diagnostics, Singapore) with higher sensitivity than previously used ELISA antibody tests. This test also distinguishes between strains according to their expression of the CagA protein, a marker of increased Hp virulence. By 1/1/2013, there were 2000 cases of gastric cancer (including ~1700 non-cardia cases) and 1500 cases of oesophageal cancer (including ~400 adenocarcinomas). This will be the largest prospective study of gastric cancer and the first of oesophageal cancer to use a highly sensitive Western blot assay for Hp detection. The combined effect of Hp and lifestyle risk factors (e.g. smoking, obesity, and diet) will also be explored.

Assessing the relevance of novel biomarkers for metabolic disorders using mass spectrometry

Key collaborators: Xu Lin, Huiyong Ying and Yan Chen (Institute of Life Science, Chinese Academy of Science [CAS], Shanghai, China); Jiarui Wu and Rong Zen (Key Laboratory of Systems Biology, CAS)

Several nutrients are known to play important roles in the development of many common diseases. Plant polyphenols and polyunsaturated fatty acids (including n-3 fatty acids) have been associated with lower risk of metabolic disorders, whereas trans fatty acids from industrial processed foods may have adverse effects. Exposure to heavy metals (e.g. arsenic, cadmium, mercury) may also have measurable health effects. Reliable assessment of nutrient intake based on dietary questionnaires is hampered by recall bias and limited food composition data, particularly in China. Mass spectrometry (MS) analysis of blood and urine samples collected in the CKB provides an opportunity for objective measurements of fatty acids and other known and novel biomarkers related to dietary intake and health.

We plan initially to measure fatty acids in red cell membranes using gas chromatography-MS (GC-MS) and plant polyphenols in urine using liquid chromatography-MS (LC-MS) in samples collected in the 2nd CKB resurvey. Analysis of fatty acids will be used to correlate with dietary patterns of traditional Chinese diet versus Western style diet. These can also be used to document proportions of saturated and of polyunsaturated fats and of their sub-fractions. The polyphenol content of urine may reflect food sources, and other factors including sun exposure, storage, food preparation, and industrial processing. Further discovery and targeted multi-omic strategies based on MS will be employed to identify metabolites and other biomarkers (e.g. proteins and peptides) relevant for health. We also plan to measure exposure to heavy metals in urine samples using inductively coupled plasma-MS (ICP-MS). The relationship between biomarker data generated by MS and dietary intake and health-related data will be explored.

Assessing delivery and utilisation of effective health service in urban and rural China

Key collaborators: Winnie Yip (Blavatnik School of Government, University of Oxford), Wen Chen (Fudan University, China)

Since 2005-6 the Chinese government has launched a national health care reform program, and by 2012, with substantial government subsidies, health insurance coverage reached 95 percent of the population. However there is little empirical evidence as to what health services people are getting, whether these services are effective, efficient and equitable, how far actual treatment deviates from evidence-based health care and what are the potential causes of the gaps. As a result, policy makers cannot make evidence-based decisions on how to achieve their policy goal of providing affordable, equitable and effective health care for all. The health insurance database from CKB provides a unique opportunity to address this evidence gap and we plan to conduct the following analyses:

  • quantify effective health care gaps, defined as differences between actual treatment and evidence-based health care (both over-utilisation and under-utilisation), focusing on a selected number of diseases based on their health expenditure share and mortality/morbidity burden
  • measure how health care gaps vary by region, patient socioeconomic status, types of facilities at which care is delivered (e.g. tertiary vs. secondary hospital; teaching vs. non-teaching hospital) and insurance program characteristics

Based on the findings, we hope to design policy interventions that improve effective and equitable service delivery. These will be designed as quasi-experiments to allow evaluation of their impacts. These studies of policy interventions will be conducted in collaboration with local governments and insurance programs, so that the findings can help inform their policy decisions and facilitate scaling up of any positive effects.

Epigenetic markers and risk of ischaemic heart disease in Chinese adults

Key collaborators: Jun Lv (Peking University, China) & Wei Chen (Tulane University, USA)

Ischaemic heart disease (IHD) is the largest single cause of death worldwide and the second most common cause of death in China. While IHD represents the joint effects of established cardiovascular risk factors and genetic factors, the role of epigenetic modification and the extent to which non-genetic factors (including lifestyle factors) influence the DNA methylation and altered expression of susceptibility genes for IHD is largely unknown. As a proof-of-concept experiment, this project, partly funded by China Natural Science Foundation, aims to (1) examine the temporal relationship between cardiovascular risk factors and DNA methylation patterns; (2) assess the within-person variation in DNA methylation using serial samples collected on 3 separate occasions over an 8-year period; and (3) examine the associations of IHD with genome-wide and gene-specific DNA methylation patterns overall, and, separately by levels of established cardiovascular risk factors.

The proposed research will use a nested case-control study design. Incident IHD cases (n=250) will be selected from the CKB cohort and compared with a similar number of age, sex and region-matched controls, from among those attending the 1st or 2nd resurveys. Using serial plasma and DNA samples collected at baseline, 1st and 2nd resurveys, plasma levels of lipids, glucose, insulin, together with DNA methylation levels will be measured on samples collected at 3 separate time points. DNA methylation levels (at 480,000 CpG sites covering the whole genome) will be determined using the Illumina HumanMethylation450 BeadChip. The proposed research will serve as a pilot study to investigate at a molecular level how environmental factors may influence gene expression through DNA methylation, and the role of epigenetic modification for risk of IHD.

Susceptibility loci for ischaemic heart disease and ischemic stroke in Chinese adults

Key collaborators: Zhibin Hu (Nanjing Medical University, China)

Ischaemic heart disease (IHD) and ischemic stroke (IS) are both caused by a combination of genetic and environmental factors. While genome-wide association studies (GWAS) have identified several susceptibility loci for IHD in Chinese populations, the genetic determinants of IS in Chinese population remain to be discovered. The proposed research, partly funded by China Natural Science Foundation,  will use a two-stage GWAS to identify IS-related loci using a nested case-control study design (screening phase involving 2000 cases and 2000 controls and replication phase involving 5000 cases and 5000 controls). Furthermore, the project will fine-map the identified regions associated with IS in Chinese population through target capture and sequencing, and will obtain the genetic markers and estimate their effect sizes for IS. In addition, loci identified for IS will examined for their association with IHD in the Chinese population.

Genetic and life style factors and risk prediction of ischaemic heart disease and ischemic stroke in Chinese adults

Key collaborators: Tangchun Wu (Huazhong University of Science and Technology, China)

Both ischaemic heart disease (IHD) and ischemic stroke (IS) are complex human diseases caused by the complex interplays of lifestyle and environmental factors and genetic factors. However, little is known about the interactions of genetic and non-genetic factors for prediction of risk of IHD and IS, especially in China where the dietary habits, lifestyle, and genetic structure in the population differ importantly from that in the Western populations. The proposed research, partly funded by a grant from China Natural Science Foundation, will explore the etiology of the IHD and IS (genetic risk factors, traditional and potential environmental risk factors), with particular focus on the effects of gene-environmental interactions for IHD and IS. The present study will use the Cox proportional hazard models to construct the predictive models for cardiovascular disease that are suitable for the Chinese population. The research may provide the basis for identification of novel strategies for the prevention of IHD and IS in Chinese adults.

Early diagnosis of pancreatic cancer using novel blood biomarkers

Key collaborators: Eric O’Neill (Gray Institute, University of Oxford) & Carl Borrebaeck (Lund University)

Pancreatic cancer has a poor prognosis, with the 5-year survival rate just 3-4%. Blood-based screening tests for early diagnosis before the onset of clinical symptoms would have significant benefit in improving clinical management and long-term survival of patients. The CKB has a collection of blood samples taken from people without any prior cancer, which can be used to validate specific blood tests for early diagnosis. By 1/1/2013, there were over 400 incident cases of pancreatic cancer. Plasma from individuals with their cancer diagnosed at 1, 2, 3, 4, and 5 years after the initial sample collection will be tested and compared with matched controls, using three novel blood-based biomarker techniques simultaneously, as a proof-of-concept experiment.

  1. Gene mutations and methylation in circulating cell-free plasma DNA (cpDNA): Apoptosis and necrosis associated with neoplastic growth lead to an elevation of cpDNA in tumor patients. cpDNA will be isolated and tested for specific oncogene and tumor suppressor gene mutation and methylation patterns, by sequencing and CpG island detection.
  2. Signatures of microRNA (miRNA): miRNAs are often deregulated in cancer and miRNA expression patterns may represent diagnostic and prognostic tools for pancreatic cancer, as well as potential therapeutic targets. miRNA signatures identified using microarrays will be validated using targeted RT-PCR.
  3. Protein signatures using a recombinant antibody proteomics microarray: A multiplexed antibody microarray has been developed which indicates that a signature of about 20 serum proteins can distinguish pancreatic cancer from healthy individuals with over 95% accuracy. Specific serum protein patterns associated with asymptomatic pancreatic cancer will be identified and validated.

The techniques developed and validated for pancreatic cancer will also be used for subsequent investigation of lung and other cancers.

Assessing short- and long-term health effects of ambient air pollution in China

Key collaborators: Haidong Kang & Binghen Chen (Dept. of Environmental Health, Fudan University, China)

Exposure to ambient air pollution can lead to increased risk of a range of acute conditions (e.g. asthma) and chronic diseases (e.g. lung cancer, IHD, COPD). With rapid economic development and urbanisation, ambient air pollution has increased significantly in China over the last few decades. The Global Burden of Disease study estimated that ambient air pollution accounted for over 1 million deaths in China in 2010, but this was mainly based on extrapolation of data from ecological studies and modelling using risk estimates from the West. Through linkage with national air pollution monitoring data, the CKB can help address this evidence gap.

Three different models will be used to estimate levels of exposure to ambient air pollution among CKB participants. These include (i) Spatial interpolation: models will be applied to estimate specific pollutant concentrations at participants’ home addresses based on monitoring results of the China National Environmental Monitoring Center; (ii) Satellite-derived aerosol optical density measurements: NASA satellite remote sensing images will be used to extract spatial density measurements, and the concentration of PM2.5 and other pollutants at the addresses within each geographically-defined grid will be estimated after a correction using humidity and surface measurements; and (iii) Land-use regression models: land-use information at each city (e.g. land use status, road, population density, weather condition, geography) will be used to construct a multiple linear regression equation and to predict individual-level air pollutant concentrations. These estimates of exposures to air pollution in the CKB participants will then be linked with health outcomes to assess, both qualitatively and quantitatively, the short- and long-term effects of ambient air pollution.