Twin

TwinsUK is the largest cohort of community dwelling adult twins in the UK. The registry comprises over 14,000 volunteer twins (14,838 including mixed, single and triplets), it is predominantly female (82%) and middle aged (mean age 59). In addition, over 1,800 parents and siblings of twins are registered volunteers. During the last 25 years, TwinsUK has collected numerous questionnaire responses, physical/cognitive measures and biological measures on over 8,500 subjects. Data were collected alongside four comprehensive phenotyping clinical visits to the Department of Twin Research and Genetic Epidemiology, King’s College London. Such collection methods have resulted in very detailed longitudinal clinical, biochemical, behavioural, dietary and socio-economic cohort characterisation; it provides a multidisciplinary platform for the study of complex disease during the adult life-course, including the process of healthy ageing. The major strength of TwinsUK is the availability of several ‘omic’ technologies for a range of sample types from participants, which includes genome-wide scans of single nucleotide variants, next-generation sequencing, metabolomic profiles, microbiomics, exome sequencing, epigenetic markers, gene expression arrays, RNA sequencing and telomere length measures. TwinsUK facilitates and actively encourages sharing the ‘TwinsUK’ resource with the scientific community – interested researchers may request data via the TwinsUK website (http://twinsuk.ac.uk/resources-for-researchers/access-our-data/) for their own use or future collaboration with the study team. In addition, further cohort data collection is planned via the Wellcome Open Research gateway (https://wellcomeopenresearch.org/gateways). The current article presents an up-to-date report on the application of technological advances, new study procedures in the cohort and future direction of TwinsUK.


Introduction
TwinsUK is the largest adult twin registry in the UK, and is one of the most deeply phenotyped and genotyped datasets in the world. It provides a multidisciplinary platform to research both health and social related questions; with the overarching aim of understanding the aetiology of complex disease and the ageing process. The registry was started in 1992, with the initial intention to investigate osteoporosis and osteoarthritis. Such conditions are highly prevalent in women and consequently several hundred middle-aged women were recruited and formed the core of the initial register. Success from these early studies led to a rapid expansion of TwinsUK and to date the cohort consists of 14,000 community dwelling twins, male and female, aged over 18.Current research areas of interest include the genetics of metabolic syndrome, cardiovascular disease, the musculoskeletal system, sensory impairment and ageing, as well as how the microbiome affects human health. Details of the registry's progression have been described previously (Moayyeri et al., 2013;Spector & Williams, 2006). To date, the TwinsUK registry has contributed to over 850 publications and 800 international collaborations. More detailed description of research outputs may be accessed through the study website: http://www.twinsuk.ac.uk

The Collection
Over the last 27 years the TwinsUK registry has been enhanced through over 80 studies, some which have been repeated over time. This has resulted in clinically rich, longitudinal phenotype information, (Table 2), which may be categorised to 4 distinct time points (Verdi et al., 2019). Recruitment strategies have predominantly involved media campaigns. These have offered opportunities for adult twin pairs to join the registry and participate in unspecific research investigating various common diseases, without selecting for particular diseases or traits. At baseline (1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004), over 7,000 twins responded to annual questionnaires and approximately 5,500 twins attended a full comprehensive clinical visit, which included several project-led studies. Age-matched characteristics of these volunteer twins were found not to differ from a singleton population-based cohort of British women (Chingford study) (Andrew et al., 2001), apart from a life-long lower weight in MZ twins of approximately 1kg. The follow-up visit occurred between April 2004 andMay 2007, in which 3,725 twins in the registry attended a 1-day clinical visit and an additional 1,299 twins posted blood taken via their GPs for DNA sampling. Participants ranged aged between 18 and 82 years (mean 52.5 ± 13 years) and the majority were female (89%). Protocols for the baseline and initial follow-up visit have been described previously (Spector & Williams, 2006).
The second wave of follow-up visits (August 2007-April 2012) aimed to investigate the ageing process; HATS (Healthy Ageing Twin Study). Inclusion criteria were women aged ≥40 years with at least one previous clinical visit (n=4,610). In total, 3,125 women (mean age 59.6 ± 9 years) attended the clinical visit. Follow-up time between first and last visits ranged between 6.1 -17.4 years, with over 600 of the participants having 4 or more previous clinical visits. HATS outcomes have previously been described (Moayyeri et al., 2012) including details of data collection (Moayyeri et al., 2013).
The 3 rd wave of follow-up visits (May 2012 and May 2018) was performed to understand the interactions in disease processes between genes and the environment, as part of the Biomedical Research Centre (BRC) study. All participants of the TwinsUK registry were invited to attend a comprehensive clinical visit, which included collection of bone density/whole body scan, cognitive and lung function, hearing and eye tests, fitness assessment (gait speed, chair stands and grip strength) and collection of blood, urine, stool and salivary samples. In total 6,686 clinical visits were made, with 3,620 volunteers attending at least once and 1,531 volunteers attended the clinic on two occasions with an average of 4 years between visits. In addition to the clinical visit, 6,300 questionnaires were returned, complementing clinical data collected during the visit. Since February 2019, a further wave of follow up visits has commenced which aims to continue the longitudinal data collection, and adds further dynamic phenotyping and blood measurement over a six-hour visit incorporating standardised meals. year of birth and sex) 1900-1909: 2 FF 1910-1919: 14 FF 1920-192948 MM;4 FM 1930-1939196 MM;36 FM 1940-1949312 MM;64 FM 1950-1959486 MM;98 FM 1960-1969502 MM;70 FM 1970-1979506 MM;56 FM 1980-1989186 MM;58 FM 199058 FM -1999

Longitudinal data
Detailed clinical and biochemical phenotypes have been collected using harmonised protocols at each visit stage. A summary of a selection of clinical phenotypes are outlined in Table 2. In addition, questionnaire data have been collected on an annual basis and during visits, some which measure incident clinical endpoints such as cardiovascular accident, type 2 diabetes, chronic obstructive pulmonary disease, which have previously been described (Verdi et al., 2019). Three main comprehensive questionnaires (TwinsUK Baseline Health', 'Baseline Core' and 'Longitudinal Core') were collected between 2004 and 2018 (detailed in Table 3). These were in paper format, completed at respondents address and returned to the research facility. Over 2500 participants completed all three main questionnaires and 2,300 completed either two of the main questionnaires. Furthermore, the demographic of the cohort provides an excellent resource to study ageing where longitudinal changes are important to consider. Table 4 provides summaries of the key cognitive and frailty phenotypes we have acquired to explore questions in this area.
Alongside regular visits and questionnaires, TwinsUK has data linkage to official cancer and mortality data for retrospective analysis and future follow up. Additional links to national health, education and environmental records to our own database are being established at present.   -

Genome-Wide Association Studies
TwinsUK has contributed to many international consortia for genome-wide association analysis of various phenotypes (Mills & Rahal, 2018). Genome-wide scan data using two chips (Illumina HumanHap300 BeadChip and Illumina HumanHap610 QuadChip) is available for 5,654 (both monozygotic and dizygotic) twins. The data has been fully imputed using '1000 Genomes' and 'Haplotype Reference Consortium -(HRC)' reference panels. TwinsUK is a member of many ongoing international consortia for meta-analysis of various traits such as height, BMI, lipids, obesity, blood pressure and back pain phenotypes. Some of the main publications from these collaborations can be found in the TwinsUK website. Our Genome-wide data is also being used to compile Polygenic Risk Scores to isolate loci for various traits (Mills, Barban, & Tropf, 2018).

-Epigenetic markers
The first large-scale genome-wide epigenetic assessment in TwinsUK was performed on DNA methylation patterns profiled on the Illumina HumanMethylation27 BeadChip in a whole blood sample of 172 female twins. This array examines 27,578 promoter CpG-sites that map uniquely across the genome and some of these sites were found to be associated with age and age-related phenotypes (Bell et al., 2012). Subsequently, the Illumina Infinium HumanMethylation450 BeadChip was applied to up to blood samples from up to 1,000 additional MZ and DZ twins to  (Buil et al,. 2015).

-Whole genome sequencing
Whole genome sequencing (WGS) of 2000 healthy, deeply phenotyped twins formed part of the UK10K project, which used state-of-the-art next-generation sequencing methods to uncover rare genetic variants associated with health and disease. The data have been used extensively to describe population structure and functional annotation of rare and low-frequency variants (The UK10K Consortium et al., 2015), further details can be accessed at: www.uk10k.org. In addition, approximately1000 exome sequences at 30-60× depth have been ascertained as part of projects with GoT2D consortium and Pfizer Inc. More recently, WGS of >30X coverage was carried out through collaboration with Human Longevity, Inc (HLI) for 2,377 individuals from the TwinsUK cohort.
DNA samples were sequenced on an Illumina HiSeqX sequencer using a 150-base paired-end singleindex-read format. The data have been used to disentangle to contribution of rare variants to the blood metabolome (Long et al., 2017), and are now under investigation to identify rare variants associated with complex diseases and traits, and for the inference of structural variants.  (Long et al., 2017;Shin et al., 2014), as well as many health traits (see Twinsuk publications list: http://twinsuk.ac.uk/our-research/publications/).

Nightingale Health Ltd. (Helsinki, Finland; previously known as Brainshake Ltd) is a targeted NMR
spectroscopy platform that has been extensively applied by us and others for biomarker profiling in epidemiological studies (Barrios et al., 2018).More recently, metabolomics profiling (Metabolon, Inc) has been conducted on faecal (n=1016)  and salivary samples (Nag et al,. 2019).
o Glycans Glycosylation is the most common form of post-transcriptional protein modification and it is a putative mechanism in the modulation of the inflammatory response. The technology to assess glycosylation has recently become high throughput, and glycosylation of immunoglobulin G (IgG) has been measured on 4900 twins while N-glycans on human serum glycoproteins on 1800. Using this, we have found that glycans are highly heritable (Menni et al, 2013b) and we have been the first to observe a number of associations between glycans and important age-related traits (

Dietary Phenotypes
TwinsUK has detailed datasets on dietary habits, which have been collected since inception of the registry. Data varies, and includes dietary indices on >5000 participants (e.g. Mediterranean Diet Score, Healthy Eating Index -2010 and the Healthy Food diversity index) .
Dietary patterns, which are measured by category of foodstuff, have also been assessed through a food frequency questionnaire (FFQ) previously used in the EPIC Study (Bingham et al., 2008). For details of collection see Table 5.

Socioeconomic data
The historical research focus of TwinsUK has shaped the main demographic of the twin cohort having middle socio-economic status and education typical of a volunteer group (Moayyeri et al., 2013;Steves et al., 2013). Socioeconomic status of the twin volunteers has been collected since the registry's inception through self-reported questions (e.g. highest educational qualification status).
More recently the Index of Multiple Deprivation has been compiled for all volunteers having UK postal codes, and data are to be linked to national databases for retrospective and future collection.
-Index of multiple deprivation

Future Directions and Collaborations
Longitudinal and detailed clinical, biochemical, behavioural, socio-economic and deep omics (including multi tissue characterisation) of participants for nearly 30 years has provided a unique resource to study complex diseases and domains of healthy ageing in the TwinsUK population. These, in conjunction with novel dynamic testing at study visits and lifestyle intervention studies, offer a unique opportunity to explore personalised medicine. High quality data collection, database management, biological sample storage, and statistical quality control enhance the resource, In addition, a key strength of the resource lies in the highly engaged and loyal population, this evident from the high retention levels of participation across studies.Blood, urine, DNA and multiple tissue samples are available for future measurements; Online questionnaires and active engagement with our twin participants using text, email and social networking enables responsive and agile data collection. Our 'Volunteer Advisory Panel' are key on developing new strategy and governance of participants, informing on decisions about the ethics, practicalities and appropriateness of potential studies.
The TwinsUK registry has a history of numerous successful scientific collaborations, and we remain committed to providing the scientific community with access to the phenotype data from the 'TwinsUK Resource'. TwinsUK has an exemplary record for data sharing with over 800 data access requests, 150,000 samples shared to over 100 collaborators, and over 600 publications in the past six years. Detailed descriptions for researchers of data and samples are on the data access pages of the website (http://www.twinsuk.ac.uk/data-access/cohortdata-description/), here over 10,000 phenotypes can be searched. Longitudinal Population studies funding from the Wellcome Trust continues to fund the core functions of TwinsUK and opens up the resource to successful cross cohort collaborations.
Over the next five years TwinsUK will integrate electronic health records into an enhanced deep tissue 'omics resource and continue dynamic phenotypic testing into clinical visits. In addition, we will extend the age range of the registry to include volunteer twins from birth to adulthood, thus opening up the resource to study unique twin gene environment interactions across the life-course.
New efficient broad consent will ensure that the communication with participating twins is ethical and proportionate. New annual sociological questionnaires will harmonise with ELSA (English Longitudinal Study of Ageing) & other LPS (1946LPS ( /1958. We will also standardise mental health phenotypes between the complementary Twins Early Development Study (TEDS) such that, together, TwinsUK and TEDS cohorts will be an unparalleled twin resource across the life-course. These developments will ensure TwinsUK will be a unique global resource of longitudinal omics and twin research across the life-course, with immense potential for future scientific exploitation.