Alzheimer’s disease is the most common form of dementia and predominantly affects people over 65 years old. Alzheimer’s disease is driven by multiple genetic and environmental influences, including complex interplay between genes. The Transformational Bioinformatics Group recently published BitEpi, a software tool to find the hidden gene-gene interactions in the genome. We have applied this tool to Alzheimer’s disease to uncover these hidden genetic interactions, highlighting novel drivers of disease.
Complex diseases, like Alzheimer’s disease or diabetes, occur when multiple genes have changes in them that amplify each other and lead to disease. These are referred to as genetic interactions, or epistasis. Detecting epistatic interactions is inherently difficult because the combinatorics of evaluating all possible interactions quickly explodes when you start looking beyond pairs, to triplets and even quadruplets. A fast, exhaustive search of interactions between four genes has been thought of as impossible before.
Addressing this gap, our publication in Nature Scientific Report introduces BitEpi, a fast and accurate method to test all combinations of up to four gene-gene interactions. BitEpi makes use of higher-level arithmetic operations, called bit-operations, which are directly supported by a computer’s processor, making interaction–searching really efficient. BitEpi is hence computationally efficient and up to 56 times faster than other published methods.
Finding the missing link in Alzheimer’s disease
Being the first to find 4-way interactions efficiently, could mean a breakthrough for genome research. It may help resolve what scientists call ’the missing heritability’, which is, when we know a disease is inherited through other means, such as twin studies, but we cannot find the underlying cause. Epistatic interactions between three or more genes has been flagged as a potential cause.
Using BitEpi we have discovered novel higher order genetic interactions involved in the development of late onset Alzheimer’s disease (LOAD). Like many other complex genetic diseases, LOAD has demonstrated missing heritability, with only 40 risk loci identified to date (Andrew et al, 2020), but between 100-1000 causal variants predicted to be associated with the disease (Holland et al, 2020). Using the world’s largest genomic dataset, UK Biobank, we used 3513 patient samples with 5 million recorded genomic mutations, and were able to find interactions between known LOAD associated genes and higher-order interactions between novel associated LOAD genes. We identified 108 interactions between known AD gene APOC1 and other gene variants, 42 of which were higher order (3 SNPs) (Figure 1). Genetic variants in SERPINB9P1 and NECTIN2 (also known LOAD genes) demonstrate epistasis with APOC1. Furthermore, a genetic variant in known LOAD gene, TOMM40 demonstrates a pairwise interaction with a long non-coding RNA , called LINC0239321 (Figure 1) – not previously known to be involved in LOAD. Recent studies have implicated roles for lncRNAs in AD aetiology (Li et al, 2021), thus highlighting this gene for further investigation. We validated our primary analysis with a replication cohort from ADNI (Alzheimer’s disease neuroimaging initiative).
Identification of these significant interactions between novel LOAD -associated variants and known LOAD-associated variants uncovers possible biological networks that are driving disease progression. Uncovering how novel disease genes actually work together will enable new drug targets and ultimately enable personal medicine approaches in diseases such as Alzheimers disease.
Supporting genomic research
If you are interested in finding out more about Epistasis or have a dataset where you suspect there are Epistatic interactions involve, please get in touch.
Subscribe to Transformational Bioinformatics
Stay up-to-date with our progress