BitEpi, a software tool to find hidden epistatic interactions in the genome
The Transformational Bioinformatics Group published BitEpi, a software tool to find the hidden gene-gene interactions in the genome. Complex diseases are driven my multiple genes and understanding this interplay could be the holy grail of genome research.
Complex diseases, like diabetes or cardiovascular disease, occur when multiple genes have changes in them that amplify each other and lead to disease. These are refered to as genetic interactions, or epistasis. Detecting epistatic interactions is inherently difficult because the combinatorics of evaluating all possible interactions quickly explodes when you start looking beyond pairs, to triplets and even quadruplets. A fast, exhaustive search of interactions between four genes has been thought of as impossible before.
Addressing this gap, our publication in Nature Scientific Report [1] introduces BitEpi, a fast and accurate method to test all combinations of up to four gene-gene interactions. BitEpi makes use of higher-level arithmetic operations, called bit-operations, which are directly supported by a computer’s processor, making interaction –searching super efficient. BitEpi is hence computationally efficient and up to 56 times faster than other published methods.
Finding the missing link in disease genomics
Being the first to find 4-way interactions efficiently, could mean a breakthrough for genome research [2]. It may help resolve what scientists call ’the missing heritability’, which is, when we know a disease is inherited through other means, such as twin studies, but we cannot find the underlying cause. Epistatic interactions between three or more genes has been flagged as a potential cause.
Using BitEpi, researchers might discover the genes that are involved in complex diseases like diabetes or cardiovascular disease. To demonstrate this ability we applied BitEpi to one of the world’s largest genomic datasets, the Wellcome Trust Case Control Data. Sifting through 4,900 patient data with 87,000 recorded genome-wide mutations, BitEpi was able to find known interactions but also novel higher-order interactions.
Enabling personalized medicine
Identifying new disease genes and uncovering how they actually work together will enable new drug targets and ultimately enable personal medicine approaches. But such advanced insights require the whole genome to be analysed in high resolution, which so far has been impossible.
In conjunction with our other genomic research tool, VariantSpark, it is now possible to analyze datasets that contain whole-genome sequencing data with 100,000,000 mutations. Working hand in glove, VariantSpark uses its Random Forest algorithm to filter interacting disease genes candidates for BitEpi to then identify the precise interactions.
We have used VariantSpark to identify genomic variants from 50,000 people in the UKBiobank that are significantly associated to Cardiovascular Disease (CVD). These significant variants were then fed into BitEpi to identify higher order genetic interactions that play a role in CVD. Significant interactions between novel CVD-associated variants and known CVD-associated variants have been found, uncovering possible biological networks that are driving disease progression in CVD.
Supporting genomic research
BitEpi is not only faster than previous methods but also 44% more accurate in identifying interactions. Researchers can hence conduct their analyses with more confidence and can visualize them in a publication-ready graphic using BitEpi’s advanced visualization capacity.
If you are interested in finding out more about Epistasis or have a dataset where you suspect there are Epistatic interactions involve, please get in touch.