Finding viral integrations with isling
CSIRO has published a software tool to detect if and where viral DNA has integrated into the genome of its host. This helps assess the consequences of a viral infection and the safety of gene therapies.
Viruses are extraordinarily common in our environment, and it is estimated that there are 100 million times more viruses on Earth than there are stars in the universe. Even the most infamous virus of current times, SARS-CoV-2, is but a drop in this ocean of viruses, of which the vast majority are harmless to humans; even if this seemingly endless pandemic may make it appear otherwise.
Most of these viruses, too, are fleeting; they cannot survive for extended periods of time and need to find a new host cell to reproduce. It enters the host cell, and makes so many copies of itself that the cell itself is disrupted and the virus leaks out to infect other cells.
However, not all viruses reproduce this way all the time, and many can integrate their genetic material into the host’s genome and lie dormant. Indeed, the human genome is littered with the remnants of such ‘fossil viruses’. Viral integration can occur in many situations and is of interest in many fields of research.
These fields include cancer research, since integration of certain viruses are known to play a role in the development of some cancers. Another field is gene therapy, which often uses modified viruses (called a ‘vector’) to deliver a therapeutic gene which substitutes for a faulty one. In gene therapy, integration of this gene can be an intended mechanism of the therapy but it can also be an unintended side-effect.
To investigate any unintended consequences of viral integration, the first step is to detect the presence of such integrations, their location in the host genome, and which part of the virus was integrated. We recently published a tool, isling, that does just that.
Isling detects viral integration in sequencing data by identifying reads that appear to be junctions between the host genome and a viral genome, and is up to 170% more accurate and 1.6 times faster than other viral integration tools. Isling is also application-agnostic, and is capable of identifying integration of both, wild-type viruses and gene therapy vectors.
Isling can cater for the complexities in each of these use-cases. For example, it can identify integrations even if they can't be uniquely localised in the genome and output them separately to integrations with a clearly defined location. These might be of interest when investigating gene therapy side-effects, where every potential integration is of interest. Isling can also identify likely false positive integrations caused by rearrangement of the vector genome, and remove them if desired, or ignore them for wild-type viruses if rearrangement is not expected to occur. It can also combine independent observations of the same integration event (for example, caused by clonal expansion in cancer tissue), and report the number of observations for each event.
These features make isling a great tool for identifying integrations of a gene therapy vector as well as wild-type viruses. Potential applications are:
- assessing the safety of gene therapy vectors by identifying any potentially dangerous integrations,
- investigating the role of viral integrations in cancer,
- comparing the integration properties of vectors and their wild-type virus counterparts.
If you're interested in isling, you can find the code on GitHub, pre-built Docker containers on Docker Hub, and read our paper.