An effective response to a disease outbreak requires the rapid identification of pathogen and source.
Exploiting genomic information has become an important component of effective biothreat agent identification, characterization, and attribution. To do so, the necessary bioinformatic analyses require known genomic data against which to compare the agent’s genomic data. However, more and more genomic data is becoming privately held. To truly understand where an agent came from or important features of the agent (e.g., virulence, alternative hosts, and environmental stability), the biodefense community will likely need to leverage the genomic data that resides in these private databases. This may be especially important when a truly novel agent is discovered and near-neighbors need to be identified. Security requirements necessary for biothreat agent information or active investigations limit the direct sharing of genomic information with outside parties. Private entities are often unable to share access to their database due to privacy and legal issues. Fortunately, technology options exist that enable secure computations to be executed that fulfill data privacy requirements.
We developed the Secure Interrogation of Genomic Databases (SIG-DB) algorithm to enable the interrogation of a privately held database with a sequence of interest to determine the presence of similar sequences, without compromising the query or database information. This method was confirmed to be functional and evaluated using wild-type and in silico mutated versions of Escherichia coli and Staphylococcus aureus genomic sequences obtained from the NCBI RefSeq database.
This is the poster that was presented at the 2018 annual biothreats meeting, hosted by the American Society for Microbiology (ASM).