The Blueprint Initiative has released the tool, which is a result of data curated by the Protein Data Bank (PDB) and genomic sequences in GenBank, is an extension of Blueprint's Small Molecule Interaction Database (SMID).
Blueprint's SMID-Genomes offer scientists a web-based interface to browse 9.4 million predicted small-molecule-protein binding interactions in an organism-specific manner. To construct this tool, Blueprint analysed NCBI's non-redundant sequence database to identify potential binding sites based on homology to known interactions using SMID-BLAST.
"Traditional chemical screening methods used by pharmaceutical companies have focused on protein targets, so scientists have had to test thousands of compounds against each protein, often with little or no forethought of likely candidates," said Dr Hogue.
"SMID-Genomes focuses on small molecules and scans fully or partially sequenced genomes for potential binding activity. The tool allows scientists to narrow down the possibilities and focus on the most likely candidates."
The SMID-Genomes interprets nearly 400,000 proteins from more than 1500 completely sequenced bacterial, eukaryotic of viral organisms. The database is one of a number that is related to human health and disease as it spans a long list of sequenced pathogenic bacteria and infectious diseases.
"But as the number of genomes sequenced and the catalogue of known protein-small molecule interactions grows in the PDB repository, so too will the power of the prediction tool," added Dr Hogue.
The SMID-Genome uniquely incorporates a feature that allows scientists to simultaneously overlap the small-molecule binding behaviours of up to five organisms, permitting the direct comparison of different genomes.
One of the biggest challenges of antibacterial drug development has been the problem of unintended side effects. The hope is that scientists will be able to use the function to quickly narrow the list of candidate chemicals by identifying earlier in the process those that may affect the target organism.
One example is the list of chemicals that target proteins in the Plasmodium falciparum, the parasite that causes malaria. Blueprint's database suggests that of the thousands of small molecules that bind to each organism, 56 compounds interact exclusively with the proteins of the malarial parasite.
In the last decade, the phosphoric acid fosmidomycin has been shown to be an effective treatment for malaria, and may help to improve the widespread problem of drug-resistant P. falciparum. The SMID-Genomes record feature informs scientists of all of the other organisms that might be affected by fosmidomycin.
Other examples of drug-like molecules that bind the malarial parasite but not humans include a piperazine derivative that has been found to inhibit the enzyme topoisomerase II and shows potential as an anticancer therapy, and a pterin analogue that has potential as a broad-spectrum anti-infective agent. Neither of these compounds, however, has been tested as a treatment for malaria.
The next stage in the development of SMID-Genomes will involve correlating compound binding with a sense of what target genes are essential for the organism to survive. This will increase the likelihood of a highlighted molecule being an effective therapeutic.
"A lot of the chemicals that will interact with an organism won't necessarily affect it negatively," Hogue said.
"By focusing on the compounds that interact with essential genes, however, we will increase the chances of the chosen molecules perturbing metabolism, inhibiting growth, or outright killing the target organism."
By opening pathways to unexplored avenues of research, SMID-Genomes offer scientists opportunity to attack unwanted microbes, pests, and plants from the flank, and hopefully offset the adaptive advantages these organisms have enjoyed.