A new industry standard? Unified data model reaches project milestone
The Pistoia Alliance is a global, not-for-profit alliance with the goal of lowering barriers to innovation in life sciences R&D. First released in June 2018, the organization recently updated its Unified Data Model (UDM) file format to v5.0 Brooklyn.
According to the organization, the data exchange file format facilitates collaboration between researchers, organizations, and other stakeholders. The latest version now supports customization by individual electronic lab notebook (ELN) and software vendors.
Gabrielle Whittick, a consultant at The Pistoia Alliance, said the goal of the Unified Data Model (UDM) project – funded by Biovia, Elsevier, GSK, Novartis, and Roche – is to remove barriers to innovation in the drug discovery process “by providing a well-defined format for exchange of chemical reaction information.”
“A lack of consistency in data formats make it difficult to share information and collaborate, particularly across a range of systems, or where data comes from a range of sources,” added Whittick.
Ultimately, he said the organization would like to see the UDM format become an industry-wide data standard, to facilitate “seamless data integration” between various systems and stakeholders, such as life sciences companies and contract research organizations (CROs).
“The UDM offers an open and freely available data format for storage and exchange of chemical reaction data – which will help organizations share data far more easily, ensuring consistency and quality, regardless of the source,” Whittick told us.
The platform also will enable consistent representation of experiments and intellectual property (IP) capture, he said.
The Pistoia Alliance aims to help organizations focus on developing new therapies by removing duplicate work, and reducing the time and money spent on data formatting.
Additionally, Whittick said the use of a standardized data format will help advance the development of the “digitally-driven Lab of the Future (LoTF) by standardizing data to unlock its value as interoperable, customizable, and analyzable information.”
It also has implications for artificial intelligence (AI) and machine learning, as a standardized format would make it easier to collate the data needed to train these algorithms.
Becoming an industry standard
According to The Pistoia Alliance, the most recent release is a milestone for the project because it allows vendors to customize aspects of the file format.
“This overcomes the existing barrier to data exchange that vendor-specific ELNs cause, by making data interoperable and shareable,” explained Whittick, who said the new format ultimately will accelerate R&D. “Combined with lower costs and eliminating the need to develop internal infrastructures, the time saved is significant,” he added.
The UDM also allows users to generate value from previously “unusable” data, or from data collected outside of an organization. Whittick described it as “a good starting point for new experiments,” because it provides a template for the representation of chemical reactions.
The Pistoia Alliance expects the next release to be announced in the second quarter (Q2) of 2019 and is actively looking for industry input.
“We also hope to see more organizations adopt the free file format and start seeing the benefits of standardized data,” said Whittick, “this will ultimately support our goal of the UDM becoming an industry standard.”