A new industry standard? Unified data model reaches project milestone

By Melissa Fassbender

- Last updated on GMT

(Image: Getty/spainter_vfx)
(Image: Getty/spainter_vfx)
The Pistoia Alliance hopes to facilitate “seamless data integration” between various industry stakeholders, including life sciences companies and CROs, via an open and freely available data format.

The Pistoia Alliance is a global, not-for-profit alliance with the goal of lowering barriers to innovation in life sciences R&D. First released in June 2018, the organization recently updated its Unified Data Model (UDM) file format to v5.0 Brooklyn.

According to the organization, the data exchange file format facilitates collaboration between researchers, organizations, and other stakeholders. The latest version now supports customization by individual electronic lab notebook (ELN) and software vendors.

Gabrielle Whittick, a consultant at The Pistoia Alliance, said the goal of the Unified Data Model (UDM) project – funded by Biovia, Elsevier, GSK, Novartis, and Roche – is to remove barriers to innovation in the drug discovery process “by providing a well-defined format for exchange of chemical reaction information.”

“A lack of consistency in data formats make it difficult to share information and collaborate, particularly across a range of systems, or where data comes from a range of sources,”​ added Whittick.

Ultimately, he said the organization would like to see the UDM format become an industry-wide data standard, to facilitate “seamless data integration”​ between various systems and stakeholders, such as life sciences companies and contract research organizations (CROs).

“The UDM offers an open and freely available data format for storage and exchange of chemical reaction data – which will help organizations share data far more easily, ensuring consistency and quality, regardless of the source,”​ Whittick told us.

The platform also will enable consistent representation of experiments and intellectual property (IP) capture, he said.

The Pistoia Alliance aims to help organizations focus on developing new therapies by removing duplicate work, and reducing the time and money spent on data formatting.

Additionally, Whittick said the use of a standardized data format will help advance the development of the “digitally-driven Lab of the Future (LoTF) by standardizing data to unlock its value as interoperable, customizable, and analyzable information.”

It also has implications for artificial intelligence (AI) and machine learning, as a standardized format would make it easier to collate the data needed to train these algorithms.

Becoming an industry standard

According to The Pistoia Alliance, the most recent release is a milestone for the project because it allows vendors to customize aspects of the file format.

“This overcomes the existing barrier to data exchange that vendor-specific ELNs cause, by making data interoperable and shareable,”​ explained Whittick, who said the new format ultimately will accelerate R&D. “Combined with lower costs and eliminating the need to develop internal infrastructures, the time saved is significant,”​ he added.

The UDM also allows users to generate value from previously “unusable”​ data, or from data collected outside of an organization. Whittick described it as “a good starting point for new experiments,”​ because it provides a template for the representation of chemical reactions.

The Pistoia Alliance expects the next release to be announced in the second quarter (Q2) of 2019 and is actively looking for industry input.

“We also hope to see more organizations adopt the free file format and start seeing the benefits of standardized data,”​ said Whittick, “this will ultimately support our goal of the UDM becoming an industry standard.”

Related news

Show more

Related products

show more

Saama accelerates data review processes

Saama accelerates data review processes

Content provided by Saama | 25-Mar-2024 | Infographic

In this new infographic, learn how Saama accelerates data review processes. Only Saama has AI/ML models trained for life sciences on over 300 million data...

More Data, More Insights, More Progress

More Data, More Insights, More Progress

Content provided by Saama | 04-Mar-2024 | Case Study

The sponsor’s clinical development team needed a flexible solution to quickly visualize patient and site data in a single location

Using Define-XML to build more efficient studies

Using Define-XML to build more efficient studies

Content provided by Formedix | 14-Nov-2023 | White Paper

It is commonly thought that Define-XML is simply a dataset descriptor: a way to document what datasets look like, including the names and labels of datasets...

Related suppliers

Follow us

Products

View more

Webinars