User-generated posts about drug effects and suspected reactions could be a valuable source of information for pharmaceutical companies at post-marketing stage, according to a team from the Computer Science and Engineering Department at Carlos III Universidad, Madrid (UC3M).
Online forums are an extremely popular way for the public to receive and post information about medicines. Searches for health information are the third most common activity in Google, with 170,000 searches performed every 5 seconds.
The UC3M team used natural language processing techniques (NLP) to translate social media user’s colloquial descriptions of their experiences with medicines into structured, codified data that can be used in comparative studies to identify patterns and trends.
“This data may also be combined with data from other sources, such as patients' electronic medical records, which contain very useful information about diagnosis, treatments, etc.,” said Paloma Martínez, Professor at the university’s Advanced Databases Laboratory.
The prototype programme analyses chunks of big data and recognises mentions of pharmaceutical drugs, adverse effects and illnesses.
The system registers anti-anxiety drugs not only by reference to their active ingredient or generic name (such as lorazepam or diazepam) but also by brand names (such as Orfidal).
Additionally, drugs can be analysed in relation to their therapeutic effects (such as an indication for anxiety) and their adverse effects (such as Orfidal possibly causing shaking and tremors).
Industry uses
The technology could be used by pharmaceutical companies to "find out what people are saying about a drug, for example, or to gather information on suspicions of adverse effects of drugs to supplement notification received through traditional channels," said one of the Madrid researchers, José Luis Martínez Fernández, who also works as Consulting Director at Daedalus.
The technology is part of the European research project TrendMiner and uses a linguistic processor based on the Daedalus company's MeaningCloud.
Some medical reports and clinical histories "are difficult to process, and because of this they are not being worked on; this technique could help us to get the most out of this content," said Fernández.
"The challenge is to transform these texts, which are currently stored without being analysed, into structured information, which allows them to be used for clinical and epidemiological purposes to gain new knowledge or to analyse trends which aid decision-making.”
As a side effect of the project, the first Spanish language database to compile data on drugs and their adverse effects in the same place has been created, said the team. A paper, ‘Exploring Spanish Health Social Media for detecting drug effects,’ is due to be published in the journal BMC Medical Informatics and Decision Systems.