Tag Archive | Literature Search

Does Clinical Data qualify as “Big Data”?

I was at an Analyst conference last week where I met a couple of analysts (no pun intended :-)) focused on Life Sciences who felt that “Big Data” is a tough sell in Life Sciences, except for Genomic Data. That made me think. I always associated “Big Data” with the size of the data sets running into Peta Bytes and Zetta Bytes. What I learned in my journey since then is that the characteristics of Big Data does not start and end with the Size.

This article on Mike 2.0 blog by Mr. Robert Hillard, a Deloitte Principal and an author, titled “It’s time for a new definition of big data” talks about why Big Data does not mean “datasets that grow so large that they become awkward to work with using on-hand database management tools” as defined by Wikipedia. He goes on to illustrate three different ways that data could be be considered “Big Data”. For more, please read the blog.

One quality he explained that is of interest to me is “the number of independent data sources, each with the potential to interact”. Why is it of interest to me? I think Clinical Data, in the larger context of Research & Development, Commercialization and Post Marketing Surveillance definitely fits this definition. As explained in one of my previous posts title “Can Clinical Data Integration on the Cloud be a reality?“, I explain the diversity of clinical data in the R&D context. Now imagine including the other data sources like longitudinal data (EMR/EHR, Claims etc.), Social Media, Pharmacovigilance so on and so forth, the complexity increases exponentially. Initiatives like Observational Medical Outcomes Partnership (OMOP) have already proven that there is value in looking into data other than the data that is collected through the controlled clinical trial process. Same thing applies to some of the initiatives going on with various sponsors and other organizations in terms of making meaningful use of data from social media and other sources. You might be interested in my other post titled “Social Media, Literature Search, Sponsor Websites – A Safety Source Data Integration Approach” to learn more about such approaches that are being actively pursued by some sponsors.

All in all, I think that the complexities involved in making sense of disparate data sets from multiple sources and analyzing them to make meaningful analysis and ensure the risks of medicinal products outweigh the benefits will definitely qualify Clinical Data as “Big Data”. Having said that, do I think that organizations would be after this any time soon? My answer would be NO. Why? The industry is still in the process of warming up to the idea. Also, Life Sciences organizations being very conservative, specially when dealing with Clinical Data which is considered Intellectual Property as well as all the compliance and regulatory requirements that goes with the domain, it is going to be a long time before it is adopted. This article titled “How to Be Ready for Big Data” by Mr. Thor Olavsrud on CIO.com website outlines the current readiness and roadmap for adoption by the industry in general.

The next couple of years will see evolution of tools and technology surrounding “Big Data” and definitely help organizations evolve their strategies which in turn will result in the uptick in adoption.

As always your feedback and comments are welcome.

Advertisements

Social Media, Literature Search, Sponsor Websites – A Safety Source Data Integration Approach

I haven been part of my fair share of discussions on Drug Safety and Social Media. In fact, I have even written a blog post about how these two are being forced into an “arranged marriage”, which could be a good thing :-). While processing data from social media is very complex and often unreliable, there is increased push to process it anyway. Understandably, Marketing teams are the first to adopt social media channels in pharmaceutical organizations, now the drug safety teams are being forced to act as these channels could end up generating adverse events and they are obligated to register, review and report.

As mentioned, processing of data from social media could be complex and may yield very few cases (0.2% according to a Nielsen’s Online survey of health-related social media content) the high level process is very similar to Literature Scanning. The later is something that is already being handled by organizations. I think that the Social Media content search and analysis can becoming an extension to this process. Now lets look at both the processes.

Literature Search:

Literature Search is used by BioPharmaceutical organizations to identify Adverse Events related to their medicinal products in medical and scientific journals published worldwide. This process was adopted as a result of multiple serious adverse events and the ensuing regulations and increased safety concerns. Many sponsor organizations have successfully built automated systems to speed up the overall process. These systems typically scan sources (Journals, Abstract Libraries and Reference Libraries) based on certain keywords, product names, Boolean expressions etc. and capture the mentions into a local database. These entries are then screened by trained professionals to either accept or reject them based on the required data elements to qualify as an adverse event. If additional details are required, the journals are purchased and reviewed to qualify the “hit” as an adverse event. Once identified, this becomes a case that will then be transferred manually or electronically (e.g. E2B) to an Adverse Event Management System and will follow the life cycle till it is reported as an expedited or periodic report to regulatory authorities.

Social Media:

This process can be very similar to Literature Search except that the source of data is much more diverse and also the data is far less structured. Depending on the source system, a manual or automated process can be adopted to monitor and record the “hits”. If the source system is a “blog” or “Twitter” or “Facebook”, a tool can be build to continuously poll certain blogs, tweets or Facebook pages to scan for keywords, products/brands etc. The resulting “hits” can be processed to filter and aggregate the “trends”. These trends can then be reviewed by trained professionals to make a decision on whether they qualify as “Safety Cases” that will then be processed per the AE case management process.

Enterprise Websites, Response Centers etc.:

The third variety that may be considered as source systems for safety cases are Brand Websites and other portals setup to increase the brand awareness or assist the patients to receive medicine faster or address any questions and concerns. This may even include response centers setup for patients, pharmacists and physicians to reach the sponsors for information and advice. Depending on the nature of inquiries, these could be potential sources of Adverse Events. This data, once screened and qualified, can also be fed into the AE Management System for subsequent review and reporting purposes.

Source Data Integration:

From a technology standpoint the architecture and design for aggregation and analysis of data may differ for each of the datasets. However, an integrated approach to collecting, aggregating, analyzing and reporting of Adverse Event data needs to adopted by the sponsor IT organizations. The diagram below depicts:

  1. Multiple Source Categories and Systems (Literature, Social Media, Enterprise Websites)
  2. Multiple Interfaces  (Manual, XML, Text, API, RSS, Web Services etc.)
  3. Simple, High Level process to screen, record, review and report the case
Literature Scanning, Social Media & Enterprise Websites  - Safety Source Data Integration

Social Media Safety Source Data Integration

(To Be Continued…)

%d bloggers like this: