What is Content Automation in Pharma and Why is it Important?
Content automation in pharma is the practice of using systems to perform repetitive tasks to create content. For example, tasks such as formatting, data retrieval, and document assembly can be automated. Content automation in the pharmaceutical industry is often implemented for the following reasons:
- Reduce the time it takes to develop and deliver content
- Mitigate risk by eliminating human error from tasks
Traditionally, pharma content automation was limited to what could be done with Microsoft Word. There are various plugins, macros, and other solutions that automate some level of content creation. Because these automations are limited by what word processing software can do, pharma has not been able to automate as much of its content creation as other industries.
More recently, pharma companies are looking toward structured content authoring (SCA) as a way to employ automation on a much larger scale. These companies are automating content creation by implementing content reuse and conditional processing, using smart templates, developing common metadata, sharing variable definitions, and maintaining libraries of reusable content components. They are fetching data from source rather than relying on humans to copy/paste or re-key data. GenAI is proving valuable in creating derivative content from knowledge captured in a source of truth, such as a clinical trial design tool or a quality management system.
Structuring pharma content is a crucial step on the journey toward enabling digital submissions and implementing emerging technologies such as deterministic AI, LLMs, GenAI, and other types of artificial intelligence.
What Are the Benefits of Content Automation?
Content automation provides several benefits of particular interest to pharma organizations. Each one of these benefits helps pharma companies reduce the time it takes to produce content throughout the drug development lifecycle and beyond.
- Reduce risk by eliminating manual operations such as copy/paste and table formatting
- Save time by automating content creation through content reuse, dynamic content assembly, and data integration
- Enable faster responses to new and changing regulatory requirements by speeding up the process of finding, updating, and delivering content
- Ensure consistency across a large volume of documents by verifying content against defined content standards
- Streamline translation process and improve quality of translations by reducing the amount of unnecessary variations across a large body of content
- Improve accuracy by automating document creation through use of authoring templates and output formats
- Enable continuous inspection readiness through structured content and automated workflows
One of the main advantages of content automation for pharma is that it reduces the amount of time authors, reviewers, and subject-matter experts (SMEs) spend on tasks not related to content creation.
Content reuse enables authors and SMEs to focus on creating new content only where it is needed. Authors spend less time finding content to copy, paste, and reformat. Automated formatting frees up a significant portion of time from authors, editors, and publishers. Integrating data sources with structured content authoring tools enables data to be retrieved automatically rather than copied and pasted, formatted, and re-verified.
Pharma organizations see these content automation advantages when developing content at scale. It takes some work to design, develop, and implement the various types of content automation. The initiative involves a cross-functional team that includes people from content, technical, and product teams.
Disadvantages of Content Automation
When done well, we don’t see a lot of disadvantages to content automation.
However, a poorly designed (or non-existent) content automation strategy can result in one or more of the following drawbacks:
- It takes longer to maintain the automation than it did to develop content manually
- Too many systems in place to be cost-effective
- Scope is too narrow to show value or too broad to implement effectively
Some of the pitfalls that pharma organizations fall into on the journey to content automation include:
- Automation is applied to legacy content without first structuring or curating the content
- Attempt to automate on too large of a scope all at once
- Tendency to re-create legacy processes which then reduces the automation capabilities of the new technology
- Tendency to automate legacy processes instead of adopting structured content authoring
- Tradition of requesting customizations to new tools rather than adopt new ways of working
- Authors are not trained in structured content authoring or to write with reuse in mind, so the content does not fit together well when automatically assembled or reused
Content automation cannot solve all content-related challenges on its own. You still need skilled writers to produce excellent content. You still need subject-matter experts to create, review, and sign off on content.
The promise of content automation is to take the repetitive, high-risk tasks out of the content development workflow so that all stakeholders can focus on the tasks that require human expertise. Content automation increases accuracy of content and data while decreasing the amount of time humans have to spend searching for, creating, revising, or reviewing content.
Examples of Automated Content Creation
Content automation manifests in many different ways throughout the content lifecycle. Here are some examples from our recent customer engagements:
- Assemble a study-specific protocol from a repository of structured components
- Generate regional ICF variants from a single “super” document
- Retrieve LIMS information from a data lake and format as a table in a structured component within a Stability Data report
- Populate master data throughout a document set from a single source of truth
- Apply formatting, layout, and design elements to documents
- Reuse content based on assembly rules and authoring templates
There are many opportunities for content automation in the pharma industry. Content reuse is one of the most common, most impactful ways to automate content creation. With content reuse, pharma companies can create and deliver documents and other outputs extremely quickly.
Medical writers don’t have to waste time with copy and paste or with formatting (and reformatting) content across documents. Revision cycles become much more efficient, as it’s easier to find and update information when it is reused than when it has been pasted or created repetitively throughout a set of documents.
Structured content management systems can receive master data from RIM systems to autopopulate components and documents. This master data can also trigger automated assembly of content, workflow tasks, or other repetitive processes that can be entrusted to systems.
How AI Supports Pharma Content Automation
AI Readiness Comes Before AI-Generated Content
GenAI, where the AI engine creates new content, is a rapidly growing field. Advances are being made literally every day. If you are considering incorporating GenAI in your content ecosystem, it is important to keep up with the latest developments.
One thing that has not changed about including GenAI as a tool is the need for your content to be AI-ready. AI readiness is perhaps the most important step that should not be skipped in any application of AI technology, particularly with complex and critical information.
Why Structured Content Improves AI Training
AI engines must be trained and, in particular, they need to be trained using your specific content corpus. Without the right training, you cannot be sure that content or data generated from AI is accurate. In fact, unless you closely monitor how you train the AI, you might never be able to trust AI to write factual information.
We have all heard of situations where current AI solutions have “gone off on their own” and woven fantasy answers for basic questions. With the need for explicit accuracy paramount in pharma content, great care must be taken in training the engine and monitory the results.
The best way to train AI is using standardized, structured content. Curating the training set to ensure it is accurate and understandable is an important step in AI readiness. Companies that think they can skip this step will be in danger of having the AI create imaginative answers to very important questions.
What AI can do with Pharma Content?
The benefits of applying artificial intelligence in the pharmaceutical industry are based on two things:
- The speed that AI can process information
- The ability for AI to analyze data and hypothesize results
AI can process enormous amounts of information at lightning speed. The quantity and speed of information processing allows AI to deliver more statistically significant analysis of massive amounts of data. In addition to the analysis, AI engines can sort through the data in different ways, providing insights that human analysis might miss and predictions that would be impossible to make otherwise.
The Three Content Sets Needed to Train AI
You need three sets of content to train an AI engine:
- One for the AI to ingest
- One to use for training the AI
- One to validate the training you have done
The most effective way to train an AI engine is to use standardized, structured content made up of componentized chunks of clean content. The cleaner and easier to understand your training corpus, the better your results will be. If you train the AI with messy, long form content, it is possible that you will introduce information that results in confused or incorrect responses.
By componentizing and curating your content, you can ensure that the learning done by your machine contains accurate results from your data. Skip this step at your peril.
What is Content Reuse and Why Does It Matter?
Content reuse is the practice of creating one piece of content and then using that piece of content everywhere you need it. Teams create, finalize, translate, manage, store, and retire a reused piece of content one time, and only one time. Content reuse helps teams create the dossier faster and at a lower cost. It’s how you scale operations to provide the vast amount of interrelated data and content that must accompany every molecule, at each stage of that molecule’s lifecycle.
Content reuse is not something you can start on an ad-hoc basis. You need up-front planning to determine which content to reuse, how to reuse it, and how to manage it within and across organizations.
Plan for Reuse Before You Create Content
The most effective way to leverage content reuse is to plan for reuse.
Content reuse reduces the time and cost of content development. Content reuse also improves quality and reduces risk by ensuring consistency, accuracy, and adherence to standards. When you reuse content, you draft, revise and release the content one time. This means you say the same thing the same way everywhere you need to provide that information. This consistency enables automation, including AI and API-based automation. It also makes information easier for reviewers to find and understand.
If you need to make a derivative – say, a lay language version of a scientific concept – you can manage that derivative as a reusable component and use it everywhere that requires the same voice and tone. Modern systems maintain links between the source knowledge and its derivatives. These connections enable humans and machines to easily find and update information. This capability mitigates the risk that outdated or contradictory information makes it through to a submission and creates delays through repeated RFIs and review cycles.
Reduce Copy/Paste Risk and Rework
Content reuse, whether verbatim or derivative, eliminates the risk of copy/paste errors and the need for re-verification of data. Reuse also removes the temptation for authors to “tweak” the words after they paste.
Each copy, paste, and tweak operation also adds time. Did the author’s tweaks improve the chances of regulators accepting the information without query? If the regulator does query the information, there are now two places to find and update the information: the original content and the pasted, repurposed content. It can be difficult to find the repurposed information if the terminology or syntax changed.
With content reuse, teams only need to find and update the single source.
Why is Content Reuse Important in Pharma?
A drug dossier contains an enormous amount of interrelated content. Without content reuse, teams struggle to provide the right information in the right place every time. As a result, most organizations attempt one or more of the following tactics:
- Create entirely new content
- Create redundant content through copy, paste, and tweak
- Use excessive cross-references to point to source information
Each of these tactics has drawbacks, including:
- Copy/paste introduces errors that increase delay, risk, and cost
- Redundant content wastes time and money to create, review, manage, translate, and retire
- Redundant content interferes with findability
- Cross-references send reviewers away with no obvious path back to where they started
Content reuse becomes crucial as the submission process evolves. As regulators accept or require digital submissions, sponsors can submit content in more granular pieces instead of waiting to complete, combine, paginate, and link full documents. Sponsors can submit information when it is ready, and regulators can provide guidance sooner. This move toward continuous submission helps everyone pivot quickly and focus on developing safe, effective treatments rather than managing thousands of pages of paper, whether in print or PDF.
Types of Pharma Content Reuse
The most common types of content reuse are:
- Component – A single building block of information
- Element – A single paragraph, table, or figure
- Variable – A single data point, word, or phrase
- Component set – An assembly of components organized into a hierarchy
- Document – Reuse an entire document as part of a larger document
- Template – Reuse templated content into a working document
- Condition – Automate inclusion/exclusion of content based on conditional metadata
Component, Element, and Variable Reuse
Each of these types of content reuse can be implemented in different ways. Most mature component structured content management systems provide all these types of content reuse.
The different types of content reuse are suited to different types of content. Each organization must consider how the types of content reuse differ from each other. The best path forward depends upon the requirements of the content, the business processes for how people work, and the system capabilities. For example, component reuse is the simplest type of reuse to configure. Teams can automate content reuse by configuring templates that reference reused content. Authors can easily insert reused components wherever they need to provide the information.
Conditional Reuse and Automated Assembly
Condition reuse is a more complex type of reuse to configure. Condition reuse provides powerful automation for creating many variations of the same document. For example, conditional reuse enables you to automatically produce compliant informed consent forms based on conditions such as therapeutic area, eligibility criteria, and country.
Condition reuse takes significant planning and additional configuration efforts compared to component reuse. Some companies choose to incorporate condition reuse as a later phase of their structured content adoption while others do not use it at all. With so many ways to reuse content, it can be challenging to figure out which way to use for any given content. Content Rules has developed a set of best practices and evaluation criteria for helping our customers determine the best ways to reuse content.
What Pharma Content Can be Reused?
Pharma content offers many opportunities for content reuse. With digital transformation, structured content, and component-based reuse, pharma companies can get much more granular and derive much greater benefit from their content libraries.
Here are just a few examples of content reuse in common pharma use cases:
- Quality/CMC
- Clinical
- Labeling
- Medical Information
- Commercial
Quality/CMC
The content reuse opportunities in CMC come mainly from the ability to reuse standardized, templated components and integrate data directly from quality systems and other sources of truth. Teams can generate narrative content from this data and reuse it in Module 3 reports and Module 2 summaries.
Clinical
Eligibility criteria, study rationale, study objectives and endpoints, and the schedule of assessments can be reused across several clinical document types. Most companies look at the protocol, statistic analysis plan, and clinical study report. Other closely related outputs include Module 2 summaries, investigator’s brochure, and informed consent forms (with transformation to patient-appropriate language).
Labeling
Labeling content derives from content created “upstream.” Therapeutic indication, dosage form and strength, use in specific populations, adverse reactions, drug-drug interactions, and other content can typically be reused into several regional labels.
The company core data sheet is essentially a repository of reusable content. Companies traditionally maintain the core data sheet as a document that authors can copy from. With structured content management, the information in the core data sheet can be managed as a set of reusable components, eliminating the need to copy and paste the information manually.
Medical information
The med info team often delivers the same information in multiple formats, including presentations, documents, and websites. The standard response document, or standard response letter, typically provides safety information and includes the entire package insert. Med info teams can reuse this information by appending the full document or by extracting relevant components and reusing them wherever needed.
As demand grows for sophisticated AI-based chatbots and mobile apps, med info teams must deliver more focused information, more quickly, and in more formats. Content reuse helps med info teams provide highly personalized information that relates to the patient at the point of care.
Commercial
Pharma marketing content must meet a complex set of regional and international regulatory requirements while still telling a compelling story for patients or health care professionals. Content reuse enables marketing teams to provide the right information at the right time on the right device – while meeting all regulatory requirements for that region, audience, and type of information.
What is a Pharma Content Reuse Strategy?
You get the most ROI from content reuse when you approach it from a strategic perspective. A reuse strategy is a plan for which content you will reuse, how you will reuse it, and where you will reuse it.
Think of content reuse the same way you think of electric standards or plumbing standards. You can buy any one of a thousand different kitchen faucets on the market, from any one of the various manufacturers, and it’s going to work in your kitchen. It works because ANSI standards define the diameter of the pipe, the number of threads, the types of materials, and other allowable characteristics. You can “reuse” different faucets in any kitchen because underlying standards make sure everything fits together properly.
Define What Gets Reused and How
A content reuse strategy identifies things such as:
- Which content is reused automatically
- Which content is reused manually
- What level of content can be reused
- Which mechanisms supply the reuse, such as variables and conditional logic to support “if/then” content reuse
- How reused content is managed
- Where content can be reused verbatim, without change from the original source
- Where content can be reused as a derivative, with some changes from the original source
While certain reuse best practices have been proven over time, there is no “one size fits all” for content reuse strategies. One of our pharma customers defined a reuse strategy for clinical reports that allows for reuse at the component and section level, but not for an individual paragraph or sentence. The CMC team at that same company defined reuse at the paragraph level in order to automate conditional text. Another customer’s reuse strategy separates individual sentences in core areas of clinical trial design – eligibility criteria, objectives, endpoints, assessments – to enable conditional reuse based on metadata.
Create Standards for Reusable Content
Regardless of how granular your level of reuse is, without a plan, your best efforts to scale content reuse will fail. There’s more to content reuse than breaking long documents into pieces and then putting them back together. To successfully reuse content, you need to create content standards.
For reused content to flow seamlessly wherever it is used and to meet the requirements of the various documents, the content must be created with reuse in mind. The organization must adopt standards for how content is written, how much content to include in each component, and what order to include the information in. Authors must follow those standards so that content can be reused automatically, without additional curation by humans.
How Pharma Content Reuse Supports Content Automation
Content reuse is the basis of several types of content automation in pharma. In fact, content reuse and automation are so closely related, it’s almost impossible to do one without the other. Much of content automation depends on a well-defined content reuse strategy.
Here are some examples of how we help pharma customers automate content creation through content reuse.
- Create authoring templates that automatically provide reused content when authors create new documents
- Enable authors to create closely related documents simultaneously by building planned reuse into authoring templates
- Build a library of reusable components that can be inserted into documents automatically or manually wherever needed
- Design reuse maps, conditions, and metadata models to support dynamic assembly of working documents
- Develop variable definitions files to reuse small units of information across a wide array of documents from a single source of truth
- Develop output formats that automate formatting, navigation, and cross-reference links so that the same content can be reused in multiple outputs
A pharma content reuse strategy needs to be robust enough to support different types of automation while still meeting all the requirements of the content.