Data migrations, from an eTMF perspective – Part 1

November 28, 2016

  • eTMF Resources

Every company using an eTMF system has conducted (or soon will) a study migration project of some scale. Whether it is finding a way to move an entire paper TMF into their new eTMF system, moving a CRO-conducted study into their eTMF, or transferring studies from a legacy eTMF system into the Wingspan eTMF, study migrations are an integral part of ensuring Completeness, Consistency and Compliance in your eTMF processes.

I sat down with Lou Pasquale, a Senior Director in Wingspan’s Professional Services organization, to learn more about his experiences leading various eTMF study migrations.

In the first part of this interview, Lou described common migration scenarios and some of the obstacles you can expect to face. 

What is a study migration?

For the purposes of this discussion we will define a study migration as an automated process through which documents and related metadata are exported from one system and loaded into the Wingspan eTMF.  Wingspan breaks these types of migrations into two categories:  CRO system to sponsor’s eTMF or legacy system into the Wingspan eTMF.

CRO to Sponsor

At the end of a CRO-outsourced study the CRO provides the sponsor with all the study documents and any related metadata from the CRO’s in-house systems.  Historically, this handover process has focused solely on a complete and effective transfer of the study from the CRO to the sponsor.  Responsibility for the documents and data rests with the sponsor after the handover.  Up until the handover, the CRO’s processes and organization of the TMF have supported the conduct of the study on behalf of the sponsor, however they have been focused on collection and management of the documents while they reside with the CRO.  Little thought may have been given to how the data and file structures may support a downstream, automated migration.

Legacy eTMF into Wingspan eTMF

In this scenario, the client has decided to replace their aging eTMF (or other) system with the Wingspan eTMF. They may plan to migrate completed studies, in-process/ongoing studies or simply all studies.  The strength of the Wingspan eTMF’s data model supports the rich features of our application but may present challenges identifying certain data points in the legacy source system.  Legacy data may be represented in many forms:  tabular data in the system’s underlying database, the location of a document within a folder structure, data elements incorporated into an “intelligent” file naming convention and other means.  The legacy system may not have been designed to support a TMF and there may be little or no correlation with the TMF Reference Model.

Both categories of studies present their own challenges and Wingspan has developed tools and a methodology to simplify each type of migration.

Can you describe the typical process you follow for an eTMF migration?

On a basic level there are three phases:  1) Definition / Data Analysis; 2) Migration Dry-Runs; and 3) Production Migration and Remediation.

Definition / Analysis:  Wingspan begins each migration project by working with the client to define the goal of the migration.  Is the study closed or ongoing?  Will the client use the eTMF to measure completeness after migration?  The answers to these questions and others inform decisions made during the data analysis.  A detailed data analysis then takes the source system document types and data and maps them to the target document types in the Wingspan eTMF.  For example, every time we see site-level IRB/IEC Meeting Materials in the source data, we know that it will be mapped to the corresponding document type in the client’s eTMF Master List and that certain other attributes will be required by the eTMF (e.g., the Site ID and a Subject Line for the document).  The closer the source system tracks to the TMF Reference Model the more straightforward this process is.  The client’s Master Data Management approach is taken into consideration to ensure the consistency of data dictionary values in the migrated study represent quality, consistent data.

Wingspan shares the results of the mapping exercise with the client for verification and approval to proceed.

Migration Dry-Runs:  In this phase, Wingspan conducts a series of dry-runs on a representative subset of the study’s data and documents.  The results of the dry-runs are shared with the client, adjustments are made to document type mappings and attribute values and dry-runs are repeated until the final mappings are determined.  Dry-run activities can also uncover corrupt files and other scenarios that may require certain documents to be withheld from the automated migration and loaded later, during remediation processes.

Migration and Remediation:  This final phase is the “easy” part of the process.  The thorough, detailed work done in the previous phases can and often does result in a smooth migration into the client’s Production system.  The cleaner the source system data and the more thorough the analysis and dry-run processes, the better chance the Production migration is in fact smooth and easy.

How do you ensure a successful migration?

It all sounds simple with just three phases, right?  I can be that simple, but there are always twists.  Everything boils down to two things: how well the source system data model is documented and how much the users of that system adhere to the model during data entry. Migrations are much more successful when precautions are taken at the beginning of a study to plan for an eventual automated migration into another system.  Document creation and indexing processes throughout the conduct of the study need to ensure a quality data and documents are captured in the first place – the best time to “get it right”.  Though it requires quite a bit of planning and negotiation up front and may change the engagement model with your CROs, keep these points in mind when defining needs of the end-of-study handover process for a study you intend to migrate into your eTMF.  Rather than accepting the CRO’s data set as is, sponsors should identify a consistent naming convention and organizational structure (e.g., the TMF Reference Model) in their contract with the CRO and make provisions for remediation by the CRO if data elements are missing or of poor quality when it comes time to accept the study from the CRO.

What do you do if you discover that you have bad source data?

The problem here is that you usually find out you have weak or inconsistent source data at the time of a migration. This is discovered during the Phase I mapping exercises.  The target data model in the Wingspan eTMF matches your current business processes and data needs and must be adhered to for the purposes of the migration.  As mentioned above, the source system may have been designed to solve a different set of business processes and is therefore not likely to directly align with the Wingspan eTMF.  The challenge is to get the source data of questionable quality and consistency into a format that allows for a quality migration.  Wingspan has a strict “no documents left behind” policy in that it is our goal to migrate all documents from the source study through an automated process.  This means that missing data points must be provided, or in their absence, set to a default value. While we prefer not to perpetuate the data quality problems of the past, we have to match data cleansing efforts and efforts to manually generate missing data with the goals of the migration. Wingspan strives for the resulting, migrated study to be as inspection ready as a study that was started in Wingspan eTMF.  So in cases of bad source data we press client or CRO for specific details. For example, the export may include 5,000 Training documents. But what site is this training related to?  What type of training material is it?  Does it expire? We need to know all these details about each training document to provide the best possible migration.

Why all of this talk about data?  I thought we were migrating documents?

Yes, the documents tell the story of the conduct of the study and are the primary focus of the migration.  However, it must be kept in mind that the migration of a study with tens or hundreds of thousands of documents must be performed by an automated process.  For that process to succeed, the actual document content is not the driving force.  The data about the documents is what drives the automated migration.  The content of a document may clearly identify it as a Sub-Investigator CV, but if the source system’s data says the document is a Medical License, an automated migration will treat the document as a medical license.  Data is King for migrations.

In Part 1, we outlined the eTMF migration process and many of the obstacles faced during this process. In our next blog post, learn how Wingspan has refined its migration process to address these common obstacles.

To learn more about eTMF migrations, register for our upcoming webinar, “Lessons Learned from eTMF Migrations.” Our panel of experts will discuss their experiences with migrations and answer questions submitted by attendees. Register for the webinar here.