General

Download

Overview
Project History
People

Use Cases Scenarios
Use Case Requirements
Bugs and Known Limitations
LHA

User Manuals Getting Started
General Tour
Configuration Options
Logging Tool Tour
Administration Tool Tour

Developers Architecture
Analysis of Design
Creating Plugins
Future Enhancements

Project History

by Sheila Raynor

A clinical study for the National Survey of Health and Development

The Clinical Study Tracker (CST) was a software solution created in response to a need to log and track large amounts of clinical data that were being processed during a three year clinical research study (2007-2010) conducted by the Lifelong Health and Ageing Unit (LHA) of the Medical Research Council (MRC). The LHA is home to the National Survey of Health and Development (NSHD), an ongoing British birth cohort study initially of 5362 men and women born in England, Scotland or Wales in March 1946. For more information about the LHA and NSHD, click here.

During this NSHD clinical study, approximately 2000 of the original 5362 participants were expected to participate in a series of 5 different cardiovascular measurements performed during a single clinic appointment. Six partner clinical research facilities based around the United Kingdom (UK) were recruited to carry out the measurements, with the LHA coordinating, analysing, and storing all of the test results in a data library.

The raw medical data collected at the research facilities would be sent to the coordinating centre where I was based. My job was to manage the logistics of the incoming data which required logging, followed by either external scientific analyses off-site or post processing in-house. This meant that the data not only needed to be logged once, but multiple times as it moved in and out of the coordinating centre during multiple processing steps with it eventually reaching the data library.

In addition to tracking large amounts of medical data during the study, my job was to also maintain quality control of the data as it arrived from the participating research facilities to ensure good quality raw data during the study. This meant opening up every file to access the overall data quality.

Motivation for tracking each test conducted on every participant

My job of tracking test data against 2000 study participants would have been simplified if the tests conducted would have all been treated in a similar manner and arrived at the same time at the coordinating centre. However, a number of factors required me to separately track each of the separate 5 tests done on every participant:

Identifying steps in the tracking activities

Although we had to separately date track the medical tests (ECHO, IMT, HRV, PWV and PWA), the steps of each task shared some similarities. Specifically we focused on tracking the following data:

Early in the project I tried to calculate how many different dates I would need to track to complete my task. Approximately 2000 participants, each having 5 different medical tests and 5-6 processing steps would equal about 50,000-60,000 date entries in total. Each date would be a critical step in the process of sensitive, confidential medical data.

More work beyond making 50,000-60,000 date entries

When I was working at the LHA, I was the sole person responsible for tracking the progress of cardiovascular tests from origination at the clinical research facility through to ensuring the end results could be uploaded into the data library. My initial calculation for the work needed to track subjects assumed that there would be no errors in the way data were collected, analysed or post-processed. The initial work estimate also didn’t consider delays caused by the LHA or its any of its partner sites. The following challenges of coordinating efforts with other partners complicated the tracking activity:

Thus my job was not simply to track the results, but to also perform quality assurance tests on the data we did process and deal with rectifying any errors. I had to make a decision about how much time I invested in tracking data. Although there were five cardiovascular tests for each volunteer, each test result was expressed as hundreds of data points which had to be vetted before the whole result could be accepted or rejected. It was becoming impractical for me to both perform quality assurance tests on the test data and track them as well. Quality assurance activities represented what I will call the MICRO management of the cardiovascular data.

My job became even more complicated when it was requested that I should provide interim progress reports on our cardiovascular data collection and processing. At monthly meetings of the Cardiovascular Project Management board, I had to update board members with results which could include information such as:

I knew that this was only the beginning of the progress reports, since there were many stakeholders in this data who all wanted to know where we were with such a complex data collection. Given my work load and the fact I was the only person charged with managing the cardiovascular data, it was becoming impractical for me to manually check, track and report on the test data for all 2000 participants. I began to consider ways that my work could be automated.

Quality assurance checks could not be automated because results had to be visually inspected by someone with specialist knowledge. MACRO management of the data such as date tracking the movement of the files did not require specialist knowledge, but was a critical step nonetheless for the study. I believed that the process of date tracking could indeed be automated in such a way so that relatively unskilled user could be quickly trained to log the dates with a minimal amount of errors. These users could even produce progress reports if needed if the software worked effectively.

The need for software to support tracking a clinical study

When I joined the study, there were no systems in place to track the cardiovascular tests which were already being received from the remote clinical research facilities. Using pen and paper would not have been scalable for tracking large numbers of participants through many various medical tests and subsequent data processing. Using an Excel spreadsheet to track the projected 50,000-60,000 entries would have been very error-prone. It would also not have retained an audit trail of updates made by multiple end-users.

I decided that the software application could MACRO manage tracking test data. Therefore, I began sketching out the idea for an in-house software tool which would let me track the participants and their data. I began by making a schematic drawing which included each of the medical tests (ECHO, IMT, HRV, PWA, PWV) I was responsible for and needed to track. For each medical test I identified the key individual processing steps which began with the initial acquisition at the remote clinical research facility and ended with uploading data to the LHA’s data libraries.

I designed the schematic to represent the processing steps in the simplest manner for each of the 5 medical tests. The process steps could be represented by a field name (name of the medical test) and a completion date for each individual process step. I created some sample output graphs which would show the progress of work done. I based the sample graphs on what I was already being asked to provide to the stakeholders, but was taking an enormous amount of time to generate.

Below are the scribbles which led to the first idea for CST:

I took the schematics to Kevin Garwood, who is the main developer of the LHA’s Data Services Group (DSG). He noted that the simplicity of the features suggested that the application could be generated from a configuration file. Kevin asked me to use my past experience dealing with clinical trials to answer three important questions:

I tried to recall what my 10 years in clinical trial work had taught me. I had worked in various capacities for both large and small clinical trials. Whatever the trial, there was always disparate bits of information that needed to be followed and completed. The trial always had a main entry form for completion, which was either paper or electronic. However, other test data is always involved in clinical trials such as blood tests, drugs, physiologic testing results, etc. and these rarely had proper tracking within the main form.

The majority of these tests required processing of some kind, even if it was just sending the blood to the laboratory for results. The trial coordinator expected the data to be gathered, analysed, and reported without subsequent tracking forms either paper or electronic. This was always a source of delay because it was difficult to manage, especially if staff changed over during the trial period. Frequently there would be a change of staff since clinical trials can run over months or years. A new staff member often felt lost on what had been collected so far, and what still needed to be collected. Occasionally the major funding body for the trial would ask for updates on the data, which caused multiple staff to spend hours putting together interim reports.

Kevin had been involved with past software projects which generated data entry applications. He raised the concern that generating software applications from a model had benefits and costs. The main benefits were that the software could allow a non-programmer to add new features to the application, such as new activities and new steps. The cost was that the application would not work well if some activities were treated differently than others. For example, he asked whether the minimal amount of data needed to track ECHO tests was the same as the minimum data needed to track HRV.

I thought about his concerns and made a number of conclusions:

Kevin produced an initial prototype of the forms using Project35, an open source tool he developed which uses XML schemas to generate simple data entry forms.

The tool was useful for validating which form fields should appear in a tracking application. However, he said that the application was not sufficient to use as the tracking tool. Instead, he decided to develop another tool which generated the kind of forms we needed for our use case.

Kevin managed to produce the first basic prototype within a couple of months. Before I left the LHA in June 2009, I managed to check that the features were sufficient to electronically support the tasks I had done. I have greatly enjoyed making something that was simple, generic and useful. I have remained involved with the project in various capacities since my departure from the LHA.

Additional Contributions

Susan Latham, who was the line manager for the data services group, suggested that the application needed to support loading progress data about subjects from spreadsheets. She became more interested in the potential of using the tool within the DSG and with Di Kuh’s support, development continued.

Tim Gaysin, who succeeded me in managing the data, later suggested that the tool should have the ability to simultaneously update a given activity step in a batch of subject member records. Imran Shah provided data sets which would test the application’s ability to import data.

Phil Curran, who became the next head of the DSG, worked with Kevin to develop validation routines to catch errors in imported spreadsheet data. Most recently, Andy Wong, Tim and Jane Abington suggested that the application needed to be able to produce tab-indented files which showed the progress of filtered subject members through specific activity steps.

The project continues.

Author: Sheila Raynor


(c)2010 Medical Research Council. Licensed under Apache 2.0.