Designing for Maintenance

CST was designed for maintenance by following these steps:

Attract stake holders who could share maintenance costs by making an application which was useful, protected sensitive information and which could be adapted independently of the LHA.
Open source the software using a license which could foster a developer community drawn from the ranks of academic and industry-based clinical study projects. In particular, let other projects adapt the code base to suit their needs without requiring them to publish their changes.
Adopt a philosophy which greatly limits the scope of development so that effort may be spent making the software a maintainable, testable and reusable piece of software.
For our work environment, favour planned development over agile development
Design for testability and software reuse.
Design the software so that maintenance tasks can be shared by people who have different skill sets.
Limit skill requirements of future developers so that cheap labour can successfully maintain the application.

The following sections address these steps in greater detail.

Attract Stake Holders to Share Maintenance Costs
Develop a Design Philosophy for Maintenance
Favour Planned Design over Agile Development
Design for testability and software reuse
Support maintenance by project members with different skill sets

Attract Stake Holders to Share Maintenance Costs

If it is feasible, design for a family of use cases rather than just one

CST was developed in response to a need the LHA had to track case study members through a three year clinical study. The initial release of the application could have been produced much more quickly had it been designed to only satisfy the needs of the LHA’s use case. For example, CST generates forms based on properties it reads from a configuration file.

It would have been much easier and quicker to create the forms by hard-coding specific fields and buttons for electrocardiograms, pulse wave analysis and IMT activities.

The application could also have been developed so that more data were captured besides the date on which a given case study member completed a step of some activity. At the LHA, people in the data services and data collection groups tend to keep track of information such as the number of backup CDs, document packs or measurements that are associated with data for some step. CST could have supported additional form fields which would allow it to manage additional data associated with the 2007-2010 clinical study.

The fastest way CST could have been developed would be if code were written to support the minimal needs of just the LHA use case. This approach would have rapidly produced an initial prototype application that could be tested by staff members.

The drawback of applying this approach would have been that the LHA would find it difficult to attract other projects which would be willing to share software maintenance costs. Sharing development costs is feasible in two scenarios:

Projects partner with the LHA to share the same data
Other non-competing projects share the same code to support the same activities but produce different data.

In either scenario, potential stake holders may be reluctant to invest in a software tool whose future development would continue to be steered by another project. In general, at least two conditions need to be met for another research group to invest in tools developed by other projects:

The tool has to be useful to them and needs to have appeared to consider their needs
The original software authors need to demonstrate that the software can be adapted or configured by another project independently of themselves.

Unless these conditions are met, it can become more appealing for prospective collaborators to apply for separate funding in order to safeguard the autonomy of their own research directions. In order to make the development of CRT an appealing shared cost borne by multiple projects, we took the following steps:

Identify a minimal feature set that would appeal to multiple projects but would still be useful to the LHA
Create a generic tool for tracking case study members that could be adapted, configured, enhanced, maintained and reused by other projects independently of us
Maintain data about non-generic clinical study aspects through other software tools.

Isolate sensitive data from the rest of the code base

The LHA would not be able to make CST publically available if the code base contained sensitive information about the study. For example, suppose the software had a combination field whose choices were NHS numbers. Perhaps because the list of study members was constant in their clinical study, project developers chose to define a static list of NHS numbers in the code. Sharing the source code would then reveal confidential information.

There are other types of information that clinical study projects may not want to appear in a software release. For example, the code used to make the forms could reveal the activities being studied, the methodologies being followed or whether study subjects were people or specific strains of mice.

Project members may feel that the software could indirectly reveal the nature of their work to competing research groups. It may also be important for a project to withhold meta data about its clinical study to safeguard its intellectual property rights.

An application could contain other sensitive information which did not relate to protecting the confidentiality of case study members or to protecting intellectual property rights. Suppose the application contains hard-coded values which show the name, port, user identity and password of a database connection. Sharing this information could help unauthorised users breach security systems.

Most of these concerns can be addressed if a project establishes confidentiality agreements with other groups who would use or adapt the software. However, the overhead of getting bilateral research agreements to protect study data ensures that only a limited number of projects would consider using the software application. By choosing to protect application data merely through law and not through good software design, project developers limit their ability to share maintenance costs with unrelated projects.

CST addresses these concerns by requiring project members to specify sensitive application data in a configuration file that is maintained separately from the code.

The configuration file supports many options, including:

The singular and plural forms of words used to describe case studies (e.g.: case study and case studies; mouse and mice)
The definitions of activities and activity steps
The name and port of the database used to hold data
Names of tables used to store information about users, changes made to activities and information about case study members.

These options allow projects to share knowledge about the code base without sharing knowledge about their study. CST’s ability to separate sensitive and non-sensitive application data make it possible for the MRC to cultivate a shared software infrastructure with groups outside the organisation.

Open source the code

Open source non-sensitive parts of the application

CST was open sourced as part of an effort to rapidly stabilise a core code base that could survive beyond the funding period of any one clinical study project. We assume that if an application which tracks clinical study subjects is both useful and protects sensitive data, the software will have sufficient value enough to attract a community of developers.

Open sourcing the code base allows other projects to adapt the software without having to become permanently dependent on the MRC. We hope there is a coincidence of self-interests which would motivate developers to contribute suggestions and bug fixes back to the common code base.

Allow groups to adapt the code base without requiring them to publish their changes.

CST was designed to isolate sensitive application data in a configuration file so that the code base could be shared publically. Some open source licensing agreements such as GPL require developers to publish any changes they make to the core code base.

The spirit of this licensing condition is to ensure that expertise about an open source application does not become privatised. Known as a “copy-left” provision, this requirement of publishing changes is one way to foster continued community interest in an open source application. However, we anticipate that if the MRC licensed CST using this kind of license, many potential projects would not use the software.

Project members may feel that publishing changes would force them by law to reveal sensitive information about their clinical studies. The spectre of litigation brought by publishing or not publishing sensitive code would far outweigh whatever benefit the application could provide.

Develop a Design Philosophy for Maintenance

By the time Sheila Raynor proposed the development of the software in May 2009, she had already spent months tracking clinical study subjects using pen and paper. It was only when she found that the paper-based method was no longer scalable for her work load that she came to me with the idea for CST. She found it easy to identify work patterns which were amenable for automation. With only a year and a half left in the clinical study, we could predict that the LHA use case would require few other features. I asked her to think of a minimal subset of features which would appeal to a family of use cases instead of just the use case for the 2007-2010 Clinical Study.

During the development of the initial prototype, we had to consider what features the design could support and which features it should support. The scope of development had to consider the current and future work environment which would support further work on the software. We made a number of assumptions:

The MRC would view the development of the core code base as an overhead cost rather than as an activity which produced revenue
If CST accommodated too many features, there would not be enough time for me to invest in maintenance activities which would make the software a reusable asset for other groups
If CST failed to garner a developer community, the application would have to be limited in scope so that a single future developer could maintain it.
The application would be maintained by a succession of lone developers. There would be no face-to-face opportunities for successive generations of developers to do knowledge transfer with their successors.

We knew that the application would likely support the LHA use case for the duration of the LHA’s clinical study. However, we were less clear whether the software would meet the evolving needs of other use cases. Because of the circumstances of our use case, we adopted the following philosophy about how the tool should be developed: When you’re developing to suit a family of use cases, don’t make the development activity chase any one use case; let the application pass through them and plan for obsolescence from the outset.

"Chasing a use case" means that developers continually evolve the software to meet the changing needs of a use case. Most software projects chase a use case and in many, the maintenance and enhancement activities provide revenue for perpetuating a business activity. In our environment, developer resources are acquired for the duration of a research activity and software maintenance becomes a research overhead that needs to be minimised. If we are trying to design for a family of use cases, particular features in any one of them could lead to uncontrolled feature enhancements that would make the original design brittle.

It is a common sentiment in the software industry that if an application is designed to be all things to all people, it ends up being of little value to anybody. Therefore, we decided that CST would principally be designed to track only the dates on which case study subjects completed a given step of a given activity. If a project had different needs, then it would need to use a different application.

"Let the application pass through it" means that we accept that CST may be relevant in one phase of a clinical study and not relevant in another. For example, in the beginning of a study researchers may be content tracking the dates when study members complete an activity. However, as the study evolves, project members may want to track more complex kinds of information. In these cases, they can maintain other data in a spreadsheet or they can use a more elaborate application to support their needs.

"Plan for obsolescence from the outset" means that we have asked the question “How do users stop using CST?” since we began work on the first release. This way of thinking would seem backward to software companies. But in promoting the core code base, we are trying to recover a research overhead rather than try to sell anything. No project wants to be vulnerable to vendor lock-in, especially if they believe that future funding for development could stop. CST has been designed for obsolescence, which means we have considered how the software will age, die and rot or be reused. Software re-use is discussed in greater detail in this section.

Favour Planned Design over Agile Development

By embracing the philosophy of passing through a use case, we can combine the certainty of past and future development activities to support a high degree of planned development. Emphasising planned development carries a risk which has motivated the agile development practices in industry. In particular, planned development:

Requires resources to produce an architecture which can support non-functional requirements. Often it is difficult for end-user to see the value in abstract concepts such as maintenance, testability and reusability.
May introduce unnecessary cost overheads if anticipated features do not end up being used in practice
Often require investment in design documents which can become outdated with respect to the code they describe.
Can be difficult to maintain if future requirements are not predictive.

Agile development emphasises relying on people more than processes and has been shown to work well in small teams of collocated developers. The approach favours having developers produce rapid iterative releases that focus only on features that end users require now. Agile also assumes that a stable design will emerge as developers continually revise code in response to changing needs. Instead of relying on design documents, agile developers rely on a number of other activities such as:

frequent stand-up meetings
pair programming where programmers sit side by side to co-author code.
writing test cases which describe application behaviour through the development of automated test cases

Techniques such as these often work well when developers have frequent dialogue with each other and with customers. However, agile development does not suit the LHA work environment for at least two reasons:

We assume that there will be little face-to-face interactions between successive generations of lone developers. We also assume the LHA will continue to operate with only one full time developer.
If CST garners a developer community, the group of developers are likely to be dispersed all over the world in different time zones

If product knowledge cannot be conveyed in person, then it must be conveyed through some form of documentation. For our work environment, planned development carries a number of benefits. In particular, the design documents and the architecture of the code base can:

Allow future developers the option of adapting the software or developing something else from scratch.
Allow developers who work abroad on unrelated projects to determine whether the application meets their needs
The design can guide other developers to make minor changes. They do not require the skill to produce a large complex application.

CST was designed so that it could be maintained by third year computer science students.

Design for testability and software reuse

Software applications become more maintainable if they are also designed for testability and reusability. Once developers make changes to the code base, they will want to ensure that the application behaves correctly. When an application supports automated testing, developers can quickly perform regression testing to ensure that changes they make don't cause errors in existing features.

When software is designed for reusabiilty, developers can save time creating new code by borrowing existing code.

Support maintenance by project members with different skill sets

Generate activity forms from a configuration file

Because Sheila had already been tracking subjects on paper for months, she had a very clear idea of what tasks she did over and over again. Based on her assurance that the content of the forms would not change much, I decided to try and generate data entry forms from a configuration file rather than create a hard-coded form for each activity. The configuration file properties describe a data model off clinical study activities which is used to create the rest of the forms and database code for the application.

Model-driven software works well when you can assert with certainty that all of one kind of feature can be created in the same way throughout an application. But there is great risk in using a model-driven approach to create software when patterns in the feature set have not been clearly established. Suppose Sheila had just started her tracking tasks and concluded that all she would need for each activity was a sequence of date fields. I could have assumed that each activity required the same features and used a configuration file to generate all the activity forms. Then suppose months later she realises that one or two activities had to be curated in a very different way than the others.

If this happened, the anomalous requirements of certain activities would break the assumptions that made it appealing to use form-generation techniques. Fortunately, Sheila had been doing the same tasks for a long time and felt confident that her future curation needs for each activity would not change.

The benefit of using form-generation techniques is that a non-developer can define the activities that are used to generate the rest of the software. A scientist can edit the configuration file in a text editor and generate most of the functionality without drawing on his team’s limited developer resources.

Isolate user interface messages in a properties file

All of the text messages that end-users see in the application are defined in a properties file called CSTProperties.properties. It is a text file containing lines resembling the following examples:

navigationBar.search=Search
provenance.userChanges.title=History of changes made by {0}.

The software uses a code such as "provenance.userChanges.title" to fetch the message it displays as the title for the dialog that shows the history of changes made by the current user. Many of the messages have {number}, which leaves holes in the message that can be filled in by text provided by the program. CST then performs substitutions to provide a message like “History of changes made by kgarwood”.

The properties file is well commented and can be altered by a team member who isn’t a programmer. They can go through the messages and make changes to the text. For example, they may want to change the context sensitive help that appears when users pass over a specific button. Support rapid prototyping by generating forms that interact with a fake data repository

CST uses activities defined in the configuration file to generate entire applications for administrators and data curators. It is able to generate forms which interact with two types of data repositories. The first is an in-memory database used when project members are trying to decide what case study subjects and activities they want to support. Non-developers can make iterative changes to the configuration file and use CST to instantly generate data entry forms for the activities they have specified. The forms use a repository that simulates persistent storage with a repository that manages instances of data container classes defined in CST’s business concept layer . The demonstration version can be run on a computer which does not have MySQL installed on it. CST’s use of a demonstration repository helps reduce the cost of developing prototype applications in order to elicit feedback from prospective users.

Produce design documents

Great effort was made to document all the decision issues that have influenced the development of CST. Whereas comments in the source code describe what small fragments of code do, the documents provide a higher-level context for many of the individual classes. Future developers can use the documents to evaluate which parts of the code base suit the future enhancements they want to make.

Adopt coding conventions

The code base was developed using a number of coding conventions and best practices. Click here to see them. Coding conventions help make the code easier to read by other developers. Best practices address a number of development concerns and encourage developers to extend patterns of development that could benefit future enhancements. Both coding conventions and best practices encourage patterned development that makes it easy for other developers to make enhancements.

Support a plugin framework

Most of the features for CST can be generated using labour from project members who are not developers. However, generated software applications often suffer from the weakness that they provide 90% of the features you need in general but lack 10% of the features you need to support your specific use case. Customising the software to support specific features usually requires work by developers.

Software generation techniques are best used to support generic features. Domain specific features can be supported through an extensible plugin framework.

CST allows developers to create plugins for visualising activity data, plugins to gather external information about case study subjects, and plugins to import and export data. The application’s support for plugins requires less time and skill by developers who are needed to do customisation than if they had to create the entire application from scratch.

Author: Kevin Garwood