The data lifecycle

A typical dataset has a longer lifespan than the research project that creates it. Although research projects usually begin and end when funding ceases, the reality is that researchers will continue to work with datasets collected well after funding has ceased and more often than not, in a new institution. A typical data lifecycle (adapted from the UK Data Archive) includes both private and publicly managed data, and can be illustrated as;

The management of research data becomes more complex the longer it needs to be kept, especially when; research involves national and international collaboration, researchers move between institutions, retention periods require research data to be archived or disposed, and when research data is re-used.

Data management usually begins in the private domain when it’s created or collected by a researcher. The type of data varies markedly across projects and research disciplines. In addition to digital information, data can also mean physical specimens (inorganic and biological), historical papers, audio and visual files etc.

Managing the transition of data through the data lifecycle, from the private to the public domain, requires careful data management planning. Complex data life cycles require data to be well organised and documented throughout each phase. Once research projects are completed, selected data is generally made public, via the publication of journal articles and where possible deposited in data archives. Research data that is well described and properly archived becomes an invaluable resource to advance scientific inquiry and to increase opportunities for learning and innovation.

The responsibility for data management lies primarily with researchers, however Swinburne acknowledges that providing a framework of guidance, tools and infrastructure is essential to helping staff manage all aspects of the data life cycle.

Additional resources