Rumours of Research: The Importance of having metadata

Thursday, 10 May 2012

The Importance of having metadata

There is a lot of interest at present in Research Data Management, its one of those words that you can hear been talked about either with excitement or with dread. You can learn more from Digital Curation Centre. There are buzz words around like "Open Access", "research outputs", Data Life (I wish they would talk of data half life but that is me) and metadata. Now I am not going to give a detailed understanding of Research Data Management, what I am going to do is give the story this week which highlights the need for ongoing metadata keeping in a live research project.

I have had a long term research relationship with Margo Barker started soon after I came about 19 years ago and has covered many research projects. The one this started about four years ago, when Margo was able for a short while to employ someone to look at magazine messages for her. The person did a good job and produced some results in the short time available. However things moved fast and when she went, Margo found this project was largely swamped with other things. It was not until last summer when another person working on a short term project had some spare time to look at it that it got revitalised. He managed to get the paper ready for presentation but it took into the autumn before a paper was ready to be submitted. During this process little new analysis was done.

The paper was sent back for editing and due to the reviewers comments we decided to alter the analysis by removing some of the data. That was fairly easy for the stuff I had on my machine (I had the SPSS programs that did the original analysis all saved) but there was other data I had not seen. A good part of this week has therefore been spent searching computer storage and opening possible files to see if they held the original data. We found it eventually but it took us a long time to do it.

So what difference would meta-data have had for this project. The meta-data would ideally contain the creation date, the last edited date, what was actually in each file, not just what data was there but what analysis had been carried out, any changes made and so on. If this was stored in a central file for that research project, then we could have opened the file and looked up which files contained the information on the number of magazine sampled. We would know what each variable actually was and could probably answer question on what actually made an article about nutrition (fortunately the original researcher was good at documenting the data, so we found that she clearly stated that she had removed articles with only a small bit on nutrition, but this was hidden away in a file and we came across it by coincidence. Also we might for instance have not spent quite so much time wondering whether supplements and weight loss products were one and the same (they weren't but it took a lot of time to find that out, because we did not know we were still short one of the data sets)!

Moral of this story: most researchers are human and humans forget. Having your metadata in a usable form makes it easier to pick up an analysis after a time has lapsed and saves you time because you have information on what data is where.

Rumours of Research

Thursday, 10 May 2012

The Importance of having metadata

No comments:

Post a Comment