Thursday, 17 May 2012

On Chucking Things Out

This week I am moving offices, not just that the department has decided that we need to do a massive clear out before we move as they do not want to pay to move things that will never be used. I am having some problems with this edict.

Firstly I am a statistician by training, culturally statisticians are hoarders. That is they tend to store data indefinitely on computers, keep copies of long forgotten analyses and so on. When I first started working as a statistician, half the office the four of us shared (it wasn't big) was taken up with computer output. There were stores elsewhere as well I believe. Even when we got half of a much bigger office, under all the waist height shelves were boxes full of output and paper copies of data. You learn quick about deleting something too soon and I did it often enough in those early years.

I developed a method of working when I was in charge of the analysis, it involved big ring binders. I would do an analysis, print it out, put it into a ring binder, and then summarize the results for the front of the ring binders. These analyses were kept. If you knew my approach to most other things, you would realise this is a real taking on the culture. The ring binders were then kept, forever just in case there was need to return to the analysis. There are to Statisticians no such thing as a dead analysis where the output can be junked.

I have had to be radical with this and say that files where the analysis is several years old, have not been recently consulted and have been published really should be counted as dead. If I did not know whether published I was also to presume dead.

I am also semi pathological about this and data, but these days data is computerised, However until fairly recently I kept all the hard copies of surveys that were conducted about fifteen years ago. Well if the worst came to the worst and we wanted to go back to them and the computer network had been wiped out we could always re-enter the data. The oldest data set I have on the computer is one of those sets, if dates from August 1996. So the computer copy outlasted the paper ones in this case.

 These are on computer, so none are being lost, although I will back up my hard disk before we move. I must also look to creating a better back up system. The work is now fairly organised at least as far as data. Senior researcher, then folders for topics or if a student then in queries and under the student name. The main problem is that a lot of the time the data is shared between my computer and the consultees and there can be problems with knowing where a file is.

Secondly manuals are a tool of my trade. I have used them for answering users queries since I first started this job and my reading of them gave me a head in my previous ones. Sometime you don't even need a copy of the software with a good manual. Part of my job seems to rtfm for other people. . Needless to say I like having manuals around, they feel like a safety net for the times I do not know the answer.

However paper copies of manuals are disappearing (so much so I sometimes print off copies of pdf as I need a paper copy and I can't buy the manual). Also I do not use manuals equally, the SPSS syntax manual is the one I consult most frequently (I wonder if it is time I got a new copy as it is ten versions out of date although syntax changes slower than most people think, and whether I can get a new copy). At the other extreme are the SPlus extra modules which after a decade aren't even out of their wrapper and we don't even support the software anymore. Some of my manuals are old, very old; some had been in the department longer than me.

However the move has been towards more and more online manuals. Also I do not use all manuals equally. I have a similar attitude to master copies of software especially when I have been caught out without it. Universities seem to be the place for historic software users.

So I have had to have a cull, the one where got rid of all no longer supported software, or when alternatives were available on line that were more up to date. This meant chucking out one huge lot of manuals that were over twenty years old, three versions out of date and I had not consulted in the last two years.

The whole thing is that I am doing this, I am doing it because these things are on paper, if these things were computerised, there is no way I would be doing this, there would be pressure on me to preserve them even longer, well maybe not the manuals but definitely the analyses. It seems to me the demand to archive data is an outcome of the present computer age, before then archiving was difficult and therefore not attempted. Only the highly methodical kept notes and data carefully enough that it was still around when they died and then if they were important it was archived. Things weren't transportable so keeping things was often not an option.

Yes I understand fully why we want to keep things but I am becoming aware that a sensible deletion policy might well be part of a good data management plan!

No comments:

Post a Comment