Chasing shadows – acquiring and managing virtual collections
Poppy Wenham, National Museum of Australia, 14 May 2010
JOANNE BACH: Our next speaker is Poppy Wenham who came from a very different background and, I guess in keeping with our other two speakers we have heard from, didn’t start her working life thinking she would be dealing with collection issues in a museum context. Poppy has previously been the registrar at the Australian War Memorial. She is currently on secondment to the National Museum of Australia as the manager of registration.
POPPY WENHAM: Thank you very much, Jo. It’s been a very interesting discussion so far this morning. I am interested to see how many threads I am picking up in the talk I am about to give now. I would like to raise some issues and to start a dialogue about what I see as a challenge in adapting museum practice to at least if not encompass then consider the acquisition and management of born digital collection material. You might wonder how that fits into a session that is focused around access. But as was said earlier this morning, for me, access is a given as one of the goals of museums. If our collections are not accessible, I am not sure why we have collected them. But before we can provide that access, we have to acquire things and we have to manage them. I am interested in this conversation because I think born digital is a growing area for us that we must make a more conscious decision whether we engage with or ignore.
In their simplest forms, museums are places for the keeping, exhibition and study of objects – objects of scientific, artistic and historical interest. Distinct from libraries which have a long history in looking after the information, in museums we have traditionally focused on physical manifestations of culture. As Jo alluded to, I came to museums through the performing arts and I am passionate about objects because of the way that they can tell stories. Objects have an amazing capacity to help us to understand the experiences of people in other cultures, in other places and in other situations or to shed light on our own lives and our own experiences.
The other thing that I think we need to pay merit to is that, more than any other cultural institutions, museums have shown a remarkable capacity to adapt. Once the realm of the private collector, museums were curiosity cabinets of strange and interesting things, and we have discussed a slight nostalgic yearning for those days. Even in my childhood, museums were still places that were object rich but interpretation poor. Today, words like ‘context’, ‘provenance’, ‘interpretation’ and ‘access’ are essential concepts in a modern museum. The physical object is the catalyst for sparking the visitors’ imagination and leading up them into the world of exhibition, the world that the curator wants to share.
Along the way to get there, museum practice has adapted to accommodate new media. Museums have seen an explosion of that new media and an explosion of new objects into our collections: glass plates, negatives, photographs, film, video, as well as audio recordings on wire, wax cylinders, vinyl, tape and now on those CD disc things. We have accommodated all of these new things, sometimes with a bit of soul searching about the validity of certain types of objects. Most institutions have come up with a range of collection types. We at the NMA have a core collection, which is the National Historical Collection, but we also have subsidiary collections for props, archival material, support material or education material - and many other museums have made the same choices. Provenanced examples sit above type examples, and facsimiles and replicas occupy a grey netherworld on the fringes of collections. To paraphrase George Orwell, ‘All objects are equal but some are more equal than others.’ In terms of actual management of these items though, we still always had a physical original artefact: an object that we could acquire, accession, research, display and make accessible; an object we could put a number on, register, stocktake, preserve and manage.
What is born digital? As Jennifer was saying earlier this morning, many of the things that we have traditionally collected are now being born digital. A good example of that are maps, architectural drawings, CAD, diaries, cartoons - all these things are commonly once hand drawn but are now being created on computers. They are born digital. Sometimes we can still print out a physical version but often we don’t need to. In addition to those examples, there are whole new things coming into existence that exist purely in the digital sphere such as wikis, blogs, tweets and online games. All of these things are genuine expressions of our society and culture that can only exist digitally, sometimes uniquely but often shared in the clouds. There is no unique physical object for museums to collect or display; there is nothing tangible at all. Does this matter? Can we adapt our existing practice to accommodate this material or do we ignore it as ephemeral, knowing that the platforms will soon be obsolete and the world will have moved on, and tweets will feature in some future exhibition as an odd insight into the noughties just like yo-yos, hula hoops or pokemons and tamagotchis are passing fads of other generations.
I want to make clear that what I am not talking about here is digital preservation strategies. There has been a lot of work around digitisation, which is making a digital copy of an object to facilitate access and to support preservation, including the managing of format obsolescence. I agree with earlier speakers that I think digitisation for access is extremely important but it does not replace the real thing. It is a way of managing the preservation of the object. In most cases with digital preservation, it is clear what the source object is and what are the digital copies that are used for preservation and access. Almost all organisations have clear policies about digitising their collections, and many have resources and projects underpinning this work. The work is important, but it can blur our thinking about born digital material.
Far fewer organisations seem to have policy around the collection and management of born digital material. In my research, it seems that libraries, particularly in the UK and US, are dealing with this topic reasonably well, but I have found no explicit policies for museums. It’s a topic that is being canvassed in the blogosphere, but there seems to be little definitive benchmarking or decision making around the topic.
I want to ask: what do we need in a born digital collection management policy or sets of procedures? Before I go on, I will give you a couple of examples of the sorts of things I have come across in my registrar practice. The one that started me thinking on this was a bequest that was made to the War Memorial while I was the registrar there by Michael Southwell-Keely. He had developed a website which was a database of war memorials around Australia. The database could be searched both geographically, so you could go to a place and see a picture of the war memorial, and it also listed the names of all people recorded on the war memorial so you could also search by name to do genealogical research with the site. Michael created the site but he didn’t create the information on it. He had a community of interest who took photographs, recorded the information and formed the website. On his death, Michael bequeathed that website to the Australian War Memorial. It created a lot of questions which are good questions to consider: What actually was bequeathed? Was it the website as it stood on the date of his death, the web site as it stood on the date probate was granted or the web site at some other point in time because it was a changing thing? Should we just download it onto a CD, accession and keep the CD? At the moment those questions are still out there. The website is still hosted on a third party ISP, and the memorial is effectively a silent partner in keeping that website up there. I don’t think that matters. What matters is that we ask the questions about that collection.
Another example is in the field of diaries and this is another War Memorial example. The War Memorial holds thousands of diaries from the First World War. They are wonderful accounts and they also carry with them a physical culture that is richer than just the stories they embody. We have diaries with bullet holes in them, diaries with dog-eared corners, diaries that are stained, torn and ragged. In the Second World War there were far fewer diaries, but in later conflicts there are almost none. Today’s soldiers are keeping blogs and there are so many of these blogs that there is now a term for it - S-blogs or soldier blogs. A blog isn’t a diary, and I am not pretending that it is. It is deliberately produced for publication; it is written with the intention of a third party audience; but in the absence of diaries the blogs are all we’ve got. Should we be collecting them or not?
And finally cartoons. When the NMA first began its annual cartoon exhibition, 100 per cent of the cartoons were hand drawn, and the Museum had a physical object to acquire if it chose. By 2009, 72 per cent of the cartoons or 63 out of 87 were born digital. Currently, high quality images of the born digital originals are being generated for the exhibition and then these are being acquired into the Museum collection. Is this the best solution or should we be acquiring the digital originals into the NHC in the same way that we are acquiring their analog counterparts?
The point of these examples is just to show that we are in an area of developing policy, but born digital material is coming to us whether we like it or not. So I feel that we need some way to assess whether born digital material meets our collection development policy. I think this is reasonably straightforward, although we need to mesh that with our analog policy. We need to consider collection types. Should we treat born digital material as a class of material on its own? This is a common approach in libraries where born digital material goes into special collections, although even that approach is being discussed for appropriateness at the moment. Or should we treat born digital material like the analog material it is closest to - images like photographs, blogs and web sites like other published material? Should we treat born digital material as a separate collection type but within the collecting sphere that it is closest to? This model sits well with current ideas of curatorial subject matter expertise.
What are the pros and cons of these approaches? What are the expectations of the public and external stakeholders? I am not sure. We need some brainstorming and debate around this to come up with a solution that is effective and appropriate. I think there is clear merit in establishing a collaborative approach in line with decisions being made at other institutions, but how will we facilitate this? Museums also need to work out what the born digital material is that they are acquiring, and how they might preserve, store and manage that digital asset.
So I have raised a few issues which I think are the things we consider when we manage analog collections in the digital world. The first thing is metadata. Metadata is the digital documentation. It provides information about the digital asset, particularly about its structure and organisation. It’s the glue that holds the digital asset together but it also holds critical information about the creation, format, modifications, authorship and many other contextural aspects of the digital item. Importantly, in respect of issues I want to touch on, metadata may be embedded or it may come in a wrapper. Why this matters is that we have to manage both bits. As we already know with physical objects, an object with no provenance is a sad object. So they need to manage both of those things.
But there are questions about that. Do we need to standardise the metadata? Probably. Will we need to enhance the metadata, for example, to support geo-tagging or other yet unimagined future initiatives? Again, probably. If we do this though, does this change the inherent nature of the original digital object we are acquiring? Clearly. Is that acceptable? It is not with a physical object. Should we keep the original item and then make an ‘access’ copy which is where we might standardise and flesh out the metadata so as to maintain the integrity of the original? Perhaps. That is what a lot of libraries are doing.
Storage and management
As in physical object collection management practice, to ensure ongoing access a unique persistent identifier has to be assigned to every digital collection item and also to each deliverable part of a digitised collection. But we need to link the source items to their access and preservation copies and any other derivatives that we may create so that we don’t lose the original or, worse still and far more expensively, end up storing vast repositories of unnecessary digital clones.
The digital asset needs to be accessible within the same regime as other collection items. That is, in my view, it must have a presence on the organisation’s collection management system. It is likely, however, to be managed through some other form of digital asset management system, so care needs to be taken in linking those two systems and ensuring that those links are strong and ideally that the persistent identifier is portable between the two IT systems.
How might we replicate the sort of secure museum storage that a physical object would get? Should we have dedicated secure storage and offsite backup separate from other museum systems for the source work? I think so. Should we consider different types of backup such as tape rather than server replication? I imagine that storage for source born digital material would have restricted access and strong audit trails so that the digital assets are just as safe as the physical items in a repository store. There are important decisions to be made here although, as the actual cost of digital storage is getting cheaper and cheaper, the actual cost to manage that storage which is a human cost gets more and more expensive. So we can’t be lulled into thinking just because terra bytes are cheap that we can store everything.
In digital terms, preservation is about retaining the digital file integrity using some regime of checksum and parity checking to ensure that the file acquired retains its original characteristics; that is, the strings of 1s and 0s stay in the same order. But what about platform obsolescence? It is not only possible but also now easy to use online format registries to check the life cycle of an incoming digital asset. In digital preservation terms it is completely reasonable to change the format of a created digital asset to ensure ongoing accessibility, management and format security. But if we do this with a born digital asset, are we changing the inherent nature of the original digital object? Yes, I think we are. As with the metadata discussion, should we retain the original item in its original and potentially unreadable format just for the sake of historical authenticity but migrate the preservation and access copies to more relevant formats over time? Does this make sense? This is the practice that seems to be most popular with other collecting institutions.
I am not going to go into the matter of rights but, since rights management is effectively no different whether an object is virtual or tangible, we need to consider in born digital the rights inherent in the source platform owner. For example, the blogging platform that the soldier’s diary is created on.
Useful lives is another issue. No matter what preservation strategy we apply, useful lives is a very untested concept with digital assets and relies entirely on platform migration. It’s a given that digital assets will have a much shorter life than the analog items in collections. There has been a lot of work around digital life expectancy in the US and the UK for commercial applications. For example, you can go onto websites that gives you methodology that model the digital life cycle of any asset. But this is a consideration for museums, because the balance between the amount of resources you need to acquire, collect and maintain the object against a life cycle that may be only three to five years may make us say that this is simply not worth while. To some degree, the life cycle also impacts on value, certainly for the accountants if not for the collection managers. The degree to which we may be able to commit resources to the acquisition and management of born digital collections may ultimately come down to a cost benefit analysis.
Disaster management is also a consideration. Born digital collections need to be considered in the organisation’s business continuity and disaster planning regime. This is probably obvious, but it requires resources and policy because we need to ensure that the born digital repository doesn’t become an accidental sacrifice to an IT disaster where the emphasis is likely to be on rapid restoring of normal business functions and not on the born digital repository.
Finally, we need to be clearheaded and ruthless about the media that the digital asset arrives on. In my view, once the digital asset is safely transferred into its secure digital repository and it’s parity checked that we have acquired the right thing, the CD, DVD, hard drive, thumb drive, floppy disc or punch card that it arrived on can go in the bin. It is no different from the cardboard box that a physical collection item might have been packed into. But at the moment I know that many of us are still storing and managing those physical media.
As I said at the outset, born digital collection management challenges our collection management paradigms and processes. I think this is a challenge that museums can rise to and overcome. I firmly believe that we need to be very mindful of the work occurring in the library field to inform our decision making, although we may not necessarily go down the same path. It’s a given for me that museums should be acquiring digital material and that we should be engaging with these questions, but I appreciate that this is not a given for everybody. I don’t mind so long as we have the discussion. Thank you very much.
Disclaimer and Copyright notice
Date published: 31 May 2010