Papers Past digital archive shows the power of adapting to change
By Nick Butler in Development on September 25, 2018
Find out how adapting to change has helped the National Library revolutionise historical research in New Zealand by creating an easy-to-search digital archive of historical records.
The Papers Past website is the National Library’s online archive of New Zealand newspapers, magazines, journals, diaries, letters and parliamentary papers. It lets you search through an ever-growing collection, spanning more than a hundred years of New Zealand history.
Previously you had to trawl through historical records like newspapers by skimming through them one by one on microfilm. Now you can search for the subject you’re interested in across the whole collection. What might have taken many years now takes milliseconds, kicking off a new era of research.
“We get a lot of commentary from people working in digital humanities and historians working in academia that historic research in New Zealand is now being divided into pre-Papers Past and post-Papers Past phases,” says Emerson Vandy, Digital Services Manager at the National Library.
One of the secrets to the team’s success is the way they adapt to what they learn. When Emerson took the reins at Papers Past it was a time of change and constraint. In this case study you’ll see how he learnt from the work already done, how he negotiated those constraints and what he plans for the future of this game-changing service.
Helping kiwis access the story of their past
“You’ve got three high level areas that are important within libraries. You’ve got collecting of things, you’ve got the preserving and care of things, and you’ve got providing access to things. Papers Past is about providing access to things,” Emerson says.
And provide access it does. 1.1 million, or one in four New Zealanders, have used National Library’s Papers Past site.
Beyond the bare numbers, one of the things Emerson has learned is how much the digital archive means to people.
“We get a never-ending stream of emails from people thanking us, often profusely, often emotionally,” he says. “People sharing really private, personal things about their lives or their ancestors’ lives, or really intimate feelings they’ve had about a piece of material that they’ve discovered.
“I haven’t encountered that with any other digital service I’ve ever worked on before. It’s really quite powerful.”
Starting small: the Project 251 experiment
It all started with a small experiment called Project 251.
“It was an exploratory project to scan and digitise 251 newspapers. I don’t mean newspaper titles. I mean 251 issues of newspapers, 251 individual newspapers,” Emerson says with a laugh. “The idea was if we did that, maybe it would mean people wouldn’t be hammering the microfilms down in the reading rooms as much.”
“This was around about 2001,” he says. “I started here in mid-2006, but I’ve worked with a lot of those staff who were there at the beginning. It was quite a different period in library history. There was a lot more room for research-based projects.”
“At the end of Project 251, an email went around the library soliciting names for the 251 images. Someone suggested Papers Past.”
The new service now had a name.
Learning from success
Trying to resolve a business need (protect microfilm) had identified a user need (reading digitised newspapers online). So increasing the uptake of the digitised papers was win-win. How could the Library make the digitised papers easier to work with? The next step would be a game-changer.
“Up to that point it had essentially just been browsable PDFs,” Emerson says. “It was just a digital surrogate for microfilm.
“A second-generation version of Papers Past was created round about the time I started at the library.”
This used OCR (Optical Character Recognition) to extract text from the digitised newspaper images and built a searchable index of these newspapers.
“People could type in a search query. It finds the match in the search index and takes a user to the location of that search term in every newspaper where it occurs.”
“This had the effect that we hoped for,” Emerson says. “It sped up the research process by several orders of magnitude.”
“It’s been a resounding success and we would like to do a heap more of it, which is good because there is a heap more to do. We’ve done probably around about six percent of all the newspapers that have been produced in New Zealand, so there is a huge amount more information.”
This sped up the research process by several orders of magnitude.
Digital Services Manager at the National Library
Opening up a whole new world for researchers
For professional researchers the searchable digital archive was a game changer.
I started off using the microfilm,” says historian Paul Meredith, currently Pou Hautū at Victoria University. “That was hard yakka. You know, you start with issue one, page one and just work on through.
“When Papers Past came along it opened up a whole new world.”
“I remember discussing when the term “mātauranga Māori” was first being used. I went to Papers Past and there was the answer. It just brought up the information so fast.”
When Papers Past came along it opened up a whole new world.
Historian Paul Meredith, Pou Hautū at Victoria University
The demands of responding to demand
It was quickly apparent that there was more demand than the service had the infrastructure to supply. The new search feature had pushed usage up twentyfold and analysis of the site’s performance showed that response times were suffering. The team had to adapt to a change of their own making.
“When we realised how much uptake we’d had, we knew we needed to build a lot more scalability into the platform. We built a lot of performance into the backend and focused on that and nothing else but that for about a year and a half.”
The results show that sometimes the best way to serve your users is to refine existing services, rather than always focussing on new features.
When we realised how much uptake we’d had, we knew we needed to build a lot more scalability into the platform.
Getting the papers people want online
Soon Emerson was closely involved with one of the programme’s thorniest issues: copyright.
“We’d had permission from Fairfax after many years of negotiation to release the Evening Post in one great monolithic chunk from about 1860 to 1945.”
The team were able to apply what they’d learnt in these negotiations when working with the rights holder for the next most requested title, the Otago Daily Times.
“I have to say, in general, even now, copyright is still the most pernicious issue around Papers Past and providing people with access to things. There isn’t a hard date for copyright cutoff,” Emerson says. “It’s different on an article-to-article level.
“For the Otago Daily Times, we put the up-to-1900 stuff online. People started going, ‘Wicked. Where’s the 1930 stuff?’. We have to keep bringing it forward and you need to keep explaining to people what the issues are around the later dates. You need work within the frameworks of New Zealand law.”
Even now, copyright is still the most pernicious issue around Papers Past.
Extending coverage while negotiating constraints
The next phase of work built on what they’d already learnt giving people access to digitised newspapers and applied that to new types of material like magazines, journals, diaries, letters and parliamentary papers.
Some of this material was available on separate sites using outdated technology. Bringing them together would help users find a wider variety of digitised papers and would reduce maintenance costs.
Emerson took on this work after two frustrating false starts triggered by the Global Financial Crisis.
“We initially kicked it off in early 2011. After the GFC they decided, ‘Well, this is maybe an optional thing and maybe it’s stuff that we shouldn’t do now,’ so after kicking off and doing the project initiation phase, it was shelved.
“We picked it up again in 2013 and a similar thing happened. We started again in late 2014, early 2015, and this time it managed to keep going.”
The challenge was to deliver the best possible service with the available time and money.
“Internally, we all have these aspirations and intentions and goals for the service that we all really love and a lot of us feel very personally wedded to, because it’s quite a vocational thing. Some of us have been working on Papers Past for 16 years.
“We had to balance that with the level of investment that was actually available to us. There were some genuine, strong tensions around that.”
We all have these aspirations and intentions and goals for the service that we all really love…we had to balance that with the level of investment that was actually available to us.
Keeping internal stakeholders happy
With other teams depending on Papers Past to achieve their targets, these constraints affected internal stakeholders as well as the end users.
“We’ve had to make sure their stuff could be delivered and met their expectations. That’s been a real juggling act.”
Emerson says that transparency and communication are the best way to resolve these kind of tensions.
“You talk to people, and you communicate with people, and let them know what you can do and what you can’t. You be honest and you just try and do your honest best.”
The digital archive goes mobile: reacting to evolving user needs
During the period the project was delayed they identified changes in the way people were using the site.
“Over that time you saw the rise in use of smartphones and tablets. By the time the project was coming online, around 2016, about 40 to 50 percent of the traffic to the site was from people on mobile devices or tablets.
“Being able to provide a mobile experience was important, not just from that immediate mobile user interface aspect, but probably more importantly from Google’s perspective,” says Emerson. “Google started marking sites much, much lower in results if they didn’t have a mobile-friendly template.”
“Three-quarters of our users use Google to come in,” he says. “Suddenly they can’t find results in the way that they used to because we weren’t providing a mobile template. Developing the mobile view was really significant, not just for the usability, but for access and discovery.”
Developing the mobile view was really significant, not just for the usability, but for access and discovery.
Responding to feedback and feature requests
One way the Papers Past team learn and adapt is by listening to user feedback such as feature requests.
“In the near future, we’re looking at delivering the most requested feature Papers Past has ever had from users. We’ve had thousands of requests over the years from people who would like to be able to correct OCR errors that they see in Papers Past.
“Newspaper typography back in the 19th century is not great. There’s ink blots everywhere, the fonts are damaged, and there’s a 99.9 percent chance that a computer hasn’t recognised every letter in an article. It’s almost guaranteed.”
“We’ve got a correctable back end prototype going live fairly soon. It doesn’t have a front end on it yet. We’re kicking off the investigation into what that front end needs to look like and how we explore and that kind of stuff.”
Emerson says this kind of feature can change the relationship of the audience and the product quite significantly.
“They have ownership of that material. There’s something of themselves and their identity surfaced through the content, and they’re fixing it. They form a relationship with that piece of content. Because they’re doing that, they form a relationship with the service.
“It’s that two-way relationship, the more valuable relationship.”
Increasing the impact of NZ Inc.
By speeding up research, Papers Past speeds up the generation of new knowledge.
It’s a benefit that extends beyond obvious user groups like historians and genealogists. Because the records of Papers Past describe many different aspects of the world at the time the writers were working, the information that can now be easily found is useful to researchers in many disciplines.
“You have climatologists, you have geologists, you have hydrologists,” says Emerson.
For these and many other researchers, understanding the past and tracking trends helps us plan for the future.
As aware as he is of the big-picture benefits, having wrestled with challenges like the GFC, mergers, complex copyrights and the evolving intricacies of Google’s algorithms, Emerson still loves hearing from individual users like this one:
“Thank you to everyone involved in this project – speaking as a professional software developer and amateur genealogist – this feels like great, high-quality software and it’s an absolute joy to use.”
Thank you to everyone involved… it’s an absolute joy to use.
Papers Past user