“Open data science is not completely new to us in Brazil.”
The world of science may present itself as a merit-based republic open to all comers, yet it’s also a highly hierarchical discipline with its own aristocracy – and its own triple-locked treasure chests of knowledge.
There are no grander scientific names than the Royal Society, founded in 1662 by Great Britain’s King Charles II and Nature, the publication founded in 1869, also in London.
So to find the editor-in-chief of Nature and a senior vice-president of the Royal Society occupying the same stage to talk about the controversial subject open access to scientific data, promised a ringside seat at one of the hottest debates in science circles.
The debate is: If taxpayer-funded agencies are paying for scientists to develop their ideas, then why should private enterprise act as gatekeepers, charging other scientists for the right to see this data?
In late February 2013, the Royal Society’s Foreign Secretary Prof. Martyn Poliakoff and Philip Campbell, editor-in-chief of Nature, visited Brazil for a conference at the São Paulo Research Foundation, as part of their ongoing roadshow to raise awareness about Science as an open enterprise: Open data for open science.
This Royal Society report explores issues relating to the gathering, storage and dissemination of scientific data in the new global environment. And of course, it touches on the issue of who pays to make science, and who collects money in exchange for offering the curious the right to see what has been made.
“As science spreads across the globe, we believe open inquiry is becoming more and more important,” said Prof. Poliakoff. “The public and ‘citizen scientists’ have a right to look at data to see the science is being done properly, and today’s data could be very important in future,” he said, Poliakoff cited the sheer speed of data production as proof of the need to organise it and make it available. A decade ago the first human genome mapping was completed: already there are 100,000 sequences now completed.
Nature’s Philip Campbell, who was also part of the Royal Society report’s working group, understandingly took a more nuanced position, as the head of a global publishing group with 3 million online users and 72,000 subscribers to print publications.
Free universal access to data is a fine goal: but somebody has to pay for the costs of editing, organising and publishing it, Campbell explained. With a staff of 100 senior editors and 40 magazine editors winnowing through 11,000 submissions of scholarly articles to publish just 800 each year, Nature needs rewarding for its efforts, and for helping scientists to raise awareness of their work through subsequent citations.
But once the data has been through Nature’s hands and an exclusivity period has expired, where does it go? It’s estimated there have been over 19 million papers in the biomedical arena since 1975.
Enter the world of databases and meta-databases. Open data activity is good for many reasons. It helps combat fraud, makes reproduction easier; it helps citizen scientists; it can address planetary challenges, and it helps improve trust in science, explained Campbell.
Such databases should represent a public service, and they’re surprisingly inexpensive to operate. Campbell cited the example of the Worldwide Protein Databank, which curates, hosts and delivers information about complex molecules. It has just 69 employees and costs only US$ 12 million to run.
Nevertheless, governments have a quite different view of the way private publishing and database management enterprises act as profit-seeking gatekeepers for the data that has been generated thanks to state funding for researchers. Legislators in the US and UK have been pressing for much more open access to data.
They want to roll back period that databases have their paywalls up, and they want to widen the range of data to include final published versions, and not just manuscripts and notes. Two separate concepts for liberating data are developing among policymakers and funding agencies, called “green” and “gold.”
The former contains final drafts and notes and may take up to six months to become available, while “gold” implies completely free access without charge from the moment of publishing. A third, hybrid option involves the author paying the publisher for the right to have material universally and instantly available.
In Europe, the EU is imposing its open access rules in 2014. Britain is following the same timetable. In the US, the Office of Science and Technology Policy at the White House is gingerly surveying the scene as to how best not upset the commercial publishing industry, and is offering US$100 million to each science funding agency to improve public access.
Surprisingly perhaps, Brazil is doing rather well in the field of open data access. Because the country has little or no scientific publishing industry, it has no established commercial interests to upset. Strong state control of the scientific environment has given the state a strong position in gathering research into databases that are publicly available.
In fact Brazil can claim to be a world leader in open access online libraries. Since 1997 Brazil has been running SCIELO, which is now ranked the number one 100% web-based scientific data repository in the world.
The system, set up in São Paulo in 1997 by FAPESP, has spawned a number of look-alikes, all of which hold prominent places in world rankings. Chile’s online system has the number nine spot, while Spain has 10th place, and Brazil’s own public health database has 14th place.
Amazingly, SCIELO gets 450 million hits a year, or over one million hits a day. This said Prof. Carlos Henrique Brito Cruz, FAPESP’s scientific director, was partly due to the system’s linkage to the “Web of Science” under the management of Thomson-Reuters
So, whatever the arguments discussions today being pursued between public policymakers and commercial publisher in Europe and the US: “Open data science is not completely new to us in Brazil,” said Prof. Brito.