Carl Malamud is on a crusade to liberate information locked up behind paywalls. He has spent decades publishing copyrighted legal documents, from building codes to court records, and then arguing that such texts represent public-domain law that ought to be available to any citizen online. Now, the 60-year-old American technologist is turning his sights on a new objective: freeing paywalled scientific literature.
Over the past year, Malamud has—without asking publishers—teamed up with Indian researchers to build a gigantic store of text and images extracted from 73 million journal articles dating from 1847 up to the present day. The cache, which is still being created, will be kept on a 576-terabyte storage facility at Jawaharlal Nehru University (JNU) in New Delhi.
No one will be allowed to read or download work from the repository, because that would breach publishers' copyright. Instead, Malamud envisages, researchers could crawl over its text and data with computer software, scanning through the world's scientific literature to pull out insights without actually reading the text.
The unprecedented project could, for the first time, open up vast swathes of the paywalled literature for easy computerized analysis.
But the depot's legal status isn't yet clear.
View Full Article
No entries found