CACM logo

BLOG@CACM

Research Data Sustainability and Access

[article image]

The dramatic growth of research data, the collaborative nature of international research, expectations for economic returns, and disciplinary differences all make data sustainability and access a pressing and difficult problem.

User Comments

 (3)

Interestingly, this should be more of a problem for life and physical sciences, rather than computer science. In the latter, the presence of public benchmarks usually allow us to experiment on common datasets; in the case of individual synthetic data, all we need to preserve are scripts for data generation.

The key factor is necessarily going to be incentivizing researchers to publicly make their data available in an appropriately accessible format. When deciding to fund research proposals, why not take into account the investigator's past record of public offerings, compared with the number of research papers published? On the plus side, this should encourage people to put more effort in releasing their data, in exchange for future funding. On the negative, it would take a long time to validate its effectiveness.

"before" webcitation.org ? Yes, all of those anecdotal examples were a real concern. Now? Meaning, now that webcitation.org exists? Now I am not so scared.

...Now it is the job of the (maintainers of the) content on the server, to worry about whether migration to a new underlying medium (or format) is "advisable", and to do the necessary copying, "if any" -- (if appropriate); so IT'S NOT MY WORRY! (any longer). (Thank goodness!)

But IMHO that also means that, the need for an item (blog post) like this, is less. Sure, it is still needed for old material, from more than a couple of years ago. But, not for new material; and even for OLD content, it only has to be put on the web ONE TIME, and then it can be "archived" at webcitation.org , and ("bingo") it can now be thought of as being OK -- "similar" to, recent content! Case closed.

Thanks for the comments to date.

Regarding computer science versus the life and physical sciences, I think it depends on the subdiscipline. People working in data intensive areas (e.g, search, databases, information visualization) have different needs and concerns than those in other, less data intensive ares.

Regarding media migration, it is somewhat less of an issue than it once was, but it still matters. The National Archives, for example, struggles with this issue daily, as do other organizations that archive historical or longitudinal data for long periods. We have another form of transition as well, namely the migration across logical data formats and tools that still plagues us.

Dan Reed

Post a comment...
Name: Anonymous

Signed and anonymous comments submitted to this site are moderated and will appear if they are relevant to the topic and not abusive. Your comment will appear with your username if you are signed into the site, and will be anonymous if you are not signed in. View our policy on comments

Tools For Readers

Bookmark and Share
Default Font Size Large Font Size X-Large Font Size Text Size

Related ACM Resources

Conferences:

Courses:

In The Digital Library


About Communications | Join ACM External Link | Renew External Link | Subscribe External Link | Sign In | For Authors | For Advertisers External Link | Privacy | Site Map | Help | Contact Us | Mobile Site

Copyright © 2012 by the ACM. All rights reserved.