Skip to end of metadata
Go to start of metadata

Research Object. Definition and Properties

Research Objects (ROs) are semantically rich aggregations of resources [5] that bring together data, methods and people in scientific investigations. As described in [6] their goal is to create a class of artifacts that can encapsulate our digital knowledge and provide a mechanism for sharing and discovering assets of reusable research and scientific knowledge. In the context of Wf4Ever we focus on those Research Objets whose methods are implemented as scientific workflows. Hence a Workflow-Centric Research Object can be viewed as an aggregation of resources that bundles a workflow specification and additional auxiliary resources, including documents, input and output data, annotations, provenance traces of past executions of the workflow, etc.

As described in [7], and later extended on [8] Research Objects ideally possess the following properties:

  • Reusable. The Research Object as a whole can be used as part of new experiments. Hence, one experiment may call upon another, and by assembling methods in this way we can conduct research, being able to ask research questions, at a higher level.
  • Repeatable. The experiments that are contained in the Research Object can be repeated again. Therefore, not only the Research Object includes enough information for the original researcher or others to be able to repeat the experiment; but should grant future access to the same version of the services and data sources involved in the initial experiments. This property is useful to verify the results or validate the experimental environment using exactly the initial set-up; and on the other hand it enables scaling to the repetition of processing demanded by data intensive research.
  • Replayable. The Research Object contains a comprehensive record that enables us to go back and see what happened at a given time point or interval. The ability to replay (rather than repeat) the experiment, and to focus on crucial parts, is essential for human understanding of what happened in experiments that might occur in nanoseconds or years.
  • Repurposable. A Research Object constituent parts can be used as pieces of a different Research Object. An experiment which is a black box is only reusable as a black box.By opening the lid we find parts, and combinations of parts, available for reuse. The way they are assembled is a clue to how they can be reassembled.
  • Reliable. A Research Object posses methods and enough experiment information (specially provenance information) to verify and validate the experiments outcomes, deriving a measure of trust of the results and methods that it contains. This property also includes that a Research Object is resilient "in the wild" and may be subject to regulatory review.
  • Referenceable. Research Objects are meant to replace (or complement/augment) traditional publications, and as such, they are referenceable and citeable. This property applies also to individual parts of the Research Object and its different versions as it evolves.
  • Re-interpretable. A Research Object (by analogy with Boundary Objects[13]) are both plastic enough to adapt to local needs, terminologies, and constraints of the several parties employing them, yet robust enough to maintain a common identity and meaning across scientific domains. Therefore, unlike traditional papers that are aimed at one target audiences, a given Research Object might be useful and meaningful in and across different research communities.
  • Respectful and Respectable. Research Objects contain information enough about to credit and attribution for the component parts and methods and their assembly, to the flow of intellectual property in generation of results, to data privacy, and with an effective definition of the policies for reuse.
  • Retrievable. Research Objects provide discovery and recommendation mechanisms. Any user that might be interested in a Research Object should become aware of its existence, and if it is the case, should be able to acquire it for its later use.
  • Refreshable. Updates on the Research Object and on its associated resources propagate in an efficient and consistent fashion. The Research Object remains valid even when its parts (associated methods, data, contained metadata, etc.) evolve in time.
  • Recoverable and reparable. The Research Object associated methods and data take into account that some experiments might impose transactional requirements. When things go wrong an automatic roll-back action might be needed to retrace the experiment steps, implying thus the provision of mechanisms for diagnosis and repair.

Research Objects can be seen as an alternative to the traditional notion of paper-based research results dissemination in order to avoid:

  • Information overwhelm. As described in [1] current scientific dissemination approach encourages authors to write many papers as possible to get more "tokens of credit"; whenever a progress is made on a certain subject, a new paper is written, reviewed, and published systematically. This modus operandi does not promote the provision of evolution traceability nor support or encourage to reuse or evolution of publications. In consequence it invariably incurs in an information overhead; firstly for reviewers, but more importantly in their possible present and future target audience. The chasm between data production and data handling has become so wide, that many data go unnoticed or at least run the risk of relative obscurity [2]. There is a high risk that valuable information could end unnoticed for researchers whilst they are forced to deal with a sheer volume of information.
  • Information burial. The evolution of electronic publications is still early stage of development; right now they can be considered just refurbished PDF versions of their paper counterparts (with just the addition of some lexical tags in the best cases). Summarizing [3] computers cannot deal with the information as produced in the classic articles electronic versions since they have the following undesired properties
    • Ambiguity. Homonyms are hard to distinguish without extensive and redundant contextual information. Furthermore, to keep journal articles readable authors often use different words to denote the same meaning; and the majority of digital information is duplicated repeatedly, as there is no practical way of referencing factual statements on their originating publication.
    • Lack of structure. Textual data is not well structured beforehand, the connection between words is not immediately (and may be at all) clear for a computer.
  • Data silos and information obfuscation. Current publications present scientific results, but the data used to obtain and back the depicted results are most of the times not available or accessible. As pointed in [3] many data is either not published (specially the case of negative studies), not freely available or not easy to find. Either way, the raw data is kept away from the research community in data silos; researchers are provided with just a partial human-readable version of it.

Dissemination Material

Presentation at: https://www.dropbox.com/home/Wf4Ever/Reviews/year%201/Slides?select=wf4ever_WP2.pptx
Paper at: SePublica2012 paper draft

References

[1] http://project.liquidpub.org/liquid-publications-scientific-publications-meet-the-web
[2] Mons B. and Velterop J. , Nano-publication in the e-science era, in: Workshop on Semantic Web Applications in Scienti?c Discourse (SWASD 2009), Washington, DC, USA, 200
[3] http://laikaspoetnik.wordpress.com/2010/06/23/will-nano-publications-triplets-replace-the-classic-journal-articles/
[4] Barend A. M. Which gene did you mean? (2005) BMC Bioinformatics, 1, doi:10.1186/1471-2105-6-142,p 142, http://www.biomedcentral.com/1471-2105/6/142, Vol 6
[5] Bechhofer, S., De Roure, D., Gamble, M., Goble, C. and Buchan, I. (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge. In: The Future of the Web for Collaborative Science (FWCS 2010), April 2010, Raleigh, NC, USA.
[6] Bechhofer, S., Ainsworth, J., Bhagat, J., Buchan, I., Couch, P., Cruickshank, D., Delderfield, M., Dunlop, I., Gamble, M., Goble, C., Michaelides, D., Missier, P., Owen, S., Newman, D., De Roure, D. and Sufi, S. (2010) Why Linked Data is Not Enough for Scientists. In: Sixth IEEE e--Science conference (e-Science 2010), December 2010, Brisbane, Australia.
[7] http://blog.openwetware.org/eresearch/?p=56
[8] http://blogs.nature.com/eresearch/2010/11/27/replacing-the-paper-the-twelve-rs-of-the-e-research-record
[9] Star S. L., Griesemer JR (1989). Institutional Ecology, 'Translations' and Boundary Objects: Amateurs and Professionals in Berkeley's Museum of Vertebrate Zoology, 1907-39". Social Studies of Science 19 (4): 387--420
[10] Gamma E., Helm R., Johnson R., and Vlissides J. (1995) Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading, Massachusetts.
[11] Heath, T., Hepp, M., and Bizer, C. (eds.). Special Issue on Linked Data, International Journal on Semantic Web and Information Systems (IJSWIS). http://linkeddata.org/docs/ijswis-special-issue
[12] Giunchiglia F., ChenuAbente R., SCIENTIFIC KNOWLEDGE OBJECTS V.1 January 2009 Technical Report # DISI-09-006
[13] Groth, P., Gibson, A., Velterop, J. (2010). The anatomy of a nanopublication. Information Services and Use, 30(1), 51-56.
[14] Consultative Committee for Space Data Systems, "Reference Model for an Open Archival Information System (OAIS)," Open Archives Initiative, Blue Book CCDS 650.0-B-1, 2002.

Specialised descriptions

RO model for scientists

RO model for publishers

RO model for developers

Material for the European Commission

RO model for EC reports

Ongoing work

  • Current sprints related to the RO model:
  • Previous sprints:
  • Other material:
    • RO preservation services (work done by Gema on identifying how preservation is handled in Digital Libraries)
    • RO lifecycle and stereotypes (work done by Rafa and Oscar). Some names to be changed after the discussion on the 1st review 
    • RO evolution (work done by Raúl and Oscar). To be improved.

Related material

Related Vocabularies and Models

  • No labels