Sandbox
Research Objects Digital Library
- 1 Introduction
- 2 Implementation
- 3 REST interface specifications
- 4 Wf4Ever models implementation considerations
- 4.1 RO model
- 4.2 RO evolution
- 5 Portal
- 6 Mapping between RO structure and internal dLibra data model
- 6.1 General structure
- 6.2 Attributes
- 7 Software libraries used
- 8 Deployment instructions
Introduction
This document contains a specification of the functionalities offered by the Research Object Digital Library. RODL realises the backbone services of a software architecture for the preservation of research objects.
The main interface to RODL is a set of RESTful APIs, being the two primary ones the RO API and the RO Evolution API. These APIs interoperate through the RO and related models, the data structures that encode the concepts and relationships of information. For instance, RO API defines the formats and links used to create and maintain ROs in the digital library. It is aligned with the RO model, hence recognizing concepts such as aggregations, annotations and folders. The RO model ontology is used to specify relations between different resources. The RODL supports content negotiation for metadata, including formats like RDF/XML, Turtle and TriG. The RO Evolution API defines the formats and links used to change the lifecycle stage of a RO, most importantly to create an immutable snapshot or archive from a mutable live RO, as well as to retrieve the evolution provenance of a RO. The API follows the RO evolution model. Additionally, RODL provides a SPARQL endpoint, a Notification API, a Solr REST API and a custom User Management API.
Implementation
The RODL is built on top of dLibra. dLibra provides file storage and retrieval functionalities, including file versioning and consistency checking. It has a built-in text search engine, fed by its own flexible metadata system, and it manages users and controls their access rights. Besides, dLibra allows organising stored objects into hierarchical structures and associating metadata at the level of object aggregations.
A dedicated Semantic Metadata Service has been included in RODL to give support to the Research Object model. This service manages an RDF triplestore and allows storage and retrieval of any type of RO metadata, in particular structured semantic annotations, classes of resources and relations between them.
RODL offers preservation services for workflows. These services take into account the decay of workflows due to changes in the external resources on which they depend – data sources or web services can disappear, malfunction or change their interface. A number of services, such as research object completeness and stability evaluation services, have already been implemented.
RODL services expose their functionality by means of a REST API. The API is accessed by software clients, which include applications that facilitate registering users and managing their access rights or support browsing the contents of RODL and connect it with other services. In particular, RODL is being used to extend the workflow preservation capabilities of myExperiment, where users can export their content to RODL as Research Objects. An interface for RODL in myExperiment is being developed so that the users can navigate through their Research Objects preserved in RODL and take advantage of its functionalities.
REST interface specifications
http://www.wf4ever-project.org/wiki/display/docs/Wf4Ever+service+APIs
Wf4Ever models implementation considerations
RO model
Version 5 of RODL implements ?Research Object Vocabulary Specification v0.1. It implements all the concepts related to resource aggregations and annotations. It offers no support for folders (ro:Folder).
RO evolution
Version 5 of RODL does not implement the ?RO evolution model.
Portal
Apart from RESTful services, the Research Object Digital Library includes a web application for exploring the content of the digital library and testing the most up-to-date implementations of the Wf4Ever models.
Mapping between RO structure and internal dLibra data model
General structure
Until RODL 5, research objects belonged to workspaces and had versions. Starting from RODL 5, shorter URIs are used. For backwards compatibility, workspaces and versions are still used internally - workspace "default" and version "v1" is assumed.
Research Object | dLibra |
---|---|
Workspace | Group publication |
Research Object | Publication |
Version of Research Object | Publication with single edition which content is modified every |
Resource file | File |
Attributes
Whenever a manifest or annotation is created, it is scanned for statements about the research object itself. Such statements are added to dLibra as RO attributes, which are later used for indexing and search.
Software libraries used
The service is a servlet based application built using the Jersey framework.
Jena framework is used to handle RDF files. NG4J is used for handling named graphs.
Portal is built using Apache Wicket.
Deployment instructions
The source code is available at git://github.com/wf4ever/rosrs.git.
See instructions and additional information at source code repository: https://github.com/wf4ever/rodl
dLibra server location and directory used for storing workspaces should be configured in src\main\resources\connection.properties file. A dLibra instance used for demonstration purposes is available at host sandbox.wf4ever-project.org (port number 10051 and directory 3, as originally configured).
That's all, now the project can be built (mvn package) and deployed (rosrs5 servlet).