Skip to end of metadata
Go to start of metadata

RO Vocabulary at a Glance

Research object (ro)

A research object is a bundle of resources, e.g., workflow templates, workflow runs, artifacts, etc. The figure below illustrates that a research object is described using a manifest. It also specifies that there is a specific class of research object named workflow research object, which refers to research objects that contain at least one workflow template.

The research object is described in the ro ontology under the namespace
http://purl.org/wf4ever/ro#.

This ontology extends the OAI-ORE ontology.

ro:ResearchObject

An ro:ResearchObject aggregates a number of resources. A resource can be a workflow, web service, document, data item, data set, workflow run, software, annotation or a research object. A Research object must be described by an ro:Manifest.

identifierhttp://purl.org/wf4ever/ro#ResearchObject

subclass of: ore:Aggregation

ro:Manifest

The ro:Manifest is used to describe an ro:ResearchObject. This identifies the resource for the manifest which lists all the aggregations of the research object, typically called .ro/manifest.rdf relative to the research object this manifest ore:describes.

identifierhttp://purl.org/wf4ever/ro#Manifest

subclass of: ore:ResourceMap

ro:Resource

An ro:Resource is an ore:AggregatedResource which ore:isAggregatedBy an ro:ResearchObject.

This specialisation requires that there exists an ore:Proxy which is ore:proxyFor this resource, and which is ore:proxyIn the same ro:ResearchObject the resource ore:isAggregatedBy. Any annotations on such a proxy will descrive the ro:Resource within that particular ro:ResearchObject, in particular dct:creator and dct:created on the proxy will specify who added the resource to the aggregation at what time.

Note that annotations (ro:AggregatedAnnotation) can be added to both the ro:Resource and the ore:Proxy - depending on if the annotation is seen to be globally true (such as the provenance of how the resource was created) or locally true within the Research Object (such as the the resource playing the role of a wf4ever:Dataset).

Not all resources aggregated by an ro:ResearchObject are ro:Resource instances, in particular ro:AggregatedAnnotation instances will also be aggregated, but will not be "true" RO resources (and thus don't need their own ore:Proxy).

Aggregated resources MAY also be organised in (potentially nested) ro:Folder instances to reflect a file-system like structure. Note that any such resources SHOULD also be aggregated in the "mother" ro:ResearchObject .

identifierhttp://purl.org/wf4ever/ro#Resource

subclass of: ore:AggregatedResource

ro:Folder

ro:Folder is a special kind of ore:Aggregation where every ro:AggregatedResource must have a ro:FolderEntry proxy with a unique name within that folder. Such folders can be nested and (optionally) used to organize the resources of the research object into a file-like structure. All such resources should also be aggregated by the ro:ResearchObject.

TODO: Need a functional property to specify the top level folder structure of an ro:ResearchObject?

identifierhttp://purl.org/wf4ever/ro#Folder

subclass of: ore:Aggregation, ro:Resource

ro:FolderEntry

An ro:FolderEntry is an ore:Proxy instance that associates a resources aggregated within an ro:Folder with a ro:entryName. This name is unique within a given folder.

identifierhttp://purl.org/wf4ever/ro#FolderEntry

subclass of: ore:Proxy

ro:entryName

This functional property specifies the name of a ro:FolderEntry within an ro:Folder.

This name must be case-sensitively unique within the ro:Folder, similar to a filename in a directory.

identifierhttp://purl.org/wf4ever/ro#entryName

domain: ro:FolderEntry

range: xsd:string

Annotations (AO)

Any aggregated resource of a research object can be annotated using the other Wf4Ever ontologies described on this page, standard ontologies like Dublin Core Terms, and third-party ontologies which may be domain-specific.

Implementation-wise, in the ro:Manifest one should only make assertions about the RO aggregation and minimal metadata such as when it was created, while all other assertions, such as typing an ro:Resource to be a wfdesc:Workflow or providing a dct:description is performed in annotations.

Wf4Ever research objects attach such annotations using the Annotation Ontology (AO) - recording who made which annotation when, on which resources.

In practice, each annotation context (for instance :resource1 a wf4ever:Dataset; dc:title "My dataset" .) is done as a separate RDF Graph (either a named graph or a separate resource). An ro:AggregatedAnnotation is then created, which specifies this graph as its ao:body, links to any aggregated resources using ro:annotatesAggregatedResource, and is itself aggregated by the ro:ResearchObject and contained within the ro:Manifest.

dct:created and dct:creator on the ro:AggregatedAnnotation will specify who made that assertion and added it to the research object.

If the annotation body has been created independently of the research object, for instance annotations of services made within the definition of a Taverna workflow, then dct:creator and dct:created can be specified also for the resource given by ao:body.

The following are Wf4Ever-specific extensions to the AO ontology:

ro:SemanticAnnotation

An ro:SemanticAnnotation is a specialisation of ao:Annotation which requires that ao:body points to an RDF Graph.

This might be a Named Graph or a resource which can be resolved separately from the URI given by ao:body.

This graph SHOULD mention the resources identified by ao:annotatesResource from this annotation, preferably by using their URIs as subject or object of statements.

Note that this use of ao:body is distinct from ao:hasTopic, which also allows the association of a an RDF Graph with an ao:Annotation, but which also implies that this graph is the "topic" (being subproperty of bookmark:hasTopic) of the annotated resource. This class does not require this interpretation, it is merely enough that the annotation body mentions the annotated resource, for instance to give it a dc:title or to relate two annotated resources. Also note that the next version of the AO ontology (v2) might change this definition of ao:hasTopic, removing the need for this class.

identifierhttp://purl.org/wf4ever/ro#SemanticAnnotation

subclass of: ao:Annotation

ro:AggregatedAnnotation

An annotation aggregated within an ro:ResearchObject.

Instances of this class are used to annotated resources aggregated within the aggregating research object, proxies of these resources, or the research object itself. In other words, if :ro is the ro:ResearchObject this annotation has been ore:isAggregatedBy, then the annotation should have at least one ao:annotatesResource which is an ore:AggregatedResource which is ore:isAggregatedBy :ro, or the annotated resource is an ore:Proxy which ore:proxyIn :ro, or the annotated resource is :ro.

It is possible for the annotation to also annotate non-aggregated resources, but as above, at least one of the annotated resources needs to be part of the RO or the RO itself.

TODO: What about freestanding annotations before there are any ROs? Will they be forced to annotate the RO?

As a subclass of ro:SemanticAnnotation the ao:body must point to an RDF Graph which contains the actual annotation.

identifierhttp://purl.org/wf4ever/ro#AggregatedAnnotation

subclass of: ro:SemanticAnnotation, ore:AggregatedResource

ro:annotatesAggregatedResource

ro:annotatesAggregatedResource is a subproperty of ao:annotatesResource, which specifies that an ao:Annotation annotates an aggregated ro:Resource.

When used on an ro:AggregatedAnnotation, both the domain and range of this property must ore:isAggregatedBy the same ro:ResearchObject.

TODO: Should also ro:ResearchObject and ore:Proxy be in the range of this property (to match restriction of ro:AggregatedAnnotation), or is this subproperty even needed?

identifier: http://purl.org/wf4ever/ro#annotatesAggregatedResource

domain: ao:Annotation

range: ro:Resource

subproperty of: ao:annotatesResource

Workflow definition (wfdesc)

Workflow descriptions can be made using the wfdesc ontology under the namespace
http://purl.org/wf4ever/wfdesc#

This ontology wfdesc describes an abstract workflow description structure, which on the top level is defined as a wfdesc:Workflow.

A wfdesc:Workflow contains several wfdesc:Process instances, associated using the wfdesc:hasSubProcess property. Each of these (and the workflow itself) wfdesc:hasInput and wfdesc:hasOutput some wfdesc:Parameter (wfdesc:Input or wfdesc:Output). An wfdesc:Artifact is associated with a wfdesc:Parameter using wfdesc:hasArtifact. The wfdesc:Workflow also wfdesc:hasDataLink several wfdesc:DataLink instances, which forms the connection between parameters. Thus this ontology allows the description a direct acyclic graph, or a dataflow.

This ontology is meant as an upper ontology for more specific workflow definitions, and as a way to express abstract workflows, which could either be hand-crafted by users ("ideal workflow template (wfdesc:process)") or extracted from workflow definitions of existing workflow systems, like Taverna's .t2flow and Scufl2 formats.

The wfprov ontology shows how to link these workflow descriptions to a provenance trace of a workflow execution.

wfdesc:Artifact

wfdesc:Artifact is used to provide information about a class of artifacts. For example, it can be used to specify the datatype of a dataset or the structure of a document. An wfdesc:Artifact is associated with a wfdesc:Parameter using wfdesc:hasArtifact. The distinction between a parameter and artifact is that the parameter can be customized to describe the particular role the artifact plays with regards to the process (and can be linked using wfdesc:DataLink) - while the wfdesc:Artifact can describe the syntactic and semantic datatype.

identifierhttp://purl.org/wf4ever/wfdesc#Artifact

wfdesc:DataLink is used to represent data dependencies between wfdesc:processes. It means that the artifact generated at an wfdesc:Output (identified using wfdesc:hasSource) will be used by a wfdesc:Input (identified using wfdesc:hasSink). The wfdesc:Processes that owns the wfdesc:Parameter instances which are the source and sink of a wfdesc:DataLink must be wfdesc:hasSubProcess of a the same wfdesc:Workflow which wfdesc:hasDataLink the data link, or be be parameters of that same workflow. Thus links can only be made within a wfdesc:Workflow - although ports owned by the workflow itself appear both inside and outside the workflow (in opposite roles).

identifier

http://purl.org/wf4ever/wfdesc#DataLink

wfdesc:Input

wfdesc:Input represents an input parameter to a wfdesc:Process. This can be compared to a function parameter, command line argument, files read, or parameter set by a user interface. It is out of scope of wfdesc to define the nature or classification of the parameter, such as giving it a name, position or data type. This can be done with subclasses and/or subproperties.

identifierhttp://purl.org/wf4ever/wfdesc#Input
subclass of:  wfdesc:Parameter

wfdesc:Output

wfdesc:Output represents an output parameter from a wfdesc:Process. This can be compared to functional return values, stdout/stdin, files written, or results shown in a user interface. It is out of scope of wfdesc to define the nature or classification of the parameter, such as giving it a name, position or data type. This can be done with subclasses and/or subproperties.

identifierhttp://purl.org/wf4ever/wfdesc#Output
subclass of:  wfdesc:Parameter

wfdesc:Parameter

This class represent a parameter of a wfdesc:process. A wfdesc:Parameter must be a wfdesc:Input, a wfdesc:Output, or both. A parameter is both an wfdesc:Input and wfdesc:Output when it is used on both sides of a subworkflow - see wfdesc:Workflow and wfdesc:DataLink for details.

identifierhttp://purl.org/wf4ever/wfdesc#Parameter

wfdesc:Process

A wfdesc:Process is used to describe a class of actions that when enacted give rise to processes. A process can have 0 or more wfdesc:Parameter instances associated using wfdesc:hasInput and wfdesc:hasOutput, signifying what kind of parameters the process will require and return. It is out of scope for wfdesc to classify or specify the nature of the process, this should be done by subclassing and additional subproperties, for instance ex:perlScript or ex:restServiceURI.

identifier: http://purl.org/wf4ever/wfdesc#Process

wfdesc:Workflow

A wfdesc:Workflow is a directed graph in which the nodes are wfdesc:Process instances and the edges (wfdesc:DataLink instances) represent data dependencies between the constituent wfdesc:process.

identifierhttp://purl.org/wf4ever/wfdesc#Workflow

subclass of: wfdesc:Process

A wfdesc:Workflow defines associated wfdesc:Process using hasSubProcess. A specialisation of this property is hasSubWorkflow, signifying that the process is a wfdesc:Workflow itself, which is further described in a similar fashion. As a subclass of wfdesc:Process a wfdesc:Workflow can also define wfdesc:hasInput/wfdesc:hasOutput parameters - these would be inputs taken at workflow execution time, and final outputs of the workflow. (Note: Not all dataflow systems have this concept of workflow parameters)

wfdesc:Parameter instances are linked using wfdesc:DataLink instances associated with the wfdesc:Workflow using wfdesc:hasDataLink. A wfdesc:Parameter defined with wfdesc:hasInput on a wfdesc:Workflow is considered an wfdesc:Input "outside" the wfdesc:Workflow (ie. if it is a subworkflow), but an wfdesc:Output "inside" the wfdesc:Workflow (where it can be connected to a wfdesc:Input of a wfdesc:Process). Thus such parameters can be linked "through" the wfdesc:Workflow without having a "mirrored" port inside..

Example

In this example :param1 is the output of :procA. :param1 is the source in a datalink that goes to the input :param4 of the :innerWorkflow. :param4 is however also the source of an inner datalink, going to input :param6 of the nested :procB. From this :param4 is both an wfdesc:Input and wfdesc:Output (which is why these two classes are not disjoint)

wfdesc:WorkflowInstance

A wfdesc:WorkflowInstance is a specialisation of a wfdesc:Workflow which defines all data/parameters/settings that are required to perform a wfprov:WorkflowRun without having executed it.

identifier: http://purl.org/wf4ever/wfdesc#WorkflowInstance

subclass of: wfdesc:Workflow

wfdesc:hasArtifact

This property associates a wfdesc:Parameter with an wfdesc:Artifact which can describe the artifact which would be used/generated on execution of the workflow.

identifierhttp://purl.org/wf4ever/wfdesc#hasArtifact
domain: wfdesc:Parameter
range: wfdesc:Artifact

This object property is used to specify the data links of a given wfdesc:Workflow.

identifierhttp://purl.org/wf4ever/wfdesc#hasDataLink
domain: wfdesc:Workflow
range: wfdesc:DataLink

wfdesc:hasInput

This object property is used to specify the input parameter of a given wfdesc:Process.  

identifierhttp://purl.org/wf4ever/wfdesc#hasInput
domain: wfdesc:Process
range: wfdesc:Input

wfdesc:hasOutput

This object property is used to specify the output parameter of a given wfdesc:Process. 

identifierhttp://purl.org/wf4ever/wfdesc#hasOutput

domain: wfdesc:Process

range: wfdesc:Output

wfdesc:hasSubProcess

This object property is used to specify that the given workflow contains the given process as part of its definition. Although not a requirement, such sub processes should have wfdesc:DataLink within the containing workflow connecting their parameters with parameters of the containing workflow, or with parameters other contained wfdesc:Process instances. A specialialisation of sub process is wfdesc:hasSubWorkflow where the sub process is a nested wfdesc:Workflow.

identifier:http://purl.org/wf4ever/wfdesc#hasSubProcess
domain: wfdesc:Workflow
range: wfdesc:Process

wfdesc:hasSink

This property is used to specify the wfdesc:Input parameter that acts as a sink from a given wfdesc:DataLink, consuming data from the link.

identifier:

http://purl.org/wf4ever/wfdesc#hasSink
domain: wfdesc:DataLink
range: wfdesc:Input

wfdesc:hasSource

This property is used to specify the wfdesc:Output parameter that acts as a source to a given wfdesc:DataLink, providing data into the link.

identifier: http://purl.org/wf4ever/wfdesc#hasSource
domain: wfdesc:DataLink
range: wfdesc:Output

wfdesc:hasSubWorkflow

This object property is used to associate a wfdesc:Workflow to another wfdesc:Workflow, specifying that the given workflow has the given sub-workflow as a contained process.  

identifier: http://purl.org/wf4ever/wfdesc#hasSubWorkflow
domainwfdesc:Workflow
rangewfdesc:Workflow
subproperty ofwfdesc:hasSubProcess

Workflow provenance (wfprov)

Provenance of workflow execution is described using the wfprovontology.

wfprov:Artifact

An artifact is a data value or item which wfprov:wasOutputFrom of a wfprov:ProcessRun or that the process run used as input (wfprov:usedInput).  Such an artifact might also be a ro:Resource if it has been aggregated in the ro:ResearchObject (typically if the artifact was used or generated by a wfprov:WorkflowRun) - but this might always not be the case for intermediate values from wfprov:ProcessRun.

identifier:

http://purl.org/wf4ever/wfprov#Artifact

wfprov:ProcessRun

A process run is a particular execution of a wfdesc:Process description (wfprov:describedByProcess), which can wfprov:usedInput some wfprov:Artifact instances, and produce new artifacts (wfprov:wasOutputFrom). A wfprov:WorkflowRun is a specialisation of this class.

identifier:

http://purl.org/wf4ever/wfprov#Process

wfprov:WorkflowEngine

A workflow engine is a foaf:Agent that is responsible for enacting a workflow definition (which could be described in a wfdesc:Workflow). The result of workflow enactment gives rise to a wfprov:WorkflowRun..

identifier:

http://purl.org/wf4ever/wfprov#WorkflowEngine
subclass of: foaf:Agent

wfprov:WorkflowRun

A workflow run is a wfprov:ProcessRun which have been enacted by a wfprov:WorkflowEngine, according to a workflow definition (which could be wfdesc:describedByWorkflow a wfdesc:Workflow). Such a process typically contains several subprocesses (wfprov:wasPartOfWorkflowRun) corresponding to wfdesc:Process descriptions

identifier:

http://purl.org/wf4ever/wfprov#WorkflowRun

subclass of: wfprov:ProcessRun

wfprov:describedByParameter

This object property is used to associate an wfprov:Artifact to the wfdesc:Parameter it is an instance of.

identifier:

http://purl.org/wf4ever/wfprov#wfprov:describedByParameter
domain: wfprov:Artifact

range: wfdesc:Parameter

wfprov:describedByProcess

This object property associate a wfprov:Processrun to its wfdesc:Process description .

identifier:

http://purl.org/wf4ever/wfprov#describedByProcess
domain: wfprov:ProcessRun

range: wfdesc:Process

wfprov:describedByWorkflow

This property associates a wfprov:WorkflowRun to its corresponding wfdesc:Workflow description.

identifier:

http://purl.org/wf4ever/wfprov#describedByWorkflow
domainwfprov:WorkflowRun

rangewfdesc:Workflow

subproperty ofwf:describedByProcess

wfprov:usedInput

This property specifies that a wfprov:ProcessRun used an wfprov:Artifact as an input

identifier:

http://purl.org/wf4ever/wfprov#usedInput
domain: wfprov:ProcessRun

rangewfprov:Artifact

wfprov:wasEnactedBy

wfprov:wasEnactedBy associates a wfprov:ProcessRun with a wfprov:WorkflowEngine, specifying that the execution of the process was enacted by the engine.

identifier:

http://purl.org/wf4ever/wfprov#wasEnactedBy
domain: wfprov:ProcessRun

rangewfprov:WorkflowEngine

wfprov:wasOutputFrom

This property specifies that a wfprov:Artifact was generated as an output from a wfprov:ProcessRun

identifier:

http://purl.org/wf4ever/wfprov#wasOutputFrom
domain: wfprov:Artifact

rangewfprov:ProcessRun

wfprov:wasPartOfWorkflowRun

This property specifies that a wfprov:ProcessRun was executed as part of a wfprov:WorkflowRun. This typically corresponds to wfdesc:hasSubProcess in the workflow description.

identifier:

http://purl.org/wf4ever/wfprov#wasPartOfWorkflowRun
domain: wfprov:ProcessRun

rangewfprov:WorkflowRun

Wf4Ever-specific extensions (wf4ever)

The wf4ever ontology specifies extensions that we feel are quite specific to Wf4Ever, and probably not as liable for reuse as the previous ontologies. This ontology imports all the other ontologies described on this page, and so can also be used to get a complete picture of all the Wf4Ever ontologies. The namespace is 

http://purl.org/wf4ever/wf4ever#

wf4ever:WebServiceProcessTemplate

A wf4ever:WebServiceProcessTemplate is an wfdesc:Process, the enactment of which gives rise to a web service call.

identifier: http://purl.org/wf4ever/wf4ever#WebServiceProcessTemplate
subclass of: wfdesc:Proccess

TODO: Give details of which webservice endpoint, port and operation. What about REST services, etc? Local tools?

wf4ever:WorkflowResearchObject

A workflow research object is a research object that contains at least one wfdesc:Workflow. Note that asserting that a resource is a wf:Workflow or that an RO is a ro:WorkflowResearchObject is an annotation which normally goes in an annotation body instead of in the manifest - thus we can see who or what claimed an RO was a WorkflowResearchObject.

identifier:

http://purl.org/wf4ever/wf4ever#WorkflowResearchObject
subclass of ro:ResearchObject

Errors in wf4ever.owl

Icon

The v0.1 of wf4ever.owl wrongly claims that wf4ever:File and friends are a subclasses of ro:Artifact. These should have been subclasses of wfprov:Artifact. This error has been fixed in the next version of the ontology.

wf4ever:File

A subclass of wfprov:Artifact to signify a file stored on a file system.

identifier:

http://purl.org/wf4ever/wf4ever#File
subclass of wfprov:Artifact

wf4ever:Image

A subclass of wfprov:Artifact to signify a visual image.

identifier:

http://purl.org/wf4ever/wf4ever#Image
subclass of wfprov:Artifact

wf4ever:Document

A subclass of wfprov:Artifact to signify a document.

identifier:

http://purl.org/wf4ever/wf4ever#Document
subclass of wfprov:Artifact

wf4ever:Dataset

A subclass of wfprov:Artifact to signify a dataset.

identifier:

http://purl.org/wf4ever/wf4ever#Dataset
subclass of wfprov:Artifact

TODO: Why these four? 

Example: Annotations and manifests

Here is an example research object containing a single workflow a_workflow.t2flow and an annotation which as been automatically added.

.ro/manifest

Here we can see that Stian has made a research object at 15:01:10, and at 15:02:10 he added <a_workflow.t2flow> to the research object. Note that in this case we don't know when <a_workflow.t2flow> itself was created or when. The research object also aggregates an annotation :ann1, this annotation was created by "t2flow workflow annotation extractor" a few seconds after Stian added <a_workflow.t2flow> - and indeed this is the aggregated resource which has been annotated. It also annotates the resource object itself. We have been told that the content of that annotation, <.ro/ann1>, was created by Marco more than a year earlier, which in this case makes sense because it is an annotation which the extractor has found inside the t2flow.

Resolving the annotation body gives:

.ro/ann1

Inside the annotation body, which we have been told was done by Marco, we find an abstract version of the .t2flow workflow (using the wfdesc ontology) which has been extracted by the agent. This reveals some workflow annotations done by Marco in 2010; title and description on the workflow, and descriptions for the input and output ports.

Example: myExperiment Pack as a Research Object

As a proof of concept, we show how a myExperiment pack can be specified using research object vocabulary. We used the myExperiment Graphing and Citing Time Series Data pack, as it contains a wfdesc:Workflow, and is, therefore, useful for illustrating what a workflow research object looks like. This pack includes a worflow and its results. The workflow is used to extract data from the Times Series Data Library and plots it using Google charts.

The RO ontology contains individuals that allow to see what the resources aggregated within the pack are, and how their relationships can be encoded using AO annotations. The figure bellow illustrates the structure of the research object, which is represented by the node ro-217. This node aggregates 7 resources representing the wfdesc:Workflow, workflow-2464, the workflow run obtained by enacting such wfdesc:Workflow, workflow-run1, the artifacts used as inputs to the workflow run, external-1, external-2 and external-3, and the artifacts generated as a result of the enactment, file-576 and file-577. Additionally, the research object ro-217 aggregates 6 annotations that are used to encode the relationships between aggregated resources. Specifically, annotation-1, annotation-2 and annotation-3 are used to specify that workflow-run1 used the artifact external-1, external-2 and external-3, and annotation-4 and annotation-5 are used to specify that the artifacts file-576 and file-577 were generated by workflow-run1. Finally, annotation-6 is used to specify that workflow-run1 has as wfdesc:Workflow workflow-2464.

OWL versions

The RO ontologies can be found on github as OWL/Turtle format:

Other formats will be produced when these are published at their official namespace URIs.

If you are viewing these in Protege, it is recommended that you:

  • Download all OWL files to the same folder (simplest is using git clone git://github.com/wf4ever/ro.git - this will also include catalog-v001.xml which avoids the below)
  • If Protege says it can't find http://purl.org/wf4ever/wfdesc# - select *Yes* to choose the corresponding file locally, select wf.owl - and similar for the other ontologies.
  • In the menu, select View -> Show only the active ontology to avoid showing all of foafao, etc.
  • In the menu, select View -> Custom rendering.. and select Render by qualified name. This will show the prefixes, like opm:Agent and wfdesc:Workflow.

Note on namespaces: