Skip to end of metadata
Go to start of metadata

Status: early work in progress

Kristina's comments in pink

Julian's comments in blue

This Research Object pattern catalogue is intended to provide concrete solutions, or recipes, for researchers to use the RO Model to capture information about their Research Objects and workflows. It is intended to provide a bridge between user-level requirements and the technical structures of the RO model.

The information provided here should be sufficient for researchers with a basic knowledge of RDF, but without a detailed understanding of the RO model, ORE, AO or other ontologies, to create RO structures for describing identified information requirements.

Pique would like to know how to annotate

(tick)  ticks are solved issues with ro manager commands (see annotation mapping..)

(minus)  ticks are still not solved issues 

  • (tick)  author and co-authors of a workflow (folder representing all that is related to a Wf) Author field for the workflow in Taverna
  • (tick)  research institution, country, scientific domain Description field for the workflow in Taverna
  • (tick)  who has performed the execution of a workflow (prov_runs) leading to the results provided
  • (tick)  who (person or webpage, survey, data release) has provided required datasets needed for the proper execution of a Wf Description field for the workflow in Taverna
  • (tick)  the computing execution environment of the RO (OS, distribution, version, Taverna 2.4, AstroTaverna plugins 0.9, etc.) If related to the workflow: description field for the workflow in Taverna
  • (tick)  special access requirements to web services Description field for the web service in Taverna
  • (tick)  "similar/alternative" annotations to provide alternates/mirrors to ws in case of decay
  • (tick)  add web pages to the used/referenced bibliography
  • (tick)  how much time does it take to run a Wf with the full data and with the subsample - This annotation os related to both the <Wf and the data sample> (full data or subsample data). Then, I annotate the dataset folder inside the workflow folder Description field for the workflow in Taverna 
  • (tick)  the number of elements of the sample where one Wf and/or RO iterates
  • (tick)  the actual size of the RO and/or a folder
  • (minus)  who is doing the actual annotation - NOTE: current limitation of ro-manager
  • (tick)  previous workflows executed before this one and next one to be executed I describe this in a README file for the RO
  • (tick)  reference to re-used workflow and its author Description field for the workflow in Taverna
  • (tick)  version of a workflow Refer to workflow version on myExperiment
  • (warning)  relationships encoded in the tree-folder structure
    • (tick)  a document is a produced publication in bibliography
    • (tick)  a document is a referenced/used publication in bibliography
    • (tick)  all files in /config/files are configuration files
    • (tick)   all small files in /config/soft represent needed software dependencies
    • (tick)  all small files in /config/ws represent web services used
    • (tick)  all files in subfolders of /data/all/user_input/ are user-provided required files not produced in other Wfs
    • (tick)  all files in /data/all/user_input/wfname1 are user-provided required files for wfname1 not produced in other Wfs
    • (tick)  all files in subfolders of /data/all/user_input/common are user-provided required files shared by several Wfs, and not produced in other Wfs
    • (tick)   all files in subfolders of /data/all/outcome/ are files produced in Wfs
    • (tick)  all files in subfolders of /data/all/outcome/wfname1 are files produced by workflow wfname1
    • (tick)  all files in subfolders of /data/all/results are actual final results of the RO
    • (tick)  all files in /data/subset are sample data used to check reproducibility and repeatibility of Wfs
    • (tick)  all files in /process/bin/ are binary files imported in Taverna Wfs
    • (tick)  all files in /process/scripts/ are scripts inserted into Taverna Wfs
    • (tick)  all t2flow files inside /workflows/main are the actual workflows used in the RO
    • (tick)  all t2flow files inside /workflows/nested are small workflows inserted into one or several main workflows
    • (tick)   all files and symlinks inside /workflows/main/wfname1 are related to wfname1
    • (tick)   all files in subfolders of /workflows/main/*/income/ files are input files
    • (tick)   all files in subfolders of /workflows/main/*/income/wfname1 are output files of wfname0
    • (tick)  all files in subfolders of /workflows/main/*/prov_runs are provenance executions of a workflow
    • (tick)  some of the files are plots, other are tabular data

Additionally, 

  • (minus)  copy annotations done in t2flow file to Wf folder, Inside the RO a Wf is not the t2flow file, but the folder that contains everything related to it and needed to execute it. In my case, the workflow is the t2flow file, plus a HOWTO file

General comment to the tree folder structure-related annotations: you can solve most of these by looking here and by exporting the provenance after the run using a special component developed by Stian, see: Bio RO provenance export

Most of the files are not in the provenance in the astro-GE because they are intermediate files. Although they are referenced from a votable (data table which is in the provenance). I guess it has to be done manually for those files.  

Raul: See this first draft for Annotation Mapping

Aggregating web services

A workflow RO may wish to include web services as aggregated resources, e.g. so that annotations can be added to indicate how or why the service is used.

A web service is identified by a URI, and that URI can be aggregated into a Research Object as an external resource using that URI, or by using an ORE proxy structure. @@check?

Scenario:

HyperLEDA workflow makes use of a web service (@@details)

RO Model structure:

Use wf4ever workflow extensions ro RO model (@@provide details from HyperLEDA)

(@@ include variants with and without ORE proxies)

See also: http://wf4ever.github.com/ro/#wf4ever

Recording software dependencies

A workflow may depend on access to a particular software package, and that dependency should be recorded to help other researchers create the environment needed to re-run or re-use the workflow.

One way to record this information is to annotate the workflow and/or RO itself with a n annotation type that indicates a software dependency. For the purpose of this patterm, we assume the software has a web home page whose URI can be used in the annotation to guide users to information about the software.

Scenario

(@@example please)

RO Model structure

(@@details)

  • No labels