Skip to end of metadata
Go to start of metadata

This is intended to be a live document that will be updated as work progresses.

Roadmap

M31: Jun 2013

  • Deliverable D4.2v2 checklist material draft (Done)
  • Deliverable D4.2v2 provenance material draft (Done)
  • Material for other deliverables as requested (Done)
  • Update RO pack 3107 for in-use submission (Done)
  • Checklist creator UI design (Done)
  • Checklist creator tool requirements (Done)
  • Checklist creator tool review (Done)
  • ISWC in-use submission rebuttal? (Done)

M32: Jul 2013

Summary: probably slightly ahead on Checklist-LD, and some technical debt has been addressed in the process, but RO Manager technical debt assessment has not been started. Additional (unplanned) effort this month is being applied to the Timbus tutorial demo, which is helping to drive forward some other RO handling issues.

  • Deliverables (D4.2v2, etc.) submitted (Done)
  • Tooling to create checklists for RO evaluation - prototype implemented (Done)
  • Checklist-LD outline design (checklist evaluation for generic linked data) (Done)
  • Checklist-LD test suite (Done)
  • Prototype checklist-LD system implemented (Done)

M33: Aug 2013

  • Checklist-LD: implemented, documented and deployed (Done)
  • Checklist-LD: reimplement chembox evaluation using this (Done)
  • Outline plan for checklist repository, including any technical enhancements required (Done: see below)
  • Outline plan for checklist performance improvements (Deferred, may be OBE: see below)
  • RO manager technical debt breakdown and estimation (Done: see below; prioritization still to do, probably ongoing)

M34: Sep 2013

Note: short month

  • Draft evaluation plan
  • Checklist "landing page" in researchobjects.org - discuss with Susana; link to all materials
  • Checklist evaluation performance improvements (De-prioritized)
  • Checklist repository? (brought forward) (De-prioritized) (Started - https://github.com/wf4ever/ro-catalogue/tree/master/minim)
  • Plan for new checklist creator? (De-prioritized)

M35: Oct 2013

  • D4.3 draft
  • Required checklist documentation identified and mostly drafted
  • Prepare evaluation materials?
    • RO and RO API from a consumer's perspective?
    • Checklist applicability to range of targets/purposes?
    • Performance discussion
  • Finalize LISC slides
  • ISWC attendance; also trip to Melbourne U
  • Technical debt? (De-prioritized)
  • Checklist repository? (De-prioritized)

M36: Nov 2013

  • Evaluation deliverables submitted
  • Web page finalized
  • Checklist documentation finalized

Progress and notes

D4.2v2 (question)

  • Initial content drafted for provenance (Done)
    • awaiting some material from Esteban. (Carried forward)
    • awaiting sequence diagram for Taverna provenance export from Stian (Done)
    • awaiting paragraph about myExperiment provenance display from Don (Done)
  • Initial content drafted for checklist evaluation (Done)
  • Esteban to provide material for provenance
  • Stian to provide sequence diagram for Taverna provenance export (Done)
  • Estaban assembling final document (Started)
    • GK to assist as required
  • GK to review assembled document (Done)
  • Quality assurance reviews (Done)
  • Updates from QA review (n/a)
  • Deliverable submitted

RO pack 3107 (question)

This is one of the KEGG workflow packs that was not originally successfully analyzed as part of the decay detection evaluation. Problems were due to the provenance structure being not quite as expected. This has been fixed and it should now be possible to include this workflow in our results.

  • Update RO pack 3107 for in-use submission
    • Reorganize provenance in formation (Done)
    • Regenerate RO (blocked on https://jira.man.poznan.pl/jira/browse/WFE-1118) (Done)
    • Blocked on adding annotations: (a) the changed APi for RODL breaks the conversion script, with no easy generic fix visible, and (b) the provenance file is zero length, not valid RDF. (Fixed)
      • Piotr helped me get rid of the old ROP identifiers that were causing problems.
    • Add annotations as needed for result (Done)
    • Evaluate, add back into results corpus (Done)

See:

Tooling to create checklists for RO evaluation (question)

Create an initial prototype tool for creating RDF/Minim descriptions of checklists via an annotated spreadsheet. This activity is time-boxed, and aims to explore possible directions rather than to create a final, polished tool.

Checklist-LD (question)

(Service to create "overlay RO" on linked data, should work with existing checklist service)

  • Initial design document for the checklist tool extended to deal with Linked Data (3 days?) (Done: https://github.com/wf4ever/ro-manager/blob/develop/src/roverlay/roverlay.md)
  • First prototype of Checklist-LD (Checklist for Linked Data) (Done)
    • Allow treating some Linked Data resources as ROs/RO-light, do not implement big changes to the existing RO Manager
    • Set up of the basic prototype infrastructure
    • Have some test suites/data
    • Have the new component that enables the wrapping-up of a Linked Data resource as an RO/RO-lite
    • Apply the existing checklist web service to evaluate the Linked Data resources, using the above new component together
    • (This does not address the performance issue - see below)
  • Command line tool for invoking Checklist-LD service (Done)
  • User documentation (Done in first-draft form - https://github.com/wf4ever/ro-manager/blob/master/src/roverlay/roverlay.md)
  • Evaluation using checklist over chembox data: try to use DBPedia sources directly. (Done. Actually used data from Matt's server, but the principle here is the same.)

See also:

  • http://austese.net/lorestore/docs.html - this provides a similar kind of service for creating ORE resource maps and annotations, with separate service endpoints for each. It claims to be REST, but actually uses predefined URI formats for its various operations, so doesn't really follow HATEOAS principles.

Performance evaluation plan

  • Performance model: identify execution elements
  • For each element, identify benchmark techniques
  • For each element, identify RO test data requirements
  • Assemble (or create) RO test data
  • Perform benchmarks

Checklist landing page at researchobjects.org

Discuss with Susana; link to all materials

Checklist documentation

  • Identify what we have
  • Create new documentation as required

Required:

  • Checklist performance evaluation description

Prepare D4.3 evaluation draft

  • Get template from Jose
  • Identify needed material
  • Review plan with Jun
  • Collect and/or write content

Wrap up final deliverables

  • D4.3 evaluation
  • Checklist/Minim evaluation
  • RO Manager evaluation
  • Tidy up software documentation
  • Help with journal pub on final Wf4Ever system? (requested by project reviewers.)

Additional lower priority activities

Checklist repository

Simple infrastructure to publish checklists (and checklist items?) for wider use.

Connect with Minim creation tooling and Checklist-LD? Per discussions with Susana, this might lead to a new form of "tick the bixes" checklist creator that assembles a custom checklist from selected items. This new checklist creation tool does not currently appear in the roadmap.

Outline plan:

  • Assemble some initial examples (Done - see https://github.com/wf4ever/ro-catalogue/tree/master/minim)
  • Identify useful metadata to associate with a checklist/Minim file (e.g. to aid discovery and selection; also consider that subsequent tooling may want to access individual checklist items within a checklist)
    • Title
    • Keywords
    • RO vocabularies assumed (other than core RO)
    • More?
  • Choose publication platform and technical format
    • Assume some kind of Google-indexable flat file text format rather than any kind of database
    • Jun suggested use of the ResearchObjects.org web site
  • Review metadata/design choices (e.g. with Susana, Jun)
  • Update examples and tooling accordingly
  • Publish updated examples using chosen mechanism

Propose for initial deployment:

  • metadata consists of title, keywords and vocabulary URIs. Can be applied to checklist and/or checklist requirement.
  • metadata is incorporated into Minim via spreadsheet and mkminim tool
  • simply publish the RDF files in a specified directory
  • create a utility to scan the Minim files and generate an HTML and/or RDF cross-referencing index
  • create a page on ResearchObjects.org with links to the repository pages in github, and manage the details in github for now

TODO:

  • define metadata structure in spreadsheet and minim
  • update mkminim to convert metadata from spreadsheet
  • write utility to read minims and generate index

THEN, maybe:

  • write a web service to read minims and generate a web form that can select by keyword and then allow picking of desired items to create new checklist.

RO Manager

Also, the TIMBUS demo work has raised another issue that may need to be addressed: should there be explicit support for nested ROs in the RO model, such that metadata from internal ROs can be exposed via the main RO (e.g. for checklist processing). If agreed as desirable, this might call for some refactoring of the RO Manager and/or checklist evaluation code. My sense is that this may be noted as desirable, but not addressed within the current project, other than possibly in an RO model documentation update.

Checklist evaluation performance-enhancing cache

(Extending Checklist-LD service with graph caching)

Is this needed? Experiments with the Overlay RO service have provided nearly 2 orders of magnitude performance improvement with the particular evaluation that gave rise to this requirement. While there could be other examples that woukod benefit from graph cacheing, it's not clear that this development is actually justified. Accordingly, I propose to de-prioritize it for the time being.

  • Try to prove that graph caching will help (e.g. use fixed ROSRS instance in ROWeb?)
  • Prototype graph-caching service
  • Refactor ROSRS to use cache if available/configured
  • Modify ROWebservice to use RO Graph cache
  • Harden graph caching service
  • Deploy updated ROWeb service to sandbox
  • No labels