Skip to end of metadata
Go to start of metadata
From teleconference:

Authentication.

Jits mentioned token-based (like Google Code?) - We may use for initial prototype one account for all, but this will have to be revised

    • PSNC: To address this issue at the stage of the first prototype we propose to introduce the concept of workspaces in the RO SRS. We have introduced necessary changes in the REST interface description placed on the wiki. We assume that
      • Space in RO SRS is divided into workspaces
      • Each workspace has single credentials (name and password) allowing full access to its contents
      • DropBox account at the level of DropBox connector should be associated with workspace accounts in RO SRS. The most simple scenario is to have separate workspace for each DropBox account, but other options are also possible e.g. several Dropbox accounts may use (be associated with) the same workspace.

Versions:. Still to be discussed - Use query pararameters instead of making it explicit in URI?

  • PSNC: We have some doubts about skipping the version element and allowing to access "latest" version. In some use scenarios the "latest" definition may be problematic. The idea at this point is to provide a generic service that can be later on extended, once we have clarified the version topic, e.g., RO versions, individual components versions, etc. for our initial experiments.
    • We propose to leave this decision to the moment after the first prototype implementation, when it can be tested by the users.
Pique's feedback:
  • In Astronomy, many scenarios in digital experiments involve high volumes of data, usually tabular data coming from databases but they may be also images, spectra or any kind of multidimensional files stored in a binary format. That's one of the reasons why astronomers prefer an automated way for doing their "experiment", because it is the computer who will be dealing with such huge amount of data and not them. In this context, it is specially relevant the way RO SRS will handle remote resources, because we expect to have these data collections spread through the internet in data providers storage spaces (e.g http://cas.sdss.org/CasJobs/Guide.aspx)

PSNC: Our proposition for the first prototype is described in the "Handling Remote Resources" section of the "Research Objects Store and Retrieve Service" document on the wiki. We propose to leave discussion about advanced handling of remote resources to the later stage.

  • It may also happen that small/medium-size datasets could be handled locally by the astronomer, and in this case a versioning of the data is not specially efficient since we may face a quick exponential scaling in the size of the RO (all versions included). I think the astronomer will need to have the possibility to choose if he wants to have a full RO versioning or not. In the case he decides not to version the RO, this dropbox api is still very useful since it allows to access the RO and also store them from any desktop.

PSNC: The versioning in the proposed service is not obligatory. Users can have just one version of the RO and change it all the time.

Jun's feedback:
  • *Research object nesting. *You said in document [1] that a RO represents a directory in a file system. Can a sub-directory also be represented by an RO. Can an RO aggregates not only files but also other ROs? I don't have specific use cases to back up this concept. But I think considering nesting might be useful. Do you only have one manifest.rdf file associated with the top directory? When do you decide that it is necessary to create a manifest.rdf file for each sub-directory?

PSNC: With the dLibra data model we can handle the RO nesting even with several levels of nesting. Also the extension of the REST API should not be difficult. At this stage we assume one manifest.rdf file per RO, and the RO SRS creates it automatically in the top directory of the RO. We propose to leave a more detailed implementation and discussion for the next iteration

  • *Research object versioning. *I think we need not only properties like currentVersion, but maybe also something like ex:latestVersion, ex:previousVersion. The latter is different from dct:hasVersion, which is used to list all different versions of an RO. ex:previousVersion will point to a previous version of an RO, which is handy for traversing different versions of ROs. On this note, would you mind adding some examples showing how dct:hasVersion will be used?

PSNC: This look like an advanced feature of the versioning mechanism. We can implement it using the information about the base version which can be given when a new version is created, but this is an optional information. We propose to leave this for discussion on the later stage.

  • My last comment is about the URIs used for identifying resources in your example. As I said in the telcom, it seems that you are using an URI to identify a physical entity. In a Linked Data manner, when you dereference a URI, you are expecting to get some description about that thing back. I am not sure whether this will be the case for your URIs, such as http://example.org/admiral-test/datasets/DatasetsSubDir/DatasetsSubDir/s1.txt. It seems that I might get the actual text file back, rather than some descriptions about that text file, such as its creation, physical location etc. I am not too fussy about it at this stage for a prototype, but we ought to get it right in a future proper release.

PSNC:  Originally, we assumed that a request to an URI identifying a file which is a part of the RO should return the content of this file. But of course there is some metadata associated with a file which could be interesting to the user and could be returned as a Linked Data. The selection between the content and the metadata cannot be done by content negotiation, because the content itself can be also a linked data file. Therefore we propose to add a query parameter allowing such selection. Now the default form will be the metadata. We have updated the specification of the REST API.

Khalid's feedback:
  • Regarding the structure of the research object, the manifest.rdf file contains structural metadata which is basically generated on the basis of the RO content. This raises the question as to whether there is an assumption that the structure of a research object will be a tree, as opposed to a graph: I am assuming that the structure will be derived by traversing the directories and sub-directories that compose the RO. This may be fine, but if it turns out that we will need a graph-like structure, e.g., if two resources r1 and r2 within the research object are referring to the same sub-resources r3, then symbolic links can be used for that purpose. I don't know however if symbolic links are supported by dropbox.

PSNC: As far as we know the DropBox does not support symbolic links. Therefore we propose to leave the issue of graph-like structured ROs to a stage when will have more advanced end user client than the DropBox client. At this stage we assume that the RO structure will be tree-like.

  • I wasn't sure if I understood how versioning was handled. If a new version of a RO is created, does this lead to the creation of a new directory in a drop box with a new manifest.rdf? If so, does the manifest.rdf of a given version of a RO contains information about all versions or only previous versions?

PSNC: This is probably thing that should be decided during the dropbox connector development.

  • In the description of the main senario, when files are removed from the RO (numbered 4 in the document), there are two actions that are performed. The first is to mark the file as deleted (numbered 1 in the document). How this done? I first thought that this is done within the manifest.rdf file, but the action that followed seems to do this (numbered 2 in the document).

PSNC: When the RO SRS service receives request to delete file, it deletes the file from its internal storage. This is done probably after the file is deleted from the end-user computer and from the DropBox. The deletion causes the automated modification of the manifest.rdf file. 

  • No labels