StoRM GridHTTPs Server’s WebDAV interface

version: 3.0.0

READ THIS

From StoRM GridHTTPs server version 3.0.0 the WebDAV service is provided at:
http(s)://[gridhttps.hostname]:[port]/webdav/[storage-area]/
instead of:
http(s)://[gridhttps.hostname]:[port]/[storage-area]/

Table of contents

Introduction  

Each Storage Area that supports HTTP or HTTPS transfer protocols can be accessed through the WebDAV interface provided by the StoRM GridHTTPs server component. This WebDAV interface conceals the details of the SRM protocol and allows users to mount remote Grid storage areas as a volume, directly on their own desktop.

To access the Storage Area’s data users have to provide the right credentials. For example, if the Storage Area A is owned by the VO X, user has to provide a valid VOMS proxy. If the Storage Area B is owned by the VO Y but permits a read-only access to anonymous (see examples section), user has to provide a valid VOMS proxy only if he wants to write data. And so on. See the examples section to other storage area configuration examples.

What is WebDAV?  

webdav-logo

Web Distributed Authoring and Versioning (WebDAV) protocol consists of a set of methods, headers, and content-types ancillary to HTTP/1.1 for the management of resource properties, creation and management of resource collections, URL namespace manipulation, and resource locking. The purpose of this protocol is to present a Web content as a writable medium in addition to be a readable one. WebDAV on Wikipedia and the WebDAV website provide information on this protocol.

In a few words, the WebDAV protocol mainly abstracts concepts such as resource properties, collections of resources, locks in general, and write locks specifically. These abstractions are manipulated by the WebDAV-specific HTTP methods and the extra HTTP headers used with WebDAV methods. The WebDAV added methods include:

  • PROPFIND - used to retrieve properties, stored as XML, from a web resource. It is also overloaded to allow one to retrieve the collection structure (a.k.a. directory hierarchy) of a remote system.
  • PROPPATCH - used to change and delete multiple properties on a resource in a single atomic act.
  • MKCOL - used to create collections (a.k.a. a directory).
  • COPY - used to copy a resource from one URI to another.
  • MOVE - used to move a resource from one URI to another.
  • LOCK - used to put a lock on a resource. WebDAV supports both shared and exclusive locks.
  • UNLOCK - used to remove a lock from a resource.

While the status codes provided by HTTP/1.1 are sufficient to describe most error conditions encountered by WebDAV methods, there are some errors that do not fall neatly into the existing categories, so the WebDAV specification defines some extra status codes. Since some WebDAV methods may operate over many resources, the Multi-Status response has been introduced to return status information for multiple resources. WebDAV uses XML for property names and some values, and also uses XML to marshal complicated requests and responses.

SRM operations via WebDAV  

Starting from EMI3 version, StoRM GridHTTPs server exposes a WebDAV interface to allow users to access Storage-Areas data via browser or by mounting it from a remote host.

milton-logo GridHTTPs’ WebDAV implementation is based on Milton open source java library that acts as an API and HTTP protocol handler for adding the WebDAV support to web applications. Milton is not a full server in itself. It is able to expose any existing data source (e.g. CMS, hibernate pojos, etc) through a WebDAV interface.

As seen in the chapter before, through a WebDAV interface we are allowed to manipulate resources and collections of them. So it is simple to understand that a WebDAV resource for StoRM GridHTTPs WebDAV implementation will be a file, while WebDAV collections will be directories of a file-system. Every WebDAV method needs to be mapped to one or more SRM operations that have to be transparent to the final users. StoRM GridHTTPs maps the HTTP/WebDAV methods with the SRM operations as shown by the following table:

Method Description SRM Operation Main exit codes
GET GET is defined as "retrieve whatever information (in the form of an entity) is identified by the Request-URI" (see RFC2616). GET applied to a file retrieves file's content. GET, when applied to a collection, returns an HTML resource that is a human-readable view of the contents of the collection. GET directory: srmLs
GET file:
  1. srmPrepareToGet
  2. read-file from disk</li>
  3. srmReleaseFile
200 OK
404 Not Found
PUT The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity is considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, server creates the resource with that URI. Resource can't be a collection.
PUT file:
  1. srmPrepareToPut
  2. write-file on disk
  3. srmPutDone.
201 Created file created
204 No Content file overwrited
409 Conflict one or more intermediate collections doesn't exist
405 Method Not Allowed resource exists but it's a collection
HEAD Acts like HTTP/1.1, so HEAD is a GET without a response message body none 200 OK
404 Not Found
OPTIONS Returns "DAV: 1" header none 200 OK
404 Not Found
MKCOL MKCOL creates a new collection resource at the location specified by the Request-UI. srmMkdir 201 Created directory created
409 Conflict means that one or more intermediate collections doesn't exist
415 Method Not Allowed means that collection already exists
DELETE Delete the resource identified by the Request-URI. If the resource is a collection, deletes every resource contained recursively. DELETE file: srmRm
DELETE directory: srmRmdir with -r recursive option
204 No Content resource deleted
404 Not Found resource doesn't exist
COPY The COPY method creates a duplication of the source resource identified by the Request-URI, in the destination resource identified by the URI in the Destination header. The Destination header MUST be present. Actually the StoRM srmCopy is deprecated, so the COPY of a file becomes a PUT of the file read from request-URI to the request's destination URI. The COPY of a directory is a recursive series of MKCOL/PUT. 201 Created </br> 204 No Content destination resource already exists
409 Conflict means that one or more intermediate collections doesn't exist.
403 Forbidden is a retrieved if source and destination URI are the same.
MOVE The MOVE operation is the logical equivalent of a COPY followed by a delete of the source. All these actions has to be performed in a single operation. The Destination header MUST be present on all MOVE methods. srmMv 201 Created or
204 No Content if destination resource already exists
409 Conflict means that one or more intermediate collections doesn't exist
403 Forbidden is retrieved if source and destination URI are the same.
PROPFIND The PROPFIND operation retrieves, in XML format, the properties defined on the resource identified by the Request-URL. Clients must submit a Depth header with a value of "0", "1", or "infinity" (default is "Depth: infinity"). Clients may submit a 'propfind' XML element in the body of the request method describing what information is being requested: a particular property values, by naming the properties desired within the 'prop' element, all property values including additional by using the 'allprop' element (e.g. checksum type and value), the list of names of all the properties defined on the resource by using the 'propname' element. srmLs with -l (detailed) option 207 Multi-Status
POST-not allowed-
TRACE-not allowed-
CONNECT-not allowed-
LOCK-not allowed-
UNLOCK-not allowed-
PROPPATCH-not allowed-

For each method, a 401 Unauthorized can be obtained if user doesn’t provide the necessary credentials.

Service installation and configuration  

The WebDAV interface is provided by StoRM GridHTTPs component. Therefore, if you want to install a WebDAV access point to your data you have to install StoRM GridHTTPs metapackage RPM (do not forget to satisfy all the pre-requisites shown in the sys-admin guide before):

  [~]# yum install emi-storm-gridhttps-mp

To configure storm-gridhttps-server you need to fill the requested YAIM variables as described in the basic and advanced StoRM GridHTTPs sys-admin configuration guides. A good explanation of the required YAIM variable is available in:

/opt/glite/yaim/examples/siteinfo/services/se_storm_gridhttps

The service uses (by default) ports 8443 and 8085, so open them on your firewall.

The service needs to be installed on a machine on which storm file system is mounted. If you need, you can install the StoRM GridHTTPs on differents hosts (that share the same data, e.g. hosts are GPFS clients) and use them as a pool (see StoRM BackEnd configuration on sys-admin guide). To start the service:

  [~]# service storm-gridhttps-server start

Using WebDAV  

The StoRM GridHTTPs WebDAV server listens on two ports, one for the unencrypted HTTP connections and another for the SSL encrypted HTTP requests. Their default values are:

  • HTTP: 8085
  • HTTP over SSL: 8443

To access storage areas’ data, users can use:

  • a browser (if the storage area can be accessed by anonymous or via a valid personal certificate)
  • cURLs (mandatory if you need to provide a valid x509 proxy credential)
  • a third-party WebDAV client (Cyberduck, Firefox RestClient plugin, …)

You can also develop a client on your own, for example by using the Apache Jackrabbit API.

Access data via browser  

brower-logos

Users can use browsers to easily read data of storage areas that are:

Using a browser, users can navigate through the storage areas’ directories and download/open files.

cURLs  

The best way to use the WebDAV service is using cURL command. cURL is a command line tool for transferring data with URL syntax (see cURL website). With cURLs we can do anonymous requests or provide our x509 credentials: personal certificate, plain Grid proxy, VOMS proxy. The following examples suppose that user has his/her personal certificate (usercert.pem) and key (userkey.pem) in $HOME/.globus directory, and his/her proxy in $X509_USER_PROXY.

Anonymous cURLs

Assuming that:

  • ghttps.hostname is the hostname where your WebDAV service is available
  • free is the name of a R/W from anonymous Storage Area

and knowing that unencrypted connections have 8085 as default port, the following table show various cURLs, one for each HTTP/WebDAV method.

Method cURL Notes
GET <pre style="width: 500px">curl -v -X GET http://ghttps.hostname:8085/webdav/free</pre><pre>curl -v http://ghttps.hostname:8085/webdav/free</pre> In case you are getting a file, you can specify a range header to get a part of it. The method GET can be omitted because by default a cURL is an HTTP GET.
PUT To create the destination resource specifying its content via HTTP body: <pre>curl -v -X PUT http://ghttps.hostname:8085/webdav/free/filename.txt --data-ascii "file content"</pre> To create the destination resource by uploading a local existent file:<pre>curl -v -T /local/path/to/filename.txt http://ghttps.hostname:8085/free/filename.txt</pre> By default, a PUT request overwrites the destination resource if exists, so there is an implicit 'Overwrite: T' header. If you want to be sure that destination resource won’t be overwritten, you have to add: --header 'Overwrite: F'
MKCOL <pre>curl -v -X MKCOL http://ghttps.hostname:8085/webdav/free/newdirectory</pre> -
DELETE <pre>curl -v -X DELETE http://ghttps.hostname:8085/webdav/free/existent_resource</pre> If existent_resource is a not empty directory, DELETE works recoursively deleting all the resources contained.
OPTIONS <pre>curl -v -X OPTIONS http://ghttps.hostname:8085/webdav/</pre> There’s no need to specify any storage area or resource with OPTIONS cURLs, the response is the same.
HEAD <pre>curl -v --head http://ghttps.hostname:8085/webdav/</pre> There’s no need to specify any storage area or resource with HEAD cURLs, the response is the same: a GET on the same URL without body.
PROPFIND <pre>curl -v -X PROPFIND http://ghttps.hostname:8085/webdav/free/path/to/resource</pre> The Depth header with a value of “0”, “1”, or “infinity” (default is “Depth: infinity”) is used to enable/disable recursion. HTTP request body is used to retrieve specific and/or more detailed information. It must be in XML format so it’s necessary to add a --header "Content-Type: text/xml" header. Then, the body content is specified through the data-ascii option, that, first of all, contains the XML header: --data-ascii "<?xml version='1.0' encoding='utf-8'?>..."
To obtain the list of names of all the resource properties complete it with:
<propfind xmlns='DAV:'><propname/></propfind>
To obtain the value of a single property complete it with:
<propfind xmlns='DAV:'><prop>property-name</prop></propfind>
To obtain all the property values complete it with:
<propfind xmlns='DAV:'><allprop/></propfind>
COPY <pre>curl -X COPY http://ghttps.hostname:8085/webdav/free/existent_resource --header "Destination: http://ghttps.hostname:8085/webdav/free/unexistent_resource"</pre> The Destination header must be present. The COPY method on a collection without a Depth header must act as if a Depth: infinity header was included. Depth header can be 0 or infinity. A COPY with --header "Depth: infinity" copies all its internal member resources, recursively through all levels of the collection hierarchy. A COPY with --header "Depth: 0" only instructs that the collection and its properties but not resources identified by its internal member URIs, are to be copied. If destination resource exists, the copy has success only if user specifies --header "Overwrite: T".
MOVE <pre>curl -X MOVE http://ghttps.hostname:8085/webdav/free/existent\_resource --header "Destination: http://ghttps.hostname:8085/webdav/free/unexistent_resource"</pre> The Destination header must be present. The MOVE method act like a COPY and a DELETE of the source. If the destination resource exists, the move has success only if user specifies --header "Overwrite: T".
Using x509 credentials

If you need to present an x509 certificate to be authorized, you can add to your cURL command the following options:

--cert path/to/usercert.pem --key path/to/userkey.pem --capath /path/to/the/trustdir

For example, assuming that:

  • A is a storage area readable and writable only with an x509 certificate that has O=”INFN”
  • $HOME/.globus/usercert.pem and $HOME/.globus/userkey.pem are user’s certificate and private key
  • $HOME/.globus/usercert.pem has O=”INFN”
  • the trust directory with CA informations is in /etc/grid-security/certificates

and knowing that encrypted connections has 8443 as default port, we can perform a cURL like this:

curl --verbose -X GET https://gridhttps.hostname:8443/webdav/A/ --cert $HOME/.globus/usercert.pem --key $HOME/.globus/userkey.pem --capath /etc/grid-security/certificates

and retrieve the list of file/directories in the root directory of the storage area A.

Using proxy credentials

If you need to present an x509 proxy - plain or with VOMS extensions - to be authorized, you can add to your cURL command the following options:

--cert path/to/yourproxy --capath /path/to/the/trustdir

For example, assuming that:

  • B is a storage area readable and writable only with an x509 VOMS proxy for dteam VO
  • $X509_USER_PROXY contains the path to the user’s VOMS proxy
  • the trust directory with CA informations is in /etc/grid-security/certificates

and knowing that encrypted connections has 8443 as default port, we can perform a cURL like this:

curl --verbose -X GET https://gridhttps.hostname:8443/webdav/B/ --cert $X509_USER_PROXY --capath /etc/grid-security/certificates

and retrieve the list of file/directories in the root directory of the storage area B.

Firefox RESTClient plugin  

There’s a useful Firefox plugin, named RESTClient, that can be used as a debugger for RESTful web services. RESTClient supports all HTTP methods RFC2616 (HTTP/1.1) and RFC2518 (WebDAV). You can construct custom HTTP request (custom method with resources URI and HTTP request Body) to directly test requests against a server.

RESTClient home screenshot

Cyberduck  

To connect to HTTP readable storage area you can use several clients. One of this is Cyberduck. Cyberduck is an open source FTP and SFTP, WebDAV, Cloud Files, Google Docs, and Amazon S3 client for Mac OS X and Windows (as of version 4.0) licensed under the GPL. To configure it, add a new connection and insert:

  • server: the FQDN of the gridhttps host
  • port: the unencrypted HTTP port, default 8085
  • select anonymous login
  • specify /storage-area-name as remote path

This configuration is the same for lots of WebDAV clients, alternatives to Cyberduck.

cyberduck