version: 3.0.1
http(s)://[gridhttps.hostname]:[port]/webdav/[storage-area]/
instead of:
http(s)://[gridhttps.hostname]:[port]/[storage-area]/
Table of contents
Each Storage Area that supports HTTP or HTTPS transfer protocols can be accessed through the WebDAV interface provided by the storm-gridhttps-server
component. This WebDAV interface conceals the details of the SRM protocol and allows users to mount remote Grid storage areas as a volume, directly on their own desktop.
To access the Storage Area’s data users have to provide the right credentials. For example, if the Storage Area A is owned by the VO X, user has to provide a valid VOMS proxy. If the Storage Area B is owned by the VO Y but permits a read-only access to anonymous (see examples section), user has to provide a valid VOMS proxy only if he wants to write data. And so on.
See the examples section to other storage area configuration examples.
Web Distributed Authoring and Versioning (WebDAV) protocol consists of a set of methods, headers, and content-types ancillary to HTTP/1.1 for the management of resource properties, creation and management of resource collections, URL namespace manipulation, and resource locking. The purpose of this protocol is to present a Web content as a writable medium in addition to be a readable one. WebDAV on Wikipedia and the WebDAV website provide information on this protocol.
In a few words, the WebDAV protocol mainly abstracts concepts such as resource properties, collections of resources, locks in general, and write locks specifically. These abstractions are manipulated by the WebDAV-specific HTTP methods and the extra HTTP headers used with WebDAV methods. The WebDAV added methods include:
While the status codes provided by HTTP/1.1 are sufficient to describe most error conditions encountered by WebDAV methods, there are some errors that do not fall neatly into the existing categories, so the WebDAV specification defines some extra status codes. Since some WebDAV methods may operate over many resources, the Multi-Status response has been introduced to return status information for multiple resources. WebDAV uses XML for property names and some values, and also uses XML to marshal complicated requests and responses.
Starting from EMI3 version, the storm-gridhttps-server
component exposes a WebDAV interface to allow users to access Storage-Areas data via browser or by mounting it from a remote host.
GridHTTPs’ WebDAV implementation is based on Milton open source java library that acts as an API and HTTP protocol handler for adding the WebDAV support to web applications. Milton is not a full server in itself. It is able to expose any existing data source (e.g. CMS, hibernate pojos, etc) through a WebDAV interface.
As seen in the chapter before, through a WebDAV interface we are allowed to manipulate resources and collections of them. So it is simple to understand that a WebDAV resource for StoRM GridHTTPs WebDAV implementation will be a file, while WebDAV collections will be directories of a file-system. Every WebDAV method needs to be mapped to one or more SRM operations that have to be transparent to the final users.
StoRM GridHTTPs maps the HTTP/WebDAV methods with the SRM operations as shown by the following table:
Method | Description | SRM Operation | Main exit codes |
---|---|---|---|
GET | GET is defined as "retrieve whatever information (in the form of an entity) is identified by the Request-URI" (see RFC2616). GET applied to a file retrieves file's content. GET, when applied to a collection, returns an HTML resource that is a human-readable view of the contents of the collection. |
GET directory: srmLs GET file: 1. srmPrepareToGet 2. read-file from disk</li> 3. srmReleaseFile |
200 OK 404 Not Found 409 Conflict when file is in a SRM_FILE_BUSY state |
PUT | The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity is considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, server creates the resource with that URI. |
Resource can't be a collection. PUT file: 1. srmPrepareToPut 2. write-file on disk 3. srmPutDone. |
201 Created file created 204 No Content file overwrited 409 Conflict one or more intermediate collections doesn't exist 405 Method Not Allowed resource exists but it's a collection |
HEAD | Acts like HTTP/1.1, so HEAD is a GET without a response message body | none |
200 OK 404 Not Found |
OPTIONS | Returns "DAV: 1" header | none |
200 OK 404 Not Found |
MKCOL | MKCOL creates a new collection resource at the location specified by the Request-UI. | srmMkdir |
201 Created directory created 409 Conflict means that one or more intermediate collections doesn't exist 415 Method Not Allowed means that collection already exists |
DELETE | Delete the resource identified by the Request-URI. If the resource is a collection, deletes every resource contained recursively. |
DELETE file: srmRm DELETE directory: srmRmdir with -r recursive option |
204 No Content resource deleted 404 Not Found resource doesn't exist |
COPY | The COPY method creates a duplication of the source resource identified by the Request-URI, in the destination resource identified by the URI in the Destination header. The Destination header MUST be present. | Actually the StoRM srmCopy is deprecated, so the COPY of a file becomes ar PUT of the file read from request-URI to the request's destination URI. The COPY of a directory is a recursive series of MKCOL/PUT. |
201 Created </br>
204 No Content destination resource already exists 409 Conflict means that one or more intermediate collections doesn't exist. 403 Forbidden is a retrieved if source and destination URI are the same. 412 Precondition-Failed means that Destination URL is equal to source URL. |
MOVE | The MOVE operation is the logical equivalent of a COPY followed by a delete of the source. All these actions has to be performed in a single operation. The Destination header MUST be present on all MOVE methods. | srmMv |
201 Created or 204 No Content if destination resource already exists 409 Conflict means that one or more intermediate collections doesn't exist 403 Forbidden is retrieved if source and destination URI are the same. 412 Precondition-Failed means that Destination URL is equal to source URL. |
PROPFIND | The PROPFIND operation retrieves, in XML format, the properties defined on the resource identified by the Request-URL. Clients must submit a Depth header with a value of "0", "1", or "infinity" (default is "Depth: infinity"). Clients may submit a 'propfind' XML element in the body of the request method describing what information is being requested: a particular property values, by naming the properties desired within the 'prop' element, all property values including additional by using the 'allprop' element (e.g. checksum type and value), the list of names of all the properties defined on the resource by using the 'propname' element. | srmLs with -l (detailed) option | 207 Multi-Status |
POST | - | not allowed | - |
TRACE | - | not allowed | - |
CONNECT | - | not allowed | - |
LOCK | - | not allowed | - |
UNLOCK | - | not allowed | - |
PROPPATCH | - | not allowed | - |
For each method, a 403 Forbidden can be obtained if user doesn’t provide the necessary credentials.
The WebDAV interface is provided by StoRM GridHTTPs component. Therefore, if you want to install a WebDAV access point to your data you have to install StoRM GridHTTPs metapackage RPM (do not forget to satisfy all the pre-requisites shown in the sys-admin guide before):
yum install emi-storm-gridhttps-mp
To configure storm-gridhttps-server you need to fill the requested YAIM variables as described in the basic and advanced StoRM GridHTTPs sys-admin configuration guides. A good explanation of the required YAIM variable is available in:
/opt/glite/yaim/examples/siteinfo/services/se_storm_gridhttps
The service uses (by default) ports 8443 and 8085, so open them on your firewall.
The service needs to be installed on a machine on which storm file system is mounted. If you need, you can install the StoRM GridHTTPs on differents hosts (that share the same data, e.g. hosts are GPFS clients) and use them as a pool (see StoRM BackEnd configuration on sys-admin guide). To start the service:
service storm-gridhttps-server start
The StoRM GridHTTPs WebDAV server listens on two ports, one for the unencrypted HTTP connections and another for the SSL encrypted HTTP requests. Their default values are:
To access storage areas’ data, users can use:
You can also develop a client on your own, for example by using the Apache Jackrabbit API.
Users can use browsers to easily read data of storage areas that are:
Using a browser, users can navigate through the storage areas’ directories and download/open files.
The best way to use the WebDAV service is using curl
command. curl
is a command line tool for transferring data with URL syntax (see CURL website). With curl
we can do anonymous requests or provide our x509 credentials: personal certificate, plain Grid proxy, VOMS proxy. The following examples suppose that user has his/her personal certificate (usercert.pem) and key (userkey.pem) in $HOME
/.globus directory, and his/her proxy in $X509_USER_PROXY
.
Assuming that:
and knowing that unencrypted connections have 8085 as default port, the following table show various curl
s, one for each HTTP/WebDAV method.
curl -v -X GET http://ghttps.hostname:8085/webdav/free
or
curl -v http://ghttps.hostname:8085/webdav/free
In case you are getting a file, you can specify a range header to get a part of it. The method GET can be omitted because by default a curl
is an HTTP GET.
To create the destination resource specifying its content via HTTP body:
curl -v -X PUT http://ghttps.hostname:8085/webdav/free/filename.txt --data-ascii "file content"
To create the destination resource by uploading a local existent file:
curl -v -T /local/path/to/filename.txt http://ghttps.hostname:8085/free/filename.txt
By default, a PUT request overwrites the destination resource if exists, so there is an implicit 'Overwrite: T'
header. If you want to be sure that destination resource won’t overwritten, you have to add: --header 'Overwrite: F'
curl -v -X MKCOL http://ghttps.hostname:8085/webdav/free/newdirectory
curl -v -X DELETE http://ghttps.hostname:8085/webdav/free/existent_resource
If existent_resource is a not empty directory, DELETE removes all the resources contained.
curl -v -X OPTIONS http://ghttps.hostname:8085/webdav/
There’s no need to specify any storage area or resource with OPTIONS curl
, the response is the same.
curl -v --head http://ghttps.hostname:8085/webdav/
There’s no need to specify any storage area or resource with HEAD curl
, the response is the same: a GET on the same URL without body.
curl -v -X PROPFIND http://ghttps.hostname:8085/webdav/free/path/to/resource
The Depth header with a value of “0”, “1”, or “infinity” (default is “Depth: infinity”) is used to enable/disable recursion. HTTP request body is used to retrieve specific and/or more detailed information. It must be in XML format so it’s necessary to add a --header "Content-Type: text/xml"
header. Then, the body content is specified through the data-ascii option, that, first of all, contains the XML header:
--data-ascii "<?xml version='1.0' encoding='utf-8'?>..."
To obtain the list of names of all the resource properties complete it with:
<propfind xmlns='DAV:'><propname/></propfind>
To obtain the value of a single property complete it with:
<propfind xmlns='DAV:'><prop>property-name</prop></propfind>
To obtain all the property values complete it with:
<propfind xmlns='DAV:'><allprop/></propfind>
curl -X COPY http://ghttps.hostname:8085/webdav/free/existent_resource --header "Destination: http://ghttps.hostname:8085/webdav/free/unexistent_resource"
The Destination header must be present. The COPY method on a collection without a Depth header must act as if a Depth: infinity header was included. Depth header can be 0 or infinity. A COPY with --header "Depth: infinity"
copies all its internal member resources, recursively through all levels of the collection hierarchy. A COPY with --header "Depth: 0"
only instructs that the collection and its properties but not resources identified by its internal member URIs, are to be copied. If destination resource exists, the copy has success only if user specifies --header "Overwrite: T"
.
curl -X MOVE http://ghttps.hostname:8085/webdav/free/existent_resource --header "Destination: http://ghttps.hostname:8085/webdav/free/unexistent_resource"
The Destination header must be present. The MOVE method act like a COPY and a DELETE of the source. If the destination resource exists, the move has success only if user specifies --header "Overwrite: T"
.
If you need to present an x509 certificate to be authorized, you can add to your curl
command the following options:
--cert path/to/usercert.pem --key path/to/userkey.pem --capath /path/to/the/trustdir
For example, assuming that:
$HOME
/.globus/usercert.pem and $HOME
/.globus/userkey.pem are user’s certificate and private key$HOME
/.globus/usercert.pem has O=”INFN”and knowing that encrypted connections has 8443 as default port, we can perform a curl
like this:
curl --verbose -X GET https://gridhttps.hostname:8443/webdav/A/ --cert $HOME/.globus/usercert.pem --key $HOME/.globus/userkey.pem --capath /etc/grid-security/certificates
and retrieve the list of file/directories in the root directory of the storage area A.
If you need to present an x509 proxy - plain or with VOMS extensions - to be authorized, you can add to your curl
command the following options:
--cert path/to/yourproxy --capath /path/to/the/trustdir
For example, assuming that:
$X509_USER_PROXY
contains the path to the user’s VOMS proxyand knowing that encrypted connections has 8443 as default port, we can perform a curl
like this:
curl --verbose -X GET https://gridhttps.hostname:8443/webdav/B/ --cert $X509_USER_PROXY --capath /etc/grid-security/certificates
and retrieve the list of file/directories in the root directory of the storage area B.
There’s a useful Firefox plugin, named RESTClient, that can be used as a debugger for RESTful web services. RESTClient supports all HTTP methods RFC2616 (HTTP/1.1) and RFC2518 (WebDAV). You can construct custom HTTP request (custom method with resources URI and HTTP request Body) to directly test requests against a server.
To connect to HTTP readable storage area you can use several clients. One of this is Cyberduck. Cyberduck is an open source FTP and SFTP, WebDAV, Cloud Files, Google Docs, and Amazon S3 client for Mac OS X and Windows (as of version 4.0) licensed under the GPL. To configure it, add a new connection and insert:
This configuration is the same for lots of WebDAV clients, alternatives to Cyberduck.