3.2.2.1. ADA interface¶
Our ADA (Advanced dCache API) interface is based on the dCache API and the webdav protocol to access and process your data on dCache from any platform and with various authentication methods.
rclone is a webdav client that supports by default 4 parallel streams of data, and is installed on the spider platform.
macaroons are a token based authentication method supported by dCache. Macaroons can be used to give access to dCache data in a very granular way. This enables data managers autonomously share their data in dCache without having to reach out to SURFsara to request access.
A quick start up guide for ADA is captured in the video below:
3.2.2.1.1. Browser view¶
dCache storage can be viewed both through the Ada tools or through your browser using the web client, this is just one additional way you can explore the storage space.
As a Data manager you have direct credentials on dCache and it is possible to access the browser view using you Spider credentials [project-username] in the following link:
https://webdav-secure.grid.surfsara.nl/pnfs/grid.sara.nl/data/[PROJECT]/
Note
You may be asked for a browser certificate, just select cancel and you will be asked for your credentials
3.2.2.1.2. Using ADA¶
ADA is a wrapper of tools created by SURFsara to simplify your interactions with dCache. Rclone can support uploading and downloading data but other operations such as listing or deleting files and directories can be performed directly on the dCache API. ADA wraps all of this functionality into one clean package saving you the hassle of having to download and troubleshoot multiple packages and dependencies. ADA is installed on Spider.
This section provides examples and the steps to start using ADA to interact with your dCache storage.
3.2.2.1.2.1. Create a macaroon¶
- Requirements: credential to dCache
- username/pwd or
- x509 proxy
- Spider role: Data manager
- Action: Create a macaroon
- Output: rclone tokenfile [PROJECT_tokenfile].conf. You can share this file with any member in the project in next step.
- Description: the DM creates a macaroon for a shared directory (including the sub-directories & files). In the next step he will share the macaroon with the project team in a non-public space, either user’s home directories, or the ‘shared’ or ‘data’ project space directories.
- Example:
get-macaroon \
--url https://webdav.grid.surfsara.nl:2880/pnfs/grid.sara.nl/data/[PROJECT] \
--duration P7D \
--chroot \
--user [PROJECT]-[USER] \
--permissions DOWNLOAD,UPLOAD,DELETE,MANAGE,LIST,READ_METADATA,UPDATE_METADATA \
--ip [IP RANGE] \
--output rclone [PROJECT_tokenfile]
These permissions can be given comma separated upon creation of the macaroon:
Permission | Function |
---|---|
DOWNLOAD | Read a file |
UPLOAD | Write a file |
DELETE | Delete a file or directory |
MANAGE | Rename or move a file or directory |
LIST | List objects in a directory |
READ_METADATA | Read file status |
UPDATE_METADATA | Stage/unstage a file, change QoS |
3.2.2.1.2.3. Inspect the macaroon¶
- Requirements: the rclone tokenfile [PROJECT_tokenfile].conf
- Spider role: Normal user
- Actions: View macaroon
- Output: the list activities and directories that you can use on dCache
- Example:
# Your macaroon is the value of 'bearer_token'
$ cat [PROJECT_tokenfile].conf
[tokenfile]
type = webdav
bearer_token = MDAxY2xvY2F0aWXXXXXXXXXXXXXXXX
url = https://webdav.grid.surfsara.nl:2880/
vendor = other
user =
password =
#View the macaroon details
$ view-macaroon [PROJECT_tokenfile].conf
location Optional.empty
identifier NDFXzXXX
cid iid:03FXXX//
cid id:39147;35932,30013;[Data Manager Name]
cid before:2020-02-05T11:01:11.577Z
cid home:/[Project folder]
cid root:/[Project folder]
cid activity:DOWNLOAD,UPLOAD,MANAGE,LIST
signature fefef25a4973e59b10ad464054dXXXXXXX
3.2.2.1.2.4. Use the macaroon¶
This section describes how to work with your files.
- Requirements: the rclone tokenfile [PROJECT_tokenfile].conf
- Spider role: Normal user
Tip
If you want to use an environment variable to set the token file, rather than having to pass it on the command line every time then you can do: $export ada_tokenfile=/path-to-mytoken/[PROJECT_tokenfile].conf
and then you can omit the option ‘–tokenfile’ from all of the ada commands
Tip
You can get extra information about the submitted command and the rest call details by using the –debug option in your ada command.
3.2.2.1.2.4.1. Check your access to the system¶
–whoami
- Action: request authentication details
- Output: information about the token owner and permissions
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --whoami
{
"status": "AUTHENTICATED",
"uid": 515XX,
"gids": [
511XX
],
"username": "[Data Manager name]",
"rootDirectory": "/pnfs/grid.sara.nl/data/[Project]/disk",
"homeDirectory": "/"
}
3.2.2.1.2.4.2. Listing files¶
–list <directory>
–longlist <file|directory>
–longlist –from-file <file-list>
- Action: List files or directories
- Output: List or long list of the files from the directory that the macaroon allows permission
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --longlist /[DIRECTORY]
3.2.2.1.2.4.3. Get file or directory details¶
–stat <file|directory>
- Action: Show all details of a file or directory
- Output: metadata information
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --stat /[FILE or DIRECTORY]
3.2.2.1.2.4.4. Create a directory on dCache¶
–mkdir <directory>
- Action: Create directories
- Output: New directory created
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --mkdir /[DIRECTORY]
3.2.2.1.2.4.5. Moving or renaming files¶
–mv <file|directory> <destination>
- Action: Move file or directory. This can be used as an option also to rename a directory if the move is done in the same directory. Specify the full path and name to the source and target directory
- Output: File or Directory moved to a different dCache location or renamed
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --mv /[SOURCE] /[DESTINATION]
3.2.2.1.2.4.6. Recursively remove folders¶
–delete <file|directory> [–recursive [–force]]
- Action: Delete files or directories
- Output: File or Directory is deleted
- Recursive deletion: To recursively delete a directory and ALL of its contents, add –recursive. You will need to confirm deletion of each subdir, unless you add –force.
- Alternative: rclone purge
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --delete /[FILE or DIRECTORY]
ada --tokenfile [PROJECT_tokenfile].conf --delete /[FILE or DIRECTORY] --recursive
ada --tokenfile [PROJECT_tokenfile].conf --delete /[DIRECTORY] --recursive --force
# alternative
$ rclone --config=[PROJECT_tokenfile].conf purge PROJECT_tokenfile]:/disk/rec-delete/
3.2.2.1.2.4.7. Checksum¶
–checksum <file>
–checksum <directory>
–checksum –from-file <file-list>
- Action: Get the checksum of a files or files inside a directory or list of files
- Output: Show MD5/Adler32 checksums for files
- Example:
ada --tokenfile [PROJECT_tokenfile].conf --checksum /[FILE or DIRECTORY]
# create a filelist and get checksums for files in it
ada --tokenfile [PROJECT_tokenfile].conf --list /disk/mydir > files-to-checksum
sed -i -e 's/^/\/disk\/mydir\//' files-to-checksum
ada --tokenfile [PROJECT_tokenfile].conf --checksum --from-file files-to-checksum
#/disk/file1 ADLER32=80690001
#/disk/file2 ADLER32=80690001
#/disk/file3 ADLER32=80690001
3.2.2.1.2.4.8. View your usage¶
- Action: get your storage usage with Rclone
- Example:
rclone --config=[PROJECT_tokenfile].conf size [PROJECT_tokenfile]:/
3.2.2.1.2.4.9. Staging¶
The dCache storage at SURFsara consists of magnetic tape storage and hard disk storage. If your quota allocation includes tape storage, then the data stored on magnetic tape has to be copied to a hard drive before it can be used. This action is called Staging files or ‘bringing a file online’.
Your macaroon needs to be created with UPDATE_METADATA permissions to allow for staging operations.
–stage <file>
–stage <directory>
–stage –from-file <file-list>
- Action: Stage a file from tape or files in directory or a list of files (restore, bring it online)
- Output: the file or list of files comes online on disk
- Example:
#list files to get the status
ada --tokenfile [PROJECT_tokenfile].conf --longlist /[PROJECT_tape_dir]
#file1 1186443 2020-02-13 16:27 UTC tape NEARLINE
#file2 1635 2018-10-24 15:34 UTC tape NEARLINE
#stage a single file
ada --tokenfile [PROJECT_tokenfile].conf --stage /[PROJECT_tape_dir]/file1
#stage a list of files
ada --tokenfile [PROJECT_tokenfile].conf --stage --from-file files-to-unstage
3.2.2.1.2.4.10. Unstaging¶
Your macaroon needs to be created with UPDATE_METADATA permissions to allow for unstaging operations.
–unstage <file>
–unstage <directory>
–unstage –from-file <file-list>
- Action: Unstage/Release a file from tape or files in directory or a list of files
- Output: the file or list of files is unstaged and may be removed for the disk any time so dCache may purge its online replica.
#unstage a single file
ada --tokenfile [PROJECT_tokenfile].conf --unstage /[PROJECT_tape_dir]/file1
#unstage a list of files
ada --tokenfile [PROJECT_tokenfile].conf --list /tape > files-to-unstage
sed -i -e 's/^/\/tape\//' files-to-unstage
ada --tokenfile [PROJECT_tokenfile].conf --unstage --from-file files-to-unstage
3.2.2.1.2.5. Transfer Data¶
In order to transfer files from/to dCache we use the same [PROJECT_tokenfile].conf and the rclone client to trigger webdav transfers as shown below.
3.2.2.1.2.5.1. Copy data from dCache¶
rclone --config=[PROJECT_tokenfile].conf copy [PROJECT_tokenfile]:/[SOURCE] ./[DESTINATION] -P
Example, copy an existing test folder to Spider:
rclone --config=[PROJECT_tokenfile].conf copy [PROJECT_tokenfile]:/tests/ ./tests/ -P
3.2.2.1.2.5.2. Write data to dCache¶
rclone --config=[PROJECT_tokenfile].conf copy ./[SOURCE]/ [PROJECT_tokenfile]:[DESTINATION] -P
Notes on data transfers:
- The rclone ‘copy’ mode will just copy new/changed files. The rclone ‘sync’ (one way) mode will create a directory identical to the source so be careful because this can cause data loss and test first with the –dry-run flag to see exactly what would be copied and deleted
- You can increase the number of parallel transfers with the ‘–transfers [Number]’ option
- For very large files it is important to set the –timeout’ option high enough. As a rule of thumb, set it to 10 minutes for every GB of the biggest file in a collection. This may look ridiculously large, but it provides a safe margin to avoid problems with timeout issues
- Using –multi-thread-streams 1 increases the performance for large files
#example command to upload a big file
rclone --timeout=240m --multi-thread-streams 1 --config=[PROJECT_tokenfile].conf copy ./[SOURCE]/ [PROJECT_tokenfile]:[DESTINATION] -P
3.2.2.1.3. Event-driven processing¶
Events are useful when you want to know something you’re interested in happened in your dCache project space, such as when new data is available or when files are staged from tape, etc.
- Subscribe to changes in a given directory:
ada --tokenfile [PROJECT_tokenfile].conf --events changes-in-dir /[PROJECT_directory] --recursive
- Check the available channels listening to events:
ada --tokenfile [PROJECT_tokenfile].conf --channels
- Report staging events
When you start this channel, all files in the scope will be listed, including their locality and QoS. This allows your event handler to take actions, like starting jobs to process the files that are online. When all files have been listed, the command will keep listening and reporting all locality and QoS changes.
ada --tokenfile [PROJECT_tokenfile].conf --report-staged staging-in-tape-dir /[PROJECT_directory] --recursive
3.2.2.1.4. Authentication¶
In this page we gave an extended example on using ada with macaroons authentication. Ada can be used with multiple authentication options.
Authentication | ADA commands | When to use |
---|---|---|
Macaroon | ada --tokenfile <filename> |
You don’t have direct access on dCache but you have a token from the project data manager that allows you certain permissions on the data |
Username/password | ada --netrc [filename] |
You have direct usr/pwd access credentials on dCache |
X509 Certificate | ada --proxy [filename] |
You have direct VO membership access on dCache |
Here is an example of a .netrc file that you can create in your home to use username/password authentication:
$ cat ~/.netrc:
machine webdav.grid.surfsara.nl
login [your-ui-username]
password [your-ui-password]
machine dcacheview.grid.surfsara.nl
login [your-ui-username]
password [your-ui-password]
3.2.2.1.5. Run ADA anywhere¶
In this page we gave an extended example on using ada on the Spider platform. Ada is portable and can be used on any platform. On the SURFsara UIs ADA is already on board. If you want to interact with the dCache API and transfer files from your own machine then you need to install the following prerequisites:
jq
: the only dependency for executing ada commandsrclone
: the client to perform transfers (MacOS: brew install rclone)
As a Data manager if you wish to create macaroons from any platform, e.g. your local machine, then you need to install the following get-macaroon and view-macaroon scripts:
wget https://raw.githubusercontent.com/sara-nl/GridScripts/master/get-macaroon
wget https://raw.githubusercontent.com/sara-nl/GridScripts/master/view-macaroon
- And their dependencies:
pymacaroons, python3-html2text
3.2.2.1.6. Ada configuration files¶
The user specific configuration files are written in ~/.ada/
- The URL to query the API is stored in /etc/ada.conf (system default) or ~/.ada/ada.conf (user specific, optional)
- The bearer tokens information based on a tokenfile is stored in ~/.ada/headers/. The authorization_header is created for security to prevent from reading the token as argument and be displayed in ‘ps’ info. This way the token is read from a hidden file in the user home dir
- The Events information such as the last eventID is stored in ~/.ada/channels/