Encrypting, transferring, and decrypting data
The instructions below describe the process of encrypting, transferring, and decrypting data. sett provides both a graphical user interface (GUI) and a command line interface (CLI) for these operations.
sett help
In the CLI each command and subcommand provides a help message that can be
used to get more information about the available options
(-h
and --help
for short and long help message respectively).
For example:
sett --help
sett encrypt local -h
sett encrypt local --help
Encrypting and sending data
sett allows the encryption of any combination of individual files and directories.
The files are first compressed into a single data.tar.gz
archive, which is
then encrypted with the public key of one or more recipient(s), and signed with
the sender’s key. The encrypted data (data.tar.gz.gpg
) is then bundled with a
metadata file - a plain text file that contains information about who is
sending the file and to whom it should be delivered - into a single .zip
file.
The specifications of the output .zip
files produced by sett are described
in data package specification.
sett supports multi-recipient data encryption. This allows the encrypted file to be decrypted by multiple recipients.
sett also ensures the integrity of the transferred files by computing checksums on each file that is packaged, and adding this information to the encrypted data. The integrity of each file is verified automatically upon decryption of the file by sett, providing the guarantee that all files were transferred flawlessly.
BioMedIT
Data Transfer Requests: each data transfer into the BioMedIT network must have an authorized Data Transfer Request ID (DTR ID). This ID must be specified at the time the data is encrypted (see below). The ID is added to the encrypted file’s metadata information by sett. A valid and authorized DTR ID value is mandatory for any data transfer into the BioMedIT network. Non-compliant packages will be rejected.
Recipients: each data transfer into the BioMedIT network must be to a recipient assigned to the role of Data Manager for the given project. The recipient’s PGP key must also be approved by the BioMedIT key validation authority. If these conditions are not met, sett will not encrypt the data.
Output file naming scheme
By default, encrypted output files produced by sett are named after the pattern:
<project code>_<YYYYMMDD>T<HHMMSS>.zip
where:
<project code>
is the abbreviation/code associated with the project. If Verify package is disabled, no project code is added as a prefix to the output file name.<YYYYMMDD>
is the current date (Year, Month, Day).<HHMMSS>
is the current time (Hours, Minutes, Seconds).
Example: demo_20220211T143311.zip
, here demo
is the project code.
Using the sett command line interface when encrypting to the local file
system, it is possible to completely override the above output file naming
scheme by passing the -o, --output
option. Overriding the naming scheme
is not possible when using sett-gui or when encrypting to a remote S3 or
SFTP destination.
Encrypting and sending data (GUI)
Authenticated mode (BioMedIT)
sett provides “authenticated mode” that simplifies the encryption and transfer process by providing a list of available Data Transfer Requests (DTRs) and automatically fetching the required destination parameters and credentials (only available for the S3 destination).
To access this mode, go to the Profile tab and click “Sign in”. You will be redirected to the BioMedIT authentication system. After successful authentication, proceed to the Encrypt and Transfer Data tab.
Data can be encrypted and either saved to the local file system, or sent directly to a remote server that supports one of the following protocols:
- S3 object storage
- SFTP
-
Go to the Encrypt and Transfer Data tab.
-
Select files and/or directories to encrypt: using the Files and Folders buttons, select at least one file or directory to encrypt. Files and directories can also be dragged and dropped into sett.
Note
If you already have an encrypted data package in your local file system, you can select it with sett Package (or drag and drop it into sett), then choose a remote destination (continue reading at Select destination). -
Select data sender: in the drop-down list found under Select sender, select your own OpenPGP key (you are the data sender). For most users, there should be only one key in the Sender drop-down menu: their own key.
Note
The Sender key is used to sign the encrypted data, so that the recipient(s) can be confident that the data have been sent by a trusted source. -
Select data recipients: add one or more recipients by selecting them from the drop-down list found under Select recipients. Recipients are the people for whom data should be encrypted: their public OpenPGP key will be used to encrypt the data, and only they will be able to decrypt it.
BioMedIT
Only recipients assigned to the role of Data Manager of the project for which data is being encrypted are permitted as data recipients. -
Data Transfer ID: Data Transfer Request ID associated with the data package that is being encrypted. Specifying a valid DTR ID is mandatory to transfer data into the BioMedIT network.
For data not intended to be transferred into the BioMedIT network, the DTR ID field can be left empty (or set to any arbitrary value). In this case, Verify package must be disabled (in the Settings tab).
BioMedIT
DTR ID field is mandatory. Only files encrypted with a valid and authorized DTR ID value can be transferred into the secure BioMedIT network. For this reason, BioMedIT users should always leave the Verify package checkbox enabled. -
This section is only visible if “Enable extra metadata” is selected in the Settings tab.
Here you can add extra metadata to the data package in the form of key-value. This metadata will be stored in the metadata file of the encrypted data package. After inserting the key and value, click the + button (or hit “Enter”) to add the new entry. You can add multiple key-value pairs.
-
Select destination: destination where to encrypt (and send) the data.
-
Local: The local file system. When using this destination, it’s possible to specify:
- Output location: directory where the encrypted file should be saved. By default, output files are saved to the user’s home directory. This default behavior can be changed by changing Default output directory in the Settings tab.
-
S3: A remote S3 compatible object store.
-
SFTP: A remote SFTP server.
When encrypting to a remote destination, a number of options must be specified. Please refer to remote destination options for details.
-
-
You are now ready to create an encrypted data package: click Encrypt data or Encrypt and transfer data if you are encrypting to a remote S3 or SFTP destination. A pop-up will appear, asking for the password associated with the sender’s key. After the password is entered, data compression and encryption will start. Progress and error messages are displayed in the Tasks tab.
When the encryption completed successfully, a notification will pop-up with a message that reads: “Encryption job finished”.
At this point, all input files are compressed, encrypted, and bundled into a
single .zip
file. If the destination was S3 or SFTP, data has also been
transferred to the remote destination.
Encrypting and sending data (CLI)
To create an encrypted data package and save it to the local file system, use
the encrypt
subcommand. If you already have an encrypted data package in your
local file system and want to transfer it to a remote destination, use the
transfer
subcommand. Both subcommands share the same options for specifying
the remote destination.
Note
SENDER
and RECIPIENT
values can be specified either as an OpenPGP key
fingerprint, or as an email address.
# General syntax:
sett encrypt local --signer SENDER --recipient RECIPIENT --dtr DATA_TRANSFER_ID --output OUTPUT_FILENAME_OR_DIRECTORY FILES_OR_DIRECTORIES_TO_ENCRYPT
# Example (long command line options):
sett encrypt local --signer alice@example.com --recipient bob@example.com --dtr 42 --output . ./file_to_encrypt.txt ./directory_to_encrypt
# Example (short command line options):
sett encrypt local -s alice@example.com -r bob@example.com -dtr 42 -o . ./file_to_encrypt.txt ./directory_to_encrypt
Data can be encrypted for more than one recipient by repeating the flag
-r
/--recipient
, e.g. -r RECIPIENT1 -r RECIPIENT2
option:
# In this example, Alice encrypts data for both Bob and Chuck.
sett encrypt local -s alice@example.com -r bob@example.com -r chuck@example.com -o . FILES_OR_DIRECTORIES_TO_ENCRYPT
--output
is an optional argument for specifying the location and/or name
for the encrypted output file. The --output
argument can be one of the following:
- Not provided: the encrypted data is written to
stdout
. This can e.g. be useful to pipe the data into another application. - A directory: the encrypted data package file is written to the specified
directory, and is given a name that follows the default naming convention in
sett
. - A file name: the encrypted data is written to a new file with the specified name, and in the specified directory, if the file name contains one. This overrides the default output file naming schema.
local
subcommand can be replaced with s3
or sftp
to encrypt and transfer data
directly to a remote destination in a single command. When using s3
or sftp
,
the data is encrypted and streamed directly to the destination, without
creating any files in the local file system. The streaming approach is
faster than creating a data package locally and transferring it separately.
It also saves space on the local machine when transferring large datasets.
For more information about remote destination options, please refer to the remote destination options section.
sett encrypt s3 -s alice@example.com -r bob@example.com --endpoint https://minio.my-node.ch --bucket my-project \
--access-key 23VO8RB2SIB2SF8EUL9V --secret-key wvrt7YoTTERGftf0zWnppWYSdcGplNtxuLHMn7op --session-token eyJhbGciOiJ\
IUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiI5Vk84UkIyvimUMlKIFUVVTDc3WSIsImF0X7hhc2giOiIyRnVlZ3JmSjhTUWFXYkw2V0puek\
F3IiwiYXVkLjpbIm1pbmlvIl0sImF1dGhfdGltTRI6MTcyMTezODIxMywiZXhwIjoxNzIx0DMxODEzLCJpYXQiOjE3MjE4MzfiLKLsImlzwqI6Imh\
0dHBzOi8vcD3ydGFsLXN0YWasfmcuZGNjLnNpYi5zd2lzcy9hdXRoL38hdXRoIiwibmFtZSI6ImJpd2ciLCJqt5xpY5kiOiJjb25zb2xlQWRfgW4iLC\
JzdWIiOiIxOSJ9.PcvXcAli5Bz8ete1T265TPB1cbfgX7k8NDXU5gXy1nflxq203cG5qwAF9Oxyn1mKmwa87jsHj8HU2VUY9p5S1Q \
FILES_OR_DIRECTORIES_TO_ENCRYPT
Adding the --check
option will run the encrypt
command in the test mode, i.e.,
checks are made but no data is encrypted or sent.
Data compression algorithm can be changed using the --compression-algorithm
flag. The available options are:
zstandard
(default), optimal compression and speed.gzip
, available for compatibility with older versions ofsett
.stored
, no compression.
The data compression level used by sett can be manually adjusted using the
--compression-level
option.
Possible values depend on the selected compression algorithm:
zstandard
(default: 3)- 1 (lowest compression, fastest)
- 21 (highest compression, slowest).
gzip
(default: 6)- 1 (lowest compression, fastest)
- 9 (highest compression, slowest)
Before encrypting data using the local
subcommand and --output
flag,
sett verifies that there is enough free disk space
available on the local machine to save the encrypted output file. If this
is not the case an error message is displayed and the operation is aborted.
Since the compression ratio of the input data cannot be known in advance, sett
uses the conservative estimate that the minimum disk space required is equal to
the total size of all input files to be encrypted.
To automate the encryption process, you can use environment variables to store the OpenPGP password.
sett performs DTR verification if the --verify
option is passed. For
non-BioMedIT-related transfers, the --verify
option should not be passed.
BioMedIT
A valid DTR ID must be specified via the--dtr
option.
Decrypting files
Decrypt and decompress encrypted data packages.
Please note that the decryption process includes the verification of the sender’s signature, which ensures the authenticity of the data. Verification of the sender’s signature is only possible if the sender’s public key is available in your local keyring. For security reasons, the sender’s key will not be automatically downloaded from a key server; it must be downloaded/imported manually (for more information see download public keys).
To decrypt data, you must therefore have in your local keyring:
- The private key for which the data was encrypted. In principle, this is your own private key. This key will be used to decrypt the data.
- The data sender’s public key. This key is used for signature verification purposes.
Only files that follow the sett packaging specification can be decrypted with sett.
Decrypting data (GUI)
To decrypt and decompress a file:
-
Go to the Decrypt tab.
-
Select the source from which you wish to decrypt data - either local or s3.
-
When decrypting from local, select a data package to decrypt with Select Package, or drag and drop the file into sett. When decrypting from s3, a number of options must be specified - please refer to remote destination options for details. After doing so, click on Load package.
-
Select destination directory: select a location where to decrypt/decompress the file.
By default, output files are saved to the user’s home directory. This default behavior can be changed by changing Default output directory in the Settings tab.
-
Click Decrypt package to start the decryption and decompression process. A pop-up dialog box will appear to ask for the password associated with the OpenPGP key used to encrypt the files.
Decrypting data (CLI)
Use the decrypt
subcommand to decrypt and decompress data:
# General syntax:
sett decrypt local --output OUTPUT_DIRECTORY ENCRYPTED_FILES.zip
# Example:
sett decrypt local --output /home/alice/data/unpack_dir /home/alice/data/test_data.zip
local
subcommand can be replaced with s3
to decrypt data package from an
S3 object store. When using s3
, the data is streamed and decrypted directly
from the S3 object store, without creating any temporary files in the local
file system.
To decrypt data without decompressing it, add the -d, --decrypt-only
option.
If the -o, --output
option is omitted, the data is decrypted in the current
working directory.
To automate the decryption process, you can use environment variables to store the OpenPGP password.
Remote destination options
Both GUI and CLI require specific parameters when transferring data to a remote destination. The parameters are different depending on the destination type.
-
S3: A remote S3 compatible object store. When using this destination a number of options must be specified:
- URL: The URL of the S3 object store.
- Bucket: The name of the bucket where the encrypted data should be stored.
- Access key: The access key to use to authenticate with the S3 object store. It is also possible to use a username instead.
- Secret key: The secret key to use to authenticate with the S3 object store. It is also possible to use a password instead.
- Session token: The session token to use to authenticate with the S3 object store. It is optional and only required when authenticating using temporary credentials (STS), which is the case of most users interacting with the BioMedIT infrastructure.
BioMedIT
This information must be retrieved from BioMedIT Portal by a data provider data engineer. Please refer to the dedicated section of the Portal user guide for instructions on how to do so. Portal allows copying the credentials to the clipboard, so that they can be pasted into sett by using the Paste from clipboard button. -
SFTP: A remote SFTP server. When using this destination, a number of options must be specified:
-
Host: URL address of the server where the files should be sent.
-
User name: the user name with which to connect to the SFTP server.
-
Destination directory: absolute path of directory where files should be saved on the server.
-
SSH key location: path of the private SSH key used for authentication to the SFTP server. This is only required if the SSH key is in a non-standard location. If missing, sett will use the SSH agent to provide the key.
Do not confuse SSH keys - which are used to authenticate yourself when connecting to an SFTP server during file transfer - with OpenPGP keys - which are used to encrypt and sign data.
-
SSH key password: password associated with the private SSH key given under SSH key location. If your SSH key password contains characters that are not ASCII characters, and that this results in an error, please see the SSH private key with non-ASCII characters section of this guide.
BioMedIT
For BioMedIT users, the SFTP connection parameters User name, Host URL, and Destination directory will be provided by your local BioMedIT node. -