Encrypting, transferring and decrypting data with sett
Encrypting files
The sett application allows the encryption of any combination of individual files and directories.
The files are first compressed into a single data.tar.gz
archive, which is
then encrypted with the public key of one or more recipient(s), and signed with
the sender’s key. The encrypted data (data.tar.gz.gpg
) is then bundled with a
metadata file - a plain text file that contains information about who is
sending the file and to whom it should be delivered - into a single .zip
file.
The specifications of the output .zip
files produced by sett are described
in the sett packaging specifications section.
sett supports multi-recipient data encryption. This allows the encrypted file to be decrypted by multiple recipients.
sett also ensures the integrity of the transferred files by computing checksums on each file that is packaged, and adding this information to the encrypted data. The integrity of each file is verified automatically upon decryption of the file by sett, providing the guarantee that all files were transferred flawlessly.
BioMedIT
Data Transfer Requests: each data transfer into the BioMedIT network must have an authorized Data Transfer Request ID (DTR ID). This ID must be specified at the time the data is encrypted (see below). The ID is added to the encrypted file’s metadata information by sett. A valid and authorized DTR ID value is mandatory for any data transfer into the BioMedIT network. Non-compliant packages will be rejected.
Recipients: each data transfer into the BioMedIT network must be to a recipient assigned to the role of Data Manager for the given project. The recipient’s PGP key must also be approved by the BioMedIT key validation authority. If these conditions are not met, sett will not encrypt the data.
Output file naming scheme
By default, encrypted output files produced by sett are named after the pattern:
<project code>_<YYYYMMDD>T<HHMMSS>.zip
where:
<project code>
is the abbreviation/code associated with the project. If no DTR ID value was provided or if Verify DTR is disabled, no project code is added as a prefix to the output file name.<YYYYMMDD>
is the current date (Year, Month, Day).<HHMMSS>
is the current time (Hours, Minutes, Seconds).
Example: demo_20220211T143311.zip
, here demo
is the project code.
Using the sett command line when encrypting to the local file system, it is
possible to completely override the above output file naming scheme by passing
the -o, --output
option. Overriding the naming scheme is not possible when
using sett-gui or when encrypting to a remote S3 or SFTP destination.
Encrypting data with sett-gui
To encrypt data:
-
Go to the Encrypt tab of the sett application.
-
Select files and/or directories to encrypt: using the Add file and Add directory buttons, select at least one file or directory to encrypt.
After adding files/directories, they will be listed in the top box of the tab (see figure above).
-
Select data sender: in the drop-down list found under Select sender, select your own PGP key (you are the data sender). For most users, there should in principle be only one key in the Sender drop-down menu: their own key.
Note
The Sender key is used to sign the encrypted data, so that the recipient(s) can be confident that the data they receive is genuine. -
Select data recipients: add one or more recipients by selecting them from the drop-down list found under Select recipients. Recipients are the people for whom data should be encrypted: their public PGP key will be used to encrypt the data, and only they will be able to decrypt it.
BioMedIT
Only recipients assigned to the role of Data Manager of the project for which data is being encrypted are permitted as data recipients. -
Transfer ID (DTR): Data Transfer Request ID associated with the data package that is being encrypted. Specifying a valid DTR ID is mandatory to transfer data into the BioMedIT network.
For data not intended to be transferred into the BioMedIT network, the DTR ID field can be left empty (or set to any arbitrary value). In this case, Verify package must be disabled (in the Settings tab).
BioMedIT
DTR ID field is mandatory. Only files encrypted with a valid and authorized DTR ID value can be transferred into the secure BioMedIT network. For this reason, BioMedIT users should always leave the Verify DTR checkbox enabled. -
Select destination: destination where to encrypt the data. Possible destinations are:
-
Local: The local file system. When using this destination, it’s possible to specify:
- Output location: directory where the encrypted file should be saved. By default, output files are saved to the user’s home directory. This default behavior can be changed by changing Default output directory in the Settings tab.
-
S3: A remote S3 compatible object store.
-
SFTP: A remote SFTP server.
When encrypting to a remote destination, a number of options must be specified. Please refer to the transferring files for details about these options.
-
-
You are now ready to compress and encrypt the data: click Encrypt package or Send package if you are encrypting to a remote S3 or SFTP destination. A pop-up will appear, asking for the password associated with the sender’s key. After the password is entered, data compression and encryption will start. Progress and error messages are displayed in the Tasks tab.
When the encryption completed successfully, a notification will pop-up with a message that reads: “Encryption job finished”.
At this point, all input files are compressed, encrypted and bundled into a
single .zip
file. If the destination was SFTP or S3, data has also been
transferred to the remote destination.
Encrypting data on the command line
The sett command to encrypt data is the following. Note that the SENDER
and
RECIPIENT
values can be specified either as a PGP key fingerprint, or as
an email address.
# General syntax:
sett encrypt local --signer SENDER --recipient RECIPIENT --dtr DATA_TRANSFER_ID --output OUTPUT_FILENAME_OR_DIRECTORY FILES_OR_DIRECTORIES_TO_ENCRYPT
# Example:
# long command line options:
sett encrypt local --signer alice@example.com --recipient bob@example.com --dtr 42 --output test_output ./test_file.txt ./test_directory
# short command line options:
sett encrypt -s alice@example.com -r bob@example.com -dtr 42 -o test_output ./test_file.txt ./test_directory
Data can be encrypted for more than one recipient by repeating the flag
--recipient
, e.g. --recipient RECIPIENT1 --recipient RECIPIENT2
option:
# In this example, Alice encrypts a set of files for both Bob and Chuck.
sett encrypt local --signer alice@example.com --recipient bob@example.com chuck@example.com FILES_OR_DIRECTORIES_TO_ENCRYPT
local
can be replaced with s3
or sftp
to encrypt data to a remote
destination.
Adding the --check
option will run the encrypt
command in test mode, i.e.
checks are made but no data is encrypted.
The data compression level used by sett can be manually adjusted using the
--compression-level
option. Compression levels value must be integers between
0
(no compression) and 9
(highest compression). Higher compression levels
produce smaller output files but require more computing time, so you may choose
a lower level to speed-up compression (e.g. --compression-level=1
), or a
higher level (e.g. --compression-level=9
) to produce smaller output files. The
default level is 6
.
Before encrypting data, sett verifies that there is enough free disk space
available on the local machine to save the encrypted output file (relevant is
the current working directory or target folder pointed by --output
). If this
is not the case an error message is displayed and the operation is aborted.
Since the compression ratio of the input data cannot be known in advance, sett
uses the conservative estimate that the minimum disk space required is equal to
the total size of all input files to be encrypted.
To automate the encryption process, the --password
option can be used to
specify the password of the singer PGP key.
sett performs DTR verification if the --verify
option is passed. For
non-BioMedIT-related transfers, the --verify
option should not be passed.
BioMedIT
A valid DTR ID is must be specified via the--dtr
option.
To override the sett output file naming scheme, the
--output
option can be used to specify the path and name that the output file
should have.
Transferring files
Data packages can be transferred to remote servers that support one of the following protocols:
- SFTP
- S3 object storage
Important
Only files encrypted with sett, or files that follow the sett packaging specifications can be transferred using sett.Transferring files with sett-gui
To transfer encrypted files:
-
Go to the Transfer tab of the sett application.
-
Select encrypted files to transfer: click Select file and select a
.zip
file that was generated using the sett application. -
Select destination: destination where to transfer the data. Possible destinations are:
-
S3: A remote S3 compatible object store. When using this destination a number of options must be specified:
- URL: The URL of the S3 object store;
- Bucket: The name of the bucket where the encrypted data should be stored;
- Access key: The access key to use to authenticate with the S3 object store;
- Private key: The private key to use to authenticate with the S3 object store;
- Session token: The session token to use to authenticate with the S3 object store.
BioMedIT
This information must be retrieved from BioMedIT Portal by a data provider data engineer. Please refer to the dedicated section of the Portal user guide for instructions on how to do so. Portal allows copying the credentials to the clipboard, so that they can be pasted into sett by using the Paste from clipboard button. -
SFTP: A remote SFTP server. When using this destination, a number of options must be specified:
-
Host: URL address of the server where the files should be sent.
-
User name: the user name with which to connect to the SFTP server.
-
Destination directory: absolute path of directory where files should be saved on the server.
-
SSH key location: name and full path of the private SSH key used for authentication to the SFTP server. This is only required if the SSH key is in a non-standard location. Only RSA keys are accepted.
Do not confuse SSH keys - which are used to authenticate yourself when connecting to an SFTP server during file transfer - with PGP keys - which are used to encrypt and sign data.
-
SSH key password: password associated with the private SSH key given under SSH key location. If your SSH key password contains characters that are not ASCII characters, and that this results in an error, please see the SSH private key with non-ASCII characters section of this guide.
BioMedIT
For BioMedIT users, the SFTP connection parameters User name, Host URL, and Destination directory will be provided by your local BioMedIT node. -
-
-
You are now ready to transfer the data. Click Send package and follow the progress of the transfer using the Tasks tab. By default, sett verifies data packages before initializing file transfers. These checks are required within the BioMedIT network, but can be skipped in other contexts by disabling the Verify package checkbox in the Settings tab.
Transferring data on the command line
sett command to transfer data:
# General syntax:
sett transfer sftp --host HOST --username USERNAME --base-path DESTINATION_DIRECTORY --key-path SSH_KEY_LOCATION --key-pwd SSH_KEY_PASSWORD FILES_TO_TRANSFER
sett transfer s3 --endpoint ENDPOINT --bucket BUCKET --access-key ACCESS_KEY --secret-key SECRET_KEY FILES_TO_TRANSFER
# Example:
sett transfer s3 --endpoint https://my-node.ch --bucket my-project --access-key pub --secret-key sec encrypted_data.zip
Adding the --check
option will run the transfer
command in test mode, i.e.
checks are made but no data is transferred.
For SFTP transfers, an SSH key is required for authentication on the host server. The private SSH key can be provided via 2 mechanisms:
- Specifying the location of the key via the
--key-path
option. - Use an SSH agent to provide the key. The SSH agent is automatically detected
by sett and no specific input from the user is needed. In ths case, the
--key-path
argument should be skipped.
To display help for the transfer command: sett transfer --help
. To display
help for a specific transfer protocol - e.g. s3: sett transfer s3 --help
.
Decrypting files
The sett application allows the decryption and decompression of files in a single step. However, only files encrypted with the sett application, or files that follow the sett packaging specifications can be decrypted with sett.
Decrypting data with sett-gui
To decrypt and decompress a file:
-
Go to the Decrypt tab of the sett application (see figure below).
-
Select a file to decrypt with Select file.
-
Select destination directory: select a location where to decrypt/decompress the file.
By default, output files are saved to the user’s home directory. This default behavior can be changed by changing Default output directory in the Settings tab.
-
Click Decrypt package to start the decryption/decompression process. A pop-up dialog box will appear to ask for the password associated with the PGP key used to encrypt the files.
Decrypting data on the command line
sett command to decrypt data:
# General syntax:
sett decrypt --output OUTPUT_DIRECTORY ENCRYPTED_FILES.zip
# Example:
sett decrypt --output /home/alice/data/unpack_dir /home/alice/data/test_data.zip
To display help for the decrypt command: sett decrypt --help
.
To decrypt data without decompressing it, add the -d, --decrypt-only
option.
If the -o, --output
option is omitted, the data is decrypted in the current
working directory.
If you want to automate the decryption process, you can use the -p, --password
option to provide the password of the recipient PGP key.