Encrypting, transferring, and decrypting data
The following instructions outline the process of encrypting, transferring, and decrypting data.
sett
offers 3 types of interfaces:
- sett GUI: a graphical user interface.
- sett CLI: a command-line interface.
- sett TUI: a terminal user interface. An interactive user interface
displayed in the terminal. The TUI is accessed by running the CLI application
without passing any arguments:
sett
sett help
In the CLI, each subcommand includes a help message with available options.
Use -h
for a short description or --help
for detailed explanations.
sett --help
sett encrypt local -h
sett encrypt local --help
Encrypting and sending data
General principles
sett allows the encryption of any combination of individual files and
directories. The files are first compressed into a single data.tar.gz
archive, which is then encrypted with the public key of one or more
recipient(s), and signed with the sender’s key.
The encrypted data (data.tar.gz.gpg
) is then bundled with a
metadata file - a plain text file that contains information about who is
sending the file and to whom it should be delivered - into a single .zip
file, hereafter referred to as a data package.
sett ensures the integrity of the transferred files by computing checksums on each file that is packaged, and adding this information to the encrypted data. The integrity of each file is verified automatically upon decryption of the data package by sett, providing the guarantee that all files were transferred flawlessly.
Glossary
Some useful definitions to know when using sett:
-
Data Package:
.zip
file produced by sett when encrypting and/or transferring data. A data package contains the encrypted data, a metadata file, as well as the data sender’s signature information.For more details, please refer to the sett data package specification.
-
Data Sender: person who is encrypting and transferring data. sett uses the data sender’s private OpenPGP key to sign the encrypted data so that data recipients can be confident in who created and sent the data package. A data package can only have one data sender.
For more details on how data signing works, please see the introduction to OpenPGP section of this guide.
-
Data Recipient(s): person(s) for whom data is encrypted. sett uses the recipient(s) public OpenPGP key to encrypt data, and as a result only they can decrypt the data package. sett supports multi-recipient data encryption. The public key of all data recipients must be available in the sender’s local keystore.
For more details on how data encryption works, please see the introduction to OpenPGP section of this guide.
BioMedIT
BioMedIT users should be aware of these additional constraints that apply when transferring data into the BioMedIT network:
-
Data Transfer Requests ID (DTR ID): ID number (numeric value) that uniquely identifies a Data Transfer Request. A valid and authorized DTR ID must be specified each time that data is being encrypted and transferred into the BioMedIT network. The DTR ID is added to the data package’s metadata information by sett. Non-compliant packages will be rejected.
Note that Data Transfer Requests can allow the transfer of multiple data packages, and therefore (if the DTR allows it), the same DTR ID can be used for multiple data transfers.
-
Data Sender: to transfer data into the BioMedIT network, a data sender must be assigned the role of Data Provider Data Engineer for the given project. The sender’s OpenPGP key must also be approved by the BioMedIT key validation authority. If these conditions are not met, sett will not encrypt the data.
-
Data Recipient(s): each data transfer into the BioMedIT network must be to a recipient assigned to the role of Data Manager for the given project. The recipient’s OpenPGP key must also be approved by the BioMedIT key validation authority. If these conditions are not met, sett will not encrypt the data.
Output file naming scheme
By default, encrypted output files produced by sett are named after the pattern:
<project code>_<YYYYMMDD>T<HHMMSS>.zip
where:
<project code>
is the abbreviation/code associated with the project. If Verify package is disabled, no project code is added as a prefix to the output file name.<YYYYMMDD>
is the current date (Year, Month, Day).<HHMMSS>
is the current time (Hours, Minutes, Seconds).
Example: demo_20220211T143311.zip
, here demo
is the project code.
Using the sett command line interface when encrypting to the local file
system, it is possible to override the default output file naming scheme by
passing the -o, --output
option.
Overriding the naming scheme is not possible when using the GUI or TUI interfaces, or when encrypting to a remote S3 or SFTP destination.
Output and input locations
sett can write (encrypt and compress) data packages to 3 different destinations:
-
Local disk: data is compressed and encrypted to a
.zip
data package on the local machine running sett. Such a data package can then be transferred (also using sett) in a second step.Before encrypting data to the local disk, sett verifies that there is enough free disk space available on the local machine to save the encrypted output. If this is not the case, an error is raised. Since the compression ratio of the input data cannot be known in advance, sett uses the conservative estimate that the minimum disk space required is equal to the total size of all input files to be encrypted.
-
S3-compatible object store: an S3 object store is a remote server dedicated to data storage (i.e. a “cloud” storage service). When encrypting data to such a destination, the data is both packaged (compressed and encrypted) and transferred in a single step. This is both faster (encryption/compression and transfer happen in parallel), and avoids duplicating data on the local machine (the packaged data does not need to be temporarily stored, as it is directly streamed to its destination S3 object store).
-
SFTP server: just like S3, an SFTP server is a “cloud” storage service (it simply uses a different protocol).
sett can read (decrypt and decompress) data packages from 2 different locations:
- Local disk: data is decrypted and decompressed from a
.zip
data package located on the machine running sett. - S3-compatible object store: as for data writing, decrypting from an
S3 object store offers the advantage that
.zip
package is both downloaded decrypted, and decompressed in a single stream. No intermediate files are created.
Authenticated mode
Authenticated mode refers to a mode where sett users authenticate with the BioMedIT Portal service. This mode is therefore only available to BioMedIT users.
BioMedIT
This section is particularly relevant for BioMedIT users, please read it carefully if your are such a user.For BioMedIT users, encrypting, transferring or decrypting data in authenticated mode is particularly convenient, as it allows sett to automatically retrieve information about the data transfers associated with a user and project (e.g. S3 connection information and credentials). Users have no longer to manually provide this information, which greatly streamlines their user experience.
BioMedIT users are therefore strongly encouraged to use authenticated mode for all their encryption, transfer or decryption operations.
More details on the authenticated mode can be found in its dedicated documentation section.
Automating the sett CLI
Most of the sett CLI usage does not require any interactive input from the user, except for the following two aspects:
- Input of OpenPGP key passwords.
- Authentication with the BioMedIT portal when running commands in
authenticated mode, e.g.
sett encrypt s3-portal
.
However, the sett CLI can be fully automated (no interactive input needed at any point) with the following settings.
OpenPGP key passwords
One of the following environment variables can be used to automatically pass the password of a secret OpenPGP key used to decrypt or sign data:
SETT_OPENPGP_KEY_PWD
: password to unlock the secret key.SETT_OPENPGP_KEY_PWD_FILE
: full path and name of a file containing the password to unlock the secret key.
Authentication with Portal using a Personal Access Token (PAT)
Automated authentication with the BioMedIT portal can be done by passing a
Personal Access Token (PAT) via either the SETT_PORTAL_PAT
environment
variable, or the --pat
command line option.
When this variable/option is set to a valid PAT, sett CLI automatically authenticates with Portal, skipping the need for interactive (browser-based) login.
BioMedIT users can generate a PAT in their profile on the BioMedIT portal.
Security warning
PAT are sensitive data and should be protected like passwords.
You should also be aware that using the --pat
command line option in an
interactive shell session will leak the PAT to your shell’s history (i.e.
someone could retrieve the PAT simply by looking at your command line history).
Most shells have settings that allow turning-off the logging of command
history. Here are some suggestions of how to do so for the bash
shell:
-
Turn-off history recording for the entire shell session:
unset HISTFILE
-
Temporarily turn-off history recording:
# Turn off shell history recording. set +o history # Run your commands. They will not be saved to your history file. sett encrypt --pat pat-vy0TY_lQNT3Yg-kw6rka5FGea2rjXx7RToHDZ_xnKsw ... # Turn history recording back on. set -o history
Example: passing a PAT via the SETT_PORTAL_PAT
environment variable.
# Temporarily turn-off shell history recording while setting the environmental
# variable that stores the PAT.
set +o history
export SETT_PORTAL_PAT=pat-vy0TY_lQNT3Yg-kw6rka5FGea2rjXx7RToHDZ_xnKsw
set -o history
sett encrypt s3-portal \
--signer SIGNER_KEY --recipient RECIPIENT_KEY --dtr DATA_TRANSFER_ID \
FILES_OR_DIRECTORIES_TO_ENCRYPT
Example: passing a PAT via the --pat
command line option.
# Turn-off history recording to avoid leaking the PAT.
unset HISTFILE
sett encrypt s3-portal \
--signer SIGNER_KEY --recipient RECIPIENT_KEY \
--dtr DATA_TRANSFER_ID \
--pat pat-vy0TY_lQNT3Yg-kw6rka5FGea2rjXx7RToHDZ_xnKsw \
FILES_OR_DIRECTORIES_TO_ENCRYPT
Encrypting and sending data in GUI
BioMedIT
Authenticated mode
For BioMedIT users, sett provides an authenticated mode that simplifies the encryption and transfer process by:
- Providing a pre-populated list of available Data Transfer Requests (DTRs) for which the authenticated user is an authorized data provider.
- Automatically fetching the required destination parameters and credentials (only available for S3 destinations).
To enable this mode, go to the Profile tab and click “Sign in”. You will be redirected to the BioMedIT authentication system. After successful authentication, proceed to the Encrypt and Transfer Data tab.
In the Encrypt and Transfer Data tab, complete the following:
-
Data to encrypt and/or transfer: using the Files and Folders buttons, select at least one file or directory to encrypt. Alternatively, files and directories can also be dragged and dropped into the sett application.
Note
If you already have an encrypted data package in your local file system, you can select it with sett Package (or drag and drop it into sett), then choose a remote destination (continue reading at Destination). -
Sender: select your own OpenPGP key (you are the data sender). For most users, there should be only one key in the Sender drop-down menu: their own key.
Note
The Sender key is used to sign the encrypted data, so that the recipient(s) can be confident that the data have been sent by a trusted source. If you do not have an OpenPGP key yet, you can generate one by going to the OpenPGP Keys tab. -
Recipients: add one or more recipients by selecting them from the drop-down list. Recipients are people for whom data should be encrypted: their public OpenPGP key will be used to encrypt the data, and only they will be able to decrypt it.
Note
If the key of your recipient’s key is not listed, you will first need to download it (go to the OpenPGP Keys tab).BioMedIT
Only recipients assigned to the role of Data Manager of the project for which data is being encrypted are permitted as data recipients. -
Data Transfer ID: Data Transfer Request ID associated with the data package that is being encrypted.
For data not intended to be transferred into the BioMedIT network, the Data Transfer ID field is optional and can be left empty or set to any arbitrary value. For non-BioMedIT transfers, the Verify package checkbox must also be disabled in the Settings tab.
BioMedIT
Specifying a valid Data Transfer ID is mandatory to transfer data into the BioMedIT network. For this reason, BioMedIT users should always leave the Verify package checkbox enabled in the Settings tab. -
Extra Metadata: this input is only visible when Enable extra metadata is enabled in the Settings tab.
Allows to optionally add additional metadata to the data package in the form of key-value. Extra metadata are stored in the metadata file of the data package.
After inserting the key and value, click the + button (or hit “Enter”) to add the new entry. You can add multiple key-value pairs.
-
Destination: destination where to encrypt (and send) the data.
- Local: The local file system. When using this destination, an Output location must be specified. This corresponds to the directory where the encrypted file should be saved. By default, output files are saved to the user’s home directory. This default behavior can be changed by changing Default output directory in the Settings tab.
- S3: A remote S3 compatible object store.
- SFTP: A remote SFTP server.
When encrypting to a remote destination, a number of options must be specified. Please refer to remote destination options for details.
-
You are now ready to create an encrypted data package: click Encrypt data or Encrypt and transfer data if you are encrypting to a remote S3 or SFTP destination.
A pop-up will appear, asking for the password associated with the sender’s key. After the password is entered, data compression and encryption will start.
Progress, success, and error messages for each submitted task are displayed in the Tasks tab.
Encrypting and sending data in CLI
Encrypting (or encrypting and transferring) data is done via the
sett encrypt
command. If you already have an encrypted data package on
your local file system and want to transfer it to a remote destination, use
the sett transfer
command. Both commands share the same options for
specifying the remote destination.
# General syntax:
sett encrypt local --signer SENDER --recipient RECIPIENT --dtr DATA_TRANSFER_ID --output OUTPUT_FILENAME_OR_DIRECTORY FILES_OR_DIRECTORIES_TO_ENCRYPT
# Example (long command line options):
sett encrypt local --signer alice@example.com --recipient bob@example.com --dtr 42 --output . ./file_to_encrypt.txt ./directory_to_encrypt
# Example (short command line options):
sett encrypt local -s alice@example.com -r bob@example.com -dtr 42 -o . ./file_to_encrypt.txt ./directory_to_encrypt
Description of the most used sett encrypt
arguments:
-
--signer
and--recipient
: data sender and data recipient(s). Values can be specified either as an OpenPGP key fingerprint, or an email address. Multiple recipients can be specified by repeating the-r
/--recipient
argument:# In this example, Alice encrypts data for both Bob and Chuck. sett encrypt local -s alice@example.com -r bob@example.com -r chuck@example.com -o . FILES_OR_DIRECTORIES_TO_ENCRYPT
-
--dtr
: Data Transfer Request ID value associated with the data being sent. For non-BioMedIT transfers, this argument is optional.BioMedIT
For transfers into the BioMedIT network, a valid Data Transfer ID must always be specified. -
--output
: an optional argument for specifying the location and/or name for the encrypted output file. It can be one of the following:- A directory: the encrypted data package file is written to the specified directory, and is given a name that follows the default naming convention.
- A file name or full path: the encrypted data is written to a new file with the specified name or path.
- Not provided: the encrypted data is written to
stdout
. This can e.g. be useful to pipe the data into another application.
-
Adding the
--check
option runs theencrypt
command in the test mode, i.e., checks are made but no data is encrypted or sent. -
If the
--verify
option is passed, sett performs Data Transfer ID verification. For non-BioMedIT-related transfers,--verify
should not be passed.
To automate the encryption process and avoid interactive prompts from the sett CLI - e.g. to ask for an OpenPGP key password - please refer to the section on automating the sett CLI.
Direct encryption to a remote destination
In the sett encrypt
command, the local
subcommand can be replaced with
s3-portal
, s3
or sftp
to encrypt and transfer data directly
to a remote destination in a single command.
Encrypting data directly to a remote location is encouraged, as it has important benefits.
BioMedIT
BioMedIT users are encouraged to use thesett encrypt s3-portal
subcommand, which enables the sett
authenticated mode.
For more information about remote destination options in non-authenticated mode, please refer to the remote destination options section.
Example command for non-authenticated encryption to an S3 object store:
sett encrypt s3 \
-s alice@example.com -r bob@example.com \
--endpoint https://minio.my-node.ch --bucket my-project \
--access-key 23VO8RB2SIB2SF8EUL9V \
--secret-key wvrt7YoTTERGftf0zWnppWYSdcGplNtxuLHMn7op \
--session-token eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9eyJhY2Nlc3NLZXkiOiI5Vk84\
UkIyvimUMlKIFUVVTDc3WSIsImF0X7hhc2giOiIyRnVlZ3JmSjhTUWFXYkw2\
V0puekF3IiwiYXVkLjpbIm1pbmlvIl0sImF1dGhfdGltTRI6MTcyMTezODIx\
MywiZXhwIjoxNzIx0DMxODEzLCJpYXQiOjE3MjE4MzfiLKLsImlzwqI6Imh0\
dHBzOi8vcD3ydGFsLXN0YWasfmcuZGNjLnNpYi5zd2lzcy9hdXRoL38hdXRo\
IiwibmFtZSI6ImJpd2ciLCJqt5xpY5kiOiJjb25zb2xlQWRfgW4iLCJzdWIi\
OiIxOSJ9.PcvXcAli5Bz8ete1T265TPB1cbfgX7k8NDXU5gXy1nflxq203cG\
5qwAF9Oxyn1mKmwa87jsHj8HU2VUY9p5S1Q \
FILES_OR_DIRECTORIES_TO_ENCRYPT
Advanced CLI options
Data compression algorithm can be changed using the --compression-algorithm
flag. The available options are:
zstandard
(default), optimal compression and speed.gzip
, available for compatibility with older versions ofsett
.stored
, no compression.
The data compression level used by sett can be manually adjusted using the
--compression-level
option. Possible values depend on the selected
compression algorithm:
zstandard
(default: 3)- 1 (lowest compression, fastest)
- 21 (highest compression, slowest).
gzip
(default: 6)- 1 (lowest compression, fastest)
- 9 (highest compression, slowest)
Encrypting and sending data in TUI
You can access the interactive mode by running the sett
command without any
arguments.
Decrypting files
Decrypt and decompress encrypted data packages. Only .zip
files that follow
the sett packaging specification can be
decrypted with sett.
Please note that:
- The decryption process includes the verification of the sender’s signature, which ensures the authenticity of the data.
- Verification of the sender’s signature is only possible if the sender’s public OpenPGP key is available in your local keyring.
- For security reasons, OpenPGP keys are not automatically downloaded from a key server by sett; they must be downloaded/imported manually.
To decrypt data, you must therefore have in your local keyring:
- The private key for which the data was encrypted. In principle, this is your own private key. This key will be used to decrypt the data.
- The data sender’s public key, used for signature verification purposes.
Decrypting data in GUI
To decrypt and decompress a file, go to the Decrypt tab and perform the following steps:
-
Data package: the data package to decrypt. Start by selecting the source where the data package is located via the drop-down menu: either local or s3.
- When decrypting from local, select a data package to decrypt with Select Package, or drag and drop the file into sett.
- When decrypting from s3, a number of options must be specified - please refer to remote destination options for details. After doing so, click on Load package.
-
Destination directory: select a location where to decrypt/decompress the file.
By default, output files are saved to the user’s home directory. This default behavior can be modified under Default output directory in the Settings tab.
-
Click Decrypt package to start the decryption and decompression process. A pop-up dialog will prompt you for the password associated with the OpenPGP key for which data was encrypted.
Decrypting data in CLI
In the CLI, data decryption is done via the sett decrypt
command:
# General syntax:
sett decrypt local --output OUTPUT_DIRECTORY ENCRYPTED_FILES.zip
# Example:
sett decrypt local --output /home/alice/data/unpack_dir /home/alice/data/test_data.zip
Some useful sett decrypt
arguments:
- The
local
subcommand can be replaced withs3
to decrypt data packages directly from an S3 object store. - To decrypt data without decompressing it, add the
--decrypt-only
/-d
option. - If
--output
/-o
is omitted, the data is decrypted to the current working directory.
To automate the decryption process and avoid interactive prompts from the sett CLI - e.g. to ask for an OpenPGP key password - please refer to the section on automating the sett CLI.
Decrypting data in TUI
The sett TUI interactive mode is accessed by running the sett
command
without any arguments.
Remote destination options
S3-compatible object store
BioMedIT
BioMedIT users are encouraged to use authenticated mode, where manual entry of the S3 parameters and credentials is not needed.When using an S3-compatible object store as destination (in non-authenticated mode), the following options must be specified:
- URL: URL of the S3 object store.
- Bucket: name of the bucket where the encrypted data should be stored.
- Access key: access key to authenticate with the S3 object store. It is also possible to use a username instead.
- Secret key: secret key to authenticate with the S3 object store. It is also possible to use a password instead.
- Session token: session token to authenticate with the S3 object store. Session tokens are only required when authenticating using temporary credentials (known as STS credentials).
Save and reuse S3 configurations
When working with sett CLI, options for frequently used S3 connections can be stored in a configuration file. For detailed instructions, refer to S3 profiles.SFTP
When using an SFTP server as destination, the following options must be specified:
- Host: URL address of the server where the files should be sent.
- User name: user name with which to connect to the SFTP server.
- Destination directory: absolute path of directory where files should be saved on the server.
- SSH key location: path of the private SSH key used for authentication to
the SFTP server. This is only required if the SSH key is in a
non-standard location. If missing, sett will use the SSH agent to provide
the key.
Do not confuse SSH keys - which are used to authenticate yourself when connecting to an SFTP server during file transfer - with OpenPGP keys - which are used to encrypt and sign data. - SSH key password: password associated with the private SSH key given under SSH key location. If your SSH key password contains characters that are not ASCII characters, and that this results in an error, please see the SSH private key with non-ASCII characters section of this guide.