Sett functionality overview

sett is developed to be an “all in one” tool to transfer sensitive data to and from trusted research environments: secure computing environments designed to host sensitive biomedical data.

At its core, sett provides its users with the ability to compress, encrypt, transfer, and decrypt data in a sett-specific file format. This regroups the following operations, which are performed as a streamed process:

  • Data encryption and signing in OpenPGP format when encrypting data.
  • Data decryption and signature verification when decrypting data.
  • Data compression and decompression in zstandard or gzip format.
  • Data checksum computation and verification: ensures that the decrypted data matches exactly with the originally packaged data, making sure no data corruption occurred during the transfer.
  • Verification of data transfer validity: sett verifies that data transfers are authorized and that the data sender and recipients are valid for the specified data transfer.
  • Data transfer with support for the following protocols:
    • S3: a RESTful API specification for object storage operating over the HTTPS protocol.
    • SFTP: secure file transfer over SSH.

For BioMedIT users, sett also provides an authenticated mode, which seamlessly integrates with the BioMedIT Portal in oder to automatically:

  • Retrieve and refresh S3 access credentials.
  • Retrieve of data transfer (DTR) information associated with the user, as well as all S3 connection information associated with a data transfer. This reduces the amount of information that the end-user must enter to perform a data transfer.

Finally, sett also implements an OpenPGP key store that provide its users with all the essential functionalities for OpenPGP key management. This includes key generation, revocation, deletion, upload to and download from a keyserver, and key trust management.

Minimal requirements for transferring data into the BioMedIT network

We strongly recommend using sett to securely transfer data into the BioMedIT network. If for some reasons this is absolutely not possible, any alternative implementation of a data encryption and transfer tool is required to fully implement the minimum set of features listed below.

Requirements for data packaging and sending into the BioMedIT network:

  • Packaging of files to transfer into a data package adhering to the sett data package specifications. This includes:
    • Data compression in zstandard, gzip, or stored (no compression) format.
    • Data encryption in OpenPGP format, as defined in RFC 4880.
    • Data signing in OpenPGP format.
    • Computation of checksums for the transferred files.
  • Transfer of data to an S3-compatible object store. Transfer into the BioMedIT network is done exclusively via S3, a RESTful API specification for object storage, operating over the HTTPS protocol. Transfer of data into the BioMedIT network therefore requires the ability to authenticate with an S3 object store and upload data to an S3 bucket.
  • Retrieval of temporary S3 access credentials (STS credentials) from the BioMedIT portal. Credentials to access BioMedIT S3 object store instances are delivered exclusively via the BioMedIT portal. Retrieval of S3 access credentials requires an authenticated call to the BioMedIT portal API STS endpoint
  • Should a data transfers last longer than the STS credentials lifetime (currently set to 24 hours), the mid-transfer renewal of S3 credentials must also be implemented.

Additionally, data packaging as per the above specifications requires having access to OpenPGP keys for encryption and signing of the data. OpenPGP key generation and management can be done via existing utilities such as Sequoia-PGP or GnuPG.