Build
Building the containers
Before the application can be started, application containers have to be built. This can be achieved with the following command.
docker-compose build
Time
Building these containers will take some time. Especially the init and annotation containers have to install a lot of perl modules and take some time before they are ready.
Refer to the next section for further details on individual docker containers and their purpose.
Docker Container
Docker Container
The application, as presented, utilizes multiple docker containers for the components.
- mariadb
- EVAdb (docker/snv-hg19p)
- EVAdb admin (docker/snvedit)
- init container (helperscripts)
- annotation (annotation)
The EVAdb application consists of two user-facing web interfaces hosted through an Apache Webserver as cgi scripts. To run the application, we also setup the mysql database (mariadb), an init container and an annotation container. The primary use-case of the init container is creation of the database, initial user setup and import of external databases. It should be run on first startup and whenever external data needs changing. The annotation container is intended to be used for the import of vcf files from your standard GATK best practices pipeline.
The setup of all containers is done through the docker-compose script
(docker-compose.yml
). Below, we will explain the basic setup of each
containers.
docker-compose run
To execute a tool in a running container of our setup, please use either
docker-compose exec [db|evadb_user|evadb_admin|evadb_init|annotation] bash
if the container is currently running, or
docker-compose run [db|evadb_user|evadb_admin|evadb_init|annotation] bash
if the container is currently shut off. For example, to inspect the database in detail, one can use
docker-compose exec db mysql -u <USER> -p<PW>
to get a mysql shell in the container.
EVAdb and EVAdb Admin Containers
Both containers are extended from standard apache/httpd Dockerfiles. We support some environment variables to supply the database user and password data. For each container, the application will run as https web service on port 443 which can be forwarded by standard docker syntax. By default, the user facing filter application will take port 443 of the host and the admin application will use port 8443.
Firewall Setup
When running a production setup where access to your EVAdb instance from the internet is required it is recommended to run a firewall with only port 443 accessible from the outside. In such a setup, you could access the admin backend only from your local network or through SSH port-forwarding (f.e.)
SSL Setup
Both containers require a docker volume to be mounted at the /ssl
location (can be read-only) containing a certificate and private key file
for the SSL connection. The files must be named evadb.crt
and
evadb.key
respectively.
Init Container
The init container will initialize the database. It creates the first user for the web interface which can then create other users as well. It is built from a standard debian image and installs most software for running the perl scripts to setup the database.
First Startup
On first start, the EVAdb will typically launch against an unitialized
database. In order to function properly, at least INIT_DB
and INIT_USER
have to be set to 1
to create the database setup and add initial
user credentials.
It features two docker-volumes which need to be populated with data for the container to work properly.
Volume | Purpose |
---|---|
/library |
Hosting pre-downloaded external databases (gnomAD, dbNSFP etc.). See the Download Section for more details. |
/database |
Database dumps and sql scripts for database creation. The container uses all scripts from this folder to initialize the database. |
The behaviour of the container can be influenced by setting the IMPORT_*
and
INIT_*
environment variables. This is especially useful if only a single
third-party dataset should be reimported. Other values than 1
(e.g. 0
) will
turn off the respective part.
Database Wipe
If the Init container is run with INIT_DB=1
on an initialized database
all data is wiped off the installation.
Setting | Default | Description |
---|---|---|
INIT_DB | 1 | Initialize the database (wipes existing data) |
INIT_USER | 1 | Setup admin user and password |
IMPORT_DBNSFP | 1 | Toggle import of Polyphen2 and SIFT |
IMPORT_CADD | 1 | Toggle import of CADD scores |
IMPORT_GNOMAD | 1 | Toggle import of gnomAD |
IMPORT_DGV | 1 | Toggle import of dgv structural variation |
IMPORT_CLINVAR | 1 | Toggle import of clinvar data |
IMPORT_UCSC | 1 | Toggle import of ucsc data |
IMPORT_CDSDB | 1 | Toggle import of coding sequence database |
IMPORT_LOF_METRICS | 1 | Toggle import of gnomad scores by gene |
Annotation Container
The annotation container is built from the ngs-pipeline
repository. Its primary purpose is enabling the import of vcf files for samples.
For this, it uses the externalPipelineImport.pl
tool as its main interface via
entrypoint.sh
. Data should be provided via the volume /data
. The container
expects library data (such as human reference genomes) at /library
and
databases for annotation at /anno_db
.
How to use this container to import a VCF file will be described in another section.