Administration Guide - Quick Start
Quick-Start
From running instance to first filter query, the following steps need to be taken in-order:
- Create a cooperation
- Create a project
- Create necessary diseases
- Create a sample sheet
- Upload the sample sheet
- Annotate and upload your vcf files
- Filter
Autosomal (dominant|recessive)
orde-novo
Once you started the application and have confirmed that it is in working order, you can start to administer your system. This includes setting up users and uploading a first dataset to filter for disease causing variants.
All of the following steps are described in greater detail in the different parts of this section. For the impatient, to be able to filter your first case, these are the minimal steps you need to do.
Evaluation
As there are many moving parts within EVAdb, especially around the annotation side of the application it is good practice to first upload a known solved case to test variant filtration and annotation.
Access
To follow this guide you need SSH access to the host running EVAdb and admin access to both applications (user and admin).
Create a Cooperation and Project
Prior to uploading samples, we need to setup the support structure. Goto
New cooperation
and enter your cooperation details in the form. Typically,
this will be your name and details. Afterwards create a project and assign your
new cooperation to the project.
We assume your cooperation is named Mustermann
and the project is called
EXOME_CASES
.
Create diseases
For the sample upload and filter processes to work, we need to create some
diseases. These tables are not pre-populated. Assuming you start with a single
cohort of trio-samples it is advisable to set the parents to a different
disease group than the index cases. Create at least two diseases, one "real"
disease that can be attributed to one of the disease groups and one for the
control samples that you assign the disease group Controls
.
A simple, yet unspecific example could be two diseases:
Developmental disorder
,DEV
(group:Developmental disorder
) andControls
,CTRLS
(group:Controls
)
Setup a sample sheet
The sample sheet is the most convenient way to define multiple sample objects
at once. While it is possible to create samples manually or via the LIMS system,
we recommend using sample sheets as source of truth. Every row in the sample
sheet defines exactly one sample and all its relevant attributes. Most of the
columns are mandatory. Upload your sample sheet by using the
Import external samples
form.
Sample ID,Foreign ID,Pedigree,Comment,Sex,Affected,Organism,Tissue,Disease,Library Type,Read Type,Exome Assay
76436,76436,B20-0498,TranslateNamse,female,1,human,peripheral blood,Cases,exomic,paired-end,SureSelect60Mbv6
76443,76443,B20-0498,TranslateNamse,male,0,human,peripheral blood,Controls,exomic,paired-end,SureSelect60Mbv6
76442,76442,B20-0498,TranslateNamse,female,0,human,peripheral blood,Controls,exomic,paired-end,SureSelect60Mbv6
The three ID columns Sample ID
, Foreign ID
and Pedigree
are used to
identify samples. Additionally, since Pedigree
is used as family id, when
the same Pedigree
is set for multiple samples, we will be able to define the
family structure through the web user interface. To be able to use de-novo
filter and the like it is necessary to specify the family structure by
modifying the generated sample entries.
Disease
Make sure to set Disease
column to the values you used in the previous
steps.
Annotate and Upload your variants
Docker
The following part is specific to the docker version of EVAdb. When using a bare-metal or other installation please adjust the commands.
After defining samples and the family structure for your cohort, the next step is uploading the variants. The minimum prerequisite other than the previous step is that all data is local to the server that hosts the mariadb container.
Database
Make sure the database is running and able to handle connections using the
MYSQL_USER
username prior to starting the annotation container.
To make your data available inside the container, set DATA_DIR
in your .env
file to the directory containing all vcf files that you wish to upload. If you
wish to annotate and insert a specific .vcf
file, use the following
procedure. We assume that the vcf file is located at
/share/data/exomes/C1234.vcf.gz
and DATA_DIR=/share/data
. Then, we can use
the following command to upload the variants. This annotation procedure does
support uploading multi-sample vcf files.
chr Tags
EVAdb expects all contigs to have the chr
Prefix. Make sure your contigs
are named chr1-chrY
prior to executing this command.
A simple command to change contig names would be the following (although
there are some caveats when changing the reference like that). Set $f
to
your .vcf.gz
input and $VCF
to the desired output location.
zcat $f | awk '{ if ( $1 ~ "#" ) { print $0 } else { print "chr"$0 } }' > $VCF
docker-compose run annotation -vcf /data/exomes/C1234.vcf.gz \
-sample C1234 \
-se hg19_plus
Depending on the speed of your system the upload of one sample should be done in less than ten minutes. Afterwards you can try to filter your variant dataset using the tools in the EVAdb user application frontend.
Finally, fill the frequency tables with information, s.t. variant filters can work with the in-house data.
docker-compose run annotation bash
/pipeline/doAfterImport.pl -se hg19_plus -s 40