DeepFilter

rabbitvar filter

A deep-learning-based variant filter for VarDict

Pipeline

There are three main steps in DeepFilter:

DeepFilter uses a hard filter strategy to the intermediate results produced by VarDict. Variants that match these conditions will be filtered out.
Then, the filtered data is the input to the network for inference.
Finally, DeepFilter formats the filtered data into VCF file.

An example using DeepFilter to filter INDEL or SNV variants

The IN_DATA below reference to the intermediate result of VarDict, you can run VarDict like:

VarDict \
  -G /path/to/hg19.fa \
  -f $AF_THR -N sample_name \
  -b "/path/to/tumor.bam|/path/to/normal.bam" \
  -c 1 -S 2 -E 3 -g 4 \
  /path/to/my.bed  | VarDict/testsomatic.R > ${IN_DATA}

Filter INDEL variant and then format to VCF file

VARTYPE="INDEL"
python call_somatic.py \
    --workspace /home/haoz/deepfilter/workspace \
    --in_data ${IN_DATA} \
    --nthread ${THREAD} \
    --var_type ${VARTYPE} \
	--trained_model ./models/checkpoint_indel_w1_24.adam.pth \
	--out ${DEEPFILTER}/workspace/result/filtered_som_indel.vcf

Filter SNV variant and then format to VCF file

VARTYPE="INDEL"
python call_somatic.py \
    --workspace /home/haoz/deepfilter/workspace \
    --in_data ${IN_DATA} \
    --nthread ${THREAD} \
    --var_type ${VARTYPE} \
	--trained_model ./models/checkpoint_snv_w1_24.adam.pth \
	--out ${DEEPFILTER}/workspace/result/filtered_som_snv.vcf

Train new models

step1: make .csv data

python make_data.py inter.txt groundtruth.vcf $TYPE train_data.tsv

step2: modified the source code if you want to change the structure of the network or other strategies.
step3: re-train the model

python train_somatic.py \
  --workspace . \
  --train_data  ${data_path}/data_indel_all.tsv \
  --nthread 8 \
  --var_type "INDEL" \
  --weight ${weight} \
  --out_model_path ./models/checkpoint_indel_w${weight}.adam.pth

Usage

sage: call_somatic.py [-h] --workspace WORKSPACE 
                      --in_data IN_DATA --truth_file TRUTH_FILE
                      [--model_out MODEL_OUT] --var_type VAR_TYPE
                      [--batch_size BATCH_SIZE] [--nthreads NTHREADS]
                      [--trained_model TRAINED_MODEL] --out OUT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
cpp		cpp
image		image
models		models
README.md		README.md
call_all.py		call_all.py
call_somatic.py		call_somatic.py
compare_filter_res_indel.py		compare_filter_res_indel.py
compare_filter_res_snv.py		compare_filter_res_snv.py
convert.py		convert.py
data_instruction.md		data_instruction.md
features.py		features.py
nn_net.py		nn_net.py
run.sh		run.sh
somatic_data_loader.py		somatic_data_loader.py
test_weight.sh		test_weight.sh
train_somatic.py		train_somatic.py
train_somatic.sh		train_somatic.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepFilter

Pipeline

An example using DeepFilter to filter INDEL or SNV variants

Train new models

Usage

About

Uh oh!

Releases

Packages

Languages

QiXinch/DeepFilter

Folders and files

Latest commit

History

Repository files navigation

DeepFilter

Pipeline

An example using DeepFilter to filter INDEL or SNV variants

Train new models

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages