See installation instructions or docs online.
- Build de Bruijn graph from Fasta files, FastQ files, or KMC k-mer counters:
./metagraph build
- Annotate graph using the column compressed annotation:
./metagraph annotate
- Transform the built annotation to a different annotation scheme:
./metagraph transform_anno
- Query annotated graph
./metagraph query
DATA="../tests/data/transcripts_1000.fa"
./metagraph build -k 12 -o transcripts_1000 $DATA
./metagraph annotate -i transcripts_1000.dbg --anno-filename -o transcripts_1000 $DATA
./metagraph query -i transcripts_1000.dbg -a transcripts_1000.column.annodbg $DATA
./metagraph stats -a transcripts_1000.column.annodbg transcripts_1000.dbg
./metagraph
./metagraph build -v --parallel 30 -k 20 --mem-cap-gb 10 \
-o <GRAPH_DIR>/graph <DATA_DIR>/*.fasta.gz \
2>&1 | tee <LOG_DIR>/log.txt
./metagraph build -v --parallel 30 -k 20 --mem-cap-gb 10 --disk-swap <GRAPH_DIR> \
-o <GRAPH_DIR>/graph <DATA_DIR>/*.fasta.gz \
2>&1 | tee <LOG_DIR>/log.txt
K=20
./KMC/kmc -ci5 -t4 -k$K -m5 -fm <FILE>.fasta.gz <FILE>.cutoff_5 ./KMC
./metagraph build -v -p 4 -k $K --mem-cap-gb 10 -o graph <FILE>.cutoff_5.kmc_pre
./metagraph annotate -v --anno-type row --fasta-anno \
-i primates.dbg \
-o primates \
~/fasta_zurich/refs_chimpanzee_primates.fa
- Cluster columns
./metagraph transform_anno -v --linkage --greedy \
-o linkage.txt \
--subsample R \
-p NCORES \
primates.column.annodbg
Requires N*R/8 + 6*N^2
bytes of RAM, where N
is the number of columns and R
is the number of rows subsampled.
- Construct Multi-BRWT
./metagraph transform_anno -v -p NCORES --anno-type brwt \
--linkage-file linkage.txt \
-o primates \
--parallel-nodes V \
-p NCORES \
primates.column.annodbg
Requires M*V/8 + Size(BRWT)
bytes of RAM, where M
is the number of rows in the annotation and V
is the number of nodes merged concurrently.
./metagraph query -v -i <GRAPH_DIR>/graph.dbg \
-a <GRAPH_DIR>/annotation.column.annodbg \
--discovery-fraction 0.8 --labels-delimiter ", " \
query_seq.fa
./metagraph align -v -i <GRAPH_DIR>/graph.dbg query_seq.fa
./metagraph assemble -v <GRAPH_DIR>/graph.dbg \
-o assembled.fa \
--unitigs
./metagraph assemble -v <GRAPH_DIR>/graph.dbg \
--unitigs \
-a <GRAPH_DIR>/annotation.column.annodbg \
--label-mask-in LABEL_1 \
--label-mask-in LABEL_2 \
--label-mask-out LABEL_3 \
-o diff_assembled.fa
Stats for graph
./metagraph stats graph.dbg
Stats for annotation
./metagraph stats -a annotation.column.annodbg
Stats for both
./metagraph stats -a annotation.column.annodbg graph.dbg
The Makefile
in the top level source directory can be used to build and test metagraph
more conveniently. The following
arguments are supported:
env
: environment in which to compile/run (""
: on the host,docker
: in a docker container)alphabet
: compile metagraph for a certain alphabet (e.g.DNA
orProtein
, defaultDNA
)additional_cmake_args
: additional arguments to pass to cmake.
Examples:
# compiles metagraph in a docker container for the `DNA` alphabet
make build-metagraph env=docker alphabet=DNA
Metagraph is distributed under the GPLv3 License (see LICENSE). Please find further information in the AUTHORS and COPYRIGHTS files.