-
Notifications
You must be signed in to change notification settings - Fork 196
feat: add wrapper for MEGAHIT #4121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
c5f43a0
feat: add wrapper for MEGAHIT
alienzj 38f0ff9
Log output
fgvieira f0b0dc0
Code tweak
fgvieira 51c93f3
Use system tempdir
fgvieira b893f9a
Use custom tempdir and update output
alienzj 77a14a6
Use system tempdir
alienzj 89f3db1
add environment.linux-64.pin.txt
alienzj 521e109
Code format
fgvieira e7a9174
coderabbitai suggestion
fgvieira File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next
Next commit
feat: add wrapper for MEGAHIT
- Loading branch information
commit c5f43a0ce73c1e7465341cb42abb8c547a69ddc9
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
channels: | ||
- conda-forge | ||
- bioconda | ||
- nodefaults | ||
dependencies: | ||
- megahit =1.2.9 | ||
- snakemake-wrapper-utils =0.7.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
name: "megahit" | ||
|
||
url: https://github.com/voutcn/megahit | ||
|
||
description: | | ||
MEGAHIT is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly. | ||
Input options that can be specified for multiple times (supporting plain text and gz/bz2 extensions). | ||
|
||
input: | ||
- reads: list of reads in FASTQ format | ||
- r1: forward reads | ||
- r2: reverse reads | ||
- interleaved: interleaved reads | ||
- unpaired: unpaired reads | ||
|
||
output: | ||
- contigs: output file with contigs | ||
- log: log file | ||
- json: options json file | ||
|
||
authors: | ||
- Jie Zhu | ||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Filipe G. Vieira |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
container: "docker://continuumio/miniconda3:4.4.10" | ||
|
||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
rule run_megahit: | ||
input: | ||
reads=["test_reads/sample1_R1.fastq.gz", "test_reads/sample1_R2.fastq.gz"], | ||
output: | ||
contigs="assembly/contigs.fasta", | ||
benchmark: | ||
"logs/benchmarks/assembly/megahit.txt" | ||
params: | ||
# all parameters are optional | ||
extra="--min-count 10 --k-list 21,29,39,59,79,99,119,141", | ||
log: | ||
"logs/megahit.log", | ||
threads: 8 | ||
resources: | ||
mem_mb=250000, | ||
wrapper: | ||
"master/bio/megahit" | ||
|
||
|
||
rule download_test_reads: | ||
output: | ||
["test_reads/sample1_R1.fastq.gz", "test_reads/sample1_R2.fastq.gz"], | ||
log: | ||
"logs/download.log", | ||
shell: | ||
"(wget -O - https://zenodo.org/record/3992790/files/test_reads.tar.gz | tar -xzf -) > {log} 2>&1" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
"""Snakemake wrapper for megahit.""" | ||
|
||
__author__ = "Jie Zhu @alienzj" | ||
__copyright__ = "Copyright 2025, Jie Zhu" | ||
__email__ = "[email protected]" | ||
__license__ = "MIT" | ||
|
||
import os, tempfile, shutil | ||
from snakemake.shell import shell | ||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from snakemake_wrapper_utils.snakemake import get_mem | ||
|
||
# get output_dir and files from output | ||
output_dir = os.path.split(snakemake.output[0])[0] | ||
contigs_file = snakemake.output.get("contigs", os.path.join(output_dir, "contigs.fa")) | ||
contigs_file_original = os.path.join(output_dir, "final.contigs.fa") | ||
options_file = snakemake.output.get("options", os.path.join(output_dir, "options.json")) | ||
log_file = snakemake.output.get("log", os.path.join(output_dir, "log")) | ||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# parse params | ||
extra = snakemake.params.get("extra", "") | ||
log = snakemake.log_fmt_shell(stdout=True, stderr=True) | ||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
memory_requirements = get_mem(snakemake, out_unit="KiB") * 1024 | ||
|
||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# parse short reads | ||
if hasattr(snakemake.input, "reads"): | ||
reads = snakemake.input.reads | ||
else: | ||
reads = snakemake.input | ||
|
||
input_arg = "" | ||
|
||
# handle named inputs if available | ||
if hasattr(snakemake.input, "r1") and hasattr(snakemake.input, "r2"): | ||
input_arg += " -1 {} -2 {} ".format(snakemake.input.r1, snakemake.input.r2) | ||
elif len(reads) >= 2: | ||
input_arg += " -1 {} -2 {} ".format(reads[0], reads[1]) | ||
|
||
# handle interleaved reads if specified | ||
if hasattr(snakemake.input, "interleaved"): | ||
input_arg += " --12 {} ".format(snakemake.input.interleaved) | ||
elif len(reads) >= 3 and not hasattr(snakemake.input, "r1"): | ||
input_arg += " --12 {} ".format(reads[2]) | ||
|
||
# handle additional reads if specified | ||
if hasattr(snakemake.input, "unpaired"): | ||
input_arg += " --read {} ".format(snakemake.input.unpaired) | ||
elif len(reads) >= 4 and not hasattr(snakemake.input, "r1"): | ||
input_arg += " --read {} ".format(reads[3]) | ||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
with tempfile.TemporaryDirectory(dir=os.path.dirname(output_dir)) as temp_dir: | ||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
|
||
output_temp_dir = os.path.join(temp_dir, "temp") | ||
|
||
shell( | ||
"megahit " | ||
" -t {snakemake.threads} " | ||
" -m {memory_requirements} " | ||
" -o {output_temp_dir} " | ||
" {input_arg} " | ||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
" {extra} " | ||
" > {snakemake.log[0]} 2>&1 " | ||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
|
||
) | ||
|
||
if os.path.exists(os.path.join(output_temp_dir, "done")): | ||
shell("rm -rf {output_dir}") | ||
shutil.move(output_temp_dir, output_dir) | ||
|
||
fgvieira marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if ( | ||
os.path.exists(contigs_file_original) | ||
and os.path.exists(options_file) | ||
and os.path.exists(log_file) | ||
): | ||
shutil.move(contigs_file_original, contigs_file) | ||
alienzj marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.