Jules pipeline hack

Generate data

$ make data

Calls make-data.py to (roughly) do the following:

Choose some images.
For each image, choose a protein.
Convert the image to black and white.
Extract the horizontal black lines from the image.
Scale the image to be the (aa) length of the protein.
Break each horizontal line into some reads.
Calculate a bit score for the reads based on how far down we are in the image.
Add some noise (otherwise the image looks too good and you can't see the individual reads).
Write out fake DIAMOND results for those reads.
Write our fake FASTQ files for the reads.

The files to be injected into the pipeline appear in OUT/json and OUT/fastq.

Inject

When the pipeline for the target sample is finished with the 03-diamond-civ-rna and 025-dedup steps:

$ make add

Calls add-data.py to:

Add the compressed DIAMOND results to the pre-existing 03-diamond-civ-rna output.
Add the compressed FASTQ to the pre-existing 025-dedup output.

No original data is touched. Only intermediate pipeline outputs are appended to. The original intermediate files are saved.

Re-run

$ make rerun

Calls rerun-pipeline.py to re-run those two pipeline steps.

Deployment

The easiest/calmest way to deploy is just to edit the 06-stop/stop.sh script for the sample so that it does not remove the slurm-pipeline.running file (or create slurm-pipeline.done). Instead you can just make it touch some other file and wait for that file to show up. Then you do the make add and make rerun. After that, just mv slurm-pipeline.running slurm-pipeline.done and the sample will be considered done by monitor-run.py.

Profit

The pipeline results look like this:

with "blue plots" like this

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
OUT		OUT
images		images
jreads		jreads
test		test
.gitignore		.gitignore
Makefile		Makefile
Michael-J-Scott-reads.png		Michael-J-Scott-reads.png
README.md		README.md
add-data.py		add-data.py
hendra-results.png		hendra-results.png
jreads.py		jreads.py
make-data.py		make-data.py
rerun-pipeline.py		rerun-pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Jules pipeline hack

Generate data

Inject

Re-run

Deployment

Profit

About

Uh oh!

Releases

Packages

Languages

terrycojones/jules-pipeline-hack

Folders and files

Latest commit

History

Repository files navigation

Jules pipeline hack

Generate data

Inject

Re-run

Deployment

Profit

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages