Skip to content

Quoting of file paths #76

@mbhall88

Description

@mbhall88

Describe the bug
When a file path has "weird" characters like | in it, mashtree fails.

To Reproduce

Here is an example of my file of filenames

mycobacteria/Mycolicibacillus/kraken:taxid|1069220|NZ_AP022594.1.fa
mycobacteria/Mycolicibacillus/kraken:taxid|1069221|NZ_CP092365.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|29314|NZ_AP022609.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|2872309|NZ_CP084029.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|2875777|NZ_CP084028.1.fa
mycobacteria/Mycolicibacter/kraken:taxid|1788|NZ_LT906469.1.fa

First up, I know, these are terrible filenames. Who in their right mind would name files this way? Well there is a lot of them and I basically can't be bothered to change them. So you are well within your right to tell me to buzz off :)

Here is an example of the command and output

$ mashtree --file-of-file all_myco.fofn --numcpus 8 --outtree tree.dnd
mashtree: main: Found mash version 2 - /home/mihall/sw/mambaforge/envs/classbench/bin/mash
mashtree: main: Temporary directory will be /tmp/MASHTREE.raIOT8
mashtree: main: mashtree on 1 files
mashtree: mashSketch(TID1): This thread will work on 137 sketches
mashtree: mashSketch(TID1): Working on file 1 out of 137
mashtree: mashSketch(TID2): This thread will work on 137 sketches
mashtree: mashSketch(TID2): Working on file 1 out of 137
sh: 1767: command not found
sh: 1767: command not found
mashtree: mashSketch(TID3): This thread will work on 136 sketches
mashtree: mashSketch(TID3): Working on file 1 out of 136
sh: NZ_AP024256.1.fa: command not found
sh: 1962118: command not found
sh: 1962118: command not found
sh: NZ_CP022235.1.fa: command not found
mashtree: mashSketch(TID1): ERROR running mash sketch -S 42 -k 21 -s 10000   -o /tmp/MASHTREE.raIOT8/kraken:taxid|1767|NZ_AP024256.1.fa mycobacteria/Mycobacterium/kraken:taxid|1767|NZ_AP024256.1.fa 2>&1!
  sh: NZ_AP024256.1.fa: command not found

Basically, I think word-splitting is causing perl to think we're trying to pipe something? I don't know anything about perl though so I might be way off. Is there a concept of quoting file paths like in bash to avoid this behaviour?

Expected behavior
Michael names his files like a sane person. Or mashtree is super kind and forgiving and knows Michael is an idiot but makes him feel better about himself by handling his crappy file paths.

Desktop (please complete the following information):

  • OS: Linux
  • Version 1.2.0
  • which method did you install with? conda

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions