Skip to content

(bug fixed)--overlapped_out option has a memory allocation issue #600

@genommm

Description

@genommm

First I want to say I enjoy reading your code. It's well-designed, well-organized and readable. Thank you for sharing this with the world!
I've been working on customizing the code and came across a memory allocation problem. I think I tracked down the cause and a fix will be pull-requested soon.

More specifically, after making the program on Github CodeSpace, I ran the following command for an output that contains an --overlapped_out option. I got the output of the stats and an error at the end, which will abort the program before the output on another computer.

$ ./fastp -i ./testdata/R1.fq -I ./testdata/R2.fq -m -o ./testdata/outm1.fq -O ./testdata/outm2.fq --merged_out ./testdata/outmerged2.fq --overlapped_out ./testdata/outstrictmerged2.fq
Read1 before filtering:
total reads: 9
total bases: 1208
Q20 bases: 1078(89.2384%)
Q30 bases: 1005(83.1954%)

Read2 before filtering:
total reads: 9
total bases: 1359
Q20 bases: 1100(80.9419%)
Q30 bases: 959(70.5666%)

Merged and filtered:
total reads: 8
total bases: 1552
Q20 bases: 1336(86.0825%)
Q30 bases: 1256(80.9278%)

Filtering result:
reads passed filter: 16
reads failed due to low quality: 0
reads failed due to too many N: 0
reads failed due to too short: 2
reads with adapter trimmed: 0
bases trimmed due to adapters: 0
reads corrected by overlap analysis: 7
bases corrected by overlap analysis: 14

Duplication rate: 55.5556%

Insert size peak (evaluated by paired-end reads): 187

Read pairs merged: 8
% of original read pairs: 88.8889%
% in reads after filtering: 100%

free(): invalid pointer
Aborted (core dumped)

This seems to result from a duplicate deletion of pointers generated in src/peprocessor.cpp

line 462          Read* overlappedRead = new Read(r1->mName, new string(r1->mSeq->substr(max(0,ov.offset)), ov.overlap_len), r1->mStrand, new string(r1->mQuality->substr(max(0,ov.offset)), ov.overlap_len));

Since Read() takes in the string name, overlappedRead takes the same r1->mName and r1->mStrand as the r1 and or1 object created a while above. Then overlappedRead is recycled to be deleted later.

line 464-
                recycleToPool1(tid, overlappedRead);

At the cleaning up phase of the function processPairEnd(), or1 (and r1) are also recycled.

line 580-
        if(or1) {
            recycleToPool1(tid, or1);
            or1 = NULL;
        }

These two objects or1 and overlappedOut will provide pointers for the memory release. However, since their or1->mName and overlappedOut->mName are pointing to the same address (same for ->mStrand), they will be deleted twice, which will be captured by the system and throw the memory error.

A way to fix it is to designate overlappedRead->mName and overlappedRead->mStrand to nullptr /*before recycling*/ and then delete this read item right after use. A revised pull request is out now.

line 463-
                overlappedRead->appendToString(overlappedOut);
                overlappedRead->mName = nullptr;
                overlappedRead->mStrand = nullptr;
		delete overlappedRead;
                //recycleToPool1(tid, overlappedRead);

Happy to hear any feedback!

Best,
genommm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions