Skip to content

Issue parsing linebreaks in segments #134

Closed
@johanib

Description

@johanib

Hi,

I'm not an edifact expert, so I'm currently struggling with some issues related to parsing linebreaks.

Afaik, edifact uses the ' character as a line ending (segment end). So whether or not the edi file uses linebreaks should not matter to parsing the file.

However, this change, specifically the linked line is giving me trouble:

7cea68c6#diff-1e623c61c2badd421334ac19660878a0a400c822feebe3c2984769b8f0fda401R483

$string = (string) \preg_replace(
            '/(([^' . $this->symbRel . ']' . $this->symbRel . '{2})+|[^' . $this->symbRel . '])' . $this->symbEnd . '|[\r\n]+/',
            '$1' . $this->stringSafe,
            $string
        );

That change causes the second line to not be parsed as part of the FTX segment:

FTX+AAI+++THE SHIPPER SHALL NOT BE RESPONSIBLE FOR ANY COSTS/DELAYS OCCUR
:DUE TO INTERVENTION OF CUSTOMS.'

Reverting the change by changing:

$terminatorRegex = '/(([^'.$this->symbRel.']'.$this->symbRel.'{2})+|[^'.$this->symbRel.'])'.$this->symbEnd.'|[\r\n]+/';

to

$terminatorRegex = '/(([^'.$this->symbRel.']'.$this->symbRel.'{2})+|[^'.$this->symbRel.'])'.$this->symbEnd.'/';

Fixes my testReadsLinebreakedSegments test, but of course breaks \EDITest\AnalyserTest::testProcess.


Input edi:

UNB+UNOB:2+CARRIER+RECEIVER-ID+999818:999+251'
UNH+0001+IFTMBC:D:00B:UN'
BGM+770+AAA99970929+9'
TSR+30+2:::2'
FTX+AAI+++THE SHIPPER SHALL NOT BE RESPONSIBLE FOR ANY COSTS/DELAYS OCCUR
:DUE TO INTERVENTION OF CUSTOMS.'
UNT+10+0001'
UNZ+1+251'

My test:

    public function testReadsLinebreakedSegments()
    {
        $p = new Parser();
        $p->load(__DIR__ . '/../files/example_linebreak.edi');
        $r = new Reader($p);

        $line = $r->readEdiDataValue(['FTX', [1 => 'AAI']], 4);

        self::assertSame(
            [
                'THE SHIPPER SHALL NOT BE RESPONSIBLE FOR ANY COSTS/DELAYS OCCUR',
                'DUE TO INTERVENTION OF CUSTOMS.',
            ],
            $line
        );
    }

Currently fails with:

4) EDITest\ReaderTest::testReadsLinebreakedSegments
Failed asserting that 'THE SHIPPER SHALL NOT BE RESPONSIBLE FOR ANY COSTS/DELAYS OCCUR' is identical to Array &0 (
    0 => 'THE SHIPPER SHALL NOT BE RESPONSIBLE FOR ANY COSTS/DELAYS OCCUR'
    1 => 'DUE TO INTERVENTION OF CUSTOMS.'
).

Can anyone help me with this?
Is my example input within parse-able spec?
If so, what would be needed to handle this scenario without introducing regression?

For reference: https://github.com/johanib/edifact/pull/2/files

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions