Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DFDL-1721

Parser removes more than one NL.


      There is a DFDL schema for a format called Praat TextGrid. It is at github. It was developed using Daffodil.

      IBM ported it to IBM DFDL, and encountered a difficulty with the final termination linefeed, and created a modification via a pull request for the schema:



      The DFDL schema for Praat TextGrid format has two linefeeds at the end. IBM had to add a terminator="%NL;" to the root element to absorb the final NL. This final terminator was not needed for daffodil parsing the data from command line via the Daffodil CLI from their instructions here:


      This is either because the Daffodil CLI doesn't complain about left-over data, or because Daffodil is absorbing this extra line ending.

      Absorbing the extra line ending is a bug. Leaving it there (unconsumed) isn't necessarily a bug, but I expect CLI users would expect an error message in that case. So that's a different bug.

      It is important that API users be able to parse data, and leave the data stream positioned at the end of a parse without checking that all data was consumed. This allows a streaming-style behavior where repeated parse calls can advance through data.

      But when using a CLI tool, as when using the TDML runner, extra data past the end of the parse should generally be an error, though one can argue the CLI should have an option to suppress this.

      We need to reproduce this issue, and clarify whether there is a CLI bug here, or a general Daffodil parse-behavior bug.

              efinnegan Elizabeth Finnegan
              mbeckerle.dfdl Mike Beckerle
              0 Vote for this issue
              3 Start watching this issue


                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - Not Specified
                  Not Specified
                  Time Spent - 8 minutes