Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DFDL-455

Performance: Eliminate repeated scanning for delimiter that was already found once.

XMLWordPrintableJSON

    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Critical Critical
    • s13
    • s6
    • Back End
    • None

      Parsers that scan for delimiters to determine the length of an element do so using a complex regular expression that embodies all the logic about matching the longest delimiter pattern.

      Right now, this information is then discarded, and a subsequent step of the parser in the StaticText primitive picks up after the length determined above, and matches just the delimiter, all over again.

      Besides being slow (might or might not be noticable in real formats), it will be confusing behavior if a user is trying to figure out what is going on in the parser, as it will appear to be doing things over again that it already did.

      Watching this in any sort of detailed debugger/trace will be confusing to users.

      So,... primitives that scan for delimiters should save both the element (into the infoset), and the text of the delimiter match in a special slot of the PState. Primitives like StaticText should look for this saved delimiter, and if it is present then their work is done, they should just advance past the delimiter. If it is not present they should do what they do now.

              efinnegan Elizabeth Finnegan
              mbeckerle.dfdl Mike Beckerle
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - Not Specified
                  Not Specified
                  Logged:
                  Time Spent - 28 minutes
                  28m