Uploaded image for project: 'Daffodil'
  1. Daffodil
  2. DFDL-1045

TDML runner - hard to debug because line number information in diagnostics is incorrect

XMLWordPrintableJSON

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: Normal Normal
    • 1.0.0
    • s14
    • TDML Runner
    • None

      It is hard to debug TDML files because the file and line number information in diagnostic messages don't refer to correct line numbers of the TDML file, but rather to file and line in some generated temp file.

      The TDML runner uses a different loader than the one we use to read DFDL schemas. This is for whitespace preservation reasons (allows us to preserve whitespace by using CDATA bracketing).

      The DaffodilXMLLoader that we use to read DFDL schemas provides file and line number information, but will not supercede such information if it is already in the XML (I think). This allows one to read a TDML file, producing file and line number information, then write out a temporary schema file from parts of the TDML file. This written schema will have file and line number information pointing back at the originally loaded TDML file which is what you want.

      However, the Constructing XML Parser that TDML runner uses does not gather file and line number information (currently, at least we haven't put in the work to make it do this.)

      The TDML runner generates a temp file containing the schema. This temp file is then what file/line numbers refer to in diagnostic messages. But the file/line numbers you want should be the tdml file and line number within that.

      So we either

      • make the constructing parser produce file and line number information
      • figure out how to get past the whitespace-preservation/CDATA issue differently so that we can use the regular DaffodilXMLLoader instead of the constructing parser.

      One idea is to require significant whitespace to be inserted using Private-Use-Area (PUA) characters. So if you want a Carriage Return (CR) in your data you would write  which XML will not view as whitespace but we can remap into a 0x0d in the data and/or infoset.

      We also already have some support for using DFDL character entitites in TDML data so "%CR;" could be used to represent a CR character in data, though to represent it in the expected infoset we would have to invert this projection, and the difficulty is that the XML containing these DFDL character entities isn't "real XML". Some individual characters would instead be represented by several characters. Lengths would all be off.

      If we want others to use TDML to report bugs and such, we need to fix this line number issue.

              efinnegan Elizabeth Finnegan
              mbeckerle.dfdl Mike Beckerle
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: