Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

If you are having a problem with Daffodil, and think that perhaps you have found a bug, then we suggest you:

Confidential? FOUO?

Some users are working on DFDL schemas for company-confidential, or military For-Official-Use-Only data formats (such as NATO STANAG 5516). For these users, we have a non-public support mailing address daffodil-fouo-support@tresys.com which goes to the support team at Tresys.

Check JIRA to See if your Issue is Already There

First you should give a search of our JIRA tickets to see if the problem is already recorded.

Here's a list of all tickets about bugs, new features, and improvements, in reverse chronological order (most recent first). You may want to change the issue type, or status specifications to narrow down the list, but most commonly you would just put some search keywords into the search box.

If you do find a bug or a closely related issue that is open status, then you can add your information to it as a comment if you prefer, rather than creating a new issue. Just knowing that another person has run into the issue is helpful at assigning fix priorities.

If you do not easily find an issue, either create a new JIRA bug, email the issue to daffodil-users@oss.tresys.com, or ask about it in the  Daffodil XMPP Chat Room. Additionally, creating a TDML file can greatly help us reproduce the issue and resolve it.

Create a TDML File that Illustrates the Issue

A TDML file is often useful just to ask a question about how something in DFDL works, for example, to get a clarification. It allows for a level of precision that is often lacking, but also often required when discussing complex data format issues.

The absolutely best way to report a bug is by creating a TDML test file that demonstrates the problem.

TDML stands for "Test Data Markup Language". It is a way of specifying a DFDL schema, the test data, the expected result or expected error/diagnostic messages, and it is all in a single self-contained XML file. IBM started TDML to capture tests for their own DFDL implementation. Daffodil latched onto this and has since extended it a bit, though there is an effort to reconcile TDML dialects so that all implementations can run the same tests.

By convention, a TDML file uses file extension ".tdml" or, more commonly now ".tdml.xml" which enables the TDML "tutorial" features to work.

Below is an annotated TDML file for a very simple example:

Code Block
languagexml
titleBug Report Template
<?xml version="1.0" encoding="ASCII"?>

<tdml:testSuite 
  suiteName="Bug Report TDML Template" 
  description="Illustration of TDML for bug reporting."
  xmlns:tdml="http://www.ibm.com/xmlns/dfdl/testData" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xmlns:xml="http://www.w3.org/XML/1998/namespace"
  xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/"
  xmlns:xs="http://www.w3.org/2001/XMLSchema" 
  xmlns:ex="http://example.com" 
  xmlns:gpf="http://www.ibm.com/dfdl/GeneralPurposeFormat"
  xmlns:daf="urn:ogf:dfdl:2013:imp:opensource.ncsa.illinois.edu:2012:ext" 
  xmlns="http://www.w3.org/1999/xhtml" 
  xsi:schemaLocation="http://www.ibm.com/xmlns/dfdl/testData tdml.xsd"
  defaultRoundTrip="false">
  
  <!-- 
  This example TDML file is for a self-contained bug report. 
  
  It shows a parse test, and similar unparse test.
  -->

  <tdml:defineSchema name="bug01Schema" elementFormDefault="unqualified">

   <xs:import 
     namespace="http://www.ibm.com/dfdl/GeneralPurposeFormat" 
     schemaLocation="IBMdefined/GeneralPurposeFormat.xsd" /> 
     
    <dfdl:defineFormat name="myFormat">
      <dfdl:format  ref="gpf:GeneralPurposeFormat"
      lengthKind="implicit" 
        representation="text" 
        encoding="ASCII" 
        initiator="" 
        terminator="" 
        separator="" />
    </dfdl:defineFormat>
 
    <dfdl:format ref="ex:myFormat" />

    <xs:element name="myTestRoot" type="xs:dateTime" 
      dfdl:calendarPattern="MM.dd.yyyy 'at' HH:mm:ssZZZZZ" 
      dfdl:calendarPatternKind="explicit"
      dfdl:lengthKind="delimited" 
      dfdl:terminator="%NL;" />
 
  </tdml:defineSchema>
  
  <tdml:parserTestCase name="dateTimeTest" root="myTestRoot" model="bug01Schema" 
    description="A hypothetical bug illustration about parsing a date time.">
   
   <tdml:document>
     <tdml:documentPart type="text" 
     replaceDFDLEntities="true"><![CDATA[04.02.2013 at 14:00:56 GMT-05:00%LF;]]></tdml:documentPart>
   </tdml:document>

    <tdml:infoset>
      <tdml:dfdlInfoset>
        <ex:myTestRoot>2013-04-02T14:00:56.000000-05:00</ex:myTestRoot>
      </tdml:dfdlInfoset>
    </tdml:infoset>
     
  </tdml:parserTestCase>

  <tdml:unparserTestCase name="unparseDateTimeTest" root="myTestRoot" model="bug01Schema" 
    description="Another bug illustration, this time unparsing direction.">

    <tdml:infoset>
      <tdml:dfdlInfoset>
        <ex:myTestRoot>2013-04-02T14:00:56.000000-05:00</ex:myTestRoot>
      </tdml:dfdlInfoset>
    </tdml:infoset>

   <tdml:document>
     <tdml:documentPart type="text" 
       replaceDFDLEntities="true"><![CDATA[04.02.2013 at 14:00:56-05:00%CR;%LF;]]></tdml:documentPart>
   </tdml:document>
       
   </tdml:unparserTestCase>
   
</tdml:testSuite>

Suppose you save the above out as a file "myDateTimeBug.tdml". You can then run it using the Daffodil Command Line Interface.

Code Block
daffodil test myDateTimeBug.tdml

The Infoset element that contains the expected result may need to contain characters that are not legal in XML documents. Daffodil remaps these characters into legal XML characters in the Unicode Private-use-area. See Daffodil and the DFDL Infoset for details.

Specifying Data in Text, Hex, Bits, or External File

When specifying the test data, there are other ways to do this than using just text.

You can specify the test data in hexadecimal, in individual bits, or you can direct Daffodil to find the data in an external file.

These are illustrated here. You just change the way the tdml:document element is specified to include tdml:documentPart children elements:

Code Block
languagehtml/xml
    <tdml:document>

        <!--
          A document part with type="text" is text. Use CDATA to avoid whitespace changes.

          So in the example below, the line ending after '250;' and after '967;' are intentional
          parts of the data so as to illustrate that the whitespace is preserved if immportant when
          you use CDATA bracketing.

          If you care exactly which kind of line ending is used, then you 
          can use DFDL character entities to insert a %CR; %LF; or both. In this example,
          because the whitespace is expressed as whitespace, it depends on the platform where
          you edit this file whether the line ending is a LF (Unix convention), or a 
          CRLF (MS Windows convention) 

          If you want to use DFDL character entities, you must turn on the 
          replaceDFDLEntities="true" feature of the documentPart element. 
        -->

      <tdml:documentPart type="text"><![CDATA[quantity:250;
hardnessRating:967;
]]></tdml:documentPart>

      <!-- 
          In 'text' both XML character entities, and DFDL's own character entities are interpreted.

          So here is a NUL terminated string that contains a date with some Japanese Kanji characters.
          The Japanese characters are expressed using XML numeric character entities. The NUL termination
          is expressed using a DFDL character entity.

          In this example one has no choice but to use a DFDL character entity. The NUL character (which has character
          code zero), is not allowed in XML documents, not even using an XML character entity. So you 
          have to write '%NUL;' or '%#x00;' to express it using DFDL character entities.
        -->

      <tdml:documentPart type="text" 
          replaceDFDLEntities="true"><![CDATA[1987&#x5E74;10&#x6708;&#x65e5; BCE%NUL;]]></tdml:documentPart>

      <!--
          Type 'byte' means use hexadecimal to specify the data. Freeform whitespace is allowed. 
          Actually, any character that is not a-zA-Z0-9 is ignored. So you can use "." or "-" to separate
          groups of hex digits if you like.
       -->
 
      <tdml:documentPart type="byte">
            9Abf e4c3
            A5-E9-FF-00
      </tdml:documentPart>
      
       <!--
          Type 'bits' allows you to specify individual 0 and 1. Any character other than 0 or 1 is ignored.
           
          The number of bits does not have to be a multiple of 8. That is, whole bytes are not required.
         -->

       <tdml:documentPart type="bits">
            1.110 0.011 1 First 5 bit fields.
       </tdml:documentPart>

       <!--
          Type 'file' means the content is a file name where to get the data
         -->
  
       <tdml:documentPart type="file">/some/directory/testData.in.dat</tdml:documentPart>

    </tdml:document>

Further details on TDML will go in a more detailed guide/page about writing TDML.

If you use the external schema file or external data file capabilities, then of course you need to send those files along with your TDML.

Specifiying the Infoset in XML or an External File

The infoset can be provided either as an inline XML infoset or as a path to an external file by setting the infosetType attribute. If not provided, it defaults to inline XML infoset"

Code Block
languagexml
   <tdml:infoset>
     <tdml:dfdlInfoset type="infoset">
       <ex:myTestRoot>2013-04-02T14:00:56.000000-05:00</ex:myTestRoot>
     </tdml:dfdlInfoset>
   </tdml:infoset>


   <tdml:infoset>
     <tdml:dfdlInfoset type="file">/some/directory/testData.in.xml</tdml:dfdlInfoset>
   </tdml:infoset>

...

A poor or missing diagnostic message is a bug just as much as a broken feature. So one can also create negative tests, i.e., tests that expect errors.

To do this replace the tdml:infoset element with a tdml:errors element:

Code Block
languagehtml/xml
<tdml:errors>
   <tdml:error>Schema Definition Error</tdml:error>
   <tdml:error>testElementName</tdml:error>
</tdml:errors>

Each tdml:error child element contains a sub-string which must be found somewhere in the set of diagnostic messages that come out of the test. The comparison is case-insensitive.

Final detail: In order for a positive test with an Infoset to pass, it must consume all the data. Otherwise the test will not pass, and will fail with a message about 'left over data'.

 

...

this page has moved to https://daffodil.apache.org/community/