1
0
-1

How would I construct a DFDL element to read in a stream of binary data until a terminating sequence of bytes occurs. eg a stream is terminated with 0xFFDA, 0xFFC0 or 0xFFD9?

I was able to partially achieve my aim (i.e. terminate on a single terminatorr) with the following:

<xs:element name="foo" type="xs:hexBinary" dfdl:length="delimited" dfdl:outputNewLine="%CR;%LF;" dfdl:terminator="%#rff;%#rda;" />

 

I have no idea  why I had to set outputNewLine, but Daffodil would throw an error if not set.

    CommentAdd your comment...

    2 answers

    1.  
      2
      1
      0

      I suspect you meant dfdl:lengthKind="delimited"? dfdl:length is only used when dfdl:lengthKind="explicit", in which case the value of dfdl:length must be either a number of a DFDL expression.

      The issue with dfdl:lengthKind="delimited" is that the 0xFFXX delimiters will be consumed by the data. When parsing JPEG (which I assume this is what you're discussing) you often will need to use those 0xFFXX markers to determine how to parse the following data. dfdl:lengthKind="delimited" will consume those delimiters so you cannot do that.

      What you really want is something like dfdl:lengthKind="pattern", and you specify a regular expression that will consume everything up to those special 0xFFXX markers.

      When working with others who are working on creating a DFDL schema for JPEG, we recommended something like the following:

      <xs:element name="foo" type="xs:hexBinary" dfdl:lengthKind="pattern" dfdl:lengthPattern=".*?(?=\xFF[\x01-\xFE])" dfdl:encoding="ISO-8859-1" />

      This specifies a pattern to consume all data as xs:hexBinary up until one of the 0xFFXX markers, but will not consume the 0xFFXX marker itself. You can then follow this with an element to consume the 2-byte marker followed by a choice that discriminates on the value of that marker to determine how to continue processing.

      Also, note that we literally just added support for dfdl:lengthKind="pattern" with xs:hexBinary types today (Nov 23, 2016), so you'll need the latest 2.0.0-SNAPSHOT of Daffodil for this to work.

      1. Steve Lawrence

        Minor update to the above. The dot (.) in the dfdl:lengthPattern property does not match newlines. So if the hexBinary data contains newlines, it will fail to match. The correct pattern should be:

        dfdl:lengthPattern="[\x00-\xFF]*?(?=\xFF[\x01-\xFE])"
      CommentAdd your comment...
    2.  
      1
      0
      -1

      I successfully used Daffodil-2.0.0-SNAPSHOT to extract multiple JPEG images from a NITF file with the caveat that I could only extract SOI, FRAME (as a blob) and EOI info. 

      A JPEG FRAME contains multiple fragments delimited by marker codes. The existence and location of the fragments is not fixed.

      I tried creating a FRAME element with a length calculated using the lengthPattern property, and then embedding multiple fragment elements. Each fragment element uses a lengthPattern property to detect the end of fragment and lengthKind = "endOfParent" to determine when to stop reading fragments for a frame. .

      Daffodil reported an error that element FRAME and children must have text representation in order for pattern-based length and scanability. I tried setting the encoding to ISO-8859-1, but I could not to fix this issue - maybe a rookie error?

       

      nitf-13.dfdl.xsd

      i62_3311a.ntf

      config.xml

       

       

       

      1. Steve Lawrence

        dfdl:lengthKind="pattern" is only supported on complex types when the text is "scannable". By scannable, we mean the dfdl:representation="text" and the dfdl:encoding of all children are the same and known at schema compile time (i.e. not an expression). Since dfdl:representation="binary", dfdl:lengthKind="pattern" is not supported, and you will get a compile time error.

        Also note that dfdl:lengthKind="endOfParent" is not yet implemented in Daffodil. The fact that Daffodil does not report this as an error is a bug. Currently, if you specify dfdl:lengthKind="endOfParent" on a complex type, it looks like Daffodil just treats it as if it is dfdl:lengthKind="implicit".  DFDL-1664 has been created to track this issue.

        I suspect the solution to your problem is to move the dfdl:lengthKind="pattern" and dfdl:length="pattern" to a simple type with type xs:hexBinary.

      CommentAdd your comment...