...
Mapping of DFDL Infoset to Daffodil JDOM Infoset and to Scala XML Nodes
DFDL Infoset | Daffodil's JDOM XML Infoset | Scala scala.xml.Node Infoset |
---|---|---|
Document Information Item | JDOM Document | The document is represented by the root element. There is no separate document item. |
root | getRootElement() | none |
dfdlVersion | attribute daf:dfdlVersion on the root element. (Not yet implemented) | none |
schema (reserved for future use) | daf:schema attribute (No implementation) | none |
unicodeByteOrderMark | attribute daf:unicodeByteOrderMark on the root element. (Not yet implemented) | same attribute scheme as JDOM |
Element Information Item | JDOM Element | scala.xml.Elem |
namespace | getNamespace(): org.jdom.Namespace | def namespace: String |
name | getName(): String | def name: String |
document | getDocument() | none (see parent) |
datatype | attribute xsi:type with value one of the set of XML Schema simple type QNames that are in the DFDL subset of XML Schema. For example: xsi:type='xs:string' By convention, the prefix 'xsi' and 'xs' denote here the usual standard namespace URIs. (Not yet implemented) | same attribute scheme as JDOM |
dataValue | For simple types other than xs:string, the cannonical XML representation of the value, as returned by getText(). For type xs:string, the DFDL Infoset allows representation of characters that are illegal in XML. These are represented by replacing them with characters in the Unicode Private Use Area by a scheme described below. | def text: String to obtain canonical text. Values containing XML-illegal characters use the same scheme. |
nilled | xsi:nil='true' attribute on element. Absence of this attribute implies 'false' | Same attribute xsi:nil |
children | getChildren() | def child: Node* |
parent | getParent() | none Scala XML nodes are immutable, and do not have parent references. This allows nodes to be shared. |
schema | A special attribute daf:schemaComponentID has a value which can be used to retrieve the associated schema component. (Not yet implemented. Note: requires a means to create a standard Schema Component Designator or SCD) | Same attribute scheme |
valid | daf:valid='true' means the data has been tested and is valid, daf:valid="false" means the data has been tested and is invalid. The absence of the attribute means that no position is taken on the validity of the data. (Not yet implemented) | Same attribute scheme |
unionMemberSchema | (Not yet implemented) | (not yet implemented) |
"No Value" | A JDOM Element with no children (not even Text node children) is the representation of an element with "No Value". | A scala.xml.Elem with no children. |
Augmented Infoset | A JDOM Element with a special marker attribute: dafint:hidden='true' signifies that the element is part of the augmented infoset. This attribute is used to identify and filter out elements when the un-augmented infoset is needed. | Same attribute scheme, but on scala.xml.Elem element. |
Implementation of DFDL Infoset Strings
...
Note that choice of the ASCII or US-ASCII encoding creates an output that is universal, in that it would have only the ASCII 7-bit characters in use yet would be able to represent any character allowed in XML accurately. This form however, would be largely unreadable not only to users of oriental language scripts, but even to users of commonplace accented forms from european European language scripts.
CDATA Escaping Option
...
Use of the DFDL character entities is preferred as it is portable to other DFDL implementations than just Daffodil. The remapping of XML-illegal characters to the PUA is a Daffodil-specific behaviour.
Other XML Output Options
XSLT has a variety of options in the xsl:output element that may be useful in terms of copying their names or meanings. It has options for encoding, for whether or not to add the xml declaration at the beginning of the xml, and even a way to list the elements where the contents should be surrounded by CDATA bracketing.