From 2006-2011, this work was supported as part of "Innovative “Innovative Systems and Software: Applications to NARA Research Problems"Problems”, a Cooperative Agreement between the US National Archives and Records Administration (NARA) and the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign.
Intellectual Merit
NCSA's NCSA’s goals for the "Investigation “Investigation of Data Representation" Representation” task focused in three areas: developing an open-source parser for DFDL with an emphasis on enhancing performance and scalability, integration of DFDL/Defuddle into the SHAMAN preservation architecture, and exploration of practical applications of DFDL and the development of a library of DFDL descriptions.
Intellectual Merit
Preservation can be thought of as communication with the future. The records we preserve today need to be accessible and displayable by future technology. Beyond maintaining the accessibility of the raw bits of the digital data, preservation requires maintaining an ability to interpret the data as meaningful structures, relationships, and visual representations.
...
For more information about DFDL, see Related Links below
Parser Developmentunmigrated-wiki-markup
In previous work, Talbott and others at Pacific Northwest Labs developed the Defuddle parser, which implemented an early version of the DFDL specification \ [6\]. Subsequently, this project updated and extended the Defuddle parser \ [1-3\]. \\
At the time of the release of Version 1 of the DFDL Specification, we reviewed the Defuddle parser, and determined that it needed to be completed revised \ [4\].
\\ Wiki Markup
The Daffodil parser is a completely new implementation, based on Version 1 of the DFDL, as well as lessons learned from Defuddle \ [5\]. The Daffodil parser is only partly implemented. It is released "as is" in August 2011.
\\released “as is” in August 2011. Wiki Markup
Semantic Extensionsunmigrated-wiki-markup
While the XML Schema language is well suited for describing the layout of data (the "syntax"), interoperability and robust archiving require semantic mark up as well. This project will extend the DFDL model to support mapping to semantic web languages (the Resource Description Framework (RDF) and the Web Ontology Language (OWL)). We are exploring a two-step mechanism based on the use of the Gleaning Resource Descriptions from Dialects of Languages (GRDDL) specification [http://www.w3.org/TR/grddl/|http://www.w3.org/TR/grddl/] to associate XML to RDF mapping instructions, written, for example, in XSLT, with the DFDL description file \ [2,3\].
Team Members and Alumni
- Robert E. McGrath, NCSA
- Joe Futrelle, Woods Hole Oceanographic Institute
- Jim Myers, RPI
- Alejandro Rodriguez, Amazon
- Jason Kastner, NCSA
...
- McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Defuddle: a Tool for Format Translation and Metadata Extraction (Poster)''. Microsoft E-science Workshop (2009).
- McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Experiments in Data Format Interoperation Using Defuddle'', National Center for Supercomputing Applications, June, 2009, http://cetwww.ncsa.illinois.edu/publications/Data_Interoperationarchives.gov/applied-research/ncsa/11-experiments-in-data-format-interoperation-using-defuddle.pdf
- McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Towards a Semantic Preservation System'', National Center for Supercomputing Applications, June, 2009, http://arxiv.org/abs/0910.3152.
- Rodriguez, A. and R. E. McGrath, ``Some Notes of comparison between DFDL and Defuddle''. National Center for Supercomputing Applications, October, 2010.
- Rodriguez, Alejandro and Robert E. McGrath, ``Daffodil: A New DFDL Parser''. National Center for Supercomputing Applications, October, 2010
- Talbott, T. D., K. L. Schuchardt, E, G. Stephan, and J, D. Myers, ``Mapping Physical Formats to Logical Models to Extract Data and metadata: The Defuddle Parsing Engine'', International Provenance and Annotation Workshop. 2006, Springer: Heidelberg. p. 73-81.
- wikipedia, "Data Format Description Language". 2011, http://en.wikipedia.org/wiki/Data_Format_Description_Language.
- Powell, Alan W, Michael J Beckerle, and Stephen M Hanson, Data Format Description Language (DFDL) v1.0 Specification. GFD-P-R.174, Open Grid Forum, 2011. http://www.ogf.org/documents/GFD.174.pdf
...
Related Links
- OGF Standards: Data Format Description Language (DFDL), http://www.ogf.org/dfdl/.
- The Open Grid Foundation Forum Data Format Description Language WG (DFDL-WG), http://forgeredmine.gridforumogf.org/projects/dfdl-wg/.
- wikipedia, "Data Format Description Language". 2011, http://en.wikipedia.org/wiki/Data_Format_Description_Language.
- Defuddle (Old) Examples and code: http://sourceforge.net/projects/defuddle.