Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

From 2006-2011, this work was supported as part of "Innovative “Innovative Systems and Software: Applications to NARA Research Problems"Problems”, a Cooperative Agreement between the US National Archives and Records Administration (NARA) and the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Urbana-Champaign.

Intellectual Merit

NCSA's NCSA’s goals for the "Investigation “Investigation of Data Representation" Representation” task focused in three areas: developing an open-source parser for DFDL with an emphasis on enhancing performance and scalability, integration of DFDL/Defuddle into the SHAMAN preservation architecture, and exploration of practical applications of DFDL and the development of a library of DFDL descriptions.

Intellectual Merit

Preservation can be thought of as communication with the future. The records we preserve today need to be accessible and displayable by future technology. Beyond maintaining the accessibility of the raw bits of the digital data, preservation requires maintaining an ability to interpret the data as meaningful structures, relationships, and visual representations.

...

For more information about DFDL, see Related Links below

Parser Developmentunmigrated-wiki-markup

In previous work, Talbott and others at Pacific Northwest Labs developed the Defuddle parser, which implemented an early version of the DFDL specification \ [6\]. Subsequently, this project updated and extended the Defuddle parser \ [1-3\]. \\

Wiki MarkupAt the time of the release of Version 1 of the DFDL Specification, we reviewed the Defuddle parser, and determined that it needed to be completed revised \ [4\]. \\

Wiki MarkupThe Daffodil parser is a completely new implementation, based on Version 1 of the DFDL, as well as lessons learned from Defuddle \ [5\]. The Daffodil parser is only partly implemented.    It is released "as is" in August 2011. \\released “as is” in August 2011.

Semantic Extensionsunmigrated-wiki-markup

While the XML Schema language is well suited for describing the layout of data (the "syntax"), interoperability and robust archiving require semantic mark up as well. This project will extend the DFDL model to support mapping to semantic web languages (the Resource Description Framework (RDF) and the Web Ontology Language (OWL)). We are exploring a two-step mechanism based on the use of the Gleaning Resource Descriptions from Dialects of Languages (GRDDL) specification [http://www.w3.org/TR/grddl/|http://www.w3.org/TR/grddl/] to associate XML to RDF mapping instructions, written, for example, in XSLT, with the DFDL description file \ [2,3\].

Team Members and Alumni

  • Robert E. McGrath, NCSA
  • Joe Futrelle, Woods Hole Oceanographic Institute
  • Jim Myers, RPI
  • Alejandro Rodriguez, Amazon
  • Jason Kastner, NCSA

...

  1. McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Defuddle: a Tool for Format Translation and Metadata Extraction (Poster)''. Microsoft E-science Workshop (2009).
  2. McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Experiments in Data Format Interoperation Using Defuddle'', National Center for Supercomputing Applications, June, 2009, http://cetwww.ncsa.illinois.edu/publications/Data_Interoperationarchives.gov/applied-research/ncsa/11-experiments-in-data-format-interoperation-using-defuddle.pdf
  3. McGrath, R.E., J. Kastner, A. Rodriguez, and J. Myers. ``Towards a Semantic Preservation System'', National Center for Supercomputing Applications, June, 2009, http://arxiv.org/abs/0910.3152.
  4. Rodriguez, A. and R. E. McGrath, ``Some Notes of comparison between DFDL and Defuddle''. National Center for Supercomputing Applications, October, 2010.
  5. Rodriguez, Alejandro and Robert E. McGrath, ``Daffodil: A New DFDL Parser''. National Center for Supercomputing Applications, October, 2010
  6. Talbott, T. D., K. L. Schuchardt, E, G. Stephan, and J, D. Myers, ``Mapping Physical Formats to Logical Models to Extract Data and metadata: The Defuddle Parsing Engine'', International Provenance and Annotation Workshop. 2006, Springer: Heidelberg. p. 73-81.
  7. wikipedia, "Data Format Description Language". 2011, http://en.wikipedia.org/wiki/Data_Format_Description_Language.
  8. Powell, Alan W, Michael J Beckerle, and Stephen M Hanson, Data Format Description Language (DFDL) v1.0 Specification. GFD-P-R.174, Open Grid Forum, 2011. http://www.ogf.org/documents/GFD.174.pdf

...