You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Introduction

There is a specific way of organizing a DFDL schema project that has been found to be helpful. It uses specific directory naming conventions and tree shape to manage name conflicts in a manner similar to how Java package names correspond to directory names.

This set of conventions provides a number of benefits:

  • No name conflicts or ambiguity on classpath if multiple DFDL schemas are used together
  • You can just copy a bunch of DFDL schemas into one directory tree and there will be no conflicts of file names.
  • sbt can be used to
    • Package the schema into a jar - The jar can then be on a classpath, become part of a larger application, etc.
    • Auto-download all dependencies of the schema, including Daffodil itself.
    • Run a suite of tests via 'sbt test'
    • Publish a local version of the schema for use in other projects that also follow this layout.
  • Eclipse IDE for development and test of the schema. Multiple such schemas all work together without conflict in the IDE.
  • Encourages organizing DFDL schemas into reusable libraries.
    • A DFDL schema project need not define a whole data format. It can define a library of pieces to be included/imported by other formats.

These conventions are actually usable for regular XML-schema projects, that is, they're not really DFDL-specific conventions. They're general conventions for organizing projects so as to achieve the above benefits.

Conventions

We're using Tresys Technology (www.tresys.com) here as an example. Substitute your organization's details.

Let's assume the DFDL schema contains two files named main.dfdl.xsd, and format.dfdl.xsd, and that our format is named RFormat.

The standard file tree would be:

RFormat/
├── src/
│   ├── main/
│   │   └── resources/
│   │       └── com/
│   │           └── tresys/
│   │               └── RFormat/
│   │                   ├── xsd/
│   │                   │   ├── main.dfdl.xsd    - main DFDL schema file
│   │                   │   └── format.dfdl.xsd  - DFDL schema file imported/included from main
│   │                   └── xsl/
│   │                       └── xforms.xsl       - resources other than XSD go in other directories
│   └── test/
│       ├── resources/
│       │   └── com/
│       │        tresys/
│       │           └── RFormat/
│       │               └── tests1.tdml    - TDML test file (may be more than 1)
│       └── scala/
│           └── com/
│               └── tresys/
│                   └── RFormat/
│                       └── Tests1.scala   - Scala test driver file. Boilerplate, but makes running tests easy
│
├── build.sbt    - simple build tool (sbt) specification file. Edit to change version of Daffodil needed, or versions of other DFDL schemas needed
├── README.md    - Documentation about the DFDL schema in Markdown file format (https://en.wikipedia.org/wiki/Markdown)
├── .classpath   - Eclipse classpath file (optional)
├── .project     - Eclipse project file (optional)
└── .gitignore   - Git revision control system 'ignore' file (should contain 'target' and 'lib_managed' entries)


After you run 'sbt test', you'll notice a lib_manage directory has been created. This directory is created by sbt to hold all the dependencies of the project.

build.sbt

Use the below template for the build.sbt file:

name := "dfdl-RFormat"

organization := "com.tresys"

version := "0.0.1"

scalaVersion := "2.11.8"

crossPaths := false

testOptions in ThisBuild += Tests.Argument(TestFrameworks.JUnit, "-v")

resolvers in ThisBuild += "NCSA Sonatype Releases" at "https://opensource.ncsa.illinois.edu/nexus/content/repositories/releases"

libraryDependencies in ThisBuild := Seq(
  "junit" % "junit" % "4.11" % "test",
  "com.novocode" % "junit-interface" % "0.10" % "test",
  "edu.illinois.ncsa" %% "daffodil-tdml" % "2.0.0-SNAPSHOT" % "test"
)

Eclipse ID

If you organize your DFDL schema project using the above conventions, and then run 'sbt compile', the lib_managed directory will be populated. Then if you create a new Eclipse scala project from the directory tree, Eclipse will see the lib_managed directory and construct a classpath containing all those jars.

XSD Conventions

DFDL schemas should have the ".dfdl.xsd" suffix to distinguish them from ordinary XML Schema files.

A DFDL schema should have a target namespace.

Stylistically, the XSD elementFormDefault="unqualified" is the preferred style for DFDL schemas.

Using a DFDL Schema

The xs:include or xs:import elements of a DFDL Schema can import/include a DFDL schema that follows these conventions like this:

<xs:import namespace="urn:tresys.com/RFormat" schemaLocation="com/tresys/RFormat/xsd/main.dfdl.xsd"/>

The above is for using a DFDL schema as a library, from another different DFDL schema. 

Within a DFDL schema, one DFDL schema file can reference another peer file that appears in the same directory (the src/main/resources/.../xsd directory) via:

<xs:include schemaLocation="format.dfdl.xsd"/>

That is, peer files need not carry the long "com/tresys/RFormat/xsd/" prefix that makes the reference globally unique.

Git Revision Control

You don't have to use Git version control, but many people do, and github.com is one of the reasons for this popularity.

Each DFDL schema should have its own Git repository if it is going to be revised independently. We encourage users to join the DFDLSchemas project on github and create repositories for, and publish schemas for any publicly-available formats there. For other formats that are not publicly available, one may want to put a placeholder for them on DFDLSchemas anyway (as IBM has done for some formats like Swift-MT.)

  • No labels