These are projects that require using Daffodil from its API, along with knowledge of something else to embed it into. (Maybe you already know that, or want to learn it?)
- Integrate Daffdil into Apache Tika (DFDL-1710) to use full or partial DFDL-driven parsing to determine file type.
- Integrate Daffodil into Piperack which is part of the Calabash implementation of XProc
- (This is partly/mostly(?) done. We have https://opensource.ncsa.illinois.edu/fisheye/changelog/daffodil-calabash-extension, and there are github forks too. Needs someone to figure out content-type: application/octet-stream so binary data can go into or come back from piperack.)
- Integrate Daffodil Into Scalable/Parallel Processing Frameworks
- Some possibilities:
Apache NiFi, Apache Spark, your favorite Hadoop..., Apache Storm, Apache Flink, Ohua, etc.
- Really, any data-processing framework/infrastructure should be able to accept data from Daffodil, and produce data into Daffodil and thereby gain the ability to handle many kinds of data that would be extraordinarily difficult otherwise.
- Some possibilities:
- NACHA - so close. This format, which IBM published to the github DFDLSchemas site, will run with a small enhancement to our TDML Runner. (See - DFDL-1006Getting issue details... STATUS
- TDML Runner - enhance for cross validation with IBM - We want to be able to run TDML tests against both Daffodil and the IBM DFDL implementation to test compatibility. The job here is to execute TDML tests against IBM's DFDL (they have a developer edition that's free) using their API. (
DFDL-723Getting issue details...
- In general, there's a backlog of TDML-related JIRA issues. The TDML runner drives Daffodil from the API, so it's relatively well isolated from the rest of the daffodil code base. See other TDML tickets.
Internals Projects Suitable for Beginners
New to scala? New to daffodil? Here's some things we need to get to, where you could help without having to grok everything about Daffodil first.
- packed decimal/BCD - know this stuff? Add this feature to daffodil. It's essential for Daffodil to run a number of the DFDL schemas on the github DFDLSchemas site.
- zoned decimal - heard of this? Needed for ISO8583 format, which has a DFDL schema on github.
- binary calendar types - we postponed this, and... well it needs to get done.