Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Improvements

...

A functional-programming approach to errors/diagnostics, that also is consistent with Daffodil's DSOM model and the OOLAG (object oriented lazy attribute grammar) approach works like this: You call a function or evaluate an expression to create a value. The returned object is always of the correct type, but the returned object represents either a value, a set of errors/diagnostics/warnings, or both. Attributes of the returned object will include the set of errors/diagnostics/warnings/information, a status of whether the value is OK, or there was an error severe enough to prevent creation of a value.

See the page OOLAG - Object-Oriented Lazy Attribute Grammars for more on this subject.

Example: an xpath expression is compiled into a CompiledExpression type object. This object type should contain a boolean member named hasValue or isRunnable (or perhaps wasCompilationSuccessful()?) isError which is true false if compilation of the xpath succeeded. A member 'diagnosticsgetDiagnostics' will be Nil if there are no errors/diagnostics/warnings/info, otherwise will contain a Set of error/diagnostic/warning/info objects. If hasValue isError is truefalse, then any error/diagnostic/warning objects are not ones that prevent the CompiledExpression from being used (i.e., probably all warnings/info objects)

In general, this is the idiom for any sort of compilation step. So instead of a compilation action returning an Option[T] where None indicates failure, we get type T, and isRunnable isError will be true or false.

Runtime

The parser runtime has a return object that captures success/failure. Whether any given runtime failure is actually a failure, or just part of speculative parsing is not something we actually know in advance, which means even if we're just backtracking as part of parsing we'll be constructing diagnostic objects just in case those errors propagate up to top level.  

For example: suppose we have a choice with 3 alternatives. Suppose the data doesn't match any of the 3. The diagnostic we issue doesn't want to say that the 3rd alternative didn't match, it wants to say that no alternative matched the data, and specifically may want to say that the first alternative didn't match because of reason(s) X, the second due to reason(s) Y, and the third due to reason(s) Z.  That is, imagine we are parsing along attempting the first alternative of this choice. We encounter an error. Now this error might be suppressed because one of the subsequent alternatives might succeed, and in that case we can discard the failure reason for the first alternative. Or, this error might still be needed because if all 3 alternatives result in errors, it might be helpful as a diagnostic to see why each alternative failed.

We'll want to design this so that most of the work associated with diagnostics (constructing message strings for example, and substituting toString representations of various pieces of errant data into them) all happens at the time the diagnostic is actually issued at top level, and not at the time the diagnostic object is created. Aka,

...

Conveniently, this goal lines up perfectly with internationalization requirements, where the software should not contain message strings at all, but should just construct objects of the right type. Message strings are created at the point where messages are displayed to a user, or printed out. While log files may want to contain English, they also want to contain all the components to enable internationalized presentation, so they don't want to contain the formatted English messages, but rather a message identifier, the string representations of the things to be substituted into the message, and possibly the English (in case someone is trying to make sense of the log, outside of an environment where they have the internationalization available, in which case they must understand English.)

In addition, since the JVM will throw some exceptions (like divide by zero), we will surround the runtime with a try-catch, and catch exceptions coming from the JVM to include in our returned list of errors/diagnostics/warnings.

In general, however, backtracking in the runtime isn't exceptional behavior that happens occasionally. It is the principle means by which the parser deals with variability of data formats. Hence, we try to avoid heavyweight constructs like try-catch logic in too many places in the runtime. This means that in the runtime, exceptions/errors aren't generally thrown, but instead a failed return status is returned.

This is not only about performance, it is also about an API guarantee that must be provided by the parse() routine of the API DataProcessor object, which is that it returns a success or failure, and has diagnostics objects associated with it.

Error Types


The DFDL spec has added, as an errata to draft v1.0.3, a behavior that is effectively a warning mechanism, called a 'recoverable error'.

So we have these error/warning types: We will generally use the term "error" to mean "error or warning or information item".

...