Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

Put links here that are particularly useful for learning scala.

Not intended to be a comprehensive sea of Scala-related links.

Not intended to replace searching the web for scala information.

Just the gems please.


This section is for design notes about Scala usage.

There are separate pages about some 'big' patterns. E.g., see OOLAG - Object-Oriented Lazy Attribute Grammars which we use extensively.

There is discussion of some coding style issues that one might think of as small patterns in Coding Style & Guidelines for Contributors. In particular: the uniform return-type principle.

Traits as 'Mixins' Pattern

Scala provides powerful multiple inheritance. Consider:

trait M extends M2 with M3 { self : B => ... }

The above is a trait with a 'self type'. The English of what this means is: "M can be mixed into any class that extends abstract class B in order to produce a variant of B". Code within the methods of M can depend on the fact that this is of type B.

About the notation: Please ignore the notational awkwardness of the above idiom. Somebody in the Scala community is in love with the fact that the implementation of all this mechanism ultimately boils down to yet more clever use of higher-order functions, and that's why the => notation is used.  To me this notation is terribly over-used in Scala. Users of this pattern are not thinking about higher-order functions or functions at all really. They are thinking about object composition. But this is something we just have to get used to.

We use this pattern extensively in Daffodil for a few different reasons

Traits for Sharing

It allows us to avoid redundant code, or equivalently increase sharing of code.

Consider I have an abstract class B, and two "characteristics" which I'll call A-like, and C-like, represented by traits A and C. I wish to create concrete classes AB, CB, and ABC. This is done via:

abstract class B {...}
trait A { self : B => ... }
trait C { self : B => ...}
class AB extends B with A
class CB extends B with C
class ABC extends B with A with C

Lest this appear unmotivated, this is exactly what we do to handle XSD's 3 schema components that are the concrete subtypes of element. These are GlobalElementDecl, LocalElementDecl, and ElementRef. Each of these extends ElementBase, but mixes in either ElementDeclMixin, LocalElementMixin, or both.

Traits for Separation of Concerns

Traits with self-types allow us to just separate the parts of a single class into different files to address different concerns.

Consider I have a large class such as ElementBase. The Daffodil compiler front end code for ElementBase is in dsom/Element.scala.

abstract class ElementBase extends Term with ElementBaseGrammarMixin {{... front end code ...}

The middle tier of the compiler is based on the grammar rules system. The code for the parts of the ElementBase class that make up this middle tier are in a separate file, in the mixin:

trait ElementBaseGrammarMixin { self : ElementBase => ...middle tier/grammar code... }

This trait is single-purpose. It is used in exactly one place, which is in the declaration of ElementBase. It could be replaced by just taking the entire contents of the trait, and just adding it to the ElementBase class. The whole purpose of this trait is to just let us break up and label the code differently, identifying this aspect of the ElementBase functionality as part of the grammar/middle-tier of the compiler.

(TBD: perhaps a naming convention to not call these "Mixins", but to specifically call them out as intended-to-be-single-use. E.g., replace the suffix "Mixin" with "Extension" or "Part" or "Aspect")

Coding Scala for Speed

There are lots of things in scala that make programming convenient and clear, but in the core speed paths  of the software we want to use programming idioms that result in the fastest possible code, on par with the fastest Java code for the same functionality. This section is a collection of hints on how to do that.

  • Use while loops - the other kinds of loops (foreach, map, flatmap, etc. ) turn into higher-complexity things.
  • Use if-then-else. Avoid match-case with pattern matching.   An exception may be very simple match cases, i.e., matching on character or integer values. Think of it this way: you know what it is turning into when compiled if you use if-then-else, so there will be no performance surprises.
  • Hoist out common sub-expressions, even small ones.
  • Maybe not Option - use the Maybe/One/Nope classes instead of Scala's option types.
  • Inlining - The @inline annotation can be used, but sparingly. It does slow compilation down a lot. A final method of a final class has a chance of being inlined if it has no overrides. This really only should be used on classes like the Maybe[T] class and other 'value classes' e.g., units-of-measure classes, etc.
  • Avoid allocation of objects such as return tuples (except pairs - special case see below), return case-class objects, etc. There are several alternatives:
    • 2-Tuples are optimized out if they are immediately destructured in the caller. So if def foo(x): (Int, Int); val (x, y) = foo(z), then no tuple allocation will be done.
      • So always destructure pairs immediately. This will be much more efficient than using the _1 and _2 data members to access the contents of the tuple. 
      • 3-or-more tuples are not so optimized, so avoid them.
    • Return a Maybe[T] - if you would have returned an Option[T]. Note: you cannot pattern match on Maybe[T] so this has impact on code simplicity/clarity.
    • Return one result, retrieve the others each with an additional method call that gets the additional result from a data member of the object.
      • This has thread-safety implications (it is not thread safe), because different threads accessing the same object could collide on these return values being saved in the object. However, this is usually ok, because few objects are accessed simultaneously by multiple threads. It must be considered though.
    • Return strings by passing in and filling in CharBuffers/StringBuilders (See additional note below on why one should prefer StringBuilder)
    • Return multiple values (more than 2) by accepting a "result" object as an argument and filling it in. This is clunky and C-like, but efficient.
      • The type of the object should be a trait having var slots that are filled in by the called method.
      • Making it a trait allows the caller to decide how to implement it. They can mix it into the class making the call sometimes, or allocate an object, but hoist that outside the inner loop, etc.
  • Avoid use of the synchronized (aka thread-safe) classes except when truly necessary. So use StringBuilder, not StringBuffer.


this page has moved to