Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Coding for speed section

...

(TBD: perhaps a naming convention to not call these "Mixins", but to specifically call them out as intended-to-be-single-use. E.g., replace the suffix "Mixin" with "Extension" or "Part" or "Aspect")

 

Coding Scala for Speed

There are lots of things in scala that make programming convenient and clear, but in the core speed paths  of the software we want to use programming idioms that result in the fastest possible code, on par with the fastest Java code for the same functionality. This section is a collection of hints on how to do that.

  • Use while loops - the other kinds of loops (foreach, map, flatmap, etc. ) turn into higher-complexity things.
  • Use if-then-else. Avoid match-case with pattern matching.   An exception may be very simple match cases, i.e., matching on character or integer values. Think of it this way: you know what it is turning into when compiled if you use if-then-else, so there will be no performance surprises.
  • Hoist out common sub-expressions, even small ones.
  • Maybe not Option - use the Maybe/One/Nope classes instead of Scala's option types.
  • Inlining - The @inline annotation can be used, but sparingly. It does slow compilation down a lot. A final method of a final class has a chance of being inlined if it has no overrides. This really only should be used on classes like the Maybe[T] class and other 'value classes' e.g., units-of-measure classes, etc.
  • Avoid allocation of objects such as return tuples (except pairs - special case see below), return case-class objects, etc. Modifying individual result locations within the object, and then asking for their values with small getter-like methods There are several alternatives:
    • 2-Tuples are optimized out if they are immediately destructured in the caller. So if def foo(x): (Int, Int); val (x, y) = foo(z), then no tuple allocation will be done.
      • So always destructure pairs immediately. This will be much more efficient than using the _1 and _2 data members to access the contents of the tuple. 
      • 3-or-more tuples are not so optimized, so avoid them.
    • Return a Maybe[T] - if you would have returned an Option[T]. Note: you cannot pattern match on Maybe[T] so this has impact on code simplicity/clarity.
    • Return one result, retrieve the others each with an additional method call that gets the additional result from a data member of the object.
      • This has thread-safety implications (it is not thread safe), because different threads accessing the same object could collide on these return values being saved in the object. However, this is usually ok, because few objects are accessed simultaneously by multiple threads. It must be considered though.
    • Return strings by passing in and filling in CharBuffers/StringBuilders
    .
    • (See additional note below on why one should prefer StringBuilder)
    • Return multiple values (more than 2) by accepting
    an argument the type of which is
    • a "result" object as an argument and filling it in. This is clunky and C-like, but efficient.
      • The type of the object should be a trait having var slots that are filled in by the called method.
      • Making it a trait allows the caller to decide how to implement it. They can mix it into the class making the call sometimes, or allocate an object, but hoist that outside the inner loop, etc.
  • Avoid use of the synchronized (aka thread-safe) classes except when truly necessary. So use StringBuilder, not StringBuffer.2-Tuples are optimized out if they are immediately destructured in the caller. So if def foo(x): (Int, Int); val (x, y) = foo(z), then no tuple allocation will be done.