Performance Coding for Runtime

This page is for Scala performance coding hints that should be used for performance critical code, such as the repetitive places in the Daffodil runtime module.

These ideas basically amount to writing "java-like" Scala code.

Do not use this style except in performance critical areas, as it makes the code less readable, less compact, much harder to get correct, etc.

Hopefully in the future improvements in JVMs and the Scala compiler will make some of these techniques less necessary.

Here's a link to a page about Java performance coding - avoiding unnecessary allocation for Java code. The same principles apply to Scala code: http://blog.takipi.com/5-coding-hacks-to-reduce-gc-overhead/

Avoid Unnecessary Allocation

Many things in Scala cause allocation of objects on the heap. This involves quite a lot of overhead to allocate the object (which has extra locations in it beyond the members), initialize memory, call the constructor, etc.

Measurements have often shown allocation to be a large cost, so there is a bunch of techniques for avoiding excess allocation.

Avoid Passing Functions - While Loops for Iteration - or Macros

Scala's map, flatmap, fold, reduce, foreach, etc. All these things take a function argument. Due to JVM issues, even though these functions are only used downward, they still end up allocated on the heap.

In general this means writing plain-old while-loops instead of Scala's much more compact idioms.

In some cases Macros can be used to create something about as compact as a scala map/fold idiom but without expressing a function object at all. See LoggerMacros.scala for examples of this.

Avoid Passing Functions - By Name Arguments - Use Macros

Code like this

def foo (a: Int, b : => Int) = { .... }

Every time method foo is called, a little function closure is allocated for the 'b' argument passing.

Probably the worst offender in this is

val opt : Option[Foo] = ....

opt.getOrElse(Assert.invariantFailed("opt should be defined")) // allocates closure around assertion.

So don't call getOrElse in performance code!

In our own classes, a macro can often be used instead to avoid the need for the by-name argument. (See AssertMacros.scala and/or LoggerMacros.scala for examples of this)

Avoid Return Objects and Tuples

These are often used to pass information back to the caller of a more complex nature, but then are discarded.

The alternative is to pass in a mutable object that is filled in by the called method. (See OnStack/LocalStack below about where that mutable object might come from.)

For the very common case of wanting to return an optional result, e.g., where you would want to return Option[T], instead return a Maybe[T] for objects, and use MaybeInt, MaybeLong, etc. for numbers. See below about avoiding Option type.

Similar common return types are small tuples of values, and the Either[L, R] and Try[T] types.

See also the Cursor & Accessor Idiom below.

Avoid Option Type - Use Maybe Family of Types

Scala's Option type (Some, None) involves a heap-allocated object to represent the Some case. Furthermore, if you make a

val foo: Option[Int] = Some(5)

That's two objects. Because the 5 has to be boxed so that it can appear in the generic "collection" type Some.

We have a AnyVal-derived family of Maybe types. There are specialized variants for the unboxed types like Int

val foo: MaybeInt = MaybeInt(5)
val bar: MaybeInt = MaybeInt.Nope

For objects, the basic

val foo : Maybe[String] = One("foobar")

However, see below about MStack and generic collections.

Built-in Scala libraries often make heavy use of the Option type. See section about HashMap below.

For Error vs. OK situations

A common pattern where Either[L, R] types are used is where Right(()) means "OK", and Left("error message") indicates a problem. This is most commonly used as a return type. To avoid allocation we have the OKOrError class. This is much like the Maybe[T] class in that it is a value class (derived from AnyVal), so it doesn't require allocation when passed around as argument or returned as a value. If you store these in a collection; however, then they do get allocated. But normally that's not the case with these OK/Error situations.

OKOrError is like Maybe[T] except that Maybe often has method names that would be misleading. If you represented error or ok states with a Maybe[String] then isDefined would be true for errors, and isEmpty would be true for "ok". Because that gets instantly confusing we have OKOrError with isError and isOK predicates.

Avoid Boxing/Unboxing Numbers

Scala's Int, Long, Short, Byte are all value types, so are passed and returned without allocating. Specialized collections like Array[Byte] also avoid allocating a boxed byte. But generic collections such as ListBuffer[T] are going to allocate a box every time a number of primitive type is inserted.

There is no idea fix for this. But consider carrying around number objects instead, so that the box gets allocated once, and then the code just carries around the number in its boxed form. So instead of allocating and discarding boxes for numbers all the time the code just carries around an object reference to a single boxed object.

The way to do this is to explicitly use the Java boxed number types. The way we do this that makes the intention clear is

import java.lang.{Integer => JInt, Long => JLong, Short => JShort, Byte => JByte, Float => JFloat, Double => JDouble}
import java.math.{BigInteger => JBigInt, BigDecimal => JBigDecimal}

Then the prefix "J" on the name makes it clear that a boxed numeric type is being used.

Avoid Generic Collections of Unboxed Types

val boxedInts = new ArrayBuffer(len)  // adding an Int allocates a box every time. 
var unboxedInts = new Array[Int](len) // Non allocating - note var idea - in case you need to resize it.

Avoid match-case with Pattern Matching

Scala's very nice match-case, and case classes uses Option types underneath in the matching.

Instead of this:

foo match {
case Foo(a, b) => ....
...

Write this instead:

foo match {
case f: Foo => { val a = f.a ; val b = f.b ; ....
...

This doesn't allocate in execution.

Avoid scala.collections.Map. Use NonAllocatingMap (wraps Java Maps) Instead

This is a library instance of the "avoid Option type" problem.

Scala's maps, including mutable HashMap, allocate a Some[T] object for every successful get(key) call.

This is unacceptable overhead for something done so frequently.

It is better to use Java's java.util.Map classes instead, where get(key) returns a value or null, and never allocates anything.

To make this convenient, there is a wrapper NonAllocatingMap:

import edu.illinois.ncsa.daffodil.util.NonAllocatingMap

val myMap = new NonAllocatingMap[String, String](new java.util.HashMap[String, String])

This provides the map functionality of the java-provided map class, recast as Scala's types.

Use MStack, avoid mutable.Stack

We need lots of stacks, and since Scala's general stacks are generic collections, we created our own non-boxing flavors:

MStack.Of[T] - generic
MStack.OfInt - stack of Int - non-boxing
MStack.OfMaybe[T] - doesn't create box for the Maybe object. Uses null for Nope, and a regular object reference for One.
- However, MStackOfMaybe[Int] will box and unbox the Int -

Allocate on "the stack" Using OnStack and LocalStack

See the definitions of these. These make it convenient to reuse objects in recursive code, as is common in Daffodil. A small pool of reusable objects is maintained per thread. Accessing one causes it to either be created, or an existing one initialized. They get put back on exit of scope.

What these achieve is some of the efficiency one sees in programming languages like C or C++ where most data structures are stack allocated and never require heap allocation nor garbage collection. There is more overhead to OnStack because it uses a thread-local to insure thread safety. if you have a per-thread data structure (such as a per-thread state block) being passed around, then you can use a LocalStack in the state block, which has less overhead.

Use Reusable Pools of Stored Objects

When reusable objects do not follow a stack discipline, then you can still reuse them by pooling them.

See Pool.scala for a common pooling idiom

Iteration Patterns - Avoid Iterators - Use Cursors and Accessors

If you consider that we have to avoid scala's nice generic collection functional operations like foreach and map, one might be tempted to just use the Iterator[T] class.

However, if the generic type T here is a value type (e.g., Int, or Long or Boolean) then calling next() will return a boxed object. Whether that box is saved somewhere or is being created and discarded immediately depends on what you are iterating over.

But the real issue here is about return objects - when they're not just returned from a method call, but you want to iterate over them.

We want to iterate over something, but the items in whatever we're iterating are aggregates of things that we immediately want to just break apart and use the pieces.

val iter : Iterator[(String, String)] = ....
...
val (left, right) = iter.next()
...

Now each time we iterate, an object is created for return from iter.next() (that is unless that object already exists in some collection, but in many cases it doesn't exist.)

An alternative idiom is the Cursor & Accessor pattern:

case class R(var left: String, var right: String) extends Accessor[R] { // note mutable vars !
   def cpy(): R = R(left, right)
   def assignFrom(other: R) { 
      this.left = other.left
      this.right = other.right
   }
   
}

class RCursor[R] extends Cursor[R] {

    val advanceAccessor = R(null, null)

    override def advance : Boolean = { ... }
    ... 
}

The idea here is you call the advance method, and it returns a boolean telling you if it was able to advance the cursor to another object. The object is "returned" by side-effecting the accessor. Each call to advance clobbers the same object. This is a way to iterate over vast amounts of complex data without having to create any objects.

There is also an inspect method (which works like peek() - looks ahead at next thing, but doesn't 'advance' to it. It fills in a different accessor so that you don't have to copy to look at the current and next items simultaneously.

If you want to revert to using ordinary Scala idioms like collections and Iterators you can copy the accessor, or assign to them with methods on the Accessor class (cpy and assignFrom).

See Cursor.scala for the traits.

Space shortcuts

Child pages