Scala Comprehensions Don’t Confuse Me Anymore

When I first started using Scala, one of the super cool features every blog, tutorial, and coworker mentioned was the sequence comprehension or for expression. That’s the official Scala documentation for the feature. Go ahead and read it. Done? Great! You might still be confused, like I was, because that documentation isn’t as helpful as it could be.

If you’ve ever used Python’s list or dict comprehensions you are probably wondering why Scala’s version is so clunky and confusing. Comprehensions are supposed to make things easier to understand—I mean, that’s literally in the name of the feature itself!

In Scala, comprehensions are really syntactic sugar for a series of calls to map. This is a big deal in functional programming languages because it relates to monads—and yeah, I’m sorry for bringing up that word but you can’t escape it: you’re writing functional programs now. If you don’t already know what a monad is, then all you really need to know for the purposes of this exercise is that it’s a wrapper with a particular set of features. Examples of monads are Option and List. For more details, I recommend Demystifying the Monad in Scala.

But I digress. What I want to show you now is how I learned what the Scala for expression does using a real-world example. Starting from the terrible procedural code I wrote to solve my problem, and then iteratively transforming it into a more concise and easier-to-understand version. Eventually we’ll end up with a for comprehension that actually makes sense.

The Problem

I needed to pull some data out of my typesafe configuration and then provide a convenient interface for it. Here’s the data in question:

import our.internal.library.Config
import spray.json.DefaultJsonProtocol._
import spray.json._

import scala.util.Try

// Expected JSON: {
// "the-wotnot": {
//   "product-id": "something-something",
//   "duration": 30
// }, [...] }
lazy val products: Option[JsObject] =
  Try(config.getString("products").parseJson.asJsObject).toOption

case class ProductInfo(productId: String, duration: Int)

It’s when I implemented the convenient interface part that the trouble began.

A Procedural Solution

def getProductInfo(productName: String): Option[ProductInfo] = {
  val info = Try(products.get.fields(productName)).toOption
  if (info.isEmpty) {
    None
  } else {
    val fields = Try(info.get.asJsObject.fields).toOption
    if (fields.isEmpty) {
      None
    } else {
      val productId = fields.get.get("product-id")
      val duration = fields.get.get("duration")
      if (productId.isEmpty || duration.isEmpty) {
        None
      } else {
        val productIdString = Try(

          fields.get("product-id").convertTo[String]

        ).toOption
        val durationInt = Try(

          fields.get("duration").convertTo[Int]

        ).toOption
        if (productIdString.isEmpty || durationInt.isEmpty) {
          None
        } else {
          Some(ProductInfo(productIdString.get, durationInt.get))
        }
      }
    }
  }
}

Wow! That’s impossible to understand at a glance. And I don’t know about you but whenever I see code like this:

        }
      }
    }
  }
}

…let alone write it myself, a little part of my soul dies.

A Series of Maps

Knowing that sequence comprehensions are essentially syntactic sugar for a series of maps, my first stab at cleaning this up was to rewrite it using map, or in this case, flatMap — Remember, Options are monads!

def getProductInfo(productName: String): Option[ProductInfo] = {
  val a = Try(products.get.fields.get(productName)).toOption
  val b = a.flatMap { i ⇒
    Try(
      i.get.asJsObject.fields.get("product-id"),
      i.get.asJsObject.fields.get("duration")
    ).toOption
  }
  val c = b.flatMap { i ⇒
    Try((
      i._1.get.convertTo[String],
      i._2.get.convertTo[Int]
    )).toOption
  }
  c.flatMap { i ⇒
    Try(ProductInfo(i._1, i._2)).toOption
  }
}

Definitely cleaner but whenever I see a bunch of maps my eyes kinda glaze over. Maybe I just need more exposure to them. But it’s the same when I see map in Python code. I always think “Ugh, how does map work again? And why isn’t this a list/dict comprehension?”

Also, just like in the procedural code, I am forced to give names to variables that I’d honestly prefer not to have to name (as evidenced by my picking a, b, c, and i). These values are all just intermediary state that I’d rather not store in named values at all.

A Sequence Comprehension

I’m assuming you’ve at least seen one of these things before now so I’m not going to talk about the syntax here. What I tried to do here was rewrite every map operation as a line in a for expression.

def getProductInfo(productName: String): Option[ProductInfo] = {
  for(
    jsdata ← Try(products.get.fields.get(productName)).toOption;
    fields ← Try(jsdata.get.asJsObject.fields).toOption;
    jsargs ← Try(fields.get("product-id"), fields.get("duration")).toOption;
    args ← Try(
      jsargs._1.get.convertTo[String],
      jsargs._2.get.convertTo[Int]
    ).toOption
  ) yield ProductInfo(args._1, args._2)
}

Now that I’ve taken the time to go from procedural to a comprehension, it’s much more clear to me what the steps of transformation are in this function.

  1. Pull out some JSON data from a config value.
  2. Get the fields from the JSON data.
  3. Pull out the arguments I’m looking for (initially as JsValue)
  4. Convert the JsValue arguments into the types I want.
  5. Yield the case class I’ve been working towards, using the arguments I’ve just built.

A Better Sequence Comprehension

Now if you’re a more experienced Scala programmer, you probably have a few suggestions, as did my coworker, Kent. I wanted to remove some duplicated code such as the multiple calls to toOption. I also believed there was a way to do this while avoiding tuples and those cryptic looking _1 and _2 functions. Kent found a way!

def getProductInfo(productName: String): Option[ProductInfo] = {
  val info = for {
    fields ← Try(products.get.fields(productName).asJsObject.fields)
    productId ← Try(fields("product-id").convertTo[String])
    duration ← Try(fields("duration").convertTo[Int])
  } yield ProductInfo(productId, duration)
  info.toOption
}

Improvements:

  • Defer conversion to Option until the very end to avoid repeated calls to toOption.
  • Avoid using a tuple by breaking up the fetch for productId and duration into two steps.
  • Using for { ... } lets you remove those annoying semicolon delimiters!

Final Thoughts

Hopefully you had an “ah-ha!” moment, just like I did. The initial procedural solution is ugly, but if you’re familiar with any non-functional programming language it should be fairly easy to trace though and understand what it is doing. The problem is that you have to trace through it. It’s also a lot more code to write, which increases the chances of subtle bugs. When I initially wrote it there were a number of tricky edge case bugs in the code that I only discovered as I was implementing the series of maps and sequence comprehension versions.

The sequence comprehension version is so concise that it’s a bit magical. If someone showed me that code without any context and then explained what it does, I’d feel a bit uneasy. “How does it work?” would be my first thought. Or maybe, “What is even going on in this code?” Somehow the Scala for expression hides just enough information that it confuses the uninitiated. Hopefully though, you can now consider yourself initiated.

scala