Scala Type Classes From Scratch

June 18, 2021

Introduction

When I picked up Scala in 2018, I found several new constructs immediately useful: case classes, pattern matching, and immutable collections are just a few that come to mind.

However, one construct left me thoroughly confused for over a year: type classes.

As I’ve watched others learn the language, type classes are repeatedly a stumbling block.

In this post, I’ll walk through a realistic example problem, presenting a few dead-ends based on conventional solutions. I’ll demonstrate why we need a type class to solve the problem and build the solution incrementally. Finally, I’ll close with some thoughts on a couple related concepts.

This is by no means a new topic, likely not even a new perspective, but hopefully it saves someone else some time while learning.

What’s In a Name?

The name type class is not particularly descriptive. I know what a type is. I know what a class is. But what is a type class?

There are certainly more rigorous definitions, but here is mine:

A type class is a set of methods which can be invoked on a generic type, without any specific type implementing or even knowing about the methods.

Let’s build some intuition by diving into an example.

A Motivating Example: JSON Encoding

Here’s our problem statement:

Write a method that takes an instance of some type, encodes it as a JSON string, and writes it to a file. The method should work for types outside the immediate codebase.

Let’s start with a User case class:

final case class User(firstName: String, lastName: String)

Then we need a method which takes two arguments: an instance of some type, and a File:

JsonUtil.toFile(User("Martin", "Odersky"), new File("/tmp/out.json"))

This call should write {"firstName": "Martin", "lastName": "Odersky"} to /tmp/out.json.

This should also work for any other type that can be encoded to JSON. We don’t yet know how we’ll implement this method, so I’m leaving the exact type signature ambiguous.

Could we add a mixin trait and only encode instances which extend it?

trait EncodesToJson {
  def toJson: String
}

final case class User(firstName: String, lastName: String) extends EncodesToJson {
  override def toJson: String = 
    s"""{"firstName": "$firstName", "lastName": "$lastName"}"""
}

object JsonUtil {
  def toFile(o: EncodesToJson, f: File): Unit =
    Files.writeString(file.toPath, o.toJson)
}

This works for types defined in our own codebase, but we can’t make a type from an external library extend this trait.

Could we make the toFile method take an `Any`?

object JsonUtil {
  def toFile(o: Any, f: File): Unit = o match {
    case u: User => 
      Files.writeString(f.toPath, s"""{"firstName": "${u.firstName}", ... }""")
    case _       => throw new Exception("Didn't see that one coming!")
  }
}

Again, this works for our own types, but it’s tedious to handle any type. We also lose type-safety, as there’s no way for the compiler to verify that the type passed will be matched. Moreover, if we publish this code in a library, we cannot handle the custom types that a user will provide.

As an aside, the Jackson libraries in Java take a similar approach. You register encoders for specific types early in your application’s lifecycle. Then you can pass an untyped Object to an encoding method. The method uses reflection to get your instance’s type and look for a registered encoder. If it finds an encoder, it encodes the instance. If not, it throws an exception. This would work for our problem as well, but I find it best to avoid untyped APIs and reflection.

We need something else

Let’s take stock of the problem:

We need a certain piece of functionality to work for multiple types.
We can’t guarantee that the types themselves will implement this piece of functionality.
We could implement it on our own for many types, but not all types, because we can’t know all the types up-front.

So if we work backwards from the problem, we basically need some way to call a method that returns a JSON String for any arbitrary type, without any type necessarily implementing, or even knowing about, the method. That sounds a lot like my original definition of a type class.

We finally need a type class

The type class Encoder[T] describes an API for encoding an instance of T as a JSON String.

trait Encoder[T] {
  def encode(t: T): String
}

Our first-pass implementation of the toFile method takes a T, an encoder for T, and the file.

object JsonUtil {
  def toFile[T](t: T, enc: Encoder[T], f: File): Unit = {
    val jsonString: String = enc.encode(t)
    Files.writeString(f.toPath, jsonString)
  }
}

Now we need an Encoder for the User type:

val userEncoder = new Encoder[User] {
  override def encode(t: User): String =
    s"""{"firstName": "$firstName", "lastName": "$lastName"}"""
}

Finally, we can call the toFile method:

JsonUtil.toFile(User("Martin", "Odersky"), userEncoder, new File("/tmp/out.json"))

Is that any better?

It’s slightly better than the previous solutions in that we can provide our own implementation of an Encoder for any T. But we’ve added a third argument to the toFile method, and now we have to keep track of the Encoder[User]. We can definitely still improve it.

We need some implicits!

We can leverage Scala’s implicits to make this type class solution far more elegant.

Implicits are another common source of confusion in Scala, so I’ll proceed slowly.

The first implicit we need is an implicit parameter to the JsonUtil.toFile method:

object JsonUtil {
  def toFile[T](t: T, f: File)(implicit enc: Encoder[T]): Unit = {
    val jsonString: String = enc.encode(t)
    Files.writeString(f.toPath, jsonString)
  }
}

This lets us call toFile with just two arguments, and the compiler finds the Encoder[T].

As an aside, we’ll often see these implicit parameters called ev, or ev1, ev2, .... ev stands for “evidence.” We’re basically saying, “To call this method, we need evidence that there exists an Encoder for T.”

So then we need an implicit Encoder for a User:

final case class User(firstName: String, lastName: String)

object User {
  implicit val userEncoder = new Encoder[User] {
    override def encode(t: User): String =
      s"""{"firstName": "$firstName", "lastName": "$lastName"}"""
  }
}

Why did we put the implicit in the User companion object? Because we know the compiler automatically looks for implicits related to a class in the companion object. You can learn more about implicit resolution from this excellent Stackoverflow answer.

Now we can call toFile without any explicit reference to an Encoder:

JsonUtil.toFile(User("Martin", "Odersky"), new File("/tmp/out.json"))

Another way to write toFile

We can slightly alter the toFile method to use a Context Bound to express that T should have an Encoder, and then access the Encoder using the implicitly method:

object JsonUtil {
  def toFile[T : Encoder](t: T, f: File): Unit = {
    val jsonString: String = implicitly[Encoder[T]].encode(t)
    Files.writeString(f.toPath, jsonString)
  }
}

When I see the context bound T : Encoder, I read “there exists an instance of Encoder for T.” The implicitly method just lets us grab that instance.

What’s the advantage of this? As far as I know, it’s just syntactic sugar. I like to use it because it lets me avoid thinking of a name for the implicit parameter.

Edge Cases

What happens if there’s no Encoder for a type?

A problem with the method that takes an Any was that it could be called on types which we hadn’t anticipated. If we call toFile on an instance of a type with no corresponding Encoder, we get a compiler error.

JsonUtil.toFile(("Scala", 3), new File("/tmp/out.json"))

could not find implicit value for parameter enc: Encoder[(String, Int)]

As soon as we provide an implicit instance of Encoder[(String, Int)], we can proceed.

What happens if there’s no companion object?

So far we’ve only provided an implicit Encoder[T] in the companion object of the type T.

In the previous section, I mentioned providing an implicit instance of Encoder[(String, Int)]. We don’t have access to the companion object for (String, Int). Likewise, if the type T comes from an external library, we can’t modify its companion object. So how do we proceed?

In either case, we can provide an implicit Encoder in the companion object for Encoder:

trait Encoder[T] {
  def encode(t: T): String
}
object Encoder {
  // Type User comes from an external library
  // so we can't modify the companion object.
  implicit val userEncoder = new Encoder[User] {
    override def encode(t: User): String =
      s"""{"firstName": "$firstName", "lastName": "$lastName"}"""
  }
  
  // (String, Int) comes from an external library 
  // and doesn't even have a companion object.
  implicit val stringIntEncoder = new Encoder[(String, Int)] {
    override def encode(t: (String, Int)): String =
      s"""["${t._1}", ${t._2}]"""
  }
}

The implicits are resolved just like the implicits in the User companion object.

What happens if all the types are in an external library?

Let’s say that User and Encoder both come from an external library, so we can’t modify either companion object.

There are a few ways to solve this. My preferred approach is to use a mixin trait with implicit instances of a single type class.

trait EncoderInstances {
  implicit val userEncoder = new Encoder[User] {
    override def encode(t: User): String =
      s"""{"firstName": "$firstName", "lastName": "$lastName"}"""
  }
}
object EncoderInstances extends EncoderInstances

If we want all implicit instances for the Encoder type class, we just extend the trait:

object Example extends App with EncoderInstances {
  JsonUtil.toFile(User("Martin", "Odersky"), new File("/tmp/out.json"))
}

If we just want a specific instance, we can import it from the EncoderInstances object:

import EncoderInstances.userEncoder

object Example extends App {
  JsonUtil.toFile(User("Martin", "Odersky"), new File("/tmp/out.json"))
}

What happens if we have several instances of the same type class?

Sometimes there is no single canonical implementation of a type class. For example, there might be one Encoder for User that uses camel-case, and another that uses snake-case.

When this happens, I generally avoid implementing any instances in the companion objects. Instead, I suggest using separate mixin traits, e.g., CamelCaseInstances and SnakeCaseInstances. Then we can pick which one to use in specific areas of the application.

Problem Solved?

Let’s review what we accomplished using a type class:

We defined Encoder[T], a type class specifying an API for encoding a T as a JSON String.
We implemented an instance of Encoder for the User type.
We introduced implicits to minimize boilerplate in the API.
We verified we can only call JsonUtil.toFile[T] if T has an Encoder.
We discussed strategies for handling several edge case.

The Complete Example

There are a lot of code snippets floating around. Here is the example again, end-to-end:

import java.io.File
import java.nio.file.Files

trait Encoder[T] {
  def encode(t: T): String
}

object JsonUtil {
  def toFile[T : Encoder](t: T, f: File): Unit = {
    val encoder: Encoder[T] = implicitly[Encoder[T]]
    val jsonString: String = encoder.encode(t)
    Files.writeString(f.toPath, jsonString)
  }
}

final case class User(firstName: String, lastName: String)

object User {
  implicit val userEncoder: Encoder[User] = new Encoder[User] {
    override def encode(t: User): String =
      s"""{"firstName": "${t.firstName}", "lastName": "${t.lastName}"}"""
  }
}

object TypeClassesFromScratch extends App {
  JsonUtil.toFile(User("Martin", "Odersky"), new File("/tmp/out.json"))
}

Running TypeClassesFromScratch produces a file /tmp/out.json, with contents as expected:

$ cat /tmp/out.json 
{"firstName": "Martin", "lastName": "Odersky"}%

Ad-hoc Polymorphism and the Single Responsibility Principle

I’ll end this post with some discussion of a couple related ideas: Ad-hoc Polymorphism and the Single Responsibility Principle (SRP).

Type classes are often referenced as a means to achieve ad-hoc polymorphism in Scala. To me, the term ad-hoc means that something works without up-front planning. I think of polymorphism as a pattern in which different types can be accessed through a common API. So ad-hoc polymorphism describes the existence of polymorphism without up-front planning.

A good example of polymorphism in Scala is the collections API. For example, you can call map on a List[Int] and a List[String]. However, this type of polymorphism is not ad-hoc. The Scala language team anticipated that users want to call map, flatMap, reduce, fold, etc. on a List. So they carefully designed an API with these methods.

The Scala language team did not, however, provide support for JSON encoding. If we want polymorphic JSON encoding of List, we need to implement it ourselves in an ad-hoc fashion.

For all we know, the Scala language team could have anticipated the need for JSON encoding, and chose not to provide it. I consider this is an example of the Single Responsibility Principle.

I think about SRP like this: someone implementing the List should focus on providing an intuitive and performant API for dealing with an ordered, in-memory collection of things. Someone implementing a User case class should focus on a useful representation of a User.

In other words, each type should focus on doing one thing very well.

However, “one thing” can mean different things. Should a type with a map method be distinct from a type with a flatMap method? This distinction can be justified mathematically via Category Theory and is implemented (using type classes) in the popular Cats library.

Though perhaps less mathematically sound, I often find it useful to delineate responsibilities by considering the physical mediums involved in a given API.

A List[T], for any T, exists entirely in memory, so its API should deal with things in memory. Encoding a List can occur entirely in memory, but that’s rarely useful. Useful encoding generally involves some network or persistent storage medium, so it makes sense that this API is decoupled from the in-memory API. We can’t doing anything with a List without going through memory, so it also makes sense that the polymorphic methods implemented directly on a List focus on in-memory operations, and any remaining polymorphism is ad-hoc.

Conclusion

I’ve done my best to guide us up to the need for type classes, including some dead-ends along the way, rather than just introducing and justifying them. If you’ve read this far, I hope this approach was useful, and also thank you for entertaining my musings in the last section. I am hoping to follow-up with a post containing type class examples from the Scala standard library and several popular open-source libraries.