Introduction

Maintaining quality documentation is one of the most important and also most difficult parts of software engineering.

I’ve found engineers, myself included, rarely budget enough time for writing and maintaining documentation. I’ve been on teams delivering high-value projects that started with great documentation but devolved to a mix of stale and inconsistent comments and markdown files, all propped up by tribal knowledge. I’ve accepted that great documentation requires dedicated effort independent of the actual implementation.

With this in mind, I’d like to present two documentation workflows that strike a good balance between effort and value.

Both workflows fit the category of recipes or how-to guides and leverage a compiler and continuous integration. They by no means replace the carefully-curated docs you’ll find in major open-source projects and cloud service SDKs, but they’re certainly an improvement over improvised comments and markdown files and only require a bit of initial setup.

A Project of Executable Recipes

In this workflow, we maintain a project of common recipes alongside the actual library or service. Crucially, these recipes should be easy to clone and execute.

I’ve seen this in several open-source projects. Keras has a directory of examples and benchmarks. Scalapb has example SBT projects – useful for a codegen tool, where build config is the main interface. The Play Framework has an entire repo of example projects with CI infrastructure for testing them.

When building internal libraries, I find it’s useful to maintain these recipes as a project of test suites. They don’t have to apply to functionality provided by your specific project. For instance, if engineers frequently ask about a particular use-case of some open-source library, we can write a recipe to demonstrate it.

Here’s an example for using Circe with snake_case naming.

import org.scalatest.freespec.AnyFreeSpec
import org.scalatest.matchers.should.Matchers

class CirceRecipesSpec extends AnyFreeSpec with Matchers {

  "JSON with snake_case Member Names" - {
    // Important to import from io.circe.generic.extras.
    // The Configuration class lets you customize the codecs.
    import io.circe.generic.extras.Configuration
    import io.circe.generic.extras.semiauto._
    import io.circe.syntax._
    import io.circe.parser._

    // Standard case class with implicit codecs in companion.
    case class User(firstName: String, lastName: String)
    object User {
      // Config tells encoder and decoder to use snake_case.
      implicit val config = Configuration.default.withSnakeCaseMemberNames
      implicit val encoder = deriveConfiguredEncoder[User]
      implicit val decoder = deriveConfiguredDecoder[User]
    }

    "encode with snake_case" in {
      val u = User("Jane", "Doe")
      val s = """{"first_name":"Jane","last_name":"Doe"}"""
      u.asJson.noSpaces shouldBe s
    }

    "decode with snake_case" in {
      val s = """{"first_name":"Jane","last_name":"Doe"}"""
      decode[User](s) shouldBe Right(User("Jane", "Doe"))
    }
  }
}

A critical property of this workflow is the use of real code.

Compared to a code snippet in a markdown file, an engineer can easily clone, compile, run, debug, and improve this recipe. The comments and names should guide the reader through subtleties like imports and implicits. Moreover, these tests are generally cheap to run, so the CI pipeline can continuously exercise them to ensure they stay up-to-date.

I’ve found this especially useful when the recipe demonstrates some functionality at the intersection of two libraries (e.g., using Kantan and Circe to parse a CSV containing JSON columns).

With communication and discipline, a team can develop a virtuous cycle of continually using and improving these recipes.

Compiled Code Snippets

Probably the lowest-effort documentation strategy is to simply add some code snippets to a markdown file (README.md).

If carefully maintained, this can be entirely sufficient for a simple project. For example, David Moten has a host of Java libraries for geospatial computing. Most of them are delightfully minimalistic, solving a single interesting problem with documentation entirely in the README.

Of course the downside to code snippets is their tendency to fall out-of-date. It seems someone inevitably makes a typo, fails to update them after a breaking change, or references some unknown piece of code.

The code snippet story can have a happy ending. The trick is to use a tool to compile the actual markdown files.

I’ve successfully used mdoc in Scala projects for exactly this purpose. It’s a simple yet powerful workflow: add a subproject to the SBT build, configure it to use the mdoc SBT plugin, set the location of markdown files, and run sbt mdoc in the CI pipeline. If the code snippets are stale, the build will fail.

For example, to reproduce the Circe recipe from above, first create a docs project in build.sbt:

lazy val docs = project.in(file("docs"))
  .enablePlugins(MdocPlugin)
  .settings(
    mdocIn := file("README.md"),
    mdocOut := file("/dev/null"),
    libraryDependencies ++= Seq(
      "io.circe" %% "circe-generic-extras" % "0.13.0",
      "io.circe" %% "circe-parser" % "0.13.0"
    )
  )

The subproject has two important configurations. First, mdocIn and mdocOut are set to simply take the README, compile it, and discard the result. Mdoc has some features that would eventually warrant publishing the results, but for now we just check the code. Second, the project includes the circe dependencies. Mdoc makes these available for use in the markdown.

With this configured, we can add code snippets to the README:

## JSON with snake_case names
Note: import from `io.circe.generic.extras.semiauto`, 
not from `io.circe.generic.semiauto`.
```scala mdoc
import io.circe.generic.extras.Configuration
import io.circe.generic.extras.semiauto._
import io.circe.syntax._
import io.circe.parser._
```

Then define a standard case class with implicit codecs 
in a companion. The implicit configuration tells the 
codecs to use snake_case.
```scala mdoc
case class User(firstName: String, lastName: String)
object User {
  implicit val config = Configuration.default.withSnakeCaseMemberNames
  implicit val encoder = deriveConfiguredEncoder[User]
  implicit val decoder = deriveConfiguredDecoder[User]
}
```

Create, encode, and decode the case class.
```scala mdoc
val u = User("Jane", "Doe")
val s = u.asJson.noSpaces // {"first_name":"Jane","last_name":"Doe"}
val r = decode[User](s)   // Right(User("Jane", "Doe"))
```

To compile the snippets, simply run sbt docs/mdoc:

$ sbt docs/mdoc
[info] running mdoc.Main 
info: Compiling 1 file to /dev/null
info: Compiled in 3.82s (0 errors)

If we introduce an error, for example forgetting to import io.circe.parser._, we get immediate feedback:

$ sbt docs/mdoc
info: Compiling 1 file to /dev/null
error: README.md:37:9: not found: value decode
val r = decode[User](s) // Right(User("Jane", "Doe"))
        ^^^^^^
info: Compiled in 3.08s (1 error)

Compared to using a test suite for recipes, this has the advantage that markdown files are more easily discoverable and the disadvantage that the code doesn’t actually execute.

Conclusion

Writing great docs is hard, but we don’t have to settle for scattered, stale documentation. If we invest in some simple up-front setup, we can leverage a compiler and CI to keep our docs up-to-date and useful with minimal ongoing effort.

Updated:

Comments