Telemetry with Scala, part 3: otel4s

Ivan Kurchenko
6 min readMay 14, 2023

--

Introduction

Welcome to the third part of telemetry with Scala series! In previous posts, we explored OpenTelemetry and Kamon, two popular telemetry solutions for Scala and Akka based applications. This time, the main focus shifter towards cats-effect ecosystem, and we will take a look at otel4s. We will walk through the key features of otel4s, how to configure it and how to integrate it into a Scala and pure functional application. So, let's get started!

System under monitoring recap

Before proceeding, first, let’s have a quick recap of the system under monitoring. More details you can find in Part 1. This is a simple task ticketing (like the Jira or Asana) service.

It uses the following external services:

  • PostgresSQL database to store tickets;
  • Kafka to publish events about tickets creation and update (for example, to send notifications to users);
  • ElasticSearch for tickets full-text search;
  • project-service to get information about a project to which the ticket belongs (mocked service);

This time it’s going to be cats-effect based application. In particular, the following libraries are used:

  • http4s — for both client (requesting projects_service) and server side (REST API);
  • doobie — to access PostgresSQL database;
  • fs2-kafka — to publish events to Kafka;
  • elastic4s — to access ElasticSearch;

For this service, we would like to monitor a couple of simple things:

  • Number of tickets in the system;
  • Traces for each request;

To simulate user traffic (create, read, update, and delete tickets), Gatling load testing tool was used. Testing scenario is pretty simple: 10 users simultaneously create 10 tickets, then search for them, update them and delete them.

Scala observability ecosystem short overview

Before diving into otel4s it is worth to mention that there are a couple of other libraries that can be used for telemetry:

  • trace4cats — tracing library with rich functionality, support for multiple exporters and integrations;
  • natchez — minimalistic tracing library for cats-effect ecosystem;

The main focus of mentioned frameworks is tracing and distributed tracing in particular. Described use case is a bit different, since it requires a metrics collection as well. Because of this otel4s was chosen for to be used in this example.

otel4s

As the main page says, otel4s is:

otel4s is an OpenTelemetry implementation for Scala. The design goal is to fully and faithfully implement the OpenTelemetry Specification atop Cats Effect.

But, please, note that this project is under active development:

The API is still highly experimental, but we are actively instrumenting various libraries and applications to check for fit.

How to plug

First of all, let’s add otel4s to the project dependencies:

val opentelemetryVersion = "1.24.0"
val opentelemetryAlphaVersion = s"$opentelemetryVersion-alpha"
val otel4sVersion = "0.2.1"

libraryDependencies ++= Seq(
"org.typelevel" %% "otel4s-java" % otel4sVersion,
"io.opentelemetry" % "opentelemetry-sdk-extension-autoconfigure" % opentelemetryAlphaVersion,
"io.opentelemetry" % "opentelemetry-exporter-prometheus" % opentelemetryAlphaVersion,
"io.opentelemetry" % "opentelemetry-exporter-zipkin" % opentelemetryVersion,

"io.opentelemetry.javaagent" % "opentelemetry-javaagent" % version % "runtime"
)

For the sake of example, two exporters were added: Prometheus and Zipkin. Adjust this list according to your needs. For instance, you can use opentelemetry-exporter-otlp to send telemetry data to OTEL collector.

Add Java agent plugin to the plugins.sbt:

addSbtPlugin("com.lightbend.sbt" % "sbt-javaagent" % "0.1.6")

And enable it in build.sbt:

javaAgents += "io.opentelemetry.javaagent" % "opentelemetry-javaagent" % "1.24.0",
javaOptions += "-Dotel.java.global-autoconfigure.enabled=true",

In order to start working with it, we need to create an instance of Otel4s:

import cats.implicits._
import cats.effect.{Async, Concurrent, LiftIO, Resource, Sync}
import io.opentelemetry.api.GlobalOpenTelemetry
import org.typelevel.otel4s.Otel4s
import org.typelevel.otel4s.java.OtelJava

def otelResource[F[_]: Sync: Async: LiftIO]: Resource[F, Otel4s[F]] = {
Resource
.eval(Sync[F].delay(GlobalOpenTelemetry.get))
.evalMap(OtelJava.forAsync[F])
}

Later we need to acquire two interfaces for tracing and metrics:

for {
traceProvider: Tracer[F] <- otel.tracerProvider.get("tickets-service")
metricsProvider: Meter[F] <- otel.meterProvider.get("tickets-service")
} yield (traceProvider, metricsProvider)

Note: Tracer needs to be added in implicit scope across the application, so make sure you have F[_]: Tracer.

Metrics

Firstm let’s start with metrics. In order to create a counter, we need to do the following:

ticketsCounter <- metricsProvider
.upDownCounter("tickets_count")
.withUnit("1")
.withDescription("Tickets count")
.create

API looks pretty similar to the native Java API, so we won’t dive into details here. More about the types of metrics you can find here. And then we can use it in scope of IO, like:

ticketsCounter.add(1, Attribute("project.id", ticket.id))

Metrics example: Prometheus

To visualize metrics, we can use Prometheus. You can find instructions on how to configure it for OpenTelemetry in Part 1. After running load tests, we can see the following metrics:

Since OTEL java agent is used, we can also find some JVM metrics, like memory usage:

Tracing

Now, let’s move to the most interesting part — tracing. But there are couple nuances of using otel4s and Opentelemetry Java native instrumentation's. The first one is that it stores SpanContext in IOLocal via Vault. Another one is that Java implementation of traces relies on ThreadLocal storage for context propagation by default. Because of having two runtimes (Java threads and Cats Effect fibers), it's pretty difficult to propagate context between them properly. Hence, we need to implement own tracing middleware layer to be consistent with IO runtime and have proper context propagation among fibers. Luckily, otel4s provides concise API for this.

There is ongoing work to implement set of instrumentation’s for popular libraries. For instance, Skunk is already to supports otel4s out of the box

For http4s server side it could look following (imports eliminated):

trait ServerMiddleware {
implicit class ServerMiddlewareOps[F[_] : Sync : Async : Tracer](service: HttpApp[F]) {
def traced: HttpApp[F] = {
Kleisli { (req: Request[F]) =>
Tracer[F]
.spanBuilder("handle-incoming-request")
.addAttribute(Attribute("http.method", req.method.name))
.addAttribute(Attribute("http.url", req.uri.renderString))
.withSpanKind(SpanKind.Server)
.build
.use { span =>
for {
response <- service(req)
_ <- span.addAttribute(Attribute("http.status-code", response.status.code.toLong))
_ <- {
if (response.status.isSuccess) span.setStatus(Status.Ok) else span.setStatus(Status.Error)
}
} yield {
val traceIdHeader = Header.Raw(CIString("traceId"), span.context.traceIdHex)
response.putHeaders(traceIdHeader)
}
}
}
}
}
}

And then used like:

def app: HttpApp[F] = {
HttpRoutes.of[F] {
case req@POST -> Root / "tickets" =>
for {
create <- req.as[CreateTicket]
ticket <- service.createTicket(create)
resp <- Ok(ticket.asJson)
} yield resp
}
.orNotFound
.traced
}

For the http4s client side:

trait ClientMiddleware {
implicit class ClientMiddlewareOps[F[_]: Tracer: Async, A](client: Client[F]) {
def runTraced(request: Request[F]): F[Response[F]] = {
Tracer[F].spanBuilder("client-request")
.addAttribute(Attribute("http.method", request.method.name))
.addAttribute(Attribute("http.url", request.uri.renderString))
.withSpanKind(SpanKind.Client)
.wrapResource(client.run(request))
.build
.use {
case span @ Span.Res(response) =>
for {
_ <- span.addAttribute(Attribute("http.status-code", response.status.code.toLong))
_ <- if (response.status.isSuccess) span.setStatus(Status.Ok) else span.setStatus(Status.Error)
} yield response
}
}
}
}

Usage example:

val request = Request[F](Method.GET, Uri.unsafeFromString(s"${configuration.url}/projects/$projectId"))
client.runTraced(request)

And for the elastic4s client:

trait ElasticMiddleware {
implicit class ElasticMiddlewareOps[F[_]: Tracer: Async, A](client: ElasticClient) {
def executeTraced[T, U](t: T)(implicit
handler: Handler[T, U],
javaTypeable: JavaTypeable[U],
options: CommonRequestOptions): F[Response[U]] = {
Tracer[F].span("elasticsearch").surround(
Async[F].fromFuture(Async[F].pure(client.execute[T, U, Future](t)))
)
}
}
}

Usage example:

def indexTicket(ticket: Ticket): F[Unit] = {
client.executeTraced {
indexInto("tickets").
id(ticket.id.toString).
source(ticket.toSearch).
refresh(RefreshPolicy.Immediate)
}.as(())
}

Minimalistic middleware for Kafka producer using fs2-kafka could be like following:

trait KafkaProducerMiddleware {
implicit class KafkaProducerMiddlewareOps[F[_] : Sync : Async : Tracer, K, V](producer: KafkaProducer[F, K, V]) {
def produceOneTraced(topic: String, key: K, value: V): F[ProducerResult[K, V]] = {
Tracer[F].span("kafka.produce", Attribute("kafka.topic", topic))
.surround(producer.produceOne(topic, key, value).flatten)
}
}
}

And using it like:

private def send(event: TicketEvent): F[ProducerResult[String, TicketEvent]] = {
KafkaProducer.resource(producerSettings)
.use(_.produceOneTraced(kafka.topic, event.ticket.id.toString, event))
}

Tracing example: Zipkin

After having all the tracing middleware in place, we can start tracing our application. Let’s setup Zipkin as a tracing backend and run the application. You can find instructions on how to configure it for OpenTelemetry in Part 1.

Below you can see a screenshot of Zipkin UI with traces from our application for ticket creation flow:

All the code you can find in this repository

Conclusion

In this article, we’ve seen how to use otel4s to instrument Scala applications with OpenTelemetry. We've also seen how to use otel4s with http4s, elastic4s, fs2-kafka and doobie. Despite the fact that otel4s is still in early development, it's already provides a lot of useful features and plug OpenTelemetry into your pure application.

--

--

No responses yet