3 minute read

Categories:
Tags:

In a system of async service calls, sometimes the most indispensable debugging tool is knowing what and when network traffic is occurring. Unfortunately for developers Finagle’s default protocol is binary, which while undecipherable, it can be transformed into something a lot more useful.

Thrift has two other built in protocols, the fully functional TJSONProtocol, and the write-only TSimpleJSONProtocol. Both are quite suitable for human consumption; and each has an advantage for different tasks.

The TJSONProtocol is the JSON alternative to using the binary TBinaryProtocol, both properly encode all necessary information for Thrift to operate. TBinaryProtocol is naturally better performing being more concise and easier for computers to parse, but there are times when human-readability is a valid reason to choose something different. One example is log input/output.

The TSimpleJSONProtocol is a lossy implementation of the Thrift protocol, it discards some of the necessary information required to reconstitute the transmitted data. However, in scenarios where the intended use is one-way output to the console it is much more developer friendly.

Since both TJSONProtocol and TSimpleJSONProtocol have their uses, it makes sense that flexible logging would support both. The intention is to design a Filter allowing us to view and/or log all transmitted data.

The hurdles in designing a Filter to convert one protocol into another is that all structure parsing code is part of the classes generated by Scrooge. And, unfortunately, the current implementation does not allow us to intercept any values inside the generated methods, and writing out own parsing classes is overkill. Thus, our approach is limited to a less optimal approach: parsing the protocol into class instances, and then re-encoding using another protocol. The overhead involved in negligible, but it does introduce a second deserialize-then-serialize cycle to all transmissions – luckily this has minimal performance impact.

Let’s start off by creating an object to contain our reserialize method.

import com.twitter.finagle.{ Service, SimpleFilter }
import com.twitter.util.Future
import org.apache.thrift.transport._
import org.apache.thrift.protocol._
 
object ProtocolHelpers {
  def reserialize(
    finagleServiceObject: AnyRef,
    input: Array[Byte],
    outputProtocolFactory: TProtocolFactory = new TSimpleJSONProtocol.Factory,
    inputProtocolFactory: TProtocolFactory = new TBinaryProtocol.Factory)
  : Array[Byte]
}

Since the generated Scrooge generated classes don’t inherit from a base class or trait, and Scala isn’t great at loading objects using reflection, we’ll just pass them as an AnyRef.

Most of the internals to the reserialize method are reflection hacks on the Scrooge generated code, the basic strategy is to parse out the method name from within the input and match it against a correspondingly named Object within the finagleServiceObject object.

def reserialize(
  finagleServiceObject: AnyRef,
  input: Array[Byte],
  outputProtocolFactory: TProtocolFactory = new TSimpleJSONProtocol.Factory,
  inputProtocolFactory: TProtocolFactory = new TBinaryProtocol.Factory
) : Array[Byte] = {
 
  import com.twitter.scrooge.ThriftStructCodec3
 
  val inputTransport = new TMemoryInputTransport(input)
  val inputProtocol = inputProtocolFactory.getProtocol(inputTransport)
  val msg = inputProtocol.readMessageBegin
  import org.apache.thrift.protocol.TMessageType
  val classPostfix = msg.`type` match {
    case TMessageType.CALL => "$args$"
    case TMessageType.REPLY => "$result$"
  } 
  val className = finagleServiceObject.getClass.getName + msg.name + classPostfix
 
  val outputTransport = new TMemoryBuffer(255)
  val outputProtocol = outputProtocolFactory.getProtocol(outputTransport)
  outputProtocol.writeMessageBegin(msg)
 
  val clazz = java.lang.Class.forName(className)
  val codec = clazz.getField("MODULE$").get(clazz).asInstanceOf[ThriftStructCodec3[_]]
  val args = codec.decode(inputProtocol)
  //can't call encode because of type erasure, call using reflection
  val encodeMethod = clazz.getMethods.filter(_.getName == "encode").head
  encodeMethod.invoke(codec, args.asInstanceOf[Object], outputProtocol)
  outputProtocol.writeMessageEnd
  outputTransport.getArray.slice(0, outputTransport.length)
}

The general naming pattern, given a method name of foo would be to look for an object named foo$args to parse a request, and foo$result to parse a response. Both of these objects extend a com.twitter.scrooge.ThriftStructCodec3, however java’s _ type erasure means we have to use reflection a second time to call ThriftStructCodec3.encode.

While the above reserialize method works converting between any two protocols, we’ll need to apply a restriction that we are dealing with JSON to convert Array[Byte] to a String – something that cannot be done reliably with arbitrary binary data.

class BinaryProtocolToJsonLoggingFilter(
  finagleServiceObject: AnyRef,
  write: String => Unit,
  toProtocol: TProtocolFactory = new TSimpleJSONProtocol.Factory
) extends SimpleFilter[Array[Byte], Array[Byte]] {
 
  import ProtocolHelpers.reserialize
 
  def apply(request: Array[Byte],
    service: Service[Array[Byte], Array[Byte]]): Future[Array[Byte]] = {
 
    val requestJson = reserialize(finagleServiceObject, request, toProtocol)
    write(new String(requestJson, "UTF-8"))
 
    service(request).onSuccess(response => {
      val responseJson = reserialize(finagleServiceObject, response, toProtocol)
      write(new String(responseJson, "UTF-8"))
    })
  }
}

General usage would be to specify write as any slf4j or java.util.logging, but most developers can dive right in by passing Console.println.

val consoleLogFilter = new BinaryProtocolToJsonLoggingFilter(TestApi, println)
val server = ServerBuilder()
  .codec(ThriftServerFramedCodec())
  .name("FooBar")
  .bindTo(socket)
  .build(consoleLogFilter andThen 
    new TestApi.FinagledService(new TestService, new  TBinaryProtocol.Factory))

When the client calls TestApi.getUserName(1), the console outputs:

["getUserName",1,735035982,{"userId":1}]
["getUserName",2,735035982,{"success":{"userId":1,"userName":"User1"}}]

Full Sources: Developer-Friendly-Thrift-Request-Logging.scala

Categories:
Tags:
Updated: