Twitter Finagle.


Finagle is Twitter’s core library for building distributed systems. The company has built almost all of their distributed systems infrastructure on top of this library. Furthermore, it represents a major abstraction that one of its creators has called services. See this paper for more.
Finagle is built atop the Netty project, so we expected Finagle to handle high concurrency workloads, which was important in many of our use cases. Also, we had used Netty directly to build some of our servers and found it’s stable and has a good community. With Finagle we found a similar pattern. For more on Finagle and Netty at Twitter, check out the recent Twitter blog posts.
Building HTTP clients with Finagle required less overall code than with the Apache library, but is still somewhat involved. The following setup an execution code for the same GET request as above:
//Set up the client. It's bound to one host.
host = ""
val url = new URL(host)
val client = ClientBuilder()
  .hosts(host) //there are more params you can set here

//Execute the request.
//Make sure the request is going to the same host
//as the client is bound to
val headers: Map[String, String] = ???
val method: Method = new HttpGet()
//this is an org.jboss.netty.buffer.ChannelBuffer
val channelBuf: ChannelBuffer = ??? 
val req = RequestBuilder()
  //oops, sending a request body with a GET request doesn't make sense
  .build(method, Some(channelBuf))
val respFuture: Future[HttpResponse] = client.apply(req)

respFuture.ensure {
  client.close() //don’t forget!

In Summary

In our search, we looked at other libraries as well, but found common patterns with all of them:
  1. HTTP libraries on the JVM tend to be very stable and well tested, or built atop very stable and well tested core libraries.
  2. You usually have to write setup and cleanup code.
  3. It usually takes at least 5 lines of code to execute a request.
  4. The plain Java libraries (obviously) require you to write non-idiomatic Scala.
Overall, the libraries we found required us to remember a lot of code, common patterns and sometimes implementation details. With so much to remember, we decided to either commit to a single library or write a wrapper around each that we wanted to use.

In Comes Newman

Newman started as an informal wrapper around Apache HttpClient. As our overall codebase grew and evolved, we needed to use new clients and knew we needed to formalize our original wrapper into a stable interface to wrap all the messy details of each implementation.
We began with the core interface and two implementations: ApacheHttpClient and FinagleHttpClient. After we deployed code using our first Newman clients, we found more benefits to the core abstraction:
  1. Safety – We iterated on the interface and used Scala’s powerful type system to enforce various rules of HTTP and REST. We’re now at a point where our users can’t compile code that attempts to execute various types of invalid HTTP requests.
  2. Performance – Behind the interface, we added various levels of caching and experimented with connection pooling mechanisms, timeouts, and more to extract the best performance from Newman based on our workloads. We didn’t have to change any code on the other side of the abstraction.
  3. Concurrency – Regardless of the underlying implementation, executing a request returns standard Scala Futures that contain the response. This pattern helps ensure that code doesn’t block on downstream services. It also ensures we can interoperate with other Scala frameworks like Akka or Spray. The Scala community has a lot of great literature on Futures, so I’ll defer to those resources instead of repeating things. The Reactive Manifesto begins to explain some reasoning behind Futures (and more!) and the standard Scala documentation on Futures shows some usage patterns.
  4. Extensibility – Our environments and workloads change, so our clients must also. To effect the change we need, we just need to switch clients with one line of code. We also made the core client interface in Newman very easy to extend, so we can implement a new client quickly and have more time to focus on getting the performance correct.

Higher Level Features

A few notes about this architecture:
  • HttpClient is heavy – it handles various caching tasks, complex concurrency tasks (running event loops and maintaining thread pools, for example), and talking to the network.
  • HttpClient creates HttpRequests – each HttpRequest is very small and light. It contains a pointer back to the client that created it, so it’s common to have many requests for one client.
  • HttpRequest creates Future[HttpResponse] – the Future[HttpResponse] is tied to the HttpClient that is executing the request. That Future will be completed when the response comes back into the client.
With this architecture, we had proven to ourselves in production that we had a consistent, safe and performant HTTP client library. Our ongoing task now is to build features that make building and running systems easier for everyone who uses Newman. Here are a few higher level features that Newman has now:
  • Caching – Newman has an extensible caching mechanism that plugs into its clients. You define your caching strategy(when to cache) and backend (how and where to store cached data) by implementing interfaces. You can then plug them in to a caching HttpClient as necessary. Also, with this extensible caching system, it’s possible to build cache hierarchies. We’ve so far built an ETag and a simple read-through caching strategy and an in-memory caching backend. All ship with Newman.
  • JSON – As I mentioned at the beginning of this post, we use JSON extensively as our data serialization format over the wire, so we built it into Newman as a first class feature. Newman enables full serialization and deserialization to/from any type. Since JSON operations are built into the request and response interfaces, all client implementations get JSON functionality “for free.”
  • DSL – We built a domain specific language into Newman that makes even complex requests possible to create and execute in one line of code. The same goes for reading, deserializing, and decoding for handling responses. The DSL is standard Scala and provides more type safety on top of core Newman. Newman DSL code has become canonical.

The Result

Newman abstracts away the basics of RPC. For example, we were able to replace 10+ lines of code with the following (excluding imports and comments in both cases):
implicit val client = new ApacheHttpClient() //or swap out for another
GET(url(http, "")).addHeaders("hello" -> "readers").apply
This code has more safety features than what it replaced in most cases and the setup and teardown complexities are written once and encapsulated inside the client. We have been pleased with Newman so far and anticipate that next steps will make Newman more powerful and useful for everyone.


Beliebte Posts