Apache Thrift is a software framework for cross-language: providing what is essentially a remote-procedure call interface to enable a client application to access services from a service — which can be written in the-same, or another language. Thrift supports all of the major languages that you’d expect to use: including Python, C++, Java, JavaScript / Node.js, and Ruby.
Unfortunately, whilst there are quite a few tutorials on how to use Thrift: some of them concentrate on explaining how Thrift is working behind the scenes (which is important, of course): rather than on how to use it. There also aren’t that many that concentrate on using C++. So having spent some time working through some of the tutorials (and the book Learning Apache Thrift), I though I’d have a go at writing something of a practical guide. I’m quite deliberately not going to focus on how Thrift works: if you want that, let me suggest the Apache Thrift – Quick Tutorial. I’m also going to assume that you have Thrift installed on your computer already (again, there are lots of sets of instruction on how to do this — Google will be your friend), and that you’re using a Linux or MacOS computer. You should be able to follow-along with most of this if you’re using Windows: but there will be a few differences, especially when it comes to compiling C++ files.
The Service Description File
The first thing that you need to do when using Thrift, is to write an description of the service that you want to create — using the Thrift Interface Definition Language (IDL).
The formal of the file is pretty simple. At its simplest it can contain just four-lines and still do something useful.
As you can see, this defines a service called logger — which has one service named timestamp, which takes a single argument (of a string type), called filename; and which doesn’t return anything. The keyword oneway in the definition means that the code generated by Thrift will result in a function that won’t wait for the service before continuing.
Thrift has a few datatypes to represent different data types in the supported languages.
Type | Detail |
---|---|
bool | Boolean |
byte | An 8-bit signed integer |
i16 | A 16-bit signed integer |
i32 | A 32-bit signed integer |
i64 | A 64-bit signed integer |
double | A 64-bit floating-point value |
string | A string |
Note that there are no unsigned data types in Thrift…
Thrift also supports the definition of structs, and also supports lists, sets, and dictionaries.
Let’s look at a more complete example .thrift file.
This file also introduces another concept — Thrift Namespaces; these are optional; but when included they let you specify the namespace to be used for the Thrift generated code. There are language specific (in the example py relates to Python, and cpp relates to C++).
As in the previous example, you can have multiple-services in one file — but that makes the generated code even more complex (as each service is implemented distinct); so for simplicity, I’d suggest only defining single service in a file.
There’s one last thing we can add to the file before we run the Thrift generator. In this example, given that we’re dealing with file I/O, we might have situations where that I/O causes errors. Typically we’d handle such errors by the use of exceptions. Thrift let’s us define exceptions to be used in conjunction with Thrift code.
So here’s the full version of the LoggerService.thrift file. (Note that the name of the file will be used in the names of some of the resulting files; so whilst you can name the file however you like, I’d recommend naming it something sensibly related to the service that it defines).
This is pretty much the same as the previous example — but you can now see more clearly why we’d want to use oneway for some void services; but not for others. Since by definition the calling client won’t wait if we specify the service as oneway we can’t throw an exception back to the client. So in this example we’ll have to handle the error silently within the function that defines our service.
Generating Code
Now that we have the IDL specification complete for our service, we can invoke Thrift and tell it to build some code in the languages that we specify. In this example, I’m going to start by defining the service in C++, and have that service called by a Python client.
To run Thrift simply run: thrift –gen py –gen cpp LoggerService.thrift
Each language generator specified will result in a directory named gen-xxx
(where xxx is the Thrift shorthand for the language — i.e. cpp, py, etc.) in the folder that Thrift was run from.
C++
For C++ Thirft actually helps us out quite a lot — by building a dummy version of the service definition: which will be named (in this case) Logger_server.skeleton.cpp.
It also generates a pair of .cpp & .h files named for the Service (here Logger.cpp and Logger.h — which highlights a good reason not to include the word service in the name of your service), and two pairs of files named Logger_Service_types and Logger_Service_constants (taking their name from the name of the .thrift file used for the generation).
These files (as with pretty much all machine generated code) are pretty heavy-going to try to read; but you don’t really need to do anything other than include them in the compilation stage…
The one exception to that, is the server. Recommended practice is to make a copy of the …skeleton.cpp file, and to build out from there.
The skeleton is a pretty short file — and you should be able to see very easily which bits you need to change to actually make your service do something useful. If we don’t make any changes to the server it should still run, but obviously won’t do anything especially useful (apart from echoing the name of the service method to the server process’s stdout).
For example, for here’s the code that the server will use to write a message line to the log.
A slightly more complex example is the method to return the last line from the log file.
Note that when using a string as the return value in C++ we can’t use the normal C++ function return; but rather the generated C++ function returns void, and has an extra parameter named _return. This has been added by Thrift in code-generation, and is a pass-by-reference parameter that we set the “return” value to before we exit the method definition.
The service is defined as a C++ class — so we can write a properly object-oriented solution if we want to. For this simple example I’m not going to do that: so turning this into an version following good OO practice, where we persistently identify the log file name (for example) is left as a exercise for the reader… 🙂
Anyway, the complete code for the server is as follows.
Having added our (example) code to the server file — all we need to do now is to build it.
Given that you need to compile in quite a few support files, I thoroughly recommend using CMake to write the build scripts for you.
Here’s my CMakeLists.txt file for this project.
It’s hopefully self-explanatory, though you will need to change (if necessary) the paths to the install for Thrift, and Boost (a dependancy for Thrift), and substitute your filenames for the code files.
Python
Having built the server in C++, we can now turn our attention to Python, where we’ll make the client application.
Unfortunately Thrift doesn’t give us quite the same skeleton to work from; but the file itself is pretty simple, so we can easily write it from scratch.
As you can see there’s not a lot to do there. The first 14-lines are essentially boiler-plate code, and after that all we need to do is create our client, and call the methods we wish to invoke from the server; handling any exceptions we generate from the server (except Thrift.TException…), and the case where the transport fails to open (which is usually caused by the server not being running).
In part two, we’ll turn this the other way around — and see how to to call a Python server from C++.