Does Apache Thrift allow foreign function calls between any two languages? - language-agnostic

I'm currently trying to develop (an API in multiple programming languages) that can be accessed from (various other programming languages). I've taken a look at Apache Thrift, and it appears that it might be possible to allow seamless foreign function calls between any two languages using Thrift. Is this correct?

Thrift is created to facilitate communication between different processes over the network, not in process FFI. It is probably possible to take some parts of Thrift (like IDL), and adopt it for FFI, but it could be an nontrivial undertaking, and provide suboptimal results.

I have actually been thinking of something similar myself.
There are core concepts to the Thrift specification.
The Transport: This portion is responsible for facilitating data transfer between a client and server.
The Protocol: This portion is responsible for formatting the said data in different ways. It can be a JSON, compressed binary, even raw uncompressed binary.
The Server: This is responsible for putting these things together and managing them.
Thrift allows you to mix these different parts in unique ways to create something suitable to your purpose. Thrift is still very much server-client oriented though.
To develop an API in thrift would mean that you could theoreticallly have plugins in any language. The main software component would launch the sub-process and use STD-IN/OUT as a Transport. This would allow it to make RPC calls regardless of Language.

Related

Protocol Buffer vs Json - when to choose one over another

Can anyone explain when to use protocol buffer instead of JSON for micro-services architecture? And vice-versa? Both on synchronous and asynchronous communication.
When to use JSON
You need or want data to be human readable
Data from the service is directly consumed by a web browser
Your server side application is written in JavaScript
You aren’t prepared to tie the data model to a schema
You don’t have the bandwidth to add another tool to your arsenal
The operational burden of running a different kind of network service
is too great
Pros of ProtoBuf
Relatively smaller size
Guarantees type-safety
Prevents schema-violations
Gives you simple accessors
Fast serialization/deserialization
Backward compatibility
While we are at it, have you looked at flatbuffers?
Some of the aspects are covered here google protocol buffers vs json vs XML
Reference:
https://codeclimate.com/blog/choose-protocol-buffers/
https://codeburst.io/json-vs-protocol-buffers-vs-flatbuffers-a4247f8bda6f
I'd use JSON when the consumer is or could possibly be written in a language with built-in native support for JSON (Javascript is an example), a web browser, or where human readability is wanted. Speaking of which, at least for asynchronous calls, many developers enjoy the convenience of examining the contents of the queue directly for debugging and even during the normal course of development. Depending on the tech stack used, it may or may not be worth the trade off to use protobuf just to reduce network load since any performance increase wont buy you much in the async world. And it's not like we need to write a bunch of boiler plate code anymore like we used to with JSON marshalling and unmarshalling in most languages.
I'd use protobuf for everything else... if there are any other use cases left for it with the considerations above. There are advantages you might see, such as performance, network load, the backwards compatibility offered by its versioning scheme, the lovely documentation that magically comes with proto files, and some validation! If for some reason you have a lot of REST or other synchronous calls between microservices, protobuf can be sent over the wire instead of JSON without many trade offs, if any at all, while offering a heap of advantages.

Messaging library for jeroMQ

I have chosen jeroMQ for building Asynchronous message channel for publishing content from multiple clients. On the other end server side workers processes request and notify client only if server wanted to notify client based on the message received.
On digging deep, looking for messaging library to marshal/un-marshal message. I found kvpmsg class which does the job for simple key-value.
Don't want to re-invent the wheel if some standard library exists, that can be applied for bigger objects
It seems like you are asking for data serialization libraries. Check Wikipedia for a list and a comparison of data serialization formats.
Also there is a relevant entry in ZeroMQ FAQ explaining why ZeroMQ doesn't include any serialization format:
Does ØMQ include APIs for serializing data to/from the wire representation?
No. This design decision adheres to the UNIX philosophy of "do one thing and do it well". In the case of ØMQ, that one thing is moving messages, not marshaling data to/from binary representations.
Some middleware products do provide their own serialization API. We believe that doing so leads to bloated wire-level specifications like CORBA (1055 pages). Instead, we've opted to use the simplest wire formats possible which ensure easy interoperability, efficiency and reduce the code (and bug) bloat.
If you wish to use a serialization library, there are plenty of them out there. See for example
Google Protocol Buffers
MessagePack
JSON-GLib
C++ BSON Library
Note that serialization implementations might not be as performant as you might expect. You may need to benchmark your workloads with several serialization formats and libraries in order to understand performance and which format/implementation is best for your use case (ease of development must also be considered).

Why Thrift, Why not HTTP RPC(JSON+gzip)

Thrift's primary goal is to enable efficient and reliable communication across programming languages. but I think HTTP-RPC can also do that, web developer almost everyone knows how to work on http and it is easier to implement HTTP-RPC(json) than Thrift,
Maybe Thrift-RPC is faster, then who can tell me the difference in perfermance between them?
A few reasons other than speed:
Thrift generates the client and server code completely, including the data structures you are passing, so you don't have to deal with anything other than writing the handlers and invoking the client. and everything, including parameters and returns are automatically validated and parsed. so you are getting sanity checks on your data for free.
Thrift is more compact than HTTP, and can easily be extended to support things like encryption, compression, non blocking IO, etc.
Thrift can be set up to use HTTP and JSON pretty easily if you want it (say if your client is somewhere on the internet and needs to pass firewalls)
Thrift supports persistent connections and avoids the continuous TCP and HTTP handshakes that HTTP incurs.
Personally, I use thrift for internal LAN RPC and HTTP when I need connections from outside.
I hope all this makes sense to you. You can read a presentation I gave about thrift here:
http://www.slideshare.net/dvirsky/introduction-to-thrift
It has links to a few other alternatives to thrift.
Here is good resource on performance comparison of different serializers: https://github.com/eishay/jvm-serializers/wiki/
Speaking specifically of Thrift vs JSON: Thrift performance is comparable to the best JSON libraries (jackson, protostuff), and serialized size is somewhat lower.
IMO, strongest thrift advantages are convenient interoperable RPC invocations and convenient handling of binary data.

Serialization format common to node.js and ActionScript?

Some of my friends are designing a game, and I am helping them out by implementing the game's backend server. The game is written in Flash, and I plan to develop the server in node.js because (a) it would be a cool project for learning node.js, and (b) it's fast, which is important for games.
The server's architecture is based on messages sent between the server and client (sort of like Minecraft's server protocol). The message format I have so far is a byte (the packet type), two bytes (the message length) and that many bytes (the message data, which is a mapping of key-value pairs). Problem is, I really don't want to develop my own serialization format (because while I probably could, implementing it would be a pain compared to using an existing solution).
Unfortunately, I am having problems finding a good candidate for the message data serialization format.
ActionScript's own remoting format might work, but I don't like it much.
JSON has support in node.js (obviously) and in ActionScript, but it's also textual and I would prefer binary for enhanced speed.
MessagePack looked like a good candidate, but I can't find an ActionScript implementation. (There's one called as3-msgpack on Google Code, but I get weird errors and can't access it.)
BSON has an ActionScript implementation, but no node.js support besides their MongoDB library (and I'm planning on using Redis).
So, can anyone offer any other serialization formats that I might have missed? Or should I just stick with one of these (or roll my own)?
Isn't that why HTTP supports gzipped content? Just use JSON and gzip the content when you send it. The time spent gzipping is more than recovered by the reduced latency of the transmission.
Check this article for more on gzip with Actionscript. On node.js I think that gzip-compress is fairly popular.
Actually, if I were in your shoes I would implement two methods and time them. Use JSON because it is common and easy to do. But then implement AMQP instead and compare them. If you want to massively scale this then you might find that AMQP makes it easier. Also. message queuing is just such a nice fit into the node.js world view.
AMQP on Actionscript, and someone doing similar on node.js.
Leverage JSAMF in Node.js for AMF communications with Flash.
http://www.jamesward.com/2010/07/07/amf-js-a-pure-javascript-amf-implementation/
If you wanted to, you could create your entire API in client side JavaScript, and use JSON as the data exchange format, then call ExternalInterface by AS to communicate with the client JavaScript API, which would make for an elegant server side solution.
It is worth noting that Flash Player has built in support for decompressing gzip compressed data. It may be worth compressing some of your JSON objects, things like localised string tables, game configuration data, etc which can grow to be a few hundred kb but are only loaded once on game load.
I'm working on a version of MessagePack for AS3.
At the current version it does the basic (encoding/decoding). Planning streams for the future.
Check the project page: https://github.com/loteixeira/as3-msgpack

RPC for java/python with rest support, HTML monitoring and goodies

Here's my set of requirements: I'm looking for an RPC framework such as thrift, avro, protobuf (when adding services to it) which supports:
Easy and intuitive IDL. No serial numbers, no manual versioning, simple... avro is a good example for this.
Works with Java and Python
Supports both fast binary prorocol, as well as HTTP based restful style. I'd like to be able to use it for both backend-to-backend communication (java-java or python-java) as well as frontend-to-backend communication (javascript to java).
The rest support needs to include &param=value input as get/post requests (configurable per request) and output in three possible formats: json, jsonp, XML.
Compact, fast, backward compatible, easy to upgrade etc...
Provides some nice monitoring interfaces such as: JMX, web page status reports (e.g. packets in, packets out, error rate etc)
Ops friendly... no need to take the whole site down to release new versions
Both sync and asyc communication
... other goodies are welcome...
Is there something out there?
So far I've looked at thrift and avro and they are both nice in some ways, but don't check all my list.
Thanks
That's a pretty tall order - some of the requirements are met by:
Avro, Thrift, Protobuff and ICE from Zero C.
ICE is probably the most performant.