Thrift Sanjoy Singh Scalable Cross Language Services Implementation

Transcription

Thrift Sanjoy Singh Scalable Cross Language Services Implementation
Thrift
Scalable Cross Language Services
Implementation
Sanjoy Singh
Senior Team Lead
Talentica S/W (I)
Pvt Ltd
1
Scalability ??

Design/Program is said to scale …
- if it is suitably efficient and practical when
applied to large situations
Measures
Load
Functional
2
Agenda

Key Components/Challenges for Cross Language
Interactions

Various System for Cross Language Interactions

Dive Into Apache Thrift

Principle Of Operation
 Example
 Thrift Stack
 Versioning
Why to use Thrift. Limitations?

Quick Code Walkthrough

3
LAMP
+
Services
High-Level Goal: Enable transparent interaction between these.
…and some others
4 too.
High Level Goals !
 Transparent Interaction between multiple
programming languages.
 Maintain Right balance between
 Performance
 Ease
and speed of development
 Availability
of existing libraries. etc
5
Simple Distributed Architecture
Waiting for
requests
(known location,
Communication protocol, Data format
known port)
Sending
requests,
getting
results
Basic questions are:
 What kind of protocol to use, and what data to
transmit

What to do with requests on the server side
6
Key Components/Challenges !

Type system

Transport system

Protocol system

Versioning

Processor

Performance
No problem can stand the assault of sustained thinking.
7
Hasn’t this been done before?

SOAP

CORBA

COM

Pillar

Protocol Buffers etc
(yes.)
8
Should we pick up one of those?
(not sure)
 SOAP
 XML,
XML, and more XML
 CORBA
 Over
designed and Heavyweight
 COM

Embraced mainly in Windows Client Software
 Pillar
 Slick!
But no versioning/abstraction.
 Protocol
 Closed
Buffers etc
source Google deliciousness
9
Decision Time !
As a developer, what are you looking
for?
Be Patient, I have something for you in
the subsequent slides !!
10
Solution
Apache Thrift
Software framework for scalable
cross-language services development.
11
Apache Thrift - Introduction

Originally developed at Facebook

Open sourced in April 2007

Easy exchange of data
Cross language serialization with
minimal overhead .

Thrift tools can generate code for C++,
Java, Python, PHP, Ruby, Erlang, Perl,
Haskell, C#, Cocoa, Smalltalk and OCaml

12
Lets Dive It..
13
Principle Of Operation
Define Data
types and
Service
interfaces
Create a thrift file
eg demo.thrift
Thrift Code Generator Tool
(written in C++)
Demo.php
Demo.cpp
Demo.py
Create Server/Client App
Run the Server
Build Thrift
platform files
Demo.java
Server
implements
Services and
Client calls
them
14
Thrift Cares About


Type Definitions
Service Definitions
Thrift Doesn’t Care About



Wire Protocol (internal XML...)
Transport (HTTP? Sockets? Whatevz!)
Programming Languages
15
Enough Banter. Show Us the Goodz.
// Include other thrift files
include "shared.thrift“
namespace java calculator
enum Operation { // define enums
ADD = 1,
SUBTRACT = 2,
MULTIPLY = 3,
DIVIDE = 4
}
struct Work {// complex data structures
1: i32 num1 = 0,
2: i32 num2,
3: Operation op,
4: optional string comment,
}
16
Enough Banter. Show Us the Goodz.
// Exception
exception InvalidOperation {
1: i32 what,
2: string why
}
// Service
service Calculator extends shared.SharedService {
void ping(),
i32 add(1:i32 num1, 2:i32 num2),
i32 calculate(1:i32 logid, 2:Work w) throws (1:InvalidOperation ouch),
oneway void zip()
}
17
Enough Banter. Show Us the Goodz.
// Include other thrift files
// Exception
include "shared.thrift“
exception InvalidOperation {
namespace java calculator
1: i32 what,
enum Operation { // define enums
2: string why
ADD = 1,
}
SUBTRACT = 2,
MULTIPLY = 3,
// Service
DIVIDE = 4
service Calculator extends
shared.SharedService {
}
void ping(),
struct Work {// complex data structures
i32 add(1:i32 num1, 2:i32 num2),
1: i32 num1 = 0,
2: i32 num2,
i32 calculate(1:i32 logid, 2:Work w)
throws (1:InvalidOperation ouch),
3: Operation op,
oneway void zip()
4: optional string comment,
}
}
18
What DOES that do?

Generates definitions for all the types in
each language
Generates Client and Server interfaces
for each language

What DOESNT that do?

Anything to do with sockets

Anything to do with serialization
19
Magically Generated Files
gen-java/calculatordemo
Calculator.java
InvalidOperation.java
Operation.java
Work.java
gen-php/
Calculator.php
calculator_types
gen-py/
ttypes.py
Calculator.py
Calculator-remote
20
Thrift Philosophy
Create a system that is abstracted in a
systematic way, such that developers can
easily extend it to suit their needs and
function in custom environments.
21
Structs don’t have any code to do with
serialization or sockets, etc.
But they know how to read and write
themselves… How does that work?
22
The Thrift Stack

The Thrift stack is a common class hierarchy implemented in each language that
abstracts out the tricky details of protocol encoding and network communication.
It provides a simple interface for generated code to use.
There are two key interfaces:


TTransport

De-coupled the transport layer from Code Generation Layer.

Provides read() and write(), with a set of other helpers like open(), close(),
etc.

Implementation - TSocket, TFileTransport, TBufferedTransport,
TFramedTransport, TMemoryBuffer.
TProtocol

Separate Data Structure from Transport representation.

Provides the ability to read and write various types of data, i.e. readI32(),
writeString(), etc.

Supports Bi-directional sequenced messaging and encoding of base types,
container and struts.
23
The Thrift Stack
Object
write()
TProtocol
Information Flow!
TTransport
TTransport
Object
read()
TProtocol
24
Versioning (applications change a lot, not protocols!)

What happens when definitions change?

Struct needs a new member

Function needs a new argument
No Problem! We’ve got Field Identifiers!
Example:
struct Work {
1: i32 num1 = 0,
2: i32 num2,
3: Operation op,
4: optional string comment,
}
25
Versioning - Case Analysis
 Add
a Field
 New
Client, Old Server
Server sees a field id that it doesn’t recognize, and safely ignores it.
 Old
Client, New Server
Server doesn’t see the field id it expects. Leaves it unset in object,
server implementation can properly handle
 Remove
 New
a Field
Client, Old Server
Server doesn’t see field it expects. Analogous to above.
 Old
Client, New Server
Old client sends deprecated field. Server politely ignore it. Analogous
to the top case.
26
Why to use Thrift …
 Less



No duplicated networking and protocol code
less time dealing with boilerplate stuff
Write your client and server in about 5 minutes
 Less


time wasted by individual developers
maintenance
One networking code base that needs maintenance
Fix bugs once, rather than repeatedly in every server
 Division

of labour
Work on high-performance servers separate from
applications
 Common

toolkit
Code reuse and shared tools
27
Why to use Thrift …

Cross-language serialization with lower overhead
than alternatives such as SOAP due to use of binary
format

A lean and clean library. No framework to code to.
No XML configuration files.

The language bindings feel natural. For example
Java uses ArrayList<String>. C++ uses
std::vector<std::string>.
 The
application-level wire format and the
serialization-level wire format are cleanly separated.
They can be modified independently.
28
Why to use Thrift …
 The
predefined serialization styles include: binary,
HTTP-friendly and compact binary.
 Soft
versioning of the protocol.
 No
build dependencies or non-standard software. No
mix of incompatible software licenses.
29
Limitations / Non-Features
 Is struct inheritance/polymorphism supported?
 No,

it isn’t
Can I overload service methods?
 Nope.


Method names must be unique.
Heterogeneous containers Not supported
Is there any enough documentation on Thrift
development?
I
think this is one weak area.
30
Steps/Code Walkthrough
(Lets build the example described earlier)
31
Some Real Time Example

Facebook Search Service
PHP based
Web App
Thrift PHP Lib
Search Service
(implemented in C++
AdServer, Blogfeeds, CSSParser,
Memcached, Network Selector, News
Feed, Scribe etc

32
Why Should I not try this?
Guess the answer?
Answer: Please do let me know at sanjoys@talentica.com
Skpe_id/Gtalk_id : sanjoy_17 /sanjoy17
33
References

http://incubator.apache.org/thrift/
http://incubator.apache.org/thrift/static
/thrift-20070401.pdf
34
Thanks !!!
35