-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thrift using different serialization protocols #19
Comments
In the case of map-output this is easy to specify in the Configuration and read by ThriftSerialization. In the case of sequence files containing Thrift objects either in key or value that couldn't be managed directly by SequenceFileInput/OutputFormat. New {Input/Output}Format must be created, and the protocol expected would be specified via Configuration or via SequenceFile Header. In this case ThriftSerialization couldn't be used since with no Objects wrappers a la Avro (AvroKey,AvroValue) it can't distinguish if its an input, map-output or output. |
It sounds reasonable. Iván 2013/1/8 Eric Palacios [email protected]
Iván de Prado |
That would be solved properly by implementing a custom field serializer for Thrift (http://pangool.net/userguide/custom_serialization.html). The metadata would be used for storing the format used for serializing this field. This information would be carried as well in the header of the TupleFile. |
Right now Pangool is serializing thrift using TBinaryProtocol. But could be interesting to use TCompactProtocol, which uses less space. The idea is to make the selection of the protocol configurable.
The text was updated successfully, but these errors were encountered: