Binary JSON with bson4jackson

JSON is an excellent alternative to XML but most JSON parsers written in Java are still rather slow. On my search for faster libraries, I found two things: BSON and Jackson.

BSON is binary encoded JSON. The format has been designed with fast machine readability in mind. BSON has gained prominence as the main data exchange format for the document-oriented database management system MongoDB. According to the JVM serializers benchmark, Jackson is one of the fastest JSON processors available. Apart from that, Jackson allows writing custom extensions. This feature can be used to add further data exchange formats.

bson4jackson

This is the moment where bson4jackson steps in. The library extends Jackson by the capability of reading and writing BSON documents. Since bson4jackson is fully integrated, you can use the very nice API of Jackson to serialize simple POJOs. Think of the following class:

public class Person {
  private String _name;
 
  public void setName(String name) {
    _name = name;
  }
 
  public String getName() {
    return _name;
  }
}

You may use the ObjectMapper to quickly serialize objects:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.databind.ObjectMapper;
import de.undercouch.bson4jackson.BsonFactory;
 
public class ObjectMapperSample {
  public static void main(String[] args) throws Exception {
    // create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");
 
    // serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectMapper mapper = new ObjectMapper(new BsonFactory());
    mapper.writeValue(baos, bob);
 
    // deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    Person clone_of_bob = mapper.readValue(bais, Person.class);
 
    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Or you may use Jackson’s streaming API and serialize the object manually:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import de.undercouch.bson4jackson.BsonFactory;
 
public class ManualSample {
  public static void main(String[] args) throws Exception {
    // create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");
 
    // create factory
    BsonFactory factory = new BsonFactory();
 
    // serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    JsonGenerator gen = factory.createJsonGenerator(baos);
    gen.writeStartObject();
    gen.writeFieldName("name");
    gen.writeString(bob.getName());
    gen.close();
 
    // deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    JsonParser parser = factory.createJsonParser(bais);
    Person clone_of_bob = new Person();
    parser.nextToken();
    while (parser.nextToken() != JsonToken.END_OBJECT) {
      String fieldname = parser.getCurrentName();
      parser.nextToken();
      if ("name".equals(fieldname)) {
        clone_of_bob.setName(parser.getText());
      }
    }
 
    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Optimized streaming

One drawback of BSON is the fact that each document has to begin with a number denoting the document’s length. When creating an object, this length has to be known in advance and bson4jackson is forced to buffer the whole document before it can be written to the OutputStream. bson4jackson’s parser ignores this length field, so you may also leave it empty. For this, you have to create the BsonFactory as follows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonGenerator.Feature.ENABLE_STREAMING);

This trick can improve the serialization performance for large documents and reduce the memory footprint a lot. The official MongoDB Java driver also ignores the length field. So, you may also use this optimization if your bson4jackson-created documents will be read by the MongoDB driver.

Performance

Version 1.1.0 of bson4jackson introduced support for Jackson 1.7 as well as many performance improvements. At the moment, bson4jackson is much faster than the official MongoDB driver for Java (as of January 2011). For serialization, this is only true using the streaming API, since Jackson’s ObjectMapper adds a little bit of overhead (actually, the MongoDB driver also uses some kind of a streaming API). Deserialization is always faster. The latest benchmark results can be reviewed on the following website:

https://github.com/eishay/jvm-serializers/wiki

Compatibility with MongoDB

In version 1.2.0, bson4jackson’s compatibility with MongoDB has been improved a lot. Thanks to the contribution by James Roper, the BsonParser class now supports the new HONOR_DOCUMENT_LENGTH feature which makes the parser honour the first 4 bytes of a document, which usually contain the document’s size. Of course, this only works if BsonGenerator.Feature.ENABLE_STREAMING has not been enabled during document generation.

This feature can be useful for reading consecutive documents from an input stream produced by MongoDB. You can enable it as follows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonParser.Feature.HONOR_DOCUMENT_LENGTH);
BsonParser parser = (BsonParser)fac.createJsonParser(...);

Compatibility with Jackson

bson4jackson 2.x is compatible to Jackson 2.x and higher. Due to some compatibility issues, both libraries’ major and minor version numbers have to match. This means you have to use at least bson4jackson 2.1 if you use Jackson 2.1, bson4jackson 2.2 if you use Jackson 2.2, etc. I will try to keep bson4jackson up to date. If there is a compatibility issue, I will update bson4jackson, usually within a couple of days after the new Jackson version has been released.

Here’s the compatibility matrix for the current library versions:

Jackson 2.7.xJackson 2.6.xJackson 2.5.x
bson4jackson 2.7.xYesYesYes
bson4jackson 2.6.xNoYesYes
bson4jackson 2.5.xNoNoYes

If you’re looking for a version compatible to Jackson 1.x, please use bson4jackson 1.3. It’s the last version for the 1.x branch. bson4jackson 1.3 is compatible to Jackson 1.7 up to 1.9.

Download

Pre-compiled binaries

Pre-compiled binary files of bson4jackson can be downloaded from Maven Central. Additionally, you will need a copy of Jackson to start right away.

Maven/Gradle/buildr/sbt

Alternatively, you may also use Maven to download bson4jackson:

<dependencies>
  <dependency>
    <groupId>de.undercouch</groupId>
    <artifactId>bson4jackson</artifactId>
    <version>2.9.2</version>
  </dependency>
</dependencies>

For Gradle, you may use the following snippet:

compile 'de.undercouch:bson4jackson:2.9.2'

For buildr, use the following snippet:

compile.with 'de.undercouch:bson4jackson:jar:2.9.2'

If you’re using sbt, you may add the following line to your project:

val bson4jackson = "de.undercouch" % "bson4jackson" % "2.9.2"

License

bson4jackson is licensed under the Apache License, Version 2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


Profile image of Michel Krämer

Posted by Michel Krämer
on 30 January 2011


Next post

Build Scala projects with Eclipse Buckminster

Buckminster is a tool to build Eclipse RCP applications. It contains a lightweight Eclipse SDK and features but no means to build Scala projects yet. This post tries to bridge this gap.

Previous post

Scala projects with Eclipse PDE Build (2)

Since my last article about building Scala projects with Eclipse PDE, the OSGi bundle names of the Scala library and the compiler have changed. This article gives an update and explains how you need to modify your configuration.

Related posts

bson4jackson 2.9.2

This blog post summarizes the changes that came with the latest bson4jackson updates. Highlights are support for Decimal128, refactored serializers and deserializers, as well as support for Jackson 2.8 and 2.9.

bson4jackson 2.11.0

This version is a maintenance release that updates Jackson and fixes some minor issues. Performance of the UTF-8 decoder has been improved and precision of BigDecimals has been increased. Updating is recommended for all users.

Actson: a reactive (or non-blocking, or asynchronous) JSON parser

I’m thrilled to announce the first release of Actson, a reactive JSON parser. Actson is event-based and can be used together with reactive frameworks such as Vert.x to create highly responsive applications.