Although the previous version was released only two months ago, the all new GeoRocket 1.3.0 comes with a large number of new features, fixes, and updates. We particularly focused on performance and scalability to make this the fastest version available. We also worked on usability a lot and performed extensive tests on Amazon Web Services AWS with huge datasets to further improve GeoRocket’s stability and use for real-world applications.
There are two features in this release that we want to highlight. A complete list of changes can be found below.
Performance: Low-latency optimistic merging
Besides many performance improvements (e.g. better scalability during import and indexing, faster queries, HTTP compression), we implemented a new feature called low-latency optimistic merging, which tremendously improves query performance.
In GeoRocket, the result of a query is a file consisting of the merged chunks matching the query. In order to create valid output files, GeoRocket has to look at all chunks before it can actually render them. The process of exporting a file is as follows:
- The client sends a query to GeoRocket.
- GeoRocket forwards the query to the index to identify matching chunks.
- It then loads the chunk metadata from the index and creates a valid header for the output file.
- GeoRocket loads the chunks from the storage back-end.
- Finally, it sends the rendered output file to the client.
If your data store contains heterogeneous XML data, GeoRocket has to load the XML namespaces of all matching chunks in step 3 to create an XML header containing all these namespaces. The time it takes to render the first chunk back to the client therefore directly depends on how many chunks match the query. While this is usually very fast, it does not scale arbitrarily and can take a few seconds for a multi-gigabyte dataset.
However, if you know that your data store contains homogeneous data or if your query matches homogeneous chunks—which is the case for most datasets and queries we’ve seen in production—you can enable low-latency optimistic merging to skip step 3 completely. GeoRocket will then render the first chunk to the client as soon as it has been loaded from the storage back-end, which usually happens within a few milliseconds.
Low-latency optimistic merging always works for GeoJSON data. So, if you only deal with GeoJSON, you can leave this feature enabled all the time. If you work with XML, you should make sure your data is homogeneous and all chunks use the same XML namespaces. If they don’t, the chunks that cannot be merged will be skipped.
Low-latency optimistic merging is available through the HTTP interface and as a command-line parameter. The number of skipped chunks (if there are any) will be returned through an HTTP trailer or written to the standard error stream, respectively.
Usability: Import progress
The command-line interface (CLI) now shows beautiful and detailed information on the progress while importing files into GeoRocket. Watch the following screencast to see the new feature in action:
Detailed list of changes
- Add low-latency optimistic merging
- Add indexer for xAL 2.0 addresses
- Improve query performance
- Improve indexer performance
- Improve scalability during import/indexing
- Add cache for indexable chunks to reduce network load
- Add possibility to POST compressed files (GZIP)
- Add support for multiple Elasticsearch hosts (i.e. Elasticsearch cluster)
- Add possibility to automatically update the list of Elasticsearch hosts
- Return number of unmerged chunks in HTTP trailer
- Compress communication between GeoRocket and Elasticsearch (configurable)
- Enable snappy compression for MongoDB connection
- Support YAML syntax in environment variables
- Log memory info on startup
- Display progress while importing
- Print metrics at the end of import process
- Compress communication with the GeoRocket server
- Add options to enable low-latency optimistic merging
- Print number of unmerged chunks
- Improve usability of server API
- Add possibility to enable low-latency optimistic merging
- Add API to get number of unmerged chunks
- Import a file only after it has been written/closed completely
NOTqueries with embedded
- Add missing documentation for scrolling
- Update Vert.x to 3.5.3
- Upgrade Elasticsearch to 6.3.2
- Upgrade Gradle Wrapper to 4.10
- Reduce log output
The new version is recommended for all users. Try GeoRocket 1.2.0 while it’s still hot! 🔥
If you have questions, ideas or comments regarding GeoRocket or any of our other services feel free to contact me.
Posted by Michel Krämer
on 17 September 2018
I’ve just uploaded a new acoustic cover of Passenger’s Sweet Louise to YouTube. The songs I usually play are rather sad and depressing, so I thought I’d go for something a little happier this time.
The new version of the open-source application GeoRocket comes with many new features. Highlights are the support for user-defined properties and the extended query language, which now provides comparison operators.
Version 3.4.0 of the popular Gradle plugin contains many new features. Highlights are the support for ETags and downloading to a temporary file. The update also contains various other improvements.
The new version of the scientific workflow management system contains many new features including an improved workflow syntax, better parallelization, workflow priorities, and full-text search. It also fixes a few bugs.