GeoRocket 1.3.0

Al­though the pre­vi­ous ver­sion was re­leased only two months ago, the all new GeoR­ocket 1.3.0 comes with a large num­ber of new fea­tures, fixes, and up­dates. We par­tic­u­larly fo­cused on per­form­ance and scalab­il­ity to make this the fast­est ver­sion avail­able. We also worked on us­ab­il­ity a lot and per­formed ex­tens­ive tests on Amazon Web Ser­vices AWS with huge data­sets to fur­ther im­prove GeoR­ock­et’s sta­bil­ity and use for real-world ap­plic­a­tions.

Highlights

There are two fea­tures in this re­lease that we want to high­light. A com­plete list of changes can be found be­low.

Performance: Low-latency optimistic merging

Be­sides many per­form­ance im­prove­ments (e.g. bet­ter scalab­il­ity dur­ing im­port and in­dex­ing, faster quer­ies, HTTP com­pres­sion), we im­ple­men­ted a new fea­ture called low-latency op­tim­istic mer­ging, which tre­mend­ously im­proves query per­form­ance.

In GeoR­ocket, the res­ult of a query is a file con­sist­ing of the merged chunks match­ing the query. In or­der to cre­ate valid out­put files, GeoR­ocket has to look at all chunks be­fore it can ac­tu­ally render them. The pro­cess of ex­port­ing a file is as fol­lows:

  1. The cli­ent sends a query to GeoR­ocket.
  2. GeoR­ocket for­wards the query to the in­dex to identify match­ing chunks.
  3. It then loads the chunk metadata from the in­dex and cre­ates a valid header for the out­put file.
  4. GeoR­ocket loads the chunks from the stor­age back-end.
  5. Fi­nally, it sends the rendered out­put file to the cli­ent.

If your data store con­tains het­ero­gen­eous XML data, GeoR­ocket has to load the XML namespaces of all match­ing chunks in step 3 to cre­ate an XML header con­tain­ing all these namespaces. The time it takes to render the first chunk back to the cli­ent there­fore dir­ectly de­pends on how many chunks match the query. While this is usu­ally very fast, it does not scale ar­bit­rar­ily and can take a few seconds for a multi-giga­byte data­set.

However, if you know that your data store con­tains ho­mo­gen­eous data or if your query matches ho­mo­gen­eous chunks—which is the case for most data­sets and quer­ies we’ve seen in pro­duc­tion—you can en­able low-latency op­tim­istic mer­ging to skip step 3 com­pletely. GeoR­ocket will then render the first chunk to the cli­ent as soon as it has been loaded from the stor­age back-end, which usu­ally hap­pens within a few mil­li­seconds.

Low-latency op­tim­istic mer­ging al­ways works for GeoJSON data. So, if you only deal with GeoJSON, you can leave this fea­ture en­abled all the time. If you work with XML, you should make sure your data is ho­mo­gen­eous and all chunks use the same XML namespaces. If they don’t, the chunks that can­not be merged will be skipped.

Low-latency op­tim­istic mer­ging is avail­able through the HTTP in­ter­face and as a com­mand-line para­meter. The num­ber of skipped chunks (if there are any) will be re­turned through an HTTP trailer or writ­ten to the stand­ard er­ror stream, re­spect­ively.

Usability: Import progress

The com­mand-line in­ter­face (CLI) now shows beau­ti­ful and de­tailed in­form­a­tion on the pro­gress while im­port­ing files into GeoR­ocket. Watch the fol­low­ing screen­cast to see the new fea­ture in ac­tion:

Detailed list of changes

New features

Command-line application

  • Dis­play pro­gress while im­port­ing
  • Print met­rics at the end of im­port pro­cess
  • Com­press com­mu­nic­a­tion with the GeoR­ocket server
  • Add op­tions to en­able low-latency op­tim­istic mer­ging
  • Print num­ber of un­merged chunks

Server API

  • Im­prove us­ab­il­ity of server API

Client API

  • Add pos­sib­il­ity to en­able low-latency op­tim­istic mer­ging
  • Add API to get num­ber of un­merged chunks

Bug fixes

  • Im­port a file only after it has been writ­ten/​closed com­pletely
  • Fix NOT quer­ies with em­bed­ded EQ clause
  • Add miss­ing doc­u­ment­a­tion for scrolling

Internal changes

  • Up­date Vert.x to 3.5.3
  • Up­grade Elast­ic­search to 6.3.2
  • Up­grade Gradle Wrap­per to 4.10
  • Re­duce log out­put

More information

The new ver­sion is re­com­men­ded for all users. Try GeoR­ocket 1.3.0 while it’s still hot! 🔥

🚀 https://georocket.io/try

For a com­plete list of fea­tures visit our web­site. There you will also find the user doc­u­ment­a­tion and other in­form­a­tion.

If you have ques­tions, ideas or com­ments re­gard­ing GeoR­ocket or any of our other ser­vices feel free to con­tact us.


Posted by Michel Krämer
on September, 17th 2018.