Performance considerations for a database system for large three dimensional geomodels

Georg Semmler and Paul Gabriel and Helmut Schaeben. ( 2017 )
in: 2017 Ring Meeting, pages 1--8, ASGA

Abstract

In the last few years an increasing amount of three-dimensional geomodels was produced by various organizations, like geological surveys, geological exploration agencies, oil, gas and mining companies. Not only the number of produced geomodels is growing, but more importantly the degree of details and the size of the models are growing, too. For instance, single vector based geometries contained in regular geomodels could reach a size of more than 20 million vertices and more that 45 million triangles.(Resulting gOcad ASCII file size: approx. 3 GB) Grid based geometries are easily getting larger than 1 billion cells. (Resulting gOcad file size: approx. 4 GB) Last year we presented a prototype of GST 3, a new kind of database that is able to handle those large geomodels. In this paper we evaluate the performance of the new system in comparison to existing systems like the array database Rasdaman, our old system GST 2 and the spatial extension to PostgreSQL PostGIS.We will compare the runtime performance of similar actions, like inserting or loading a geometry from the database and required hard disk memory for storing different kind of geometries. Another important part of building a performant database systems for geomodels is an optimized transport layer. As a second part of our performance comparison we compare the different transport strategies used by the compared systems. On the one hand PostGIS and GST 2 use a simple system based on passing whole geometries as encoded binary or even text to the called functionality. On the other hand Rasdaman and GST 3 use complex mechanisms to pass geometries using a stream-based approach. This allows both systems to handle geometries that are larger than the available main memory. Furthermore, the processing of those geometries could start before the whole geometry is transfered.

Download / Links

BibTeX Reference

@INPROCEEDINGS{Semmler2017,
    author = { Semmler, Georg and Gabriel, Paul and Schaeben, Helmut },
     title = { Performance considerations for a database system for large three dimensional geomodels },
 booktitle = { 2017 Ring Meeting },
      year = { 2017 },
     pages = { 1--8 },
 publisher = { ASGA },
  abstract = { In the last few years an increasing amount of three-dimensional geomodels was produced by various organizations, like geological surveys, geological exploration agencies, oil, gas and mining companies. Not only the number of produced geomodels is growing, but more importantly the degree of details and the size of the models are growing, too. For instance, single vector based geometries contained in regular geomodels could reach a size of more than 20 million vertices and more that 45 million triangles.(Resulting gOcad ASCII file size: approx. 3 GB) Grid based geometries are easily getting larger than 1 billion cells. (Resulting gOcad file size: approx. 4 GB) Last year we presented a prototype of GST 3, a new kind of database that is able to handle those large geomodels. In this paper we evaluate the performance of the new system in comparison to existing systems like the array database Rasdaman, our old system GST 2 and the spatial extension to PostgreSQL PostGIS.We will compare the runtime performance of similar actions, like inserting or loading a geometry from the database and required hard disk memory for storing different kind of geometries. Another important part of building a performant database systems for geomodels is an optimized transport layer. As a second part of our performance comparison we compare the different transport strategies used by the compared systems. On the one hand PostGIS and GST 2 use a simple system based on passing whole geometries as encoded binary or even text to the called functionality. On the other hand Rasdaman and GST 3 use complex mechanisms to pass geometries using a stream-based approach. This allows both systems to handle geometries that are larger than the available main memory. Furthermore, the processing of those geometries could start before the whole geometry is transfered. }
}