How to retrieve raster data in GeoMesa with a single query given tempospatial search criteria - geomesa

According to How are GeoTIFFs persisted in GeoMesa?
GeoMesa's raster data is indexed by spatial extent solely.
Can I also save time info with the raster data? Else, for each raster I will have to persist another record holding its time info. Therefore, in order to retrieve my raster using a tempospatial query (is WMS capable for this? according to [1] it seems to be) I will have to retrieve both files; this means for x raster ==> x+1 GeoMesa hits (retrievals).
[1] http://docs.geoserver.org/stable/en/user/services/wms/time.html

Currently, no, you cannot save time info with the raster data.
WMS does support time and time range queries. That capability isn't wired up completely in the AccumuloRasterStore.
As an alternative, GeoMesa does allow for storing blobs and creating pointers to them on the way. The GeoMesa BlobStore doesn't allow for WMS access, but it is a nifty, extensible capability.

Related

How to store JSON in DB without schema

I have a requirement to design an app to store JSON via REST API. I don't want to put limitation on JSON size(number of keys,etc). I see that MySQL supports to store JSON, but we have to create table/schema and then store the records
Is there any way to store JSON in any type of DB and have to query data with keys
EDIT: I don't want use any in-memory DB like Redis
Use ElasticSearch. In addition to schema less json, it support fast search.
The tagline of ElasticSearch is "You know, for search".
It is built on top of text indexing library called "Apache Lucene".
The advantage of using ElasticSearch are:
Scalable to petabytes of data clusters.
Fully open source. No cost to pay.
Enterprise support available for platinum license.
Comes with additional benefits such as analytics using Kibana.
I believe NoSQL is best solution. i.e MongoDB. I have tested MongoDB, looks good and has python module to interact easily. For quick overview on pros https://www.studytonight.com/mongodb/advantages-of-mongodb
I've had great results with Elasticsearch, so I second this approach as well. One question to ask yourself is how you plan to access the JSON data once it is in a repository like Elasticsearch. Will you simply store the JSON doc or will you attempt to flatten out the properties so that they can be individually aggregated? But yes, it is indeed fully scalable by increasing your compute capacity via instance size, expanding your disk space or by implementing index sharding if you have billions of records in a single index.

Storing GeoTIFFs in GeoMesa

I want to store in GeoMesa GeoTIFFs and retrieve them with WMS. The idea is saving them in BlobStore according to http://www.geomesa.org/documentation/user/blobstore.html by parsing their spatial info using GDAL (http://www.gdal.org/formats_list.html).
But it seems that you cannot query with WMS data peristed in BlobStore (How to retrieve raster data in GeoMesa with a single query given tempospatial search criteria).
Moreover, what if I want to have temporal info for my GeoTIFFs? Where should I store them?
The blobstore does not support WMS. The blobstore has a pluggable indexing module which allows for handling temporal information in an application specific manner.

Storing/versioning large amount of JSON objects

Our cloud service deals with chunks of JSON data (item) which is being manipulated all the time. It can be changed as fast as every second.
At the moment item is JSON object that is being modified all the time. Now we need to implement versioning of these items as well.
Basically, every time request to modify the object arrives, it is modified, saved to DB and then we also need to store that version somewhere. So later on you will be able to say "give me version 345 of this item".
My question is - what would be the ideal way to store this history. Mind you, we do not need to query or alter the data once saved, all we need is to load it if necessary (0.01% of time) - the data is meaningless blob basically.
We are researching multiple approaches:
Simple text files (file system)
Cloud storage (eg S3)
Version control (eg GIT)
Database (any)
Vault (For example Vault from hashicorp)
Main problem is that since items are updated every second, we end up with a lot of blobs. Consider - 100 items, updated every second - thats 8,640,000 records in a single day. Not to mention 100rps for the DB.
Do you have any recommendation as to what would be the optimal approach? We need it to be scalable, fast, reliable, encryption out-of-the-box would be great plus.

Determining size of a JSON document stored in DocumentDB

I'm developing a partitioning strategy for a multi-tenant application running on DocumentDB.
Since each collection only allows for 10gb of storage I am attempting to calculate how many documents each of my tenants can store, so I can come up with the number of tenants I can place into a collection.
I have a sample Json document that represents a common document that a tenant may store. Using Document Explorer on the Azure Portal does not tell me what the size of one of these documents is on disk (Just a general graph of usage percentage).
I'm also using DocumentDB Studio and am unable to determine the document sizes there. I can use Notepad locally, but depending on my encoding settings (ANSI, , Etc..) I am getting various results.
My questions are:
Is there an accurate way to determine the file size a Json file will be stored as within DocumentDB as so that I can properly calculate resource usage of my application?
Is there also a way to get back the size of a document or group of documents via a query against a collection?
Yes - you can calculate the size of the document query response, so that all the system properties (e.g. _rid, _ts) are included. You will want to use UTF-8 encoding to get the correct size.
You will also want to factor in an additional ~10% for indexing storage costs.

how to store spatial files in MySQL

what is a better way to store spatial data in MySQL (say tracks)
internally or as references to the external flat files?
MySQL has a spatial extensions to store geographic objects (objects with a geometric attributes). More detail available there.
I would reccomend against mysql if you want to store it as explicitly spatial information. Instead I would reccomend Postgresql/PostGIS if you want to stay with Open Source DB. MySQL barely implements any of their spatial functionality. If you read the doc closely most spatial functions are yet to be implemented.
If you are don't care about explicitly spatial information then go ahead and store it directly in the DB.
If you give some more background on what you want to do we might be able to help more
The "better way" to store data depends on several factors which you, yourself need to consider:
Are the files rediculously large? +50MB? MySql can time out on long transactions.
Are you working on a closed network environment where the file system is secure and controlled?
Do you plan only to serve the raw files? There's no point in processing them into MySql format only to re-process them on the way out.
Is it expected that 'non technical' people are going to want to access this data? 'non technical' people generally don't like obfuscated information.
Do you have the capability in your applciation (if you have an applicaiton) to read the spatial data in the format that MySql stores it in? There's no point in processing and storing a .gpx or .shp file into MySql format if you can't read it back from there.
Do you have a system / service that will control the addition / removal / modification of the file structure and corresponding database records? Keeping a database and file system in sync is not an easy task. Especially when you consider the involvement of 'non technical' people.