Quearing unpredictable JSON objects from Elasticsearch using Springboot - json

I am creating a spring-boot application which will interact with elasticsearch using spring-data. But the problem is, my data in the elasticsearch is unpredictable. That means there can be slight changes in the fields like additional fields or can be totally new field coming in JSON. Please guide me for a solution to address that. Using normal repository is seems not working because I don't have a defined JSON format. Your guide will be highly appreciated.

You need to provide a bit more data on your case.
Normally, when you use #Field annotations, or introducing/dropping a new simple or object field, this should not be a problem at all since spring-data-elasticsearch updates mapping when you save to ElasticsearchRepository. In some cases, e.g. introducing a parent-child rel, you would need to drop and recreate index but this can also be done programatically, if needed.
If you need advanced mapping that is also changing dynamically, then you need to build and execute a mapping update request from your code on save (custom repo).

Related

Best way to validate json response in python 3

I am using python 3 for functional testing of a bunch of rest endpoints.
But i cannot figure out the best way to validate the json reaponse ( verifying the type, required, missing and additional fields)
I thought of below options :
1. Writing custom code and validate the response while converting the data into python class objects.
2. Validate using json schema .
Option 1: would be difficult to maintain and need to add separate functions to all the data models.
Option 2 : i like it. But i dont want to write schema for each endpoint in separate file/object. Is there a way to put it in a single object like we have swagger yml file. That way would be easy to maintain.
I would like to know which option is the best and if there are other better options / libraries available.
I've been through the same process, but validating REST requests and responses with Java. In the end I went with JSON Schema (there's an equivalent Python implementation at https://pypi.python.org/pypi/jsonschema) because it was simple and powerful, and hand-crafting the validation for anything but a trivial payload soon became a nightmare. Also, reading a JSON Schema file is easier than reasoning about a long list of validation statements.
It's true you need to define the schema in a separate file, but this proved to be no big deal. And, if your endpoints share some common features you can modularise your schemas and reuse common parts. There's a good tutorial at Understanding JSON Schema.

How to put json object in database automatically?

I have a very large Json object that i want to put in a nosql database.
I would like to know:
first, how to generate the database schema based on that Json object?
second, is there a way to put this object automatically in the database, without manually specifying which value (in json object) goes in which column (in the database)?
I hope I was clear enough. Thanks!
Since you haven't specified which NoSQL database you're using in particular, for convenience, I'll assume you're using MongoDB when I talk about things that are implementation specific.
First off, you should know that NoSQL databases by nature are "schema-less". You still could implement your own schema (in your app, not the db), but that's optional, and mostly done just for validation purposes or to let future developers understand the planned structure of your data better. Read the Dynamic Schemas section in this article to know more. Here is a SO answer explaining how you would do that in mongoose and here is the official guide/doc for it.
Second, NoSQL databases don't work in terms of columns or rows. Rather, you need to think in terms of collections and documents. So to answer your question : Yes, when you have a JSON object, you shove it in directly (before applying any required formatting if you've implemented a schema like in above). You don't enter data value by value (unless you've intentionally set it up to do so).
It sounds to me that you need to strengthen your fundamental understanding of how NoSQL works as you seem to be confusing yourself with concepts that belong to other DBMS. Here is a neat slideshow to get you started and the previous article I linked you to also gives you a decent introduction.
After you're done, consider installing MongoDB or something similar and just playing around with the command line interface to get a good hang of it.

Dynamic JSON file vs API

I am designing a system with 30,000 objects or so and can't decide between the two: either have a JSON file pre computed for each one and get data by pointing to URL of the file (I think Twitter does something similar) or have a PHP/Perl/whatever else script that will produce JSON object on the fly when requested, from let's say database, and send it back. Is one more suited for than another? I guess if it takes a long time to generate the JSON data it is better to have already done JSON files. What if generating is as quick as accessing a database? Although I suppose one has a dedicated table in the database specifically for that. Data doesn't change very often so updating is not a constant thing. In that respect the data is static for all intense and purposes.
Anyways, any thought would be much appreciated!
Alex
You might want to try MongoDB which retrieves the objects as JSON and is highly scalable and easy to setup.

Mix of MySQL and Mongodb in an application

I'm writing a web application using PHP/Symfony2/Doctrine2 and just finishing up the design of the Database. We have to import these objects (for ex. Projects, Vendors) into our database that come from different customers with variety of fields. Some customers have 2 fields in the project object and some have 20. So I was thinking about implementing them in MongoDB since it seems like a good use for it.
Symfony2 supports both ORM and ODM so that shouldn't be a problem, Now my question is how to ensure the integrity of the data in both databases. Because Objects in my MySQL db need to somehow be linked to the objects in the MongoDB for integrity issues.
Are there any better solutions out there? Any help/thoughts would be appreciated
Bulat implemented a Doctrine extension while we were at OpenSky for handling references between MongoDB documents and MySQL records, which is currently sitting in their (admittedly outdated) fork of the DoctrineExtensions project. You'll want to look at either the orm2odm_references or openskyfork branches. For this to be usable in your project, you'll probably want to port it over to a fresh fork of DoctrineExtensions, or simply incorporate the code into your application. Unfortunately, there is no documentation apart from the code itself.
Thankfully, there is also cookbook article on the Doctrine website that describes how to implement this from scratch. Basically, you rely on an event listener to replace your property with a reference (i.e. uninitialized Proxy object) from the other object manager and the natural behavior of Proxy objects to lazily load themselves takes care of the rest. Provided the event listener is a service, you can easily inject both the ORM and ODM object managers into it.
The only integrity guaranteed by this model is that you'll receive exceptions when trying to hydrate a bad reference, which is probably more than you'd get by simply storing an ID of the other database and querying manually.
So the way we solved this problem was by moving to Postgres. Postgres has a datatype called hstore that acts like a NoSQL column. Works pretty sweet
UPDATE
Now that I'm looking back, go with jsonb instead of json or hstore as it allows you to have more of a data structure than a key-value store.

Processing a MySQL DB and XML Hybrid into a Solr Index

Problem:
A Table in MySQL
that has a few normal fields and
one text field that holds XML
I need to use Solr Data Import Handler to process this table into a Solr Index.
However, the XML field needs to be parsed into several other solr fields each
Question:
Is it possible to do this without having to write a custom Transformer? If yes how. Can I use XPathEntityProcessor with a my SQL DB as datasource?
If I write a custom transformer, how exactly do I configure it in dataConfig?
I am using older version of solr (1.4.1), so can I just drop a new jar with new class into my solr web-app?
The thing I am quite unsure about is how I need to configure the data-config.xml to do this. If anyone has any examples, please share! Thanks.
My suggestion is to write a program which selects the data from the database, parses the XML data field and then inserts the entire document into the SOLR index.
The solrj Java apis are really easy to use. The hardest part of this is parsing the XML, but it's a far easier challenge and easier to test.