how to create an elastic search index which emulates my mysql db - mysql

I am new to elasticsearch and i am having tough time switching from mysql to elasticsearch
my Mysql my tables looks like this
table : test_request
+---------+-------------+--------------+-----------+------------+-----------+
| test_id | device_name | ip_address | user_name | time_stamp | show_flag |
+---------+-------------+--------------+-----------+------------+-----------+
| 1 | d1 | 0.0.0.0 | admin | | Y |
+---------+-------------+--------------+-----------+------------+-----------+
table: test_results
+----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+
| id | test_id | cli | xml | json | another json | show_flag |
+----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+
| 1 | 1 | c1 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y |
+----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+
| 2 | 1 | c2 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y |
+----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+
| 3 | 1 | c2 | some xml format data | {"some":"json here"} | {"some":" another json here"} | Y |
+----+---------+-----+-----------------------+-------------------------+----------------------------------+-----------+
the test_id field in the test_request table and the id field in the test_results table are auto increment. The json and another json fields are of data type JSON.
I am trying to use elasticsearch_dsl to create index and its mappings. I am going through the docs to figure out how to do achieve this but i couldn't figure out three things
how to get the test_id to auto increment
how to make a field of JSON data type
Best way to setup a relationship between both (i partially understood nested could help here) but looking for the correct way to do this

The auto increment id columns play following rules in the SQL tables:
they are unique identifiers of the row
they allow to link rows between tables
To achieve this in elasticsearch you don't need a auto increment field. You can add document to elasticsearch index, and elasticsearch will add a unique id to it.
For JSON fields use simply object datatype.
There are few options to setup relation like SQL join:
You can put test_results as nested objects within test_request document
You can use join datatype field to link test_results documents to test_request document within the same index
You can denormalize and store every test_result into single document together with its test_request. It is ok, that test_request will be stored many times. Elasticsearch is primarily for searching anyway.
Which version you choose is up to you. It depends, how are you going to use your data, what kind of queries are you going to do. Can you collect all test_results together with the test_request and store it with the single call, or do you need store the test_request and the successively add test_results?
Successively updating nested field would mean reindexing the whole document every time. Join datatype is expensive for querying.
Denormalization adds space usage, but if the number of test_results per request is not large, then it is maybe the best option.

Related

How to fetch or give query to get another table column to my table through in spring boot

How to fetch or give query to get another table column to my table through in spring boot.
I am doing some spring project and I have created two table here.
----------------------
| Table 1 |
----------------------
| UserID |
| InstrumenName |
| Qty |
| Price |
| Date |
----------------------
----------------------
| Table 2 |
----------------------
| InstrumenName |
| LTP |
| Sector |
----------------------
so while saving I am saving the Table data.
Here is my code for controller class.
#PostMapping("/employee")
public Employee createEmployee(#RequestBody Employee employee) {
return employeeRepository.save(employee);
}
Here I am following all the rules like JPA repo, Model and COntroller.
Now when I wanted to get this data I wanted to add LTP from table 2 with respect to instrument name.
I am very new here so what I suppose to do. I have get code as well in my controller but I wanted to add LTP as well?
Do i need to write sql procedure or any businees login java code.
This would be my sql query :
SELECT table1.UserID, table1.Qty,table1.InstrumenName, table1.Price,table2.LTP
FROM table1
INNER JOIN table2
ON table1.InstrumenName=table2.InstrumenName;
ANy help will be helpfull.
You need a class/interface to hold the result.
If it's read only you can use an interface if you also want to receive data then use a DTO (if using Java 16 a Record would be a great fit)
Please read the documentation about projections.
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projections

Defining queries around a designed database

I have a database which contains a lot of data and although I was not involved in setting it up it is what I have to work with.
Within this database is somewhat of a lookup table. However, this table has no link to any other tables. It essentially takes the following form
ID | input | table_name |
-------------------------------------
1 | Movie | movie_tbl |
2 | Cartoon | cartoon_tbl |
3 | Animation | cartoon_tbl |
4 | Audio | audio_tbl |
5 | Picture | picture_tbl |
The table is a lot larger than the above, but the structure is as above. So what happens is someone visits my site. Here, they have an input field. Say they enter Movie then the above table is called to find the input with Movie. It then gets what table it needs to look in. I would imagine that the query would be something like
SELECT table_name FROM lookup_table WHERE input LIKE Movie;
Now that should return movie_tbl. I now know that I need to search for Movie within movie_tbl and return all the data for its row. So movie_tbl might be like this (data would be some type of data and the column names different)
ID | input | col_1 | col_2 | col_3 |
----------------------------------------------------
1 | Movie | data | data | data |
2 | Cartoon | data | data | data |
3 | Animation | data | data | data |
4 | Audio | data | data | data |
5 | Picture | data | data | data |
So now my query will be something like this
SELECT * FROM movie_tbl WHERE input LIKE Movie;
Now the tables have tens of thousands of lines of data. My real question is whether the above will be effecient or not? With the database I was given however, I do not see any other way I could do this (I cant touch the database). Is there anything I can do to make this more effecient?
Any advice appreciated
Thanks
Why are you checking for input in the 2nd table? You have already filtered the input from the first table:
SELECT table_name FROM lookup_table WHERE input LIKE Movie;
In this case you dont have to make 2 queries. Just the 2nd one should suffice. Or just having Movie data in the 2nd table and separate tables for Cartoon, Animation etc. Because then you wont be accessing the 'WHERE' clause, just:
SELECT * FROM movie_tbl;
2nd Suggestion: Use = instead of LIKE. No need for pattern matching if you know the exact input string.

Dynamic value to display numbers of entries in second table

I've got multiple entries in table A and would like to display the number of entries in a coloumn of table B. Is there a way to create a dynamic cell-content displaying the number of entries in a table?
I'm a beginner in MySQL and did not find a way to do it so far.
Example table A:
+----+------+------------+
| id | name | birthday |
+----+------+------------+
| 1 | john | 1976-11-18 |
| 2 | bill | 1983-12-21 |
| 3 | abby | 1991-03-11 |
| 4 | lynn | 1969-08-02 |
| 5 | jake | 1989-07-29 |
+----+------+------------+
What I'd like in table B:
+----+------+----------+
| id | name | numusers |
| 1 | tblA | 5 |
+----+------+----------+
In my actual database there is no incrementing ID so just taking the last value would not work - if this would've been a solution.
If MySQL can't handle this the option would be to create some kind of cronjob on my server reading the number of rows and writing them into that cell. I know how to do this - just checking if there's another way.
I'm not looking for a command to run on the mysql-console. What I'm trying to figure out is if there's some option which dynamically changes the cell's value to what I've described above.
You can create a view that will give you this information. The SQL for this view is inspired by an answer to a similar question:
CREATE VIEW table_counts AS
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = '{your_db}';
The view will have the cells you speak of. As you can see, it is just a filter on an already existing table, so you might consider that this table information_schema.tables is the answer to your question.
You can do that directly with COUNT() for example SELECT COUNT(*) FROM TblA The you get all rows from that table. If you IDXs are ok then its very fast. If you write it to another table you have to make an request too to get the result of the second table. So i think your can do it directly.
If you have some performance problems there are some other possibilities like Triggers or Stored Procedures to calculate that result and save them in a memory table to get a better performance.

Compare specific field from two different database table

I'm actually developing a synchronization tool in vb.net, I have two database that have on each table records field GUID, this field help to have the same PK on both database. On each record there is also a field called lastUpdated, this field have a milliseconds value, so prevent two user to update the record in the same time. My question is, how I can compare the records of the same table from different db? For example:
ONLINE_DATABASE
TABLE_1
| ID | GUID | NAME | LASTUPDATED |
| 5 | 054ba092-b476-47ed-810b-32868cc95fb| John | 06-01-2016 17:01:12.472438 |
CLIENT_DATABASE
TABLE_1
| ID | GUID | NAME | LASTUPDATED |
| 9 | 054ba092-b476-47ed-810b-32868cc95fb| Jack | 06-01-2016 18:01:12.472438 |
How you can see I've update the record from client application, so I need to apply the same change to online database. Now I've a thousand records to check in about ten tables. So my question is, how I can made a system that do this? Actually I tough to read each row with a MySqlCommand reader but I also think that this procedure is slow... Suggest?
NOTICE THAT: the table have the same name in both db

Data structure for a set of changes similar to SVN?

So far we have been storing information of changes as following.
Imagine having a changeset table structure of something that gets changed that is called object. The object is connected to say a foreign element by a foreign key. The object gets created like this
changesetId (Timestamp) | objectId | foreignKey | name (String) | description (String)
2015-04-29 23:28:52 | 2 | 123 | none | none
Now we change the name, the table will look like that after the name change
changesetId (Timestamp) | objectId | foreignKey | name (String) | description (String)
2015-04-29 23:28:52 | 2 | 123 | none | none
2015-04-29 23:30:01 | 2 | null | foo | null
This structure is exactly the minimum. It contains exactly the change we did. But to create the current version of the object, we have to add up the changes to actually get the final version. E.g.
changesetId (Timestamp) | objectId | foreignKey | name (String) | description (String)
2015-04-29 23:28:52 | 2 | 123 | none | none
2015-04-29 23:30:01 | 2 | null | foo | null
*2015-04-29 23:30:01 | 2 | 123 | foo | none
the * marking the final version, which does not exist in the DB.
So if we only store exactly the changes, we have more work to do. Especially, when coming from a foreign object f. If I have a number of objects f and I want to get all changes to the object from our table, I have to create a bit of an ugly SQL. This obviously gets worse, the more foreign objects you have.
Basically I have to do:
Select all F that I want and
Select all objects WHERE foreignKey = foreignId
OR Select all objects that have objectId in (Select all objects that have foreignKey = foreignId)
e.g. I have to select the objects that have foreignKey 123 or elements that have foreignKey null but there exists an entry with same objectId with foreignKey 123.
The more dependencies, the uglier this SQL gets obviously.
Did I make myself clear?
Wouldn't it be much easier to keep always all fields in all versions
e.g. a simple name change gets:
changesetId (Timestamp) | objectId | foreignKey | name (String) | description (String)
2015-04-29 23:28:52 | 2 | 123 | none | none
2015-04-29 23:30:01 | 2 | 123 | foo | none
now to create a diff I have to compare both versions, but I don't have to do the extra work for selecting the right elements nor for calculating the final version of said timestamp.
What do you consider the proven best solution?
how is svn doing it?
For your use case the method you suggest seem to be better. Key value stores like LSM trees do exactly the same. They just write a newer version of the object without deleting the older version. If, at any point of time, you need the change that was made, I think you can just diff two adjacent versions.
The second method might use more space if you have a lot of variable length text fields, but that's a trade-off you get for speed and maintainability.