Left join table ON row with JSON values? - mysql

This one is tough , I have 2 tables that I need to join on specific row and issue is that first table row is json value
this is the json row from table items
[{"id":"15","value":"News Title"},{"id":"47","value":"image1.jpg"},{"id":"33","value":"$30"}]
this is the table attributes that I need to join on json ID and get the actual attribute name like Title , Image , Price
id Name
15 Title
47 Image
33 Price
so the start is
SELECT item_values FROM ujm_items
LEFT JOIN?????
WHERE category = 8 AND published = 1 ORDER BY created DESC
but left join on json , have no clue.
any help is appreciated.

... and this is why you don't store structured data in a single SQL field. It negates the whole purpose of a relational database.
Unless you've got a DB that includes a JSON parser, you've got two options:
a) unreliable string operations to find/extract a particular key/value pair
b) slurp the json into a client which CAN parse back to native, extract the key/values you want, then use some other ID field for the actual joins.
SELECT ...
LEFT JOIN ON SUBSTR(jsonfield, POSITION('"id:"', jsonfield)) etc...
Either way, it utterly torpedoes performance since you can't use indexes on these calculated/derived values.
note that this won't work as is - it's just to demonstrate how utterly ugly this gets.
Fix your tables - normalize the design and don't store JSON data that you need to extract data from. It's one thing to put in a json string that you'll only ever fetch/update in its entirely. It's a completely different thing to have one you need to join on sub-values thereof.

Related

How to search inside a JSON string with a SQL query when not every row that is being searched in contains JSON?

I've got a table that contains a column (rated_by) with some id's in json format. Example: ["59"]
I would like to search for a specific number in all rated_by columns of the entire table.
I've done this before in the same table for another column that also contain id's in the same format as above with the following query (for the column producten):
SELECT * FROM review WHERE JSON_SEARCH(producten,"one", "26") IS NOT NULL ORDER BY useful DESC
This works fine and this is because every row of producten is filled with json, not a single one is empty, but for rated_by some rows can be empty.
Using the exact same query like this:
SELECT * FROM review WHERE JSON_SEARCH(rated_by,"one", "59") IS NOT NULL
I get: Invalid JSON text in argument 1 to function json_search: "The document is empty." at position 0.
I tested it by emptying one row of producten and trying the working query again, it stopped working.
So the problem is my query stops working when one row of rated_by does not contain json or is empty.
Why is that? I thought using IS NOT NULL would tackle this.
I am using MYSQL.
I fixed it using JSON_VALID (first check if the row contains valid json, only if it does, retrieve data) as commented above.
Working query:
SELECT * FROM review WHERE JSON_VALID(rated_by) AND review_id = "10" AND JSON_SEARCH(rated_by,"one", "59") IS NOT NULL

How to query an array field (AWS Glue)?

I have a table in AWS Glue, and the crawler has defined one field as array.
The content is in S3 files that have a json format.
The table is TableA, and the field is members.
There are a lot of other fields such as strings, booleans, doubles, and even structs.
I am able to query them all using a simpel query such as:
SELECT
content.my_boolean,
content.my_string,
content.my_struct.value
FROM schema.tableA;
The issue is when I add content.members into the query.
The error I get is: [Amazon](500310) Invalid operation: schema "content" does not exist.
Content exists because i am able to select other fiels from the main key in the json (content).
Probably is something related with how to perform the query agains array field in Spectrum.
Any idea?
You have to rename the table to extract the fields from the external schema:
SELECT
a.content.my_boolean,
a.content.my_string,
a.content.my_struct.value
FROM schema.tableA a;
I had the same issue on my data, I really don't know why it needs this cast but it works. If you need to access elements of an array you have to explod it like:
SELECT member.<your-field>,
FROM schema.tableA a, a.content.members as member;
Reference
You need to create a Glue Classifier.
Select JSON as Classifier type
and for the JSON Path input the following:
$[*]
then run your crawler. It will infer your schema and populate your table with the correct fields instead of just one big array. Not sure if this was what you were looking for but figured I'd drop this here just in case others had the same problem I had.

Incorporating JSON objects in a table into JSON output

I have two tables with some data:
Table 1: A single row of data actin as header information to the data contained in table 2.
Table 2: A table with multiple rows including various data columns, but also including two columns with JSON formatted text (varchar(max)).
The sample data is available here:
https://docs.google.com/spreadsheets/d/1Y1Zb2a2G-NZ71wNLQxTTeiF8gXdLuIRfUrqBVmAdTFU/edit#gid=0
The requirement is to wrap the data in Table 2 within Table 1 producing a single valid JSON output including the two columns with JSON formatted text.
I expect the output to be produced by something looking like this:
SELECT *
,(
SELECT *
FROM Table2
FOR JSON path
) elements
FROM Table1
FOR JSON path
I am not sure what the best way is to incorporate the JSON formatted text into this output. I suspect that one will have to use JSON_modify() to achieve this.
Do you have any suggestions?
The answer was actually quite simple. The statement merely requires JSON_Query() statements for the two columns with formatted JSON text to get the required output.
SELECT *,
(
SELECT
SequenceID,ItemID,Description,arrow,arrowColor,Value,valueUnit,volatility,BulletChartID,JSON_Query(BulletChart) BulletChart,JSON_Query(TrendLineChart) TrendLineChart
FROM Table2
FOR JSON path
) elements
FROM Table1
FOR JSON path

Database design - using JSON to store info

I have a database listing campsites
There is a main table tblSites that contains the unique data eg name, coordinates, address etc and also includes columns for each facility eg, Toilet, Water, Shower, Electric etc where these are just 1=Yes, Null= no
This would be searched by something like
SELECT id FROM tblSites WHERE Water = 1 AND Toilets = 1
There is another related table tblLocations which contains location type (ie Near the sea, Rural, Mountains, By a river etc.
This means the table has a lot of columns and doesn't allow for easy updating if I want to add a new category.
This would be included in a search like this
SELECT M.id, L.* FROM tblSites AS M
LEFT JOIN tblLocation AS L ON M.ID = L.ID WHERE M.water=1 AND L.river=1
What I am considering is adding a column eg facilities that would contain an json string of facilities as a numbered key eg [1,3,4,12] each of the numbers represents an available facility, and another column for locations in the same format eg [1,3,5]
THis does allow me to reduce the table size and add additional facilities or locations without adding extra columns but is it a good idea performance wise?
i.e. a search would now be something like
SELECT id FROM tblSites WHERE (facilities LIKE '%1,%' AND facilities LIKE '%4,%' AND locations LIKE '%1,%')
Is there a better query that could be used to see if the field contains a key number in the array string?
Your WHERE clause is not working fine while using like '%1,%'.
if your facilities is string (TEXT or varchar ...) and searching value in stringed json array [2,3,12,21,300], where facilities like '%1,%' is true and matching with '21,' and if you want find where facilities like '%300,%' it never match that mentioned array!
So, searching json array in string format is rejected.
if your MySQL version is greater than 5.7.8, it supports native json as JSON type for columns.
When you store your data in json column in MySQL (by JSON_INSERT()) you're able to search them in where by WHERE JSON_CONTAINS(facilities,1)
But the best solution is re-design your table structures and relations as #Robby commented below your question.

How to write a code to convert text into a number?

The database I'm working on has a field in one table as a text whereas the other table has the field in a number format. I cannot change the field format at all in the database. Therefore I need to know how to convert the field from text to number before linking (or join) the tables to pull the data.
SELECT DISTINCT tblCoachingDB.ID, tblCoachingDB.SourceId, tblCoachingDBSource.ID
FROM tblCoachingDB, tblCoachingDBSource
WHERE (((tblCoachingDB.SourceId)="12"));
The tblCoachingDB.SourceID is a TEXT whereas the tblCoachingDBSource.ID is a NUMBER
You can use CStr() to cast a number as text and JOIN that to another text field.
SELECT DISTINCT
tblCoachingDB.ID,
tblCoachingDB.SourceId,
tblCoachingDBSource.ID
FROM
tblCoachingDB INNER JOIN tblCoachingDBSource
ON tblCoachingDB.SourceId = CStr(tblCoachingDBSource.ID)
WHERE tblCoachingDB.SourceId='12';
Actually I would leave out the WHERE clause until after you confirm the JOIN works properly.
You originally asked to JOIN by converting the text field to number. I first suggested text instead because I recall Access was less likely to object. But my memory about that is shaky, and if you want numeric for both sides of the JOIN, see which of these (if any) works best for you:
ON Int(tblCoachingDB.SourceId) = tblCoachingDBSource.ID
ON CLng(tblCoachingDB.SourceId) = tblCoachingDBSource.ID
ON Val(tblCoachingDB.SourceId) = tblCoachingDBSource.ID
Note I offered this suggestion only because you told us you are not permitted to alter your tblCoachingDB table's design to make SourceId numeric instead of text datatype. Since you can't make that change, you will have to live with the run-time performance impact of converting the datatype of a JOIN field. That is not a good thing, but I don't know how bad it will be. Good luck.
Assuming that all values in tblCoachingDB.SourceID are numbers, you could create a query, selecting all fields from tblCoachingDB EXCEPT SourceID. Then add a new field to the query SourceID: clng(tblCoachingDB.SourceID)
You would then use the query instead of tblCoachingDB anywhere you needed to make the join. A second alternative would be to create a query for tblCoachingDBSource and using SourceID: cstr(tblCoaching.SourceID) A third alternative would be:
SELECT * FROM tblCoachingDB, tblCoachingDBSource
WHERE (clng(tblCoachingDB.SourceId)=tblCoachingDBSource.ID
AND ((tblCoachingDB.SourceId)="12"));