Redshift transformation: json key/value to relation - json

in the Redshift table I have the tag column, which is varchar storing json with typical key/value pairs. E.g. for record with id = 1 the tag value looks like: {"env":"test","app-name":"ap123","product-type":"web-app"}.
I would like to transform the key/values to the typical relational table like below:
+---+-------------+-------+
|id ||key |value |
+---+-------------+-------+
|1 |env |test |
|1 |app-name |ap123 |
|1 |product-type |web-app|
|2 |env |dev |
|...|... |... |
+---+-------------+-------+
I had made a fast research already I didn't found any solution. I tried to use the Redshift json functions, but without achieving the desired result (https://docs.aws.amazon.com/redshift/latest/dg/json-functions.html).
Any ideas are highly wellcome.

The idea in this situation was to change the json to json_array by adding enclosing brackets ('[' and ']') to the json and replace ',' with the '}, {' to change separate json_array elements. Afther that it's possible to access the json_array element by the json_extract_array_element_text function.

Related

how to create Json object of each column in a row

i have a table itemmaster in Postgresql.
id| attribute1 | attribute2 | attribute3
1 | Good | Average | Best
i want output as json like
[{"attribute1":"Good"},{"attribute2":"Average"},{"attribute3":"Best"}]
i want to use this JSON as nested JSON other object, ihave tried row_to_json and json object builder but not getting exact result.
select json_build_array(json_build_object('attribute1', itemmaster.attribute1),
json_build_object('attribute2', itemmaster.attribute2),
json_build_object('attribute3', itemmaster.attribute3))
from itemmaster;

Create hive external table with complex data type and load from CSV or TSV having few columns with serialized JSON object

I have CSV (or TSV) with a column ('nw_day' in example below) having serialized array object and another column ('res_m' in example below) having serialized JSON object. It also has columns with STRING, TIMESTAMP, and FLOAT data type.
For the TSV that looks somewhat like (showing first row)
+----------+---------------------+-------+-----------------------------------------------+------------------------------------------------------------------------+
| com_id | w_start_time | cap | nw_day | res_m |
+----------+---------------------+-------+-----------------------------------------------+------------------------------------------------------------------------+
| dtf_id | 2019-04-24 06:00:03 | 444.3 | {'Fri','Mon','Sat','Sun','Thurs','Tue','Wed'} | {"some_str":"str_one","some_n":1,"some_t":2019-04-24 06:00:03.700+0000}|
+----------+---------------------+-------+-----------------------------------------------+------------------------------------------------------------------------+
I have tried the following statement, but it is not giving me perfect results.
CREATE EXTERNAL TABLE IF NOT EXISTS table_name(
com_id STRING,
w_start_time TIMESTAMP,
cap FLOAT,
nw_day array <STRING>,
res_m STRUCT <
some_str: STRING,
some_n: BIGINT,
some_t: TIMESTAMP
>)
COMMENT 's_e_s'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION '/location/to/folder/containing/csv'
TBLPROPERTIES ("skip.header.line.count"="1");
So, I'm thinking I deserialize those objects into hive complex datatypes with ARRAYS and STRUCT. But that is not exactly what I get when I run
select * from table_name limit 1;
which gives me
+----------+---------------------+-------+----------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
| com_id | w_start_time | cap | nw_day | res_m |
+----------+---------------------+-------+----------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
| dtf_id | 2019-04-24 06:00:03 | 444.3 | ["{'Fri'"," 'Mon'"," 'Sat'"," 'Sun'"," 'Thurs'"," 'Tue'"," 'Wed'}"] | {"some_str":"{\"some_str\":\"str_one\",\"some_n\":1,\"some_t\":2019-04-24 06:00:03.700+0000}\","some_n":null,"some_t":null}|
+----------+---------------------+-------+----------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------+
So, it considering the whole object as a string and split the string by delimiter.
I need some help understanding how to load data from CSV/TSV to complex data types in Hive.
I found a similar question but the requirement is little different and there is no complex datatype involved in there.
Any help would be much appreciated. If this cannot be done and a preprocessing step has to be included prior to loading, some example of input data to complex datatype loads in hive would help me. Thanks in advance!

MySQL Return JSON array index based on property value

I have a table with JSON data like this:
{"a": [{"color": "blue", "value": 15}, {"color": "red", "value": 30}]}
I need to get the "value" that is inside the same object of "blue".
I thought to use the code below:
SELECT JSON_EXTRACT(my_data, '$.a[0].value');
The problem is that the "blue" object can be in any index of the array.
So, is there a way to retrieve the index first and then i'll query using the right index?
UPDATE
The Barmar's answer works but it needs to wrap in JSON_UNQUOTE()
Use JSON_SEARCH() to find the path to blue.
SELECT JSON_EXTRACT(my_data, JSON_UNQUOTE(REPLACE(JSON_SEARCH(my_data, 'one', 'blue'), '.color', '.value')))
JSON_SEARCH will return a string like $.a[0].color. REPLACE changes that to $.a[0].value, then you extract that element.
DEMO
Here's an example of using JSON_TABLE():
select j.* from d, json_table(d.data, '$.a[*]' columns (
color varchar(20) path '$.color',
value int path '$.value')
) as j;
+-------+-------+
| color | value |
+-------+-------+
| blue | 15 |
| red | 30 |
+-------+-------+
You can then apply conditions in the WHERE clause, as if you had stored the data in a normal table.
select j.* from d, json_table(d.data, '$.a[*]' columns (
color varchar(20) path '$.color',
value int path '$.value')
) as j
where j.color = 'blue';
+-------+-------+
| color | value |
+-------+-------+
| blue | 15 |
+-------+-------+
This requires you to write a complex query like this EVERY TIME you query the JSON data.
One wonders if it would have been easier to store the JSON in a normal table from the start.
I often recommend to MySQL users that storing data as JSON makes more work for you, if you need to make SQL expressions to reference individual fields within the JSON. I wouldn't use JSON in these cases, I'd explode the JSON array into rows, and the JSON fields into columns of a set of normal tables. Then you can write simpler queries, you can optimize with indexes, and you can use constraints and data types properly.
JSON is the most easily misused feature of the recent MySQL releases.

Using MySQL JSON field to join on a table with custom fields

So I made this system to store custom objects with custom fields for an app that I'm developing. First I have object_def where I save the object definitions:
id | name | fields
------------------------------------------------------------
101 | Group 1 | [{"name": "Title", "id": "AbCdE123"}, ...]
102 | Group 2 | [{"name": "Name", "id": "FgHiJ456"}, ...]
So we have ID (INT), name (VARCHAR) and fields (LONGTEXT). In fields are the object fields like this: {id: string, type: string, name: string}[].
Now In the object table, I have this:
id | object_def_id | object_values
------------------------------------------------------------
235 | 101 | {"AbCdE123": "The Object", ... }
236 | 102 | {"FgHiJ456": "John Perez", ... }
Where object_values is a LONGTEXT also. With that system, I'm able to show the objects on a table in my app using JSON.parse().
Now I've learned that there is a JSON type in MySQL and I want it to use it to do queries and stuff (I'm really new to this).
I've changed the LONGTEXT to JSON and now I wanted to do a SELECT that show the results like this:
#Select objects in group 1:
id | group | Title | ... | other_custom_field
-------------------------------------------------------
235 | Group 1 | The Object | ... | other_custom_value
#Select objects in group 2:
id | group | Name | ... | other_custom_field
-------------------------------------------------------
236 | Group 2 | John Perez | ... | other_custom_value
Id, then group name (I can do this with INNER JOIN) and then all the custom fields with the respective values.
Is this possible? How can I achieve this (hopefully without changing my database structure)? I'm learning MySQL, SQL and databases as I go so I really appreciate your help. Thanks!
Problems I see with your design:
Incorrect JSON format.
[{name: 'Title', id: 'AbCdE123'}, ...]
Should be:
[{"name": "Title", "id": "AbCdE123"}, ...]
You should use the JSON data type instead of LONGTEXT, because JSON will at least reject invalid JSON syntax.
Setting column headings based on data. You can't do this in SQL. Columns and headings must be fixed at the time you prepare the query. You can't do an SQL query that changes its own column headings.
Your object def has an array of attributes, but there's no way in MySQL 5.7 to loop over the "rows" of a JSON array. You'll need to use the JSON_TABLE() in MySQL 8.0.
That will get you closer to being able to look up object values, but then you'll still have to pivot the data into the result set you describe, with one attribute in each column, as if the data had been stored in a traditional way. But SQL doesn't allow you to do dynamic pivoting in a single query. You can't make an SQL query that dynamically grows its own select-list based on the data it finds.
This all makes me wonder...
Why don't you just store the data in the traditional way?
Create a table per object type. Add one column to that table per attribute. That way you get column names. You get column types. You get column constraints — for example, how would you simulate NOT NULL or UNIQUE in your current system?
If you don't want to use SQL, then don't. There are alternatives, like document databases or key/value databases. But don't torture poor SQL by using it to implement an Inner-Platform.

Select as JSON object {key: {}}

My table:
ID | something1 | something2 | ...
1 | meow | 5 |
2 | 4 | KITTIES |
Is there any way to select data as JSON in format {"1":{"something1":"meow","something2":5},"2":{...}}?
If you don't mind repeating the ID field in the JSON representation of a row, you can do:
SELECT
format('{%s}',
string_agg(
format(
'%s:%s',
to_json(ID::text),
row_to_json(my_table)
), ','
), ''
)::json as json_object
FROM my_table;
This gives you a JSON object containing a sub-object for each row in the table, keyed by the value in the ID field.
SQLFiddle
See this question for more details.
You can use this library to get an API of the database. Then, consume it! This is the fastest and clearest thing I can imagine.