Check the value type of a JSON value in Postgres - json

Let's say I have a json column fields, like so:
{phone: 5555555555, address: "55 awesome street", hair_color: "green"}
What I would like to do is update all entries where the json key phone is present, and the result is of type number to be a string.
What I have is:
SELECT *
FROM parent_object
WHERE (fields->'phone') IS NOT NULL;
Unfortunately this still returns values where phone:null. I'm guessing that a JSON null is not equivalent to a SQL NULL.
How do I
1) How do I rule out JSON nulls
AND (fields->'phone') <> null produces
LINE 4: ...phone') IS NOT NULL AND (fields->'phone') <> 'null';
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
2) Check the type of the value at that key, this pseudocode (type_of (fields->'phone') == Integer) but in working PGSQL.
3) Modify this to update the column
UPDATE parent_object
SET fields.phone = to_char(fields.phone)
WHERE query defined above

As other folks have said, there is no reason to convert the variable to an integer just to them cast it to a string. Also, phone numbers are not numbers. :-)
You need to be using the ->> operator instead of ->. That alongside IS NOT NULL gets your SELECT query working.
Note the difference between the two tuple values after running this query:
SELECT fields->'phone', fields->>'phone'
FROM parent_object;
Your working query:
SELECT *
FROM parent_object
WHERE (fields->>'phone') IS NOT NULL;
Postgres does not currently natively support atomically updating individual keys within a JSON column. You can write wrapper UDFs to provide this capability to you: How do I modify fields inside the new PostgreSQL JSON datatype?

For checking the type of the value at key, postgres has the following in the documentation.
json_typeof ( json ) → text
jsonb_typeof ( jsonb ) → text
Returns the type of the top-level JSON value as a text string. Possible types are object, array, string, number, boolean, and null. (The null result should not be confused with a SQL NULL; see the examples.)
json_typeof('-123.4') → number
json_typeof('null'::json) → null
json_typeof(NULL::json) IS NULL → t

Related

How to use JSON_EXTRACT to get values of a string index

I have the followed data rows:
I'm trying to use JSON_EXTRACT to get rows only if inside jot_locale_vars has index equals "2".
SELECT
jot.*,
(JSON_EXTRACT(`jot_locale_vars`, '$[2]')) as localeVar
FROM job_type jot
WHERE jot_excluded = ''
HAVING localeVar IS NOT NULL
But, as you can see, i've been used $[2], but the array indexes start to zero. So... String 1 equals to [0], 2 equals [1]... and i cant use in this way.
How can i extract values if has condition by a string index?
If you are looking to see if a key exists inside of an object, then you'll want to use JSON_CONTAINS_PATH.
SELECT `jot`.*,
`jot_locale_vars`->'$[*]."2"' AS `localeVar`
FROM `job_type` AS `jot`
WHERE `jot_excluded` = 0
AND JSON_CONTAINS_PATH(`jot_locale_vars`, 'one', '$[*]."2"')
Note: This requires MySQL 5.7+
Note 2: The -> operator is just shorthand for JSON_EXTRACT().
The syntax for the path can be found at: https://dev.mysql.com/doc/refman/5.7/en/json.html#json-path-syntax
I'm using $[*]."2", which means "any array value" ([*]) that contains a key named "2" (."2").

Querying dynamic JSON fields for first non-null value in AWS Athena

I am storing event data in S3 and want to use Athena to query the data. One of the fields is a dynamic JSON field that I do not know the field names for. Therefore, I need to query the keys in the JSON and then use those keys to query for the first non-null for that field. Below is an example of the data stored in S3.
{
timestamp: 1558475434,
request_id: "83e21b28-7c12-11e9-8f9e-2a86e4085a59",
user_id: "example_user_id_1",
traits: {
this: "is",
dynamic: "json",
as: ["defined","by","the", "client"]
}
}
So, I need a query to extract the keys from the traits column (which is stored as JSON), and use those keys to get the first non-null value for each field.
The closest I could come was sampling a value using min_by, but this does not allow for me to add a where clause without returning null values. I will need to use presto's "first_value" option, but I cannot get this to work with the extracted JSON keys from the dynamic JSON field.
SELECT DISTINCT trait, min_by(json_extract(traits, concat('$.', cast(trait AS varchar))), received_at) AS value
FROM TABLE
CROSS JOIN UNNEST(regexp_extract_all(traits,'"([^"]+)"\s*:\s*("[^"]+"|[^,{}]+)', 1)) AS t(trait)
WHERE json_extract(traits, concat('$.', cast(trait AS varchar))) IS NOT NULL OR json_size(traits, concat('$.', cast(trait AS varchar))) <> 0
GROUP BY trait
It's not clear to me what you expect as result, and what you mean by "first non-null value". In your example you have both string and array values, and none of them is null. It would be helpful if you provided more examples and also expected output.
As a first step towards a solution, here's a way to filter out the null values from traits:
If you set the type of the traits column to map<string,string> you should be able to do something like this:
SELECT
request_id,
MAP_AGG(ARRAY_AGG(trait_key), ARRAY_AGG(trait_value)) AS trait
FROM (
SELECT
request_id,
trait_key,
trait_value
FROM some_table CROSS JOIN UNNEST (trait) AS t (trait_key, trait_value)
WHERE trait_value IS NOT NULL
)
However, if you want to also filter values that are arrays and pick out the first non-null value, that becomes more complex. It could probably be done with a combination of casts to JSON, the filter function, and COALESCE.

How to get a json object with specific key values, from a json array column?

In my MySQL 8.0 table, I have a JSON ARRAY column. It is an array of JSON objects. I want to pick one object out of each row's array, based on the key value pairs in the objects.
Example row:
[{bool:false, number:0, value:'hello'},
{bool:true, number:1, value:'world'},
{bool:true, number:2, value:'foo'},
{bool:false, number:1, value:'bar'}]
What I am trying to do is get the 'value' WHERE bool=true, AND number=1. So I want a query that in this example returns 'world'.
What would also work is if I could get the index of the object where bool=true and number=1, in this example it would return '$[1]'.
I am trying to run a query across the whole column, setting a new column to the value returned from the query. Is this possible with MySQL JSON functions? I've looked at the references but none have objects inside arrays like my example.
EDIT: If I do
SELECT JSON_SEARCH(column->"$[*]", 'all', '1');
SELECT JSON_SEARCH(names->"$[*]", 'all', 'true');
I get the paths/indexes of objects where number=1, and where bool=true, respectively. I would like the overlap of these two results.
You can use JSON_TABLE to convert the JSON into a derived table which you can then extract values from:
SELECT j.value
FROM test t
JOIN JSON_TABLE(t.jsonStr,
'$[*]'
COLUMNS(bool BOOLEAN PATH '$.bool',
number INT PATH '$.number',
value VARCHAR(20) PATH '$.value')) j
WHERE j.bool = true AND j.number = 1
Output:
value
world
If you also want to get the index within each JSON value of the value which matched, you can add a FOR ORDINALITY clause to your JSON_TABLE e.g.:
SELECT j.idx, j.value
FROM test t
JOIN JSON_TABLE(t.jsonStr,
'$[*]'
COLUMNS(idx FOR ORDINALITY,
bool BOOLEAN PATH '$.bool',
number INT PATH '$.number',
value VARCHAR(20) PATH '$.value')) j
WHERE j.bool = true AND j.number = 1
Output:
idx value
2 world
Demo on dbfiddle

Math calculating with two JSONB keys in Postgres?

I have JSONB column with data like:
{"plays": {"win": 90, "draw": 8, "lose": 2}}
How I can calculate sum with win and draw keys?
Something like:
SELECT
data::json#>>'{plays,draw}' + data::json#>>'{plays,win}' as "total_plays",
FROM
plays_data;
Let's assume your table and data are the following ones (note that I have avoided called json any column, to avoid confusion between column names and types; this is a recommended practice):
CREATE TABLE data
(
some_data json
) ;
INSERT INTO data
(some_data)
VALUES
('{"plays": {"win": 90, "draw": 8, "lose": 2}}') ;
You need to use the following query:
SELECT
CAST(some_data->'plays'->>'win' AS INTEGER) + CAST(some_data->'plays'->>'draw' AS INTEGER) AS total_plays
FROM
data ;
| total_plays |
| ----------: |
| 98 |
Explanation:
-> operator, applied to a JSON column (a JSON object) on the left, and a string on the right, gives back the corresponding field as a JSON object (which might be an object, an array or a value).
->> operator gives back the field value as a text. PostgreSQL doesn't have a way to know whether your data is a string, number or boolean; it treats everything as strings, which is how they're stored.
CAST(expression AS type) converts the expression to the specified type. JavaScript might use a number or a string and cast from one to the other as needed. PostgreSQL must be told explicitly most of the times whether a certain expressions needs to be interpreted as one or the other. As for numbers, JavaScript doesn't let you specify between floats or integers. PostgreSQL needs to be that specific.
You can check everything at dbfiddle here
Reference:
PostgreSQL JSON functions and operators
CAST
SELECT
*, j.draw + j.win as total_plays
FROM
plays_data,
json_to_record(data->'plays') as j(win int, draw int, lose int);

Trying to query a JSON array of non-objects in Postgresql 9.3

I'm trying to query a table with a JSON column which will always hold an array of "primitive" values (i.e. integers, strings, booleans -- not objects or arrays).
My query should be similar to [ref2], but I can't do ->>'id' because I'm not trying to access a JSON object but the value itself.
In the [ref1] fiddle (blatant fork from the above), there's and incomplete query... I'd like to query all things which contain 3 among its values.
Even more so, I'd like some rows to have arrays of strings, other rows to have arrays of integers, and other ones arrays of booleans... So casting is undesiderable.
I believe ->> returns the original JSON value type, but I need the "root" object... That is, my JSON value is [1,2,3,4], using json_array_elements should yield e.g. 2, but that is a JSON type according to my tests.
Upgrading to 9.4 is planned in the near future, but I haven't read anything yet that gave me a clue jsonb would help me.
UPDATE: at the moment, I'm (1) making sure all values are integers (mapping non-integers values to integers), which is suboptimal; (2) querying like this:
SELECT *
FROM things, json_array_elements(things.values) AS vals
WHERE vals.value::text::integer IN (1,2,3);
I need the double casting (otherwise it complains that cannot cast type json to integer).
ref1: http://sqlfiddle.com/#!15/5febb/1
ref2: How to query an array of JSON in PostgreSQL 9.3?
Rather than using json_array_elements you can unpack the array with generate_series, using the ->> operator to extract a text representation.
SELECT things.*
FROM things
CROSS JOIN generate_series(0, json_array_length(values) - 1) AS idx
WHERE values ->> idx = '1'
GROUP BY things.id;
This is a workaround for the lack of json_array_elements_text in 9.3.
You need an operator(=) for json to do this without either messing with casting or relying on the specific textual representations of integers, booleans, etc. operator(=) is only available for jsonb. So in 9.3 you're stuck with using the text representation (so 1.00 won't = 1) or casting to a PostgreSQL type based on the element type.
In 9.4 you could use to_json and the jsonb operator(=), e.g.:
SELECT things.*
FROM things
CROSS JOIN generate_series(0, json_array_length(values) - 1) AS idx
WHERE (values -> idx)::jsonb = to_json(1)::jsonb
GROUP BY things.id;
id | date | values
----+-------------------------------+---------
1 | 2015-08-09 04:54:38.541989+08 | [1,2,3]
(1 row)