JSON split in Postgres - json

How we can split this in a single query, Any ideas or suggestions ?
{"_xxx8430yyy59_xx":{"label":"Campaign Codes"},"_zzz984ggg4110_zzz":{"label":"NL Discount Codes"},"_ttt9843hhh160_ttt":{"label":"OOP Discount Codes"},"_ddd984393lll3_ddd":{"label":"Influencer Codes (Offer)"} }
I have tried to do this in multiple statements , but was not successful.
WITH CTE AS(select *, unnest(string_to_array(value, ',')) AS parts from config_data)
SELECT parts from CTE
Expected output as
ID label
_xxx8430yyy59_xx Campaign Codes
_zzz984ggg4110_zzz NL Discount Codes
_ttt9843hhh160_ttt OOP Discount Codes
_ddd984393lll3_ddd Influencer Codes (Offer)

You don't string-split anything, you use the JSON processing functions that are built into postgres:
SELECT key as "ID", obj.value->>'label' AS label
FROM config_data,
jsonb_each(config_data.value::jsonb) AS obj(key, value)
You should also change the data type of the value column in your table to be jsonb/json, if you haven't already.

Related

Snowflake latteral flatten data types

I have a table containing an id column and a json column(variant data type). I want to flatten the data, make the value column a variant, assign each value in the value column a data type if a condition is met, then eventually pivot the data and have each column be the correct data type.
Example code that doesn't work:
with cte as (
select
1 as id,
parse_json('{
"field1":"TRUE",
"field2":"some string",
"field3":"1.035",
"field4":"097334"
}') as my_output
)
select
id,
key,
to_variant(
case
when value in ('true', 'false') then value::boolean
when value like ('1.0') then value::decimal
else value::string
end) as value
from cte, lateral flatten(my_output)
Ultimately, I'd like to pivot the data and have a wide table with columns id, field1, field2, etc. where field1 is boolean, field2 is string, field3 is a decimal etc.
This is just a simple example, instead of 4 fields, I'm dealing with hundreds.
Is this possible?
For the pivot, I'm using dbt_utils.get_column_values to get the column names dynamically. I'd really prefer a solution that doesn't involve listing out the column names, especially since there are hundreds.
Since you'd have to define each column in your PIVOT statement, anyway, it'd probably be much easier to simply select each attribute directly and cast to the correct data type, rather than using a lateral flatten.
select
my_output.field1::boolean,
my_output.field2::string,
my_output.field3::decimal(5,3),
my_output.field4::string
from cte;
Alternatively, if you want this to be dynamically created, you could create a stored procedure that dynamically uses your json to create a view over your table that has this select in it.
Solution ended up being
select
id,
key,
ifnull(try_parse_json(value), value) as value_mod,
typeof(value_mod)
from cte, lateral flatten(my_output)
Leading zeros are removed so things like zip codes have to be accounted for.

How to extract a value from JSON that repeats multiple times?

I have the following table:
I need to create a select that returns me something like this:
I have tried this code:
SELECT Code, json_extract_path(Registers::json,'sales', 'name')
FROM tbl_registers
The previous code returns me a NULL in json_extract_path, I have tried the operator ::json->'sales'->>'name', but doesn't work too.
You need to unnest the array, and the aggregate the names back. This can be done using json_array_elements with a scalar sub-query:
select code,
(select string_agg(e ->> 'name', ',')
from json_array_elements(t.products) as x(e)) as products
from tbl_registers t;
I would also strongly recommend to change your column's type to jsonb
step-by-step demo:db<>fiddle
SELECT
code,
string_agg( -- 3
elems ->> 'name', -- 2
','
) as products
FROM tbl_registers,
json_array_elements(products::json) as elems -- 1
GROUP BY code
If you have type text (strictly not recommended, please use appropriate data type json or jsonb), then you need to cast it into type json (I guess you have type text because you do the cast already in your example code). Afterwards you need to extract the array elements into one row per element
Fetch the name value
Reaggregate by grouping and use string_agg() to create the string list

Postgreql avg on JSON query

I have this query that gives me a list of values from a JSON type column. Now I have two questions
Is this approach correct to access nested JSON elements?
How would I now get the average of my values?
select json_extract_path_text(json_extract_path(report_data, 'outer_key'), 'inner_key') as values
from report
where serial_number like '%123456%';
Given that inner_key is a number, you can simply cast it to a numeric type:
select avg((report_data->'outer_key'->>'inner_key')::float8)
from report
where serial_number like '%123456%';

'SUM' is not a recognized built-in function name when converting VARCHAR to DECIMAL

I am new to the community so please bear with me. I am working on a sum function that will take the values of 3 columns (Exchange, Commission, Otherfees) and give me that total based on row. The datatypes for these 3 fields are VARCHAR. I started by using a CONVERT function and then addressed any NULLs. Please see the query below:
SELECT SUM(
(SELECT(SELECT
CONVERT(decimal(18,4), isnull(ExchangeFee,0)) AS decimal
FROM T_TABLE) as EXCHANGE_VALUE) +
(SELECT(
SELECT
CONVERT(decimal(18,4), isnull(Commission,0)) AS decimal
FROM T_TABLE) AS COMMISSION_VALUE) +
(SELECT(
SELECT
CONVERT(decimal(18,4), isnull(OtherFees,0)) AS decimal
FROM T_TABLE) AS OTHERFEES_VALUE) AS decimal) AS SUMMED_VALUE
When running this query, I get the message
'SUM' is not a recognized built-in function name.
Please let me know your thoughts.
You could start by using the correct data types for your fields.
ExchangeFee, Commission and OtherFees are all numeric, so why store them in a varchar?
If the values should never be NULL, and here these look like they probably probably shouldn't, set them as NOT NULL and default them to 0.
That said, mysql will convert strings to numbers in a numerical context so you only need to worry about any NULL values which COALESCE or IFNULL will deal with.
As for the query which you want to sum the rows, all of the data is coming from T_TABLE so the general structure of the query should be:
SELECT COALESCE(ExchangeFee,0) + COALESCE(Commission,0) + COALESCE(OtherFees,0) AS SUMMED_VALUE
FROM T_TABLE;

SQL search with REGEX instead with BETWEEN operator

I have MySQL database, and inside table with ads. In one field of table of that database, data is being saved in json format. In that json formatted data, I have key which value contains price (with decimal values).
That field (named for example ad_data), which is saved in database field, contains (json) data like this:
{"single_input_51":"Ad 44 test.","price":"20.00","single_input_4":"ad test title, ad tes title, .","single_input_11":"8.8.2015.","single_input_5":"video test","single_input_6":"https://www.youtube.com/watch?v=nlTPeCs2puw"}
I would like to search in that field, so I can find price range that is searched. If for example, user sets in html form he wants to search in ranges from 100.00 do 755.00, SQL should return only rows where that field (which data is saved as json) contains those values that are from 100.00 to 755.00.
So basically, I would want to write something like this with REGEX in SQL for that json formatted contents of that field (numbers here are just examples, I must be able to to this for every starting and closing decimal number, and numbers I will pass programatically):
SELECT id, price FROM ads WHERE price BETWEEN 100.00 AND 755.00
What would be SQL command for that search via REGEX?
Don't use REGEX for doing the match, that will be painful. If you had a particular range of prices you were looking for, it might be doable, but to dynamically generate the regular expression to "work" for any specified range of prices, when the price could be two, three or more characters, that's going to be hard. (The REGEXP function in MySQL only returns a boolean indicating whether a match was found or not; it won't return the portion of the string that was matched.)
If I had to do a comparison on "price", I would parse the value for price out of the string, then cast that to a numeric value, and the do a comparison on that.
For example:
SELECT t.col
FROM mytable t
WHERE SUBSTRING_INDEX(SUBSTRING_INDEX(t.col,'"price":"',-1),'"',1) + 0
BETWEEN 100.00 AND 755.00
To answer the question you asked: what expression would you use to perform this match using a REGEX...
For "price between 100.00 and 755.00", using MySQL REGEXP, the regular expression you would need would be something like the second expression in the SELECT list of this query:
SELECT t.col
, t.col REGEXP '"price":"([1-6][0-9][0-9]\.[0-9][0-9]|7[0-4][0-9]\.[0-9][0-9]|75[0-4]\.[0-9][0-9]|755\.00)"' AS _match
FROM ( SELECT 'no' AS col
UNION ALL SELECT 'no "price":"14.00"def'
UNION ALL SELECT 'ok "price":"99.99" def'
UNION ALL SELECT 'ok "price":"100.00" def'
UNION ALL SELECT 'ok "price":"699.99" def'
UNION ALL SELECT 'ok "price":"703.33" def'
UNION ALL SELECT 'ok "price":"743.15" def'
UNION ALL SELECT 'ok "price":"754.99" def'
UNION ALL SELECT 'no "price":"755.01" def'
) t
The regular expression in this example is almost a trivial example, because the price values we're matching all have three digits before the decimal point.
The string used for a regular expression would need to be crafted for each possible range of values. The crafting would need to take into account prices with different number of digits before the decimal point, and handle each of those separately.
For doing a range check of price between 95.55 to 1044.44, that would need to be crafted into a regular expression to check price in these ranges:
95.55 thru 95.59 95\.5[5-9]
95.60 thru 95.99 95\.[6-9][0-9]
96.00 thru 99.99 9[6-9]\.[0-9][0-9]
100.00 thru 999.99 [1-9][0-9][0-9]\.[0-9][0-9]
1000.00 thru 1039.99 10[0-3][0-9]\.[0-9][0-9]
1040.00 thru 1043.99 1040[0-3]\.[0-9][0-9]
1044.00 thru 1044.39 1044\.[0-3][0-9]
1044.40 thru 1044.44 1044\.4[0-4]
It could be done, but the code to generate the regular expression string won't be pretty. (And getting it fully tested won't be pretty either.)
(#spencer7593 has a good point; here's another point)
Performance... If you have an index on that field (and the optimizer decides to use the index), then BETWEEN can be much faster than a REGEXP.
BETWEEN can use an index, thereby minimizing the number of rows to look at.
REGEXP always has to check all rows.