Redshift nested json extraction - json

I have a table with two columns, one column named user, one json column named js that looks like this:
{"1":{"partner_id":54,"provider_id":13},
"2":{"partner_id":56,"provider_id":8},
"3":{"partner_id":2719,"provider_id":274}}
I want to select all 'provider_id' in one column/row.So it should look like this:
user| provider_ids
0001| 13,8,274
0002| 21,36,57,12
How can I do this? Thanks in advance!

Your provided json format is not so easy to work with.
Crated table for test purposes:
create table json_test as
select '0001' as usr, '{"1":{"partner_id":54,"provider_id":13},
"2":{"partner_id":56,"provider_id":8},
"3":{"partner_id":2719,"provider_id":274}}'
as json_text
union all
select '0002' as usr, '{"1":{"partner_id":54,"provider_id":21},
"2":{"partner_id":56,"provider_id":36},
"2":{"partner_id":56,"provider_id":57},
"3":{"partner_id":2719,"provider_id":12}}'
as json_text;
Query to return results:
with NS AS (
select 1 as n union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
)
select usr,
listagg(trim(TRIM(split_part(SPLIT_PART(js.json_text, '},', NS.n),'"provider_id":',2)),'}'),',') within group(order by null) AS t
from NS
join json_test js ON true and NS.n <= REGEXP_COUNT(js.json_text, '\\},') + 1
group by usr;
Notes:
1) do not name column "user" as it is reserved keyword
2) add as many dummy rows in NS subquery as there is maximum of json provider records
3) Yes, I know, this isn't very readable SQL :D

Related

mysql find numbers in query that are NOT in table

Is there a simple way to compare a list of numbers in my query to a column in a table to return the ones that are NOT in the db?
I have a comma separated list of numbers (1,57, 888, 99, 76, 490, etc etc) that I need to compare to the number column in a table in my DB. SOME of those numbers are in the table, some are not. I need the query to return those that are in my comma separated list, but are NOT in the DB...
I would put the list of numbers to be checked in a table of their own, then use WHERE NOT EXISTS to check whether they exist in the table to be queried. See this SQLFiddle demo for an example of how this might be accomplished:
If you're comfortable with this syntax, you can even avoid putting into a temp table:
SELECT * FROM (
SELECT 1 AS mycolumn
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6
UNION
SELECT 7
) a
WHERE NOT EXISTS ( SELECT 1 FROM mytable b
WHERE b.mycolumn = a.mycolumn )
UPDATE per comments from OP
If you can insert your very long list of numbers into a table, then query as follows to get the numbers that are not found in the other table:
SELECT mynumber
FROM mytableof37000numbers a
WHERE NOT EXISTS ( SELECT 1 FROM myothertable b
WHERE b.othernumber = a.mynumber)
Alternately
SELECT mynumber
FROM mytableof37000numbers a
WHERE a.mynumber NOT IN ( SELECT b.othernumber FROM myothertable b )
Hope this helps.
May be this is what you are looking for.
Convert your CSV to rows using SUBSTRING_INDEX. Use NOT IN operator to find the values which is not present in DB
Then Convert the result back to CSV using Group_Concat.
select group_concat(value) from(
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.a, ',', n.n), ',', -1) value
FROM csv t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.a) - LENGTH(REPLACE(t.a, ',', '')))) ou
where value not in (select a from db)
SQLFIDDLE DEMO
CSV TO ROWS referred from this ANSWER
You could use the 'IN' clause of MySQL. Maybe check this out IN clause tutorial

Search for the existance of many objects in a database

Say I have a database with 5 million users, with the columns
id (unsigned int, auto-increment), facebook_id (unsigned int), and name (varchar)
In a program, I have a list of a variable amount of users from a person's facebook friend list (generally ranging from 500-1200 different facebook ids).
What's the most efficient way to send a query to my database that returns the facebook_id's of all of the users where that same facebook_id exists in the database?
Pseudo-code:
$friends = array(12345, 22345, 32345, 42345, 52345, ... ~1000 more);
$q = mysql_query("SELECT * FROM users ...");
$friendsAlreadyUsingApp = parseQuery($q);
This is a topic of almost an endless number of articles, blogs, Q&As etc; and the essence of this problem is that it looks really simple - but isn't.
The heart of the problem is that the parameters looks like it should work using WHERE field IN() BUT it does not do that because the parameter is a single string that just happens to have lots of commas in it.
So, when that parameter is passed to SQL it is necessary to process that single string into multiple parts so that the field can be compared to each part. This is where it gets a little complex as not all database types have all the same features to handle this. MySQL for example does not have a table variable that MS SQL Server provides.
So. A simple method, for MySQL is this:
SET #param := '105,110,125,135,145,155,165,175,185,195,205';
SELECT
*
FROM Users
WHERE FIND_IN_SET(facebook_id, #param) > 0
;
FIND_IN_SET Return the index position of the first argument
within the second argument
Just how well this scales in your database I cannot tell, it might not be acceptable for parameters containing 1000+ id's.
So if text processing like FIND_IN_SET is too slow, then each id needs to be broken out from the parameter and inserted into a table. That way the resulting table can be used through an INNER JOIN to filter the users; but this requires a table and inserts which take time, and there may be concurrency issues if more than one user is attempting to use that table at the same time.
Using the following sets-up a table of 10,000 integers (1 to 10,000)
/* Create a table called Numbers */
CREATE TABLE `Numbers`
(
`Number` int PRIMARY KEY
);
/* use cross joins to create 10,000 integers from 1 & store into table */
INSERT INTO Numbers (Number)
select 1 + (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a)) as N
from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as d
;
This "utility table" can then be used to divide a comma separated parameter into a derived table of the individual integers, and this then used in an INNER JOIN to your users table will provide the wanted result.
SET #param := '105,110,125,135,145,155,165,175,185,195,205';
SET #delimit := ',';
SELECT
users.id
, users.facebook_id
, users.name
FROM users
INNER JOIN (
SELECT
CAST(SUBSTRING(iq.param, n.number + 1, LOCATE(#delimit, iq.param, n.number + 1) - n.number - 1) AS UNSIGNED INTEGER) AS itemID
FROM (
SELECT
concat(#delimit, #param, #delimit) AS param
) AS iq
INNER JOIN Numbers n
ON n.Number < LENGTH(iq.param)
WHERE SUBSTRING(iq.param, n.number, 1) = #delimit
) AS derived
ON users.facebook_id = derived.itemID
;
This query can be used as the basis for a stored procedure which might be easier for you to call from PHP.
See this SQLFiddle demo

MySQL select from custom set and compare with table data

Hi I'm trying to solve which elements doesn't exists in my database. In order to do so I want to compare list of integers (output from external script) with data in table. How to do such thing like:
SELECT * FROM (1,1,2,3,5,8,13...) l WHERE l NOT IN (select id from table1);
This is probably best done with a left outer join. But, your problem is creating the table of constants:
SELECT *
FROM (select 1 as id union all select 2 union all select 3 union all select 5 union all
select 8 union all select 13 union all select 21 . . .
) ids
where ids.id NOT IN (select id from table1);
This can have odd behavior, if table1.id is ever NULL. The following works more generally:
SELECT *
FROM (select 1 as id union all select 2 union all select 3 union all select 5 union all
select 8 union all select 13 union all select 21 . . .
) ids left outer join
table1 t1
on ids.id = t1.id
where t1.id is null;
EDIT:
The size of a MySQL query is dictated by the parameter max_packet_size (see here). The most recent version has a limit of 1 Gbyte. You should be able to fit 18,000 rows of:
select <n> union all
into that limit, quite easily. Gosh, I don't even think it would be 1 megabyte. I would say, though, that passing a list of 18,000 ids through the application seems inefficient. It would be nice if one database could just pull the data from the other database, without going through the application.
If your set to compare is huge I'd recommend you to create a temporary table myids with the only column id, put there all your 18K values and run query like that:
select id from myids where myids.id not in (select id from table1);

SQL query return what's NOT in table [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
SQL: find missing IDs in a table
getting values which dont exist in mysql table
Just wondering, is it possible to have a query that somehow tells you the values it did not find in a table?
So if I had a query SELECT * FROM mytable WHERE id IN (1,2,3,4,5,6,7,8,9) and only 2,3,6,7,9 was returned. I wouldd like to know that 1,4,5,8 were not found.
It will be a little hard to do a manual comparision, because this is going to be run over apx 2,000+ rows in a table (the id's are going to be provided via a csv file which can be copied into the query)
Thanks in advance
This is probably silly, but what about creating a temporary table containing all your IDs from which you'll substract the result of your SELECT query ?
Untested, but in theory:
Table 1:
+----+-----+
| id | num |
+----+-----+
Table 2:
+----+
| id |
+----+
Table 1 contains the data you're looking for (and num is any field containing any data)
Table 2 contains the IDs from the CSV
SQL:
SELECT COUNT(`Table1`.`num`) AS `count`
FROM `Table1`
LEFT JOIN `Table2` ON `Table1`.`id` = `Table2`.`id`
WHERE `count` = 0
Quick solution, open your csv file, replace all comma's with " union select " put select in front of that line and use it as the first line of the query at the bottom query.
So 1,2,3 becomes
Select 1 union select 2 union select 3
Use this in the query below
Select 1 union select 2 union select x -- replace this line with the line generated from your csv
Except
(
Select id from mytable
)
What about:
SELECT *
FROM (select 1 as f
UNION
SELECT 2 as f
UNION
SELECT 3 as f
UNION
SELECT 4 as f
UNION
SELECT 5 as f
UNION
SELECT 6 as f
UNION
SELECT 7 as f
UNION
SELECT 8 as f
UNION
SELECT 9 ) as s1
WHERE f NOT IN (SELECT id FROM mytable);

Counting word occurrences in a table column

I have a table with a varchar(255) field. I want to get (via a query, function, or SP) the number of occurences of each word in a group of rows from this table.
If there are 2 rows with these fields:
"I like to eat bananas"
"I don't like to eat like a monkey"
I want to get
word | count()
---------------
like 3
eat 2
to 2
i 2
a 1
Any idea? I am using MySQL 5.2.
#Elad Meidar, I like your question and I found a solution:
SELECT SUM(total_count) as total, value
FROM (
SELECT count(*) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.sentence, ' ', n.n), ' ', -1) value
FROM table_name t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.sentence) - LENGTH(REPLACE(t.sentence, ' ', '')))
ORDER BY value
) AS x
GROUP BY x.value
) AS y
GROUP BY value
Here is the full working fiddle: http://sqlfiddle.com/#!2/17481a/1
First we do a query to extract all words as explained here by #peterm(follow his instructions if you want to customize the total number of words processed). Then we convert that into a sub-query and then we COUNT and GROUP BY the value of each word, and then make another query on top of that to GROUP BY not grouped words cases where accompanied signs might be present. ie: hello = hello! with a REPLACE
I would recommend not to do this in SQL at all. You're loading DB with something that it isn't best at. Selecting a group of rows and doing frequency calculation on the application side will be easier to implement, will work faster and will be maintained with less issues/headaches.
You can try this perverted-a-little way:
SELECT
(LENGTH(field) - LENGTH(REPLACE(field, 'word', ''))) / LENGTH('word') AS `count`
ORDER BY `count` DESC
This query can be very slow. Also, it looks pretty ugly.
I think you should do it like indexing, with additional table.
Whenever u create, update, or delete a row in your original table, you should update your indexing table. That indexing table should have the columns: word, and the number of occurrences.
I think you are trying to do too much with SQL if all the words are in one field of each row. I recommend to do any text processing/counting with your application after you grab the text fields from the db.