mysql find numbers in query that are NOT in table - mysql

Is there a simple way to compare a list of numbers in my query to a column in a table to return the ones that are NOT in the db?
I have a comma separated list of numbers (1,57, 888, 99, 76, 490, etc etc) that I need to compare to the number column in a table in my DB. SOME of those numbers are in the table, some are not. I need the query to return those that are in my comma separated list, but are NOT in the DB...

I would put the list of numbers to be checked in a table of their own, then use WHERE NOT EXISTS to check whether they exist in the table to be queried. See this SQLFiddle demo for an example of how this might be accomplished:
If you're comfortable with this syntax, you can even avoid putting into a temp table:
SELECT * FROM (
SELECT 1 AS mycolumn
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6
UNION
SELECT 7
) a
WHERE NOT EXISTS ( SELECT 1 FROM mytable b
WHERE b.mycolumn = a.mycolumn )
UPDATE per comments from OP
If you can insert your very long list of numbers into a table, then query as follows to get the numbers that are not found in the other table:
SELECT mynumber
FROM mytableof37000numbers a
WHERE NOT EXISTS ( SELECT 1 FROM myothertable b
WHERE b.othernumber = a.mynumber)
Alternately
SELECT mynumber
FROM mytableof37000numbers a
WHERE a.mynumber NOT IN ( SELECT b.othernumber FROM myothertable b )
Hope this helps.

May be this is what you are looking for.
Convert your CSV to rows using SUBSTRING_INDEX. Use NOT IN operator to find the values which is not present in DB
Then Convert the result back to CSV using Group_Concat.
select group_concat(value) from(
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.a, ',', n.n), ',', -1) value
FROM csv t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.a) - LENGTH(REPLACE(t.a, ',', '')))) ou
where value not in (select a from db)
SQLFIDDLE DEMO
CSV TO ROWS referred from this ANSWER

You could use the 'IN' clause of MySQL. Maybe check this out IN clause tutorial

Related

Separate field into different rows [duplicate]

This question already has answers here:
SQL split values to multiple rows
(12 answers)
Closed 3 years ago.
I have a table called tableA, and the id field may stored values with comma at one record.
When I use sql: "select id from tableA", I will get the below result.
I do not know how to separate a row record into several rows like below:
27
19
7
18
...
Is it any hints for the sql script? Thank you.
I'm not confident there isn't a nicer way, but maybe something like this is a good starting point:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(tablea.id, ',', numbers.row), ',', -1)
FROM
tablea INNER JOIN
-- This numbers subquery is taken from #Unreason's answer in https://stackoverflow.com/questions/304461/generate-an-integer-sequence-in-mysql
(
SELECT #row := #row + 1 AS row FROM
(select 0 union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) t,
(select 0 union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) t2,
(SELECT #row:=0) n) numbers
on numbers.row <= LENGTH(tablea.id)-LENGTH(REPLACE(tablea.id, ',', ''))+1
Given tablea.id='27,19,8', SUBSTRING_INDEX(SUBSTRING_INDEX(tablea.id, ',', 1), ',', -1) will return the first element (27) and SUBSTRING_INDEX(SUBSTRING_INDEX(tablea.id, ',', 2), ',', -1) will return the second element (19), and so forth.
Therefore I join that with a list of numbers (in this case my list goes up to 100, so I am assuming a single tablea.id field never has more than 100 comma-separated values, although the details could be changed). The join condition uses LENGTH(tablea.id)-LENGTH(REPLACE(tablea.id, ',', ''))+1 which is a count of how many comma-separated values are in the field.
Here's a db fiddle of it working to play around with.

SQL Subtraction one by one

I was thinking of subtracting digits one by one but didn't find a way to implement it after a big effort.
Row 1: 100211210
Row 2: 100010220
Result: 000201010
And the result has to be non-negative.
Select SUBSTR(t1.row,1,1)-(t2.row,1, 1)|| SUBSTR(t1.row,2,2)-SUBSTR(t2.row,2, 2)||... so on from table t1 where t1.row NOT IN (Select row in table t2);
This will check in the same table if the row exists then it will skip if not subtract digit by digit or you can use loop in pl/sql by declaring values for substr as i,j for both and then subtracting.
select GROUP_CONCAT(CAST(ABS(substring('123456782',c.count,1)-substring('323456789',c.count,1)) AS CHAR) separator '')
from (select c1.1*10+c2.1 count from (select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 0) c1,
(select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 0) c2 order by count) c
where c.count>0 and c.count<=length('123456782')
2 string is the same length, and the last parameter is the lenght of the strings

how to split a string to 4 characters sub strings then concat them by hyphen?

I need to write a mysql query that does the following:
I have list of serial numbers in a database table, the query should only update serial number rows where the length of a serial number is more than 15, then each serial number is divided to 4-digits sub-strings separated by a hyphen. example :
'10028641232912356' should become '1002-8641-1232-9123-56'
I started with this, which only inserts a hyphen after first 4 :
SELECT serial_no, CHAR_LENGTH(TRIM(serial_no)) as 'length',
CONCAT_WS('-',SUBSTRING(TRIM(serial_no),1,4),
SUBSTRING(TRIM(serial_no),5,4)) as result
FROM pos where
serial_no is not null and CHAR_LENGTH(TRIM(serial_no))>=15 ;
this is only the select statement, at first I just want to get (as in select) the new format of the serial no then i'll update it, but since I dont know the exact length of each serial number I need to figure out this part:
CONCAT_WS('-',SUBSTRING(TRIM(serial_no),1,4)
This is must only be done using mysql functions
Any help is appreciated
Introduction
Since we've figured out that your strings can have any length; your issue boils down to how to split strings with any length in to it's chunks of some length in MySQL. Unfortunately there is a serious problem in MySQL that prevent us from doing this in "native" way. And this problem is - MySQL does not support sequences. That means you can not use some sort of internal iterator like construct to loop over your string.
But there is always a way
Building sequence
We can use a trick to do this. First part of trick: use CROSS JOIN to produce the desired row sets. If you are not aware how it works then I'll remind you. It will produce a Cartesian product of the two row sets. The second part of the trick: use a well known formula.
N = d1x101 + d2x102 + ...
Actually you can do that with any base, not just 10. For this demonstration I will use 10. Usage:
SELECT
n1.i+10*n2.i AS num
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
So we're using UNION ALL for the numbers 0..9 to produce a multiplication part (our di). With the query above you'll get consecutive 0..99 numbers. This is because we're using base powers till 2 only, and 102=100. You can check fiddle to make sure that it will work properly.
Iterating through string
Now with these "pseudo-generator" we can emulate iteration through the string. To do that, there are MySQL variables that will be our iterator. Of course separation of string piece is a work for SUBSTR(). So basic skel is:
SELECT
SUBSTR(#str, #i:=#i+#len, #len) as chunk
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
CROSS JOIN
(SELECT #str:='ABCD1234EFGH5678IJKL90', #len:=4, #i:=1-#len) as init
LIMIT 6;
(fiddle for sample above is here). We are just iterating through sequence and then using the iterator to create the correct offset. All that is left to do now is gather our string and insert the hyphens. Hopefully there is GROUP_CONCAT() in MySQL for that:
SELECT
GROUP_CONCAT(chunk SEPARATOR '-') AS string
FROM
(SELECT
SUBSTR(#str, #i:=#i+#len, #len) as chunk
FROM
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
CROSS JOIN
(SELECT #str:='ABCD1234EFGH5678IJKL90', #len:=4, #i:=1-#len) as init
) AS data
WHERE chunk!='';
(again, fiddle is here).
With whole table
Now it is a sample and you want to select that from a table. It will be more complicated:
SELECT
serial_no,
GROUP_CONCAT(chunk SEPARATOR '-') AS serial
FROM
(SELECT
SUBSTR(
IF(#str=serial_no, #str, serial_no),
#i:=IF(#str=serial_no, #i+#len, 1),
#len
) AS chunk,
#str:=serial_no AS serial_no
FROM
(SELECT
serial_no
FROM
pos
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n1
CROSS JOIN
(SELECT 0 as i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as n2
ORDER BY
serial_no) AS data
CROSS JOIN
(SELECT #len:=4, #i:=1-#len) AS init) AS seq
WHERE
chunk!=''
GROUP BY
serial_no;
Fiddle is available here. There is another trick to produce the row sets. We'll use CROSS JOIN now instead of initializing string. So we should pass the current pos value of the serial_no. In order to make sure all the rows will be iterated properly we have to order them (that is done with inner ORDER BY).
Limitations
Well as you already know, sequence is limited with 99; thus, with our #len defined as 4 we'll be able to split only by a 4000 length string maximum. Also this will use the whole sequence in any case. Even if your string is much shorter (chances are it is). Thus performance impact may have place.
Is it worth the effort?
My point is: mostly, no. It may be ok to use it once, maybe. But it won't be re-usable and it won't be readable. Thus there is little sense in doing such string operations with DBMS functions because we have applications for such things. It should be used for that and using it you actually can create re-usable/scalable/or/whatever code.
Another way may be to create stored procedure in which we do the desired thing (so, split the string by given length & concatenate it with given delimiter). But honestly, it's just an attempt to "hide the problem". Because even if it would be re-usable, it still will have the same weakness; performance impact. Even more so if we're going to create code for the DBMS. Then again, why don't we place that code in the application? In 99% of cases DBMS is the place for data storage and the application is the place for code (e.g. logic). Mixing these two things almost always ends in a bad result.
Making an assumption that the serial number is unique then the following should do it:-
SELECT serial_no, GROUP_CONCAT(SUBSTRING(serial_no, anInt, 4) ORDER BY anInt SEPARATOR '-')
FROM
(
SELECT serial_no, anInt
FROM pos
CROSS JOIN
(
SELECT (4 * (units.i + 10 * tens.i + 100 * hundreds.i)) + 1 AS anInt
FROM (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) hundreds
) sub1
WHERE LENGTH(serial_no) >= anInt
AND LENGTH(serial_no) > 15
) sub2
GROUP BY serial_no
This will cope with a serial number up to ~4000 characters long.
This works by using a series of unioned fixed queries to get the numbers 0 to 9. This is cross joined against itself 3 times with the first being units, the next being tens and the last being hundreds (you can carry on adding more if you want). From these the numbers between 0 and 999 are generated, then multiplied by 4 and with 1 added (so giving 1 to 3997 in steps of 4), which is the starting position of each group of 4. The WHERE clause checks that this generated number is less than the length of serial_no (if it is you land up with duplicates), and that serial_no is longer than 15.
This will generate a list of all the numbers, each one repeated as many times as their are groups of 4 numbers (or partial groups), along with the start position of a group.
The outer SELECT then takes this list and uses substring to extract each group, and uses GROUP_CONCAT to join he results together again with '-' as the separator between each group. If also specifies the start position of each group as the order to join them again (would probably be fine without this, but I wouldn't guarantee it).
SQL fiddle here:-
http://www.sqlfiddle.com/#!2/eb2d0/2

calculating the word occurrence in mysql table

I am getting the reviews from different sites and storing into table. For each review I am getting adjective and noun list in separate column.
So for each review there are main 3 values here.
review, adjective_list, rate
Now I want to count number of times adjectives repeats. After that recommending only those review which has adjectives which repeats maximum time and having review 4-5.
Which is correct way to do this?
My thought about this:
Creating trigger which perform action when ever there is insert review operation.
This trigger will read column having adjectives, calculate the occurrence(don't know how?) and storing top adjectives with their occurrence.
While recommendation selecting adjective with maximum occurrence, and looking into 4-5 rated review.
I am not sure what is correct way. Any help is appreciable
Main table looks like this:
Not tested, but if I understand you requirement correctly you should be able to base a query on something like this to do the job:-
SELECT id, SUBSTRING_INDEX(SUBSTRING_INDEX(adj_noun, ',', aCnt + 1), ',', -1), COUNT(*)
FROM Main_Table
INNER JOIN
(
SELECT Units.i + Tends.i * 10 + Hundreds.i * 100 AS aCnt
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Units
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Tens
(SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Hundreds
) Integers_Query
ON aCnt <= (LENGTH(adj_noun) - LENGTH(REPLACE(adj_noun, ',', '')))
GROUP BY id, SUBSTRING_INDEX(SUBSTRING_INDEX(adj_noun, ',', aCnt + 1), ',', -1)
This uses a subquery to get a range of numbers (0 to 999), and does a join of this against your table where the number is less than or equal to the number of time a comma appears in the adj_noun column (ie, subtract the length of adj_noun with all the commas removed from the full length of adj_noun). Then use SUBSTRING_INDEX to get the string up to the aCnt comma, and again use SUBSTRING_INDEX to get the string from that comma back to the previous comma (excludes the commas from the result).
The COUNT / GROUP BY should get you the number of times each word appears in the resulting list for each item.
Probably fairly inefficient. Only copes with 1000 comma separated words (easily extended, but will be slower).

Mysql explode function

I have a table witch has a field with the following record:
1,2,3,4,5,6
I would like to ask the following two things:
1) How can i make a foreign key in another table? The rule would be:
For any value seperated by comma in field `field_name` must be record of other_table.field_id
2) How can i do something like: SELECT explode(field) AS ex FROM table_name ?
the name's of row maybe can retrieve as ex[0]-->1, ex[1]-->2
While it is possible to do a join on a comma separated field (using FIND_IN_SET for example), I don't think there is a way to do this for a foreign key.
MySQL doesn't have an explode function, and your idea would seem to suggest a varying number of columns on each row.
You can split them onto different rows if necessary but it is ugly. And more a good reason to NOT use comma separated fields
SELECT DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(field, ',', 1 + units.i + tens.i * 10 + hundreds.i * 100), ',', -1)
FROM table_name
CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) hundreds