i have 2 tables qs and local.
qs has 2 columns (actually built from several other columns) that are part of the comparison i need to do:
f1 | t1
abcdaa | abcdbb
local just has one column that's part of the comparison:
rangeA
abcd
I am trying to find the entries in qs that do not have a matching substring in local
I've tried this in about a dozen different ways, and i must be missing something , since it's taking an unusual amount of time.
here is the fastest method I've found so far:
CREATE TEMPORARY TABLE `tempB` SELECT f1, t1,
LEFT(f1,2) AS l2,LEFT(f1,3) AS l3,LEFT(f1,4) AS l4,LEFT(f1,5) AS l5,LEFT(f1,6) AS l6,LEFT(f1,7) AS l7,LEFT(f1,8) AS l8,
LEFT(f1,9) AS l9,LEFT(f1,10) AS l10,LEFT(f1,11) AS l11,LEFT(f1,12) AS l12,LEFT(f1,13) AS l13,
LEFT(t1,2) AS lt2,LEFT(t1,3) AS lt3,LEFT(t1,4) AS lt4,LEFT(t1,5) AS lt5,LEFT(t1,6) AS lt6,LEFT(t1,7) AS lt7,LEFT(t1,8) AS lt8,
LEFT(t1,9) AS lt9,LEFT(t1,10) AS lt10,LEFT(t1,11) AS lt11,LEFT(t1,12) AS lt12,LEFT(t1,13) AS lt13 FROM
(SELECT CONCAT(c1,n1,s1) AS f1, CONCAT(c1,n1,s2) AS t1 FROM qs WHERE c1 ='a')tab0 ORDER BY f1 ASC;
CREATE TEMPORARY TABLE `tempB2` SELECT rangeA FROM local WHERE rangeA LIKE 'a%' ORDER BY rangeA ASC;
CREATE TEMPORARY TABLE `tempB3` SELECT rangeA AS rangeAA FROM local WHERE rangeA LIKE 'a%' ORDER BY rangeA ASC;
SELECT f1,t1, rangeA, rangeAA FROM tempB
LEFT JOIN tempB2 ON rangeA IN(l2,l3,l4,l5,l6,l7,l8,l9,l10,l11,l12,l13)
LEFT JOIN tempB3 ON rangeAA IN(lt2,lt3,lt4,lt5,lt6,lt7,lt8,lt9,lt10,lt11,lt12,lt13)
WHERE rangeA IS NULL OR rangeAA IS NULL
creating the temp tables is fast and starting with one character at a time (in this case 'a') significantly reduces the size of the datasets, but this is still very very slow even with only a few hundred thousand rows in each temp table.
I've tried using just f1 and t1 with a
ON f1 LIKE CONCAT (rangeA,'%')
but that seemed to be even slower.
Any other ideas?
Note that rangeA is at least 2 characters long and at most 13 characters long. hence the LEFTs.
example data:
qs :
c1 | n1 | s1 | s2
ab | cd | aa | bb
bb | bbb | bb | bc
cbc | cc | cdd | ddd
ddd | e | ddf | def
local :
rangeA
abcd
bdddd
cbcccdd
dddedd
expected result:
f1 | t1 | f1match | t1match
bbbbbbb | bbbbbbc | NULL | NULL
cbccccdd | cbcccddd | NULL | cbcccdd
dddeddf | dddedef | dddedd | NULL
Thank you Paul Spiegel for making this work.
Let's set up some test data.
mysql> select * from qs;
+----+---------------+-------------------+
| id | f1 | t1 |
+----+---------------+-------------------+
| 6 | match1 | no match |
| 7 | match1 | match2 |
| 8 | foo match1 | match1 bar |
| 9 | no match | abc match2 123 |
| 10 | no match | no match |
| 11 | also no match | again not a match |
+----+---------------+-------------------+
mysql> select * from local;
+--------+
| rangeA |
+--------+
| match1 |
| match2 |
+--------+
And we expect only those rows which neither f1 nor t1 match any row in local.
+----+---------------+-------------------+
| id | f1 | t1 |
+----+---------------+-------------------+
| 10 | no match | no match |
| 11 | also no match | again not a match |
+----+---------------+-------------------+
UPDATE: Indexing qs(f1,t1) and local(rangeA) will help performance.
create index index_qs_fields on qs(f1,t1);
create index index_local_rangeA on local(rangeA);
instr finds a substring in a string, that simplifies many things.
We can do this with a left excluding join. That is to get only the rows on the left side (qs) which have no match on the right (local).
We do a normal left join to check for matches.
select qs.*, rangeA
from qs
left join local on
instr(f1,rangeA) or
instr(t1,rangeA)
+----+---------------+-------------------+--------+
| id | f1 | t1 | rangeA |
+----+---------------+-------------------+--------+
| 1 | match1 | no match | match1 |
| 2 | match1 | match2 | match1 |
| 3 | foo match1 | match1 bar | match1 |
| 2 | match1 | match2 | match2 |
| 4 | no match | abc match2 123 | match2 |
| 5 | no match | no match | NULL |
| 6 | also no match | again not a match | NULL |
+----+---------------+-------------------+--------+
And turn it into an excluding join by filtering for only those which don't match at all.
select qs.*, rangeA
from qs
left join local on
instr(f1,rangeA) or
instr(t1,rangeA)
where rangeA is null
+----+---------------+-------------------+
| id | f1 | t1 |
+----+---------------+-------------------+
| 5 | no match | no match |
| 6 | also no match | again not a match |
+----+---------------+-------------------+
dbfiddle
UPDATE: Lots of entries in local can make this slow. We can try optimizing it by joining all the matches together into one regular expression. This might be faster.
We can construct our regex using group_concating all the matches together as a single regex.
select group_concat(rangeA separator '|')
into #range_re
from local;
select qs.*
from qs
where not f1 regexp(#range_re) and not t1 regexp(#range_re);
Note that you'll need to be careful to escape regex characters in your matches.
Original way too complicated answer follows.
That tells us which entries in qs don't match entries in local.
select qs.id, f1, t1, rangeA
from qs
left join local on 1=1
where instr(f1,rangeA) = 0 and instr(t1,rangeA) = 0;
+----+---------------+-------------------+--------+
| id | f1 | t1 | rangeA |
+----+---------------+-------------------+--------+
| 6 | match1 | no match | match2 |
| 8 | foo match1 | match1 bar | match2 |
| 9 | no match | abc match2 123 | match1 |
| 10 | no match | no match | match1 |
| 10 | no match | no match | match2 |
| 11 | also no match | again not a match | match1 |
| 11 | also no match | again not a match | match2 |
+----+---------------+-------------------+--------+
But we want those which don't match all of local. We can do that by counting up how many times a row appears in our list of not matches.
select qs.id, f1, t1, count(id)
from qs
left join local on 1=1
where instr(f1,rangeA) = 0
and instr(t1,rangeA) = 0
group by qs.id;
+----+---------------+-------------------+-----------+
| id | f1 | t1 | count(id) |
+----+---------------+-------------------+-----------+
| 6 | match1 | no match | 1 |
| 8 | foo match1 | match1 bar | 1 |
| 9 | no match | abc match2 123 | 1 |
| 10 | no match | no match | 2 |
| 11 | also no match | again not a match | 2 |
+----+---------------+-------------------+-----------+
And then select only those whose count is the same as the number of matches.
mysql> select qs.id, f1, t1
from qs
left join local on 1=1
where instr(f1,rangeA) = 0
and instr(t1,rangeA) = 0
group by qs.id
having count(id) = (select count(*) from local);
+----+---------------+-------------------+
| id | f1 | t1 |
+----+---------------+-------------------+
| 10 | no match | no match |
| 11 | also no match | again not a match |
+----+---------------+-------------------+
dbfiddle
here's what i have so far, which works pretty well for <50k rows. Thank you to Schwern for the helpful discussion about INSTR().
CREATE TEMPORARY TABLE `tempB` SELECT f1, t1 FROM
(SELECT LEFT(CONCAT(c1,n1,s1),17) AS f1, LEFT(CONCAT(c1,n1,s2),17) AS t1 FROM qs WHERE c1 ='a')tab0 ORDER BY f1 ASC;
CREATE TEMPORARY TABLE `tempB2` SELECT rangeA FROM local WHERE rangeA LIKE 'a%' ORDER BY rangeA ASC;
CREATE TEMPORARY TABLE `tempB3` SELECT rangeA AS rangeAA FROM local WHERE rangeA LIKE 'a%' ORDER BY rangeA ASC;
SELECT f1,t1, rangeA, rangeAA FROM tempB
LEFT JOIN tempB2 ON INSTR(f1,rangeA) =1
LEFT JOIN tempB3 ON INSTR(t1,rangeAA) =1
WHERE rangeA IS NULL OR rangeAA IS NULL
If I correctly understand your question, I think you should look into using LOCATE() or POSITION(). I don't really get the need to using all those LEFT().
A overly simplified version of what I think you want is this:
CREATE TEMPORARY TABLE `tempB`
SELECT CONCAT(c1,n1,s1) AS f1, CONCAT(c1,n1,s2) AS t1 FROM qs ORDER BY f1 ASC;
CREATE TEMPORARY TABLE `tempB2` SELECT rangeA FROM local ;
SELECT tempB.f1, tempB.t1
from tempB
WHERE (SELECT COUNT(*) from tempB2
WHERE POSITION(rangeA IN tempB.f1) != 0 AND POSITION(rangeA IN tempB.t1) != 0) = 0;
I'm in the need to perform a select SUM() where that is a formula contained into a field selected by another query.
Example:
table_A (the "formula" field contains, in each cell, an arithmetic expression involving columns from table B):
+------------+--------------+------------+
| Product_id | related_prod | formula |
+------------+--------------+------------+
| U1 | C2 | col2-col1 |
| U2 | C3 | col3-col2 |
| U3 | C4 | col3-col1 |
+------------+--------------+------------+
table_B:
+------------+---------+------------+----------+------+------+------+
| Product_id | year_id | company_id | month_id | col1 | col2 | col3 |
+------------+---------+------------+----------+------+------+------+
| C2 | 2017 | 1 | 2 | 100 | 200 | 300 |
| C3 | 2017 | 1 | 2 | 400 | 500 | 600 |
| C4 | 2017 | 1 | 2 | 700 | 800 | 900 |
+------------+---------+------------+----------+------+------+------+
I do, then, the following query:
SELECT
SUM(totals.relaz) as final_sum,
totals.relaz as 'col',
totals.prod as 'prod',
totals.cons as 'cons',
m.company_id, m.month_id, m.year_id, FROM `table_B` m,
( SELECT formula as relaz,
related_prod as prod,
p.product_id as cons FROM table_A p )
AS totals
WHERE m.product_id=totals.prod
GROUP BY m.company_id, m.year_id, m.month_id, m.product_id, totals.cons
After the select I'd do expect that, considering for example the only product 'U1', the corresponding row would be
+-----------+-----------+------+------+------------+----------+---------+
| final_sum | col | prod | cons | company_id | month_id | year_id |
+-----------+-----------+------+------+------------+----------+---------+
| 100 | col2-col1 | C2 | U1 | 1 | 2 | 2017 |
+-----------+-----------+------+------+------------+----------+---------+
Instead, what I get is
+-----------+-----------+------+------+------------+----------+---------+
| final_sum | col | prod | cons | company_id | month_id | year_id |
+-----------+-----------+------+------+------------+----------+---------+
| 0 | col2-col1 | C2 | U1 | 1 | 2 | 2017 |
+-----------+-----------+------+------+------------+----------+---------+
i.e. the final_sum field is always set to 0, despite the 'col' field contains the correct equation.
What am I doing wrong?
Thank you in advance
Alex
You are trying to get sum from a string column (table_A.formula). This will result 0. MySQL/MariaDB will not try to convert the strings to column references and evaluate the formula in the string.
Another thing is that you should list all columns not in aggregate function in GROUP BY.
To get the result you want, use:
SELECT
SUM(CASE
WHEN a.formula = 'col2-col1' THEN b.col2-b.col1
WHEN a.formula = 'col3-col1' THEN b.col3-b.col1
WHEN a.formula = 'col3-col2' THEN b.col3-b.col2
END
) AS final_sum,
a.formula as 'col',
a.related_prod as 'prod',
a.Product_id as 'cons',
b.company_id,
b.month_id,
b.year_id
FROM table_B b
JOIN table_A a on a.related_prod=b.Product_id
GROUP BY a.formula, a.related_prod, a.Product_id, b.company_id, b.month_id, b.year_id
It may possible to build a Stored routine that fetches the string col2-col1 and inserts it (using CONCAT) into a string, then PREPAREs and EXECUTEs the SQL string.
That is, dynamically build the SQL, perhaps like in #slaakso's Answer.
It would be messy.
I have needed something like this; I chose to do eval() in PHP, which was the client language. I use it for evaluating VARIABLES and GLOBAL STATUS. Example: Table_open_cache_misses / Uptime gives the "misses per second", which, if high, indicates the need for increasing the setting table_open_cache.
I am working on a product sample inventory system where I track the movement of the products. The status of each product can have a status of "IN" or "OUT" or "REMOVED". Each row of the table represents a new entry, where ID, status and date are unique. Each product also has a serial number.
I need help with a SQL query that will return all products that are currently "OUT". If I simply just select SELECT * FROM table WHERE status = "IN", it will return all products that ever had status IN.
Every time product comes in and out, I duplicate the last row of that specific product and change the status and update the date and it will get a new ID automatically.
Here is the table that I have:
id | serial_number | product | color | date | status
------------------------------------------------------------
1 | K0T4N | XYZ | silver | 2016-07-01 | IN
2 | X56Z7 | ABC | silver | 2016-07-01 | IN
3 | 96T4F | PQR | silver | 2016-07-01 | IN
4 | K0T4N | XYZ | silver | 2016-07-02 | OUT
5 | 96T4F | PQR | silver | 2016-07-03 | OUT
6 | F0P22 | DEF | silver | 2016-07-04 | OUT
7 | X56Z7 | ABC | silver | 2016-07-05 | OUT
8 | F0P22 | DEF | silver | 2016-07-06 | IN
9 | K0T4N | XYZ | silver | 2016-07-07 | IN
10 | X56Z7 | ABC | silver | 2016-07-08 | IN
11 | X56Z7 | ABC | silver | 2016-07-09 | REMOVED
12 | K0T4N | XYZ | silver | 2016-07-10 | OUT
13 | 96T4F | PQR | silver | 2016-07-11 | IN
14 | F0P22 | DEF | silver | 2016-07-12 | OUT
This query will give you all the latest records for each serial_number
SELECT a.* FROM your_table a
LEFT JOIN your_table b ON a.serial_number = b.serial_number AND a.id < b.id
WHERE b.serial_number IS NULL
Below query will give your expected result
SELECT a.* FROM your_table a
LEFT JOIN your_table b ON a.serial_number = b.serial_number AND a.id < b.id
WHERE b.serial_number IS NULL AND a.status LIKE 'OUT'
There are two good ways to do this. Which way is best,in terms of performance, can depend on various factors, so try both.
SELECT
t1.*
FROM table t
LEFT OUTER JOIN table later_t
ON later_t.serial_number = t.serial_number
AND later_t.date > t.date
WHERE later_t.id IS NULL
AND t.status = "OUT"
Which column you check from later_t for IS NULL does not matter, so long as that column is declared NOT NULL in the table definition.
The other logically equivalent method is:
SELECT
t.*
FROM table t
INNER JOIN (
SELECT
serial_number,
MAX(date) AS date
FROM table
GROUP BY serial_number
) latest_t
ON later_t.serial_number = t.serial_number
AND latest_t.date = t.date
WHERE t.status = "OUT"
For each of these queries, I strongly suggest the following index:
ALTER TABLE table
ADD INDEX `LatestSerialStatus` (serial_number,date)
I use this type of query a lot in my own work, and have the above index as the primary key on tables. Query performance is extremely fast in such cases, for these type of queries.
See also the documentation on this query type.
I'd like to query the table which stores the entries somehow mixed in rows and columns.
Here is the table:
| id | class | field | value |
|-----|-------|-------|-------|
| 1 | 1 | a | AA |
| 2 | 1 | b | BB |
| 3 | 1 | c | CC |
| 4 | 2 | a | DD |
| 5 | 2 | b | EE |
| 6 | 2 | c | FF |
What should be the query to get a result like:
a)
| class | new_a | new_c |
|-------|-------|-------|
| 1 | AA | CC |
| 2 | DD | FF |
My pseudo query I imagine it would be something like:
select class, value(where field=a) as new_a, value(where field=c) as new_c, from table;
b)
| class | new_a | new_c |
|-------|-------|-------|
| 2 | DD | FF |
For this one I guess it should be like:
select class, value(where field=a) as new_a, value(where field=c) as new_c, from table where class = '2';
Unfortunatelly I'm rarely using the mysql and I'm not sure how to build this query. All constructive suggestions are appreciated.
Try this query
For a) The query is
SELECT t1.class, t1.value as new_a, t2.value as new_b
FROM table t1
JOIN table t2 ON(t2.class=t1.class )
WHERE t1.field='a' AND t2.field='c'
For b) The query is
SELECT t1.class, t1.value as new_a, t2.value as new_b
FROM table t1
JOIN table t2 ON(t2.class=t1.class )
WHERE t1.field='a' AND t2.field='c' AND t1.class='2'
1) you are trying to convert the rows into columns so I joined the same table twice with condition as 2 tables should have same 'class' value
2) then added condition as what to fetch from table t1 and table t2 as t1.field='a' and t2.field='c'
3) In second query you need only the class value '2', so i added the condtion as t1.class=2
I am sure this would be easy to google if I knew the right words to use, but I've tried and not come up with anything: apologies if this is a common question on SO.
I have one table which lists a set of records which can be one of 4 types.
table_1:
+-------+------------+------+
| id | value | type |
+-------+------------+------+
| 1 | x | 1 |
| 2 | y | 1 |
| 3 | z | 2 |
| 4 | a | 3 |
+-------+------------+------+
I have another table which references the id of this table and stores data
table_2:
+-------+------------+------+
| id | table_1_id |value |
+-------+------------+------+
| 1 | 4 | A |
| 2 | 2 | B |
| 3 | 3 | C |
| 4 | 2 | D |
+-------+------------+------+
I want to write a query that effects:
"Find all the records from table 1 which are of type 1, take the id's of those records, and find all the records in table 2 where 'table_1_id' which match one of that set of ids."
In the above very oversimplified table example that would result in the query returning records with ids 2 and 4 in table 2
Sounds like your looking for IN:
select *
from table2
where table_1_id in (select id from table1 where type = 1)
Or perhaps you could JOIN the tables:
select t2.*
from table2 t2
join table1 t1 on t2.table_1_id = t1.id
where t1.type = 1
Joining the tables could result in duplicate records. Depends on your needs.
SELECT t1.value,t1.type,t2.value FROM table1 t1,table2 t2 WHERE t1.id = t2.table_1_id AND t1.type = 1;