SQL update multiple table using inner join - mysql

I have three tables "batch", "batchyield", "batchsop"
BATCH
|----------|--------------|----------------|-------|
| batch_id | batch_status | actual_produce | stage |
|----------|--------------|----------------|-------|
BATCHYIELD
|--------------|----------------|
| batch_id | actual_harvest |
|--------------|----------------|
BATCHSOP
|--------------|----------------|
| batch_id | current_status |
|--------------|----------------|
I am trying to update two tables at a time they all are connected with a foreign key
I have written a SQL query for that
UPDATE b SET
b.batch_status = 'completed', b.stage = 'flowering',
b.actual_produce = SUM(byl.actual_harvest),
bsop.current_status='3'
from igrow.farm_management_batch b
INNER JOIN igrow.farm_management_batchyield byl ON b.id = byl.batch_id
INNER JOIN igrow.sop_management_batchsopmanagement bsop ON b.id = bsop.batch_id
WHERE end_date < "2022-07-10 00:00:00.000000" and end_date is not null and (batch_status = "running" or batch_status = "to_start")
BUT It says the query is wrong

UPDATE igrow.farm_management_batch b
INNER JOIN ( SELECT batch_id, SUM(actual_harvest) actual_harvest
FROM igrow.farm_management_batchyield
GROUP BY batch_id ) byl ON b.id = byl.batch_id
INNER JOIN igrow.sop_management_batchsopmanagement bsop ON b.id = bsop.batch_id
SET b.batch_status = 'completed',
b.stage = 'flowering',
b.actual_produce = byl.actual_harvest,
bsop.current_status='3'
WHERE end_date < "2022-07-10 00:00:00.000000"
-- and end_date is not null
AND b.batch_status IN ("running", "to_start")
end_date is not null is excess (if previous is true then this is true too), commented.
PS. There is no end_date column in shown tables - where it is taken from?

Related

Query to group without lost IF function

I created a query to search for all my stock products that are in orders placed, and I created an alias "total_vendido" that adds the products when they are kits or units, so far this is ok. But now I need to group the sizes and add this "total_vendido" alias by size.
Query:
SELECT `gp`.`id`, `gp`.`data`, `gp`.`status`, `gp`.`situacao`, `gp`.`nome`,
`gp`.`razao_social`, `gp`.`email`, `gp`.`telefone`,
`itens`.*,
IF(itens.tipo = 'K',
SUM(itens.qtde_prod) * itens.qtde_lote,
SUM(itens.qtde_prod)
) AS total_vendido,
`estoq`.`titulo`
FROM `ga845_pedidos_view` `gp`
JOIN `ga845_pedido_itens` `itens` ON `itens`.`pedido_id` = `gp`.`id`
JOIN `ga845_produtos` `prod` ON `prod`.`id` = `itens`.`produtos_id`
JOIN `ga845_produtos_estoque` `estoq` ON `estoq`.`id` = `prod`.`estoques_id`
WHERE `gp`.`situacao` IN('Pedido Realizado', 'Pagamento Aprovado',
'Pedido em Separação', 'Pedido Separado')
AND date(gp.data) >= '2020-07-25'
AND date(gp.data) <= '2020-07-25'
AND `estoq`.`id` IN('24')
GROUP BY `itens`.`tamanho_prod`, `estoq`.`id`
ORDER BY `estoq`.`id` ASC, `itens`.`tamanho_prod` ASC
Current result (only important columns)
tamanho_prod | tipo | total_vendido
G | K | 5
G | U | 1
M | K | 1
P | U | 8
Expected result (only important columns)
tamanho_prod | total_vendido
G | 6
M | 1
P | 8
Code related to Expected result (only important columns)
SELECT
, `itens`.`tamanho_prod`
, SUM( IF(itens.tipo = 'K',
itens.qtde_prod * itens.qtde_lote,
itens.qtde_prod
) AS total_vendido
FROM `ga845_pedidos_view` `gp`
JOIN `ga845_pedido_itens` `itens` ON `itens`.`pedido_id` = `gp`.`id`
JOIN `ga845_produtos` `prod` ON `prod`.`id` = `itens`.`produtos_id`
JOIN `ga845_produtos_estoque` `estoq` ON `estoq`.`id` = `prod`.`estoques_id`
WHERE `gp`.`situacao` IN('Pedido Realizado', 'Pagamento Aprovado',
'Pedido em Separação', 'Pedido Separado')
AND date(gp.data) >= '2020-07-25'
AND date(gp.data) <= '2020-07-25'
AND `estoq`.`id` IN('24')
GROUP BY `itens`.`tamanho_prod`
ORDER BY `itens`.`tamanho_prod` ASC
if you want an aggregated result just for itens.tamanho_prod .. then you should use group by only for this column ... and move the SUM() outside the if condition

Writing more better SQL

I've got a query here that's painfully slow. Part of the problem may be that tableA in the sub-query has a quite substantial size in comparison to the other tables.
TABLES STRUCTURE
*-------------------*------------------*-------------------*
| ID_TABLE | DATA_TABLE | DATA_TABLE_EXT |
*-------------------*------------------*-------------------*
| id n<|>1 id 1<|>n owner_id |
| foreign_id | owner_id | information |
| foreign_id_source | date_field | ... |
| ... | ... | |
*-------------------*------------------*-------------------*
QUERY
SELECT ID_TABLE.foreign_id_source, count(ID_TABLE.id) as count
FROM DATA_TABLE
LEFT JOIN ID_TABLE ON DATA_TABLE.id = ID_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_id'
AND DATA_TABLE.date_field > 'some_date'
AND DATA_TABLE.id IN (
SELECT DATA_TABLE_EXT.owner_id FROM DATA_TABLE_EXT
JOIN DATA_TABLE ON DATA_TABLE_EXT.owner_id = DATA_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_id'
GROUP BY DATA_TABLE.id
HAVING SUM(ABS(DATA_TABLE_EXT.information)) <> 0
)
GROUP BY ID_TABLE.foreign_id_source
ORDER BY count ASC
REQUIRED RESULT
*-------------------*-------------*
| foreign_id_source | count |
*-------------------*-------------*
| source1 | 45 |
| source2 | 10 |
| ... | |
*-------------------*-------------*
Each id in DATA_TABLE may have multiple records in ID_TABLE.
many records in DATA_TABLE may have the same owner_id.
I'm looking for the number of records in data_table with a foreign_id_source, grouped by that foreign_id_source, where the record is after 'some_date' and it's DATA_TABLE_EXT records do not all have a value of 0 in the information field.
Short of creating indexes or other database manipulation is there a way to improve this query in terms of performance?
Any other suggestions are also welcome.
The point is: SUM(ABS(DATA_TABLE_EXT.information)) <> 0 can only be true if at least one DATA_TABLE_EXT.information is non-zero. So we don't have to sum() them, we only only need to check if a non-zero one exists.
[ I don't know if mysql is smart enough to handle the exists(), but in theory it is cheaper, and can be faster]
SELECT it.foreign_id_source, count(it.id) as count
FROM DATA_TABLE dt
LEFT JOIN ID_TABLE it ON dt.id = it.id
WHERE dt.owner_id = 'some_id'
AND dt.date_field > 'some_date'
AND EXISTS (
SELECT *
FROM DATA_TABLE_EXT x
JOIN DATA_TABLE dt2 ON x.owner_id = dt2.id
WHERE x.id =dt.id
AND dt2.owner_id = 'some_id'
AND x.information <> 0
)
GROUP BY it.foreign_id_source
ORDER BY count ASC
;
Often moving the subquery to the FROM will help:
SELECT ID_TABLE.foreign_id_source, count(DATA_TABLE.id) as count
FROM ID_TABLE LEFT JOIN
DATA_TABLE
ON DATA_TABLE.id = ID_TABLE.id JOIN
(SELECT DATA_TABLE.id
FROM DATA_TABLE_EXT JOIN
DATA_TABLE
ON DATA_TABLE_EXT.owner_id = DATA_TABLE.id
WHERE DATA_TABLE.owner_id = 'some_value'
GROUP BY DATA_TABLE.id
HAVING SUM(ABS(DATA_TABLE_EXT.information)) <> 0
) xx
ON DATA_TABLE.id = xx.id
WHERE DATA_TABLE.owner_id = 'some_value' AND
DATA_TABLE.date_field > 'some_date'
GROUP BY x.field1
ORDER BY count ASC;
Then, you can think about indexes. These would be tableX(field2, fieldZ, field1, fieldX), tableI(field1), tableX(field2, field1, fieldB), andtableA(field1)`.

How to query and group every continuous number series in MySQL?

I have this freight.or_nos table which contains series of receipt numbers. I want to list all the or's being issued excluding the status='Cancelled' making the series broken in groups.
For example I have this receipt stab 125001-125050, and 125020 is cancelled so the listing result would be:
+-------------------------------------------------------+
| OR Start | OR End | Quantity | Amount |
+-------------------------------------------------------+
| 125001 | 125019 | 19 | |
+-------------------------------------------------------+
| 125021 | 125050 | 30 | |
+-------------------------------------------------------+
This seems to be a tough query.
Thanks for reading but I already made it, just now! :)
Here's my query(disregard the other characters it's form our CGI):
{.while SELECT `start`,`end`,or_prefix,or_suffix,SUM(a.amount) AS g_total,COUNT(*) AS qcount FROM (SELECT l.id AS `start`,( SELECT MIN(a.id) AS id FROM ( SELECT a.or_no AS id FROM freight.`or_nos` a WHERE a.status!='Cancelled' AND a.log_user = 0#user_teller AND DATE(a.or_date)='#user_date`DATE' AND IF(a.status='Default' AND a.amount=0,0,1) ) AS a LEFT OUTER JOIN ( SELECT a.or_no AS id FROM freight.`or_nos` a WHERE a.status!='Cancelled' AND a.log_user = 0#user_teller AND DATE(a.or_date)='#user_date`DATE' AND IF(a.status='Default' AND a.amount=0,0,1) ) AS b ON a.id = b.id - 1 WHERE b.id IS NULL AND a.id >= l.id ) AS `end` FROM ( SELECT a.or_no AS id FROM freight.`or_nos` a WHERE a.status!='Cancelled' AND a.log_user = 0#user_teller AND DATE(a.or_date)='#user_date`DATE' AND IF(a.status='Default' AND a.amount=0,0,1) ) AS l LEFT OUTER JOIN ( SELECT a.or_no AS id FROM freight.`or_nos` a WHERE a.log_user = 0#user_teller AND DATE(a.or_date)='#user_date`DATE' AND IF(a.status='Default' AND a.amount=0,0,1) ) AS r ON r.id = l.id - 1 WHERE r.id IS NULL) AS k LEFT JOIN freight.`or_nos` a ON a.`or_no` BETWEEN k.start AND k.end AND DATE(a.`or_date`)='#user_date`DATE' AND a.log_user =0#user_teller AND IF(a.status='Default' AND a.amount=0,0,1) AND a.status!='Cancelled' GROUP BY `start`}
{.start}{.x.24.12:end}{.x`p0.40.-5:qcount}{.x`p2.57.-15:g_total}{.asc 255}
{.wend}{.asc 255}

Why is this MySQL query slow?

I have the following query, all relevant columns are indexed correctly. MySQL version 5.0.8. The query takes forever:
SELECT COUNT(*) FROM `members` `t` WHERE t.member_type NOT IN (1,2)
AND ( SELECT end_date FROM subscriptions s
WHERE s.sub_auth_id = t.member_auth_id AND s.sub_status = 'Completed'
AND s.sub_pkg_id > 0 ORDER BY s.id DESC LIMIT 1 ) < curdate( )
EXPLAIN output:
----+--------------------+-------+-------+-----------------------+---------+---------+------+------+-------------
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
----+--------------------+-------+-------+-----------------------+---------+---------+------+------+-------------
1 | PRIMARY | t | ALL | membership_type | NULL | NULL | NULL | 9610 | Using where
----+--------------------+-------+-------+-----------------------+---------+---------+------+------+-------------
2 | DEPENDENT SUBQUERY | s | index | subscription_auth_id, | PRIMARY | 4 | NULL | 1 | Using where
| | | | subscription_pkg_id, | | | | |
| | | | subscription_status | | | | |
----+--------------------+-------+-------+-----------------------+---------+---------+------+------+-------------
Why?
Your subselect refers to values in the parent query. This is known as a correlated (dependent) subquery, and such a query has to be executed once for every row in the parent query, which often leads to poor performance. It is often faster to rewrite the query as a JOIN, for example like this
(Note: without a sample schema to test with, it is impossible to say in advance if this will be faster and still correct, you might need to adjust it a little):
SELECT COUNT(*) FROM members t
LEFT JOIN (
SELECT sub_auth_id as member_id, max(id) as sid FROM subscriptions
WHERE sub_status = 'Completed'
AND sub_pkg_id > 0
GROUP BY sub_auth_id
LEFT JOIN (
SELECT id AS subid, end_date FROM subscriptions
WHERE sub_status = 'Completed'
AND sub_pkg_id > 0
) sdate ON sid = subid
) sub ON sub.member_id = t.member_auth_id
WHERE t.member_type NOT IN (1,2)
AND sub.end_date < curdate( )
The logic here is:
For each member, find his latest subscription.
For each latest subscription, find its end date.
Join these member-latest_sub_date pair to the members list.
Filter the list.
Your query is slow because as written you are considering 9,610 rows and therefore performing 9,610 SELECT subqueries in your WHERE clause. You really should rewrite your query to JOIN the members and subscriptions tables first, to which your WHERE conditions could still apply.
EDIT: Try this.
SELECT COUNT(*)
FROM `members` `t`
JOIN subscriptions s ON (s.sub_auth_id = t.member_auth_id)
WHERE t.member_type NOT IN (1,2)
AND s.sub_status = 'Completed'
AND s.sub_pkg_id > 0
AND end_date < curdate()
ORDER BY s.id DESC LIMIT 1
Caveat: I'm not a MySQL expert, but pretty good in a different SQL flavour (VFP), but I believe you will save some time if:
You count just one field, let's say memberid, instead of *.
Your comparison NOT IN (1,2) is replaced with > 2 (provided that is valid).
The ORDER BY in your subselect is unnecessary, I think. You're trying to get the last completed subscription?
The < curdate() should be inside your subselect's WHERE.
(SELECT end_date FROM subscriptions s
WHERE s.end_date < curdate() and s.sub_auth_id = t.member_auth_id AND
s.sub_status = 'Completed' AND s.sub_pkg_id > 0 ORDER BY s.id DESC LIMIT 1 )
Tune your subselect so as to trim down the set as quickly as possible. The first conditional should be the one least likely to occur.
I ended up doing it like this:
select count(*) from members t
JOIN subscriptions s ON s.sub_auth_id = t.member_auth_id
WHERE t.membership_type > 2 AND s.sub_status = 'Completed' AND s.sub_pkg_id > 0
AND s.sub_end_date < curdate( )
AND s.id = (SELECT MAX(ss.id) FROM subscriptions ss WHERE ss.sub_auth_id = t.member_auth_id)
I believe that the problem is due to a bug that won't be fixed until MySQL 6.

How to delete duplicates in SQL table based on multiple fields

I have a table of games, which is described as follows:
+---------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| date | date | NO | | NULL | |
| time | time | NO | | NULL | |
| hometeam_id | int(11) | NO | MUL | NULL | |
| awayteam_id | int(11) | NO | MUL | NULL | |
| locationcity | varchar(30) | NO | | NULL | |
| locationstate | varchar(20) | NO | | NULL | |
+---------------+-------------+------+-----+---------+----------------+
But each game has a duplicate entry in the table somewhere, because each game was in the schedules for two teams. Is there a sql statement I can use to look through and delete all the duplicates based on identical date, time, hometeam_id, awayteam_id, locationcity, and locationstate fields?
You should be able to do a correlated subquery to delete the data. Find all rows that are duplicates and delete all but the one with the smallest id. For MYSQL, an inner join (functional equivalent of EXISTS) needs to be used, like so:
delete games from games inner join
(select min(id) minid, date, time,
hometeam_id, awayteam_id, locationcity, locationstate
from games
group by date, time, hometeam_id,
awayteam_id, locationcity, locationstate
having count(1) > 1) as duplicates
on (duplicates.date = games.date
and duplicates.time = games.time
and duplicates.hometeam_id = games.hometeam_id
and duplicates.awayteam_id = games.awayteam_id
and duplicates.locationcity = games.locationcity
and duplicates.locationstate = games.locationstate
and duplicates.minid <> games.id)
To test, replace delete games from games with select * from games. Don't just run a delete on your DB :-)
You can try such query:
DELETE FROM table_name AS t1
WHERE EXISTS (
SELECT 1 FROM table_name AS t2
WHERE t2.date = t1.date
AND t2.time = t1.time
AND t2.hometeam_id = t1.hometeam_id
AND t2.awayteam_id = t1.awayteam_id
AND t2.locationcity = t1.locationcity
AND t2.id > t1.id )
This will leave in database only one example of each game instance which has the smallest id.
The best thing that worked for me was to recreate the table.
CREATE TABLE newtable SELECT * FROM oldtable GROUP BY field1,field2;
You can then rename.
To get list of duplicate entried matching two fields
select t.ID, t.field1, t.field2
from (
select field1, field2
from table_name
group by field1, field2
having count(*) > 1) x, table_name t
where x.field1 = t.field1 and x.field2 = t.field2
order by t.field1, t.field2
And to delete all the duplicate only
DELETE x
FROM table_name x
JOIN table_name y
ON y.field1= x.field1
AND y.field2 = x.field2
AND y.id < x.id;
select orig.id,
dupl.id
from games orig,
games dupl
where orig.date = dupl.date
and orig.time = dupl.time
and orig.hometeam_id = dupl.hometeam_id
and orig. awayteam_id = dupl.awayeam_id
and orig.locationcity = dupl.locationcity
and orig.locationstate = dupl.locationstate
and orig.id < dupl.id
this should give you the duplicates; you can use it as a subquery to specify IDs to delete.
AS long as you are not getting id (primary key) of the table in your select query and the other data is exact same you can use SELECT DISTINCT to avoid getting duplicate results.
delete from games
where id not in
(select max(id) from games
group by date, time, hometeam_id, awayteam_id, locationcity, locationstate
);
Workaround
select max(id) id from games
group by date, time, hometeam_id, awayteam_id, locationcity, locationstate
into table temp_table;
delete from games where id in (select id from temp);
DELETE FROM table
WHERE id =
(SELECT t.id
FROM table as t
JOIN (table as tj ON (t.date = tj.data
AND t.hometeam_id = tj.hometeam_id
AND t.awayteam_id = tj.awayteam_id
...))
DELETE FROM tbl
USING tbl, tbl t2
WHERE tbl.id > t2.id
AND t2.field = tbl.field;
in your case:
DELETE FROM games
USING games tbl, games t2
WHERE tbl.id > t2.id
AND t2.date = tbl.date
AND t2.time = tbl.time
AND t2.hometeam_id = tbl.hometeam_id
AND t2.awayteam_id = tbl.awayteam_id
AND t2.locationcity = tbl.locationcity
AND t2.locationstate = tbl.locationstate;
reference: https://dev.mysql.com/doc/refman/5.7/en/delete.html