Why I can not join this query with Max date? - mysql

I have an issue with the following mySQL query where it fails when Max date is introduced as shown below.
I get the following error
Error Code: 1054. Unknown column 'order_items.ORDER_ITEM_ID' in 'where
clause'
SET #UserID = 160;
SET #OrderDateTime = '2018-11-13 09:23:45';
SELECT
order_items.ORDER_ID,
listing_region.LIST_REGION_REGION_ID,
listings.LISTING_ID,
order_items.ORDER_REQUIRED_DATE_TIME,
listings.LISTING_NICK_NAME,
order_items.ORDER_QUANTITY,
order_price.ORDER_PRICE_ID,
order_items.ORDER_PORTION_SIZE,
t.LATEST_DATE,
t.ORDER_STATUS
FROM order_status_change, order_items
INNER JOIN listings ON listings.LISTING_ID = order_items.ORDER_LISTING_ID
INNER JOIN listing_region ON listing_region.LIST_REGION_LISTING_ID = listings.LISTING_ID
INNER JOIN order_price ON order_price.ORDERP_ITEM_ID = order_items.ORDER_ITEM_ID
INNER JOIN
(
SELECT MAX(order_status_change.ORDER_STATUS_CHANGE_DATETIME) AS LATEST_DATE, order_status_change.ORDER_ITEM_ID, order_status_change.ORDER_STATUS
FROM order_status_change
WHERE order_status_change.ORDER_ITEM_ID = order_items.ORDER_ITEM_ID
) AS t ON order_status_change.ORDER_ITEM_ID = t.ORDER_ITEM_ID AND order_status_change.ORDER_STATUS_CHANGE_DATETIME = t.LATEST_DATE
WHERE ((order_items.ORDER_USER_ID = #UserID) AND DATE(order_items.ORDER_REQUIRED_DATE_TIME) = DATE(#OrderDateTime))
Any help ?

I have assumed you can join order_status_change on order_items.ID = order_status_change.ORDER_ITEM_ID
If that is valid then I think this will achieve what you are after:
SET #UserID = 160;
SET #OrderDateTime = '2018-11-13 09:23:45';
SELECT
order_items.ORDER_ID
, listing_region.LIST_REGION_REGION_ID
, listings.LISTING_ID
, order_items.ORDER_REQUIRED_DATE_TIME
, listings.LISTING_NICK_NAME
, order_items.ORDER_QUANTITY
, order_price.ORDER_PRICE_ID
, order_items.ORDER_PORTION_SIZE
, t.LATEST_DATE
, order_status_change.ORDER_STATUS
FROM order_items
INNER JOIN listings ON listings.LISTING_ID = order_items.ORDER_LISTING_ID
INNER JOIN listing_region ON listing_region.LIST_REGION_LISTING_ID = listings.LISTING_ID
INNER JOIN order_price ON order_price.ORDERP_ITEM_ID = order_items.ORDER_ITEM_ID
INNER JOIN order_status_change ON order_items.ID = order_status_change.ORDER_ITEM_ID
INNER JOIN (
SELECT
MAX( mc.ORDER_STATUS_CHANGE_DATETIME ) AS LATEST_DATE
, mc.ORDER_ITEM_ID
FROM order_status_change AS mc
GROUP BY
mc.ORDER_ITEM_ID
) AS t
ON order_status_change.ORDER_ITEM_ID = t.ORDER_ITEM_ID
AND order_status_change.ORDER_STATUS_CHANGE_DATETIME = t.LATEST_DATE
WHERE order_items.ORDER_USER_ID = #UserID
AND DATE( order_items.ORDER_REQUIRED_DATE_TIME ) = DATE( #OrderDateTime )
You need to avoid this in future:
FROM order_status_change , order_items
That comma between the 2 table names IS a join, but it is from an older syntax and it is LOWER in precedence than the other joins of your query. Also, by default this comma based join acts as an equivalent to a cross join which MULTIPLIES the number of rows. In brief, please do NOT USE commas between table names.
The other issue is that you were missing a group by clause and I believe you just want to get the "latest" date from this aggregation, once that is determined link back to that table to get the status relevant to that date. (i.e. you can't group by status in the subquery, otherwise you get the latest dateS (one for each status).

Here's a simplified version to illustrate the problem.
DROP TABLE IF exists t,t1;
create table t (id int);
create table t1(id int,dt date);
insert into t values (1),(2);
insert into t1 values (1,'2018-01-01'),(1,'2018-02-01'),(2,'2018-01-01');
select t.*,t2.maxdt
from t
join (select max(dt) maxdt,t1.id from t1 where t1.id = t.id) t2
on t2.id = t.id;
ERROR 1054 (42S22): Unknown column 't.id' in 'where clause'
You could group by in the sub query and then the on clause will come into play
select t.*,t2.maxdt
from t
join (select max(dt) maxdt,t1.id from t1 group by t1.id) t2
on t2.id = t.id;
+------+------------+
| id | maxdt |
+------+------------+
| 1 | 2018-02-01 |
| 2 | 2018-01-01 |
+------+------------+
2 rows in set (0.00 sec)
If you want an answer closer to your problem please add sample data and expected output to the question as text of to sqlfiddle.

Related

DISTINCT on one value from a group selects

I have following sql query
select devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
order by devices_device.id, devices_device.start_date
I now get a list of device id's. Some of them are the same. I want to do a distinct so I only keep the first record for every device (and due to order by on start_date that would be the most recent device record for that device)
How do I do this? If I do
select distinct devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
order by devices_device.id, devices_device.start_date
nothing happens
You can use the ROW_NUMBER() window function to identify the row you want. Then filtering out the other ones is easy.
For example:
select *
from (
select
d.id, d.start_date, d.code,
s.id as "site_id", s.name as "site_name",
row_number() over(partition by d.id order by start_date desc) as rn
from devices_device d
inner join st_site_site s on d.site_id = s.id
where d.deleted = false
) x
where rn = 1
order by id, start_date
In this query the ROW_NUMBER() value will be 1 for the latest row in each device group. That's how the filtering at the end removes all other rows greater than 1.
NOTE: In case there are collisions (two rows with the same recent start_date) this query will always return a single [though random] row between them.
You should probably use a GROUP BY. Something like:
select distinct devices_device.id , devices_device.code, sss.id as "site_id",
sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false
group by devices_device.id
order by devices_device.start_date
You could test for the min start date
drop table if exists devices_device,st_site_site;
create table devices_device(id int,code int,site_id int,start_date date,deleted int);
create table st_site_site(id int,name varchar(10));
insert into devices_device values(1,10,1,'2020-10-01',0),(1,20,1,'2020-09-01',0);
insert into st_site_site values(1,'aaa');
select devices_device.id , devices_device.code, sss.id as "site_id", sss.name as "site_name"
from devices_device
inner join st_site_site sss on devices_device.site_id = sss.id
where devices_device.deleted = false and
devices_device.start_date = (select min(d1.start_date) from devices_device d1 where d1.id = devices_device.id)
order by devices_device.id;
+------+------+---------+-----------+
| id | code | site_id | site_name |
+------+------+---------+-----------+
| 1 | 20 | 1 | aaa |
+------+------+---------+-----------+
1 row in set (0.001 sec)

Optimize SQL query with 2 selects

I am trying to update a single campaign.id with minimum used_time (datetime) based on user.id but the following code need about 5 seconds to execute. Backlinks table contains 1 million rows.
UPDATE `backlinks`
SET
`backlinks`.`crawler_id` = 'test',
`backlinks`.`used_time`=NOW()
WHERE
`backlinks`.`campaign_id`=(
SELECT `id` FROM `campaigns`
WHERE `campaigns`.`completed`=false
AND `campaigns`.`status`=true
GROUP BY `campaigns`.`user_id`
ORDER BY `campaigns`.`used_time` ASC
limit 1
)
AND `backlinks`.`googlebot_id` IS NULL
AND `backlinks`.`used_time` IS NULL
LIMIT 1;
You can try to UPDATE with JOIN by a subquery.
UPDATE `backlinks` b
JOIN (
SELECT c.id
FROM campaigns c
WHERE exists (
SELECT 1
FROM campaigns cc
WHERE c.user_id = cc.user_id
GROUP BY cc.user_id
HAVING min(cc.used_time) = c.used_time
)
) t1 on b.`campaign_id` = t1.id
SET
b.`crawler_id` = 'test',
b.`used_time`=NOW()
WHERE
b.`googlebot_id` IS NULL
AND
b.`used_time` IS NULL

I have a table like below I want the latest entry to be displayed

The query which I am using now is below:
select ur.uid
, ua.user_activity_min_budget
, ua.user_activity_max_budget
, ua.user_activity_bedroom
, ptm.property_type_description
, cm.city_name
, lm.locality_name
, ua.user_activity_datetime
from user_registration ur
join ksl_user_activity ua
on ua.registered_user_uid = ur.uid
and ua.user_activity_uid = ( select max(ua0.user_activity_uid) from ksl_user_activity ua0)
join ksl_locality_master lm
on lm.locality_uid = ua.user_activity_area
join ksl_city_master cm
on cm.city_uid = lm.city_uid
join ksl_property_type_master ptm
on ptm. property_type_uid = ua.user_activity_property_type
where date(ua.user_activity_datet±me) >= '20l7-07-24'
and (lm.city_uid = 1 or lm.city_uid=2)
order
by ur.uid
The raw output s as this image shows:
The data is what I get now but I want the latest entry for uid 3,15,33
The reason why I have done the below is and ua.user_activity_uid=(select max(ua0.user_activity_uid) from user_activity ua0).
ksl_user_activity table has a primary key user_activity_id which has the maximum value for the latest entry but I am not getting any data when I include this in my query.
I also tried and ua.user_activity_uid=(select ua0.user_activity_uid from user_activity ua0 order by ua0.user_activity_uid desc limit 1)
This is also not working.
use max() function and sub-query
select t1.uid from user_activity t1
inner join
(select uid,max(user_activity_datetime) as user_activity_datetime from user_activity group by uid
) as t2 on
t1.user_activity_datetime=t2.user_activity_datetime
and t1.uid=t2.uid

MySQL select with group and one to many relations condition

For example have such structure:
CREATE TABLE clicks
(`date` varchar(50), `sum` int, `id` int)
;
CREATE TABLE marks
(`click_id` int, `name` varchar(50), `value` varchar(50))
;
where click can have many marks
So example data:
INSERT INTO clicks
(`sum`, `id`, `date`)
VALUES
(100, 1, '2017-01-01'),
(200, 2, '2017-01-01')
;
INSERT INTO marks
(`click_id`, `name`, `value`)
VALUES
(1, 'utm_source', 'test_source1'),
(1, 'utm_medium', 'test_medium1'),
(1, 'utm_term', 'test_term1'),
(2, 'utm_source', 'test_source1'),
(2, 'utm_medium', 'test_medium1')
;
I need to get agregated values of click grouped by date which contains all of selected values.
I make request:
select
c.date,
sum(c.sum)
from clicks as c
left join marks as m ON m.click_id = c.id
where
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1') OR
(m.name = 'utm_term' AND m.value='test_term1')
group by date
and get 2017-01-01 = 700, but I want to get 100 which means that only click 1 has all of marks.
Or if condition will be
(m.name = 'utm_source' AND m.value='test_source1') OR
(m.name = 'utm_medium' AND m.value='test_medium1')
I need to get 300 instead of 600
I found answer in getting distinct click_id by first query and then sum and group by date with condition whereIn, but on real database which is very large and has id as uuid this request executes extrimely slow. Any advices how to get it work propely?
You can achieve it using below queries:
When there are the three conditions then you have to pass the HAVING count(*) >= 3
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
OR (
m.NAME = 'utm_term'
AND m.value = 'test_term1'
)
GROUP BY id
HAVING count(*) >= 3
) AS t ON cc.id = t.id
GROUP BY cc.DATE
When there are the three conditions then you have to pass the HAVING count(*) >= 2
SELECT cc.DATE
,sum(cc.sum)
FROM clicks AS cc
INNER JOIN (
SELECT id
FROM clicks AS c
LEFT JOIN marks AS m ON m.click_id = c.id
WHERE (
m.NAME = 'utm_source'
AND m.value = 'test_source1'
)
OR (
m.NAME = 'utm_medium'
AND m.value = 'test_medium1'
)
GROUP BY id
HAVING count(*) >= 2
) AS t ON cc.id = t.id
GROUP BY cc.DATE
Demo: http://sqlfiddle.com/#!9/fe571a/35
Hope this works for you...
You're getting 700 because the join generates multiple rows for the different IDs. There are 3 rows in the mark table with ID=1 and sum=100 and there are two rows with ID=2 and sum=200. On doing the join where shall have 3 rows with sum=100 and 2 rows with sum=200, so adding these sum gives 700. To fix this you have to aggregate on the click_id too as illustrated below:
select
c.date,
sum(c.sum)
from clicks as c
inner join (select * from marks where (name = 'utm_source' AND
value='test_source1') OR (name = 'utm_medium' AND value='test_medium1')
OR (name = 'utm_term' AND value='test_term1')
group by click_id) as m
ON m.click_id = c.id
group by c.date;
DEMO SQL FIDDLE
I found the right way myself, which works on large amounts of data
The main goal is to make request generate one table with subqueries(conditions) which do not depend on amount of data in results, so the best way is:
select
c.date,
sum(c.sum)
from clicks as c
join marks as m1 ON m1.click_id = c.id
join marks as m2 ON m2.click_id = c.id
join marks as m3 ON m3.click_id = c.id
where
(m1.name = 'utm_source' AND m1.value='test_source1') AND
(m2.name = 'utm_medium' AND m2.value='test_medium1') AND
(m3.name = 'utm_term' AND m3.value='test_term1')
group by date
So we need to make as many joins as many conditions we have

MySQL error: duplicate column

I'm having a bit of a problem with the following MySQL query and I can't find the source of it.
MySQL tells me that
SQLSTATE[42S21]: Column already exists: 1060 Duplicate column name
'annonce_dispo_id'
SELECT MAX(max_price) AS `max_price`,
COUNT(*) AS `nb_annonces`,
SUM(nb_dispo) AS `nb_dispo`
FROM
(SELECT `annonce`.`id`,
CEIL(MAX(price)*1.16) AS `max_price`,
COUNT(DISTINCT annonce.id) AS `nb_annonces`,
COUNT(annonce_dispoo.annonce_dispo_id) AS `nb_dispo`,
`annonce_dispo1`.*,
`annonce_dispo2`.*
FROM `annonce`
LEFT JOIN `annonce_dispo` AS `annonce_dispoo` ON (annonce_dispoo.annonceId = annonce.id
AND STR_TO_DATE(annonce_dispoo.dispo_date, '%d/%m/%Y') >= CURDATE())
INNER JOIN `annonce_dispo` AS `annonce_dispo1` ON annonce.id = annonce_dispo1.annonceId
INNER JOIN `annonce_dispo` AS `annonce_dispo2` ON annonce.id = annonce_dispo2.annonceId
WHERE ((annonce.city IN
(SELECT `cities`.`id`
FROM `cities`
WHERE (cities.label LIKE 'lyon%'))
OR annonce.zipcode = 'lyon')
OR (annonce.city LIKE '28674'
OR annonce.zipcode = '28674'))
AND (annonce_dispo1.dispo_date = '27/05/2014')
AND (annonce_dispo1.disponibility = 'available')
AND (annonce_dispo2.dispo_date = '31/05/2014')
AND (annonce_dispo2.disponibility = 'available')
AND (annonce.visible = 1)
AND (annonce.completed = 1)
GROUP BY `annonce`.`id` HAVING (nb_dispo >= 1)) AS `t`
I thought gave a different alias for the table in each JOIN I use them in, and can't really put my finger on what else is possible to output such an error.
Don't select annonce_dispo1.* and annonce_dispo2.* in your subquery, duplicated column names are being returned. Instead select the fields you need and alias accordingly.
SELECT MAX(max_price) AS `max_price`,
COUNT(*) AS `nb_annonces`,
SUM(nb_dispo) AS `nb_dispo`
FROM
(SELECT `annonce`.`id`,
CEIL(MAX(price)*1.16) AS `max_price`,
COUNT(DISTINCT annonce.id) AS `nb_annonces`,
COUNT(annonce_dispoo.annonce_dispo_id) AS `nb_dispo`,
`annonce_dispo1`.field, `annonce_dispo1`.otherfield,
`annonce_dispo1`.field as field2, `annonce_dispo1`.otherfield as otherfield2
FROM `annonce`
LEFT JOIN `annonce_dispo` AS `annonce_dispoo` ON (annonce_dispoo.annonceId = annonce.id
AND STR_TO_DATE(annonce_dispoo.dispo_date, '%d/%m/%Y') >= CURDATE())
INNER JOIN `annonce_dispo` AS `annonce_dispo1` ON annonce.id = annonce_dispo1.annonceId
INNER JOIN `annonce_dispo` AS `annonce_dispo2` ON annonce.id = annonce_dispo2.annonceId
WHERE ((annonce.city IN
(SELECT `cities`.`id`
FROM `cities`
WHERE (cities.label LIKE 'lyon%'))
OR annonce.zipcode = 'lyon')
OR (annonce.city LIKE '28674'
OR annonce.zipcode = '28674'))
AND (annonce_dispo1.dispo_date = '27/05/2014')
AND (annonce_dispo1.disponibility = 'available')
AND (annonce_dispo2.dispo_date = '31/05/2014')
AND (annonce_dispo2.disponibility = 'available')
AND (annonce.visible = 1)
AND (annonce.completed = 1)
GROUP BY `annonce`.`id` HAVING (nb_dispo >= 1)) AS `t`
See here for an example that doesn't work:
http://sqlfiddle.com/#!2/9bb13/1
The problem is that you are selecting all columns in the tables annonce_dispo1 and annonce_dispo2.
The fact that you have attributed different table names doesn't mean that there aren't duplicate column names.
I mean, you should use [Table name].[column name]
Example:
(SELECT `annonce`.`id`,
CEIL(MAX(price)*1.16) AS `max_price`,
COUNT(DISTINCT annonce.id) AS `nb_annonces`,
COUNT(annonce_dispoo.annonce_dispo_id) AS `nb_dispo`,
`annonce_dispo1`.annonce_dispo_id AS `column1`,
`annonce_dispo2`.annonce_dispo_id AS `column2`
I hope I've helped