Related
I have a few MySQL tables from which I need to JOIN and return data. The return data must show only one row for one of the JOINed tables, but MySQL mixes the rows.
I have tried different methods using subqueries and normal JOIN with GROUP but the results remain pretty much the same.
Example table structure
suppliers
id name ...
1 ACME ...
2 EMCA ...
3 ORG ...`
ratings
id supplier_id rating expiry_date report_file
1 1 5.0 2017-01-31 a.pdf
3 1 7.9 2019-06-30 c.pdf
4 2 5.0 2016-01-31 d.pdf
5 2 2.0 2018-11-30 g.pdf
6 245 9.5 2009-03-31 p.pdf
spends
id report_id supplier_id amount
1 1 1 150.00
2 1 2 100.00
3 1 245 200.00
Here are example queries I have tried to resolve this and return the correct dataset with no luck.
SELECT
reports.id,
suppliers.id AS supplier_id,
suppliers.name,
...
spends.amount,
...
ratings.rating,
ratings.report_file,
ratings.expiry_date
FROM reports
INNER JOIN spends ON reports.id=spends.report_id
INNER JOIN suppliers ON spends.supplier_id=suppliers.id
LEFT JOIN (
SELECT id,
level,
report_file,
supplier_id,
MAX(expiry_date) AS expiry_date
FROM ratings
GROUP BY supplier_id
) ratings ON (ratings.supplier_id=suppliers.id
AND ratings.expiry_date >= reports.period_start)
...
WHERE reports.id = 1
GROUP BY spends.id
ORDER BY spends.amount DESC
Another query
SELECT
reports.id,
suppliers.id AS supplier_id,
suppliers.name,
...
spends.amount,
...
ratings.rating,
ratings.report_file,
MAX(ratings.expiry_date) AS expiry_date
FROM reports
INNER JOIN spends ON reports.id=spends.report_id
INNER JOIN suppliers ON spends.supplier_id=suppliers.id
LEFT JOIN ratings ON (ratings.supplier_id=suppliers.id
AND ratings.expiry_date >= reports.period_start)
...
WHERE reports.id = 1
GROUP BY spends.id
ORDER BY spends.amount DESC
I expect the results to be
id supplier_id name amount rating report_file expiry_date
1 1 ACME 150.00 7.9 c.pdf 2019-06-30
1 2 EMCA 100.00 2.0 g.pf 2018-11-30
1 245 MACE 200.00 null null```
However, the actual output is
```sql
id supplier_id name amount rating report_file expiry_date
1 1 ACME 150.00 5.0 a.pdf 2019-06-30
1 2 EMCA 100.00 5.0 d.pf 2018-11-30
1 245 MACE 200.00 null null
Please could anyone advise how I can fix this.
try a query like this:
SELECT
reports.id,
suppliers.id AS supplier_id,
suppliers.name,
...
spends.amount,
...
r.rating,
r.report_file,
r.expiry_date
FROM reports
INNER JOIN spends ON reports.id=spends.report_id
INNER JOIN suppliers ON spends.supplier_id=suppliers.id
LEFT JOIN ratings r ON r.supplier_id=suppliers.id
AND r.`expire_date` = (
SELECT MAX(`expire_date`) FROM ratings WHERE supplier_id = supplier.id GROUP BY supplier_id)
) as maxdate
...
WHERE reports.id = 1
GROUP BY spends.id
ORDER BY spends.amount DESC
I have simple table:
Order_ID Client_ID Date Order_Status
1 1 01/01/2015 3
2 2 05/01/2015 3
3 1 06/01/2015 3
4 2 10/01/2015 3
5 1 12/01/2015 4
6 1 05/02/2015 3
I want to identify orders from new customers which are orders in same month in which that customer made first order with Order_Status = 3
So the output table should look like this:
Order_ID Client_ID Date Order_Status Order_from_new_customer
1 1 01/01/2015 3 yes
2 2 05/01/2015 3 yes
3 1 06/01/2015 3 yes
4 2 10/01/2015 3 yes
5 1 12/01/2015 4 NULL
6 1 05/02/2015 3 no
I wasn't able to successfully figure out the query. Thanks a lot for any help.
Join with a subquery that gets the date of the first order by each customer.
SELECT o.*, IF(MONTH(o.date) = MONTH(f.date) AND YEAR(o.date) = YEAR(f.date),
'yes', 'no') AS order_from_new_customer
FROM orders AS o
JOIN (SELECT Client_ID, MIN(date) AS date
FROM orders
WHERE Order_Status = 3
GROUP BY Client_ID) AS f
ON o.Client_ID = f.Client_ID
Use a CASE statement along with a SELF JOIN like below
select t1.*,
case when t1.Order_Status = 3 and MONTH(t1.`date`) = 1 then 'yes'
when t1.Order_Status = 3 and MONTH(t1.`date`) <> 1 then 'no'
else null end as Order_from_new_customer
from order_table t1 join order_table t2
on t1.Order_ID < t2.Order_ID
and t1.Client_ID = t2.Client_ID;
If your order table gets big, the solutions from Rahul and Barmar will tend to get slow.
I would hope your shop will get many orders and you will run into performance trouble ;-). So I would suggest marking the very first order of a new customer with a tinyint column, and when you have the comfort of a tinyint, you could code it like:
0 : unknown
1 : very first order
2 : order in first month
3 : order in "grown-up" mode.
The very first order you could probably mark easily, everyone loves a bright new customer enough to store this event somehow during first ordering. The other orders you can identify in a background job / cronjob by there "0" for unknown, or you mark your old customers and store the "3" on their orders.
The result-set can be achieved without any table-join or subquery:
select
if(Order_Status<>3,null,if(#first_date:=if(#prev_client_id!=Client_ID,month(date),#first_date)=month(date),"yes","no")) as Order_from_new_customer
,Order_ID,Client_ID,date,Order_Status,#prev_client_id:=client_id
from
t1,
(select #prev_client_id:="",#first_date:="")t
order by Client_ID ,date
One extra column added for computation and order by clause is used.
Verify result at http://sqlfiddle.com/#!9/83c29f/24
I have three tables: attendance, cv_target, and candidate. I need to find the candidate count for a specific user.
I am not an expert in MySQL. I have tried the query below, but I'm unable to find the exact value.
SELECT
attendance_date,
cv_target_date_for,
cv_requirement,
job_id,
cv_target,
achi,
recruiter_comment,
recruiter_rating
FROM
attendance f
RIGHT JOIN
(
SELECT
cv_requirement,
cv_target,
cv_target_date_for,
achi,
recruiter_comment,
recruiter_rating
FROM
cv_target a
LEFT JOIN
(
SELECT
COUNT(candidate_id) AS achi,
cv_target_date,
fk_job_id
FROM
candidate
GROUP BY
fk_job_id,
cv_target_date
) b
ON a.cv_requirement = b.fk_job_id
AND a.cv_target_date_for = b.cv_target_date
WHERE
cv_target_date_for BETWEEN '2014-02-01' AND '2014-03-01'
AND cv_recruiter = '36'
) c
ON f.attendance_date=c.cv_target_date_for
GROUP BY
cv_requirement,
cv_target_date_for
ORDER BY
c`.`cv_target_date_for` ASC
attendance
id fk_user_id attendance_date
1 44 2014-02-24
2 44 2014-02-25
3 44 2014-02-26
4 44 2014-02-27
5 36 2014-02-24
6 44 2014-02-28
cv_target
id cv_recruiter cv_requirement cv_target cv_target_date_for
1 44 1 3 2014-02-24
2 44 2 2 2014-02-24
3 44 3 2 2014-02-25
4 44 4 3 2014-02-25
4 44 4 3 2014-02-26
candidate
candidate_id fk_posted_user_id fk_job_id cv_target_date
1 44 1 2014-02-24
2 44 3 2014-02-25
3 44 3 2014-02-25
3 44 4 2014-02-25
4 44 4 2014-02-26
5 44 5 2014-02-28
5 44 5 2014-02-28
Desired result
attendance_date cv_target_date_for job_id cv_target achi(count)
2014-02-24 2014-02-24 1 3 1
2014-02-24 2014-02-24 2 2 null
2014-02-25 2014-02-25 3 2 2
2014-02-25 2014-02-25 4 3 1
2014-02-26 2014-02-26 4 3 1
2014-02-27 2014-02-27 null null null
2014-02-28 null 5 null 2
Output that I am getting
attendance_date cv_target_date_for job_id cv_target achi(count)
2014-02-24 2014-02-24 1 3 1
2014-02-24 2014-02-24 2 2 null
2014-02-25 2014-02-25 3 2 2
2014-02-25 2014-02-25 4 3 1
2014-02-26 2014-02-26 4 3 1
Date 27 and 28 are not showing. I want those values also.
Original Answer
I think I understand what you want. The following assumes you want all attendance dates within a specific range for a specific user. And for each of those attendance dates, you want all cv_target records, if any. And for each of those, you want a count of the candidates.
Use a subquery to get the count. That's the only part that needs to go in the subquery. Only use a GROUP BY expression in the subquery, not the outer query. Only select the fields you need.
Use LEFT JOIN to get all the records from the table on the left side of the expression and only matching records from the table on the right side. So all records from attendance (that match the WHERE expression), and matching records from cv_target (regardless of whether they have a match in the candidate subquery), and then matching records from the candidate subquery.
Try this:
SELECT
DATE_FORMAT(a.attendance_date, '%Y-%m-%d') AS attendance_date,
DATE_FORMAT(t.cv_target_date_for, '%Y-%m-%d') AS cv_target_date_for,
t.cv_requirement AS job_id,
t.cv_target,
c.achi AS `achi(count)`
FROM
attendance AS a
LEFT JOIN
cv_target AS t
ON a.fk_user_id = t.cv_recruiter
AND a.attendance_date = t.cv_target_date_for
LEFT JOIN
(
SELECT
COUNT(candidate_id) AS achi,
fk_job_id,
cv_target_date
FROM
candidate
WHERE
fk_posted_user_id = 44
AND cv_target_date BETWEEN '2014-02-01' AND '2014-03-01'
GROUP BY
fk_job_id,
cv_target_date
) AS c
ON t.cv_requirement = c.fk_job_id
AND t.cv_target_date_for = c.cv_target_date
WHERE
a.fk_user_id = 44
AND a.attendance_date BETWEEN '2014-02-01' AND '2014-03-01'
ORDER BY
ISNULL(t.cv_target_date_for), t.cv_target_date_for, t.cv_requirement
Note that the following line is not necessary for the correct result. However, depending on the database structure and amount of data, it may improve performance.
AND cv_target_date BETWEEN '2014-02-01' AND '2014-03-01'
The ISNULL function is being used to sort NULL to the bottom.
I've created an SQL Fiddle showing the output you request, except for cv_target_date_for. It's not possible to output values that do not exist in the data.
UPDATE
With the new data and new requirement of retrieving data where either cv_target or candidate has data for a particular attendance date, you need to add another table to get the job IDs. In your original question you had a table with ID numbers and job titles, but it had no dates.
You might want to rethink your database design. I'm not sure I understand how your tables relate to one another, but those two new records for the candidate table appear to be orphaned. All your joins are based on date, but you don't appear to have a table that links job ID numbers to dates.
You could create a derived table by doing a UNION of cv_target and candidate. Then use the derived table as the left side of the join.
Updated query:
SELECT
DATE_FORMAT(a.attendance_date, '%Y-%m-%d') AS attendance_date,
DATE_FORMAT(t.cv_target_date_for, '%Y-%m-%d') AS cv_target_date_for,
j.job_id,
t.cv_target,
c.achi AS `achi(count)`
FROM
attendance AS a
LEFT JOIN
(
SELECT
cv_requirement AS job_id,
cv_target_date_for AS job_date
FROM
cv_target
WHERE
cv_recruiter = 44
AND cv_target_date_for BETWEEN '2014-02-01' AND '2014-03-01'
UNION
SELECT
fk_job_id AS job_id,
cv_target_date AS job_date
FROM
candidate
WHERE
fk_posted_user_id = 44
AND cv_target_date BETWEEN '2014-02-01' AND '2014-03-01'
) AS j
ON a.attendance_date = j.job_date
LEFT JOIN
cv_target AS t
ON a.fk_user_id = t.cv_recruiter
AND j.job_id = t.cv_requirement
AND j.job_date = t.cv_target_date_for
LEFT JOIN
(
SELECT
COUNT(candidate_id) AS achi,
fk_job_id,
cv_target_date
FROM
candidate
WHERE
fk_posted_user_id = 44
AND cv_target_date BETWEEN '2014-02-01' AND '2014-03-01'
GROUP BY
fk_job_id,
cv_target_date
) AS c
ON j.job_id = c.fk_job_id
AND j.job_date = c.cv_target_date
WHERE
a.fk_user_id = 44
AND a.attendance_date BETWEEN '2014-02-01' AND '2014-03-01'
ORDER BY
ISNULL(t.cv_target_date_for), t.cv_target_date_for, j.job_id
I've created an updated SQL Fiddle showing the output you request, except for cv_target_date_for. It's not possible to output values that do not exist in the data (i.e. 2014-02-27).
If that's a typo and you meant 2014-02-28, then you'll need to select the date from the derived table instead of the cv_target table. And you should probably change the column heading in the result because it's no longer the cv_target_date_for date.
To get the date from either cv_target or candidate, change this line:
DATE_FORMAT(t.cv_target_date_for, '%Y-%m-%d') AS cv_target_date_for,
to this:
DATE_FORMAT(j.job_date, '%Y-%m-%d') AS job_date,
And you may need to tweak the order by expression to suit your needs.
I have a table with columns similar to below , but with about 30 date columns and 500+ records
id | forcast_date | actual_date
1 10/01/2013 12/01/2013
2 03/01/2013 06/01/2013
3 05/01/2013 05/01/2013
4 10/01/2013 09/01/2013
and what I need to do is get a query with output similar to
week_no | count_forcast | count_actual
1 4 6
2 5 7
3 2 1
etc
My query is
SELECT weekofyear(forcast_date) as week_num,
COUNT(forcast_date) AS count_forcast ,
COUNT(actual_date) AS count_actual
FROM
table
GROUP BY
week_num
but what I am getting is the forcast_date counts repeated in each column, i.e.
week_no | count_forcast | count_actual
1 4 4
2 5 5
3 2 2
Can any one please tell me the best way to formulate the query to get what I need??
Thanks
try:
SELECT weekofyear(forcast_date) AS week_forcast,
COUNT(forcast_date) AS count_forcast, t2.count_actual
FROM
t t1 LEFT JOIN (
SELECT weekofyear(actual_date) AS week_actual,
COUNT(forcast_date) AS count_actual
FROM t
GROUP BY weekOfYear(actual_date)
) AS t2 ON weekofyear(forcast_date)=week_actual
GROUP BY
weekofyear(forcast_date), t2.count_actual
sqlFiddle
You have to write about 30 (your date columns) left join, and the requirement is that your first date column shouldn'd have empty week (with a count of 0) or the joins will miss.
Try:
SELECT WeekInYear, ForecastCount, ActualCount
FROM ( SELECT A.WeekInYear, A.ForecastCount, B.ActualCount FROM (
SELECT weekofyear(forecast_date) as WeekInYear,
COUNT(forecast_date) as ForecastCount, 0 as ActualCount
FROM TableWeeks
GROUP BY weekofyear(forecast_date)
) A
INNER JOIN
( SELECT * FROM
(
SELECT weekofyear(forecast_date) as WeekInYear,
0 as ForecastCount, COUNT(actual_date) as ActualCount
FROM TableWeeks
GROUP BY weekofyear(actual_date)
) ActualTable ) B
ON A.WeekInYear = B.WeekInYear)
AllTable
GROUP BY WeekInYear;
Here's my Fiddle Demo
Just in case someone else comes along with the same question:
Instead of trying to use some amazing query, I ended up creating an array of date_columns_names and a loop in the program that was calling this query, and for each date_column_name, performing teh asme query. It is a bit slower, but it does work
I have tables:
orders:
id_order id_customer
1 1
2 2
3 1
orders_history
id_history id_order id_order_state date_add
1 1 1 2010-01-01 00:00:00
2 1 2 2010-01-02 00:00:00
3 1 3 2010-01-03 00:00:00
4 2 2 2010-05-01 00:00:00
5 2 3 2011-05-02 00:00:00
6 3 1 2011-05-03 00:00:00
7 3 2 2011-06-01 00:00:00
order_state
id_order_state name
1 New
2 Sent
3 Rejected
4 ...
How to get all order_id's where last id_order_state of that order (by last I mean this with MAX(id_history) or MAX(date_add)) is not equal 1 or 3?
select oh.id_history, oh.id_order, oh.id_order_state, oh.date_add
from (
select id_order, max(date_add) as MaxDate
from orders_history
where id_order_state not in (1, 3)
group by id_order
) ohm
inner join orders_history oh on ohm.id_order = oh.id_order
and ohm.MaxDate = oh.date_add
I think what he's after is what orders are complete... ie their final status, not those that are exclusive of the 1 and 3 specifically. The first pre-query should be the max ID regardless of the status code
select
orders.*
from
( select oh.id_order,
max( oh.id_history ) LastID_HistoryPerOrder
from
orders_history oh
group by
oh.id_order ) PreQuery
join orders_history oh2
on PreQuery.ID_Order = oh2.id_order
AND PreQuery.LastID_HistoryPerOrder = oh2.id_history
AND NOT OH2.id_order_state IN (1, 3) <<== THIS ELIMINATES 1's & 3's from result set
join Orders <<= NOW, anything left after above ^ is joined to orders
on PreQuery.ID_Order = Orders.ID_Order
Just to re-show YOUR data... I've marked the last SEQUENCE (ID_History) per ORDER... This is what the PREQUERY is going to return...
id_history id_order id_order_state date_add
1 1 1 2010-01-01 00:00:00
2 1 2 2010-01-02 00:00:00
**3 1 3 2010-01-03 00:00:00
4 2 2 2010-05-01 00:00:00
**5 2 3 2011-05-02 00:00:00
6 3 1 2011-05-03 00:00:00
**7 3 2 2011-06-01 00:00:00
The "PreQuery" will result with the following subset
ID_Order LastID_HistoryPerOrder (ID_History)
1 3 (state=3) THIS ONE WILL BE SKIPPED IN FINAL RESULT
2 5 (state=3) THIS ONE WILL BE SKIPPED IN FINAL RESULT
3 7 (state=2)
Now, the result of this is then re-joined back to order history on just these two elements... yet adds the criteria to EXCLUDE the 1,3 entries for "order state".
In this case,
1 would be rejected as its state = 3 (sequence #3),
2 would be rejected since its last history is state = 3 (sequence #5).
3 would be INCLUDED since its state = 2 (sequence #7)
Finally, all that joined to the orders will result with ONE ID, and nicely match up with the orders table on the Order_ID alone and get the desired results.
Another possible solution:
SELECT DISTINCT
id_order
FROM
Orders_History OH1
LEFT OUTER JOIN Orders_History OH2 ON
OH2.id_order = OH1.id_order AND
OH2.is_order_state IN (1, 3) AND
OH2.date_add >= OH1.date_add
WHERE
OH2.id_order IS NULL
I'm using "answer for my question" because I need to post results of your queries. So.
Unfortunately not all of your answers guys works. Let's prepare test environment:
CREATE TABLE `order_history` (
`id_order_history` int(11) NOT NULL AUTO_INCREMENT,
`id_order` int(11) NOT NULL,
`id_order_state` int(11) NOT NULL,
`date_add` datetime NOT NULL,
PRIMARY KEY (`id_order_history`)
) ENGINE=MyISAM AUTO_INCREMENT=11 DEFAULT CHARSET=latin2;
CREATE TABLE `orders` (
`id_order` int(11) NOT NULL AUTO_INCREMENT,
`id_customer` int(11) DEFAULT NULL,
PRIMARY KEY (`id_order`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin2;
INSERT INTO `order_history`
(`id_order_history`, `id_order`, `id_order_state`, `date_add`) VALUES
(1,1,1,'2011-01-01 00:00:00'),
(2,1,2,'2011-01-01 00:10:00'),
(3,1,3,'2011-01-01 00:20:00'),
(4,2,1,'2011-02-01 00:00:00'),
(5,2,2,'2011-02-01 00:25:01'),
(6,2,3,'2011-02-01 00:25:59'),
(7,3,1,'2011-03-01 00:00:01'),
(8,3,2,'2011-03-01 00:00:02'),
(9,3,3,'2011-03-01 00:01:00'),
(10,3,2,'2011-03-02 00:00:01');
COMMIT;
INSERT INTO `orders` (`id_order`, `id_customer`) VALUES
(1,1),
(2,2),
(3,3),
(4,4),
(5,5),
(6,6),
(7,7);
COMMIT;
Now, lets select Last/Max State for each Order, so let's run simple query:
select id_order, max(date_add) as MaxDate
from `order_history`
group by id_order
this gives us PROPER results, no rocket science right now:
id_order MaxDate
---------+-------------------
1 2011-01-01 00:20:00 //last order_state=3
2 2011-02-01 00:25:59 //last order_state=3
3 2011-03-02 00:00:01 //last order_state=2
Now for simplicity, lest change our queries to get Orders where Last State is not equal 3.
We're expecting to get one row result with id_order = 3.
So let's test our queries:
QUERY 1 made by RedFilter:
select oh.id_order, oh.id_order_state, oh.date_add
from (
select id_order, max(date_add) as MaxDate
from `order_history`
where id_order_state not in (3)
group by id_order
) ohm
inner join `order_history` oh on ohm.id_order = oh.id_order
and ohm.MaxDate = oh.date_add
Result:
id_order id_order_state date_add
-------------------------------------------------
1 2 2011-01-01 00:10:00
2 2 2011-02-01 00:25:01
3 2 2011-03-02 00:00:01
So it's not true
QUERY 2 made by Tom H.:
SELECT DISTINCT OH1.id_order
FROM order_history OH1
LEFT OUTER JOIN order_history OH2 ON
OH2.id_order = OH1.id_order AND
OH2.id_order_state NOT IN (3) AND
OH2.`id_order_history` >= OH1.`id_order_history`
WHERE
OH2.id_order IS NULL
Result:
id_order
--------
1
2
So it's not true
Any suggestions appreciated.
EDIT
Thanks to Andriy M. comment we have proper solution. It's a modification of Tom H. query all should looks as follow:
SELECT DISTINCT
OH1.id_order
FROM
order_history OH1
LEFT OUTER JOIN order_history OH2 ON
OH2.id_order = OH1.id_order
AND OH2.date_add > OH1.date_add
WHERE OH1.id_order_state NOT IN (3) AND OH2.id_order IS NULL
EDIT 2:
QUERY 3 made by DRapp:
select
distinct orders.`id_order`
from
( select oh.id_order,
max( oh.id_order_history ) LastID_HistoryPerOrder
from
order_history oh
group by
oh.id_order ) PreQuery
join order_history oh2
on PreQuery.id_order = oh2.id_order
AND PreQuery.LastID_HistoryPerOrder = oh2.id_order_history
AND NOT oh2.id_order_state IN (1,3)
join orders
on PreQuery.id_order = orders.id_order
Result:
id_order
--------
3
So it's finally true