MySQL grouping when using a sub query - mysql

I am trying to create a summary output to show a totals based on values in sub queries and then group the output by a label
Query looks like:
select c.name,
(select sum(duration) from dates d
inner join time t1 on d.time_id=t1.id
where d.employee=t.employee
and d.date >= now() - INTERVAL 12 MONTH) as ad,
(select sum(cost) from dates d
inner join time t1 on d.time_id=t1.id
where d.employee=t.employee
and d.date >= now() - INTERVAL 12 MONTH) as ac
FROM time t
inner join employees ee on t.employee=ee.employee
inner join centres c on ee.centre=c.id
where
ee.centre in (4792,4804,4834) group by c.centre
I want this to show me the ad and ac for each centre but instead it only shows values for ac for the last centre in the list and the rest show as zero
If I remove the group by then I get a list of all the entries but then it is not summarised in any way and I need that rollup view

A SQL statement that is returning a result that is unexpected is not a whole lot to go on.
Without a specification, helpfully illustrated with sample data and expected output, we're just guessing at the result that the query is supposed to achieve.
I think the crux of the problem is the value of t.employee returned for the GROUP BY, multiple detail rows with a variety of values for t.employee are getting collapsed into a single row for each value of c.centre, and the value of t.employee is from "some row" in the set. (A MySQL-specific non-standard extension allows the query to run without throwing an error, where other RDBMS would throw an error. We can get MySQL behavior more inline with the standard by including ONLY_FULL_GROUP_BY in `sql_mode. But that would just cause the SQL in the question to throw an error.)
Suggested fix (just a guess) is to derived ac and ad for each employee, before doing the GROUP BY, and then aggregating.
(I'm still suspicious of the joins to centre, not being included in the subqueries. Is employee the primary key or a unique key in employees? Is centre functionally dependent on employee? So many questions, too many assumptions.
My guess is that we are after the result returned by a query something like this:
SELECT c.name
, SUM(v.ad) AS `ad`
, SUM(v.ac) AS `ac`
FROM ( -- derive `ad` and `ac` in an inline view before we collapse rows
SELECT ee.employee
, ee.centre
, ( -- derived for each employee
SELECT SUM(d1.duration)
FROM time t1
JOIN dates d1
ON d1.time_id = t1.id
AND d1.date >= NOW() + INTERVAL -12 MONTH
WHERE d1.employee = t.employee
) AS `ad`
, ( -- derived for each employee
SELECT SUM(d2.cost)
FROM time t2
JOIN dates d2
ON d2.time_id = t2.id
AND d2.date >= NOW() + INTERVAL -12 MONTH
WHERE d2.employee = t.employee
) AS `ac`
FROM time t
JOIN employees ee
ON ee.employee = t.employee
WHERE ee.centre in (4792,4804,4834)
GROUP
BY ee.employee
, ee.centre
) v
LEFT
JOIN centres c
ON c.id = v.centre
GROUP
BY v.centre
, c.name

Related

Subquery issue when trying to find number of things happening in a given month occur more than the previous month

My query is as follows:
select challenges_unique_acronym from presents where
(select count(month(MonthA.present_date)) from presents as MonthA group by month(MonthA.present_date))
>
(select count(month(MonthA.present_date) - 1) from presents as MonthA group by month(MonthA.present_date));
(count(month(MonthA.present_date))) from presents as MonthA group by month(MonthA.present_date); returns the number of times a month appears where each distinct month is on a seperate row
like so,
but since they're in multiple rows Im getting the "subquery returns more than 1 row" thing which prevents me from equating them.
Is there a way that i can tell mysql "For each row check whether:
(select count(month(MonthA.present_date)) from presents as MonthA group by month(MonthA.present_date))
is greater than
(select count(month(MonthA.present_date) - 1) from presents as MonthA group by month(MonthA.present_date));"
You could try using a hìjoin with teh subquery
select presents.challenges_unique_acronym
from presents
INNER JOIN (
select month(MonthA.present_date) month, count(month(MonthA.present_date)) count
from presents as MonthA
group by month(MonthA.present_date)
) t1 ON t1.month = month(presents.present_date)
INNER JOIN (
select month(MonthA.present_date) month, count(month(MonthA.present_date) - 1) count
from presents as MonthA
group by month(MonthA.present_date)
) t2 ON t1.month = t2.month
AND t1.count > t2.count; -- adding this comment to make the eddit 6 characters

select latest duplicate record from table got long process MySQL

I have a query to only display one duplicate data and retrieve the one with latest date. But the process is very long. what kind of query should I write to be more efficient?
Here is my query:
SELECT d.id_qr, d.code_qr, d.date, d3.code id_code
FROM survey d
INNER JOIN place d3 on d.id_code = d3.id_code
WHERE d.date IN (SELECT max(d2.date)FROM survey d2 WHERE d2.code_qr=d.code_qr)
Instead of aggregating, you could try ORDER BY and LIMIT. Also, you probably want an equality instead of IN.
SELECT d.id_qr, d.code_qr, d.date, d3.code id_code
FROM survey d
INNER JOIN place d3 on d.id_code = d3.id_code
WHERE d.date = (
SELECT d2.date
FROM survey d2
WHERE d2.code_qr = d.code_qr
ORDER BY d2.date DESC
LIMIT 1
)
For performance, consider an index on survey(code_qr, date).

Getting all value from every month, put zero if no data of that month

i'm trying to get data for each month, if there is no data found for a particular month, I will put zero. I already created a calendar table so I can left join it, but I still can't get zero.
Here's my query
SELECT calendar.month, IFNULL(SUM(transaction_payment.total),0) AS total
FROM `transaction`
JOIN `transaction_payment` ON `transaction_payment`.`trans_id` =
`transaction`.`trans_id`
LEFT JOIN `calendar` ON MONTH(transaction.date_created) = calendar.month
WHERE`date_created` LIKE '2017%' ESCAPE '!'
GROUP BY calendar.month
ORDER BY `date_created` ASC
the value in my calendar tables are 1-12(Jan-Dec) int
Result should be something like this
month total
1 0
2 20
3 0
4 2
..
11 0
12 10
UPDATE
The problem seems to be the SUM function
SELECT c.month, COALESCE(t.trans_id, 0) AS total
FROM calendar c
LEFT JOIN transaction t ON month(t.date_created) = c.month AND year(t.date_created) = '2018'
LEFT JOIN transaction_payment tp ON tp.trans_id = t.trans_id
ORDER BY c.month ASC
I tried displaying the ID only and it's running well. but when I add back this function. I can only get months with values.
COALESCE(SUM(tp.total), 0);
This fixes the issues with your query:
SELECT c.month, COALESCE(SUM(tp.total), 0) AS total
FROM calendar c LEFT JOIN
transaction t
ON month(t.date_created) = month(c.month) AND
year(t.date_created) = '2017' LEFT JOIN
transaction_payment tp
ON tp.trans_id = t.trans_id
GROUP BY c.month
ORDER BY MIN(t.date_created) ASC;
This will only work if the "calendar" table has one row per month -- that seems odd, but that might be your data structure.
Note the changes:
Start with the calendar table, because those are the rows you want to keep.
Do not use LIKE with dates. MySQL has proper date functions. Use them.
The filtering conditions on all but the first table should be in the ON clause rather than the WHERE clause.
I prefer COALESCE() to IFNULL() because COALESCE() is ANSI standard.
You need to use right as per your query because you calendar table is present at right side
SELECT calendar.month, IFNULL(SUM(transaction_payment.total),0) AS total
FROM `transaction`
JOIN `transaction_payment` ON `transaction_payment`.`trans_id` =
`transaction`.`trans_id`
RIGHT JOIN `calendar` ON MONTH(transaction.date_created) = calendar.month
WHERE`date_created` LIKE '2017%' ESCAPE '!'
GROUP BY calendar.month
ORDER BY `date_created` ASC

Select most recent record grouped by 3 columns

I am trying to return the price of the most recent record grouped by ItemNum and FeeSched, Customer can be eliminated. I am having trouble understanding how I can do that reasonably.
The issue is that I am joining about 5 tables containing hundreds of thousands of rows to end up with this result set. The initial query takes about a minute to run, and there has been some trouble with timeout errors in the past. Since this will run on a client's workstation, it may run even slower, and I have no access to modify server settings to increase memory / timeouts.
Here is my data:
Customer Price ItemNum FeeSched Date
5 70.75 01202 12 12-06-2017
5 70.80 01202 12 06-07-2016
5 70.80 01202 12 07-21-2017
5 70.80 01202 12 10-26-2016
5 82.63 02144 61 12-06-2017
5 84.46 02144 61 06-07-2016
5 84.46 02144 61 07-21-2017
5 84.46 02144 61 10-26-2016
I don't have access to create temporary tables, or views and there is no such thing as a #variable in C-tree, but in most ways it acts like MySql. I wanted to use something like GROUP BY ItemNum, FeeSched and select MAX(Date). The issue is that unless I put Price into the GROUP BY I get an error.
I could run the query again only selecting ItemNum, FeeSched, Date and then doing an INNER JOIN, but with the query taking a minute to run each time, it seems there is a better way that maybe I don't know.
Here is my query I am running, it isn't really that complicated of a query other than the amount of data it is processing. Final results are about 50,000 rows. I can't share much about the database structure as it is covered under an NDA.
SELECT DISTINCT
CustomerNum,
paid as Price,
ItemNum,
n.pdate as newest
from admin.fullproclog as f
INNER JOIN (
SELECT
id,
itemId,
MAX(TO_CHAR(pdate, 'MM-DD-YYYY')) as pdate
from admin.fullproclog
WHERE pdate > timestampadd(sql_tsi_year, -3, NOW())
group by id, itemId
) as n ON n.id = f.id AND n.itemId = f.itemId AND n.pdate = f.pdate
LEFT join (SELECT itemId AS linkid, ItemNum FROM admin.itemlist) AS codes ON codes.linkid = f.itemId AND ItemNum >0
INNER join (SELECT DISTINCT parent_id,
MAX(ins1.feesched) as CustomerNum
FROM admin.customers AS p
left join admin.feeschedule AS ins1
ON ins1.feescheduleid = p.primfeescheduleid
left join admin.group AS c1
ON c1.insid = ins1.feesched
WHERE status =1
GROUP BY parent_id)
AS ip ON ip.parent_id = f.parent_id
WHERE CustomerNum >0 AND ItemNum >0
UNION ALL
SELECT DISTINCT
CustomerNum,
secpaid as Price,
ItemNum,
n.pdate as newest
from admin.fullproclog as f
INNER JOIN (
SELECT
id,
itemId,
MAX(TO_CHAR(pdate, 'MM-DD-YYYY')) as pdate
from admin.fullproclog
WHERE pdate > timestampadd(sql_tsi_year, -3, NOW())
group by id, itemId
) as n ON n.id = f.id AND n.itemId = f.itemId AND n.pdate = f.pdate
LEFT join (SELECT itemId AS linkid, ItemNum FROM admin.itemlist) AS codes ON codes.linkid = f.itemId AND ItemNum >0
INNER join (SELECT DISTINCT parent_id,
MAX(ins1.feesched) as CustomerNum
FROM admin.customers AS p
left join admin.feeschedule AS ins1
ON ins1.feescheduleid = p.secfeescheduleid
left join admin.group AS c1
ON c1.insid = ins1.feesched
WHERE status =1
GROUP BY parent_id)
AS ip ON ip.parent_id = f.parent_id
WHERE CustomerNum >0 AND ItemNum >0
I feel it quite simple when I'd read the first three paragraphs, but I get a little confused when I've read the whole question.
Whatever you have done to get the data posted above, once you've got the data like that it's easy to retrive "the most recent record grouped by ItemNum and FeeSched".
How to:
Firstly, sort the whole result set by Date DESC.
Secondly, select fields you need from the sorted result set and group by ItemNum, FeeSched without any aggregation methods.
So, the query might be something like this:
SELECT t.Price, t.ItemNum, t.FeeSched, t.Date
FROM (SELECT * FROM table ORDER BY Date DESC) AS t
GROUP BY t.ItemNum, t.FeeSched;
How it works:
When your data is grouped and you select rows without aggregation methods, it will only return you the first row of each group. As you have sorted all rows before grouping, so the first row would exactly be "the most recent record".
Contact me if you got any problems or errors with this approach.
You can also try like this:
Select Price, ItemNum, FeeSched, Date from table where Date IN (Select MAX(Date) from table group by ItemNum, FeeSched,Customer);
Internal sql query return maximum date group by ItemNum and FeeSched and IN statement fetch only the records with maximum date.

Returning data from 2 tables if condition matches

I'm not really sure how to do this. I have website that tracks when a server goes down. So I have 1 table with the server names and ID's named servers and another where the error messages are held called errors. I want to return a calendar like view for the past 7 days that would show if an error occurred in any of our servers.
So far, I have a query that will find error messages for that day for any one server, but I don't know how to return the servers that are good and had 0 errors.
SELECT errors.error_id, servers.server_id, errors.start_time, servers.name
FROM errors
INNER JOIN servers ON errors.server_id=servers.server_id
WHERE errors.start_time BETWEEN '2014-02-25 00:00:00' AND '2014-02-25 23:59:59'
I have it loop through the 7 days and that all works. But I'm stuck on how to get the id's and names of the servers that DID NOT go down on that day. I've been thinking about implementing an IF or CASE into the query, but I've never used them before and I'm not quite sure how that would work.
Do I need to run multiple queries for this or is it possible with one?
Instead of looping through the days, do them all at once. Assuming you have at least one error per day, you can get this information from the errors table. Otherwise, you might need a calendar table for this:
SELECT dates.thedate, e.error_id, s.server_id, e.start_time, s.name
FROM (select distinct date(start_time) as thedate
from errors
where e.start_time BETWEEN '2014-02-25 00:00:00' AND '2014-03-03 23:59:59'
) dates cross join
servers s LEFT OUTER JOIN
errors e
ON e.server_id = s.server_id;
This will generate a row for each error for each server per day. If there is no error, there will be a row for each server with NULL in the error fields. If you want to aggregate this:
SELECT dates.thedate, s.server_id, s.name, count(*) as numErrors,
group_concat(error_id order by e.start_time) as errorIds,
group_concat(se.tart_time order by e.start_time) as startTimes
FROM (select distinct date(start_time) as thedate
from errors
where e.start_time BETWEEN '2014-02-25 00:00:00' AND '2014-03-03 23:59:59'
) dates cross join
servers s LEFT OUTER JOIN
errors e
ON e.server_id = s.server_id and date(e.start_time) = dates.thedate
GROUP BY dates.thedate, s.server_id, s.name;
EDIT:
Without a calendar table, you can insert each day into the query like this:
SELECT dates.thedate, s.server_id, s.name, count(*) as numErrors,
group_concat(error_id order by e.start_time) as errorIds,
group_concat(se.tart_time order by e.start_time) as startTimes
FROM (select date('2014-02-25') as thedate union all
select date('2014-02-26') union all
select date('2014-02-27') union all
select date('2014-02-28') union all
select date('2014-03-01') union all
select date('2014-03-02') union all
select date('2014-03-03')
) dates cross join
servers s LEFT OUTER JOIN
errors e
ON e.server_id = s.server_id and date(e.start_time) = dates.thedate
GROUP BY dates.thedate, s.server_id, s.name;