I have an Access table with multiple date entries for each unique identifier
Year ID TotalSpent
2003-2004 001 1000
2002-2003 001 900
2001-2002 001 100
2009-2010 002 8000
2008-2009 002 4000
2000-2001 003 100
1999-2000 003 0
I want to keep the latest (top) entry for each unique ID to produce
Year ID TotalSpent
2003-2004 001 1000
2009-2010 002 8000
2000-2001 003 100
I have looked at the top() function but cannot get it to produce more than 1 result (as opposed to 1 result for each unique ID). Any help would be appreciated.
Remou makes a valid point that a unique ID would be beneficial as it would allow to refer to the top row in the future but this could be a constraint outside of your control.
The data source is a bit awkward with the hyphenated years which prevents a simple grouping query. The second issue is that you simply cannot just group by the max of the TotalSpent field as it may not be the last field (A large refund for instance may affect a years total).
My solution involves finding the latest Year for each ID (Query A) and then reforms the year-tag to join onto table B. I didn't want to perform a join on a calculated field so I have wrapped it in another subquery (Query B). This is then joined onto the original table/query to extract the key rows and values.
SELECT YourTable.[YourYearField],
YourTable.ID,
YourTable.TotalSpent
FROM (SELECT A.ID,
[StartYear] & "-" & [EndYear] AS Grouping
FROM (SELECT YourTable.ID,
Max(Val(Right$([YourYearField], 4))) AS EndYear,
Max(Val(Right$([YourYearField], 4)) - 1) AS StartYear
FROM YourTable
GROUP BY YourTable.ID) AS A
GROUP BY A.ID,
[StartYear] & "-" & [EndYear]) AS B
INNER JOIN YourTable
ON ( B.Grouping = YourTable.[YourYearField] )
AND ( B.ID = YourTable.ID )
GROUP BY YourTable.[YourYearField],
YourTable.ID,
YourTable.TotalSpent;
You can get the Year and ID values you want with this query:
SELECT ID, Max([Year]) AS MaxOfYear
FROM YourTable
GROUP BY ID;
Then to get the corresponding TotalSpent values, use that SQL for a subquery which you join to YourTable.
SELECT y.Year, y.ID, y.TotalSpent
FROM
YourTable AS y
INNER JOIN
(
SELECT ID, Max([Year]) AS MaxOfYear
FROM YourTable
GROUP BY ID
) AS sub
ON
(y.Year = sub.MaxOfYear)
AND (y.ID = sub.ID);
Related
I have two tables, issue and receipt where I am issuing and receiving quantities :
IssueTable:
Order
Type
Qty
OD12
A
48
OD19
A
33
OD12
B
14
ReceiptTable:
Order
Type
Qty
OD12
A
20
OD19
A
15
OD12
B
11
The desired result that I want:
Balance:
Order
Type
Qty
OD12
A
28
OD19
A
18
OD12
B
03
IssueTable contains details of Orders which have been issued, a single order can have multiple "Type" of products. Similarly, ReceiptTable contains details of Orders which have been completed and received. I want a Balance table which subtracts issue qty from receipt qty based on Order and Type.
SELECT `Order`,
`Type`,
COALESCE(IssueTable.Qty, 0) - COALESCE(ReceiptTable.Qty, 0) Qty
FROM ( SELECT `Order`, `Type` FROM IssueTable
UNION
SELECT `Order`, `Type` FROM ReceiptTable ) TotalTable
LEFT JOIN IssueTable USING (`Order`, `Type`)
LEFT JOIN ReceiptTable USING (`Order`, `Type`);
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=cafd416abcbf7ab31f54bf6efbd6566f
The query assumes that (Order, Type) is unique in each separate table. If not then use aggreagating subqueries instead if the tables itself.
You may try using a join approach:
SELECT it.`Order`, it.Type, it.Qty - rt.Qty AS Qty
FROM IssueTable it
INNER JOIN ReceiptTable rt
ON rt.`Order` = it.`Order` AND rt.Type = it.Type;
This answer assumes that every order would have a matching receipt. If not, the approach might have to change slightly based on your expectations. As a side note, ORDER is a reserved keyword in MySQL, and you should avoid naming your columns and tables using it.
I am trying to get all students group by class_id, student_id, teacher_id
SO what I mean is this one :
Select id,class_id, student_id,teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id
But this is what I get
Actually what I want as a result is:
114 137 1 47 1
108 138 2 49 0
113 197 3 47 1
So basically the problem is at the third row. Instead of having id = 113 I get ID=111.
What should I do in this case? Can you please help me with the query
As mentioned in the comments, MySQL allows something against the SQL standard, letting you include a non-aggregated column (in this case id) in the select list of a query that includes a group by. As far as I know, it will arbitrarily pick one row in each grouping and display the id value from that row.
If you have a specific rule about which id value you want to see, you need to express that in your query.
By the way, your desired output appears to have multiple typos (e.g. 197, which doesn't appear in your data at all).
From your comment (which you should edit into your original question), and your desired output, I think the rule you want for the id column is:
If there are any rows with active=1 in the group, choose the maximum id value from those rows
If all rows in the group have active=0, choose the minimum id value. (You didn't say this specifically; I'm assuming it based on the presence of 108 on the second row of your desired output.)
I think that this query will produce those results. (And also eliminate the non-standard MySQL behavior.)
SELECT
COALESCE(
MAX(CASE WHEN active=1 THEN id ELSE NULL END),
MIN(id)
) AS some_id
class_id, student_id, teacher_id, max(active)
FROM student_classes
GROUP BY class_id, student_id, teacher_id
MySQL versions 5.5, 5.6 works as you coded. But actually it's not correct. With version 5.7 and higher it will throw error. The error will be like "SELECT list is not in GROUP BY clause and contains nonaggregated column 'student_classes.id'..."
Therefore it seems your DB version is old and maybe this code should work as you wanted
select
---------
min(x.id) as id,
---------
x.class_id,
x.student_id,
x.active
from student_classes x
inner join (select
class_id,
student_id,
teacher_id,
---------
max(active) max_active
---------
from student_classes x
group by class_id, student_id, teacher_id
) y
on x.class_id = y.class_id and
x.student_id = y.student_id and
x.teacher_id = y.teacher_id and
x.active = y.max_active
group by x.class_id, x.student_id, x.active
order by id, class_id, student_id
;
You don't want an aggregation actually, but rather pick particular rows. The rule for picking a row is: Per class_id, student_id, teacher_id get the one with the maximum active and in case of a tie the lowest id. This is a ranking of rows.
As of MySQL 8 you can use a window function like ROW_NUMBER to rank rows:
select *
from
(
select
sc.*,
row_number() over (partition by class_id, student_id, teacher_id
order by active desc, id) as rn
from student_classes sc
) with_wanted_id
where rn = 1;
In older versions you could use NOT EXISTS to exclude rows for which a better row exists:
select *
from student_classes sc1
where not exists
(
select null
from student_classes sc2
where sc2.class_id = sc1.class_id
and sc2.student_id = sc1.student_id
and sc2.teacher_id = sc1.teacher_id
and
(
sc2.active > sc1.active
or
(sc2.active = sc1.active and sc2.id < sc1.id)
)
);
I've got a table locations:
user | timestamp | state | other_data
-------------------------------------
1 100 1 some_data
1 200 1 some_data
1 300 0 some_data
1 400 0 some_data
2 100 0 some_data
2 200 0 some_data
This is for a location tracking app. A location has two states (1 for "user is within range" and 0 for "user is out of range").
Now I want to retrieve the last time a user's location state has changed.
Example for user = 1
user | timestamp | state | other_data
-------------------------------------
1 300 0 some_data
Because this was the first location update that has the same state value as the "current" (timestamp 400) record.
Higher-level description: I want to display the user something like "You have been in / out of range since [timestamp]"
The faster the solution, the better of course
I would use ranks to order the rows and then pick the min timestamp of the first ranked rows.
select user,min(timestamp) as timestamp,min(state) as state
from
(select l.*,#rn:=case when #user=user and #state=state then #rn
when #user<>user then 1
else #rn+1 end as rnk
,#user:=user,#state:=state
from locations l
cross join (select #rn:=0,#user:='',#state:='') r
order by user,timestamp desc
) t
where rnk=1
group by user
You can do this with a correlated subquery:
select l.*
from locations l
where l.timestamp = (select max(l2.timestamp)
from locations l2
where l2.user = l.user
);
For this to work well, you want an index on locations(user, timestamp).
This can be faster than the join and group by approach. In particular, the correlated subquery can make use of an index, but a group by on the whole table often does not (in MySQL).
As far as I am aware the only way to achieve this is a sells join. Something a bit like;
Select table.* From table
Inner Join
(Select id, max(timestamp) as tm from table group by id) as m
On m.tm = table.timestamp and m.id = table.id
Syntax is for MsSQL, it should transfer to MySQL though. Might have to specify column names instead of table.*
I have a MySQL table where there are many rows for each person, and I want to write a query which aggregates rows with special constraint. (one per person)
For example, lets say the table is consist of following data.
name date reason
---------------------------------------
John 2013-04-01 14:00:00 Vacation
John 2013-03-31 18:00:00 Sick
Ted 2012-05-06 20:00:00 Sick
Ted 2012-02-20 01:00:00 Vacation
John 2011-12-21 00:00:00 Sick
Bob 2011-04-02 20:00:00 Sick
I want to see the distribution of 'reason' column. If I just write a query like below
select reason, count(*) as count from table group by reason
then I will be able to see number of reasons for this table overall.
reason count
------------------
Sick 4
Vacation 2
However, I am only interested in single reason from each person. The reason that should be counted should be from a row with latest date from the person's records. For example, John's latest reason would be Vacation while Ted's latest reason would be Sick. And Bob's latest reason (and the only reason) is Sick.
The expected result for that query should be like below. (Sum of count will be 3 because there are only 3 people)
reason count
-----------------
Sick 2
Vacation 1
Is it possible to write a query such that single latest reason will be counted when I want to see distribution(count) of reasons?
Here are some facts about the table.
The table has tens of millions of rows
For most of times, each person has one reason.
Some people have multiple reasons, but 99.99% of people have fewer than 5 reasons.
There are about 30 different reasons while there are millions of distinct names.
The table is partitioned based on date range.
SELECT T.REASON, COUNT(*)
FROM
(
SELECT PERSON, MAX(DATE) AS MAX_DATE
FROM TABLE-NAME
GROUP BY PERSON
) A, TABLE-NAME T
WHERE T.PERSON = A.PERSON AND T.DATE = A.MAX_DATE
GROUP BY T.REASON
Try this
select reason, count(*) from
(select reason from table where date in
(select max(date) from table group by name)) t
group by reason
In MySQL, it's not very efficient to do this kind of query since you don't have access to tools like partitionning query in SQL Server or Oracle.
You can still emulate it by doing a subquery and retrieve the rows based on the condition you need, here the maximum date :
SELECT t.reason, COUNT(1)
FROM
(
SELECT name, MAX(adate) AS maxDate
FROM #aTable
GROUP BY name
) maxDateRows
INNER JOIN #aTable t ON maxDateRows.name = t.name
AND maxDateRows.maxDate = t.adate
GROUP BY t.reason
You can see a sample here.
Test this query on your samples, but I'm afraid that it will be slow as hell.
For your information, you can do the same thing in a more elegant and much much faster way in SQL Server :
SELECT reason, COUNT(1)
FROM
(
SELECT name
, reason
, RANK() OVER(PARTITION BY name ORDER BY adate DESC) as Rank
FROM #aTable
) AS rankTable
WHERE Rank = 1
GROUP BY reason
The sample is here
If you are really stuck to MySql, and the first query is too slow, then you can split the problem.
Do a first query creating a table:
CREATE TABLE maxDateRows AS
SELECT name, MAX(adate) AS maxDate
FROM #aTable
GROUP BY name
Then create index on both name and maxDate.
Finally, get the results :
SELECT t.reason, COUNT(1)
FROM maxDateRows m
INNER JOIN #aTable t ON m.name = t.name
AND m.maxDate = t.adate
GROUP BY t.reason
The solution you are looking for seems to be solved by this query :
select
reason,
count(*)
from (select * from tablename group by name) abc
group by
reason
It is quite fast and simple. You can view the SQL Fiddle
Apologies if this answer duplicates an existing. Maybe I'm suffering from some form aphasia but I cannot see it...
SELECT x.reason
, COUNT(*)
FROM absentism x
JOIN
( SELECT name,MAX(date) max_date FROM absentism GROUP BY name) y
ON y.name = x.name
AND y.max_date = x.date
GROUP
BY reason;
Here is a simplified version of my table:
group price spec
a 1 .
a 2 ..
b 1 ...
b 2
c .
. .
. .
I'd like to produce a result like this: (I'll refer to this as result_table)
price_a |spec_a |price_b |spec_b |price_c ...|total_cost
1 |. |1 |.. |... |
(min) (min) =1+1+...
Basically I want to:
select the rows containing the min price within each group
combine columns into a single row
I know this can be done using several queries and/or combined with some non-sql processing on the results, but I suspect that there maybe better solutions.
The reason that I want to do task 2 (combine columns into a single row)
is because I want to do something like the following with the result_table:
select *,
(result_table.total_cost + table1.price + table.2.price) as total_combined_cost
from result_table
right join table1
right join table2
This may be too much to ask for, so here is some other thoughts on the problem:
Instead of trying to combine multiple rows(task 2), store them in a temporary table
(which would be easier to calculate the total_cost using sum)
Feel free to drop any thoughts, don't have to be complete answer, I feel it's brilliant enough if you have an elegant way to do task 1 !
==Edited/Added 6 Feb 2012==
The goal of my program is to identify best combinations of items with minimal cost (and preferably possess higher utilitarian value at the same time).
Consider #ypercube's comment about large number of groups, temporary table seems to be the only feasible solution. And it is also pointed out there is no pivoting function in MySQL (although it can be implemented, it's not necessary to perform such operation).
Okay, after study #Johan's answer, I'm thinking about something like this for task 1:
select * from
(
select * from
result_table
order by price asc
) as ordered_table
group by group
;
Although looks dodgy, it seems to work.
==Edited/Added 7 Feb 2012==
Since there could be more than one combination may produce the same min value, I have modified my answer :
select result_table.* from
(
select * from
(
select * from
result_table
order by price asc
) as ordered_table
group by group
) as single_min_table
inner join result_table
on result_table.group = single_min_table.group
and result_table.price = single_min_table.price
;
However, I have just realised that there is another problem I need to deal with:
I can not ignore all the spec, since there is a provider property, items from different providers may or may not be able to be assembled together, so to be safe (and to simplify my problem) I decide to combine items from the same provider only, so the problem becomes:
For example if I have an initial table like this(with only 2 groups and 2 providers):
id group price spec provider
1 a 1 . x
2 a 2 .. y
3 a 3 ... y
4 b 1 ... y
5 b 2 x
6 b 3 z
I need to combine
id group price spec provider
1 a 1 . x
5 b 2 x
and
2 a 2 .. y
4 b 1 ... y
record (id 6) can be eliminated from the choices since it dose not have all the groups available.
So it's not necessarily to select only the min of each group, rather it's to select one from each group so that for each provider I have a minimal combined cost.
You cannot pivot in MySQL, but you can group results together.
The GROUP_CONCAT function will give you a result like this:
column A column B column c column d
groups specs prices sum(price)
a,b,c some,list,xyz 1,5,7 13
Here's a sample query:
(The query assumes you have a primary (or unique) key called id defined on the target table).
SELECT
GROUP_CONCAT(a.`group`) as groups
,GROUP_CONCAT(a.spec) as specs
,GROUP_CONCAT(a.min_price) as prices
,SUM(a.min_prices) as total_of_min_prices
FROM
( SELECT price, spec, `group` FROM table1
WHERE id IN
(SELECT MIN(id) as id FROM table1 GROUP BY `group` HAVING price = MIN(price))
) AS a
See: http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
Producing the total_cost only:
SELECT SUM(min_price) AS total_cost
FROM
( SELECT MIN(price) AS min_price
FROM TableX
GROUP BY `group`
) AS grp
If a result set with the minimum prices returned in row (not in column) per group is fine, then your problem is of the gretaest-n-per-group type. There are various methods to solve it. Here's one:
SELECT tg.grp
tm.price AS min_price
tm.spec
FROM
( SELECT DISTINCT `group` AS grp
FROM TableX
) AS tg
JOIN
TableX AS tm
ON
tm.PK = --- the Primary Key of the table
( SELECT tmin.PK
FROM TableX AS tmin
WHERE tmin.`group` = tg.grp
ORDER BY tmin.price ASC
LIMIT 1
)