MySQL Query for finding a "LAST" row, based on two fields - mysql

I have the following MySQL table to log the registration status changes of pupils:
CREATE TABLE `pupil_registration_statuses` (
`status_id` INT(11) NOT NULL AUTO_INCREMENT,
`status_pupil_id` INT(10) UNSIGNED NOT NULL,
`status_status_id` INT(10) UNSIGNED NOT NULL,
`status_effectivedate` DATE NOT NULL,
PRIMARY KEY (`status_id`),
INDEX `status_pupil_id` (`status_pupil_id`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM;
Example data:
INSERT INTO `pupil_registration_statuses` (`status_id`, `status_pupil_id`, `status_status_id`, `status_effectivedate`) VALUES
(1, 123, 1, '2013-05-06'),
(2, 123, 2, '2014-03-15'),
(3, 123, 5, '2013-03-15'),
(4, 123, 6, '2013-05-06'),
(5, 234, 2, '2013-02-02'),
(6, 234, 4, '2013-04-17'),
(7, 345, 2, '2014-02-01'),
(8, 345, 3, '2013-06-01');
It is possible that statuses can be inserted, thus the sequence of dates does not necessarily follow the same sequence of IDs.
For example: status_id 1 might has a date of 2013-05-06, but status_id 3 might have a date of 2013-03-15.
status_id values are, however, sequential within any particular date. Thus if a pupil's registration status changes multiple times on one day then the last row will will reflect their status for that date.
It is necessary to find out a particular student's registration status on a particular date. The following query works for an individual pupil:
SELECT *
FROM pupil_registration_statuses
WHERE status_pupil_id = 123
AND status_effectivedate <= '2013-05-06'
ORDER BY status_effectivedate DESC, status_id DESC
LIMIT 1;
This returns the expected row of status_id = 4
However, I now need to issue a (single) query to return the status for all pupils on a particular date.
The following query is proposed, but doesn't obey the "last status_id in a day" requirement:
SELECT *
FROM pupil_registration_statuses prs
INNER JOIN (SELECT status_pupil_id, MAX(status_effectivedate) last_date
FROM pupil_registration_statuses
WHERE status_effectivedate <= '2013-05-06'
GROUP BY status_pupil_id) qprs ON prs.status_pupil_id = qprs.status_pupil_id AND prs.status_effectivedate = qprs.last_date;
This query, however, returns 2 rows for pupil 123.
EDIT
To clarify, if the input is the date '2013-05-06', I expect to get the rows 4 and 6 from the query.
http://sqlfiddle.com/#!2/68ee6/2

Is this what you're after?
SELECT a.*
FROM pupil_registration_statuses a
JOIN
( SELECT prs.status_pupil_id
, MIN(prs.status_id) min_status_id
FROM pupil_registration_statuses prs
JOIN
( SELECT status_pupil_id
, MAX(status_effectivedate) last_date
FROM pupil_registration_statuses
WHERE status_effectivedate <= '2013-05-06'
GROUP
BY status_pupil_id
) qprs
ON prs.status_pupil_id = qprs.status_pupil_id
AND prs.status_effectivedate = qprs.last_date
GROUP
BY prs.status_pupil_id
) b
ON b.min_status_id = a.status_id;
http://sqlfiddle.com/#!2/68ee6/7
(Incidentally, there's an ugly and undocumented hack for this kind of problem which goes something like this:
SELECT x.* FROM (SELECT * FROM prs WHERE status_effectivedate <= '2013-05-06' ORDER BY status_pupil_id, status_effectivedate DESC, status_id)x GROUP BY status_pupil_id;
...but I didn't tell you that! ;) )

If I understood right, you want to...
1) Get 1 row per person.
2) Get the status changes from the specific day you manually input.
3) Get the last status changes from within the specific day.
If that's right, you need the query you already have ordering by date and then by id, just with a distinct.
SELECT DISTINCT on status_pupil_id *
FROM pupil_registration_statuses
WHERE status_pupil_id = 123
AND status_effectivedate <= '2013-05-06'
ORDER BY status_effectivedate DESC, status_id DESC

I have changed where clause, please try it.
SELECT *
FROM pupil_registration_statuses prs
INNER JOIN (SELECT status_pupil_id, MAX(status_effectivedate) last_date
FROM pupil_registration_statuses
WHERE Datediff(status_effectivedate, '2013-05-06') <= 0
GROUP BY status_pupil_id) qprs ON prs.status_pupil_id = qprs.status_pupil_id AND prs.status_effectivedate = qprs.last_date;
EDIT
Try this
SELECT *
FROM
(
select status_pupil_id,max(status_id) as status_id from pupil_registration_statuses innr
--where Datediff(dd,status_effectivedate, '2013-05-06') >= 0
group by status_pupil_id
)as ca
inner join pupil_registration_statuses prs on prs.status_id = ca.status_id
where Datediff(dd,prs.status_effectivedate, '2013-05-06') >= 0

Related

Get Data According to Group by date field

Here is my table
Which have field type which means 1 is for income and 2 is for expense
Now requirement is for example in table there is two transaction made on 2-10-2018 so i want data as following
Expected Output
id created_date total_amount
1 1-10-18 10
2 2-10-18 20(It calculates all only income transaction made on 2nd date)
3 3-10-18 10
and so on...
it will return an new field which contains only incom transaction made on perticulur day
What i had try is
SELECT * FROM `transaction`WHERE type = 1 ORDER BY created_date ASC
UNION
SELECT()
//But it wont work
SELECT created_date,amount,status FROM
(
SELECT COUNT(amount) AS totalTrans FROM transaction WHERE created_date = created_date
) x
transaction
You can Also See Schema HERE http://sqlfiddle.com/#!9/6983b9
You can Count() the total number of expense transactions using conditional function If(), on a group of created_date.
Similarly, you can Sum() the amount of expense done using If(), on a created_date.
Try the following:
SELECT
`created_date`,
SUM(IF (`type` = 2, `amount`, 0)) AS total_expense_amount,
COUNT(IF (`type` = 2, `id`, NULL)) AS expense_count
FROM
`transaction`
GROUP BY `created_date`
ORDER BY `created_date` ASC
Do you just want a WHERE clause?
SELECT t.created_date, SUM(amount) as total_amount
FROM transaction t
WHERE type = 2
GROUP BY t.created_date
ORDER BY created_date ASC ;

mysql: running an insert based two other row's value

I have a table with columns: debit, credit, debit_balance, credit_balance, and amount. Debit and credit each pertain to specific accounts.
Each time I add a new row, I want the debit_balance and credit_balance to be assigned based on the account's previous balance.
INSERT INTO `ledger` (`debit`, `credit`, `debit_balance`, `credit_balance`, `amount`)
VALUES ('1', '3',
(SELECT debit_balance FROM `ledger` WHERE `debit` = '1' ORDER BY `id` DESC LIMIT 0,1) + 5,
(SELECT credit_balance FROM `ledger` WHERE `credit` = '3' ORDER BY `id` DESC LIMIT 0,1) + 5,
'5')
Where debit's account is 1, credit's account is 3, and the amount I want to change is 5.
When I run the query, mysql gives me a Every derived table must have its own alias error.
You can use a single SELECT query to provide the values to be inserted.
INSERT INTO ledger (debit, credit, debit_balance, credit_balance, amount)
SELECT 1, 3, l1.debit_balance + 5, l2.credit_balance + 5, 5
FROM (SELECT MAX(id) AS debit_id FROM ledger WHERE debit = 1) AS maxd
JOIN ledger AS l1 ON l1.id = maxd.debit_id
CROSS JOIN (SELECT MAX(id) AS credit_id FROM ledger WHERE credit = 3) AS maxc
JOIN ledger AS l2 ON l2.id = maxc.credit_id

MySQL COUNT(*) not counting result rows

Simplified schema of m:n relation implementing a subscription model:
CREATE TABLE c (
id INT(11) PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(32)
) ENGINE=MyISAM CHARACTER SET=UTF8;
CREATE TABLE t (
id INT(11) PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(32)
) ENGINE=MyISAM CHARACTER SET=UTF8;
CREATE TABLE c2t (
id INT(11) PRIMARY KEY AUTO_INCREMENT,
cid INT(11) NOT NULL,
tid INT(11) NOT NULL,
dateStart DATE NULL,
dateEnd DATE NULL
) ENGINE=MyISAM CHARACTER SET=UTF8;
INSERT INTO c (name) VALUES ('mike'),('carl'),('suzy');
INSERT INTO t (name) VALUES ('plan1'),('plan2'),('plan3'),('plan4');
INSERT INTO c2t (cid, tid, dateStart, dateEnd) VALUES
(1, 1, '2014-01-01', '2014-07-31'),
(1, 2, '2014-08-01', '2015-07-31'),
(1, 1, '2015-08-01', null),
(1, 3, '2015-09-01', null),
(2, 1, '2014-01-01', '2015-07-31'),
(2, 2, '2015-08-01', '2015-09-30'),
(2, 3, '2015-09-30', null),
(3, 1, '2014-01-01', '2014-12-31'),
(3, 2, '2014-01-01', '2014-12-31'),
(3, 3, '2015-01-01', '2015-10-31'),
(3, 4, '2015-01-01', '2015-10-31');
I've developed a query to find the c's who have active subscriptions of t's:
SELECT c.*
FROM c
LEFT JOIN c2t ON c.id = c2t.cid
AND NOW() BETWEEN COALESCE(dateStart, '0000-00-00')
AND COALESCE(dateEnd, DATE_ADD(NOW(), INTERVAL 1 DAY))
GROUP BY c2t.cid
HAVING COUNT(c2t.id) > 0;
Result as expected:
id name
1 mike
2 carl
The problem arises when I try to count the result rows. The query is almost identical, I've just dropped in a COUNT(*):
SELECT COUNT(*)
FROM c
LEFT JOIN c2t ON c.id = c2t.cid
AND NOW() BETWEEN COALESCE(dateStart, '0000-00-00')
AND COALESCE(dateEnd, DATE_ADD(NOW(), INTERVAL 1 DAY))
GROUP BY c2t.cid
HAVING COUNT(c2t.id) > 0;
Result:
`COUNT(*)`
2
1
Expected result would be a single row containing the number of rows found (2). I can only assume that the GROUP BY is interfering, but have no idea how to work around. Explanations are most welcome.
Wrap everything with subquery and use COUNT in outer query:
SELECT COUNT(*)
FROM (
SELECT c.*
FROM c
LEFT JOIN c2t ON c.id = c2t.cid
AND NOW() BETWEEN COALESCE(dateStart, '0000-00-00')
AND COALESCE(dateEnd, DATE_ADD(NOW(), INTERVAL 1 DAY))
GROUP BY c2t.cid
HAVING COUNT(c2t.id) > 0
) AS sub
If the only thing you want returned is the number of c's who have active subscriptions, then you can simplify your query like this:
SELECT COUNT(DISTINCT c.id) AS cnt
FROM c
INNER JOIN c2t ON c.id = c2t.cid
AND NOW() BETWEEN COALESCE(dateStart, '0000-00-00')
AND COALESCE(dateEnd, DATE_ADD(NOW(), INTERVAL 1 DAY))
So, INNER JOIN is used in place of LEFT JOIN: there is no need to return c's with no matches in c2t, since these are not going to have any active subscriptions.
Also, there is no need to GROUP BY: the query returns just one row with the number of c's.
Finally, DISTINCT must be used in COUNT so as to avoid counting duplicate c.id values more than once.

How to get users that purchased items ONLY in a specific time period (MySQL Database)

I have a table that contains all purchased items.
I need to check which users purchased items in a specific period of time (say between 2013-03-21 to 2013-04-21) and never purchased anything after that.
I can select users that purchased items in that period of time, but I don't know how to filter those users that never purchased anything after that...
SELECT `userId`, `email` FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21' GROUP BY `userId`
Give this a try
SELECT
user_id
FROM
my_table
WHERE
purchase_date >= '2012-05-01' --your_start_date
GROUP BY
user_id
HAVING
max(purchase_date) <= '2012-06-01'; --your_end_date
It works by getting all the records >= start date, groups the resultset by user_id and then finds the max purchase date for every user. The max purchase date should be <=end date. Since this query does not use a join/inner query it could be faster
Test data
CREATE table user_purchases(user_id int, purchase_date date);
insert into user_purchases values (1, '2012-05-01');
insert into user_purchases values (2, '2012-05-06');
insert into user_purchases values (3, '2012-05-20');
insert into user_purchases values (4, '2012-06-01');
insert into user_purchases values (4, '2012-09-06');
insert into user_purchases values (1, '2012-09-06');
Output
| USER_ID |
-----------
| 2 |
| 3 |
SQLFIDDLE
This is probably a standard way to accomplish that:
SELECT `userId`, `email` FROM my_table mt
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
AND NOT EXISTS (
SELECT * FROM my_table mt2 WHERE
mt2.`userId` = mt.`userId`
and mt2.`date` > '2013-04-21'
)
GROUP BY `userId`
SELECT `userId`, `email` FROM my_table WHERE (`date` BETWEEN '2013-03-21' AND '2013-04-21') and `date` >= '2013-04-21' GROUP BY `userId`
This will select only the users who purchased during that timeframe AND purchased after that timeframe.
Hope this helps.
Try the following
SELECT `userId`, `email`
FROM my_table WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
and user_id not in
(select user_id from my_table
where `date` < '2013-03-21' or `date` > '2013-04-21' )
GROUP BY `userId`
You'll have to do it in two stages - one query to get the list of users who did buy within the time period, then another query to take that list of users and see if they bought anything afterwards, e.g.
SELECT userID, email, count(after.*) AS purchases
FROM my_table AS after
LEFT JOIN (
SELECT DISTINCT userID
FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
) AS during ON after.userID = during.userID
WHERE after.date > '2013-04-21'
HAVING purchases = 0;
Inner query gets the list of userIDs who purchased at least one thing during that period. That list is then joined back against the same table, but filtered for purchases AFTER the period , and counts how many purchases they made and filters down to only those users with 0 "after" purchases.
probably won't work as written - haven't had my morning tea yet.
SELECT
a.userId,
a.email
FROM
my_table AS a
WHERE a.date BETWEEN '2013-03-21'
AND '2013-04-21'
AND a.userId NOT IN
(SELECT
b.userId
FROM
my_table AS b
WHERE b.date BETWEEN '2013-04-22'
AND CURDATE()
GROUP BY b.userId)
GROUP BY a.userId
This filters out anyone who has not purchased anything from the end date to the present.

MySQL query, MAX() + GROUP BY

Daft SQL question. I have a table like so ('pid' is auto-increment primary col)
CREATE TABLE theTable (
`pid` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`cost` INT UNSIGNED NOT NULL,
`rid` INT NOT NULL,
) Engine=InnoDB;
Actual table data:
INSERT INTO theTable (`pid`, `timestamp`, `cost`, `rid`)
VALUES
(1, '2011-04-14 01:05:07', 1122, 1),
(2, '2011-04-14 00:05:07', 2233, 1),
(3, '2011-04-14 01:05:41', 4455, 2),
(4, '2011-04-14 01:01:11', 5566, 2),
(5, '2011-04-14 01:06:06', 345, 1),
(6, '2011-04-13 22:06:06', 543, 2),
(7, '2011-04-14 01:14:14', 5435, 3),
(8, '2011-04-14 01:10:13', 6767, 3)
;
I want to get the PID of the latest row for each rid (1 result per unique RID). For the sample data, I'd like:
pid | MAX(timestamp) | rid
-----------------------------------
5 | 2011-04-14 01:06:06 | 1
3 | 2011-04-14 01:05:41 | 2
7 | 2011-04-14 01:14:14 | 3
I've tried running the following query:
SELECT MAX(timestamp),rid,pid FROM theTable GROUP BY rid
and I get:
max(timestamp) ; rid; pid
----------------------------
2011-04-14 01:06:06; 1 ; 1
2011-04-14 01:05:41; 2 ; 3
2011-04-14 01:14:14; 3 ; 7
The PID returned is always the first occurence of PID for an RID (row / pid 1 is frst time rid 1 is used, row / pid 3 the first time RID 2 is used, row / pid 7 is first time rid 3 is used). Though returning the max timestamp for each rid, the pids are not the pids for the timestamps from the original table. What query would give me the results I'm looking for?
(Tested in PostgreSQL 9.something)
Identify the rid and timestamp.
select rid, max(timestamp) as ts
from test
group by rid;
1 2011-04-14 18:46:00
2 2011-04-14 14:59:00
Join to it.
select test.pid, test.cost, test.timestamp, test.rid
from test
inner join
(select rid, max(timestamp) as ts
from test
group by rid) maxt
on (test.rid = maxt.rid and test.timestamp = maxt.ts)
select *
from (
select `pid`, `timestamp`, `cost`, `rid`
from theTable
order by `timestamp` desc
) as mynewtable
group by mynewtable.`rid`
order by mynewtable.`timestamp`
Hope I helped !
SELECT t.pid, t.cost, to.timestamp, t.rid
FROM test as t
JOIN (
SELECT rid, max(tempstamp) AS maxtimestamp
FROM test GROUP BY rid
) AS tmax
ON t.pid = tmax.pid and t.timestamp = tmax.maxtimestamp
I created an index on rid and timestamp.
SELECT test.pid, test.cost, test.timestamp, test.rid
FROM theTable AS test
LEFT JOIN theTable maxt
ON maxt.rid = test.rid
AND maxt.timestamp > test.timestamp
WHERE maxt.rid IS NULL
Showing rows 0 - 2 (3 total, Query took 0.0104 sec)
This method will select all the desired values from theTable (test), left joining itself (maxt) on all timestamps higher than the one on test with the same rid. When the timestamp is already the highest one on test there are no matches on maxt - which is what we are looking for - values on maxt become NULL. Now we use the WHERE clause maxt.rid IS NULL or any other column on maxt.
You could also have subqueries like that:
SELECT ( SELECT MIN(t2.pid)
FROM test t2
WHERE t2.rid = t.rid
AND t2.timestamp = maxtimestamp
) AS pid
, MAX(t.timestamp) AS maxtimestamp
, t.rid
FROM test t
GROUP BY t.rid
But this way, you'll need one more subquery if you want cost included in the shown columns, etc.
So, the group by and join is better solution.
If you want to avoid a JOIN, you can use:
SELECT pid, rid FROM theTable t1 WHERE t1.pid IN ( SELECT MAX(t2.pid) FROM theTable t2 GROUP BY t2.rid);
Try:
select pid,cost, timestamp, rid from theTable order by timestamp DESC limit 2;