How to find median given frequency of numbers? - mysql

The Numbers table keeps the value of number and its frequency.
+----------+-------------+
| Number | Frequency |
+----------+-------------|
| 0 | 7 |
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |
+----------+-------------+
In this table, the numbers are 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3, so the median is (0 + 0) / 2 = 0. How to find median (output shown) given frequency of numbers?
+--------+
| median |
+--------|
| 0.0000 |
+--------+
I found the following solution here. However, I am unable to understand it. Can someone please explain the solution and/or post a different solution with explanation?
SELECT AVG(n.Number) AS median
FROM Numbers n LEFT JOIN
(
SELECT Number, #prev := #count AS prevNumber, (#count := #count + Frequency) AS countNumber
FROM Numbers,
(SELECT #count := 0, #prev := 0, #total := (SELECT SUM(Frequency) FROM Numbers)) temp ORDER BY Number
) n2
ON n.Number = n2.Number
WHERE
(prevNumber < floor((#total+1)/2) AND countNumber >= floor((#total+1)/2))
OR
(prevNumber < floor((#total+2)/2) AND countNumber >= floor((#total+2)/2))
Here's the SQL script for reproducibility:
CREATE TABLE `Numbers` (
`Number` INT NULL,
`Frequency` INT NULL);
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('0', '7');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('1', '1');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('2', '3');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('3', '1');
Thanks!

You can use a cumulative sum and then take the midway point. I think the logic looks like this:
select avg(number)
from (select t.*, (#rf := #rf + frequency) as running_frequency
from (select t.* from t order by number) t cross join
(select #rf := 0) params
) t
where running_frequency - frequency >= ceil(#rf / 2) and
running_frequency <= ceil((#rf + 1) / 2);

Related

Need some help to clean duplicates in MySQL table which didn't have constraint

I've inherited some MySQL table that was designed without correct constraint so it gets filled with some duplicate rows which I need to remove. The problem that across duplicate rows data isn't usually consistent, see example below:
id
request_id
guid_id
details
flag
1
10
fh82EN
help me
1
2
11
fh82EN
3
12
fh82EN
assistance required
1
4
12
fh82EN
assistance required
1
5
13
fh82EN
6
13
fh82EN
assist me.
1
7
13
fh82EN
8
14
fh82EN
Records with id: 1,2,8 perfectly fine. For duplicate records with id 3, 4 I have designed the query below which works fine and removes all duplicates without an issue:
DELETE IR.*
FROM platform.temp IR
WHERE id IN (
SELECT maxId AS id FROM (
SELECT MAX(id) as maxId, request_id, guid_id
FROM platform.temp
GROUP BY request_id, guid_id
HAVING COUNT(*) > 1
) AS T
);
The problem is records with id 5,6,7. You can see that the same record by (guid_id and request_id) is not consistent. So, my previous query will delete records with content too because of MAX(id). I have designed a query that fixes these records, but we are talking about a huge database and this query is painfully slow:
UPDATE platform.temp AS DEST_T
INNER JOIN (
SELECT request_id, guid_id, details, flag FROM platform.temp WHERE details IS NOT NULL AND details != ''
) AS SOURCE_T
SET DEST_T.details = SOURCE_T.details, DEST_T.flag = SOURCE_T.flag
WHERE DEST_T.guid_id = SOURCE_T.guid_id AND DEST_T.request_id = SOURCE_T.request_id;
How can I change my delete query that it will order my subgroup by field details and will select not MAX(id) but first id, so I will be sure that last row in subgroup will always be populated with value and will left?
MySQL version: 5.6.40-log
UPDATE1:
The desired outcome after cleaning the table should be as follow:
id
request_id
guid_id
details
flag
1
10
fh82EN
help me
1
2
11
fh82EN
3
12
fh82EN
assistance required
1
6
13
fh82EN
assist me.
1
8
14
fh82EN
Use a self join of the table:
DELETE t1
FROM tablename t1 INNER JOIN tablename t2
ON t2.request_id = t1.request_id AND t2.guid_id = t1.guid_id
WHERE (t2.id < t1.id AND COALESCE(t1.details, '') = '')
OR
(t2.id > t1.id AND COALESCE(t2.details, '') <> '');
This will keep 1 row for each request_id and guid_id combination, not necessarily the one with the min id.
See the demo.
Another way to do it, with conditional aggregation:
DELETE t1
FROM tablename t1 INNER JOIN (
SELECT request_id, guid_id,
MIN(id) min_id,
MIN(CASE WHEN COALESCE(details, '') <> '' THEN id END) min_id_not_null
FROM tablename
GROUP BY request_id, guid_id
) t2 ON t2.request_id = t1.request_id AND t2.guid_id = t1.guid_id
WHERE t1.id <> COALESCE(t2.min_id_not_null, t2.min_id);
This will keep the row with the min id under your conditions, but maybe its performance would not be that good compared to the 1st query.
See the demo.
Another way is to emulate the ROW_NUMBER ad then perform the delete operation.
DELETE FROM test
WHERE id NOT IN (select id
from (SELECT id,
#row_number := CASE WHEN #last_request_id <> x.request_id + x.guid_id
THEN 1 ELSE #row_number + 1 END AS row_num,
#last_request_id := x.request_id + x.guid_id
FROM test x
CROSS JOIN (SELECT #row_number := 0, #last_request_id := null, #last_guid_id := null) y
ORDER BY request_id, guid_id, details DESC) temp
where row_num = 1);
Demo.
As i said in the comments i would use it with row_numbers, which in mysql 8 would look much more nicer
CREATE TABLE temp
(`id` varchar(4), `request_id` varchar(12), `guid_id` varchar(9), `details` varchar(21), `flag` varchar(6))
;
INSERT INTO temp
(`id`, `request_id`, `guid_id`, `details`, `flag`)
VALUES
('1', '10', 'fh82EN', 'help me', '1'),
('2', '11', 'fh82EN', NULL, NULL),
('3', '12', 'fh82EN', 'assistance required', '1'),
('4', '12', 'fh82EN', 'assistance required', '1'),
('5', '13', 'fh82EN', NULL, NULL),
('6', '13', 'fh82EN', 'assistance required', '1'),
('7', '13', 'fh82EN', NULL, NULL),
('8', '14', 'fh82EN', NULL, NULL)
;
DELETE t1
FROM temp t1 INNER JOIN
(SELECT `id`
, IF(#request = `request_id` AND #guid = guid_id, #rn:= #rn+1,#rn := 1) rn
,#request := `request_id` as request_id
,#guid := guid_id as guid_id
fROM temp,(SELECT #request := 0, #guid := '',#rn := 0) t1
ORDER BY `guid_id`,`request_id`,`details` DESC, id) t2 ON
t1.`id` = t2.`id` AND rn > 1
SELECT * FROM temp
id | request_id | guid_id | details | flag
:- | :--------- | :------ | :------------------ | :---
1 | 10 | fh82EN | help me | 1
2 | 11 | fh82EN | null | null
3 | 12 | fh82EN | assistance required | 1
6 | 13 | fh82EN | assistance required | 1
8 | 14 | fh82EN | null | null
db<>fiddle here

Getting highest calculated score of GROUP BY in mysql

I'm trying to retrieve the best suited price for a product in each quantity depending on the customer and/or his customer group. To do so, I use a weight based system: the matching customer group is more important than the matching customer, so if two rows collide, we should get the row corresponding to the customer group id.
Here's an example:
Customer n°1 is part of Customer group n°2
Product prices:
A - 90€ for customer n°1 (when buying at least 2 of the same product)
B - 80€ for customer group n°2 (when buying at least 2 of the same product)
So the price shown to the customer n°1 should be 80€
He's my query:
SELECT
MAX(IF(t.customer_id = 1, 10, 0) + IF(t.customer_group_id = 1, 100, 0)) as score,
t.*
FROM tierprice t
WHERE t.product_variant_id = 110
AND (t.customer_id = 1 OR t.customer_id IS NULL)
AND (t.customer_group_id = 1 OR t.customer_group_id IS NULL)
GROUP BY t.product_variant_id, t.qty
The problem I'm having is that the correct score is shown in the result row (here: 100), but the row for the given score is not correct. I'm guessing it has something to do with the MAX in the SELECT and the GROUP BY, but I don't know how to assign the score to the row, and then take the highest.
Here's a fiddle :
CREATE TABLE `tierprice` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`product_variant_id` int(11) DEFAULT NULL,
`customer_group_id` int(11) DEFAULT NULL,
`price` int(11) NOT NULL,
`qty` int(11) NOT NULL,
`customer_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `no_duplicate_prices` (`qty`,`product_variant_id`,`customer_group_id`),
KEY `IDX_BA5254F8A80EF684` (`product_variant_id`),
KEY `IDX_BA5254F8D2919A68` (`customer_group_id`),
KEY `IDX_BA5254F89395C3F3` (`customer_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `tierprice` (`id`, `product_variant_id`, `customer_group_id`, `price`, `qty`, `customer_id`)
VALUES
(1, 110, NULL, 8000, 2, 1),
(2, 110, 1, 7000, 2, NULL),
(3, 110, 1, 6000, 5, NULL),
(4, 110, NULL, 5000, 5, 1),
(5, 111, 1, 8000, 2, NULL),
(6, 111, NULL, 6000, 2, 1),
(7, 111, 1, 7000, 6, NULL),
(8, 111, NULL, 5000, 6, 1);
http://sqlfiddle.com/#!9/7bc0d9/2
The price ids that should come out in the result should be ID 2 & ID 3.
Thank you for your help.
Provided query is not a valid query from SQL standard's perspective:
SELECT
MAX(IF(t.customer_id = 1, 10, 0) + IF(t.customer_group_id = 1, 100, 0)) as score,
t.*
FROM tierprice t
WHERE t.product_variant_id = 110
AND (t.customer_id = 1 OR t.customer_id IS NULL)
AND (t.customer_group_id = 1 OR t.customer_group_id IS NULL)
GROUP BY t.product_variant_id, t.qty;
Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 't.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
Related: Group by clause in mySQL and postgreSQL, why the error in postgreSQL?
It could be rewritten using windowed functions(MySQL 8.0 and above):
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER(PARTITION BY product_variant_id, qty
ORDER BY IF(t.customer_id=1,10,0)+IF(t.customer_group_id=1,100,0) DESC) AS rn
FROM tierprice t
WHERE t.product_variant_id = 110
AND (t.customer_id = 1 OR t.customer_id IS NULL)
AND (t.customer_group_id = 1 OR t.customer_group_id IS NULL)
)
SELECT *
FROM cte
WHERE rn = 1;
db<>fiddle demo
The only valid columns that your query can return are product_variant_id, qty, which you use in GROUP BY clause, and the aggregated column score.
Because of t.* you get all the columns of the table but the values chosen are nondeterministic, for the other columns, as it is explained in MySQL Handling of GROUP BY.
What you can do is join your query to the table like this:
SELECT t.*
FROM tierprice t
INNER JOIN (
SELECT product_variant_id, qty,
MAX(IF(customer_id = 1, 10, 0) + IF(customer_group_id = 1, 100, 0)) as score
FROM tierprice
WHERE product_variant_id = 110
AND (customer_id = 1 OR customer_id IS NULL)
AND (customer_group_id = 1 OR customer_group_id IS NULL)
GROUP BY product_variant_id, qty
) g ON g.product_variant_id = t.product_variant_id
AND g.qty = t.qty
AND g.score = IF(t.customer_id = 1, 10, 0) + IF(t.customer_group_id = 1, 100, 0)
WHERE (t.customer_id = 1 OR t.customer_id IS NULL)
AND (t.customer_group_id = 1 OR t.customer_group_id IS NULL)
See the demo.
Results:
> id | product_variant_id | customer_group_id | price | qty | customer_id
> -: | -----------------: | ----------------: | ----: | --: | ----------:
> 2 | 110 | 1 | 7000 | 2 | null
> 3 | 110 | 1 | 6000 | 5 | null

How can I calculate portfolio return in MySQL?

I am stuck on a MySQL problem. I am trying to calculate the return series of a portfolio using:
for(i = startdate+1; i <= enddate; i++) {
return[i]=0;
for(n = 0; n < count(instruments); n++) {
return[i] += price[i,n] / price[i-1, n] * weight[n];
}
}
So, the return of portfolio today is calculated as a sum of price_today/price_yesterday*weight over the instruments in the portfolio.
I created a scribble at http://rextester.com/FUC35243.
If it doesn't work, the code is:
DROP TABLE IF EXISTS x_ports;
DROP TABLE IF EXISTS x_weights;
DROP TABLE IF EXISTS x_prices;
CREATE TABLE IF NOT EXISTS x_ports (id INT NOT NULL AUTO_INCREMENT, name VARCHAR(20), PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_weights (id INT NOT NULL AUTO_INCREMENT, port_id INT, inst_id INT, weight DOUBLE, PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_prices (id INT NOT NULL AUTO_INCREMENT, inst_id INT, trade_date DATE, price DOUBLE, PRIMARY KEY (id));
INSERT INTO x_ports (name) VALUES ('PORT A');
INSERT INTO x_ports (name) VALUES ('PORT B');
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 1, 20.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 2, 80.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (2, 1, 100.0);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-01', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-02', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-03', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-04', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-05', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-06', 1.14);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-01', 50.23);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-02', 50.45);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-03', 50.30);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-04', 50.29);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-05', 50.40);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-06', 50.66);
# GETTING THE DATES
SET #DtShort='2018-01-01';
SET #DtLong=#DtShort;
SELECT
#DtShort:=#DtLong as date_prev,
#DtLong:=dt.trade_date as date_curent
FROM
(SELECT DISTINCT trade_date FROM x_prices ORDER BY trade_date) dt;
# GETTING RETURN FOR SINGLE DAY
SET #DtToday='2018-01-03';
SET #DtYesterday='2018-01-02';
SELECT
x2.trade_date,
x2.portfolio,
sum(x2.val*x2.weight)/sum(x2.weight) as ret
FROM
(SELECT
x1.trade_date,
x1.portfolio,
sum(x1.weight)/2.0 as weight,
sum(x1.val_end)/sum(x1.val_start) as val,
sum(x1.val_start) as val_start,
sum(x1.val_end) as val_end
FROM
(SELECT
#DtToday as trade_date,
prt.name as portfolio,
wts.inst_id as iid,
wts.weight,
if(prc.trade_date=#DtToday,prc.price*wts.weight,0) as val_start,
if(prc.trade_date=#DtYesterday,prc.price*wts.weight,0) as val_end
FROM
x_ports prt,
x_weights wts,
x_prices prc
WHERE
wts.port_id=prt.id and
prc.inst_id=wts.inst_id and
(prc.trade_date=#DtToday or prc.trade_date=#DtYesterday)) x1
GROUP BY x1.portfolio) x2
GROUP BY x2.portfolio;
I hope to be able to produce a result looking like this:
Date Port A Port B
--------------------------------------------
01/01/2010
02/01/2010 1.005289596 1.004379853
03/01/2010 0.995851496 0.997026759
04/01/2010 0.999840954 0.999801193
05/01/2010 1.003535565 1.002187314
06/01/2010 1.005896896 1.00515873
The return for Port A on the 2/1/2018 should be calculated as 1.13/1.12*20/(20+80) + 50.45/50.23*80/(20+80).
The return for Port B on the 2/1/2018 should be calculated as 50.45/50.23*100/100, or possibly 1.13/1.12*0/(0+100) + 50.45/50.23*100/(0+100).
FYI, in the looping function above, I only calculate at the nominator (or the unscaled weight) so that Port A would be calculated as 1.13/1.12*20+50.45/50.23*80, which I see as the crucial step when calculating the return. The return is then found by dividing it by the sum of the weight.
Though it certainly can be done better, I can get the dates and I can calculate the return of a single day, but I just can't put the two together.
Simulating analytics is no fun! Demo
The math on this doesn't seem right to me; as I'm no where close to your 'looks like results'
I'd like to be able to reuse CurDay but as the version is lower I couldn't use a common table expression.
What this does:
X1 generate the join of the tables
X2 gives us a count of the instruments in a portfolio used later in math
r generates a uservariable on which we can assign rows #rn and #Rn2
CurDay generate a rownumber ordered correctly so we can join
NextDay generates a copy of CurDay so we can join curday to next day on RN+1
Z allows us to do the math and group by current day and prepare for pivot on the portfolio name.
Outer most select allows us to pivot the data so we have date+2 columns
.
SELECT Z.Trade_Date
, sum(case when name = 'Port A' then P_RETURN end) as PortA
, sum(case when name = 'Port B' then P_RETURN end) as PortB
FROM (
## Raw data
SELECT CurDay.*, NextDay.Price/CurDay.Price*CurDay.Weight/CurDay.Inst_Total_Weight as P_Return
FROM (SELECT x1.*, #RN:=#RN+1 rn,x2.inst_cnt, x2.Inst_Total_Weight
FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
FROM x_ports Prt
INNER JOIN x_weights W
on W.Port_ID = prt.ID
INNER JOIN x_prices Prc
on Prc.INST_ID = W.INST_ID
ORDER BY W.port_id, W.inst_id,trade_Date) x1
CROSS join (SELECT #RN:=0) r
INNER join (SELECT count(*) inst_Cnt, port_ID, sum(Weight) as Inst_Total_Weight
FROM x_weights
GROUP BY Port_ID) x2
on X1.Port_ID = X2.Port_ID) CurDay
LEFT JOIN (SELECT x1.*, #RN2:=#RN2+1 rn2
FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
FROM x_ports Prt
INNER JOIN x_weights W
on W.Port_ID = prt.ID
INNER JOIN x_prices Prc
on Prc.INST_ID = W.INST_ID
ORDER BY W.port_id, W.inst_id,trade_Date) x1
CROSS join (SELECT #RN2:=0) r
) NextDay
on NextDay.Port_ID = CurDay.Port_ID
and NextDay.Inst_ID = curday.Inst_ID
and NextDay.RN2 = CurDay.RN+1
GROUP BY CurDay.Port_ID, CurDay.Inst_ID, CurDay.Trade_Date) Z
##END RAW DATA
GROUP BY Trade_Date;
+----+---------------------+-------------------+-------------------+
| | Trade_Date | PortA | PortB |
+----+---------------------+-------------------+-------------------+
| 1 | 01.01.2018 00:00:00 | 1,00528959642786 | 1,00892857142857 |
| 2 | 02.01.2018 00:00:00 | 0,995851495829569 | 0,991150442477876 |
| 3 | 03.01.2018 00:00:00 | 0,999840954274354 | 1 |
| 4 | 04.01.2018 00:00:00 | 1,0035355651507 | 1,00892857142857 |
| 5 | 05.01.2018 00:00:00 | 1,00589689563141 | 1,00884955752212 |
| 6 | 06.01.2018 00:00:00 | NULL | NULL |
+----+---------------------+-------------------+-------------------+

Finding in between time in a list of times

I have a table that looks like this
userid | eventid | description | date | starttime | endtime
1 1 Event 1 2016-02-02 09:30:00 11:00:00
1 2 Event 2 2016-02-02 13:30:00 15:00:00
1 3 Event 3 2016-02-02 17:30:00 21:00:00
2 4 Event 4 2016-02-03 13:00:00 14:00:00
2 5 Event 5 2016-02-03 15:00:00 16:00:00
I need to find what is the sum of time between the events on the same day by the user.
Like this:
userid | timeBetween
1 05:00:00
2 01:00:00
I should also assume that there may be overlapping times for example event1 starts at 11:00 ends 13:00 and event2 starts 12:00 and ends 14:00 by the same user on the same day. These cases are rare and I believe returning 00:00 here is the appropriate answer.
I solved a similar problem, finding the sum of the length of all events per day.
SELECT *,
SEC_TO_TIME( SUM( TIME_TO_SEC(TIMEDIFF(`endtime`,`starttime`)))) as sumtime
FROM `events`
group by userid, date
order by sumtime desc
Given this sample data:
CREATE TABLE t
(`userid` int, `eventid` int, `description` varchar(7), `date` date, `starttime` time, `endtime` time)
;
INSERT INTO t
(`userid`, `eventid`, `description`, `date`, `starttime`, `endtime`)
VALUES
(1, 1, 'Event 1', '2016-02-02', '09:30:00', '11:00:00'),
(1, 2, 'Event 2', '2016-02-02', '13:30:00', '15:00:00'),
(1, 3, 'Event 3', '2016-02-02', '17:30:00', '21:00:00'),
(2, 4, 'Event 4', '2016-02-03', '13:00:00', '14:00:00'),
(2, 5, 'Event 5', '2016-02-03', '15:00:00', '16:00:00')
;
this query
SELECT userid, SEC_TO_TIME(SUM(TIME_TO_SEC(diff))) AS time_between
FROM (
SELECT
TIMEDIFF(starttime, COALESCE(IF(userid != #prev_userid, NULL, #prev_endtime), starttime)) AS diff,
#prev_endtime := endtime,
#prev_userid := userid AS userid
FROM
t
, (SELECT #prev_endtime := NULL, #prev_userid := NULL) var_init_subquery
ORDER BY userid
) sq
GROUP BY userid;
will return
+--------+--------------+
| userid | time_between |
+--------+--------------+
| 1 | 05:00:00 |
| 2 | 01:00:00 |
+--------+--------------+
Explanation:
In this part
, (SELECT #prev_endtime := NULL, #prev_userid := NULL) var_init_subquery
ORDER BY userid
we initialize our variables. The ORDER BY is very important, since there's no order in a relational database unless you specify it. It is so important, because the SELECT clause processes the rows in this order.
In the SELECT clause the order is also very important. Here
#prev_endtime := endtime,
#prev_userid := userid AS userid
we assign the values of the current row to the variables. Since this happens after this line
TIMEDIFF(starttime, COALESCE(IF(userid != #prev_userid, NULL, #prev_endtime), starttime)) AS diff,
the variables still hold the values of the previous row in the timediff() function. Therefore we also have to use COALESCE(), because in the very first row and when the userid changes, there is no value to calculate the diff from. To get a diff of 0 there, COALESCE() exchanges the NULL value with the starttime.
The last part is obviously to simply sum the seconds of the "between times".
Here's one way you can get the timeBetween value in SECONDS
SELECT
firsttable.userid,
SEC_TO_TIME(SUM(TIME_TO_SEC(secondtable.starttime) - TIME_TO_SEC(firsttable.endtime))) timeBetween
FROM
(
SELECT
*,
IF(#prev = userid, #rn1 := #rn1 + 1, #rn1 := 1) rank,
#prev := userid
FROM eventtable,(SELECT #prev := 0,#rn1 := 1) var
ORDER BY userid,starttime DESC
) firsttable
INNER JOIN
(
SELECT
*,
IF(#prev2 = userid, #rn2 := #rn2 + 1, #rn2 := 1) rank,
#prev2 := userid
FROM eventtable,(SELECT #prev2 := 0,#rn2 := 1) var
ORDER BY userid,endtime DESC
) secondTable
ON firsttable.userid = secondtable.userid AND firsttable.rank = secondtable.rank + 1 AND
firsttable.date = secondtable.date
GROUP BY firsttable.userid;
TEST:
Unable to add a fiddle.
So here's test data with schema:
DROP TABLE IF EXISTS `eventtable`;
CREATE TABLE `eventtable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`userid` int(11) NOT NULL,
`eventid` int(11) NOT NULL,
`description` varchar(100) CHARACTER SET utf8 NOT NULL,
`date` date NOT NULL,
`starttime` time NOT NULL,
`endtime` time NOT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO `eventtable` VALUES ('1', '1', '1', 'Event 1', '2016-02-02', '09:30:00', '11:00:00');
INSERT INTO `eventtable` VALUES ('2', '1', '2', 'Event 2', '2016-02-02', '13:30:00', '15:00:00');
INSERT INTO `eventtable` VALUES ('3', '1', '3', 'Event 3', '2016-02-02', '17:30:00', '21:00:00');
INSERT INTO `eventtable` VALUES ('4', '2', '4', 'Event 4', '2016-02-03', '13:00:00', '14:00:00');
INSERT INTO `eventtable` VALUES ('5', '2', '5', 'Event 5', '2016-02-03', '15:00:00', '16:00:00');
Result:
Executing the above query on the given test data you will get output like below:
userid timeBetween
1 05:00:00
2 01:00:00
Note:
For overlapping events the above query will give you negative timeBetween value.
You can replace the the SEC_TO_TIME...line by the following:
SEC_TO_TIME(IF(SUM(TIME_TO_SEC(secondtable.starttime) - TIME_TO_SEC(firsttable.endtime)) < 0, 0,SUM(TIME_TO_SEC(secondtable.starttime) - TIME_TO_SEC(firsttable.endtime)))) timeBetween
If you take the TIMEDIFF of the MIN(starttime) and MAX(endtime) for each user/day and then subtract the sum of events as calculated earlier, this will give you the times in between.
try this on
select TIMEDIFF('start_time','end_time') from your table
hope this one help you

Mysql increment column value by group

Would like to increment column between groups of the same parentid. See problem below:
ID Name Parent Pos
================================
1 Alex 1 0
2 Mary 1 0
3 John 1 0
4 Doe 2 0
5 Bob 2 0
6 Kate 2 0
EXPECTED RESULT
ID Name Parent Pos
================================
1 Alex 1 1
2 Mary 1 2
3 John 1 3
4 Doe 2 1
5 Bob 2 2
6 Kate 2 3
I would do this using two queries to select distinct values of the parent, then do a loop and update in sets but I feel there is a more efficient way!!
These problems can be easily solved by ranking function. As mysql doesn't support ranking function we've to go with alternative.
Check this query
-- for dense rank
SELECT
Id,
NAME,
Parent,
Pos
, case when #previousParent = rankTab.Parent THEN #runningGroup := #runningGroup + 1
else #runningGroup := 1 AND #previousParent := rankTab.Parent
END as denseRank
FROM
inc_col_val_by_group AS rankTab,
(SELECT #runningGroup := 0) AS b
, (select #previousParent := 0 ) as prev
ORDER BY rankTab.Parent -- order by Parent
--
-- -- below are the create table & insert the given records script
-- create the table
CREATE TABLE inc_col_val_by_group
(Id INT
, NAME CHAR(10)
, Parent INT
, Pos INT
)
-- insert some records
INSERT INTO inc_col_val_by_group(Id, NAME, Parent, Pos)
VALUES
(1, 'Alex', 1, 0)
, (1, 'Mary', 1, 0)
, (3, 'John', 1, 0)
, (4, 'Doe', 2, 0)
, (5, 'Bob', 2, 0)
, (6, 'Kate', 2, 0)
The most efficient way is to probably use variables:
select t.*,
(#rn := if(#p = parent, #p + 1,
if(#p := parent, 1, 1)
)
) as pos
from table t cross join
(select #p := 0, #rn := 0) init
order by parent, id;
SET #posn:=0;
SET #pid:=0;
SELECT IF(#pid=k.parentid,#posn:=#posn+1,#posn:=1) pos,#pid:=k.parentid pid, k.*
FROM kids k
ORDER BY parentid