Related
I have these data in a table:
numb m value
8070 1 7.63
NULL 1 7.64
NULL 1 7.65
8070 2 7.939
8070 2 7.935
8070 2 7.941
NULL 3 7.62
8070 4 7.92
8070 4 7.935
I need MIN(value) and MAX(value) for each m, and if there is a value without numb (NULL), then the ones with a numb should be ignored.
So I should be getting the following results:
numb m value
NULL 1 7.64
NULL 1 7.65
8070 2 7.935
8070 2 7.941
NULL 3 7.62
8070 4 7.92
8070 4 7.935
I've tried quite a lot of different things, but nothing seems to work, and I have no more ideas how to find relevant info. Can you please point me to the right direction?
UPDATE:
to get the number of values it looks like this:
COALESCE(
IF(
COUNT(
CASE
WHEN m IN (2, 4)
THEN value
ELSE
CASE
WHEN m IN (1, 3) AND numb IS NULL
THEN value
END
END
) = 0,
NULL,
COUNT(
CASE
WHEN m IN (2, 4)
THEN value
ELSE
CASE
WHEN m IN (1, 3) AND numb IS NULL
THEN value
END
END
)
),
COUNT(
CASE
WHEN m IN (1, 3)
AND numb IS NOT NULL
THEN value
END
)
) AS cnt
This query should give you the results you want. It has two levels of nested derived tables. The first:
SELECT m,
MIN(CASE WHEN numb IS NULL THEN value END) AS min_null,
MAX(CASE WHEN numb IS NULL THEN value END) AS max_null,
MIN(CASE WHEN numb IS NOT NULL THEN value END) AS min_normal,
MAX(CASE WHEN numb IS NOT NULL THEN value END) AS max_normal
FROM numbers
GROUP BY m;
computes the minimum and maximum values for each value of m, dependent on whether numb was a number or NULL. In the next level,
SELECT m,
COALESCE(min_null, min_normal) AS min_value,
COALESCE(max_null, max_normal) AS max_value
FROM (... query 1...)
we use compute the appropriate minimum and maximum values to use (if there was a NULL value, we use that, otherwise we use the one associated with numeric values of numb). Finally we JOIN the numbers table to the result of query 2 to find the appropriate values of numb for each value of m:
SELECT n.numb, n.m, n.value
FROM numbers n
JOIN (... query 2 ...) num ON num.m = n.m AND (num.min_value = n.value OR num.max_value = n.value)
ORDER BY n.m, n.value
Output:
numb m value
null 1 7.64
null 1 7.65
8070 2 7.935
8070 2 7.941
null 3 7.62
8070 4 7.92
8070 4 7.935
Demo on dbfiddle
The full query:
SELECT n.numb, n.m, n.value
FROM numbers n
JOIN (SELECT m,
COALESCE(min_null, min_normal) AS min_value,
COALESCE(max_null, max_normal) AS max_value
FROM (SELECT m,
MIN(CASE WHEN numb IS NULL THEN value END) AS min_null,
MAX(CASE WHEN numb IS NULL THEN value END) AS max_null,
MIN(CASE WHEN numb IS NOT NULL THEN value END) AS min_normal,
MAX(CASE WHEN numb IS NOT NULL THEN value END) AS max_normal
FROM numbers
GROUP BY m) n) num ON num.m = n.m AND (num.min_value = n.value OR num.max_value = n.value)
ORDER BY n.m, n.value
I have a table like this:
CREATE TABLE `PQ_batch` (
`id` int(6) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Batch Id number',
`date` datetime DEFAULT NULL,
`qty` int(11) DEFAULT NULL COMMENT 'Number of units in a batch',
PRIMARY KEY (`bid`)
) ENGINE=InnoDB AUTO_INCREMENT=1000 DEFAULT CHARSET=utf8;
Id | date | qty
--------------------------------
1 2017-01-06 5
2 2017-01-02 5
3 2017-01-03 100
Given a qty value of: #qtyToTake:=100
*Select the rows that will be needed to fulfill the #qtyToTake and ONLY these rows, the quantity that is to be taken from each row, and the new quantity that remains for that row. The oldest batches should be used up first. *
It should look something like this:
Id | date | qty | newQty | qtyTakenPerRecord
-------------------------------------------------------
1 2017-01-02 5 0 5
2 2017-01-03 100 5 95
3 2017-01-01 5 5 0
newQty = (qty - #qtyToTake) where #qtyToTake = (#qtyToTake - the previous row's qty until #qtyToTake reaches 0)
#qtyToTake should be dynamically assigned to be the difference of the previous row's qty and its current value until it reaches 0.
Here's what I came up with:
SELECT p.bid, p.Orig as origQty, p.NewQty, (p.Orig - p.NewQty) AS NumToTake
FROM(
SELECT b.bid, (#runtot := b.bqty - #runtot) AS remain, ( #runtot := (b.bqty - #runtot) ) leftToGet, b.bqty AS Orig,
(SELECT
(sum(bqty) - #runtot) AS tot FROM PQ_batch
WHERE bid <= b.bid ) AS RunningTotal,
(SELECT
CASE
WHEN (sum(bqty) - #runtot) > 1 THEN (sum(bqty) - #runtot)
ELSE 0
END
FROM PQ_batch
WHERE bid <= b.bid ) AS NewQty
FROM PQ_batch b,(SELECT #runtot:= 100) c
ORDER BY bdate
) AS p
Using #McNets example here's what I came up with:
SELECT bid, bdate,bpid,bcost, bqty, newQty, qtyTakenPerRecord
FROM (
SELECT y.*,
IF(#qtyToTake > 0, 1, 0) AS used,
IF(#qtyToTake > bqty, 0, bqty - #qtyToTake) AS newQty,
IF(#qtyToTake > bqty, bqty, bqty - (bqty - #qtyToTake)) AS qtyTakenPerRecord,
#qtyToTake := #qtyToTake - bqty AS qtyToTake
FROM
(SELECT #qtyToTake := 100) x,
(SELECT * from PQ_batch WHERE bpid =1002 AND bqty > 0 ORDER BY bdate) y
) z
WHERE used = 1
ORDER BY bdate
Here is my table ws_sold I'm doing testing with
Along with my ws_inventory table
My query is as followed:
SELECT
inventory.id,
inventory.sku AS inventory_sku,
inventory.quantity AS inventory_quantity,
mastersku.sku2,
mastersku.sku1,
mastersku.sku3,
mastersku.multsku,
mastersku.qtysku,
mastersku.altsku,
mastersku.sku,
sold.quantity AS sold_quantity,
sold.sku AS sold_sku
FROM sold
LEFT OUTER JOIN mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN inventory
ON mastersku.sku1 = inventory.sku
OR mastersku.altsku = inventory.sku
Which has an output of:
Everything is great, besides the inventory_quantity column results.
My query is not taking into consideration previous equations in earlier rows for same SKU entries, and is assuming each query is starting fresh from the ws_inventory Quantity of 99.
The logic in this is (I've done so with both PHP and MySQL in testing as I'm open to both):
inventory_quantity - (sold_quantity * ws_mastersku.QtySKU)
Therefore the first result for WS16 is 99 - (2 * 4) = 91.
This is correct.
But, the second instance of WS16 is 99 - (4 * 4) = 83.
And is therefore over-writing the first result.
I'm looking for a query that will keep the running total on inventory_quantity if (such as in this test case), there are more than one of the same SKU being processed.
Something such as this:
1 WS16 91 (null) (null) (null) (null) 0 4 WS16 WS16X4-2 WS16X4-2 2
2 WS3 97 (null) (null) (null) (null) 0 2 WS3 WS3X2-4 WS3X2-4 1
3 WS6 95 (null) (null) (null) (null) 0 4 WS6 WS6X4-16 WS6X4-16 1
4 WS16 75 (null) (null) (null) (null) 0 4 WS16 WS16X4-2 WS16X4-2 4
I realize this issue is arrising because inventory_quantity is taken at the start the query as its initial number, and is not updating based off processes later down in the line.
Any suggestions/help please? It has taken me a while just to get to this point in the project, being rather new to MySQL it has been a big learning experience all the way through, but this issue is causing a huge barrier for me.
Thank you!
You should reconsider your database design and the questions you need to answer. You are using the ws_inventory.quantity field to store the initial quantity when in fact it should be showing the current available quantity. This will instantly show the answer to the question: "how many widgets do I have left?"
You should be decrementing the inventory in ws_inventory as you sell each unit. You can do this with a trigger in MySQL when you add a quantity to the ws_sold table you update the ws_inventory.quantity field. You may also want to do this as a separate query in your application (PHP?) code. The quantity should show quantity on hand: that way if you have 99 and sell 2 and then sell 3 more your ws_inventory.quantity field should be 94 (94 = 99 - 2 - 3). You can also add to this field when you replenish inventory.
This is how the trigger should work:
(from Update another table after insert using a trigger?)
-- run this SQL code on your database
-- note that NEW.quantity and NEW.sku are what you inserted into ws_sold
CREATE TRIGGER update_quantity
AFTER INSERT ON ws_sold
FOR EACH ROW
UPDATE ws_inventory
SET ws_inventory.quantity = ws_inventory.quantity - NEW.quantity
WHERE ws_inventory.sku = NEW.sku;
If you need to maintain a history of inventory for reports like the one above, then you may want to consider an ws_inventory_history table that can take snapshots.
The following answer works by joining the sold items with the last sold items after grouping them on inventory.id and by resetting the variables when the inventory.id changes. Note how the join is on the inventory.id and group_row - 1 = group_row
This SQL example works
select sold.inventory_id
, sold.sold_id
, case sold.inventory_id when #inventoryId then
#starting := #starting
else
#starting := sold.inventory_quantity_before_sale
end as starting_quantity
, case sold.inventory_id when #inventoryId then
#runningSold := #runningSold + coalesce(last.quantity_sold,0)
else
#runningSold := 0
end as running_sold
, case sold.inventory_id when #inventoryId then
#runningInventoryQuantity := #starting - #runningSold
else
#runningInventoryQuantity := sold.inventory_quantity_before_sale
end as before_sale
, sold.quantity_sold
, sold.inventory_quantity_before_sale - sold.quantity_sold - #runningSold after_sale
, #inventoryId := sold.inventory_id group_inventory_id -- clocks over the group counter
from (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity_before_sale
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) sold
left join (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) `last`
on sold.inventory_id = `last`.inventory_id
and sold.group_row - 1 = `last`.group_row
join ( select #runningInventoryQuantity := 0, #runningSold := 0, #inventoryId := 0, #afterSold := 0, #starting :=0 ) variables
order by sold.inventory_id, sold.group_row
-- example results
inventory_id sold_id starting_quantity running_sold before_sale quantity_sold after_sale group_inventory_id
1 1 93 0 93 4 89 1
1 4 93 4 89 16 73 1
1 5 93 20 73 20 53 1
1 12 93 40 53 48 5 1
2 2 97 0 97 4 93 2
2 6 97 4 93 12 81 2
2 7 97 16 81 14 67 2
2 8 97 30 67 16 51 2
2 11 97 46 51 22 29 2
3 3 95 0 95 12 83 3
3 9 95 12 83 36 47 3
3 10 95 48 47 40 7 3
You could do the same thing in PHP. Have a starting quantity, a running amount sold and a running total that gets reset each time the inventory id changes and then use the quantity sold for each transaction to adjust those variables.
If you choose the PHP route example sql is
select sold.* from (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity_before_sale
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) sold
-- example results
inventory_id inventory_quantity_before_sale quantity_sold sold_id group_row group_inventory_id
1 93 4 1 1 1
1 93 16 4 2 1
1 93 20 5 3 1
1 93 48 12 4 1
2 97 4 2 1 2
2 97 12 6 2 2
2 97 14 7 3 2
2 97 16 8 4 2
2 97 22 11 5 2
3 95 12 3 1 3
3 95 36 9 2 3
3 95 40 10 3 3
You can get the label information by joining other tables to the result using inventory.id and sold.id.
I agree with https://stackoverflow.com/users/932820/chris-adams. If you're looking to keep a track of stocktakes over time then you'll need a transaction table to record the starting and ending inventory quantities and starting and ending timestamps ... and probably starting and ending sold ids.
-- supporting tables - run this in an empty database unless you want to destroy your current tables
DROP TABLE IF EXISTS `ws_inventory`;
CREATE TABLE `ws_inventory` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`sku` varchar(20) DEFAULT NULL,
`quantity` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
LOCK TABLES `ws_inventory` WRITE;
/*!40000 ALTER TABLE `ws_inventory` DISABLE KEYS */;
INSERT INTO `ws_inventory` (`id`, `sku`, `quantity`)
VALUES
(1,'WS16',93),
(2,'WS3',97),
(3,'WS6',95);
/*!40000 ALTER TABLE `ws_inventory` ENABLE KEYS */;
UNLOCK TABLES;
# Dump of table ws_mastersku
# ------------------------------------------------------------
DROP TABLE IF EXISTS `ws_mastersku`;
CREATE TABLE `ws_mastersku` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`sku` varchar(20) DEFAULT NULL,
`sku1` varchar(20) DEFAULT NULL,
`sku2` varchar(20) DEFAULT NULL,
`sku3` varchar(20) DEFAULT NULL,
`multsku` tinyint(2) DEFAULT NULL,
`qtysku` int(11) DEFAULT NULL,
`altsku` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
LOCK TABLES `ws_mastersku` WRITE;
/*!40000 ALTER TABLE `ws_mastersku` DISABLE KEYS */;
INSERT INTO `ws_mastersku` (`id`, `sku`, `sku1`, `sku2`, `sku3`, `multsku`, `qtysku`, `altsku`)
VALUES
(1,'WS16X4-2',NULL,NULL,NULL,NULL,4,'WS16'),
(2,'WS3X2-4',NULL,NULL,NULL,NULL,2,'WS3'),
(3,'WS6X4-16',NULL,NULL,NULL,NULL,4,'WS6');
/*!40000 ALTER TABLE `ws_mastersku` ENABLE KEYS */;
UNLOCK TABLES;
# Dump of table ws_sold
# ------------------------------------------------------------
DROP TABLE IF EXISTS `ws_sold`;
CREATE TABLE `ws_sold` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`sku` varchar(20) DEFAULT NULL,
`quantity` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
LOCK TABLES `ws_sold` WRITE;
/*!40000 ALTER TABLE `ws_sold` DISABLE KEYS */;
INSERT INTO `ws_sold` (`id`, `sku`, `quantity`)
VALUES
(1,'WS16X4-2',1),
(2,'WS3X2-4',2),
(3,'WS6X4-16',3),
(4,'WS16X4-2',4),
(5,'WS16X4-2',5),
(6,'WS3X2-4',6),
(7,'WS3X2-4',7),
(8,'WS3X2-4',8),
(9,'WS6X4-16',9),
(10,'WS6X4-16',10),
(11,'WS3X2-4',11),
(12,'WS16X4-2',12);
/*!40000 ALTER TABLE `ws_sold` ENABLE KEYS */;
UNLOCK TABLES;
Using the schema you supplied.
The query for SQL.
select sold.inventory_id
, sold.sold_id
, case sold.inventory_id when #inventoryId then
#starting := #starting
else
#starting := sold.inventory_quantity_before_sale
end as starting_quantity
, case sold.inventory_id when #inventoryId then
#runningSold := #runningSold + coalesce(last.quantity_sold,0)
else
#runningSold := 0
end as running_sold
, case sold.inventory_id when #inventoryId then
#runningInventoryQuantity := #starting - #runningSold
else
#runningInventoryQuantity := sold.inventory_quantity_before_sale
end as before_sale
, sold.quantity_sold
, sold.inventory_quantity_before_sale - sold.quantity_sold - #runningSold after_sale
, #inventoryId := sold.inventory_id group_inventory_id -- clocks over the group counter
from (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity_before_sale
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku_1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) sold
left join (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku_1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) `last`
on sold.inventory_id = `last`.inventory_id
and sold.group_row - 1 = `last`.group_row
join ( select #runningInventoryQuantity := 0, #runningSold := 0, #inventoryId := 0, #afterSold := 0, #starting :=0 ) variables
order by sold.inventory_id, sold.group_row
-- example results
inventory_id sold_id starting_quantity running_sold before_sale quantity_sold after_sale group_inventory_id
1 1 99 0 99 8 91 1
1 4 99 8 91 16 75 1
2 2 99 0 99 2 97 2
3 3 99 0 99 4 95 3
The query for PHP.
select sold.* from (
select inventorySold.*
, case inventory_id
when #inventoryId then
#groupRow := #groupRow + 1
else
#groupRow := 1
end as group_row
, #inventoryId := inventory_id as group_inventory_id
from (
SELECT
inventory.id inventory_id
, inventory.quantity AS inventory_quantity_before_sale
, mastersku.qtysku * sold.quantity quantity_sold
, sold.id as sold_id -- for order ... you'd probably use created timestamp or finalised timestamp
FROM ws_sold sold
LEFT OUTER JOIN ws_mastersku mastersku
ON sold.sku = mastersku.sku
LEFT OUTER JOIN ws_inventory inventory
ON mastersku.sku_1 = inventory.sku
OR mastersku.altsku = inventory.sku
) inventorySold
join ( select #groupRow := 0, #inventoryId := 0 ) variables
order by inventory_id, sold_id
) sold
-- example results
inventory_id inventory_quantity_before_sale quantity_sold sold_id group_row group_inventory_id
1 99 8 1 1 1
1 99 16 4 2 1
2 99 2 2 1 2
3 99 4 3 1 3
I need to group together the entries in which the timestamp difference between one and the other is X amount of seconds or less than then average the value for each of them for each of the devices. In the following example I have a Table with this data and I need to group by device with entries between 60 seconds from each other.
Device Timestamp Value
0 30:8c:fb:a4:b9:8b 10/26/2015 22:50:15 34
1 30:8c:fb:a4:b9:8b 10/26/2015 22:50:46 34
2 c0:ee:fb:35:ec:cd 10/26/2015 22:50:50 33
3 c0:ee:fb:35:ec:cd 10/26/2015 22:50:51 32
4 30:8c:fb:a4:b9:8b 10/26/2015 22:51:15 34
5 30:8c:fb:a4:b9:8b 10/26/2015 22:51:47 32
6 c0:ee:fb:35:ec:cd 10/26/2015 22:52:38 38
7 30:8c:fb:a4:b9:8b 10/26/2015 22:54:46 34
This should be the resulting Table
Device First_seen Last_seen Average_value
0 30:8c:fb:a4:b9:8b 10/26/2015 22:50:15 10/26/2015 22:51:47 33,5
1 c0:ee:fb:35:ec:cd 10/26/2015 22:50:50 10/26/2015 22:50:51 32,5
2 c0:ee:fb:35:ec:cd 10/26/2015 22:52:38 10/26/2015 22:52:38 38
3 30:8c:fb:a4:b9:8b 10/26/2015 22:54:46 10/26/2015 22:54:46 34
Thank you very much for your help.
There is an old trick for this!
Mostly based on power of Window functions
Perfectly works for BigQuery!
So, first you "mark" all entries which exceed 60 seconds after previous entry!
Those which exceed getting value 1 and rest getting value 0!
Secondly you define groups by summing all previous marks (of course steps above are done while partitioning by device)
And finally, you just do simple grouping by above defined groups
Three simple steps implemented in one query with few simple sub-selects!
Hope this helps
SELECT device, MIN(ts) AS first_seen, MAX(ts) AS last_seen, AVG(value) AS average_value
FROM (
SELECT device, ts, value, SUM(grp_start) OVER (PARTITION BY device ORDER BY ts) AS grp
FROM (
SELECT device, ts, value,
IF(TIMESTAMP_TO_SEC(TIMESTAMP(ts))-TIMESTAMP_TO_SEC(TIMESTAMP(ts0))>60,1,0) AS grp_start
FROM (
SELECT device, ts, value, LAG(ts, 1) OVER(PARTITION BY device ORDER BY ts) AS ts0
FROM yourTable
)
)
)
GROUP BY device, grp
Here's one way...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(device CHAR(1) NOT NULL
,timestamp DATETIME NOT NULL
,value INT NOT NULL
,PRIMARY KEY(device,timestamp)
);
INSERT INTO my_table VALUES
('a','2015/10/26 22:50:15',34),
('a','2015/10/26 22:50:46',34),
('b','2015/10/26 22:50:50',33),
('b','2015/10/26 22:50:51',32),
('a','2015/10/26 22:51:15',34),
('a','2015/10/26 22:51:47',32),
('b','2015/10/26 22:52:38',38),
('a','2015/10/26 22:54:46',34);
SELECT m.*
, AVG(n.value) avg
FROM
( SELECT a.device
, a.timestamp start
, MIN(c.timestamp) end
FROM
( SELECT x.*
, CASE WHEN x.device = #prev THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=device
FROM my_table x
, (SELECT #i:=1,#prev:=null) vars
ORDER
BY device
, timestamp
) a
LEFT
JOIN
( SELECT x.*
, CASE WHEN x.device = #prev THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=device
FROM my_table x
, (SELECT #i:=1,#prev:=null) vars
ORDER
BY device
, timestamp
) b
ON b.device = a.device
AND b.timestamp > a.timestamp - INTERVAL 60 SECOND
AND b.i = a.i - 1
LEFT
JOIN
( SELECT x.*
, CASE WHEN x.device = #prev THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=device
FROM my_table x
, (SELECT #i:=1,#prev:=null) vars
ORDER
BY device
, timestamp
) c
ON c.device = a.device
AND c.i >= a.i
LEFT
JOIN
( SELECT x.*
, CASE WHEN x.device = #prev THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=device
FROM my_table x
, (SELECT #i:=1,#prev:=null) vars
ORDER
BY device
, timestamp
) d
ON d.device = c.device
AND d.i = c.i + 1
AND d.timestamp < c.timestamp + INTERVAL 60 SECOND
WHERE b.i IS NULL
AND c.i IS NOT NULL
AND d.i IS NULL
GROUP
BY a.device
, a.i
) m
JOIN my_table n
ON n.device = m.device
AND n.timestamp BETWEEN start AND end
GROUP
BY m.device
, m.start;
+--------+---------------------+---------------------+---------+
| device | start | end | avg |
+--------+---------------------+---------------------+---------+
| a | 2015-10-26 22:50:15 | 2015-10-26 22:51:47 | 33.5000 |
| a | 2015-10-26 22:54:46 | 2015-10-26 22:54:46 | 34.0000 |
| b | 2015-10-26 22:50:50 | 2015-10-26 22:50:51 | 32.5000 |
| b | 2015-10-26 22:52:38 | 2015-10-26 22:52:38 | 38.0000 |
+--------+---------------------+---------------------+---------+
Given a table named RECORD in mysql with following structure:
rid(pk & AI) patientid(fk) recordTYPE(varchar) recordValue(varchar) recordTimestamp(timestamp)
1 1 temperature(℃) 37.2 2015-08-11 18:10:04
2 1 weight(kg) 65.0 2015-08-11 18:20:08
3 1 heartbeat(bpm) 66 2015-08-11 18:30:08
4 1 temperature(℃) 36.8 2015-08-11 18:32:08
You can see that for the same date, there can be multiple records for one particular type of record. e.g. temperature in the sample data :
rid patientid recordTYPE value recordtimestamp
1 1 temperature(℃) 37.2 2015-08-11 18:10:04
4 1 temperature(℃) 36.8 2015-08-11 18:32:08
In this case, we should choose the latest record. i.e. the record with rid = 4 and value = 36.8 .
Now given an input date e.g. '2015-8-11', I want to do a query to obtain something like:
date patientid temperature(℃) weight(kg) heartbeat(bpm)
2015-08-11 1 36.8 65.0 66
2015-08-11 2 36.5 80.3 70
2015-08-11 3 35.5 90.5 80
..........................................................
..........................................................
2015-08-11 4 35.5 null null
Fig. 2
In addition, you can see that for a particular date, there may not be any records of some types. In this case, the value in that column is null.
I tried the following query:
SELECT max(recordTimestamp), patientid, recordTYPE, recordValue
FROM RECORD
WHERE date(recordTimestamp) = '2015-08-11'
GROUP BY patientid, recordTYPE
The result is something like:
date patientid recordTYPE recordValue
2015-08-11 1 temperature(℃) 36.8
2015-08-11 1 weight(kg) 65.0
2015-08-11 1 heartbeat(bpm) 66
2015-08-11 2 temperature(℃) 36.5
2015-08-11 2 weight(kg) 80.3
2015-08-11 2 heartbeat(bpm) 70
2015-08-11 4 temperature(℃) 35.5
Fig. 4
The questions are:
Given this table RECORD, what is the proper mysql statement (in terms
of performance such as retrieval speed) to produce the desired result set (i.e. Fig.2)?
Will it be better (in terms of facilitating query and scalability such as adding new types of record) if the db design is changed?
e.g. Create one table for each type of record instead of putting all types of record in one table.
Any suggestion is appreciated as I'm a db novice...... Thank you.
You can try this:-
SELECT MAX(rid), patientid, recordTYPE, MAX(recordValue), recordTimestamp
FROM YOUR_TABLE
WHERE recordTimestamp = '2015/08/11'
GROUP BY patientid, recordTYPE, recordTimestamp;
Here's one way to do it. SQL Fiddle Demo
Sadly MySQL doesn't support the row_number() over (partition by ...) syntax which would have simplified this a lot.
Instead I've made excessive use of a trick discussed here: https://stackoverflow.com/a/3470355/361842
select `date`
, `patientId`
, max(case when `tRank`=1 then `temperature(℃)` else null end) `temperature(℃)`
, max(case when `wRank`=1 then `weight(kg)` else null end) `weight(kg)`
, max(case when `hRank`=1 then `heartbeat(bpm)` else null end) `heartbeat(bpm)`
from
(
select case when #p = `patientId` and #d = cast(`recordTimestamp` as date) then #x := 1 else #x := 0 end
, case when #x = 0 then #t := 0 end
, case when #x = 0 then #w := 0 end
, case when #x = 0 then #h := 0 end
, case `recordType` when 'temperature(℃)' then case #x when 1 then #t := #t + 1 else #t := 1 end else null end as `tRank`
, case `recordType` when 'weight(kg)' then case #x when 1 then #w := #w + 1 else #t := 1 end else null end as `wRank`
, case `recordType` when 'heartbeat(bpm)' then case #x when 1 then #h := #h + 1 else #t := 1 end else null end as `hRank`
, case `recordType` when 'temperature(℃)' then `recordValue` else null end as `temperature(℃)`
, case `recordType` when 'weight(kg)' then `recordValue` else null end as `weight(kg)`
, case `recordType` when 'heartbeat(bpm)' then `recordValue` else null end as `heartbeat(bpm)`
, #d := cast(`recordTimestamp` as date) as `date`
, #p := `patientId` as `patientId`
from `Record`
cross join
(
SELECT #t := 0
, #w := 0
, #h := 0
, #p := 0
, #x := 0
, #d := cast(null as date)
) x
order by `patientId`, `recordTimestamp` desc
) y
group by `date`, `patientId`
order by `date`, `patientId`
Breakdown
This says that if this is the last temperature of the day for the current grouping's partientId/date combo then return it; otherwise return null. It then takes the max of the matching values (which given all but 1 are null, gives us the one we're after).
, max(case when `tRank`=1 then `temperature(℃)` else null end)
How tRank = 1 means the last temperature of the day for a patientId/date combo is explained later.
This line says that if this record has the same patientId and date as the previous record then set x to 1; if it's a new combo set it to 0.
select case when #p = `patientId` and #d = cast(`recordTimestamp` as date) then #x := 1 else #x := 0 end
The next lines say that if we have a new patiendIt/date combo, reset the t, w and h markers to say "the next value you receive will be the one we're after".
, case when #x = 0 then #t := 0 end
The next lines split the data by recordType; returning null if this record isn't their record type, or returning a number saying what how many of this type of record we've now seen for the patientId/date combo.
, case `recordType` when 'temperature(℃)' then case #x when 1 then #t := #t + 1 else #t := 1 end else null end as `tRank`
This is similar to the above; except instead of returning a combo-counter it returns the value of the current record (or null if this is a different record type).
, case `recordType` when 'temperature(℃)' then `recordValue` else null end as `temperature(℃)`
We then record the current record's date and patientId values, so we can compare them with the next record on the next iteration.
, #d := cast(`recordTimestamp` as date) as `date`
, #p := `patientId` as `patientId`
The cross join and following subquery is just used to initialise our variables.
The (first) order by is used to ensure that comparing current and previous records is enough to tell if we're looking ata different combo (i.e. if all combos are grouped then any change is easy to spot; if the combos keep alternating we'd need to keep track of every combo we'd seen before).
recordTimestamp is sorted descending so that the first record we see on a new combo will be the last record that day; the one we're after.
The group by is used to ensure we get 1 result per combo; and the last order by just to make our output ordered.