mysql query to identify groups of data based on timestmp

mysql query to identify groups of data based on timestmp - mysql

I have records of smartmeter in an mysql database.
Records in timestamp order looking in generall as follow:
key
timestamp
watt now
000001
2022-10-04-01-01-01
10
000002
2022-10-04-01-02-01
10
000003
2022-10-04-01-03-01
101
000004
2022-10-04-01-04-01
101
000005
2022-10-04-01-05-01
102
000006
2022-10-04-01-06-01
101
000007
2022-10-04-01-07-01
102
000008
2022-10-04-01-08-01
10
000009
2022-10-04-01-09-01
10
000010
2022-10-04-01-09-01
10
000011
2022-10-04-01-09-01
107
000012
2022-10-04-01-09-01
101
000013
2022-10-04-01-09-01
109
000014
2022-10-04-01-09-01
10
000015
2022-10-04-01-09-01
10
I want to identify the groups with bigger number (lets say > 100)
and give them an incresing id. Also I want to get per group the first and last key id
Result of query should look like this:
month
day
numbers of group
first id
last id
average watt
10
04
0
000003
000007
102
10
04
1
000011
0000013
105
Any help apreciated

You'll need something to identify them as a group. My first thought was using RANK() or DENSE_RANK() but after multiple tries, I couldn't find a way. Then I thought about using LAG() but still I'm stuck at how to re-identify the rows as new group. After testing many times, I come up with this suggestion:
WITH cte AS (
SELECT s1.*,
#n := COALESCE(IF(s1.skey=1,1,s2.skey), #n) As newGroup
FROM smartmeter s1
LEFT JOIN (
SELECT skey,
stimestamp,
watt,
LENGTH(watt) AS lenwatt,
LAG(LENGTH(watt)) OVER (ORDER BY skey) llwatt
FROM smartmeter) s2 ON s1.skey=s2.skey
AND lenwatt != llwatt)
SELECT MONTH(stimestamp) AS Month,
DAY(stimestamp) AS Day,
ROW_NUMBER() OVER (ORDER BY MIN(skey)) AS 'numbers of group',
MIN(skey) AS 'first id',
MAX(skey) AS 'last id',
AVG(watt) AS 'Average watt',
CEIL(AVG(watt)) AS 'Average watt rounded',
newGroup
FROM cte
WHERE watt >= 100
GROUP BY newGroup, MONTH(stimestamp), DAY(stimestamp)
By the way, I've changed some of your column names because key is actually a reserve word. Although you can use it as column name as long as you wrap it in backticks, I personally find it's a hassle to do it every time.
Ok, so my idea was to use LENGTH(watt) and ORDER BY skey in the LAG() function. Then I'll separate those rows where the length doesn't match and use that as a starting point for each new group. After that, I left join the result of that with smartmeter table. The next challenge is to assign each of the rows that doesn't match with previous skey value then I've found this answer and applied it into the cte.
Once those are done, I just write another query to fulfil your expected result. Although, some part of it is not exactly as what you expected.
Here's a demo fiddle

Related

Delete the duplicate values in the SUM with MySQL or SQL

Hi I am doing a sum of a table, but the problem is that the table has duplicate rows, so I wonder how can I do the sum without duplicated rows:
The main table is this one:
folio
cashier_id
amount
date
0001
1
2500
2022-06-01 00:00:00
0002
2
10000
2022-06-01 00:00:00
0001
1
2500
2022-06-01 00:00:00
0003
1
1000
2022-06-01 00:00:00
If I sum that you can see that the first and the third row are duplicated, so when I do the sum it makes it wrong because, the result will be:
cashier_id
cash_amount
1
6000
2
10000
but it should be:
cashier_id
cash_amount
1
3500
2
10000
The query that I use to make the sum is this one:
SELECT `jysparki_jis`.`api_transactions`.`cashier_id` AS `cashier_id`,
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`) AS `cash_amount`,,
COUNT(0) AS `ticket_number`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`) AS `date`
FROM `jysparki_jis`.`api_transactions`
WHERE DATE(`jysparki_jis`.`api_transactions`.`created_at`) >= '2022-01-01'
AND (`jysparki_jis`.`api_transactions`.`dte_type_id` = 39
OR `jysparki_jis`.`api_transactions`.`dte_type_id` = 61)
AND `jysparki_jis`.`api_transactions`.`cashier_id` <> 0
GROUP BY `jysparki_jis`.`api_transactions`.`cashier_id`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`)
How you can see the sum is this:
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`).
I wonder how can I do the sum avoiding to duplicate the folio with same cashier_id?
I know that if I filter for the cashier_id and folio I can avoid the duplicate rows but I do not know how to do that, can you help me?
Thanks

Given your provided input tables, you can use the DISTINCT clause inside the SUM aggregation function to solve your problem:
SELECT cashier_id, SUM(DISTINCT amount)
FROM tab
GROUP BY cashier_id,
folio,
date
Check the demo here.
Then you can add up your conditions inside your WHERE clause to this query, and your aggregation on the "created_at" field (that should correspond to the "date" field of your sample table - I guess). This solution may give your the general idea.

How to get the lowest price from a particular group of users MYSQL

I have been at this for a few days without much luck and I am looking for some guidance on how to get the lowest estimate from a particular group of sullpiers and then place it into another table.
I have 4 supplier estimate on every piece of work and all new estimates go into a single table, i am trying to find the lowest 'mid' price from the 4 newsest entries in the 'RECENT QUOTE TABLE' with a group id of '1' and then place that into the 'LOWEST QUOTE TABLE' as seen below.
RECENT QUOTE TABLE:
suppid group min mid high
1 1 200 400 600
2 1 300 500 700
3 1 100 300 500
[4] [1] 50 [150] 300
5 2 1000 3000 5000
6 2 3000 5000 8000
7 2 2000 4000 6000
8 2 1250 3125 5578
LOWEST QUOTE TABLE:
suppid group min mid high
4 1 50 150 300
Any help on how to structure this would be great as i have been loking for a few days and have not been able to find anything to get me moving again, im using MYSQL and the app is made in Python im open to all suggestions.
Thanks in advance.

If you really want to select only row with group 1, you can do something like
INSERT INTO lowest_quote_table
SELECT * FROM recent_quote_table
WHERE `group` = 1
ORDER BY `mid` ASC
LIMIT 1.
If you want a row with the lowest mid from every group, you can do something like
INSERT INTO lowest_quote_table
SELECT rq.* FROM recent_quote_table AS rq
JOIN (
SELECT `group`, MIN(`mid`) AS min_mid FROM recent_quote_table
GROUP BY `group`
) MQ ON rq.`group` = MQ.`group` AND rq.`mid` = MQ.min_mid

MYSQL query - cross tab? Union? Join? Select? What should I be looking for?

Not sure what exactly it is I should be looking for, so I'm reaching out for help.
I have two tables that through queries I need to spit out one. the two tables are as follows:
Transactions:
TransactionID SiteID EmployeeName
520 2 Michael
521 3 Gene
TransactionResponse:
TransactionID PromptMessage Response PromptID
520 Enter Odometer 4500 14
520 Enter Vehicle ID 345 13
521 Enter Odometer 5427 14
521 Enter Vehicle ID 346 13
But what I need is the following, let's call it TransactionSummary:
TransactionID SiteID EmployeeName 'Odometer' 'VehicleID'
520 2 Michael 4500 345
521 3 Gene 5427 346
The "PromptID" column is the number version of "PromptMessage" so I could query off that if it's easier.
A good direction for what this query would be called is the least I'm hoping for. True extra credit for working examples or even using this provided example would be awesome!

For a predefined number of possible PromptID values you can use something like the following query:
SELECT t.TransactionID, t.SiteID, t.EmployeeName,
MAX(CASE WHEN PromptID = 13 THEN Response END) AS 'VehicleID',
MAX(CASE WHEN PromptID = 14 THEN Response END) AS 'Odometer'
FROM Transactions AS t
LEFT JOIN TransactionResponse AS tr
ON t.TransactionID = tr.TransactionID AND t.SiteID = tr.SiteID
GROUP BY t.TransactionID, t.SiteID, t.EmployeeName
The above query uses what is called conditional aggregation: a CASE expression is used within an aggregate function, so as to conditionally account for a subset of records within a group.

Need help with MySQL query getting results to average for year y and y+1

I have a MySQL query:
SELECT px.player, px.pos, px.year, px.age, px.gp, px.goals, px.assists
, 1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists) sim
FROM hockey p1
JOIN hockey px
ON px.player <> p1.player
WHERE p1.player = 'John Smith'
AND p1.year = 2010
HAVING sim >= 900
ORDER BY sim DESC
This gets me a table of results, something like this:
player pos year age gp goals assists sim
Player1 LW 2002 25 75 29 32 961
Player2 LW 2000 27 82 29 27 956
Player3 RW 2000 27 78 29 33 955
Player4 LW 2009 26 82 30 30 940
Player5 RW 2001 25 79 33 24 938
Player6 LW 2008 25 82 23 24 936
Player7 LW 2006 27 79 26 33 932
Instead, I would like it to do two things. Average the data and add a player count, so I get something like:
players age gp goals assists sim
7 26 79 28 29 945
I tried avg(px.age), avg(px.gp), avg(px.goals)...etc but I am running into errors with my "sim" formula.
Second issue is that underneath that, I would like to have the average of the data for the FOLLOWING year. In other words data from Player1 in 2003, data from Player2 in 2001, etc.
I am stuck as to HOW to get the data to average AND to get it for the following year.
Can anyone help me with either or both of these issues?

To get a single subtotal of counts and averages, just wrap your original query AS the inner select... something like... (pq = "PreQuery" select result)
Select
max( "Tot Players" ) Players,
max( "->" ) position,
count(*) Year,
avg( pq.age ) AvgAge,
avg( pq.gp ) AvgGP,
avg( pq.goals ) AvgGoals,
avg( pq.assists ) AvgAssists,
avg( pq.sim ) AvgSim
from
( SELECT
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px ON px.player <> p1.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900
ORDER BY
sim DESC ) pq
If your original query worked, this should get you your overall averages. However, with the INNER query with a having and order, might cause a problem. You might need to kill the order by since it really makes no difference in the outer most query. As for the HAVING clause in the INNER query, might need to be moved to a WHERE pq.sim >= 900 in the OUTER SQL-Select.
Additionally, if you wanted the results of all players first, THEN the total, take your original query and merge it with this one... As you'll see, to keep the columns in synch with BOTH queries, I've put a bogus for player and position so it won't crash on mismatched unions... Notice my COUNT column actually would correspond with the YEAR column of the ORIGINAL query.
For the prior year... As Rob mentioned, you would just do a UNION of the two queries just showing the respective year you were qualifying for in each UNION...
EDIT --- CLARIFICATION for 2nd YEAR....
Per your subsequent comment clarification, you would have to get the basis as the basis of the year +1... if you then want the overall averages again, those would be wrapped to an outer max / avg, etc... But I think THIS is what you want for the subsequent year per player
SELECT
PrimaryQry.PrimaryPlayer,
PrimaryQry.PrimaryPos,
PrimaryQry.PrimaryYear,
PrimaryQry.PrimaryAge,
PrimaryQry.PrimaryGP,
PrimaryQry.PrimaryGoals,
PrimaryQry.PrimaryAssists,
PrimaryQry.player,
PrimaryQry.pos,
PrimaryQry.year,
PrimaryQry.age,
PrimaryQry.gp,
PrimaryQry.goals,
PrimaryQry.assists,
PrimaryQry.sim,
p2.pos PrimaryPos2,
p2.year PrimaryYear2,
p2.age PrimaryAge2,
p2.gp PrimaryGP2,
p2.goals PrimaryGoals2,
p2.assists PrimaryAssists2,
px2.player player2,
px2.pos pos2,
px2.year year2,
px2.age age2,
px2.gp gp2,
px2.goals goals2,
px2.assists assists2,
1000 - ABS(p2.gp - px2.gp)
- ABS(p2.goals - px2.goals)
- ABS(p2.assists - px2.assists) sim2
FROM
( SELECT
p1.player PrimaryPlayer,
p1.pos PrimaryPos,
p1.year PrimaryYear,
p1.age PrimaryAge,
p1.gp PrimaryGP,
p1.goals PrimaryGoals,
p1.assists PrimaryAssists,
px.player,
px.pos,
px.year,
px.age,
px.gp,
px.goals,
px.assists,
1000 - ABS(p1.gp - px.gp)
- ABS(p1.goals - px.goals)
- ABS(p1.assists - px.assists) sim
FROM
hockey p1
JOIN hockey px
ON p1.player <> px.player
WHERE
p1.player = 'John Smith'
AND p1.year = 2010
HAVING
sim >= 900 ) PrimaryQry
JOIN hockey p2
ON PrimaryQry.PrimaryPlayer = p2.player
AND PrimaryQry.PrimaryYear +1 = p2.year
JOIN hockey px2
ON PrimaryQry.Player = px2.Player
AND PrimaryQry.Year +1 = px2.year
If you follow the logic here, you already know the inner query is returning about 10 other players. So, I am keeping the stats of the first person basis IN that query too. THEN, I am joining that result set back to the hockey table TWICE... The join is primary player joined to the first for his/her year +1, the SECOND join works specifically to the one person that qualified against the primary player. The final column results get the entire first year qualifier with the second qualifier, such as
So, it will all be on one row consecutively of
John Smith 2010 Compare Person 1 YearA John Smith 2011 Compare Person 1 YearA+1
John Smith 2010 Compare Person 2 YearB John Smith 2011 Compare Person 2 YearB+1
John Smith 2010 Compare Person 3 YearC John Smith 2011 Compare Person 3 YearC+1

What query are you using to get the averages?
Just applying "AVG" to your expression for 'sim' should work in mysql. e.g.
AVG(1000 - ABS(p1.gp - px.gp) - ABS(p1.goals - px.goals) - ABS(p1.assists - px.assists)) sim
To aggregate over different years, I think there is no alternative to using a subselect or union.
Reference:
http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
http://dev.mysql.com/doc/refman/5.0/en/union.html
Something like:
(ORIGINAL AVG QUERY)
UNION ALL
(ORIGINAL AVG QUERY WITH NEW YEAR)
should do the trick.
(Note that your original query selects data from every year to compare it to the data for John Smith in 2010, which may not be what you want.)

Getting count of rows with multiple combinations

I need help with a MySQL query. We have a database (~10K rows) which I have simplified down to this problem.
We have 7 truck drivers who visit 3 out of a possible 9 locations, daily. Each day they visit exactly 3 different locations and each day they can visit different locations than the previous day. Here are representative tables:
Table: Drivers
id name
10 Abe
11 Bob
12 Cal
13 Deb
14 Eve
15 Fab
16 Guy
Table: Locations
id day address driver.id
1 1 Oak 10
2 1 Elm 10
3 1 4th 10
4 1 Oak 16
5 1 4th 16
6 1 Toy 16
7 1 Toy 11
8 1 5th 11
9 1 Law 11
10 2 Oak 11
11 2 4th 11
12 2 Toy 11
.........
We have data for a full year and we need to find out how many times each "route" is visited over a year, sorted from most to least.
From my high school math, I believe there are 9!/(6!3!) route combinations, or 84 in total. I want do something like:
Get count of routes where route addresses = 'Oak' and 'Elm' and '4th'
then run again
where route addresses = 'Oak' and 'Elm' and '5th'
then again and again, etc.Then sort the route counts, descending. But I don't want to do it 84 times. Is there a way to do this?

I'd be looking at GROUP_CONCAT
SELECT t.day
, t.driver
, GROUP_CONCAT(t.address ORDER BY t.address)
FROM mytable t
GROUP
BY t.day
, t.driver
What's not clear here, if there's an order to the stops on the route. Does the sequence make a difference, and how to we tell what the sequence is? To ask that a different way, consider these two routes:
('Oak','Elm','4th') and ('Elm','4th','Oak')
Are these equivalent (because it's the same set of stops) or are they different (because they are in a different sequence)?
If sequence of stops on the route distinguishes it from other routes with the same stops (in a different order), then replace the ORDER BY t.address with ORDER BY t.id or whatever expression gives the sequence of the stops.
Some caveats with GROUP_CONCAT: the maximum length is limited by the setting of group_concat_max_len and max_allowed_packet variables. Also, the comma used as the separator... if we combine strings that contain commas, then in our result, we can't reliably distinguish between 'a,b'+'c' and 'a'+'b,c'
We can use that query as an inline view, and get a count of the the number of rows with identical routes:
SELECT c.route
, COUNT(*) AS cnt
FROM ( SELECT t.day
, t.driver
, GROUP_CONCAT(t.address ORDER BY t.address) AS route
FROM mytable t
GROUP
BY t.day
, t.driver
) c
GROUP
BY c.route
ORDER
BY cnt DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

mysql query to identify groups of data based on timestmp - mysql

Related

Delete the duplicate values in the SUM with MySQL or SQL

How to get the lowest price from a particular group of users MYSQL

MYSQL query - cross tab? Union? Join? Select? What should I be looking for?

Need help with MySQL query getting results to average for year y and y+1

Getting count of rows with multiple combinations

Categories

Resources