MySQL extract average data from multiple group criteria - mysql

Apologies for the wall of text, the example in the end explains my question. Any help is appreciated, thank you!
I have a table which contains several columns of data from among other values voltages and currents.
These instances are logged every second when there is current flowing. I want to calculate an approximated kJ and kW from these values.
Basically I have one table, instances, that contains:
instanceID,
location,
current,
voltage,
time.
And another one, sets, that contains:
instanceID,
setID.
The instanceID is the same, the instanceID in instances is a FK pointing to sets. For every location in instances there are approximately 23 rows (varies). There are 30 locations. So I have 23 rows where the instance have location 1, another 23 for the same instance for location 2 and so on. Time is the logged time for when the measured data was taken (so if the difference is one second between all the 23 instances the difference between the first and the last time is 23 seconds).
I need to calculate the average kW and the total kJ (approximated).
What I've done is the following:
SELECT instances.instanceID, location, current,
voltage, current * voltage AS kW,
COUNT(IF(current > 0 AND voltage > 0,
instances.instanceID,
0)) AS InstancedTime
FROM instances
INNER JOIN sets ON instances.instanceID = sets.instanceID
WHERE sets.setID = arbitrary_number;
The problem arises that I get the following table:
instanceID, location, current, voltage, kW, InstancedTime
The kW is a random number from one of the 23 sets, which is fine since it's an approximation anyway, but the COUNT(IF()) is counting ALL the instances in the instances table, when I only want the query to count the instances for every location.
I tried the MAX(CAST(time AS SIGNED)) - MIN(CAST(time AS SIGNED)), but that takes the max time from the last location minus the min time of the first location, I want to isolate it to one location at a time.
What I want to do is get the total amount of kJ which would be the time it had power multiplied by the kW of that time. Since I know the time is always 1 second between the instances it should be enough to count the number of instances for individual locations and multiply that by the kW, however I want to do that for all the instances within one set. It is possible to replace the set by using a single query for all the individual instances but that would take eons.
I'm trying to take a table that looks like
instanceID, location, current, voltage, kW, InstancedTime
1, 1, 500V, 2A, 1kW, 1s
1, 1, 500V, 2A, 1kw, 1s
1, 2, 400V, 3A, 1.2kW, 1s
1, 2, 400V, 3A, 1.2kW, 1s
2, 1, 700V, 2A, 1.4kW, 1s
2, 1, 700V, 2A, 1.4kw, 1s
2, 2, 300V, 3A, 0.9kW, 1s
2, 2, 300V, 3A, 0.9kW, 1s
And add the kJ which would be summarising the number of instances that ID 1 has been in location 1 and location 2, doing the same for ID 2 and presenting this all in one table that would look like:
instanceID, location, current, voltage, kW, SumInstancedTime, kJ
1, 1, 500V, 2A, 1kW, 2s, 2kJ
1, 2, 400V, 3A, 1.2kW, 2s, 2.4kJ
2, 1, 700V, 2A, 1.4kW, 2s, 2.8kJ
2, 2, 300V, 3A, 0.9kW, 2s, 1.8kJ
Thank you for your time, any provided help is appreciated!

I cannot test my answer right now, but it sounds that what you need is a GROUP BY:
the following query is an example that averages your current and voltage for every set of instance/location and then calculates the value
SELECT instances.instanceID, location, AVG(current) as avg_current,
AVG(voltage) as avg_voltage, AVG(current) * AVG(voltage) AS kW,
COUNT(IF(current > 0 AND voltage > 0,
instances.instanceID,
0)) AS InstancedTime
FROM instances
INNER JOIN sets ON instances.instanceID = sets.instanceID
WHERE sets.setID = arbitrary_number;
GROUP BY instances.instanceID, location

This is an example where you are trying to group consecutive rows in a table. In your example, they are not interleaved, but I'm assuming they could be. You need to assign everything with the same instanceID and location to the same group.
My approach is to find the next higher instance/location pair, and to assign that as a group identifier. I do this using a subquery. Once I have this identifier, I just summarize each group:
select i.instanceId, i.location, i.current, i.volate, i.kw, COUNT(*) as SumTime,
SUM(kw) as KJ
from (select i.*,
(select concat(i2.instanceId, ',', i2.location)
from instances i2
where i2.instanceId > i.instanceId or (i2.instanceId = i.instanceId and i2.location > i.location)
order by i2.instanceId desc, i2.locationId desc
limit 1
) as grouping
from instances i
) i
group by grouping

Related

How to find out next available number

In my MySQL table I have field called sequence where I have values like
1 , 2, 3, 5, 6, 7, 8, 10 some of the sequence number are skiped due to deleted records. How do I find out next available number from given number. let's say if I need next number from 3 , how do I get number 5 as my next number in sequence not the 4.
To find out the next ID after 3 that appears in your table, you should do
SELECT id FROM thetable WHERE id>3 ORDER BY id ASC LIMIT 1
This just considers IDs that are greater than 3, in ascending order, and then takes the first one on that list. If it returns you one result, then that's the next one used in the table; if it doesn't return a result at all, then the ID you gave it was already the highest one in the table (or, strictly speaking, at least as high as the highest one in the table).
If you want a general expression that works to get the next available number, then you can use an aggregation query:
select coalesce(max(id), maxid + 1) as NextAvailableId
from table t cross join
(select max(id) as maxid from table t) x
where id > 3;
Or, if you don't like the cross join, you can use conditional aggregation:
select coalesce(max(case when id > 3 then id end), max(id)) as NextAvailableId
from table t;

Preserve from splitting results

Is there any possible way to SELECT from MYSQL database and preserve from splitting results? I'd like to get all the data from previous day, but it'll be too much, but I also cannot split results:
Select all with certain limit, but do not split (by certain value, i.e. user_id) onto separate results.
EXAMPLE
SELECT
ti.id, ti.date, ti.duedate, ti.datepaid,
tii.invoiceid, tii.userid,
tc.postcode, tc.country,
(SELECT GROUP_CONCAT(value) FROM custom WHERE relid=tc.id) AS vatid
FROM invoices ti
LEFT JOIN invoiceitems tii
ON tii.invoiceid=ti.id
LEFT JOIN clients tc
ON tc.id=tii.userid
WHERE ti.status='Paid'
AND ti.nullmo_no IS NULL
ORDER BY tii.userid AND tii.id
Now I get all the results, but I need to split them without breaking userid. For example one SELECT returns 20 results, because there were 15 invoices for user 1, and 5 invoices for user 2, then the next call returns the rest, also with a limit, but not breaking user related group of results:
SELECT
part 1 (all from user 1, all from user 2)
part 2 (all from user 3, all from user 4)
Can this be done in one select statement?
id = 1,2,3,4,5,6,7,8,9,10
name = n1, n2, n3, n4, n5, n6, n7, n8, n9, n10
user_id = 1, 1, 1, 2, 2, 2, 3, 3, 4, 5 // split but not divide
content = c1,c2,c3,c4,c5,c6,c7,c8,c9,c10
date = yesterday, yesterday, yesterday, yesterday, yesterday, yesterday, yesterday, yesterday, yesterday, yesterday
The deal is to select all of them, with a limit, but not to split user_id, so: 1. All from yesterday 2. LIMIT if per one or more user_id's there are more results than LIMIT So the limit would be determined by the number of results.

MySQL query to assign values to a field based in an iterative manner

I am using a MySql table with 500,000 records. The table contains a field (abbrevName) which stores a two-character representation of the first two letters on another field, name.
For example AA AB AC and so on.
What I want to achieve is the set the value of another field (pgNo) which stores a value for page number, based on the value of that records abbrevName.
So a record with an abbrevName of 'AA' might get a page number of 1, 'AB' might get a page number of 2, and so on.
The catch is that although multiple records may have the same page number (after all multiple entities might have a name beginning with 'AA'), once the amount of records with the same page number reaches 250, the page number must increment by one. So after 250 'AA' records with a page number of 1, we must assign futher 'AA records with a page number of 2, and so on.
My Pseudocode looks something like this:
-Count distinct abbrevNames
-Count distinct abbrevNames with more than 250 records
-For the above abbrevNames count the the sum of each divided by 250
-Output a temporary table sorted by abbrevName
-Use the total number of distinct page numbers with 250 or less records to assign page numbers incrementally
I am really struggling to put anything together in a query that comes close to this, can anyone help with my logic or some code ?
Please have a try with this one:
SELECT abbrevNames, CAST(pagenumber AS signed) as pagenumber FROM (
SELECT
abbrevNames
, IF(#prev = abbrevNames, #rows_per_abbrev:=#rows_per_abbrev + 1, #pagenr:=#pagenr + 1)
, #prev:=abbrevNames
, IF(#rows_per_abbrev % 250 = 0, #pagenr:=#pagenr + 1, #pagenr) AS pagenumber
, IF(#rows_per_abbrev % 250 = 0, #rows_per_abbrev := 1, #rows_per_abbrev)
FROM
yourTable
, (SELECT #pagenr:=0, #prev:=NULL, #rows_per_abbrev:=0) variables_initialization
ORDER BY abbrevNames
) subquery_alias
UPDATE: I had misunderstood the question a bit. Now it should work

Partitioning SQL query by arbitrary number of rows

I have a SQL table with periodic measurements. I'd like to be able to return some summary method (say SUM) over the value column, for an arbitrary number of rows at a time. So if I had
id | reading
1 10
5 14
7 10
11 12
13 18
14 16
I could sum over 2 rows at a time, getting (24, 22, 34), or I could sum 3 rows at a time and get (34, 46), if that makes sense. Note that the ID might not be contiguous -- I just want to operate by row count, in sort order.
In the real world, the identifier is a timestamp, but I figure that (maybe after applying a unix_timestamp() call) anything that works for the simple case above should be applicable. If it matters, I'm trying to gracefully scale the number of results returned for a plot query -- maybe there's a smarter way to do this? I'd like the solution to be general, and not impose a particular storage mechanism/schema on the data.
You may resequense query result and then group it
SET #seq = 0;
SELECT SUM(data), ts FROM (
SELECT #seq := #seq + 1 AS seq, data, ts FROM table ORDER BY ts LIMIT 50
) AS tmp GROUP BY floor(tmp.seq / 3);

Use mysql to work out in and out times of vehicle at customer, multiple stops and entries for each customer

I have a mysql table that contains data as per the screenshot below.
My requirement is to generate a mysql query that will show me the in and out time for each customer.
the issue I have is that I cannot use min or max as the vehicle might have visited the same customer two or three times within a period.
So the output I am looking for is:
Vehicle: RB10
Customer: Hulamin
In: 10:19
out: 10:35
Time Taken: 16 min
In: 11:14
out: 11:29
Time Taken: 15 min
ave time taken: 15.5 min
and the same for each of the other sites and vehicles as required.
How do I tell mysql to take the smallest in time before the corresponding out time and report?
Many thanks for the assistance.
You could do it using SQL Variables to help control when the address changes, even IF they occur multiple times. Without having MySQL readily available, I would approach something like below. Start with an inner query that stamps a "GroupSeq" based on a change in either vehicle and/or address. Keep the order sequential by date/time. After each test against the #lastGroup is either left alone, or added 1 to the sequence, THEN update the #lastAddress and #lastVehicle as basis for the NEXT record being selected into the result set for comparison.
Per your example, the results of each customer would be (all these same vehicle, so not duplicating display of that column)
Address GroupSeq
Hulamin 1
SACD 2
UL 3
NP 4
Hulamin 5
SACD 6
After that, you can then properly do your MIN/MAX based on the GroupSeq assigned.
select
PreQuery.Vehicle,
PreQuery.Address,
PreQuery.GroupSeq,
MIN( PreQuery.`DateTime` ) as InTime,
MAX( PreQuery.`DateTime` ) as OutTime
from
( select
YT.Vehicle,
YT.Address,
YT.`DateTime`,
YT.Direction,
#lastGroup := #lastGroup + if( #lastAddress = YT.Address
AND #lastVehicle = YT.Vehicle, 0, 1 ) as GroupSeq,
#lastVehicle := YT.Vehicle as justVarVehicleChange,
#lastAddress := YT.Address as justVarAddressChange
from
YourTable YT,
( select #lastVehicle := '',
#lastAddress := '',
#lastGroup := 0 ) SQLVars
order by
YT.`DateTime` ) PreQuery
Group By
PreQuery.Vehicle,
PreQuery.Address,
PreQuery.GroupSeq
The above SHOULD result in something like
Vehicle Address GroupSeq InTime OutTime
RB10 Hulamin 1 10:19 10:35
RB10 SACD 2 10:37 10:40
RB10 UL 3 10:41 11:06
RB10 NP 4 11:07 11:14
RB10 Hulamin 5 11:14 11:28
RB10 SACD 6 11:29 12:21
Now, the above sample does not actually compute the total time taken per in-out, nor the average per vehicle/customer average time for what appears to be processing, but you can add those computations after you understand and get this part.
Please note, this is based on natural order as appears by date/time. It looks like one transaction from beginning to end can have many "IN"s, but ALWAYS ends with an "OUT" before proceeding to the next customer address. If this is an incorrect assumption, modifications would obviously need to be made.