I have four queries that run on one web page. I use them for statistics and they are taking too long to load.
Here are my current configurations
use the text wrapping button on pastebin to make it easier to read.
I have a lot of RAM dedicated to mysql but it still takes a long time. I have also index most of the columns.
I'm just trying to see what other options I have.
I put "show create table" and total count(*) in here. I'm going to rename everything and paste in SO. I agree that someone in the future may use it.
QUERY ONE
SELECT SQL_NO_CACHE
DATE_FORMAT(DateActioned,'%M-%Y') as val1,
COUNT(*) AS total_count
FROM
db.statisticsresults
WHERE
DID = 28
AND ActionTypeID = 1
AND DateActioned IS NOT NULL
GROUP BY
DATE_FORMAT(DateActioned, '%m-%y')
ORDER BY
YEAR( DateActioned ) DESC,
MONTH( DateActioned ) DESC
This, I would have a covering index based on your key elements so the engine does not have to go back to the raw data... Based on this and your following queries, I would have THAT column in the primary index position such as
StatisticsResults -- index ( DID, ActionTypeID, DateActioned )
The order by by respective year() descending and month() descending will do the same thing as your hard-coded references to FIND the field in the list.
QUERY TWO
-- 381.812
SELECT SQL_NO_CACHE
DATE_FORMAT(DateActioned,'%M-%Y') as val1,
COUNT(*) AS total_count
FROM
db.statisticsdivision
WHERE
DID = 28
AND ActionTypeID = 9
AND DateActioned IS NOT NULL
GROUP BY
DATE_FORMAT(DateActioned, '%m-%y')
ORDER BY
YEAR( DateActioned ) DESC,
MONTH( DateActioned ) DESC
ON this one, the DID = '28', I changed to DID = 28. If the column is numeric, don't offer confusion to the engine to try and convert one to the other. The same indexes from option 1 would apply here too.
QUERY THREE
-- 33.899
SELECT SQL_NO_CACHE DISTINCT
AID,
COUNT(*) AS acount
FROM
db.statisticsresults
JOIN db.division_id USING(AID)
WHERE
DID = 28
GROUP BY
AID
ORDER BY
count(*) DESC
LIMIT
19
This one looks like a bit of a waste... you are joining to the division table based on an "AID" column in the stats table. Why are you doing the join unless you actually are expecting some invalid "AID" values not in the division table? Again, change your "DID" column to 28 instead of '28'. Ensure your division table has its index on "AID" for the join. The SECOND index from query 1 appears to be your better option
QUERY FOUR
-- 21.403
SELECT SQL_NO_CACHE DISTINCT
TID,
tax,
agent,
COUNT(*) AS t_count
FROM
db.statisticsresults sr
JOIN db.tax_id USING(TID)
JOIN db.agent_id ai ON(ai.AID = sr.AID)
WHERE
DID = 28
GROUP BY
TID,
sr.AID
ORDER BY
COUNT(*) DESC
LIMIT 19
Again, "DID" column from '28' to 28
FOR your TAX_ID table, have a covering index on that too so it can handle the join
TO the agent table without going TO the raw page data
Tax_ID -- index ( tid, aid )
Finally, if you are dealing with your original list finding things only from Jan 2012 to Dec 2013, you can simplify querying the ENTIRE table of stats by adding to your WHERE clause...
AND DateActioned >= '2012-01-01'
So you completely skip over anything prior to 2012 (old data I presume?)
Related
I have read through quite a few posts with greatest-n-per-group but still don't seem to find a good solution in terms of performance. I'm running 10.1.43-MariaDB.
I'm trying to get the change in data values in given time frame and so I need to get the earliest and latest row from this period. The largest number of rows in a time frame that needs to be calculated right now is around 700k and it's only going to be growing. For now I have just resulted into doing two queries, one for the latest and one for the earliest date, but even this has slow performance on currently. The table looks like this:
user_id data date
4567 109 28/06/2019 11:04:45
4252 309 18/06/2019 11:04:45
4567 77 18/02/2019 11:04:45
7893 1123 22/06/2019 11:04:45
4252 303 11/06/2019 11:04:45
4252 317 19/06/2019 11:04:45
The date and user_id columns are indexed. Without ordering the rows aren't in any particular order in the database if that makes a difference.
The furthest I have gotten with this issue is query like this for year period currently (700k datapoints):
SELECT user_id,
MIN(date) as date, data
FROM datapoint_table
WHERE date >= '2019-01-14'
GROUP BY user_id
This gives me the right date and user_id in around very fast in around ~0.05s. But like the common issue with the greatest-n-per-group is, the rest of the row (data in this case) is not from the same row with date. I have read about other similar questions and tried with subquery like this:
SELECT a.user_id, a.date, a.data
FROM datapoint_table a
INNER JOIN (
SELECT datapoint_table.user_id,
MIN(date) as date, data
FROM datapoint_table
WHERE date >= '2019-01-01'
GROUP BY user_id
) b ON a.user_id = b.user_id AND a.date = b.date
This query takes around 15s to complete and gets the correct data value. The 15s tho is just way too long and I must be doing something wrong when the first query is so fast. I also tried doing (MAX)-(MIN) for the data with group by for user_id but it also had slow performance.
What would be more efficient way of getting the same data value as the date or even the difference in latest and earliest data for each user?
Assuming you are using a fairly recent version of either MariaDB or MySQL, then ROW_NUMBER would probably be the most efficient way to find the earliest record for each user:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY date) rn
FROM datapoint_table
WHERE date > '2019-01-14'
)
SELECT user_id, data, date
FROM cte
WHERE rn = 1;
To the above you could also consider adding the following index:
CREATE INDEX ON datapoint_table (user_id, date);
You could also try the following variant index with the columns reversed:
CREATE INDEX ON datapoint_table (date, user_id);
It is not clear which version of the index would perform the best, which would depend on your data and the execution plan. Ideally one of the above two indices would help the database execute ROW_NUMBER, along with the WHERE clause.
If your database version does not support ROW_NUMBER, then you may continue with your current approach:
SELECT d1.user_id, d1.data, d1.date
FROM datapoint_table d1
INNER JOIN
(
SELECT user_id, MIN(date) AS min_date
FROM datapoint_table
WHERE date > '2019-01-14'
GROUP BY user_id
) d2
ON d1.user_id = d2.user AND d1.date = d2.min_date
WHERE
d1.date > '2019-01-14';
Again, the indices suggested should at least speed up the execution of the GROUP BY subquery.
I have a subquery that aggregates some UNION ALL selects. Over that I prepare the SELECT to create cross-tab and limit it to let's say 20. I would like to be able to retrieve the total COUNT of sub query results before I am limiting them in main query. This is for the purpose of trying to build a pagination that receives the total number of records and then the specific page record grid.
Sample query:
SELECT
name,
sumIf(metric_value, metric_name = 'data') AS data,
sumif(....
FROM
(SELECT
name, metric_name, SUM(metric_value) as metric_value
FROM
(SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table2
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table3
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
.
.
.)
GROUP BY
name, metric_name)
GROUP BY
name
ORDER BY
name ASC
LIMIT 0,20;
The first subselect returns tons of data, so I thought I can count it and return as one column value, or row and it would propagate to main select that limits 20 results. Because I need to know the entire set of results but don;t want to call the same query twice without limit and with limit just to get COUNT. There are at least 12 UNION ALL third level sub selects, so why waste resources. I am looking to try generic SQL solutions not necessarily related to ClickHouse
I was thinking of using count(*) OVER (), however that is not supported, so if thats only option I know I need to run query twice.
The first thing that one should mention is that nobody is usually interested in the exact number of pages on a query. It can be easily estimated and almost no one will care how exact is the estimation. However, if you have a link to the last page in your GUI, people will often click to link just to see whether it works.
Nevertheless, there are cases when an analyst should visit all the pages, and then the GUI should display the exact amount of work. A good news is that in that latter case, a better strategy is to cache a snapshot of the whole results table and counting the rows in the table becomes not a problem anymore.
I mean, it makes sense to discuss with the customers whether they really need it, because unneeded full scans many times per day may have effect on the database load and billing sums.
Anyway, if you still need to estimate the number of rows, you can simplify the query just to count the number of rows. As I understand this is something like:
SELECT SUM(cnt) as row_count
FROM (
SELECT COUNT(DISTINCT name) as cnt FROM table1 WHERE date > ...
UNION ALL
SELECT COUNT(DISTINCT name) as cnt FROM table2 WHERE date > ...
...
) as counts;
or if data is a constant metric name
SELECT COUNT(DISTINCT name) as row_count
FROM (
SELECT DISTINCT name FROM table1 WHERE date > ...
UNION ALL
SELECT DISTINCT name FROM table2 WHERE date > ...
...
) as names;
I have arrived at a query that gives me what I want but it is not efficient and takes over 45 seconds to execute. How can I modify to make this quicker?
SELECT *
FROM (SELECT DISTINCT email,
title,
first_name,
last_name,
'chauntry' AS source,
post_code AS postcode
FROM chauntry
WHERE mailing_indicator = 1) AS x
LEFT JOIN (SELECT email,
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Month(added) )) AS unique_months
FROM chauntry
WHERE added >= Now() - INTERVAL 1 year
GROUP BY email) AS y
ON x.email = y.email
here are the data fields
here are the column headings I am after
DRapp, appreciate the detailed feedback - spot on about the date being an oversight.
Are we talking about something like the below to speed up my query? I cant find much info about creating a covering index and the source of the below one was questionable.
ALTER TABLE `chauntry`
ADD INDEX(`mailing_indicator`, `email`);
ALTER TABLE `chauntry`
ADD INDEX covering_index (`added`, `email`, `amount_paid`);
You use SELECT DISTINCT in the first subquery, and GROUP BY in the second subquery.
Which has the same effect.
Subqueries in a from clause are often redundant, they produce derived tables which aren't indexed. When you run explain over the query you'll see 4 tabels in the execution plan. This can be rewritten to a query without subqueries:
SELECT x.email,
x.title,
x.first_name,
x.last_name,
'chauntry' AS source,
post_code AS postcode,
Avg(y.amount_paid) AS avg_paid,
Count(y.email) AS no_times_booked,
Count(DISTINCT( Month(y.added) )) AS unique_months
FROM
chauntry x
LEFT JOIN
chaunrty y
ON x.email = y.email AND y.added >= CURRENT_DATE - INTERVAL 1 YEAR
GROUP BY x.email
However your model isn't properly normalized, you should have two tables, one with the account data, and one with the payments
To help your performance, you really need indexes. Since you are in essence running two different queries, I would have the following indexes on your CHAUNTRY table
First... by having the mailing_indicator first, you are jumping directly to those, then getting email too which is a basis for the join after. You could actually extend the index to include title, first, last, post_code to be a covering index, but that might be overkill.
( mailing_indicator, email )
Your LEFT JOIN query, it appears you want the count, avg, etc regardless of the mailing_indicator status. To help optimize this, I would have an index on
( added, email, amount_paid )
This WOULD be a covering index so the engine does not need to go to the raw data pages to query the data, but gets them directly from the index.
One additional note about your count of distinct months. You MIGHT be missing a count entry. Consider the middle of the month, such as now Jan 28.
if you have entries for Jan 29, 2014 and Jan 27, 2015, they will fall into the same distinct month count basis as 1 and not 2 representing two different months as they span month AND year. You might want to change that to
DATE_FORMAT(added, '%M %Y') as unique_months_yrs
Syntax to create index
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_type]
Create index Chauntry_MailInd_EMail on Chauntry ( mailing_indicator, email );
Create index Chauntry_Add_Email_Paid on Chauntry ( added, email, amount_paid );
I have the following query:
SELECT t.ID, t.caseID, time
FROM tbl_test t
INNER JOIN (
SELECT ID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
ORDER BY caseID DESC -- ERROR HERE!
) s
USING (ID)
It seems that I only get the correct result if I use the ORDER BY in the inner join. Why is that? I am using the ID for the join, so the order should take no effekt.
If I remove the order by, I get too old entries from the database.
ID is the primary key, the caseID is a kind of object with multiple entries with different timestamps.
This query is ambiguous:
SELECT ID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
It's ambiguous because it does not guarantee that it returns the ID of the row where the MAX(TIME) occurs. It returns the MAX(TIME) for each distinct value of caseID, but the value of other columns (like ID) is chosen arbitrarily from members of the group.
In practice, MySQL chooses the row that it finds first in the group as it scans rows in storage order.
Example:
caseID ID time
1 10 15:00
1 12 18:00
1 14 13:00
The max time is 18:00, which is the row with ID 12. But the query will return ID 10, simply because it's the first one in the group. If you were to reverse the order with ORDER BY, it would return ID 14. Still not the row where the max time is found, but it's from the other end of the group of rows.
Your query works with ORDER BY caseID DESC because, by coincidence, your Time values increase with the increasing ID.
This sort of query is actually an error in standard SQL and most other brands of SQL database. MySQL permits it, trusting that you know how to form an unambiguous query.
The fix is to use columns in the select-list only if they are unambiguous, that is, if they are in the GROUP BY clause, then each group is guaranteed to have only one distinct value:
SELECT caseID, MAX( TIME )
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
SELECT t.ID, t.caseID, time
FROM tbl_test t
INNER JOIN (
SELECT caseID, MAX( TIME ) maxtime
FROM tbl_test
WHERE TIME <=1353143351
GROUP BY caseID
) s
ON t.caseID = s.caseID and t.time = s.maxtime
You are seeing that issue because you are getting the MAX(TIME) per caseID, but since you are grouping by caseID and NOT ID, you are getting an arbitrary ID. That happens because when you use an aggregate function, like MAX, you must, for every non-grouped field in the select specify how you want to aggregate it. That means, if it's in the SELECT and NOT in the GROUP BY, you have to tell MySQL how to aggregate. If you don't then you get a RANDOM row (well, not random per se, but it's not going to be in an order that you necessarily expect).
The reason ORDER BY is working for you, is that it kind of tricks the query optimizer into sorting the results before grouping, which just so happens to produce the result you want, but be warned, that will not always be the case.
What you want is the ID that has the MAX(TIME) given a caseID. Which means your INNER join needs to connect by caseID (not ID) and time (which will give you 1 row per each 1 row in the outer table).
Barmar beat me to the actual query, but that's the way you want to go.
Suppose I have a table with 3 columns:
id (PK, int)
timestamp (datetime)
title (text)
I have the following records:
1, 2010-01-01 15:00:00, Some Title
2, 2010-01-01 15:00:02, Some Title
3, 2010-01-02 15:00:00, Some Title
I need to do a GROUP BY records that are within 3 seconds of each other. For this table, rows 1 and 2 would be grouped together.
There is a similar question here: Mysql DateTime group by 15 mins
I also found this: http://www.artfulsoftware.com/infotree/queries.php#106
I don't know how to convert these methods into something that will work for seconds. The trouble with the method on the SO question is that it seems to me that it would only work for records falling within a bin of time that starts at a known point. For instance, if I were to get FLOOR() to work with seconds, at an interval of 5 seconds, a time of 15:00:04 would be grouped with 15:00:01, but not grouped with 15:00:06.
Does this make sense? Please let me know if further clarification is needed.
EDIT: For the set of numbers, {1, 2, 3, 4, 5, 6, 7, 50, 51, 60}, it seems it might be best to group them {1, 2, 3, 4, 5, 6, 7}, {50, 51}, {60}, so that each grouping row depends on if the row is within 3 seconds of the previous. I know this changes things a bit, I'm sorry for being wishywashy on this.
I am trying to fuzzy-match logs from different servers. Server #1 may log an item, "Item #1", and Server #2 will log that same item, "Item #1", within a few seconds of server #1. I need to do some aggregate functions on both log lines. Unfortunately, I only have title to go on, due to the nature of the server software.
I'm using Tom H.'s excellent idea but doing it a little differently here:
Instead of finding all the rows that are the beginnings of chains, we can find all times that are the beginnings of chains, then go back and ifnd the rows that match the times.
Query #1 here should tell you which times are the beginnings of chains by finding which times do not have any times below them but within 3 seconds:
SELECT DISTINCT Timestamp
FROM Table a
LEFT JOIN Table b
ON (b.Timestamp >= a.TimeStamp - INTERVAL 3 SECONDS
AND b.Timestamp < a.Timestamp)
WHERE b.Timestamp IS NULL
And then for each row, we can find the largest chain-starting timestamp that is less than our timestamp with Query #2:
SELECT Table.id, MAX(StartOfChains.TimeStamp) AS ChainStartTime
FROM Table
JOIN ([query #1]) StartofChains
ON Table.Timestamp >= StartOfChains.TimeStamp
GROUP BY Table.id
Once we have that, we can GROUP BY it as you wanted.
SELECT COUNT(*) --or whatever
FROM Table
JOIN ([query #2]) GroupingQuery
ON Table.id = GroupingQuery.id
GROUP BY GroupingQuery.ChainStartTime
I'm not entirely sure this is distinct enough from Tom H's answer to be posted separately, but it sounded like you were having trouble with implementation, and I was thinking about it, so I thought I'd post again. Good luck!
Now that I think that I understand your problem, based on your comment response to OMG Ponies, I think that I have a set-based solution. The idea is to first find the start of any chains based on the title. The start of a chain is going to be defined as any row where there is no match within three seconds prior to that row:
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
Now we can assume that any non-chain starters belong to the chain starter that appeared before them. Since MySQL doesn't support CTEs, you might want to throw the above results into a temporary table, as that would save you the multiple joins to the same subquery below.
SELECT
SQ1.my_id,
COUNT(*) -- You didn't say what you were trying to calculate, just that you needed to group them
FROM
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ1
INNER JOIN My_Table MT3 ON
MT3.title = SQ1.title AND
MT3.my_time >= SQ1.my_time
LEFT OUTER JOIN
(
SELECT
MT1.my_id,
MT1.title,
MT1.my_time
FROM
My_Table MT1
LEFT OUTER JOIN My_Table MT2 ON
MT2.title = MT1.title AND
(
MT2.my_time < MT1.my_time OR
(MT2.my_time = MT1.my_time AND MT2.my_id < MT1.my_id)
) AND
MT2.my_time >= MT1.my_time - INTERVAL 3 SECONDS
WHERE
MT2.my_id IS NULL
) SQ2 ON
SQ2.title = SQ1.title AND
SQ2.my_time > SQ1.my_time AND
SQ2.my_time <= MT3.my_time
WHERE
SQ2.my_id IS NULL
This would look much simpler if you could use CTEs or if you used a temporary table. Using the temporary table might also help performance.
Also, there will be issues with this if you can have timestamps that match exactly. If that's the case then you will need to tweak the query slightly to use a combination of the id and the timestamp to distinguish rows with matching timestamp values.
EDIT: Changed the queries to handle exact matches by timestamp.
Warning: Long answer. This should work, and is fairly neat, except for one step in the middle where you have to be willing to run an INSERT statement over and over until it doesn't do anything since we can't do recursive CTE things in MySQL.
I'm going to use this data as the example instead of yours:
id Timestamp
1 1:00:00
2 1:00:03
3 1:00:06
4 1:00:10
Here is the first query to write:
SELECT a.id as aid, b.id as bid
FROM Table a
JOIN Table b
ON (a.Timestamp is within 3 seconds of b.Timestamp)
It returns:
aid bid
1 1
1 2
2 1
2 2
2 3
3 2
3 3
4 4
Let's create a nice table to hold those things that won't allow duplicates:
CREATE TABLE
Adjacency
( aid INT(11)
, bid INT(11)
, PRIMARY KEY (aid, bid) --important for later
)
Now the challenge is to find something like the transitive closure of that relation.
To do so, let's find the next level of links. by that I mean, since we have 1 2 and 2 3 in the Adjacency table, we should add 1 3:
INSERT IGNORE INTO Adjacency(aid,bid)
SELECT adj1.aid, adj2.bid
FROM Adjacency adj1
JOIN Adjacency adj2
ON (adj1.bid = adj2.aid)
This is the non-elegant part: You'll need to run the above INSERT statement over and over until it doesn't add any rows to the table. I don't know if there is a neat way to do that.
Once this is over, you will have a transitively-closed relation like this:
aid bid
1 1
1 2
1 3 --added
2 1
2 2
2 3
3 1 --added
3 2
3 3
4 4
And now for the punchline:
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
returns:
aid Neighbors
1 1,2,3
2 1,2,3
3 1,2,3
4 4
So
SELECT DISTINCT Neighbors
FROM (
SELECT aid, GROUP_CONCAT( bid ) AS Neighbors
FROM Adjacency
GROUP BY aid
) Groupings
returns
Neighbors
1,2,3
4
Whew!
I like #Chris Cunningham's answer, but here's another take on it.
First, my understanding of your problem statement (correct me if I'm wrong):
You want to look at your event log as a sequence, ordered by the time of the event,
and partitition it into groups, defining the boundary as being an interval of
more than 3 seconds between two adjacent rows in the sequence.
I work mostly in SQL Server, so I'm using SQL Server syntax. It shouldn't be too difficult to translate into MySQL SQL.
So, first our event log table:
--
-- our event log table
--
create table dbo.eventLog
(
id int not null ,
dtLogged datetime not null ,
title varchar(200) not null ,
primary key nonclustered ( id ) ,
unique clustered ( dtLogged , id ) ,
)
Given the above understanding of the problem statement, the following query should give you the upper and lower bounds your groups. It's a simple, nested select statement with 2 group by to collapse things:
The innermost select defines the upper bound of each group. That upper boundary defines a group.
The outer select defines the lower bound of each group.
Every row in the table should fall into one of the groups so defined, and any given group may well consist of a single date/time value.
[edited: the upper bound is the lowest date/time value where the interval is more than 3 seconds]
select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
You could then pull rows from the event log and tag them with the group to which they belong thus:
select *
from ( select dtFrom = min( t.dtFrom ) ,
dtThru = t.dtThru
from ( select dtFrom = t1.dtLogged ,
dtThru = min( t2.dtLogged )
from dbo.EventLog t1
left join dbo.EventLog t2 on t2.dtLogged >= t1.dtLogged
and datediff(second,t1.dtLogged,t2.dtLogged) > 3
group by t1.dtLogged
) t
group by t.dtThru
) period
join dbo.EventLog t on t.dtLogged >= period.dtFrom
and t.dtLogged <= coalesce( period.dtThru , t.dtLogged )
order by period.dtFrom , period.dtThru , t.dtLogged
Each row is tagged with its group via the dtFrom and dtThru columns returned. You could get fancy and assign an integral row number to each group if you want.
Simple query:
SELECT * FROM time_history GROUP BY ROUND(UNIX_TIMESTAMP(time_stamp)/3);