i have one table trip_data.Every one second i getting packets and inserting data to database.trip_data table contains four fields.trip_paramid,fuel_content,creation_time&vehicle_id.I want to select all rows in which difference between creation time is 2 minutes(Not exactly 2.Approximately 2).trip_data table contains 40 lacks rows.So i need a optimized select query for this.Can anyone help on this.Here is table schema&sample data for the trip_table..
SQlFiddle demo
SELECT
tp.*
FROM
trip_parameters tp
GROUP BY
CONVERT(UNIX_TIMESTAMP (tp.creation_time)/(2*60), unsigned)
ORDER BY
tp.creation_time asc
Note that using UNIX_TIMESTAMP does not allow you to handle dates beyond year 2037. Using the following instead fixes the problem:
CONVERT(TIMESTAMPDIFF(SECOND,'1970-01-01 00:00:00',tp.creation_time)/(2*60), unsigned)
You can do it in one table scan using MYSQL User defined variables. Unfortunately UDV's have a limited set of data types (integer, decimal, floating-point, binary or nonbinary string). So in this query I use a char #ti varible to store previous datetime using CAST to compare it with the Creation_time field. Also initial value for this variable I set to (now()-10000000) you can use any date you wish less than MIN(Creation_time)
Here is the SQLFiddle demo
select * from
(
select trip_parameters.*,
if(ABS(TIMESTAMPDIFF(MINUTE,Creation_time,cast(#ti as datetime)))>=2,1,0) t,
#ti:=if(ABS(TIMESTAMPDIFF(MINUTE,Creation_time,cast(#ti as datetime)))>=2,
cast(Creation_time as char(100)),#ti)
from trip_parameters,
(select #ti:=cast(now()-10000000 as char(100))) a
order by creation_time
) t2
where T=1
order by creation_time
Try this
SELECT trip_paramid, fuel_content, creation_time, vehicle_id
FROM trip_parameters
GROUP BY FLOOR(UNIX_TIMESTAMP(creation_time)/120)
This takes one item of every 2 minute block
Related
I have a field called MER_DATA in a snowflake table having a value as shown below:
[43,44.25,44.5,42.75,44,44.25,42.75,43,42.5,42.5,36.75,42.25,42.75,43.25,43.25,43.25,42.75,43.5,42,43,43.75,43.75,43.25,41.75,43.25,42.5,43.25,42.75,43.25,43.5,43.25,43.25,43.75,...]
Each row has approximately 4k(This varies from row to row)numbers in them and the data type of the field is varchar(30000). The data is around 700k rows
Now I want to calculate the standard deviation of each row using the numbers present in the list shown above.
I have tried doing this in MySQL using the following query:
select mac, `timestamp`, std(res), min(res), max(res)
from
(select mac, `timestamp`, r.res from table cmr ,
json_table(mer_data, '$[*]' columns (res float path '$'))r)T
group by mac, `timestamp`;
which gives me the right result but takes a lot of time for 700k rows.
I want to do the same in snowflake. Is there an optimal way to do this?
Also the query needs to run within 10 mins in snowflake. The mysql query can take upto 1 hours.
Without the table definition and example source data it's difficult to produce a complete solution for your problem, but here is an example of how to do this using the STRTOK_SPLIT_TO_TABLE table function which first splits your varchar numbers to rows, so we can then re-aggregate the Value's to get the standard deviations per row.
First generate some test data at the right scale:
Create or Replace Table cmr (mer_data varchar) as
With gen as (
select
uniform(1,700000, random()) row_num,
normal(50, 1, random(0))::decimal(4,2) num
from table(generator(rowcount => 2800000000)) v
)
Select listagg(num, ',') listNums from gen group by row_num
;
Check we have 700k rows and varying count of numbers per row.
Select
count(*) row_count,
min(REGEXP_COUNT( mer_data , '[,]' ))+1 min_num_count,
max(REGEXP_COUNT( mer_data , '[,]' ))+1 max_num_count
from cmr limit 10;
Split the varchar number lists to rows with STRTOK_SPLIT_TO_TABLE and group by the generated SEQ column to calculate the stddev of the VALUE.
Select
seq row_num,
stddev(value) stdListNums,
min(value) minNum, max(value) maxNum,
count(value) countListNums
from cmr, table(STRTOK_SPLIT_TO_TABLE(mer_data ,','))
Group By 1
;
For my data the query takes just over 3 minutes on and XSMALL Virtual Warehouse, and
a little over 30 seconds on LARGE Virtual Warehouse.
You can read about the STRTOK_SPLIT_TO_TABLE function here.
I have a more than 10 million data from my table and needs to pull it in order to display in the report. The origin of data was extracted from csv and all of them are in text format. and here is how it looks from my table:
I tried to query with limit on 1000 only and it will display quickly however If I am going to have a date filter for e.g getting 1 day data and it will take around 25-30 secs:
SELECT STR_TO_DATE(SUBSTRING_INDEX(time, '_', 1), '%m/%d/%Y') FROM myTable
WHERE STR_TO_DATE(SUBSTRING_INDEX(time, '_', 1), '%m/%d/%Y') BETWEEN DATE('2019-9-3') AND DATE('2019-9-3');
I already tried to create an index on time column which I am using for filter but still got the same result:
Is there any suggestion/comments how can I improve the speed to pull the data. TIA
When you apply functions to a column as part of your search, it can't use an index, even if you define an index for that column.
You should also use a proper DATE or DATETIME data type for the column, which will require dates be stored in YYYY-MM-DD format, not a string column in MM/DD/YYYY format.
If you store the dates properly, you can do this:
SELECT DATE(time) FROM myTable
WHERE time >= '2019-09-03' AND time < '2019-09-04';
That will make use of the index.
You are storing your dates/timestamps as text, which is going to force you to doing suboptimal things like calling STR_TO_DATE all over the place. I suggest adding a new bona fide datetime column, and then indexing that column:
ALTER TABLE myTable ADD COLUMN time_dt DATETIME;
Then, populate it using STR_TO_DATE:
UPDATE myTable
SET time_dt = STR_TO_DATE(time, '%m/%d/%Y_%H:%i:%s.%f');
Then, add an index on time_dt:
CREATE INDEX idx ON myTable (time_dt);
And finally, rewrite your query so that the WHERE clause is sargable (i.e. so that it may use the above index):
SELECT DATE(time_dt)
FROM myTable
WHERE time_dt >= '2019-09-03' AND time_dt < '2019-09-04';
Side note: You need to use %H in the format mask with STR_TO_DATE, because your hours are in 24-hour clock mode.
i have an events table having start date and end date I am trying retrieve all the records by giving a date that is between start and end dates.
eg :
SELECT *
FROM `events`
WHERE '2017-01-29' BETWEEN start_date='2017-01-28'
AND end_date='2017-01-31'
but response is syntax error can any one help me to finish the query
Just list the columns.
WHERE '2017-01-29' BETWEEN start_date AND end_date
The values come from the table, you don't put them into the query.
According to mysql documentation (https://dev.mysql.com/doc/refman/5.7/en/comparison-operators.html#operator_between) the syntax for BETWEN is
expr BETWEEN min AND max
it is not
expr BETWEEN blabla=min AND stuff=max
Also, it is rather pointless to be using constants in all three expressions, because in this case the result will be known in advance (either always TRUE or always FALSE) without having to consult the values in your table.
It is kind of hard to give you an example without knowing the structure of your table, but what you probably want is something like
WHERE '2017-01-29' BETWEEN start_date
AND end_date
(assuming start_date and end_date are columns in your table)
or something like
WHERE some_column BETWEEN '2017-01-28'
AND '2017-01-31'
(assuming some_column is a column in your table.)
I believe you're trying to find all the rows where a date is 2017-01-29, and so, your query could be:
SELECT *
FROM `events`
WHERE
date = '2017-01-29';
If, however, you want all rows with date between 2017-01-28 and 2017-01-31, then you could do:
SELECT *
FROM `events`
WHERE
date BETWEEN '2017-01-28' AND '2017-01-31';
Instead of putting 2017-01-29 before WHERE, put the name of the field you want to filter by date, such as EventDate (or whatever your field is named).
I'm trying to select all fields for a number of rows from my MySQL table. One of my fields is called publication_date and it stores a string that represents a day that specific row is to be published on our website. It's stored in mm/dd/yyyy format.
I know I can cast that field to a DATE data type using CAST, but I'm not sure how to also grab the other fields' data.
Just add that column to your SELECT clause in addition to the *. Make sure to give it an alias so you can differentiate it from the regular datetime field.
SELECT *
, CAST(datefield AS date) AS aliasname
FROM tablename
You can do :
Select *,cast(publication_date as char) as newPublicationdate from tableName
Or if your table do not have lots of column it is much better to type it all
Select column1,column2,cast(publication_date as char) as publication_date from tableName
Regards
I have 2 PDO database connections. I am doing a search within a MS SQL table for a row that closest matches a date (mysql datetime) row.
I have mysql.table1.date passed to mssql.table and I am looking for the closest date accordingt to the mssql.table.date. It is also defined as a datetime field. I only need 1 row returned, the closest to the time, so in essence:
SELECT * FROM table ORDER BY CLOSEST(mysqldate = mssql.table.date) LIMIT 1;
I know the syntax above is incorrect but that basically outputs what I need, but I really do not know how to do this with mssql.
Any help?
Basically u can find the difference of the mysql date with all the dates in mssql.Table.Date column .Then u need to select the least difference value from the above query .Hopefully the below query might help u
;with CTE as
(
Select mssql.table.date,row_number()
over (order by abs(datediff(day,mysqlDate,mssql.table.date))) rowNumber
from mssql.Table)
select mssql.table.date from CTE where rowNumber=1
A simple solution which worked for me was to do the following:
SELECT * FROM `table` WHERE `date` < `startDate` ORDER BY `date` LIMIT 1;
This returns 1 row matching the closest time to the time I am passing :)