I´ve got a table with 41,000,000 rows with the following structure:
What is the best index for date to enable a quick filtering like this
select count(*) from market_db_candle where date >= 2018-09-05 and date <= 2018-09-09;
For the query you show, you can't do better than the index on date which you already have. The query will examine all rows that match the date range, but at least it won't examine any rows outside the date range. If that makes the query examine thousands or millions of rows, that can't be helped. It has to examine the rows to count them.
If you need a query that counts the rows in less time, one strategy is to create a second table stores one row per date, and the count associated with that date.
CREATE TABLE market_db_candle_count_by_day (
date DATE PRIMARY KEY,
count INT UNSIGNED NOT NULL
);
Populate it with the counts by day:
INSERT INTO market_db_candle_count_by_day
SELECT date, COUNT(*)
FROM market_db_candle
GROUP BY date;
Then you can query the SUM of counts:
select sum(count) as count from market_db_candle_count_by_day
where date >= '2018-09-05' and date <= '2018-09-09';
It's up to you to keep the latter table in sync, and update it when necessary.
PS: Put date literals inside single-quotes.
Related
I have a MySQL database. I have a table in it which has around 200000 rows.
I am querying through this table to fetch the latest data.Query
select *
from `db`.`Data`
where
floor = "floor_value" and
date = "date_value" and
timestamp > "time_value"
order by
timestamp DESC
limit 1
It is taking about 9 sec to fetch the data, when the number of rows in the table were less, it did not take this long to fetch the data. Can anyone help me with how do I reduce the time taken for the query?
Try adding the following compound index:
CREATE INDEX idx ON Data (floor, date, timestamp);
This index should cover the entire WHERE clause and also ideally should be usable for the ORDER BY clause. The reason why timestamp appears last in the index is that this allows for generating a final set of matching timestamp values by scanning the index. Had we put timestamp first, MySQL might have to seek back to the clustered index to find the set of matching timestamp values.
I have an innodb table with 100M records like this:
id name pid cid createdAt
int char int int timestamp
id is PK, and pid is indexed: key
the most often query is select count(*) from table1 where pid='pid'
my question is does this query do a full table scanning?
count(*) is very rarely what you want.
The count function counts rows that are not null, so count(name) counts records where the name field is not null for example. If the field being counted is not indexed then this results in a full table scan.
In the case of count(*) the database counts records that have at least one non null field, ie it excludes records where all of the fields are null. This might be what you want, but most people incorrectly use this form when they want to just count all of the records regardless of their content.
The most efficient way of counting all of the records without database specific syntax is count(1). This works because the value 1 is not null for every record, and does not require any data to be read from the database.
If you want to know what the query does, then look at the "explain" plan.
If you want to speed the query in question, then create an index on table1(pid).
The query should scan the index rather than the table.
I got an existing Mysql table with one of the columns as time int(10)
The field has many records like
1455307434
1455307760
Is it a date time, encrypted.
What should be the select Query, so it should display an actual date.
FROM_UNIXTIME()
SELECT FROM_UNIXTIME(mycolumn)
FROM mytable
I want to:
select
max_date = max( dates)
from some_table t
where dates is datetime in form of
2014-10-29 23:34:11
and is primary key, so is indexed.
What is the retrieval complexity for big databases?
Since your date column is primary key it will be unique and indexed. So, it should be fine.
Per MySQL documentation, if you use WHERE clause along with the MAX() function then the query will be optimized and will be faster.
In your case, you are just trying to get the maximum date, you can as well use OEDER BY with LIMIT like below which will take advantage of index on dates column and will be faster
select `dates`
from some_table
order by `dates` desc
limit 1;
I have a table with 5 million rows, and I want to get only rows that have the field date between two dates (date1 and date2). I tried to do
select column from table where date > date1 and date < date2
but the processing time is really big. Is there a smarter way to do this? Maybe access directly a row and make the query only after that row? My point is, is there a way to discard a large part of my table that does not match to the date period? Or I have to read row by row and compare the dates?
Usually you apply some kind of condition before retrieving the results. If you don't have anything to filter on you might want to use LIMIT and OFFSET:
SELECT * FROM table_name WHERE date BETWEEN ? AND ? LIMIT 1000 OFFSET 1000
Generally you will LIMIT to whatever amount of records you'd like to show on a particular page.
You can try/do a couple of things:
1.) If you don't already have one, index your date column
2.) Range partition your table on the date field
When you partition a table, the query optimizer can eliminate partitions that are not able to satisfy the query without actually processing any data.
For example, lets say you partitioned your table by the date field monthly and that you had 6 months of data in the table. If you query for a date between range of a week in OCT-2012, the query optimizer can throw out 5 of the 6 partitions and only scan the partition that has records in the month of OCT in 2012.
For more details, check the MySQL Partitioning page. It gives you all the necessary information and gives a more through example of what I described above in the "Partition Pruning" section.
Note, I would recommend creating/cloning your table in a new partitioned table and do the query in order to test the results and whether it satisfies your requirements. If you haven't already indexed the date column, that should be your first step, test, and if need be check out partitioning.