MySQL: How to index my table to optimize this query? - mysql

My table contains history of values relating to objects, it looks like:
create table History (
object_id bigint NOT NULL,
value int NOT NULL,
date bigint NOT NULL
);
How can I index it to optimize following query:
select * from History
where object_id = ? and date < ?
order by date desc
limit ?

Create composite index object_id + date
CREATE INDEX object_id_date ON History(object_id, `date`);

Related

How can I optimize this further?

My table has approx. 121,246,211 rows. The records are simple page impression information.
Here is the schema:
create table stat_page
(
id int auto_increment primary key,
pageId int not null,
timestamp int not null
)
collate = utf8_unicode_ci;
create index pageIdIndex
on stat_page (pageId);
create index timestampIndex
on stat_page (timestamp);
This query takes 15 seconds:
select count(*)
from stat_page
where `timestamp` > 1543622400;
This query takes nearly 7 minutes:
select count(*)
from stat_page
where `timestamp` > 1543622400
and pageId = 87;
I thought I indexed the right things; is the table just too large? Does anyone have a suggestion as to how to get information from this table faster?
The following index will improve the performance of that query:
create index ix1 on stat_page (pageId, timestamp);
This query benefits of this "composite" index.

MYSQL Select filtered on one column, use index of other column

I have a table looking something like:
CREATE TABLE myTable (
id INT NOT NULL AUTO_INCREMENT,
atTime TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
stuff varchar(100) NULL,
CONSTRAINT myTable_PK PRIMARY KEY (id)
)
ENGINE=InnoDB
DEFAULT CHARSET=latin1
COLLATE=latin1_swedish_ci ;
The atTime column is not indexed.
The id column is always in chronological order.
I'd like to select all rows where id > x and atTime < y.
Is there any way of doing this without doing a full table scan for each select and without adding an index to atTime?
Essentially what I want is to tell MYSQL to rely on id being chronological to optimize the query.
Edit:
Figured out one way of doing it using a subquery but it only made it faster with about 30%:
SELECT * from myTable
WHERE id<( SELECT id FROM myTable
WHERE atTime>'y'
ORDER BY id LIMIT 1
)
AND id>x
By themselves these 2 queries are near instantaneous, but together they take quite some time. Why could that be?

RANK conversion of MS SQL to MYSQL

I am converting our project database from SQL Server to MySQL, the DB conversion has done already.
We have code as below to identify duplicate records based on hashcode and update them as duplicate.
Rank function in MySQL ([Rank function in MySQL) need rank based on age which will start with 1 and increment by 1 for each record. But for me Rank for each hascode should start from 1 and increment by 1 for same hascode, if new hascode comes Rank should start from 1.
update table set Duplicate=1
WHERE id IN
( SELECT id FROM (
select RANK() OVER (PARTITION BY Hashcode ORDER BY Date asc) R,*
from table )A where R!=1 )
Below is table structure
CREATE TABLE TBL (
id int(11) NOT NULL AUTO_INCREMENT,
FileName varchar(100) DEFAULT NULL,
date datetime DEFAULT NULL,
hashcode varchar(255) DEFAULT NULL,
FileSize varchar(25) DEFAULT NULL,
IsDuplicate bit(1) DEFAULT NULL,
IsActive bit(1) DEFAULT NULL
PRIMARY KEY (`id`)
)
Please help me to migrate this code to MYSQL.
You don't need to use enumeration for this logic. You just want to set the duplicate flag on everything that is not the minimum date for the hashcode:
update table t join
(select hashcode, min(date) as mindate
from table t
group by hashcode
) tt
on t.hashcode = tt.hashcode and t.date > tt.mindate
set t.Duplicate = 1;
MySQL features a rather unique way to delete duplicates:
alter ignore table YourTable
add unique index ux_yourtable_hashcode (hashcode);
The trick here is in the ignore option:
If IGNORE is specified, only one row is used of rows with duplicates
on a unique key. The other conflicting rows are deleted.
But there are also other ways. Based on your comment, there is an auto_increment column called id. Since this column is unique and not null, you can use it to distinguish duplicates. You'd need a temporary table to work around the cant specify target table TBL for update in FROM clause error:
create temporary table tmp_originals (id int);
insert tmp_originals
(id)
select min(id)
from YourTable
group by
hashcode;
update YourTable
set Duplicate = 1
where id not in (select id from tmp_originals);
The group by query selects the lowest id per group of rows with the same hashcode.

MySQL indexes to SQL SErver

I have the following script that I want to convert to SQL Server, but how?
MySQL:
CREATE TABLE mytable (
id INTEGER UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
uri VARBINARY(1000),
INDEX(uri(100))
);
How does this index looks like in MSSQL??
CREATE TABLE mytable (
id INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
uri VARBINARY(1000),
--INDEX ???
);
sql-server doesn't have an inline index creation clause in its create table syntax. However, you can do it afterwards:
CREATE TABLE mytable (
id INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
uri VARBINARY(1000),
);
CREATE INDEX my_table_uri_ind ON mytable(uri);
EDIT:
To address the comment below, you can use computed columns to gain the effect of indexing only part of your uri:
CREATE TABLE mytable (
id INTEGER NOT NULL PRIMARY KEY IDENTITY(1,1),
uri VARBINARY(1000),
uri100 AS SUBSTRING (uri, 1, 100)
);
CREATE INDEX my_table_uri_ind ON mytable(uri100);

Database search with multi joins

I have a MySQL database and I want to perform a little bigger search.
I have about 10k records in one of the tables and It's expected to grow, but slowly.
The biggest problem is that to perform the search I have to make a query with 4 JOINS which I think causes the search to be slow.
So here is some example struct:
[table records]
id INT unsigned PRIMARY KEY auto_increment
description text
label INT unsigned
type INT unsigned
price DECIMAL
[table records_labels]
id INT unsigned PRIMARY KEY auto_increment
label varchar
[table records_types]
id INT unsigned PRIMARY KEY auto_increment
type varchar
[table records_serial]
id INT unsigned PRIMARY KEY auto_increment
serial varchar
record INT unsigned
[table records_barcode]
id INT unsigned PRIMARY KEY auto_increment
barcode varchar
record INT unsigned
So here is how the things run:
I run a query which selects records.id, records.description, records.price, records_labels.label, records_types.type, records_serial.serial, records_barcode.barcode;
So the full query is like this:
SELECT records.id, records.description, records.price, records_labels.label, records_types.type, records_serial.serial, records_barcode.barcode FROM records JOIN records_labels ON records_labels.id = records.label JOIN records_types ON records_types.id = records.type LEFT JOIN records_serial ON records_serial.record = record.id LEFT JOIN records_barcode ON records_barcode.record = record.id WHERE records_serial.serial LIKE %SEARCH_TERM% OR records_barcode.barcode LIKE %SEARCH_TERM%
I think that the solution here is indexing I guess, but I'm not very familiar with it.
So shortly, how to speed up and optimize query of this kind?
indexing records (OPTIONAL, BUT RECOMENDED)
CREATE INDEX ilabel ON records (`label`);
CREATE INDEX itype ON records (`type`);
fixing records_label
ALTER TABLE records_label MODIFY label INT(10) UNSIGNED NULL;
CREATE INDEX ilabel ON records_label (`label`);
fixing records_types
ALTER TABLE records_types MODIFY `type` INT(10) UNSIGNED NULL;
CREATE INDEX itype ON records_types (`type`);
the search
SELECT r.id, r.description, r.price, rl.label,
rt.`type`, records_serial.`serial`, records_barcode.barcode
FROM records r
INNER JOIN records_labels rl ON rl.id = r.label
INNER JOIN records_types rt ON rt.id = r.`type`
WHERE
r.id IN (
SELECT rs.record
FROM records_serial rs
WHERE rs.`serial` LIKE '%SEARCH_TERM%'
)
OR
r.id IN (
SELECT rb.record
FROM records_barcode rb
WHERE rb.barcode LIKE '%SEARCH_TERM%'
);
There is no much what I can do for your where clause. the Like %% kills any sort of performance if you keen to change it for something like this LIKE 'SEARCH_TERM%', then you could create the index below
CREATE INDEX iserial ON records_serial (`serial`(10));
CREATE INDEX ibarcode ON records_barcode (`barcode`(10));
It could be improved even more but with theses changes I believe you achieve what you are looking for. ;-)