Database search with multi joins - mysql

I have a MySQL database and I want to perform a little bigger search.
I have about 10k records in one of the tables and It's expected to grow, but slowly.
The biggest problem is that to perform the search I have to make a query with 4 JOINS which I think causes the search to be slow.
So here is some example struct:
[table records]
id INT unsigned PRIMARY KEY auto_increment
description text
label INT unsigned
type INT unsigned
price DECIMAL
[table records_labels]
id INT unsigned PRIMARY KEY auto_increment
label varchar
[table records_types]
id INT unsigned PRIMARY KEY auto_increment
type varchar
[table records_serial]
id INT unsigned PRIMARY KEY auto_increment
serial varchar
record INT unsigned
[table records_barcode]
id INT unsigned PRIMARY KEY auto_increment
barcode varchar
record INT unsigned
So here is how the things run:
I run a query which selects records.id, records.description, records.price, records_labels.label, records_types.type, records_serial.serial, records_barcode.barcode;
So the full query is like this:
SELECT records.id, records.description, records.price, records_labels.label, records_types.type, records_serial.serial, records_barcode.barcode FROM records JOIN records_labels ON records_labels.id = records.label JOIN records_types ON records_types.id = records.type LEFT JOIN records_serial ON records_serial.record = record.id LEFT JOIN records_barcode ON records_barcode.record = record.id WHERE records_serial.serial LIKE %SEARCH_TERM% OR records_barcode.barcode LIKE %SEARCH_TERM%
I think that the solution here is indexing I guess, but I'm not very familiar with it.
So shortly, how to speed up and optimize query of this kind?

indexing records (OPTIONAL, BUT RECOMENDED)
CREATE INDEX ilabel ON records (`label`);
CREATE INDEX itype ON records (`type`);
fixing records_label
ALTER TABLE records_label MODIFY label INT(10) UNSIGNED NULL;
CREATE INDEX ilabel ON records_label (`label`);
fixing records_types
ALTER TABLE records_types MODIFY `type` INT(10) UNSIGNED NULL;
CREATE INDEX itype ON records_types (`type`);
the search
SELECT r.id, r.description, r.price, rl.label,
rt.`type`, records_serial.`serial`, records_barcode.barcode
FROM records r
INNER JOIN records_labels rl ON rl.id = r.label
INNER JOIN records_types rt ON rt.id = r.`type`
WHERE
r.id IN (
SELECT rs.record
FROM records_serial rs
WHERE rs.`serial` LIKE '%SEARCH_TERM%'
)
OR
r.id IN (
SELECT rb.record
FROM records_barcode rb
WHERE rb.barcode LIKE '%SEARCH_TERM%'
);
There is no much what I can do for your where clause. the Like %% kills any sort of performance if you keen to change it for something like this LIKE 'SEARCH_TERM%', then you could create the index below
CREATE INDEX iserial ON records_serial (`serial`(10));
CREATE INDEX ibarcode ON records_barcode (`barcode`(10));
It could be improved even more but with theses changes I believe you achieve what you are looking for. ;-)

Related

Mysql limit joined table results by previous table column

I have three tables: test_composite, tests, questions. TestComposite belongs to Test. Test has many Questions.
create table test_composite
(
id bigint unsigned auto_increment
primary key,
id_include_test bigint unsigned not null,
questions_quantity smallint unsigned not null,
)
create table tests
(
id bigint unsigned auto_increment
primary key,
name varchar(128) not null,
)
create table questions
(
id bigint unsigned auto_increment
primary key,
question text not null,
test_id bigint unsigned not null,
)
The question is how to make this query work:
select questions.*
from test_composite
inner join tests on test_composite.id_include_test = tests.id
inner join questions on questions.id in (
select id
from questions
where tests.id = questions.test_id limit test_composite.questions_quantity
);
This query is expected to join test_composite and tests tables and then join questions with limit scope. Therefore everything comes to using previous table data in current table query.
Problem is that limit test_composite.questions_quantity isn't allowed.
Is there any way to do it?
PS. Mysql version 5.6.
I know a little about lateral join, but it is not supported in 5.6 version.

Strange behaviour on MYSQL querying huge table

I'm trying to understand a strange performance behaviour happening in a MYSQL data structure that I'm working on:
CREATE TABLE metric_values
(
dmm_id INT NOT NULL,
dtt_id BIGINT NOT NULL,
cus_id INT NOT NULL,
nod_id INT NOT NULL,
dca_id INT NULL,
value DOUBLE NOT NULL
)
ENGINE = InnoDB;
CREATE INDEX metric_values_dmm_id_index
ON metric_values (dmm_id);
CREATE INDEX metric_values_dtt_index
ON metric_values (dtt_id);
CREATE INDEX metric_values_cus_id_index
ON metric_values (cus_id);
CREATE INDEX metric_values_nod_id_index
ON metric_values (nod_id);
CREATE INDEX metric_values_dca_id_index
ON metric_values (dca_id);
CREATE TABLE dim_metric
(
dmm_id INT AUTO_INCREMENT
PRIMARY KEY,
met_id INT NOT NULL,
name VARCHAR(45) NOT NULL,
instance VARCHAR(45) NULL,
active BIT DEFAULT b'0' NOT NULL
)
ENGINE = InnoDB;
CREATE INDEX dim_metric_dmm_id_met_id_index
ON dim_metric (dmm_id, met_id);
CREATE INDEX dim_metric_met_id_index
ON dim_metric (met_id);
CONTEXT:
Some context, I'm trying to understand some strange performance behaviour happening in a data structure that I'm working on:
CREATE TABLE metric_values
(
dmm_id INT NOT NULL,
dtt_id BIGINT NOT NULL,
cus_id INT NOT NULL,
nod_id INT NOT NULL,
dca_id INT NULL,
value DOUBLE NOT NULL
)
ENGINE = InnoDB;
CREATE INDEX metric_values_dmm_id_index
ON metric_values (dmm_id);
CREATE INDEX metric_values_dtt_index
ON metric_values (dtt_id);
CREATE INDEX metric_values_cus_id_index
ON metric_values (cus_id);
CREATE INDEX metric_values_nod_id_index
ON metric_values (nod_id);
CREATE INDEX metric_values_dca_id_index
ON metric_values (dca_id);
CREATE TABLE dim_metric
(
dmm_id INT AUTO_INCREMENT
PRIMARY KEY,
met_id INT NOT NULL,
name VARCHAR(45) NOT NULL,
instance VARCHAR(45) NULL,
active BIT DEFAULT b'0' NOT NULL
)
ENGINE = InnoDB;
CREATE INDEX dim_metric_dmm_id_met_id_index
ON dim_metric (dmm_id, met_id);
CREATE INDEX dim_metric_met_id_index
ON dim_metric (met_id);
CONTEXT:
Metric_values have something close to 100 milion rows and table dim_metric has 1024 rows.
I'm doing a simple JOIN between this 2 tables and I'm having huge performance issues. Trying to figure out what the problem is I stumbled in this strange behaviour.
I can't execute the JOIN using the column met_id as a filter. I left it running for 10 minutes and lost the connection to the database due to timeout before I got any results back;
Running a explain on the query I can see that the indexes are being used correctly (I assume) and only 1052 rows are being scanned from metric_values.
EXPLAIN
SELECT
count(0)
FROM metric_values v
INNER JOIN dim_metric m ON m.dmm_id = v.dmm_id
WHERE 1=1
AND m.met_id = 1;
1 SIMPLE m ref PRIMARY,dim_metric_met_id_index,dim_metric_dmm_id_met_id_index dim_metric_met_id_index 4 const 1 Using index
1 SIMPLE v ref metric_values_dmm_id_index metric_values_dmm_id_index 4 oi_fact.m.dmm_id 1052 Using index
Doing a simple change to the query to use a sub select instead of a JOIN I can get the results after ~45 seconds.
Running a explain on the modified query I can see that the index is not the primary resource being used to fetch the data and that almost 20 million rows were scaned to bring me the result.
EXPLAIN
SELECT
count(0)
FROM metric_values v
WHERE 1=1
AND v.dmm_id = (SELECT m.dmm_id FROM dim_metric m WHERE m.met_id = 1);
1 PRIMARY v ref metrics_values_dmm_id_index metrics_values_dmm_id_index 4 const 19589800 Using where; Using index
2 SUBQUERY m ref dim_metric_met_id_index dim_metric_met_id_index 4 const 1 Using index
Can someone explain to me what is happening? Did I misunderstood what the EXPLAIN is telling me? Can I do some changes to the data model to improve the query performance?

How do I get a row of the same type from one table or another table along with the information about from which table it was

Let's say I have tables:
create table people (
human_id bigint auto_increment primary key,
birthday datetime );
create table students (
id bigint auto_increment primary key,
human_id bigint unique key not null,
group_id bigint not null );
create table teachers (
id bigint auto_increment primary key,
human_id bigint unique key not null,
academic_degree varchar(20) );
create table library_access (
access_id bigint auto_increment primary key,
human_id bigint not null,
accessed_on datetime );
Now I want to display information about a library access, along with the information whether it was a student or a teacher (and then the id corresponding to the table) (let's say I want something like SELECT access_id,id,true_if_student_false_if_teacher FROM library_access), in an idiomatic way.
How do I form the query (in case such database was already deployed) and what are better and more idiomatic ways to solve that problem (in case it wasn't deployed so far).
MariaDB 5.5, database accessed by Go and nothing else.
Thanks in advance.
You said you need to know which table the data comes from. You can use union all for this:
select la.access_id, s.id, 'Students' as source_table
from library_access la
join students s on la.human_id = s.human_id
union all
select la.access_id, t.id, 'Teachers' as source_table
from library_access la
join teachers t on la.human_id = t.human_id
Without looking at your tables or any idea as to what you want returned in the select statement:
SELECT *
FROM people a,
students b,
teachers c,
library_access d
WHERE a.human_id = b.human_id
AND a.human_id = c.human_id
AND a.human_id = d.human_id

It is too slow to excute query that get data from another table which has foreign key and not exists in

My select query is too slow.
blog_data table
About 2,000,000 rows.
Field Type
no bigint(20) primary key, auto_increment
title text
body text
tags text
url varchar(200)
date datetime
ngram_relation table
Field Type
no int primary key, auto_increment
blogId bigint(20)
term varchar(200)
frequency bigint(20)
TF float
IDF float
weight float
Ns int(11)
primary key(no),
unique key(blogid, term),
foreign key(blogId) references blog_data(no)
I want to get blog_data.no which is not in ngram_relation table. So, I execute query below.
select no, title, body, tags, url
from blog_data where not exists (
select blogid as gg
from ngram_relation
group by blogid
having blog_data.no=gg
) limit 0, 10000
Then, first executing was well. After first executing, The ngram_relation table has about 260,000 rows.
But second executing did not work. Just lock.
How do I modify my query?
Use left outer join to filter the records that are existing in blog_data and not in ngram_relation. This should be faster.
something like this(may not be the exact code) :
select top 10000
from blog_data b
left outer join ngram_relation n
on b.no = n.no
where n.no is null

Running SQL queries with JOINs on large datasets

Im new to using MySQL.
Im trying to run an inner join query, between a database of 80,000 (this is table B) records against a 40GB data set with approx 600million records (this is table A)
Is Mysql suitable for running this sort of query?
Whay sort of time should I expect it to take?
This is the code I ied is below. However it failed as my dbs connection failed at 60000 secs.
set net_read_timeout = 36000;
INSERT
INTO C
SELECT A.id, A.link_id, link_ref, network,
date_1, time_per,
veh_cls, data_source, N, av_jt
from A
inner join B
on A.link_id = B.link_id;
Im starting to look into ways to cutting down the 40GB table size to a temp table, to try and make the query more manageabe. But I keep getting
Error Code: 1206. The total number of locks exceeds the lock table size 646.953 sec
Am I on the right track?
cheers!
my code for splitting the database is:
LOCK TABLES TFM_830_car WRITE, tfm READ;
INSERT
INTO D
SELECT A.id, A.link_id, A.time_per, A.av_jt
from A
where A.time_per = 34 and A.veh_cls = 1;
UNLOCK TABLES;
Perhaps my table indices are in correct all I have is a simple primary key
CREATE Table A
(
id int unsigned Not Null auto_increment,
link_id varchar(255) not Null,
link_ref int not Null,
network int not Null,
date_1 varchar(255) not Null,
#date_2 time default Null,
time_per int not null,
veh_cls int not null,
data_source int not null,
N int not null,
av_jt int not null,
sum_squ_jt int not null,
Primary Key (id)
);
Drop table if exists B;
CREATE Table B
(
id int unsigned Not Null auto_increment,
TOID varchar(255) not Null,
link_id varchar(255) not Null,
ABnode varchar(255) not Null,
#date_2 time not Null,
Primary Key (id)
);
In terms of the schema, it is just these two two tables (A and B) loaded underneath a database
I believe that answer has already been given in this post: The total number of locks exceeds the lock table size
ie. use a table lock to avoid InnoDB default row by row lock mode
thanks foryour help.
Indexing seems to have solved the problem. I managed to reduce the query time from 700secs to aprox 0.2secs per record by indexing on:
A.link_id
i.e. from
from A
inner join B
on A.link_id = B.link_id;
found this really usefull post. v helpfull for a newbe like myself
http://hackmysql.com/case4
code used to index was:
CREATE INDEX linkid_index ON A(link_id);