Using an OR clause in my INNER JOIN - mysql

I have the following two tables,
CREATE TABLE logins (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
user_id_1 INT NOT NULL,
user_id_2 INT DEFAULT 0,
user_id_3 INT DEFAULT 0,
PRIMARY KEY (id)
) ENGINE=MyISAM;
CREATE TABLE user_data (
user_id NOT NULL,
day DATE NOT NULL,
PRIMARY KEY (`user_id, `day`)
) ENGINE=MyISAM;
This schema could use a refactor, but I've inherited it and have to write a query now that does a JOIN with both logins and user_data. I need to select all the rows in user_data that have a > 0 value for one of the three user_id_? keys.
I'm not entirely sure how to compile this query, was thinking something along the lines of:
SELECT logins.user_id_1, logins.user_id_2, logins.user_id_3, user_data.day,
FROM logins
INNER JOIN user_data
ON (logins.user_id_1 = user_data.user_id OR ??)
What's the best way to query for this where I will retrieve up to 3 rows, one for each user_id_?

OR is allowed. Another option is to left join to the user_data table three times, and check to see if any of them came back. You might want to try both and see which performs better. My guess is they will be about the same, but I'm not deeply familiar with the MySQL plan generator.
SELECT
l.user_id_1
,l.user_id_2
,l.user_id_3
--consider: what if there is a match in more
--than one table? what do you want to happen?
,case when ud1.day is not null then ud1.day
when ud2.day is not null then ud2.day
when ud3.day is not null then ud3.day
else null
end as day
FROM
logins l
left JOIN user_data ud1 on ud1.user_id = l.user_id_1
left join user_data ud2 on ud2.user_id = l.user_id_2
left join user_data ud3 on ud3.user_id = l.user_id_3
where ud1.user_id is not null
or ud2.user_id is not null
or ud3.user_id is not null

you can use or.
SELECT logins.user_id_1, logins.user_id_2, logins.user_id_3, user_data.day,
FROM logins
INNER JOIN user_data
ON logins.user_id_1 = user_data.user_id OR
logins.user_id_2 = user_data.user_id OR
logins.user_id_3 = user_data.user_id
This syntax will work to choose all the records that match one or the other conditions.

You could try this:
SELECT logins.user_id_1, logins.user_id_2, logins.user_id_3, user_data.day,
FROM logins,user_data
WHERE user_data.user_id in (logins.user_id_1,logins.user_id_2,logins.user_id_3)

You can set any combinations of different conditions when do ON absolutely no limits for OR AND and other boolean operations same as for WHERE clause
SELECT logins.user_id_1, logins.user_id_2, logins.user_id_3, user_data.day,
FROM logins
INNER JOIN user_data
ON logins.user_id_1 = user_data.user_id OR
logins.user_id_2 = user_data.user_id OR
logins.user_id_3 = user_data.user_id
or probably you don't need those logins.user_id_1, logins.user_id_2, logins.user_id_3, but just user_data.user_id:
SELECT user_data.user_id, user_data.day,
FROM logins
INNER JOIN user_data
ON logins.user_id_1 = user_data.user_id OR
logins.user_id_2 = user_data.user_id OR
logins.user_id_3 = user_data.user_id

You might utilize a UNION, this will return multiple rows per login-row, don't know if you need this:
SELECT l.user_id, user_data.*
FROM user_data as u
INNER JOIN
(
select user_id_1 --, could add other columns...
from logins
union
select user_id_2 --, could add other columns...
from logins
where user_id_2 > 0
union
select user_id_3 --, could add other columns...
from logins
where user_id_3 > 0
) as l
ON l.user_id = user_data.user_id

Related

Inner query or multiple queries which would be result in better performance for mysql?

Inner query:
select up.user_id, up.id as utility_pro_id from utility_pro as up
join utility_pro_zip_code as upz ON upz.utility_pro_id = up.id and upz.zip_code_id=1
where up.available_for_survey=1 and up.user_id not in (select bjr.user_id from book_job_request as bjr where
((1583821800000 between bjr.start_time and bjr.end_time) and (1583825400000 between bjr.start_time and bjr.end_time)))
Divided in two queries:
select up.user_id, up.id as utility_pro_id from utility_pro as up
join utility_pro_zip_code as upz ON upz.utility_pro_id = up.id and upz.zip_code_id=1
Select bjr.user_id as userId from book_job_request as bjr where bjr.user_id in :userIds and (:startTime between bjr.start_time and bjr.end_time) and (:endTime between bjr.start_time and bjr.end_time)
Note:
As per my understanding, when single query will be executed using inner query it will scan all the data of book_job_request but while using multiple queries rows with specified user ids will be checked.
Any other better option for the same operation other than these two is also appreciated.
I expect that the query is supposed to be more like this:
SELECT up.user_id
, up.id utility_pro_id
FROM utility_pro up
JOIN utility_pro_zip_code upz
ON upz.utility_pro_id = up.id
LEFT
JOIN book_job_request bjr
ON bjr.user_id = up.user_id
AND bjr.end_time >= 1583821800000
AND bjr.start_time <= 1583825400000
WHERE up.available_for_survey = 1
AND upz.zip_code_id = 1
AND bjr.user_id IS NULL
For further help with optimisation (i.e. which indexes to provide) we'd need SHOW CREATE TABLE statements for all relevant tables as well as the EXPLAIN for the above
Another possibility:
SELECT up.user_id , up.id utility_pro_id
FROM utility_pro up
JOIN utility_pro_zip_code upz ON upz.utility_pro_id = up.id
WHERE up.available_for_survey = 1
AND upz.zip_code_id = 1
AND bjr.user_id IS NULL
AND NOT EXISTS( SELECT 1 FROM book_job_request
WHERE user_id = up.user_id
AND end_time >= 1583821800000
AND start_time <= 1583825400000 )
Recommended indexes (for my NOT EXISTS and for Strawberry's LEFT JOIN):
book_job_request: (user_id, start_time, end_time)
upz: (zip_code_id, utility_pro_id)
up: (available_for_survey, user_id, id)
The column order given is important. And, no, the single-column indexes you currently have are not as good.

JOIN query taking long time and creating issue "converting HEAP to MyISAM

My query like below. here I used join query to take data. can u pls suggest how can I solve "converting HEAP to MyISAM" issue.
Can I use subquery here to update it? pls suggest how can I.
Here I have joined users table to check user is exist or not. can I refine it without join so that "converting HEAP to MyISAM" can solve.
Oh one more sometimes I will not check with specific user_id. like here I have added user_id = 16082
SELECT `user_point_logs`.`id`,
`user_point_logs`.`user_id`,
`user_point_logs`.`point_get_id`,
`user_point_logs`.`point`,
`user_point_logs`.`expire_date`,
`user_point_logs`.`point` AS `sum_point`,
IF(sum(`user_point_used_logs`.`point`) IS NULL, 0, sum(`user_point_used_logs`.`point`)) AS `minus`
FROM `user_point_logs`
JOIN `users` ON ( `users`.`id` = `user_point_logs`.`user_id` )
LEFT JOIN (SELECT *
FROM user_point_used_logs
WHERE user_point_log_id NOT IN (
SELECT DISTINCT return_id
FROM user_point_logs
WHERE return_id IS NOT NULL
AND user_id = 16082
)
)
AS user_point_used_logs
ON ( `user_point_logs`.`id` = `user_point_used_logs`.`user_point_log_used_id` )
WHERE expire_date >= 1563980400
AND `user_point_logs`.`point` >= 0
AND `users`.`id` IS NOT NULL
AND ( `user_point_logs`.`return_id` = 0
OR `user_point_logs`.`return_id` IS NULL )
AND `user_point_logs`.`user_id` = '16082'
GROUP BY `user_point_logs`.`id`
ORDER BY `user_point_logs`.`expire_date` ASC
DB FIDDLE HERE WITH STRUCTURE
Kindly try this, If it works... will optimize further by adding composite index.
SELECT
upl.id,
upl.user_id,
upl.point_get_id,
upl.point,
upl.expire_date,
upl.point AS sum_point,
coalesce(SUM(upl.point),0) AS minus -- changed from complex to readable
FROM user_point_logs upl
JOIN users u ON upl.user_id = u.id
LEFT JOIN (select supul.user_point_log_used_id from user_point_used_logs supul
left join user_point_logs supl on supul.user_point_log_id=supl.return_id and supl.return_id is null and supl.user_id = 16082) AS upul
ON upl.id=upul.user_point_log_used_id
WHERE
upl.user_id = 16082 and coalesce(upl.return_id,0)= 0
and upl.expire_date >= 1563980400 -- tip: if its unix timestamp change the datatype and if possible use range between dates
#AND upl.point >= 0 -- since its NN by default removing this condition
#AND u.id IS NOT NULL -- removed since the inner join matches not null
GROUP BY upl.id
ORDER BY upl.expire_date ASC;
Edit:
Try adding index in the column return_id on the table user_point_logs.
Since this column is used in join on derived query.
Or use composite index with user_id and return_id
Indexes:
user_point_logs: (user_id, expire_date)
user_point_logs: (user_id, return_id)
OR is hard to optimize. Decide on only one way to say whatever is being said here, then get rid of the OR:
AND ( `user_point_logs`.`return_id` = 0
OR `user_point_logs`.`return_id` IS NULL )
DISTINCT is redundant:
NOT IN ( SELECT DISTINCT ... )
Change
IF(sum(`user_point_used_logs`.`point`) IS NULL, 0,
sum(`user_point_used_logs`.`point`)) AS `minus`
to
COALESCE( ( SELECT SUM(point) FROM user_point_used_logs ... ), 0) AS minus
and toss LEFT JOIN (SELECT * FROM user_point_used_logs ... )
Since a PRIMARY KEY is a key, the second of these is redundant and can be DROPped:
ADD PRIMARY KEY (`id`),
ADD KEY `id` (`id`) USING BTREE;
After all that, we may need another pass to further simplify and optimize it.

Performance problems with an ordered view in MySQL

I am new to performance tuning in MySQL & need your help concerning a view that will replace a table later-on in our design.
The table to be replaced is called users and has the following attributes:
The users2 view has the following attributes:
When I execute a normal SELECT on both objects, they respond at the same time:
SELECT *
FROM `users`
SELECT *
FROM `users2`
But an ordered version of these queries result in a different performance: The table is a little slower (takes less than two seconds), the view need about ten times this time:
SELECT *
FROM `users`
ORDER BY `lastName`, `firstName`
SELECT *
FROM `users2`
ORDER BY `lastName`, `firstName`
To find out the reason, I let EXPLAIN the two comments:
Obviously, an ALL on table 'a' (addresses) on the attribute Countries_ID is making trouble, so I made the following:
ALTER TABLE addresses ADD INDEX (Countries_ID);
This index didn't change anything at all. So, I ask you for your opinion what can be done better.
Notice 1: Is there a way to create an index on temporary column Countries_ID_2?
Notice 2: The users2 view was created with the following SQL query:
CREATE OR REPLACE VIEW users2 AS
(SELECT p.username
, p.password
, p.firstName
, p.lastName
, p.eMail AS email
, a.settlement AS city
, s.name AS country
, pl.languages
, p.description
, p.ID AS ID
, p.phone1
, p.phone2
, CONCAT_WS(' ', a.street, a.addition) AS address
, p.status
, p.publicMail
, ad.name AS Betreuer
FROM addresses a
INNER JOIN addresses_have_persons ap ON a.ID = ap.Addresses_ID
INNER JOIN countries c ON a.Countries_ID = c.ID
INNER JOIN persons p ON a.ID = p.addressID
AND ap.Persons_ID = p.ID
INNER JOIN states s ON a.States_ID = s.ID
INNER JOIN persons_language pl ON p.ID = pl.ID
LEFT JOIN advisors ad ON p.advisorID = ad.ID
-- LEFT JOIN titles t ON t.ID = ad.titleID
);
Notice 3: Although a lot of fields in the persons table are NULL, there is not a single row where these fields are altogether NULL.
EDIT:
CREATE OR REPLACE VIEW persons_language AS
(SELECT lp.Persons_ID AS ID
, GROUP_CONCAT(DISTINCT l.name ORDER BY l.name SEPARATOR ', ') AS languages
FROM languages l
, languages_have_persons lp
WHERE l.ID = lp.Languages_ID
GROUP BY lp.Persons_ID);
Without the ORDER BY, the language names are not alphabetically ordered, which I currently want. Perhaps, we could decide to get them in any order, but we'll see.
Currently, I made the following modifications without any performance improvement:
ALTER TABLE addresses ADD INDEX (Countries_ID);
ALTER TABLE addresses ADD INDEX (States_ID);
ALTER TABLE addresses_have_persons ADD INDEX (Addresses_ID);
ALTER TABLE languages ADD INDEX (name);
ALTER TABLE persons ADD INDEX (addressID);
ALTER TABLE persons ADD INDEX (address2ID);
ALTER TABLE persons ADD INDEX (address3ID);
ALTER TABLE persons ADD INDEX (advisorID);
EDIT 2:
I discuss this issue also on another site. The discussions there let me do the following changes to be nearer to the third normal form:
CREATE OR REPLACE TABLE accounts
(ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY
, username VARCHAR(50) NOT NULL UNIQUE
, password VARCHAR(255) NOT NULL
, eMail VARCHAR(100) NOT NULL
, Persons_ID INT NOT NULL
);
INSERT INTO accounts (username, password, eMail, Persons_ID)
SELECT username, password, eMail, ID
FROM persons;
The table persons does contain only the most necessary things and has the following structure now:
The new table persons_information carries all additional information:
I recreated the users2 with the following command:
CREATE OR REPLACE VIEW users2 AS
(SELECT ac.username
, ac.password
, p.firstName
, p.lastName
, ac.eMail AS email
, adr.settlement AS city
, s.name AS country
, pl.languages
, pi.description
, ac.Persons_ID AS ID
, pi.phone1
, pi.phone2
, CONCAT_WS(' ', adr.street, adr.addition) AS address
, p.status
, pi.publicMail
, adv.name AS Betreuer
FROM accounts ac
INNER JOIN persons p ON ac.Persons_ID = p.ID
INNER JOIN persons_information pi ON p.ID = pi.ID
INNER JOIN addresses adr ON adr.ID = pi.addressID
INNER JOIN addresses_have_persons ap ON adr.ID = ap.Addresses_ID
AND ap.Persons_ID = p.ID
INNER JOIN countries c ON adr.Countries_ID = c.ID
INNER JOIN states s ON adr.States_ID = s.ID
INNER JOIN persons_language pl ON p.ID = pl.ID
LEFT JOIN advisors adv ON pi.advisorID = adv.ID
-- LEFT JOIN titles t ON t.ID = adv.titleID
);
The SELECT _ FROM users2 is fast, but if I add an ORDER BY lastName, firstName, it takes about 25 seconds to get the response.
Here are the results of the *EXPLAIN SELECT * FROM users2* command:
And here for the other command:
I also (re)created following indexes:
ALTER TABLE addresses ADD INDEX (Countries_ID);
ALTER TABLE addresses ADD INDEX (States_ID);
ALTER TABLE addresses_have_persons ADD INDEX (Persons_ID);
ALTER TABLE languages ADD INDEX (name);
ALTER TABLE persons_information ADD INDEX (addressID);
ALTER TABLE persons_information ADD INDEX (address2ID);
ALTER TABLE persons_information ADD INDEX (address3ID);
ALTER TABLE persons_information ADD INDEX (advisorID);
I think one reason for the problem is the persons_language view that is created as follows:
CREATE OR REPLACE VIEW persons_language AS
(SELECT lp.Persons_ID AS ID
, GROUP_CONCAT(DISTINCT l.name ORDER BY l.name SEPARATOR ', ') AS languages
FROM languages l
INNER JOIN languages_have_persons lp ON l.ID = lp.Languages_ID
GROUP BY lp.Persons_ID);
EDIT 3:
For those interested, I add the EXPLAIN for the persons_language view:
EDIT 4:
After the database meeting today, we decided to delete all objects related to the address information & recreated the view with
CREATE OR REPLACE VIEW `users2` AS
(SELECT ac.username
, ac.password
, p.firstName
, p.lastName
, ac.eMail AS email
, pl.languages
, pi.description
, ac.Persons_ID AS ID
, pi.phone1
, pi.phone2
, p.status
, pi.publicMail
, adv.name AS Betreuer
FROM accounts ac
INNER JOIN persons p ON ac.Persons_ID = p.ID
INNER JOIN persons_information pi ON p.ID = pi.ID
INNER JOIN persons_language pl ON p.ID = pl.ID
INNER JOIN advisors adv ON pi.advisorID = adv.ID
WHERE ac.password IS NOT NULL
);
I also created an index with
CREATE INDEX LanguagesPersonsIndex ON `languages_have_persons` (`Languages_ID`, `Persons_ID`);
The EXPLAIN command shows that the new indices are in use and that the delay after a SELECT with an ORDER BY clause with the new, smaller view is about 18 s. Here is the new result:
My question is: What could I do more to improve the performance?
The key fault must be the problem.
But depending on data volume on the joined tables, it'll anyway be slower.
Try To:
Implement KeyIndexes on ALL attributes used to stablish relationships. (ap.Addresses_ID, a.Countries_ID, p.addressID, ap.Persons_ID, a.States_ID, p.advisorID).
Declare PK on All 'ID' columns.
Don't use ORDER or GROUP in the views construction.
Declare Key Index for attributes that are most used on searches, ordering or grouping.
Tip: The 'INNER' (INNER JOIN) isn't necessary. Is the same of 'JOIN'
Your VIEW "persons_language" would be better like this:
SELECT lp.Persons_ID AS ID, GROUP_CONCAT(DISTINCT l.name ORDER BY l.name SEPARATOR ', ') AS languages
FROM languages_have_persons lp
JOIN languages l ON l.ID = lp.Languages_ID
GROUP BY lp.Persons_ID;
It's more appropriate because the clauses 'FROM' and 'JOIN' are processed before 'WHERE' clause.
You may boost your mysql memory and cache configurations.
Look the my mysql server's configurations (Runs an ERP with weight tables and views):
join_buffer_size= 256M
key_buffer = 312M
key_buffer_size = 768M
max_allowed_packet = 160M
thread_stack = 192K
thread_cache_size = 8
query_cache_limit = 64M
innodb_buffer_pool_size = 1512M
table_cache = 1024M
read_buffer_size = 4M
query_cache_size = 768M
query_cache_limit = 128M
SELECT *
FROM `users`
ORDER BY `lastName`, `firstName`
needs
INDEX(last_name, first_name) -- in that order
Beware of VIEWs; some VIEWs optimize well, some do not.
Please provide SHOW CREATE TABLE for both addresses and addresses_have_persons.
In persons_language, why do you need DISTINCT? Doesn't it have PRIMARY KEY(person, language) (or in the opposite order)? Let's see SHOW CREATE TABLE.
Please provide the EXPLAIN for any query you want to discuss.

How to speed up left join queries by indexing?

At the moment I am experiencing some slower MySQL queries in my application which I want to speed up. Unfortunately I’m not quite sure which is the correct way to do it.
I have the following (fictitious) tables: Book, Page and Word.
Word is child of Page by word_page_id
Page is child of Book by page_book id
I already have individual indexes on page_book_id, word_page_id, book_user_id and book_flag_delete.
SELECT `book`.*, COUNT(word_id) AS `word_amount` FROM `book`
LEFT JOIN `page` ON page_book_id = book_id
LEFT JOIN `word` ON word_page_id = paragraph_id
WHERE (book_user_id = 1) AND (book_flag_delete IS NULL)
GROUP BY `book_id`
ORDER BY `book_id` ASC LIMIT 100
SELECT COUNT(DISTINCT `book_id`) AS `book_row_count` FROM `book`
LEFT JOIN `page` ON page_book_id = book_id
LEFT JOIN `word` ON word_page_id = page_id
WHERE (book_user_id = 59) AND (book_flag_delete IS NULL)
Any ideas how to speed up such queries?
Is there extra indexing involved?
Set indexes on the fields you use for joining.
Further make sure that these have both the same datatype, encoding, and collation, else the index will also not be used.
mysql> EXPLAIN <query> will show you the actually used fields (key column in output) and the available indexes (possible_keys output field).
For this query:
SELECT b.*, COUNT(w.word_id) AS `word_amount`
FROM `book` b LEFT JOIN
`page` p
ON p.page_book_id = b.book_id LEFT JOIN
`word` w
ON w.word_page_id = p.paragraph_id
WHERE (b.book_user_id = 1) AND (b.book_flag_delete IS NULL)
GROUP BY b.`book_id`
ORDER BY b.`book_id` ASC
LIMIT 100;
The best indexes are: book(user_id, book_flag_delete, book_id), page(page_book_id, paragraph_id), and word(word_page_id, word_id).
However, the overall group by might be expensive. You might try writing the query as:
SELECT b.*,
(SELECT COUNT(w.word_id)
FROM `page` p JOIN
`word` w
ON w.word_page_id = p.paragraph_id
WHERE p.page_book_id = b.book_id
) AS `word_amount`
FROM `book` b LEFT JOIN
WHERE (b.book_user_id = 1) AND (b.book_flag_delete IS NULL)
ORDER BY b.`book_id` ASC
LIMIT 100;
The same indexes indexes work here. But, this query should avoid a group by on all the data at once (instead, it uses the indexes for the aggregation).
The optimal schema for a many-to-many mapping table is
CREATE TABLE XtoY (
# No surrogate id for this table
x_id MEDIUMINT UNSIGNED NOT NULL, -- For JOINing to one table
y_id MEDIUMINT UNSIGNED NOT NULL, -- For JOINing to the other table
# Include other fields specific to the 'relation'
PRIMARY KEY(x_id, y_id), -- When starting with X
INDEX (y_id, x_id) -- When starting with Y
) ENGINE=InnoDB;
The details on 'why' are in my index cookbook
In your select you're gonna want to refrain from using the wildcard "*" to grab columns. Plus utilize aliases ALWAYS!! This will keep your db from having to create a "virtual" alias.
select book1.column1, book1.column2, page1.column1
from book book1
left join page page1
on page1.page_book_id = book1.book_id
..... blah

MySQL Update query with left join and group by

I am trying to create an update query and making little progress in getting the right syntax.
The following query is working:
SELECT t.Index1, t.Index2, COUNT( m.EventType )
FROM Table t
LEFT JOIN MEvents m ON
(m.Index1 = t.Index1 AND
m.Index2 = t.Index2 AND
(m.EventType = 'A' OR m.EventType = 'B')
)
WHERE (t.SpecialEventCount IS NULL)
GROUP BY t.Index1, t.Index2
It creates a list of triplets Index1,Index2,EventCounts.
It only does this for case where t.SpecialEventCount is NULL. The update query I am trying to write should set this SpecialEventCount to that count, i.e. COUNT(m.EventType) in the query above. This number could be 0 or any positive number (hence the left join). Index1 and Index2 together are unique in Table t and they are used to identify events in MEvent.
How do I have to modify the select query to become an update query? I.e. something like
UPDATE Table SET SpecialEventCount=COUNT(m.EventType).....
but I am confused what to put where and have failed with numerous different guesses.
I take it that (Index1, Index2) is a unique key on Table, otherwise I would expect the reference to t.SpecialEventCount to result in an error.
Edited query to use subquery as it didn't work using GROUP BY
UPDATE
Table AS t
LEFT JOIN (
SELECT
Index1,
Index2,
COUNT(EventType) AS NumEvents
FROM
MEvents
WHERE
EventType = 'A' OR EventType = 'B'
GROUP BY
Index1,
Index2
) AS m ON
m.Index1 = t.Index1 AND
m.Index2 = t.Index2
SET
t.SpecialEventCount = m.NumEvents
WHERE
t.SpecialEventCount IS NULL
Doing a left join with a subquery will generate a giant
temporary table in-memory that will have no indexes.
For updates, try avoiding joins and using correlated
subqueries instead:
UPDATE
Table AS t
SET
t.SpecialEventCount = (
SELECT COUNT(m.EventType)
FROM MEvents m
WHERE m.EventType in ('A','B')
AND m.Index1 = t.Index1
AND m.Index2 = t.Index2
)
WHERE
t.SpecialEventCount IS NULL
Do some profiling, but this can be significantly faster in some cases.
my example
update card_crowd as cardCrowd
LEFT JOIN
(
select cc.id , count(1) as num
from card_crowd cc LEFT JOIN
card_crowd_r ccr on cc.id = ccr.crowd_id
group by cc.id
) as tt
on cardCrowd.id = tt.id
set cardCrowd.join_num = tt.num;