Optimizing rand query with join

Optimizing rand query with join - mysql

I have a rand query which runs very slow like almost every rand query. I researched all stackoverflow but cannot find any good solution for my query
SELECT u.id
, u.is_instagram_connected
, u.tokens
, u.username
, u.name
, u.photo
, u.bio
, u.voice
, u.mobile_update
, 1584450999 - l.time idleTime
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN mobile_token_list t
ON t.username = l.username
WHERE l.time > 1584393399
AND l.username NOT IN ('enesdoo')
AND u.username NOT IN (
SELECT blocked_username
FROM hided_mobile_users_from_shuffle
WHERE username = 'enesdoo'
)
AND u.ban_status = 0
AND u.perma_ban = 0
AND u.mobile_online_status = 1
AND u.lock_status = 0
GROUP
BY l.username
ORDER
BY RAND( )
LIMIT 27
If i remove the order by rand line, this runs very very quick like 100 times faster.
How can i speed up this query?
mobile_login_list has > 50k rows
users has > 1m rows
Edit:
Explain:
My table:
CREATE TABLE IF NOT EXISTS `mobile_login_list` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(30) COLLATE utf8_bin NOT NULL,
`key` varchar(32) COLLATE utf8_bin NOT NULL,
`time` int(11) NOT NULL,
`ip` int(11) NOT NULL,
`version` smallint(4) NOT NULL,
`messaged` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `kontrol` (`username`,`key`),
KEY `username` (`username`),
KEY `time` (`time`),
KEY `username_2` (`username`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=3351637 ;

In the lingo of random retrieval, this is called a deal operation (deal 27 different cards from a shuffled deck of 4k or so. The other random operation is called roll: it allows duplicates.)
You're using SELECT mess-of-columns FROM mess-of-joins WHERE mess-of-criteria ORDER BY RAND() LIMIT small-number to do shuffle and deal operation. That is a notorious performance antipattern. It causes some extra work for the server because it must order a fairly large result set then discard almost all of it (with the LIMIT).
A way to save some of the trouble is to defer the joins to the details. Shuffle only the ids. Then take the small number of results and fetch the details you need. Something like this.
SELECT u.id /* just the id values */
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN mobile_token_list t
ON t.username = l.username
WHERE l.time > 1584393399
AND l.username NOT IN ('enesdoo')
AND u.username NOT IN (
SELECT blocked_username
FROM hided_mobile_users_from_shuffle
WHERE username = 'enesdoo'
)
AND u.ban_status = 0
AND u.perma_ban = 0
AND u.mobile_online_status = 1
AND u.lock_status = 0
ORDER
BY RAND( )
LIMIT 27
You can debug, run EXPLAIN and optimize this subquery by changing indexes and maybe tightening up your selection criteria. It's the one doing all the hard work of shuffling and dealing.
Then join that resultset to your detail tables to choose the data you need. This outer query only needs to process your 27 rows. Be sure to shuffle again.
SELECT u.id
, u.is_instagram_connected
, u.tokens
, u.username
, u.name
, u.photo
, u.bio
, u.voice
, u.mobile_update
, 1584450999 - l.time idleTime
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN (
/* the subquery from above */
) selected ON u.id = selected.id
ORDER BY RAND()
Putting it all together, you get this big repetitive mess of a query. But it should be a little faster.
SELECT u.id
, u.is_instagram_connected
, u.tokens
, u.username
, u.name
, u.photo
, u.bio
, u.voice
, u.mobile_update
, 1584450999 - l.time idleTime
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN (
SELECT u.id
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN mobile_token_list t
ON t.username = l.username
WHERE l.time > 1584393399
AND l.username NOT IN ('enesdoo')
AND u.username NOT IN (
SELECT blocked_username
FROM hided_mobile_users_from_shuffle
WHERE username = 'enesdoo'
)
AND u.ban_status = 0
AND u.perma_ban = 0
AND u.mobile_online_status = 1
AND u.lock_status = 0
ORDER
BY RAND( )
LIMIT 27
) selected ON u.id = selected.id
ORDER BY RAND()
A more performant way to deal records is this, if you do the dealing a lot.
Add a FLOAT column to the table you're dealing from, let's call it deal. Put an index on it.
Every few hours, or maybe overnight or even once a week, shuffle the table by running this query UPDATE users SET deal = RAND(); It will take a while; it needs to change the deal value in every row.
When you need to deal, do ...WHERE deal >= RAND() * 0.9 ... ORDER BY deal LIMIT n. The multiplication by 0.9 helps ensure you don't hit the end of the table by choosing a random number too close to 1.
This is equivalent, in cardshark terms, to shuffling the deck every few hours and then just cutting it for every deal. It's the way Wikipedia implements their "show a random article" feature.

Can we see the EXPLAIN for this instead...?
SELECT DISTINCT u.id
, u.is_instagram_connected
, u.tokens
, u.username
, u.name
, u.photo
, u.bio
, u.voice
, u.mobile_update
, 1584450999 - l.time idleTime
FROM mobile_login_list l
JOIN users u
ON l.username = u.username
JOIN mobile_token_list t
ON t.username = l.username
LEFT
JOIN hided_mobile_users_from_shuffle x
ON x.blocked_username = u.username
AND x.username = 'enesdoo'
WHERE l.time > 1584393399
AND l.username NOT IN ('enesdoo')
AND x.blocked_username IS NULL
AND u.ban_status = 0
AND u.perma_ban = 0
AND u.mobile_online_status = 1
AND u.lock_status = 0
ORDER
BY RAND( )
LIMIT 27
Given my limited knowledge of query optimisation, I would simply define the table as follows, but maybe someone else can suggest further improvements:
CREATE TABLE IF NOT EXISTS mobile_login_list
(id SERIAL PRIMARY KEY
,username varchar(30) COLLATE utf8_bin NOT NULL
,`key` varchar(32) COLLATE utf8_bin NOT NULL
,time int NOT NULL
,ip int NOT NULL
,version smallint NOT NULL
,messaged int NOT NULL DEFAULT 0
,KEY username_2 (username,time) -- or (time,username)
);
Note that key is a reserved word (and time is a 'keyword') rendering it a poor choice for a table/column identifier

Related

Index probably not used correctly on simple SQL query

Size:
Campaigns: 3k rows (200 with campaigns.is_active = 1)
Links: 20k rows (4k with links.status = 1 // 500 with links.status = 1 AND campaigns.is_active = 1)
Clicks: 10mln rows (50k with created > '2020-10-25 00:00:00')
This query runs 2 seconds
SELECT links.id, COUNT(clicks.id)
FROM links
INNER JOIN campaigns ON campaigns.id = links.campaign_id
AND campaigns.is_active = 1
LEFT JOIN clicks ON clicks.link_id = links.id
WHERE links.status = 1
AND clicks.created > '2020-10-25 00:00:00'
GROUP BY links.id
When I remove the following line, it runs just 0.13 seconds (15 times faster)
AND campaigns.is_active = 1
There is an INDEX on campaigns.is_active.
Also tried to set an index on 2 columns (campaigns.id + campaigns.is_active) but didn't help.
"campaigns.is_active" contains simply 0 or 1. The campaigns table is small, the campaigns.is_active condition actually reduces the amount of rows. So it should speed up the query instead.
Why does it take so much longer because of this condition and how to fix it?
If I would remove the JOIN to campaigns and instead add links.campaign_id to the SELECT fields and then query every single of the returned campaign_id's in an additional query like "SELECT is_active FROM campaigns WHERE id = ?" it would still be faster, because such a query is 0.000x. From my experience when something is faster in 2 queries, it usually means the first query isn't optimized to its full extent.
Explain-Select
Structure
CREATE TABLE `campaigns` (
`id` int(11) UNSIGNED NOT NULL,
`is_active` tinyint(4) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `clicks` (
`id` int(11) UNSIGNED NOT NULL,
`link_id` int(11) UNSIGNED NOT NULL,
`created` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `links` (
`id` int(11) UNSIGNED NOT NULL,
`campaign_id` int(8) UNSIGNED NOT NULL,
`status` tinyint(4) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `campaigns`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `id_isactive` (`id`,`is_active`),
ADD KEY `is_active` (`is_active`)
ALTER TABLE `clicks`
ADD PRIMARY KEY (`id`),
ADD KEY `link_id` (`link_id`),
ADD KEY `created` (`created`)
ALTER TABLE `links`
ADD PRIMARY KEY (`id`),
ADD KEY `campaign_id` (`campaign_id`),

How long does this take?
SELECT l.id,
(SELECT COUNT(*)
FROM clicks cl
WHERE cl.link_id = l.id AND
cl.created > '2020-10-25'
)
FROM links l JOIN
campaigns ca
ca.id = l.campaign_id
WHERE l.status = 1 AND ca.is_active = 1;
EDIT:
Hmmm, with an order by, you can try:
SELECT l.id,
(SELECT COUNT(*)
FROM clicks cl
WHERE cl.link_id = l.id AND
cl.created > '2020-10-25'
)
FROM links l
WHERE EXISTS (SELECT 1
FROM campaigns ca
WHERE ca.id = l.campaign_id AND ca.is_active = 1
)
WHERE l.status = 1
ORDER BY l.id;
For this, you want an index on links(status, id) and campaigns(campaign_id, is_active).

Question... If a campaign is not currently active, you don't want any output for it, correct? Furthermore, there won't be any clicks for inactive campaigns, correct? Then why bother checking is_active?
Even if my analysis is wrong, it may be faster to ignore is_active until after the counts have been tallied.
Please don't use LEFT when it is not functional. You have a simple JOIN.
Use COUNT(*); COUNT(x) tests x for being not null.
SELECT links.id, COUNT(*)
FROM links
JOIN clicks ON clicks.link_id = links.id
WHERE links.status = 1
AND clicks.created > '2020-10-25 00:00:00'
GROUP BY links.id
This is redundant:
ADD UNIQUE KEY `id_isactive` (`id`,`is_active`),
since PRIMARY KEY(id) declares id to be an index and unique.

I prefer not to fight the database engine optimizer.
SELECT links.id, campaigns.is_active, COUNT(clicks.id)
FROM links
INNER JOIN campaigns ON campaigns.id = links.campaign_id
LEFT JOIN clicks ON clicks.link_id = links.id
WHERE links.status = 1
AND clicks.created > '2020-10-25 00:00:00'
GROUP BY links.id, campaigns.is_active
HAVING campaigns.is_active = 1;
Second variant!
-- Second Variant
EXPLAIN
SELECT links.id AS LinksId
, COUNT(clicks.id) AS ClickCount
FROM links
LEFT JOIN clicks
ON links.id = clicks.link_id
WHERE links.status = 1
AND clicks.created > '2020-10-25 00:00:00'
AND links.campaign_id IN (SELECT campaign_id
FROM campaigns
WHERE is_active = 1)
GROUP BY links.id;
Third time is the charm! Using CTEs due to the published cardinalities.
-- Third time is the charm
WITH ActiveCampaigns
AS
(SELECT *
FROM campaigns
WHERE is_active = 1)
SELECT links.id, COUNT(clicks.id)
FROM links
INNER JOIN ActiveCampaigns
ON ActiveCampaigns.id = links.campaign_id
LEFT JOIN clicks
ON clicks.link_id = links.id
WHERE links.status = 1
AND clicks.created > '2020-10-25 00:00:00'
GROUP BY links.id;

Optimising MySQL Query, Select within Select, Multiple of same

I need help optimising this MySQL statement that I whipped up. It does exactly what I want, however I have a great feeling that it'll be quite slow, since I do multiple selects within the statement, and I also query achievements_new multiple times. This is the first time I do some major statement like this, I'm used to the simple SELECT FROM WHERE style crap.
I might do some explaining, this is for a leaderboard style thing for my website.
--First variable output is a rank that is calculated according to the formula shown, (Log + Log + # of achievements).
--Wepvalue is the sum of the values of the weapons which that id has. playerweapons contains all the weapons, and weaponprices convert the type to the price, and then the SUM calculates the value.
--Achcount is simply the amount of achievements that's unlocked. Maybe this can be optimised somehow with the rank output?
--id in achievements_new and playerweapons are Foreign Keys to the id in playerdata
SELECT
(
IFNULL(LOG(1.5, cashearned),0) +
IFNULL(LOG(1.3, roundswon), 0) +
(
SELECT COUNT(*)
FROM achievements_new
WHERE `value` = -1 AND achievements_new.id = playerdata.id
)
) as rank,
nationality,
nick,
steamid64,
cash,
playtime,
damage,
destroyed,
(
SELECT SUM(price)
FROM weaponprices
WHERE weapon IN
(
SELECT class
FROM playerweapons
WHERE playerweapons.id = playerdata.id
)
) as wepvalue,
(
SELECT COUNT(*)
FROM achievements_new
WHERE `value` = -1 AND achievements_new.id = playerdata.id
) as achcount,
lastplayed
FROM playerdata
ORDER BY rank DESC
Table structures:
playerdata:
CREATE TABLE IF NOT EXISTS `playerdata` (
`id` int(11) unsigned NOT NULL,
`steamid64` char(17) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`nick` varchar(32) NOT NULL DEFAULT '',
`cash` int(32) unsigned NOT NULL DEFAULT '0',
`playtime` int(32) unsigned NOT NULL DEFAULT '0',
`nationality` char(2) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`damage` int(32) unsigned NOT NULL DEFAULT '0',
`destroyed` int(32) unsigned NOT NULL DEFAULT '0',
`cashearned` int(10) unsigned NOT NULL,
`roundswon` smallint(5) unsigned NOT NULL,
`lastplayed` datetime NOT NULL,
) ENGINE=InnoDB
achievements_new:
CREATE TABLE IF NOT EXISTS `achievements_new` (
`id` int(10) unsigned NOT NULL,
`achkey` enum(<snip - lots of values here>) NOT NULL,
`value` mediumint(8) NOT NULL DEFAULT '0'
) ENGINE=InnoDB
playerweapons:
CREATE TABLE IF NOT EXISTS `playerweapons` (
`id` int(10) unsigned NOT NULL,
`class` varchar(30) CHARACTER SET ascii NOT NULL
) ENGINE=InnoDB
weaponprices:
CREATE TABLE IF NOT EXISTS `weaponprices` (
`weapon` varchar(30) NOT NULL,
`price` int(10) unsigned NOT NULL
) ENGINE=InnoDB
Thanks in advance!

Try something like the query below.
I used LEFT JOIN instead of joins because there may be players without achievements or weapons. If you do not need these players you can use JOIN
SELECT
IFNULL(LOG(1.5, p.cashearned),0) +
IFNULL(LOG(1.3, p.roundswon), 0) +
SUM(CASE WHEN ac.id IS NOT NULL THEN 1 ELSE 0 END)/COUNT(pw.id) as rank
p.nationality,
p.nick,
p.steamid64,
p.cash,
p.playtime,
p.damage,
p.destroyed,
--SUM(CASE WHEN pw.id IS NOT NULL THEN pw.price ELSE 0 END) as wepvalue,
--wpn.price as wepvalue,
SUM(CASE WHEN pw.id IS NOT NULL THEN wp.price ELSE 0 END)/COUNT(ac.id) as wepvalue,
SUM(CASE WHEN ac.id IS NOT NULL THEN 1 ELSE 0 END)/COUNT(pw.id) as achcount,
lastplayed
FROM playerdata as p
JOIN playerweapons as pw ON pw.id = p.id
JOIN weaponprices as wp ON pw.class = wp.weapon
LEFT JOIN achievements_new as ac ON ac.id = p.id AND ac.value = -1
--LEFT JOIN playerweapons as pw ON pw.id = p.id
--LEFT JOIN weaponprices as wp ON pw.class = wp.weapon
--LEFT JOIN ( SELECT
--pw.id as player,
--SUM(wp.price) as price
--FROM weaponprices as wp
--JOIN playerweapons as pw ON pw.class = wp.weapon
--GROUP BY pw.id
--) as wpn ON wpn.player = p.id
GROUP BY
p.nationality,
p.nick,
p.steamid64,
p.cash,
p.playtime,
p.damage,
p.destroyed,
p.lastplayed

Your query is fairly reasonable, although I would rewrite the subqueries to use explicit joins rather than in and factor out the achievements subquery:
SELECT (IFNULL(LOG(1.5, cashearned),0) + IFNULL(LOG(1.3, roundswon), 0) +
coalesce(an.cnt, 0)
) as rank,
nationality, nick, steamid64, cash, playtime, damage, destroyed,
(SELECT SUM(wp.price)
FROM weaponprices wp JOIN
playerweapons pw
on pw.class = wp.weapons
WHERE pw.id = pd.id
) as wepvalue,
coalesce(an.cnt, 0) as achcount,
lastplayed
FROM playerdata pd left outer join
(SELECT id, count(*) as cnt
FROM achievements_new an
WHERE an.`value` = -1
GROUP BY an.id
) an
on an.id = pd.id
ORDER BY rank DESC;
For this query, create the following indexes:
playerweapons(id, weapon);
weaponprices(class, price);
achievements_new(value, id);
This does the following things:
It eliminates two redundant subqueries on achievements_new.
It should optimize the prices subquery to only use indexes.
It replaces the in with an explicit join, which is sometimes optimized better.
It does not require an outer group by.

I would try to remove all correlated subqueries
SELECT
( COALESCE(LOG(1.5, pd.cashearned), 0)
+ COALESCE(LOG(1.3, pd.roundswon), 0)
+ COALESCE(an.cnt, 0)) AS rank
, pd.nationality
, pd.nick
, pd.steamid64
, pd.cash
, pd.playtime
, pd.damage
, pd.destroyed
, COALESCE(pw.wepvalue, 0) AS wepvalue
, COALESCE(an.cnt, 0) AS achcount
, pd.lastplayed
FROM playerdata pd
LEFT JOIN (
SELECT
id
, COUNT(*) AS cnt
FROM achievements_new
WHERE value = -1
GROUP BY
id
) an
ON pd.id = an.id
LEFT JOIN (
SELECT
playerweapons.id
, SUM(price) AS wepvalue
FROM weaponprices
INNER JOIN playerweapons
ON weaponprices.weapon = playerweapons.class
GROUP BY
playerweapons.id
) pw
ON pd.id = pw.id
ORDER BY
rank DESC;

MySQL LEFT JOIN returns empty resultset

maybe I miss something stupid but...
I have three tables in m-to-m relation:
CREATE TABLE tbl_users (
usr_id INT NOT NULL AUTO_INCREMENT ,
usr_name VARCHAR( 64 ) NOT NULL DEFAULT '' ,
usr_surname VARCHAR( 64 ) NOT NULL DEFAULT '' ,
usr_pwd VARCHAR( 64 ) NOT NULL ,
usr_level INT( 1 ) NOT NULL DEFAULT 0,
PRIMARY KEY ( usr_id )
) ENGINE = InnoDB;
CREATE TABLE tbl_houses (
house_id INT NOT NULL AUTO_INCREMENT ,
city VARCHAR( 100 ) DEFAULT '' ,
address VARCHAR( 100 ) DEFAULT '' ,
PRIMARY KEY ( house_id )
) ENGINE = InnoDB;
CREATE TABLE tbl_users_houses (
user_id INT NOT NULL ,
house_id INT NOT NULL ,
INDEX user_key (user_id),
FOREIGN KEY (user_id) REFERENCES tbl_users(usr_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
INDEX house_key (house_id) ,
FOREIGN KEY (house_id) REFERENCES tbl_houses(house_id)
ON DELETE CASCADE
ON UPDATE CASCADE
) ENGINE = InnoDB;
Into the link table I have two records:
user_id house_id
1 1
1 2
Now, trying to select all houses with:
select * from tbl_houses AS H
left join tbl_users_houses AS UH on H.house_id = UH.house_id
where UH.user_id = 2;
Why I get no data instead of all houses?

Because of this line:
where UH.user_id = 2;
This is only true if UH.user_id is non-null, so it effectively excludes any case where you have a house without a matching row in UH, which is the point of using a LEFT JOIN.
If you want all houses, and UH data where there is a match, use this:
select * from tbl_houses AS H
left join tbl_users_houses AS UH on H.house_id = UH.house_id and UH.user_id = 2;

Your WHERE clause is specifying that
UH.user_id = 2
What happens if you change it to H.user_id = 2 ?
To give this (all houses for user_id = 2):
select * from tbl_houses AS H
left join tbl_users_houses AS UH on H.house_id = UH.house_id
where H.user_id = 2;
Or if you want all houses regardless and data for user_id = 2 where it exists in tbl_User_houses try this:
select * from tbl_houses AS H
left join tbl_users_houses AS UH on H.house_id = UH.house_id and UH.user_id = 2;

Becasue you have no user with id 2.

Assuming your question "Why I get no data instead of all houses?" means you are wondering why you are not getting all the users when the users table is on on inner side of an outer join, this is happening because you placed the predicate condition after the join (in where clause) instead of in the join condition. This effectively converts the join to an inner join. change it to:
select * from tbl_houses H
left join tbl_users_houses UH
on uh.house_id = h.house_id
and UH.user_id = 2;
Conditions in where clauses are applied after all joins have been processed. At this point, values from rows from tables on the outer side of outer joins will all have nulls in them, so any predicate condition on such a value will cause these rows to be eliminated.

MySQL "OR MATCH" hangs (very slow) on multiple tables

After learning how to do MySQL Full-Text search, the recommended solution for multiple tables was OR MATCH and then do the other database call. You can see that in my query below.
When I do this, it just gets stuck in a "busy" state, and I can't access the MySQL database.
SELECT
a.`product_id`, a.`name`, a.`slug`, a.`description`, b.`list_price`, b.`price`, c.`image`, c.`swatch`, e.`name` AS industry,
MATCH( a.`name`, a.`sku`, a.`description` ) AGAINST ( '%s' IN BOOLEAN MODE ) AS relevance
FROM
`products` AS a LEFT JOIN `website_products` AS b
ON (a.`product_id` = b.`product_id`)
LEFT JOIN ( SELECT `product_id`, `image`, `swatch` FROM `product_images` WHERE `sequence` = 0) AS c
ON (a.`product_id` = c.`product_id`)
LEFT JOIN `brands` AS d
ON (a.`brand_id` = d.`brand_id`)
INNER JOIN `industries` AS e ON (a.`industry_id` = e.`industry_id`)
WHERE
b.`website_id` = %d
AND b.`status` = %d
AND b.`active` = %d
AND MATCH( a.`name`, a.`sku`, a.`description` ) AGAINST ( '%s' IN BOOLEAN MODE )
OR MATCH ( d.`name` ) AGAINST ( '%s' IN BOOLEAN MODE )
GROUP BY a.`product_id`
ORDER BY relevance DESC
LIMIT 0, 9
Any help would be greatly appreciated.
EDIT
All the tables involved are MyISAM, utf8_general_ci.
Here's the EXPLAIN SELECT statement:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY a ALL NULL NULL NULL NULL 16076 Using temporary; Using filesort
1 PRIMARY b ref product_id product_id 4 database.a.product_id 2
1 PRIMARY e eq_ref PRIMARY PRIMARY 4 database.a.industry_id 1
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 23261
1 PRIMARY d eq_ref PRIMARY PRIMARY 4 database.a.brand_id 1 Using where
2 DERIVED product_images ALL NULL NULL NULL NULL 25933 Using where
I don't know how to make that look neater -- sorry about that
UPDATE
it returns the query after 196 seconds (I think correctly). The query without multiple tables takes about .56 seconds (which I know is really slow, we plan on changing to solr or sphinx soon), but 196 seconds??
If we could add a number to the relevance if it was in the brand name ( d.name ), that would also work

I found 2 things slowing down my query drastically and fixed them.
To answer the first problem, it needed parentheses around the entire "MATCH AGAINST OR MATCH AGAINST":
WHERE
b.`website_id` = %d
AND b.`status` = %d
AND b.`active` = %d
AND (
MATCH( a.`name`, a.`sku`, a.`description` ) AGAINST ( '%s' IN BOOLEAN MODE )
OR MATCH ( d.`name` ) AGAINST ( '%s' IN BOOLEAN MODE )
)
I didn't understand how to use EXPLAIN SELECT, but it helped quite a bit, so thank you! This reduced that first number 16076 rows to 143. I then noticed the other two with over 23 and 25 thousand rows. That was cause from this line:
LEFT JOIN ( SELECT `product_id`, `image`, `swatch` FROM `product_images` WHERE `sequence` = 0) AS c
ON (a.`product_id` = c.`product_id`)
There was a reason I was doing this in the first place, which then changed. When I changed it, I didn't realize I could do a normal LEFT JOIN:
LEFT JOIN `product_images` AS c
ON (a.`product_id` = c.`product_id`)
This makes my final query like this: (and MUCH faster went from the 196 seconds to 0.0084 or so)
SELECT
a.`product_id`, a.`name`, a.`slug`, a.`description`, b.`list_price`, b.`price`,
c.`image`, c.`swatch`, e.`name` AS industry,
MATCH( a.`name`, a.`sku`, a.`description` ) AGAINST ( '%s' IN BOOLEAN MODE ) AS relevance
FROM
`products` AS a LEFT JOIN `website_products` AS b
ON (a.`product_id` = b.`product_id`)
LEFT JOIN `product_images` AS c
ON (a.`product_id` = c.`product_id`)
LEFT JOIN `brands` AS d
ON (a.`brand_id` = d.`brand_id`)
INNER JOIN `industries` AS e
ON (a.`industry_id` = e.`industry_id`)
WHERE
b.`website_id` = %d
AND b.`status` = %d
AND b.`active` = %d
AND c.`sequence` = %d
AND (
MATCH( a.`name`, a.`sku`, a.`description` ) AGAINST ( '%s' IN BOOLEAN MODE )
OR MATCH( d.`name` ) AGAINST( '%s' IN BOOLEAN MODE )
)
GROUP BY a.`product_id`
ORDER BY relevance DESC
LIMIT 0, 9
Oh, and even before I was doing a full text search with multiple tables, it was taking about 1/2 a second. This is much improved.

Getting the last record inserted into a select query

I am creating a small message board and I am stuck
I can select the subject, the original author, the number of replies but what I can't do is get the username, topic or date of the last post.
There are 3 tables, boards, topics and messages.
I want to get the author, date and topic of the last message in the message table. The author and date field are already fields on the messages table but i would need to join the messages and topics table on the topicid field.
this is my query that selects the subject, author, and number of replies
SELECT t.topicname, t.author, count( message ) AS message
FROM topics t
INNER JOIN messages m
ON m.topicid = t.topicid
INNER JOIN boards b
ON b.boardid = t.boardid
WHERE b.boardid = 1
GROUP BY t.topicname
Can anyone please help me get this finished?
This is what my tables look like
CREATE TABLE `boards` (
`boardid` int(2) NOT NULL auto_increment,
`boardname` varchar(255) NOT NULL default '',
PRIMARY KEY (`boardid`)
);
CREATE TABLE `messages` (
`messageid` int(6) NOT NULL auto_increment,
`topicid` int(4) NOT NULL default '0',
`message` text NOT NULL,
`author` varchar(255) NOT NULL default '',
`date` timestamp(14) NOT NULL,
PRIMARY KEY (`messageid`)
);
CREATE TABLE `topics` (
`topicid` int(4) NOT NULL auto_increment,
`boardid` int(2) NOT NULL default '0',
`topicname` varchar(255) NOT NULL default '',
`author` varchar(255) NOT NULL default '',
PRIMARY KEY (`topicid`)
);

if your SQL supports the LIMIT clause,
SELECT m.author, m.date, t.topicname FROM messages m
JOIN topics t ON m.topicid = t.topicid
ORDER BY date desc LIMIT 1
otherwise:
SELECT m.author, m.date, t.topicname FROM messages m
JOIN topics t ON m.topicid = t.topicid
WHERE m.date = (SELECT max(m2.date) from messages m2)
EDIT: if you want to combine this with the original query, it has to be rewritten using subqueries to extract the message count and the date of last message:
SELECT t.topicname, t.author,
(select count(message) from messages m where m.topicid = t.topicid) AS messagecount,
lm.author, lm.date
FROM topics t
INNER JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
INNER JOIN boards b
ON b.boardid = t.boardid
WHERE b.boardid = 1
GROUP BY t.topicname
also notice that if you don't pick any field from table boards, you don't need the last join:
SELECT t.topicname, t.author,
(select count(message) from messages m where m.topicid = t.topicid) AS messagecount,
lm.author, lm.date
FROM topics t
INNER JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
WHERE t.boardid = 1
GROUP BY t.topicname
EDIT: if mysql doesn't support subqueries in the field list, you can try this:
SELECT t.topicname, t.author, mc.messagecount, lm.author, lm.date
FROM topics t
JOIN (select m.topicid, count(*) as messagecount from messages m group by m.topicid) as mc
ON mc.topicid = t.topicid
JOIN messages lm
ON lm.topicid = t.topicid AND lm.date = (SELECT max(m2.date) from messages m2)
WHERE t.boardid = 1
GROUP BY t.topicname

If you want to get the latest entry in a table, you should have a DateTime field that shows when the entry was created (or updated). You can then sort on this column and select the latest one.
But if your id field is a number, you could find the highest. But I would recommend against this because it makes many assumptions and you would be fixed to numerical ids in the future.

You can use a subselect. Eg.:
select * from messages where id = (select max(id) from messages)
edit: And if you identify the newest record by a timestamp, you'd use:
select * from messages where id = (
select id
from messages
order by post_time desc
limit 1)

With MySQL this should work:
SELECT author, date, topicname as topic FROM messages LEFT JOIN topics ON messages.topicid = topics.topicid ORDER BY date DESC, LIMIT 0, 1;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Optimizing rand query with join - mysql

Related

Index probably not used correctly on simple SQL query

Optimising MySQL Query, Select within Select, Multiple of same

MySQL LEFT JOIN returns empty resultset

MySQL "OR MATCH" hangs (very slow) on multiple tables

Getting the last record inserted into a select query

Categories

Resources