Timeout with inner join on two MySQL tables - mysql

Here are two tables, with only 50K rows in each:
CREATE TABLE `ps_product_access` (
`id_order` int(10) UNSIGNED NOT NULL DEFAULT '0',
`id_product_access` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
ALTER TABLE `ps_product_access`
ADD KEY `id_order` (`id_order`);
CREATE TABLE `ps_orders` (
`id_order` int(10) UNSIGNED NOT NULL,
`id_order_renew` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `ps_orders`
ADD PRIMARY KEY (`id_order`)
ADD KEY `ps_orders__id_order_renew__index` (`id_order_renew`);
The tables are overly simplified with only the relevant fields. There is no foreign key, but I can't add one right now (data is inconsistent in this database).
This query does not work (it means it's an infinite loading):
SELECT pa.`id_product_access`
FROM `ps_product_access` pa
INNER JOIN `ps_orders` o ON pa.id_order = o.id_order_renew;
I can't understand why? It seems pretty simple, just an inner join. I know I can optimize query with WHERE EXISTS but this is not the main question. This query should not run into an infinite loading, since there is almost no data (50k rows). Did I missed something?
side note: I run this query on a fresh install of MySQL 8 (installed via brew on a MacOS). I saw the same problem with the same data on another computer with a totally different config (ubuntu VM on windows, MySQL5)

The column id_order in ps_product_access defaults to 0, maybe you need to check how many rows you have with id_order = 0

Related

MariaDB server dependent problem in group_concat when combined with group by

more information added 2022-09-05.
I have a problem with a sql query in a mariadb environment. I have a query that runs a number of tests in the database to identify unprocessed/unverified records, or records with contradicting data. The record identifiers that need to be checked are collected in one field (named fam_list) using a group_concat. The use of group_concat properly collects all record identifiers if no group by function is used. Issue is when group by is added in the query. I want to group the record identifiers by person responsible and project code. After adding the group by instruction, I retrieve several rows but the field that should collect all record identifiers is empty in all rows.
Additionally surprising is the fact that the column that counts the number of family identifiers (named fam_count), which is also an aggregate function, works correct in all attempts.
The query functions ok in my development machine, but fails in my production machine.
extra information 2022-09-05
Removing parts of the query suggests that the origin of the problem lies in the combination of the table fam_proj_link and the view Team_curr_members.
Building up from left to right continues to produce the expected result until Team_curr_members is added in the join. From that moment onwards the group_concat column is empty. The table fam_proj_link can be joined with Team_curr_members directly. A query running in the table fam_proj_link joined to the view Team_curr_members still produces the empty group_concat column.
The simplified query that still fails is:
select 'uninsp' as checkname, l.proj_id, count(l.pb_fam_id), group_concat(l.pb_fam_id)
from PLS.fam_proj_link l left join Projects_2012.Team_curr_members t on(l.proj_id=t.proj_id and t.func_id=1)
where rank_id=-1
group by l.proj_id
I have tried modifying the join condition by doing a straight join, but still an empty group_concat field. I have tried redefyning the join to 'using(proj_id), but still an empty group_concat field.
In none of these cases does the mariadb server produce any error information and all the queries run as expected in the development server.
I checked the site for other usage of the group_concat function and found irregular behaviour in more pages. Including pages that used to work correctly until shortly before my holidays.
I have run optimize tables on the entire database and restarted the mariadb server package, but without improvement.
Could it be that there are damages in some database related files and if so, where do I have to look and what do I have to look for?
end of extra added information
Does anybody have any idea what could be the reason or what I can try to find out what goes wrong?
Query (for one of the tests):
This works in development and production:
select 'uninsp' as checkname, coalesce(user_id,0) as user_id, coalesce(firstsur,' none') as firstsur, l.proj_id, label, stat_name, stat_class, count(distinct pb_fam_id) as fam_count, group_concat(distinct pb_fam_id) as fam_list
from PLS.fam_proj_link l join Projects_2012.Proj_list_2018 p using(proj_id) join Projects_2012.Proj_status_reflist using(stat_id)
left join Projects_2012.Team_curr_members t on(p.proj_id=t.proj_id and t.func_id=1) left join UserData.people_view using(user_id)
where ref_code and stat_name!='cancelled' and rank_id=-1;
This works in development only, not in production:
select 'uninsp' as checkname, coalesce(user_id,0) as user_id, coalesce(firstsur,' none') as firstsur, l.proj_id, label, stat_name, stat_class, count(distinct pb_fam_id) as fam_count, group_concat(distinct pb_fam_id) as fam_list
from PLS.fam_proj_link l join Projects_2012.Proj_list_2018 p using(proj_id) join Projects_2012.Proj_status_reflist using(stat_id)
left join Projects_2012.Team_curr_members t on(p.proj_id=t.proj_id and t.func_id=1) left join UserData.people_view using(user_id)
where ref_code and stat_name!='cancelled' and rank_id=-1
group by user_id, proj_id;
Production machine:
Debian linux running mariadb 10.1.41
Development machine:
Manjaro linux running mariadb 10.8.3
Of course, I can solve the issue by abandoning the group_concat in mariadb and store the record identifiers in arrays in php and implode the array before generating the output, but I assume that concatenation in mariadb is faster than concatenation via arrays in php.
Relevant table definitions:
fam_proj_link
--
-- Table structure for table `fam_proj_link`
--
CREATE TABLE `fam_proj_link` (
`proj_id` smallint(5) UNSIGNED NOT NULL,
`pb_fam_id` int(10) UNSIGNED NOT NULL,
`originator` tinyint(1) NOT NULL DEFAULT '0',
`rank_id` tinyint(3) NOT NULL DEFAULT '-1',
`factsheet` tinyint(1) NOT NULL DEFAULT '0',
`cust_title` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- RELATIONS FOR TABLE `fam_proj_link`:
-- `rank_id`
-- `def_ranks` -> `rank_id`
-- `pb_fam_id`
-- `pb_fams` -> `pb_fam_id`
-- `proj_id`
-- `Projects` -> `proj_id`
--
--
-- Indexes for dumped tables
--
--
-- Indexes for table `fam_proj_link`
--
ALTER TABLE `fam_proj_link`
ADD PRIMARY KEY (`proj_id`,`pb_fam_id`) USING BTREE,
ADD KEY `originator` (`originator`),
ADD KEY `rank_id` (`rank_id`),
ADD KEY `pb_fam_id` (`pb_fam_id`),
ADD KEY `factsheet` (`factsheet`);
--
-- Constraints for dumped tables
--
--
-- Constraints for table `fam_proj_link`
--
ALTER TABLE `fam_proj_link`
ADD CONSTRAINT `fam_proj_link_ibfk_3` FOREIGN KEY (`rank_id`) REFERENCES `def_ranks` (`rank_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `fam_proj_link_ibfk_4` FOREIGN KEY (`pb_fam_id`) REFERENCES `pb_fams` (`pb_fam_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `fam_proj_link_ibfk_5` FOREIGN KEY (`proj_id`) REFERENCES `Projects_2012`.`Projects` (`proj_id`) ON DELETE CASCADE ON UPDATE CASCADE;
Proj_list_2018 is a view with the below stand-in structure:
all fields are indexed in the parent tables, except for the label which is a combination of a group_concat and a normal concat over fields in three different tables.
--
-- Stand-in structure for view `Proj_list_2018`
-- (See below for the actual view)
--
CREATE TABLE `Proj_list_2018` (
`proj_id` smallint(6) unsigned
,`priority` tinyint(4)
,`stat_id` tinyint(4) unsigned
,`ref_code` bigint(20)
,`label` mediumtext
);
Proj_status_reflist:
--
-- Table structure for table `Proj_status_reflist`
--
CREATE TABLE `Proj_status_reflist` (
`stat_id` tinyint(3) UNSIGNED NOT NULL,
`stat_group_id` tinyint(3) NOT NULL DEFAULT '1',
`stat_name` varchar(20) NOT NULL DEFAULT '-',
`stat_class` varchar(25) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- RELATIONS FOR TABLE `Proj_status_reflist`:
-- `stat_group_id`
-- `Proj_status_groups` -> `stat_group_id`
--
--
-- Indexes for dumped tables
--
--
-- Indexes for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
ADD PRIMARY KEY (`stat_id`),
ADD UNIQUE KEY `stat_name_UNIQUE` (`stat_name`),
ADD KEY `stat_group_id` (`stat_group_id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
MODIFY `stat_id` tinyint(3) UNSIGNED NOT NULL AUTO_INCREMENT;
--
-- Constraints for dumped tables
--
--
-- Constraints for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
ADD CONSTRAINT `Proj_status_reflist_ibfk_1` FOREIGN KEY (`stat_group_id`) REFERENCES `Proj_status_groups` (`stat_group_id`) ON UPDATE CASCADE;
Team_curr_members is a view with the below stand-in structure:
all fields are indexed in the parent table.
--
-- Stand-in structure for view `Team_curr_members`
-- (See below for the actual view)
--
CREATE TABLE `Team_curr_members` (
`proj_id` smallint(5) unsigned
,`func_id` tinyint(3) unsigned
,`user_id` smallint(5) unsigned
);
people_view is a view with the below stand-in structure:
--
-- Stand-in structure for view `people_view`
-- (See below for the actual view)
--
CREATE TABLE `people_view` (
`user_id` smallint(5) unsigned
,`synth_empl` int(1)
,`firstsur` varchar(77)
,`surfirst` varchar(78)
);
After trying many things, I noticed that the problem appeared in more group_concat queries, many of them had been running without problems for several years. Rebuilding the indexes of the database tables did not solve the issue. Since the software was not altered during the last three years, I concluded that the database engine might be damaged.
I asked IT to create a new virtual server. After building a complete new server environment and migration of the databases and web interface to the new server all problems were solved.
Apparently the problem was not in the queries, but in the server enviroment. I have not discovered which part of the old server environment caused the actual issue, but possibly one or more mariadb/php related files were damaged.
Thanks to those that took the effort to come with suggestions.

Why does MySQL query return zero rows, but after defrag it works?

I have an InnoDB, MySQL table and this query returns zero rows:
SELECT id, display FROM ra_table WHERE parent_id=7266 AND display=1;
However, there are actually 17 rows that should match:
SELECT id, display FROM ra_itable1 WHERE parent_id=7266;
ID display
------------------
1748 1
5645 1
...
There is an index on display (int 1), and ID is the primary key. The table also has several other fields which I'm not pulling in this query.
After noticing this query wasn't working, I defragmented the table and then the first query started working correctly, but only for a time. It seems after a few days, the query stops working again and I have to defragment to fix it.
My question is, why does the fragmented table break this query?
Additional info: MySQL 5.6.27 on Amazon RDS.
CREATE TABLE `ra_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parent_id` int(6) NOT NULL,
`display` int(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `parent_id` (`parent_id`),
KEY `display` (`display`),
) ENGINE=InnoDB AUTO_INCREMENT=13302 DEFAULT CHARSET=latin1
ROW_FORMAT=DYNAMIC
There may be a bug in the version you are running.
Meanwhile, change
INDEX(parent_id),
INDEX(display)
to
INDEX(parent_id, display)
By combining them, the query will run faster (and hopefully correctly). An index on a flag (display) is likely to never be used.

MySQL join with federated view not working

I want to join a table with a view, where one table L is local, whereas the view F is a FEDERATED view residing on another server:
SELECT * FROM L LEFT JOIN F ON L.id = F.id;
Now the JOIN results in no hits despite the fact that there actually are many matches between the table and view. The ID field is bigint.
Frustrated, I created a TEMPORARY table T and dumped everything from F into it, thus making a local copy of F. Using T instead of F, the JOIN works as expected. But the process of creating T consumes memory and time.
What could be possible reasons for this odd MySQL behaviour?
Table definitions:
CREATE TABLE `L` (
`id` bigint(20) NOT NULL,
`id2` bigint(20) NOT NULL,
PRIMARY KEY (`id`,`id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and (this table is in fact a view on the remote server):
CREATE TABLE `F` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`field1` bigint(20) NOT NULL,
...
`field5` tinyint(1) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=FEDERATED DEFAULT CHARSET=latin1 CONNECTION='mysql://userName:pword...';
As it states from definition of what FEDERATED storage-engine is, you must have table structure definition (so, for example .frm files for MyISAM) on both servers. That is because how FEDERATED engine works:
Therefore, you can not use VIEW since it has completely different meaning and structure. So instead you should mirror your table and then you'll be able to use it in your queries.

Optimize MySQL count query with JOIN

I have a query that takes about 20 seconds, I would like to understand if there is a way to optimize it.
Table 1:
CREATE TABLE IF NOT EXISTS `sessions` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=9845765 ;
And table 2:
CREATE TABLE IF NOT EXISTS `access` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`session_id` int(10) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `session_id ` (`session_id `)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=9467799 ;
Now, what I am trying to do is to count all the access connected to all sessions about one user, so my query is:
SELECT COUNT(*)
FROM access
INNER JOIN sessions ON access.session_id=session.id
WHERE session.user_id='6';
It takes almost 20 seconds...and for user_id 6 there are about 3 millions sessions stored.
There is anything I can do to optimize that query?
Change this line from the session table:
KEY `user_id` (`user_id`)
To this:
KEY `user_id` (`user_id`, `id`)
What this will do for you is allow you to complete the query from the index, without going back to the raw table. As it is, you need to do an index scan on the session table for your user_id, and for each item go back to the table to find the id for the join to the access table. By including the id in the index, you can skip going back to the table.
Sadly, this will make your inserts slower into that table, and it seems like this may be a bid deal, given just one user has 3 millions sessions. Sql Server and Oracle would address this by allowing you to include the id column in your index, without actually indexing on it, saving a little work at insert time, and also by allowing you specify a lower fill factor for the index, reducing the need to re-build or re-order the indexes at insert, but MySql doesn't support these.

Mysql Join Query optimization

I have two tables in mysql:
Results Table : 1046928 rows.
Nodes Table : 50 rows.
I am joining these two tables with the following query and the execution of the query is very very slow.
select res.TIndex, res.PNumber, res.Sender, res.Receiver,
sta.Nickname, rta.Nickname from ((Results res join
Nodes sta) join Nodes rta) where ((res.sender_h=sta.name) and
(res.receiver_h=rta.name));
Please help me optimize this query. Right now if I want to pull just top 5 rows, It takes about 5-6 MINUTES. Thank you.
CREATE TABLE `nodes1` (
`NodeID` int(11) NOT NULL,
`Name` varchar(254) NOT NULL,
`Nickname` varchar(254) NOT NULL,
PRIMARY KEY (`NodeID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `Results1` (
`TIndex` int(11) NOT NULL,
`PNumber` int(11) NOT NULL,
`Sender` varchar(254) NOT NULL,
`Receiver` varchar(254) NOT NULL,
`PTime` datetime NOT NULL,
PRIMARY KEY (`TIndex`,`PNumber`),
KEY `PERIOD_TIME_IDX` (`PTime`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
SELECT res.TIndex ,
res.PNumber ,
res.Sender ,
res.Receiver ,
sta.Nickname ,
rta.Nickname
FROM Results AS res
INNER JOIN Nodes AS sta ON res.sender_h = sta.name
INNER JOIN Nodes AS rta ON res.receiver_h = rta.NAME
Create an index on Results
(sender_h)
Create an index on Results (receiver_h)
Create an index
on Nodes (name)
Joining on the node's name rather than NodeId (the primary key) doesn't look good at all.
Perhaps you should be storing NodeId for foreign key sender and receiver in the Results table instead of name Adding foreign key constraints is a good idea too. Among other things, this might cause indexing automatically depending on your configuration
If this change is difficult, at the very least you should enforce uniqueness on node's name field
If you change the tables definition in this manner, change your query to John's recommendation, and add indexes it should run a lot better and be a lot more readable/better form.