Is this SQl Considered a JOIN? - mysql

I am using MySQl to category name from 1 table based on the Category ID in a "Module" table.
I have the below SQL working fine for my needs but I am wanting to know if this is considered a JOIN or not?
Since it does not call a JOIN ?
SELECT `mo_category_fk` , `mo_name_vc` , `mc_name_vc`
FROM x_modcats mc, x_modules m
WHERE mc.mc_id_pk = m.mo_category_fk
AND m.mo_folder_vc = :module

Yes - In MySQL implicit and explicit joins have identical execution plans. You can verify this with EXPLAIN. But here is a sample from another thread:
mysql> explain select * from table1 a inner join table2 b on a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| 1 | SIMPLE | b | ALL | PRIMARY | NULL | NULL | NULL | 986 | |
| 1 | SIMPLE | a | ref | pid | pid | 4 | schema.b.pid | 70 | |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.02 sec)
mysql> explain select * from table1 a, table2 b where a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| 1 | SIMPLE | b | ALL | PRIMARY | NULL | NULL | NULL | 986 | |
| 1 | SIMPLE | a | ref | pid | pid | 4 | schema.b.pid | 70 | |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.00 sec)

Yes you are joining. Per the documentation, , can be used as a substitute for the JOIN keyword .. except you can't use the very helpful ON clause. However, you have a condition that connects the tables in the WHERE clause. In my opinion, it makes more sense to do it as part of the FROM clause:
SELECT mo_category_fk, mo_name_vc, mc_name_vc
FROM x_modcats mc
JOIN x_modules m ON (mc.mc_id_pk = m.mo_category_fk)
WHERE m.mo_folder_vc = :module

Related

How to make an efficient UPDATE like my SELECT in MariaDB

Background
I made a small table of 10 rows from a previous SELECT already ran (SavedAnimals).
I have a massive table (animals) which I would like to UPDATE using the rows with the same id as each row in my new table.
What I have tried so far
I can quickly SELECT the desired rows from the big table like this:
mysql> EXPLAIN SELECT * FROM animals WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | NULL | NULL | NULL | 10 | |
| 1 | PRIMARY | animals | eq_ref | PRIMARY | PRIMARY | 8 | db_staging.SavedAnimals.animal_id | 1 | Using where |
| 2 | MATERIALIZED | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | |
+------+--------------+-------------------------------+--------+---------------+---------+---------+----------------------------------------------------------+------+-------------+
But the "same" command on the UPDATE is not quick:
mysql> EXPLAIN UPDATE animals SET ignored=1, ignored_when=CURRENT_TIMESTAMP WHERE ignored=0 and id IN (SELECT animal_id FROM SavedAnimals);
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
| 1 | PRIMARY | animals | index | NULL | PRIMARY | 8 | NULL | 34269464 | Using where |
| 2 | DEPENDENT SUBQUERY | SavedAnimals | ALL | NULL | NULL | NULL | NULL | 10 | Using where |
+------+--------------------+-------------------------------+-------+---------------+---------+---------+------+----------+-------------+
2 rows in set (0.00 sec)
The UPDATE command never finishes if I run it.
QUESTION
How do I make mariaDB run with the Materialized select_type on the UPDATE like it does on the SELECT?
OR
Is there a totally separate way that I should approach this which would be quick?
Notes
Version: 10.3.23-MariaDB-log
Use JOIN rather than WHERE...IN. MySQL tends to optimize them better.
UPDATE animals AS a
JOIN SavedAnimals AS sa ON a.id = sa.animal_id
SET a.ignored=1, a.ignored_when=CURRENT_TIMESTAMP
WHERE a.ignored = 0
You should find an EXISTS clause more efficient than an IN clause. For example:
UPDATE animals a
SET a.ignored = 1,
a.ignored_when = CURRENT_TIMESTAMP
WHERE a.ignored = 0
AND EXISTS (SELECT * FROM SavedAnimals sa WHERE sa.animal_id = a.id)

MySQL : Indexes for View based on an aggregated query

I have a working, nice, indexed SQL query aggregating notes (sum of ints) for all my users and others stuffs. This is "query A".
I want to use this aggregated notes in others queries, say "query B".
If I create a View based on "query A", will the indexes of the original query will be used when needed if I join it in "query B" ?
Is that true for MySQL ? For others flavors of SQL ?
Thanks.
In MySQL, you cannot create an index on a view. MySQL uses indexes of the underlying tables when you query data against the views that use the merge algorithm. For the views that use the temptable algorithm, indexes are not utilized when you query data against the views.
https://www.percona.com/blog/2007/08/12/mysql-view-as-performance-troublemaker/
Here's a demo table. It has a userid attribute column and a note column.
mysql> create table t (id serial primary key, userid int not null, note int, key(userid,note));
If you do an aggregation to get the sum of note per userid, it does an index-scan on (userid, note).
mysql> explain select userid, sum(note) from t group by userid;
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
| 1 | SIMPLE | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+-------+-------+---------------+--------+---------+------+------+-------------+
1 row in set (0.00 sec)
If we create a view for the same query, then we can see that querying the view uses the same index on the underlying table. Views in MySQL are pretty much like macros — they just query the underlying table.
mysql> create view v as select userid, sum(note) from t group by userid;
Query OK, 0 rows affected (0.03 sec)
mysql> explain select * from v;
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+-------------+
2 rows in set (0.00 sec)
So far so good.
Now let's create a table to join with the view, and join to it.
mysql> create table u (userid int primary key, name text);
Query OK, 0 rows affected (0.09 sec)
mysql> explain select * from v join u using (userid);
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | test.u.userid | 2 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 1 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+---------------+------+-------------+
3 rows in set (0.01 sec)
I tried to use hints like straight_join to force it to read v then join to u.
mysql> explain select * from v straight_join u on (v.userid=u.userid);
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+-------+---------------+--------+---------+------+------+----------------------------------------------------+
"Using join buffer (Block Nested Loop)" is MySQL's terminology for "no index used for the join." It's just looping over the table the hard way -- by reading batches of rows from start to finish of the table.
I tried to use force index to tell MySQL that type=ALL is to be avoided.
mysql> explain select * from v straight_join u force index(PRIMARY) on (v.userid=u.userid);
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 7 | NULL |
| 1 | PRIMARY | u | eq_ref | PRIMARY | PRIMARY | 4 | v.userid | 1 | NULL |
| 2 | DERIVED | t | index | userid | userid | 9 | NULL | 7 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+----------+------+-------------+
Maybe this is using an index for the join? But it's weird that table u is before table t in the EXPLAIN. I'm frankly not sure how to understand what it's doing, given the order of rows in this EXPLAIN report. I would expect the joined table should come after the primary table of the query.
I only put a few rows of data into each table. One might get some different EXPLAIN results with a larger representative sample of test data. I'll leave that to you to try.

Why does the output of EXPLAIN change after each SHOW index?

I was trying to improve performance on some queries through indexes using EXPLAIN and I noticed each time I used SHOW index FROM TableB; the output of the rows colums in the EXPLAIN of a query changed
Ex:
mysql> EXPLAIN Select A.id
From TableA A
Inner join TableB B
On A.address = B.address And A.code = B.code
Group by A.id
Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | B | index | test_index | PRIMARY | 518 | NULL | 10561 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | A | eq_ref | PRIMARY | PRIMARY | 514 | db.B.address,db.B.code | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
mysql> show index from TableB;
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| TableB | 0 | PRIMARY | 1 | id | A | 7 | NULL | NULL | | BTREE | |
| TableB | 0 | PRIMARY | 2 | address | A | 21 | NULL | NULL | | BTREE | |
| TableB | 0 | PRIMARY | 3 | code | A | 10402 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 1 | address | A | 1 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 2 | code | A | 10402 | NULL | NULL | | BTREE | |
| TableB | 1 | test_index | 3 | id | A | 10402 | NULL | NULL | | BTREE | |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.03 sec)
and...
mysql> EXPLAIN Select A.id
From TableA A
Inner join TableB B
On A.address = B.address And A.code = B.code Group by A.id
Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | B | index | test_index | PRIMARY | 518 | NULL | 9800 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | A | eq_ref | PRIMARY | PRIMARY | 514 | db.B.address,db.B.code | 1 | |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)
Why does this happen?
The rows column should be taken as a rough estimate only. It's not a precise number.
It's based on statistical estimates of how many rows will be examined during a query. The actual number of rows cannot be known until you actually execute the query.
The statistics are based on samples read from the table periodically. These samples are re-read occasionally, for example after you run ANALYZE TABLE or certain INFORMATION_SCHEMA queries, or certain SHOW statements.
I don't find 20% variation in statistics to be a big deal. In many situations, think of the graph being like an upturned parabola, and you need to know which side of the minimum point you are on. In complex queries, where the Optimizer is likely to goof, it need a lot more than simple stats, such as Histograms of MariaDB 10.0 / 10.1. (I don't have enough experience with such to say whether that makes much headway.)
Your particular query is probably going to be performed in only one way, regardless of the statistics. An example of a complicated query would be a JOIN with WHERE clauses filtering each table. The optimizer has to decide which table to start with. Another case is a single table with a WHERE and ORDER BY and they cannot both be handled by a single index -- should it use an index to filter, but then have to sort? or should it use an index for ORDER BY, but then have to filter on the fly?

How do I create an efficient DQL statement, to match my efficient SQL when doing a simple LEFT JOIN?

I can craft a simple SQL that returns 1 when there is an item with requested id and corresponding model that matches FS-%, and 0 otherwise.
But when I try to write it as DQL, I fail in all spectacular ways. See EXPLAIN results below.
Question: How do I write an efficient DQL?
SQL (efficient)
select count(*)
from item
left join product on item.product_id = product.id
where item.id=2222 and product.model like "FS-%";
Using Explain:
+----+-------------+---------+-------+--------------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+--------------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | item | const | PRIMARY,product_id | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | product | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+---------+-------+--------------------+---------+---------+-------+------+-------+
DQL (NOT efficient)
$this->getEntityManager()
->createQueryBuilder()
->select('count(i)')
->from(Item::class, 'i')
->leftJoin(Product::class, 'p')
->where('i.id = :id')
->andWhere('p.model like :model')
->setParameter('id', 2222)
->setParameter('model', 'FS-%')
->getQuery()->getSingleResult();
Resulting SQL:
SELECT * FROM item i0_ LEFT JOIN product p1_
ON (i0_.id = 409264 AND p1_.model LIKE 'FS-%');
Using Explain:
+----+-------------+-------+------+---------------+------+---------+------+--------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------+
| 1 | SIMPLE | i0_ | ALL | NULL | NULL | NULL | NULL | 276000 | |
| 1 | SIMPLE | p1_ | ALL | NULL | NULL | NULL | NULL | 564 | |
+----+-------------+-------+------+---------------+------+---------+------+--------+-------+
2 rows in set (0.00 sec)
Note: I used https://stackoverflow.com/a/25887143/2883328 to help me write the DQL.
in your case I would have tried this query:
$this->getEntityManager()
->createQueryBuilder()
->select('count(i)')
->from(Item::class, 'i')
->leftJoin(Product::class, 'p', 'WITH', 'i.product = p.id')
->where('i.id = :id')
->andWhere('p.model like :model')
->setParameter('id', 2222)
->setParameter('model', 'FS-%')
->getQuery()->getSingleScalarResult();
change product in i.product to your property name

slow mysql count because of subselect

how to make this select statement more faster?
the first left join with the subselect is making it slower...
mysql> SELECT COUNT(DISTINCT w1.id) AS AMOUNT FROM tblWerbemittel w1
JOIN tblVorgang v1 ON w1.object_group = v1.werbemittel_id
INNER JOIN ( SELECT wmax.object_group, MAX( wmax.object_revision ) wmaxobjrev FROM tblWerbemittel wmax GROUP BY wmax.object_group ) AS wmaxselect ON w1.object_group = wmaxselect.object_group AND w1.object_revision = wmaxselect.wmaxobjrev
LEFT JOIN ( SELECT vmax.object_group, MAX( vmax.object_revision ) vmaxobjrev FROM tblVorgang vmax GROUP BY vmax.object_group ) AS vmaxselect ON v1.object_group = vmaxselect.object_group AND v1.object_revision = vmaxselect.vmaxobjrev
LEFT JOIN tblWerbemittel_has_tblAngebot wha ON wha.werbemittel_id = w1.object_group
LEFT JOIN tblAngebot ta ON ta.id = wha.angebot_id
LEFT JOIN tblLieferanten tl ON tl.id = ta.lieferant_id AND wha.zuschlag = (SELECT MAX(zuschlag) FROM tblWerbemittel_has_tblAngebot WHERE werbemittel_id = w1.object_group)
WHERE w1.flags =0 AND v1.flags=0;
+--------+
| AMOUNT |
+--------+
| 1982 |
+--------+
1 row in set (1.30 sec)
Some indexes has been already set and as EXPLAIN shows they were used.
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2072 | |
| 1 | PRIMARY | v1 | ref | werbemittel_group,werbemittel_id_index | werbemittel_group | 4 | wmaxselect.object_group | 2 | Using where |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 3376 | |
| 1 | PRIMARY | w1 | eq_ref | object_revision,or_og_index | object_revision | 8 | wmaxselect.wmaxobjrev,wmaxselect.object_group | 1 | Using where |
| 1 | PRIMARY | wha | ref | PRIMARY,werbemittel_id_index | werbemittel_id_index | 4 | dpd.w1.object_group | 1 | |
| 1 | PRIMARY | ta | eq_ref | PRIMARY | PRIMARY | 4 | dpd.wha.angebot_id | 1 | |
| 1 | PRIMARY | tl | eq_ref | PRIMARY | PRIMARY | 4 | dpd.ta.lieferant_id | 1 | Using index |
| 4 | DEPENDENT SUBQUERY | tblWerbemittel_has_tblAngebot | ref | PRIMARY,werbemittel_id_index | werbemittel_id_index | 4 | dpd.w1.object_group | 1 | |
| 3 | DERIVED | vmax | index | NULL | object_revision_uq | 8 | NULL | 4668 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | wmax | range | NULL | or_og_index | 4 | NULL | 2168 | Using index for group-by |
+----+--------------------+-------------------------------+--------+----------------------------------------+----------------------+---------+-----------------------------------------------+------+----------------------------------------------+
10 rows in set (0.01 sec)
The main problem while the statement above takes about 2 seconds seems to be the subselect where no index can be used.
How to write the statement even more faster?
Thanks for help. MT
Do you have the following indexes?
for tblWerbemittel - object_group, object_revision
for tblVorgang - object_group, object_revision
for tblWerbemittel_has_tblAngebot - werbemittel_id, zuschlag
Let me know if that helps, there are a few more that I can see might help but try those first.
EDIT
Can you try these two queries and see if they run fast?
SELECT w1.id AS AMOUNT
FROM tblWerbemittel w1 INNER JOIN
(SELECT wmax.object_group,
MAX( wmax.object_revision ) AS wmaxobjrev
FROM tblWerbemittel AS wmax
GROUP BY wmax.object_group ) AS wmaxselect ON w1.object_group = wmaxselect.object_group AND
w1.object_revision = wmaxselect.wmaxobjrev
WHERE w1.flags = 0
SELECT v1.werbemittel_id
FROM tblVorgang v1 LEFT JOIN
(SELECT vmax.object_group,
MAX( vmax.object_revision ) AS vmaxobjrev
FROM tblVorgang AS vmax
GROUP BY vmax.object_group ) AS vmaxselect ON v1.object_group = vmaxselect.object_group AND
v1.object_revision = vmaxselect.vmaxobjrev LEFT JOIN
WHERE v1.flags = 0
While I consider I don't have sufficient data to provide a 100% correct answer, but I can throw in a handful of tips.
Forst of all, MYSQL is stupid. Bear that in mind and always rearrange your queries so that the most data is excluded at the beginning. For instance, if the last join reduced the number of results from 10k to 2k while the others don't, try swapping their positions so that each subsequent join operates on the smallest subset of data possible.
Same applies to the WHERE clause.
Also, joins tend to be slower than subqueries. I don't know if that's a rule or just something that I'm observing in my case, but you can always try to substitute a join or two with a subquery.
While I suppose this doesn't really answer your question, I hope it at least gives you an idea about where to start looking for optimisations.