INNER join taking way too long - mysql

I asked the original question here in stack before. Apologies if this is not the best way to go about this.
The problem is I have a query that even with a INNER JOIN is taking at least 5 seconds to complete and I'm wondering if there is a faster way to do this. Here is the answer I was given:
` q = "SELECT DISTINCT e2.eventId FROM event_tags e1 INNER JOIN event_tags e2 " \
"ON BINARY e2.tagName=e1.tagName AND e2.eventId != e1.eventId " \
"WHERE e1.eventId = {} ORDER BY RAND() LIMIT {}".format(eventId, '10')`
my tags table looks like this
mysql> describe event_tags;
+---------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+------------------+------+-----+---------+----------------+
| tagId | int(10) unsigned | NO | PRI | NULL | auto_increment |
| tagName | text | NO | | NULL | |
| eventId | int(10) unsigned | NO | PRI | NULL | |
+---------+------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
and I have a bunch of tags in them and they will only continue to grow. When I do a count on the tags table I have 504,402 tagId, and same for tagname. How could I make the look up faster?
Here is some sample data of the event tags table
mysql> select * from event_tags limit 40;
+-------+-------------------------------------------+---------+
| tagId | tagName | eventId |
+-------+-------------------------------------------+---------+
| 261 | Justin Timberlake (Rescheduled from 11/9) | 38 |
| 264 | Rogers Arena | 38 |
| 267 | Pop | 38 |
| 271 | Rock | 38 |
| 285 | Justin Timberlake (Rescheduled from 11/8) | 41 |
| 288 | Rogers Arena | 41 |
| 291 | Pop | 41 |
| 294 | Rock | 41 |
| 595 | Yogesh Soman | 84 |
| 599 | Geetanjali Kulkarni | 84 |
| 602 | Bhagyashree Shankpal | 84 |
| 606 | Lalit Prabhakar | 84 |
| 611 | Sameer Sanjay Vidwans | 84 |
| 617 | Drama | 84 |
| 647 | Shrihari Abhyankar | 89 |
| 651 | Deepali Borkar | 89 |
| 654 | Akash Kamble | 89 |
| 657 | Sharavi Kulkarni | 89 |
| 660 | Sharav Wadhawekar | 89 |
| 667 | Nipun Dharmadhikari | 89 |
| 670 | Drama | 89 |
| 689 | Frank Grillo | 94 |
| 692 | Jamie Bell | 94 |
| 695 | Margaret Qualley | 94 |
| 700 | James Badge Dale | 94 |
| 704 | Tim Sutton | 94 |
| 710 | Drama | 94 |
| 734 | Bruce Dern | 101 |
| 739 | Anthony Michael Hall | 101 |
| 745 | Sean Astin | 101 |
| 749 | Aly Michalka | 101 |
| 754 | Victoria Smurfit | 101 |
| 759 | Carl Bessai | 101 |
| 762 | Drama | 101 |
| 783 | Sarah Clarke | 106 |
| 785 | Xander Berkeley | 106 |
| 787 | Kristen Gutoskie | 106 |
| 790 | Mackenzie Astin | 106 |
| 794 | Bobby Campo | 106 |
| 798 | Adam Cushman | 106 |
+-------+-------------------------------------------+---------+
40 rows in set (0.00 sec)
and here is the CREATE statement for the table:
CREATE TABLE IF NOT EXISTS event_tags(
tagId INT UNSIGNED NOT NULL AUTO_INCREMENT,
tagName VARCHAR(40) NOT NULL,
eventId INT UNSIGNED NOT NULL,
PRIMARY KEY(tagId, eventId)
);
Here is the EXPLAIN for the query:
mysql> EXPLAIN SELECT DISTINCT e2.eventId FROM event_tags e1 INNER JOIN event_tags e2 ON BINARY e2.tagName=e1.tagName AND e2.eventId != e1.eventId WHERE e1.eventId = 487 ORDER BY RAND() LIMIT 10
-> ;
+----+-------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
| 1 | SIMPLE | e1 | ALL | NULL | NULL | NULL | NULL | 34275 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | e2 | ALL | NULL | NULL | NULL | NULL | 34275 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+-------+----------------------------------------------+
2 rows in set (0.03 sec)
UPDATE: i created an index on the table with:
CREATE INDEX tagsNdx ON event_tags (eventId, tagName(255));
Which looks now like this:
mysql> show index from event_tags;
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| event_tags | 0 | PRIMARY | 1 | tagId | A | 455408 | NULL | NULL | | BTREE | | |
| event_tags | 0 | PRIMARY | 2 | eventId | A | 455408 | NULL | NULL | | BTREE | | |
| event_tags | 1 | tagsNdx | 1 | eventId | A | 186 | NULL | NULL | | BTREE | | |
| event_tags | 1 | tagsNdx | 2 | tagName | A | 186 | 255 | NULL | | BTREE | | |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.00 sec)
But it's still slow.

Following is possible optimization:
Remove column "eventId" from primary key (this step is optional an can be further elaborated if you like).
Create index on column (eventId,tag_name).
Execute Command: ANALYZE TABLE event_tags

Related

How to find a specific row in an index [duplicate]

This question already has answers here:
Select all where [first letter starts with B]
(6 answers)
Closed last year.
I am new to MySQL and am not very familiar with it. I am supposed to find a name that starts with b in index, which i do not have the slightest clue of doing.
| id | name | DoB | class | marks | dept_id |
+----+--------------+------------+-------+-------+---------+
| 1 | Data Science | 2006-07-15 | 11 | 100 | 3 |
| 2 | Garry | 2006-08-20 | 11 | 92 | 4 |
| 3 | Jane | 2006-03-22 | 10 | 95 | 2 |
| 4 | Benny | 2005-10-10 | 12 | 74 | 4 |
| 5 | Karen | 2005-01-15 | 12 | 88 | 3 |
| 6 | Camy | 2006-04-18 | 12 | 91 | 2 |
| 7 | Farhan | 2006-09-21 | 11 | 80 | NULL |
| 8 | Shamil | 2005-10-19 | 11 | 90 | 3 |
+----+--------------+------------+-------+-------+---------+
The Table above is (students)
And the one Below it is the Index of (students)
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| students | 0 | PRIMARY | 1 | id | A | 6 | NULL | NULL | | BTREE | | | YES | NULL |
| students | 1 | dept_id | 1 | dept_id | A | 3 | NULL | NULL | YES | BTREE | | | YES | NULL |
| students | 1 | name_index | 1 | name | A | 6 | NULL | NULL | | BTREE | | | YES | NULL |
| students | 1 | index_name | 1 | name | A | 6 | NULL | NULL | | BTREE | | | YES | NULL |
+----------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
You have to use SELECT query with LIKE keyword
The percentage ( % ) wildcard matches any string of zero or more characters. For example, b% matches any string starts with the character b such as Benny and Ben.
SELECT * FROM students WHERE name LIKE 'b%';
If you have same name in record which starts from uppercase as well as lowecase then put BINARY keyword after LIKE keyword (Case Sensitive):
SELECT * FROM students WHERE name LIKE BINARY 'b%';

Complex INSERT or UPDATE MariaDB tables with data from other MariaDB tables using JOIN or UNION

I need to INSERT or UPDATE data in a table using data from other tables; I understand the basic
insert into table (a,b,c)
select h, i, j
from otherTable
where........
My challenge comes from the fact that the data is spread across multiple tables and in one of the tables the data is metadata stored in rows, not columns. Therefore I need to use JOIN and possible UNION to get what is needed.
Unfortunately after trying everything I read in both the Maria manual, on the Maria forum and on Stack overflow I can not get it to work.
Here is what I am attempting to do:
insert data into dbc_jot_groupmembers in the following fields using source data as shown:
jot_grpid = dbc_bp_groups_members.group_id
jot_bbmemid = dbc_bp_groups_members.user_id
jot_grpmemname = dbc_bp_xprofile_data.value where field_id=3
jot_grpmemnum = dbc_bp_xprofile_data.value where field_id=4
I need the final result to look like this:
select * from dbc_jot_groupmembers;
+--------------+-----------+----------------+---------------+---------------------+-------------+
| jot_grpmemid | jot_grpid | jot_grpmemname | jot_grpmemnum | jot_grpmemts | jot_bbmemid |
+--------------+-----------+----------------+---------------+---------------------+-------------+
| 1 | 17 | hutchdad | +17047047045 | 2021-06-15 14:56:19 | 14 |
| 2 | 24 | hutchdad | +17047047045 | 2021-06-15 19:49:58 | 14 |
| 3 | 25 | hutchdad | +17047047045 | 2021-06-15 19:49:58 | 14 |
| 4 | 17 | hutchmom | +17773274355 | 2021-06-15 19:49:58 | 15 |
| 5 | 24 | hutchmom | +17773274355 | 2021-06-15 19:49:58 | 15 |
| 6 | 16 | ledwards | +14567655645 | 2021-06-15 19:49:58 | 11 |
| 7 | 16 | medwards | +12223334545 | 2021-06-15 19:49:58 | 10 |
| 7 | 20 | medwards | +12223334545 | 2021-06-15 19:49:58 | 10 |
SAMPLE DATA FROM SOURCE TABELS AND TABLE DEFINITIONS:
MariaDB [devDisciplePlaceCom]> describe dbc_bp_groups_members;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| group_id | bigint(20) | NO | MUL | NULL | |
| user_id | bigint(20) | NO | MUL | NULL | |
| inviter_id | bigint(20) | NO | MUL | NULL | |
| is_admin | tinyint(1) | NO | MUL | 0 | |
| is_mod | tinyint(1) | NO | MUL | 0 | |
| user_title | varchar(100) | NO | | NULL | |
| date_modified | datetime | NO | | NULL | |
| comments | longtext | NO | | NULL | |
| is_confirmed | tinyint(1) | NO | MUL | 0 | |
| is_banned | tinyint(1) | NO | | 0 | |
| invite_sent | tinyint(1) | NO | | 0 | |
+---------------+--------------+------+-----+---------+----------------+
12 rows in set (0.002 sec)
describe dbc_bp_xprofile_data;
+--------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| field_id | bigint(20) unsigned | NO | MUL | NULL | |
| user_id | bigint(20) unsigned | NO | MUL | NULL | |
| value | longtext | NO | | NULL | |
| last_updated | datetime | NO | | NULL | |
+--------------+---------------------+------+-----+---------+----------------+
5 rows in set (0.001 sec)
THIS IS THE LIST OF GROUPS AND WHAT USERS THEY ARE IN.
select group_id,user_id from dbc_bp_groups_members ;
+----------+---------+
| group_id | user_id |
+----------+---------+
| 16 | 13 |
| 16 | 12 |
| 16 | 11 |
| 16 | 10 |
| 17 | 14 |
| 17 | 15 |
| 17 | 16 |
| 17 | 17 |
| 17 | 18 |
| 17 | 19 |
| 20 | 10 |
| 24 | 14 |
| 24 | 16 |
| 24 | 15 |
| 24 | 17 |
| 24 | 19 |
| 25 | 19 |
| 25 | 14 |
| 1 | 14 |
| 11 | 14 |
+----------+---------+
20 rows in set (0.000 sec)
THIS IS THE TABLE CONTAINING THE USERS METADATA. IN MY CASE I NEED THE PHOEN NUMBER AND NAME WHICH ARE IN THE value FIELD WITH A field_id of 3 and 4.
select * from dbc_bp_xprofile_data where user_id > 9 and field_id > 2 AND field_id < 5;
+-----+----------+---------+---------------+---------------------+
| id | field_id | user_id | value | last_updated |
+-----+----------+---------+---------------+---------------------+
| 31 | 3 | 10 | medwards | 2021-06-24 03:11:59 |
| 34 | 3 | 11 | ledwards | 2021-06-24 03:11:24 |
| 37 | 3 | 12 | nedwards | 2021-04-24 14:47:18 |
| 40 | 3 | 13 | iedwards | 2021-04-24 14:47:52 |
| 43 | 3 | 14 | hutchdad | 2021-06-21 14:53:08 |
| 46 | 3 | 15 | hutchmom | 2021-06-24 03:10:58 |
| 49 | 3 | 16 | hutchdaughter | 2021-04-24 16:54:48 |
| 52 | 3 | 17 | hutchson1 | 2021-04-24 16:55:43 |
| 55 | 3 | 18 | hutchson2 | 2021-04-24 16:57:42 |
| 58 | 3 | 19 | hutchson3 | 2021-04-24 16:58:44 |
| 78 | 3 | 25 | demoadmin | 2021-06-08 02:01:39 |
| 158 | 4 | 14 | 7047047045 | 2021-06-21 14:53:08 |
| 190 | 3 | 58 | dupdup | 2021-06-23 19:46:19 |
| 191 | 4 | 15 | 7773274355 | 2021-06-24 03:10:58 |
| 193 | 4 | 11 | 4567655645 | 2021-06-24 03:11:24 |
| 195 | 4 | 10 | 2223334545 | 2021-06-24 03:11:59 |
+-----+----------+---------+---------------+---------------------+
16 rows in set (0.000 sec)
If this can not be done is a single INSERT then I can use an INSERT with subsequent UPDATE statements. I also understand that this is not best practice and violates 3nf and probably several other best practice principles. Unfortunately, I am at the mercy of the application and can not change the code, so the only way to get this to work is to put duplicate data in the database as described below:
It can be done with a single INSERT. However, there are some information need to be addressed as what I've posted in a the comment. In the meantime, here is an example query that you can use to do the operation that you want:
SELECT ROW_NUMBER() OVER (ORDER BY A.group_id, A.user_id) AS 'jot_grpmemid',
A.group_id AS 'jot_grpid',
MAX(CASE WHEN B.field_id=3 THEN B.value ELSE '' END) AS 'jot_grpmemname',
MAX(CASE WHEN B.field_id=4 THEN CONCAT('+',B.value) ELSE '' END) AS 'jot_grpmemnum',
A.user_id AS 'jot_bbmemid'
FROM
dbc_bp_groups_members A JOIN dbc_bp_xprofile_data B
ON A.user_id=B.user_id
GROUP BY A.group_id, A.user_id;
Like I said in the comment, I'm not sure how you get/generate jot_grpmemid because you have two 7 in the expected result so I assume it's a typo. I guess, at this point it's up to you to modify the query accordingly.
Demo fiddle

Update a date column with the result of the next closest date column

I have a MySQL table that looks like this:
+-------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+-------+
| person_id | int(11) | NO | MUL | NULL | |
| location_id | int(11) | NO | MUL | NULL | |
| date_signed | date | NO | | NULL | |
| date_ended | date | YES | | NULL | |
+-------------+---------+------+-----+---------+-------+
Where all of the records are like this:
+-----------+-------------+-------------+------------+
| person_id | location_id | date_signed | date_ended |
+-----------+-------------+-------------+------------+
| 1 | 49 | 2007-09-29 | NULL |
| 1 | 41 | 2010-10-09 | NULL |
| 2 | 45 | 2007-09-29 | NULL |
| 2 | 58 | 2007-12-16 | NULL |
| 3 | 49 | 2007-09-29 | NULL |
| 4 | 45 | 2007-09-29 | NULL |
| 4 | 35 | 2013-10-04 | NULL |
| 5 | 45 | 2007-09-29 | NULL |
| 5 | 37 | 2009-01-09 | NULL |
| 5 | 32 | 2009-10-08 | NULL |
+-----------+-------------+-------------+------------+
I'm trying to update each person's date_ended to be one day less than the date_signed in the next chronological row for that person:
+-----------+-------------+-------------+------------+
| person_id | location_id | date_signed | date_ended |
+-----------+-------------+-------------+------------+
| 1 | 49 | 2007-09-29 | 2010-10-08 |
| 1 | 41 | 2010-10-09 | NULL |
| 2 | 45 | 2007-09-29 | 2007-12-15 |
| 2 | 58 | 2007-12-16 | NULL |
.
.
.
But I can't figure out how to select the next chronological record. I tried a few suggestions from similar questions:
UPDATE a column based on the value of another column in the same table
Mysql - update table column from another column based on order
ROW_NUMBER() in MySQL
But I couldn't get them to work. Is there a way to do this in MySQL?
SQL Fiddle: http://sqlfiddle.com/#!9/8b5219/1

Fetch cumulative sum from MySQL table

I have a table containing donations, and I am now creating a page to view statistics. I would like to fetch monthly data from the database with gross and cumulative gross.
mysql> describe donations;
+------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| transaction_id | varchar(64) | NO | UNI | | |
| donor_email | varchar(255) | NO | | | |
| net | double | NO | | 0 | |
| gross | double | NO | | NULL | |
| original_request | text | NO | | NULL | |
| time | datetime | NO | | NULL | |
| claimed | tinyint(4) | NO | | NULL | |
+------------------+------------------+------+-----+---------+----------------+
Here's what I've tried:
SET #cgross = 0;
SELECT YEAR(`time`), MONTH(`time`), SUM(`gross`), (#cgross := #cgross + SUM(`gross`)) AS `cumulative_gross` FROM `donations` GROUP BY YEAR(`time`), MONTH(`time`);
The result is:
+--------------+---------------+--------------+------------------+
| YEAR(`time`) | MONTH(`time`) | SUM(`gross`) | cumulative_gross |
+--------------+---------------+--------------+------------------+
| 2013 | 1 | 257 | 257 |
| 2013 | 2 | 140 | 140 |
| 2013 | 3 | 311 | 311 |
| 2013 | 4 | 279 | 279 |
+--------------+---------------+--------------+------------------+
Which is wrong. The desired result would be:
+--------------+---------------+--------------+------------------+
| YEAR(`time`) | MONTH(`time`) | SUM(`gross`) | cumulative_gross |
+--------------+---------------+--------------+------------------+
| 2013 | 1 | 257 | 257 |
| 2013 | 2 | 140 | 397 |
| 2013 | 3 | 311 | 708 |
| 2013 | 4 | 279 | 987 |
+--------------+---------------+--------------+------------------+
I tried this without SUM, and it did work as expected.
SET #cgross = 0;
SELECT YEAR(`time`), MONTH(`time`), SUM(`gross`), (#cgross := #cgross + 10) AS `cumulative_gross` FROM `donations` GROUP BY YEAR(`time`), MONTH(`time`);
+--------------+---------------+--------------+------------------+
| YEAR(`time`) | MONTH(`time`) | SUM(`gross`) | cumulative_gross |
+--------------+---------------+--------------+------------------+
| 2013 | 1 | 257 | 10 |
| 2013 | 2 | 140 | 20 |
| 2013 | 3 | 311 | 30 |
| 2013 | 4 | 279 | 40 |
+--------------+---------------+--------------+------------------+
Why doesn't it work with SUM? Any ideas how I could fix it?
Thanks,
Lassi
A subquery without variables will do it just as easily, and quite a bit more portably;
SELECT YEAR(`time`),
MONTH(`time`),
SUM(gross),
(SELECT SUM(gross)
FROM donations
WHERE `time`<=MAX(a.`time`)) cumulative_gross
FROM donations a GROUP BY YEAR(`time`), MONTH(`time`);
An SQLfiddle to test with.

MySQL query assistance

Here is a description of the table i am using:
describe mjla_db.StudentRecordTable2;
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| classId | varchar(20) | NO | MUL | NULL | |
| studentId | varchar(20) | NO | MUL | NULL | |
| quizGrade | tinyint(4) | YES | | NULL | |
| quizId | int(11) | NO | MUL | NULL | |
+-----------+-------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
Here is example data in the database:
+------------+-----------+------------+---------+------------+
| Student ID | Last Name | First Name | Quiz ID | Quiz Grade |
+------------+-----------+------------+---------+------------+
| A1 | Cat | Tom | 19 | 75 |
| A2 | pancake | Harry | 19 | 65 |
| A5 | Worthy | Dick | 19 | NULL |
| A1 | Cat | Tom | 20 | 55 |
| A2 | pancake | Harry | 21 | NULL |
| A2 | pancake | Harry | 20 | 47 |
| A5 | Worthy | Dick | 20 | 95 |
| A1 | Cat | Tom | 21 | 55 |
| A5 | Worthy | Dick | 21 | 95 |
+------------+-----------+------------+---------+------------+
3 rows in set (0.00 sec)
The result i am trying to get is one that will look similar to the following:
+------------+-----------+------------+---------+------------+------------+
| Student ID | Last Name | First Name | Quiz 19 | Quiz 20 | Quiz 21 |
+------------+-----------+------------+---------+------------+------------+
| A1 | Cat | Tom | 75 | 55 | 55 |
| A2 | pancake | Harry | 65 | 47 | NULL |
| A5 | Worthy | Dick | NULL| 95 | 95 |
+------------+-----------+------------+---------+------------+------------+
Where the Student ID column is unique.
Where the quiz columns continue depending on how many quizzes are in
the original table. And the quiz columns contain the grade of each of
the respective students.
try this:
select s.StudentId, s.FirstName, s.LastName,
Case when s.QuizId = 19 then quizGrade end as 'Quiz 19',
Case when s.QuizId = 20 then quizGrade end as 'Quiz 20',
Case when s.QuizId = 21 then quizGrade end as 'Quiz 21'
from StudentRecordTable2 sr
inner join Students s on sr.StudentId = s.StudentId
see this