Mysql 3 tables, multiple counts grouped by, but gets stuck - mysql

I'm running with some troubles on a query. I'm trying to retrieve some data of a big database where 3 tables are involved.
These tables contain data about adds where, in a backend website, the administrator can manage which local adds he wants to be displayed, position and etc... These are organized in 3 tables, 1 of them, contains all the data that are relevant to adds info (Name, date of avaliability, date of expiration, etc...). Then, there's another 2 tables which contain some extra info, but just about views, or clicks.
So I have only 15 adds, that have multiple clicks and multiple views.
Each click and view table, register a new row for every click. So, when a click is registered, it will add a new row where addid_views is a register(click), and addid is addid from adds_table. So for instance, add (1) will have 2 views and 2 clicks while add (2) will have 1 view and 1 click.
My idea is to get for each add, how many clicks and views had in total.
I have 3 tables like these:
adds_table adds_clicks_table adds_views_table
+-------+-----------+ +-------------+------+ +-------------+------+
| addid | name | | addid_click |addid | | addid_views |addid |
+-------+-----------+ +-------------+------+ +-------------+------+
| 1 | add_name1 | | 1 | 1 | | 1 | 1 |
+-------+-----------+ +-------------+------+ +-------------+------+
| 2 | add_name2 | | 2 | 2 | | 2 | 1 |
+-------+-----------+
| 3 | add_name3 | | 3 | 1 | | 3 | 2 |
+-------+-----------+ +-------------+------+ +-------------+------+
CREATE TABLE `bwm_adds` (
`addid` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) NOT NULL,
...
PRIMARY KEY (`addid`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8
CREATE TABLE `bwm_adds_clicks` (
`add_clickid` int(19) NOT NULL AUTO_INCREMENT,
`addid` int(11) NOT NULL,
...
PRIMARY KEY (`add_clickid`)
) ENGINE=InnoDB AUTO_INCREMENT=3374 DEFAULT CHARSET=utf8
CREATE TABLE `bwm_adds_views` (
`add_viewsid` int(19) NOT NULL AUTO_INCREMENT,
`addid` int(11) NOT NULL,
...
PRIMARY KEY (`add_viewsid`)
) ENGINE=InnoDB AUTO_INCREMENT=2078738 DEFAULT CHARSET=utf8
The result would be a single table where I retrieved, per each add (addid), how many clicks and how many views it had.
I need to get all a query where I get something like this:
+-------+---------+-----------+
| addid | clicks | views |
+-------+---------+-----------+
| 1 | 123123 | 235457568 |
+-------+---------+-----------+
| 2 | 5124123 | 435345234 |
+-------+---------+-----------+
| 3 | 123541 | 453563623 |
+-------+---------+-----------+
I tried to execute a query but it get's stuck and loading for undefined time... I 'm pretty sure that my query is failing cause if I remove one of the counts, displays some data very fast.
SELECT a.addid, COUNT(ac.addid_clicks) as 'clicks', COUNT(av.addid_views) as 'views'
FROM `adds_table` a
LEFT JOIN `adds_clicks_table` ac ON a.addid = ac.addid_click
LEFT JOIN `adds_views_table` av ON ac.addid_click = av.addid_views
GROUP BY a.addid
Mysql gets loading all the time, any idea to help know what I'm missing?
By the way, I found this post where treats almost the same problem I have, you can see I have the query very similar to the first answer, but I get the Loading message all the time. No errors, just Loading.
Edit: I missplaced the numbers and got confused. Now the tables are fixed and I added some explanation about it.
Edit2: I updated the post with SHOW CREATE TABLES DEFINITIONS.
Edit3: Is there any way to optimise this query? It seems it retrieves the result I want but the mysql database cancels the query because it gets more than 30 seconds to execute.
SELECT a.addid,
(SELECT COUNT(addid) FROM add_clicks where addid = a.addid) as clicks,
(SELECT COUNT(addid) FROM add_views where addid = a.addid) as views
FROM adds a ORDER BY a.addid;

If those are really your tables (one column, plus an auto_inc), then there is no meaningful information justifying having 3 tables instead of 1:
CREATE TABLE `bwm_adds` (
`addid` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) NOT NULL,
clicks INT UNSIGNED NOT NULL,
views INT UNSIGNED NOT NULL,
PRIMARY KEY (`addid`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8
and then UPDATE ... SET views = views + 1 (etc) rather than inserting into the other tables.

If you have an old version,
SELECT a.addid,
( SELECT COUNT(addid_clicks)
FROM `adds_clicks_table`
WHERE addid = a.addid
) AS 'clicks',
( SELECT COUNT(addid_clicks)
FROM `adds_views_table`
WHERE addid = a.addid
) AS 'views'
FROM adds_table AS a
For 5.6 and later, this might be faster:
SELECT a.addid, c.clicks, v.views
FROM `adds_table` a
LEFT JOIN ( SELECT addid, COUNT(addid_clicks) FROM addid_clicks ) AS c USING(addid)
LEFT JOIN ( SELECT addid, COUNT(addid_views) FROM addid_views ) AS v USING(addid)
If you get NULLs but prefer 0s, then wrap the value in IFNULL(..., 0).
If you need to discuss further, please provide SHOW CREATE TABLE and EXPLAIN SELECT ...

I ended with a solution to my problem. The table I was trying to reach was too big cause of the bad engineered database, where in adds_views_table, for each view, a new row would be added. Ending with almost 3 millions of rows and with a table that weights almost the 35% of the entire database (326MB).
When phpmyadmin tried to execute a query, loaded for ever and never showed a result because a timeout limit applied to mysql. Changing this value would help but wasn't viable to retrieve that data and display it on a website (that implies the website or data wouldn't load until the query its executed).
That problem was fixed thanks to creating an index of addid in adds_table. Also, the query it's faster if subquery's are used for some reason. The query ended like this:
SELECT a.addid,
(SELECT COUNT(addid) FROM adds_clicks_table WHERE addid = a.addid) AS 'clicks',(SELECT COUNT(addid) FROM adds_views_table WHERE addid = a.addid) AS 'views'
FROM adds_table a
ORDER BY a.addid;
Thanks to #Rick James who posted a similar query and I ended modifying it to get the data I needed
forgive my horrible english

Related

Best practice in MySQL for selecting two interchangeable columns and counting them, returning the most recent result

I have a MySQL table that looks like:
CREATE TABLE `messages` (
`id` int NOT NULL AUTO_INCREMENT,
`from` varchar(12) NOT NULL,
`to` varchar(12) NOT NULL,
`message` text,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=66 DEFAULT CHARSET=latin1;
So each time a message is sent or received, it is stored as:
# id from, to, message, timestamp
'65', '+1231303****', '+1833935****', 'Showtimes', '2022-01-26 09:26:10'
'64', '+1833935****', '+1231303****', 'Showtimes are: 12:30 someresponse', '2022-01-26 09:26:10'
I want to create a index of these conversation threats, and need to be able to execute a query that selects the conversation based on it either being addressed from or to a specific number, and returns the number of rows that match either, while at the same time, returning the last message that was sent. So basically I want it to return:
recipient (the other phone number, not the one I'm using to look up ),count(messages),lastmessage
Individually, I can query this all separately, since most of my experience here resolves around using PHP to untangle the data I'm going after. What I'm curious about is a single query that lets MySQL handle this, rather than submitting multiple queries to the database server. I figure this may be a good time to approach in, since several projects I've coded have ran out of memory to process before with so many queries between so many loops.
Apologies in advance if this has been answered somewhere else already. I searched extensively for an answer, but the few results I found used a completely different table structure than I am using, and the MySQL query I was able to fumble together didn't work. I stand next to my work as a PHP programmer, but my MySQL needs some work. Hence I'm here!
If a conversation thread can be defined by a unique combination of from and to then creating a compound key where the first node is the lower of the two then all the conversations in the thread can be established , however selecting on from OR two means many conversation threads may be selected. for example
DROP TABLE IF EXISTS T;
CREATE TABLE T(ID INT AUTO_INCREMENT PRIMARY KEY, FROMNO INT, TONO INT);
INSERT INTO T(FROMNO,TONO) VALUES
(1,2),(2,1),
(1,3),(4,1),(1,2);
WITH CTE AS
(SELECT * ,
CASE WHEN FROMNO < TONO THEN CONCAT(FROMNO,TONO)
ELSE CONCAT(TONO,FROMNO)
END AS CVAL
FROM T
WHERE FROMNO = 1 OR TONO = 1
),
CTE1 AS
(SELECT *,
DENSE_RANK() OVER (ORDER BY CVAL) DR
FROM CTE
),
CTE2 AS
(SELECT CVAL,COUNT(*) conversations,MAX(ID) MAXID
FROM CTE1
GROUP BY CVAL
)
SELECT CTE2.CVAL,CTE2.THINGS,CTE2.MAXID,T.ID
FROM CTE2
JOIN T ON T.ID = CTE2.MAXID;
Yields
+------+---------------+-------+----+
| CVAL | conversations | MAXID | ID |
+------+---------------+-------+----+
| 13 | 1 | 3 | 3 |
| 14 | 1 | 4 | 4 |
| 12 | 3 | 5 | 5 |
+------+---------------+-------+----+
3 rows in set (0.002 sec)

Reduce number of joins in mysql

I have 12 fixed tables (group, local, element, sub_element, service, ...), each table with different numbers of rows.
The columns 'id_' in all table is a primary key (int). The others columns are of datatype varchar(20). The maximum number of rows in these tables are 300.
Each table was created in this way:
CREATE TABLE group
(
id_G int NOT NULL,
name_group varchar(20) NOT NULL,
PRIMARY KEY (id_G)
);
|........GROUP......| |.......LOCAL.......| |.......SERVICE.......|
| id_G | name_group | | id_L | name_local | | id_S | name_service |
+------+------------+ +------+------------+ +------+--------------+
| 1 | group1 | | 1 | local1 | | 1 | service1 |
| 2 | group2 | | 2 | local2 | | 2 | service2 |
And I have one table that combine all these tables depending on user selects.
The 'id_' come from fixed tables selected by the user are recorded into this table.
This table was crate in this way:
CREATE TABLE group
(
id_E int NOT NULL,
event_name varchar(20) NOT NULL,
id_G int NOT NULL,
id_L int NOT NULL,
...
PRIMARY KEY (id_G)
);
The tables (event) look like this:
|....................EVENT.....................|
| id_E | event_name | id_G | id_L | ... |id_S |
+------+-------------+------+------+-----+-----+
| 1 | mater1 | 1 | 1 | ... | 3 |
| 2 | master2 | 2 | 2 | ... | 6 |
This table get greater each day, an now it has about thousunds of rows.
Column id_E is the primary key (int), event_name is varchar(20).
This table has, in addition of id_E and event_name columns, 12 other columns the came from the fixed tables.
Every time than I need to retrieve information on the event table, to turn more readable, I need to do about 12 joins.
My query look like this where i need to retrieve all columns from table event:
SELECT event_name, name_group, name_local ..., name_service
FROM event
INNER JOIN group on event.id_G = group.id_G
INNER JOIN local on event.id_L = local.id_L
...
INNER JOIN service on event.id_S = service.id_S
WHERE event.id_S = 7 (for example)
This slows down my system performance. Is there a way to reduce the number of joins? I've heard about using Natural Keys, but I think this is not a good idea to form my case thinking in future maintenance.
My queries are taking about 7 seconds and I need to reduce this time.
I changed the WHERE clause and this caused not affect. So, I am sure that the problem is that the query has so many joins.
Could someone give some help? thanks a lot...
MySQL has a great keyword of "STRAIGHT_JOIN" and might be what you are looking for. First, each of your lookup tables (id/description) I have to assume already have an index on the ID column since that is primary key.
Your event table is the one you are querying as the primary basis of the details and joining to the lookups per their respective IDs. As long as your WHERE clause applicable to the EVENT table is optimized, such as the ID you are looking for, it SHOULD be virtually instantaneous.
If it is not, then it might be that MySQL is trying to think for you and take one of the secondary lookup tables and make it a primary basis of the query for whatever reason, such as much lower record count. In this case, add the keyword and try it..
SELECT STRAIGHT_JOIN ... rest of your query
This tells MySQL to do the query in the order you gave it, thus the Event table first and it's where clause on the ID. It should find that one thing, then grab all the corresponding lookup descriptions from the other tables.
Create indexes, concretely use compound indexes, for instance, start creating a compound index for event and groups:
on table events create one for (event id, group id).
then, on the group table create another one for the next relation (group id, local id).
on local do the same with service, and so on...

How to use MySQL REGEXP in the WHERE of a JOIN statement

I have two tables A and B
Table A has columns: ID and POST
Table B has columns: ID, POST_ID and UPPERS
I want to select all records where a.POST matches the regex
'\\[cd(i|b)?(=[a-z0-9]+)?\\].+\\[/cd(i|b)?\\]'
and JOIN table B on a.ID = b.POST_ID where b.UPPERS matches the regex
'(\\|[0-9]+\\![0-9]{4}[-]+[0-9]{2}[-]+[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},){1,}'
I came up with the following statement but it is not returning any row even when the columns contains the contents matching the regex
SELECT a.*,b.*
FROM a JOIN
b
ON b.POST_ID=a.ID
WHERE a.POST RLIKE '\\[cd(i|b)?(=[a-z0-9]+)?\\].+\\[/cd(i|b)?\\]' AND
b.UPPERS REGEXP '(\\|[0-9]+\\![0-9]{4}[-]+[0-9]{2}[-]+[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},){1,}'
Summary:
I want to select records where a users has sent contents that matches this regex
'\\[cd(i|b)?(=[a-z0-9]+)?\\].+\\[/cd(i|b)?\\]'
and then check if that very post has received at least two ups(or likes) using the regex
'(\\|[0-9]+\\![0-9]{4}[-]+[0-9]{2}[-]+[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},){2,}'
which can be broken down as simply:
a prefix pipe: |
a user id: [0-9]+
an exclamation mark: !
a datetime: [0-9]{4}[-]+[0-9]{2}[-]+[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}
and a sufix: ,
NOTE: {2,} simply to check how many times the match occurs
Please can someone point me in the right direction as to what am doing wrong.
Sample table datas:
Table A
ID | POST
23 match found [cd=plain]6h+#gtyr[/cd]
24 match found [cd]65#%gte2!iu[/cd]
25 match found [cdi]*tre&y^g82u[/cdi]
26 no match found *tre&y^g82u
27 no match found rtyure99
28 match found [cdb]aha87ulchr[/cdb]
Table B
ID | POST_ID | UPPERS
4 24 |98!2018-02-10 22:43:03,
|35!2018-02-08 20:42:09,
|3!2018-02-05 02:05:07,
5 26 |2!2018-02-10 22:43:03,
|30!2018-02-08 20:42:09,
6 25 |21!2018-02-10 22:43:03,
7 27 |23!2018-02-10 22:43:03,
|11!2018-02-08 20:42:09,
NOTE: POST_ID in table B is a foreign key referencing ID of table A
If you don't mind, I'm actually going to answer the question that lies beneath your actual question. I'm sure we could work through why the regular expression is not working as you expect, but it begs the question: why use regular expressions for such a simple task?
It happens a lot that people first just use a database to stash stuff that is the same format that appears in the code. But if you take a little time to break down your data in a meaningful way, you can unlock a lot of power from humble MySQL.
Think about the question you want this query to answer:
Which posts that match certain criteria have been upped?
As you already realized, that suggests two tables - one to store information about the posts, and another to store information about who upped them. To make your queries fast and easy, think about which attributes of the information are going to show up in your where clause.
You want posts that are enclosed by certain markup. To make your search more efficient, put the markup tag in its own column:
CREATE TABLE `posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`tag` enum('cd','cdi','cdb') DEFAULT NULL,
`tag_value` varchar(11) DEFAULT NULL,
`content` text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
for the data you list above, the table might look something like:
+-----+------+-----------+-------------+
| id | tag | tag_value | content |
+-----+------+-----------+-------------+
| 23 | cd | plain | 6h+#gtyr |
| 24 | cd | NULL | 65#%gte2!iu |
| 25 | cdi | NULL | *tre&y^g82u |
| 26 | NULL | NULL | *tre&y^g82u |
| 27 | NULL | NULL | rtyure99 |
| 28 | cdb | NULL | aha87ulchr |
+-----+------+-----------+-------------+
It takes a little more work to put your data IN (this is where your regex powers are better applied, as you create the INSERT), but now you can do all sorts of things with it quite easily. I used an ENUM for the tag column, because that is extra-fast to search on. If you have a large number of tags or don't know what they will all be, you can just use a VARCHAR instead.
So how to you track UPPERS? That part gets very easy. All you need is a table with a row for each time someone ups something:
CREATE TABLE `uppers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`post_id` int(11) DEFAULT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Currently when someone ups something, you have to go find the relevant record, append new data to it, then save it back. Now you can just slap a record into the table. The time will be set automatically; all you need to insert is the user_id and post_id. Some of your data might look like:
+----+---------+---------+---------------------+
| id | user_id | post_id | time |
+----+---------+---------+---------------------+
| 2 | 98 | 24 | 2018-02-10 15:23:03 |
| 3 | 35 | 24 | 2018-02-10 15:23:23 |
| 4 | 27 | 24 | 2018-02-10 15:23:43 |
| 5 | 2 | 26 | 2018-02-10 15:24:16 |
| 6 | 30 | 26 | 2018-02-10 15:24:28 |
+----+---------+---------+---------------------+
Now you can harness the power of the MySQL engine to capture all the information you need:
All posts with the desired tags:
SELECT * FROM posts where tag IN ('cd', 'cdi', 'cdb')
All post with the desired tags and at least one up:
SELECT posts.*, uppers.user_id, uppers.time
FROM posts
INNER JOIN uppers ON posts.id = uppers.post_id
WHERE tag IN ('cd', 'cdi', 'cdb')
That will return a row for each post-upper combination. The INNER JOIN means it will not return any posts that don't have a match in the uppers table. This may be what you are looking for, but if you want to group the ups together by post ID, you can ask MySQL to group them for you:
SELECT posts.*, COUNT(uppers.user_id)
FROM posts
INNER JOIN uppers
WHERE tag IN ('cd', 'cdi', 'cdb')
GROUP BY posts.id
If you want to rule out duplicate ups by the same user, you can easily only count unique user id's for each post:
SELECT posts.*, COUNT(DISTINCT uppers.user_id)
FROM posts
INNER JOIN uppers
WHERE tag IN ('cd', 'cdi', 'cdb')
GROUP BY posts.id
There are many functions like COUNT() you can use to work with the data that gets grouped together. You could MAX(uppers.time) to get the time of the most recent up for that post, or you can use functions like GROUP_CONCAT() to put the values together in a long string.
The bottom like is that by breaking down your data into its fundamental pieces, you allow MySQL (or any other relational database) to work much more efficiently, and life gets much, much easier.

MYSQL, PHP, order by not working, primary key

I am generating a mySQL query from PHP.
Part of the query re-orders a table based on some variables (which do not include the primary key).
The code doesn't produce errors, however the table is not sorted.
I echo'd out the SQL code, and it looks correct, I tried running it directly in phpMyAdmin, and it runs also without error, but the table is still not sorted as requested.
alter table anavar order by dset_name, var_id;
I am pretty sure that this has to do with the fact that I have a primary key variable (UID) which is not present in the sort.
Both prior and post running the query the table remains ordered by UID. Deleting UID and re-running the query results in a correctly sorted table, but this seems like an overkill solution.
Any suggestions?
create table t2
( id int auto_increment primary key,
someInt int not null,
thing varchar(100) not null,
theWhen datetime not null,
key(theWhen) -- creates an index on theWhen
);
-- my table now has 2 indexes on it
-- see it by running `show indexes from t2`
-- truncate table t2;
insert t2(someInt,thing,theWhen) values
(17,'chopstick','2016-05-08 13:00:00'),
(14,'alligator','2016-05-01'),
(11,'snail','2016-07-08 19:00:00');
select * from t2; -- returns in physical order (the primary key `id`)
select * from t2 order by thing; -- returns via thing, which has no index anyway
select * from t2 order by theWhen,thing; -- partial index use
note that indexes aren't even used until you have a significant number of rows in a db anyway
Edit (new data comes in)
insert t2 (someInt,thing,theWhen) values (777,'apple',now());
select t2.id,t2.thing,t2.theWhen,#rnk:=#rnk+1 as rank
from t2
cross join (select #rnk:=0) xParams
order by thing;
+----+-----------+---------------------+------+
| id | thing | theWhen | rank |
+----+-----------+---------------------+------+
| 2 | alligator | 2016-05-01 00:00:00 | 1 |
| 4 | apple | 2016-09-04 15:04:50 | 2 |
| 1 | chopstick | 2016-05-08 13:00:00 | 3 |
| 3 | snail | 2016-07-08 19:00:00 | 4 |
+----+-----------+---------------------+------+
Focus on the fact that you can maintain your secondary indices and generate a rank on the fly whenever you want.

MySQL - how to optimize large set of conditions

My head is already spinning from this and I need your help.
MY DATABASE
imported CSV file: 22 columns and 11k rows
2 tables with the same data (both created from the CSV)
Added ID as PRIMARY KEY to both
All VARCHAR(60) Some columns are empty strings ' '
DB:
PID | CODE 1 | CODE 2 | CODE 3 | CODE 4 | CODE 5 | CODE X (up to 9) | ID
-------------------------------------------------------------------------
1 | a | b | c | | | | 1
2 | a | | b | d | | | 2
3 | x | | | | | y | 3
DB has 22 columns but I'm only including CODE columns (up to 9)
in which I might be interested in terms of SQL statement.
It'll be only read table - MyISAM engine then?
WHAT I'D LIKE TO DO
select PID = 1 from first table
and retrieve all PIDs from second table
IF
selected PID's column CODE 1
or
selected PID's column CODE 2 (which is b) etc (up to 9).
= any PID's CODE X
So I should get only PID 2.
edit: PID is not a ID, it's just an example code, it could be string: '002451' and I'm looking for other PIDs with the same CODES (e.g PID1 has code = a so it should find PID2 becasue one of its CODE columns contains a)
MY ATTEMPT
SELECT a.* FROM `TABLE1` a WHERE
(
SELECT * FROM `TABLE2` b WHERE b.`PID` = 1
AND
(
( b.`CODE 1` NOT IN ('') AND IN (a.`CODE 1`,a.`CODE 2`, A.`CODE 3`...) ) OR
( b.`CODE 2` NOT IN ('') AND (a.`CODE 1`,a.`CODE 2`, A.`CODE 3`...) ) OR...
I'd end up with large query - over 81 conditions. In terms of performance... well, it doesn't work.
I intuitively know that I should:
use INDEXES (on CODE 1 / CODE 2 / CODE 3 etc.?)
use JOIN ON (but I'm too stupid) - that's why I created 2 tables (let's assume I don't want TEMP. TABLES)
How to write the SQL / design the DB efficently?
The correct data structure is one row per pid and code. The simplest way is:
create table PCodes (
pid int not null,
code varchar(255),
constraint fk_PCodes_pid references p(pid)
);
Then you have the values in a single column and it is much simpler to check for matching codes.
In practice, you should have three tables:
create table Codes (
CodeId int not null auto_increment primary key,
Code varchar(255)
);
create table PCodes (
pid int not null,
codeid int not null,
constraint fk_PCodes_pid references p(pid),
constraint fk_PCodes_codeid references codes(codeid);
);
If the ordering of the codes is important for each "p", then include a priority or ordering column in the PCodes table.