I have a bit complicated query:
SELECT SQL_CALC_FOUND_ROWS DISTINCT l1.item_id, l1.uid, l2.id, l2.uid, u.prename, l1.item_id, l2.item_id,
(SELECT SUM(cnt) FROM
(
SELECT DISTINCT
p1.item_id,
COUNT(*) AS cnt
FROM pages_likes AS p1
JOIN pages_likes AS p2 ON p1.item_id = p2.item_id AND p1.status = p2.status
WHERE p1.uid = 391 AND p2.uid = 1091
GROUP BY p1.id
ORDER BY p1.date DESC
) AS t) AS total
FROM pages_likes l1
JOIN users u on u.id = l1.uid
JOIN pages_likes l2 on l1.item_id = l2.item_id
JOIN users_likes ul on l1.uid = ul.uid
WHERE ul.date >= DATE_SUB(NOW(),INTERVAL 1 WEEK)
AND l1.uid != 1091 AND l2.uid = 1091
AND (l1.status = 1 AND l2.status = 1)
AND u.gender = 2
GROUP BY l1.uid
ORDER BY
total DESC,
l1.uid DESC,
l1.date DESC
What I expect: It should display all users, sorted by total page likes we have in common that also are the most liked users this week.
The thing is that I inserted values (391 and 1091) as user id to test the query. But since it should be dynamic I'll need to use the row of the first query l1.uid in the subquery, so it should be WHERE p1.uid = l1.uid AND p2.uid = 1091 but mysql can't find the row.
status = 1 means user liked this page, status = 0 means user disliked this page.
Table structure here:
CREATE TABLE pages_likes
(
id BIGINT PRIMARY KEY NOT NULL AUTO_INCREMENT,
uid INT NOT NULL,
date DATETIME NOT NULL,
item_id INT,
status TINYINT
);
CREATE INDEX item_index ON pages_likes (item_id);
CREATE INDEX uid_index ON pages_likes (uid);
CREATE TABLE users
(
id BIGINT PRIMARY KEY NOT NULL AUTO_INCREMENT,
fb_uid VARCHAR(255),
email VARCHAR(100) NOT NULL,
pass VARCHAR(50) NOT NULL,
gender TINYINT NOT NULL,
birthdate DATE,
signup DATETIME NOT NULL,
lang VARCHAR(10) NOT NULL,
username VARCHAR(255),
prename VARCHAR(255) NOT NULL,
surname VARCHAR(255) NOT NULL,
projects VARCHAR(255) NOT NULL,
views INT DEFAULT 0,
verified DATETIME
);
CREATE UNIQUE INDEX id_index ON users (id);
CREATE INDEX uid_index ON users (id);
CREATE TABLE users_likes
(
id INT PRIMARY KEY NOT NULL AUTO_INCREMENT,
uid INT NOT NULL,
date DATETIME NOT NULL,
item_id INT,
status TINYINT
);
CREATE INDEX item_index ON users_likes (item_id);
CREATE INDEX uid_index ON users_likes (uid);
Have you tried different alias names for your subquery and then use alias from outer query? It works for me in this simple example: http://rextester.com/MJOL87502
Sadly I cannot test in your sqlfiddle, since that site often doesn't respond or throws errors (like it does now).
You could also use Window Functions and replace your subselect with something as simple as SUM(*) OVER (PARTITION BY p1.id, p1.item_id), but MySQL does not support Window Functions.
Related
So I have three tables
CREATE TABLE Personnel(
IdPersonnel INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
Name VARCHAR(45) NOT NULL,
Surename VARCHAR(45) NOT NULL,
Department VARCHAR(45) NOT NULL,
Salary INT NOT NULL,
Birthday DATE NOT NULL
);
CREATE TABLE Doctor(
IdDoctor INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
fk_Personnel INT NOT NULL
);
CREATE TABLE Visit(
IdVisit INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
Date DATE NOT NULL,
ControlDate DATE,
fk_Patient INT NOT NULL,
fk_Doctor INT NOT NULL,
);
From these three tables, I need to get a doctor who had most of the visits.
SELECT p.Name, p.Surname, COUNT(*) visits
FROM Visit v
JOIN Doctor d
ON v.fk_Doctor = d.IdDoctor
JOIN Personnel p
ON d.fk_Personnel = p.IdPersonnel
GROUP BY d.IdDoctor
ORDER BY COUNT(*) DESC
LIMIT 1;
I used this query, and the result is correct, but I have to use MAX() function. I am using MySQL Community server 8.0.26
select P.Name, P.Surename, P.IdPersonnel ,count(*) from Visit V
inner join Doctor D on V.fk_Doctor = D.IdDoctor
inner join Personnel P on D.fk_Personnel = P.IdPersonnel
group by P.Name, P.Surename, P.IdPersonnel
having count(*) = (
select max(visits) from
(select fk_Doctor, count(*) visits from Visit group by fk_Doctor) a
)
I am implementing a simple follow/followers system in MySQL. So far I have three tables that look like:
CREATE TABLE IF NOT EXISTS `User` (
`user_id` INT AUTO_INCREMENT PRIMARY KEY,
`username` varchar(40) NOT NULL ,
`pswd` varchar(255) NOT NULL,,
`email` varchar(255) NOT NULL ,
`first_name` varchar(40) NOT NULL ,
`last_name` varchar(40) NOT NULL,
CONSTRAINT uc_username_email UNIQUE (username , email)
);
-- Using a middle table for users to follow others on a many-to-many base
CREATE TABLE Following (
follower_id INT(6) NOT NULL,
following_id INT(6) NOT NULL,
KEY (`follower_id`),
KEY (`following_id`)
)
CREATE TABLE IF NOT EXISTS `Tweet` (
`tweet_id` INT AUTO_INCREMENT PRIMARY KEY,
`text` varchar(280) NOT NULL ,
-- I chose varchar vs TEXT as the latter is not stored in the database server’s memory.
-- By querying text data MySQL has to read from it from the disk, much slower in comparison with VARCHAR.
`publication_date` DATETIME NOT NULL,,
`username` varchar(40),
FOREIGN KEY (`username`) REFERENCES `user`(`username`)
ON DELETE CASCADE
);
Lets say I want to write a query that returns the 10 latest tweets by users followed by the user with username "Tom". What is the best way to writhe that query and return results with username, first name, last name, text and publication date.
Also if one minute later I want to query again 10 latest tweets and assuming someone Tom follows tweets during that minute, how do I query the database to not select tweets that have already shown in the first query?
To answer your first question:
SELECT u1.username, u1.first_name, u1.last_name, t.text, t.publication_date
FROM Tweet t
JOIN User u1 ON t.username = u1.username
JOIN Following f ON f.following_id = u1.user_id
JOIN User u2 ON u2.user_id = f.follower_id
WHERE u2.username = 'Tom'
ORDER BY t.publication_date DESC
LIMIT 10
For the second part, simply take the tweet_id from the first row of the first query (so the latest tweet_id value) and use it in the WHERE clause for the next query i.e.
WHERE u2.username = 'Tom'
AND t.tweet_id > <value from previous query>
To get latest 10 tweets for Tom:
select flg.username, flg.first_name, flg.last_name, t.tweet_id, t.text, t.publication_date
from user flr
inner join following f on f.follower_id = flr.user_id
inner join user flg on flg.user_id = f.following_id
inner join tweet t on t.username = flg.username
where flr.username = 'Tom'
order by tweet_id desc
limit 10
To get the next 10 tweets, pass in the max tweet_id, and apply an additional condition in the where clause:
where flr.username = 'Tom'
and t.tweet_id > <previous_max_tweet_id>
I'm trying to do a link exchange script and run into a bit of trouble.
Each link can be visited by an IP address a number of x times (frequency in links table). Each visit costs a number of credits (spend limit given in limit in links table)
I've got the following tables:
CREATE TABLE IF NOT EXISTS `contor` (
`key` varchar(25) NOT NULL,
`uniqueHandler` varchar(30) DEFAULT NULL,
`uniqueLink` varchar(30) DEFAULT NULL,
`uniqueUser` varchar(30) DEFAULT NULL,
`owner` varchar(50) NOT NULL,
`ip` varchar(15) DEFAULT NULL,
`credits` float NOT NULL,
`tstamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`key`),
KEY `uniqueLink` (`uniqueLink`),
KEY `uniqueHandler` (`uniqueHandler`),
KEY `uniqueUser` (`uniqueUser`),
KEY `owner` (`owner`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `links` (
`unique` varchar(30) NOT NULL DEFAULT '',
`url` varchar(1000) DEFAULT NULL,
`frequency` varchar(5) DEFAULT NULL,
`limit` float NOT NULL DEFAULT '0',
PRIMARY KEY (`unique`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I've got the following query:
$link = MYSQL_QUERY("
SELECT *
FROM `links`
WHERE (SELECT count(key) FROM contor WHERE ip = '$ip' AND contor.uniqueLink = links.unique) <= `frequency`
AND (SELECT sum(credits) as cost FROM contor WHERE contor.uniqueLink = links.unique) <= `limit`")
There are 20 rows in the table links.
The problem is that whenever there are about 200k rows in the table contor the CPU load is huge.
After applying the solution provided by #Barmar:
Added composite index on (uniqueLink, ip) and droping all other indexes except PRIMARY, EXPLAIN gives me this:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY l ALL NULL NULL NULL NULL 18
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 15
2 DERIVED pop_contor index NULL contor_IX1 141 NULL 206122
Try using a join rather than a correlated subquery.
SELECT l.*
FROM links AS l
LEFT JOIN (
SELECT uniqueLink, SUM(ip = '$ip') AS ip_visits, SUM(credits) AS total_credits
FROM contor
GROUP BY uniqueLink
) AS c
ON c.uniqueLink = l.unique AND ip_visits <= frequency AND total_credits <= limit
If this doesn't help, try adding an index on contor.ip.
The current query is of the form:
SELECT l.*
FROM `links` l
WHERE l.frequency >= ( SELECT COUNT(ck.key)
FROM contor ck
WHERE ck.uniqueLink = l.unique
AND ck.ip = '$ip'
)
AND l.limit >= ( SELECT SUM(sc.credits)
FROM contor sc
WHERE sc.uniqueLink = l.unique
)
Those correlated subqueries are going to each your lunch. And your lunchbox too.
I'd suggest testing an inline view that performs both of the aggregations from contor in one pass, and then join the result from that to the links table.
Something like this:
SELECT l.*
FROM ( SELECT c.uniqueLink
, SUM(c.ip = '$ip' AND c.key IS NOT NULL) AS count_key
, SUM(c.credits) AS sum_credits
FROM `contor` c
GROUP
BY c.uniqueLink
) d
JOIN `links` l
ON l.unique = d.uniqueLink
AND l.frequency >= d.count_key
AND l.limit >= d.sum_credits
For optimal performance of the aggregation inline view query, provide a covering index that MySQL can use to optimize the GROUP BY (avoiding a Using filesort operation)
CREATE INDEX `contor_IX1` ON `contor` (`uniqueLink`, `credits`, `ip`) ;
Adding that index renders the uniqueLink index redundant, so also...
DROP INDEX `uniqueLink` ON `contor` ;
EDIT
Since we have a guarantee that contor.key column is non-NULL (i.e. the NOT NULL constraint), this part of the query above is unneeded AND c.key IS NOT NULL, and can be removed. (I also removed the key column from the covering index definition above.)
SELECT l.*
FROM ( SELECT c.uniqueLink
, SUM(c.ip = '$ip') AS count_key
, SUM(c.credits) AS sum_credits
FROM `contor` c
GROUP
BY c.uniqueLink
) d
JOIN `links` l
ON l.unique = d.uniqueLink
AND l.frequency >= d.count_key
AND l.limit >= d.sum_credits
I'd like some help with an left join statement thats not doing what i, probably incorrectly, think it should do.
there are two tables:
cd:
CREATE TABLE `cd` (
`itemID` int(11) NOT NULL AUTO_INCREMENT,
`title` text NOT NULL,
`artist` text NOT NULL,
`genre` text NOT NULL,
`tracks` int(11) NOT NULL,
PRIMARY KEY (`itemID`)
)
loans
CREATE TABLE `loans` (
`itemID` int(11) NOT NULL,
`itemType` varchar(20) NOT NULL,
`userID` int(11) NOT NULL,
`dueDate` date NOT NULL,
PRIMARY KEY (`itemID`,`itemType`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and i want to select all cd's thats not in loans using a left join and then an where dueDate is null
select
t.itemID,
t.artist as first,
t. title as second,
(select AVG(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `rating avarage`,
(select COUNT(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `number of ratings`
from
cd t left join loans l
on t.itemID = l.itemID
where l.itemType = 'cd' and l.dueDate is null;
this one however returns an empty table even though there are plenty rows in cd with itemIDs thats not in loans
now i was under the understanding that the left join should preserv the righthandside and fill the columns from the lefthandside with null values
but this does not seem to be the case, can anbyone enlighten me?
Your WHERE condition causes the error. The L.ItemType = 'cd' will always return false if the L.DueDate IS NULL is true. (All of your fields are NOT NULL, so the DueDate can only be NULL if there is no matching records, but in this case the ItemType field will be NULL too).
Another point is that your query is semantically incorrect. You are trying to get the record from the cd table where the loans table do not contains any rows with dueDates.
The second table acts as a condition, so it should go to the WHERE conditions.
Consider to use the EXISTS statement to achieve your goal:
SELECT
t.itemID,
t.artist as first,
t. title as second,
(select AVG(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `rating avarage`,
(select COUNT(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `number of ratings`
FROM
cd t
WHERE
NOT EXISTS (SELECT 1 FROM loans l WHERE t.itemID = l.itemID AND L.itemType = 'cd')
Based on your data model you have to add another condition to the subquery to filter out those records which are out-of-date now (dueDate is earlier than the current time)
This is the case, when you do not delete outdated loan records.
NOT EXISTS (SELECT 1 FROM loans l WHERE t.itemID = l.itemID AND AND L.itemType = 'cd' l.dueDate > NOW())
I have this payments table, with about 2M entries
CREATE TABLE IF NOT EXISTS `payments` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned NOT NULL,
`date` datetime NOT NULL,
`valid_until` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `date_id` (`date`,`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=2113820 ;
and this users table from ion_auth plugin/library for CodeIgniter, with about 320k entries
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ip_address` varbinary(16) NOT NULL,
`username` varchar(100) NOT NULL,
`password` varchar(80) NOT NULL,
`salt` varchar(40) DEFAULT NULL,
`email` varchar(100) NOT NULL,
`activation_code` varchar(40) DEFAULT NULL,
`forgotten_password_code` varchar(40) DEFAULT NULL,
`forgotten_password_time` int(11) unsigned DEFAULT NULL,
`remember_code` varchar(40) DEFAULT NULL,
`created_on` int(11) unsigned NOT NULL,
`last_login` int(11) unsigned DEFAULT NULL,
`active` tinyint(1) unsigned DEFAULT NULL,
`first_name` varchar(50) DEFAULT NULL,
`last_name` varchar(50) DEFAULT NULL,
`company` varchar(100) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name` (`first_name`,`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=322435 ;
I'm trying to get both the user information and his last payment. Ordering(ASC or DESC) by ID, first and last name, the date of the payment, or the payment expiration date. To create a table showing users with expired payments, and valid ones
I've managed to get the data correctly, but most of the time, my queries take 1+ second for a single user, and 40+ seconds for 30 users. To be honest I have no idea if it's possible to get the information under 1 second. Also probably my application is never going to reach this number of entries, probably a maximum of 10k payments and 300 users
My query, works pretty well with few entries and it's easy to change the ordering:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT * FROM (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
ORDER BY payments.valid_until DESC
) AS p GROUP BY p.user_id
) AS final ON final.user_id = users.id
ORDER BY id ASC
LIMIT 0, 30"
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY users ALL NULL NULL NULL NULL 322269 Using where; Using temporary; Using filesort
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 50
4 DEPENDENT SUBQUERY users_deactivated unique_subquery user_id user_id 4 func 1 Using index
2 DERIVED <derived3> ALL NULL NULL NULL NULL 2072327 Using temporary; Using filesort
3 DERIVED payments ALL NULL NULL NULL NULL 2072566 Using filesort
I'm open to any suggestions and tips, since I'm new to PHP, MySQL and stuff, and don't really know if I'm doing the correct way
I would first suggest removing the ORDER BY clause from your subquery -- I don't see how it's helping as you're reordering by id in your outer query.
You should also be able to move your GROUP BY statement into your subquery:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
GROUP BY payments.user_id
) AS final ON final.user_id = users.id
ORDER BY users.id ASC
LIMIT 0, 30
Given your comments, how about this -- not sure it would be better than your current query, but ORDER BY can be expensive:
SELECT users.id, users.first_name, users.last_name, users.email, p.id AS payment_id, p.payment_date, p.valid_until AS payment_valid_until
FROM users
LEFT JOIN payments p ON p..user_id = users.id
LEFT JOIN (
SELECT user_id, MAX(valid_until) Max_Valid_Until
FROM payments
GROUP BY user_id
) AS maxp ON p.user_id = maxp.user_id and p.valid_until = maxp.max_valid_until
ORDER BY users.id ASC
LIMIT 0, 30
use an index on the payments table for users, that and do the group by on the payments table...
alter table payments add index (user_id);
your query
ORDER BY users.id ASC
alter table payments drop index user_id;
and why don't you use the payments "id" instead of "valid_until" ? Is there a reason to not trust the ids are sequential? if you don't trust the id add index to the valid_until field:
alter table payments add index (valid_until) desc;
and don't forget to drop it later
alter table payments drop index valid_intil;
if the query is still slow you will need to cache the results... this means you need to improve your schema, here is a suggestion:
create table last_payment
(user_id int,
constraint pk_last_payment primary key user_id references users(id),
payment_id int,
constraint fk_last_payment foreign key payment_id references payments(id)
);
alter table payments add index (user_id);
insert into last_payment (user_id, payment_id)
(select user_id, max(id) from payments group by user_id);
#here you probably use your own query if the max (id) does not refer to the last payment...
alter table payments drop index user_id;
and now comes the magic:
delimiter |
CREATE TRIGGER payments_trigger AFTER INSERT ON payments
FOR EACH ROW BEGIN
DELETE FROM last_payment WHERE user_id = NEW.user_id;
INSERT INTO last_payment (user_id, payment_id) values (NEW.user_id, NEW.id);
END;
|
delimiter ;
and now every-time you want to know the last payment made you need to query the payments_table.
select u.*, p.*
from users u inner join last_payment lp on (u.id = lp.user_id)
inner join payments on (lp.payment_id = p.id)
order by user_id asc;
Maybe something like this...
SELECT u.id
, u.first_name
, u.last_name
, u.email
, p.id payment_id
, p.payment_date
, p.payment_valid_until
FROM users u
JOIN payments p
ON p.user_id = u.id
JOIN
( SELECT user_id,MAX(p.valid_until) max_valid_until FROM payments GROUP BY user_id ) x
ON x.user_id = p.user_id
AND x.may_valid_until = p.valid_until;
The problem with joining to a sub query is that MySql internally generates the result of the sub query before performing the join. This is expensive in resources and is probably taking the time. Best solution is to change the query to avoid sub queries.
SELECT users.id, users.first_name, users.last_name, users.email, max(payments.id) AS payment_id, max(payments.date) as payment_date, max(payments.valid_until) AS payment_valid_until
FROM users
LEFT JOIN payments use index (user_id) on payments.user_id=users.id
group by users.id
ORDER BY id ASC
LIMIT 0, 30
This query is only correct , however, if the largest values for valid_until, payment_date and payment_date are always in the same record.
SELECT payments.users_id, users.first_name, users.last_name,
users.email, (final.id), MAX(payment.date), MAX(final.valid_until)
FROM payments final
JOIN users ON final.user_id = users.id
GROUP BY final.user_id
ORDER BY final.user_id ASC
LIMIT 0, 30
The idea is to flatten the payments first.
The MAX fields of course are of different payment records.
Speed up
Above I did a MySQL specific thing: final.id without MAX. Better not use the field at all.
If you could leave out the payments.id, it would be faster (with the appropiate index).
KEY `user_date` (`user_id`, `date` DESC ),
KEY `user_valid` (`user_id`, `valid_until` DESC ),