Tweets implementation MySQL and queries - mysql

I am implementing a simple follow/followers system in MySQL. So far I have three tables that look like:
CREATE TABLE IF NOT EXISTS `User` (
`user_id` INT AUTO_INCREMENT PRIMARY KEY,
`username` varchar(40) NOT NULL ,
`pswd` varchar(255) NOT NULL,,
`email` varchar(255) NOT NULL ,
`first_name` varchar(40) NOT NULL ,
`last_name` varchar(40) NOT NULL,
CONSTRAINT uc_username_email UNIQUE (username , email)
);
-- Using a middle table for users to follow others on a many-to-many base
CREATE TABLE Following (
follower_id INT(6) NOT NULL,
following_id INT(6) NOT NULL,
KEY (`follower_id`),
KEY (`following_id`)
)
CREATE TABLE IF NOT EXISTS `Tweet` (
`tweet_id` INT AUTO_INCREMENT PRIMARY KEY,
`text` varchar(280) NOT NULL ,
-- I chose varchar vs TEXT as the latter is not stored in the database server’s memory.
-- By querying text data MySQL has to read from it from the disk, much slower in comparison with VARCHAR.
`publication_date` DATETIME NOT NULL,,
`username` varchar(40),
FOREIGN KEY (`username`) REFERENCES `user`(`username`)
ON DELETE CASCADE
);
Lets say I want to write a query that returns the 10 latest tweets by users followed by the user with username "Tom". What is the best way to writhe that query and return results with username, first name, last name, text and publication date.
Also if one minute later I want to query again 10 latest tweets and assuming someone Tom follows tweets during that minute, how do I query the database to not select tweets that have already shown in the first query?

To answer your first question:
SELECT u1.username, u1.first_name, u1.last_name, t.text, t.publication_date
FROM Tweet t
JOIN User u1 ON t.username = u1.username
JOIN Following f ON f.following_id = u1.user_id
JOIN User u2 ON u2.user_id = f.follower_id
WHERE u2.username = 'Tom'
ORDER BY t.publication_date DESC
LIMIT 10
For the second part, simply take the tweet_id from the first row of the first query (so the latest tweet_id value) and use it in the WHERE clause for the next query i.e.
WHERE u2.username = 'Tom'
AND t.tweet_id > <value from previous query>

To get latest 10 tweets for Tom:
select flg.username, flg.first_name, flg.last_name, t.tweet_id, t.text, t.publication_date
from user flr
inner join following f on f.follower_id = flr.user_id
inner join user flg on flg.user_id = f.following_id
inner join tweet t on t.username = flg.username
where flr.username = 'Tom'
order by tweet_id desc
limit 10
To get the next 10 tweets, pass in the max tweet_id, and apply an additional condition in the where clause:
where flr.username = 'Tom'
and t.tweet_id > <previous_max_tweet_id>

Related

Minimizing repeated joins in MySQL to horizontally constructed table

I have a table which contains contacts (simplified here):
create table contacts
(
id_c char(36) not null primary key,
first_name varchar(255) not null,
last_name varchar(255) not null,
user_id1_c char(36) not null,
user_id2_c char(36) not null,
user_id3_c char(36),
user_id4_c char(36),
user_id5_c char(36),
user_id6_c char(36),
user_id7_c char(36) null,
user_id8_c char(36) null,
user_id9_c char(36) null,
user_id10_c char(36) null,
user_id11_c char(36) null,
user_id12_c char(36) null,
user_id13_c char(36) null,
user_id14_c char(36),
user_id15_c char(36) null,
user_id16_c char(36) null
)
engine = MyISAM
charset = utf8;
and a related table containing user information:
create table users (
id char(36) not null primary key,
first_name varchar(255),
last_name varchar(255)
)
For a report, I would want to be able to get the names of the associated users, and in MySQL this is a JOIN statement.
SELECT contacts.*,
CONCAT_WS(' ', u1.first_name, u1.last_name) AS `some_user`,
CONCAT_WS(' ', u2.first_name, u2.last_name) AS `some_other_user`,
FROM contacts
LEFT JOIN users u1 ON u1.id = user_id1_c
LEFT JOIN users u2 ON u2.id = user_id2_c
...
In the query, I would be able to get the names no problem (and before you go rushing to the comments with criticism for the design, this is the structure I got, so building a more abstract middleman user table isn't really within my pay grade). I recognize that appending 16 joins onto a query is slow even if it is well indexed. So my question becomes the following:
Is there a more direct way to address this problem? I have considered doing a LEFT JOIN users WHERE users.id IN (user_id1_c,...), but I cannot organize a GROUP BY or a distinct such that I retrieve the names matched to the column, just that a user maps to some column in a contact record.
Am I doomed to repeated LEFT JOINs, or is there a better way. If it matters, I am currently in version MySQL5.7, though I think we might be upgrading our company to version 8 sometime soon.
SELECT
c.*,
MAX(CASE WHEN u.id = c.user_id1 THEN u.whole_name END) AS user_name_1,
MAX(CASE WHEN u.id = c.user_id2 THEN u.whole_name END) AS user_name_2,
MAX(CASE WHEN u.id = c.user_id3 THEN u.whole_name END) AS user_name_3,
...
FROM
contacts
LEFT JOIN
(
SELECT
id,
CONCAT_WS(' ', first_name, last_name) AS whole_name
FROM
users
)
AS u
ON u.id IN (c.user_id1, c.user_id2, c.user_id3, ...)
GROUP BY
c.id
Really you should group by everything that you select from the contacts table, but MySQL 5.7 is lax and allows you to group by only the unique identififer.

SQL JOIN with conditions

I have a conversation table which contains two users ids as foreign keys, and the user table which contains the users details. I want to write a query which returns the conversation table joined to the user table but displaying the name and surname of the user whose id wasn't sent as the parameter.
CREATE TABLE `conversation` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_one_id` int(11) NOT NULL,
`user_two_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `user` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`surname` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
)
For example I have
Conversation:
id user_one_id user_two_id
1 1 2
User:
id name surname
1 userone_name userone_surname
2 usertwo_name usertwo_surname
I want a query that will return user_two's name and surname in the join, not user one.
My current query:
SELECT c.id, c.user_one_id, c.user_two_id, u.name, u.surname * FROM conversation c
JOIN user u
WHERE c.user_one_id = 1
OR c.user_two_id = 1
AND IF (c.user_one_id = u.id, c.user_two_id = u.id, c.user_one_id = u.id)
GROUP BY c.id
ORDER BY c.date DESC;
[INNER] JOIN should have an ON clause. (I consider it a flaw that MySQL allows you to omit it.)
The join criteria would have to be: Give me the user of the conversation that is not user 1.
SELECT c.id, c.user_one_id, c.user_two_id, u.name, u.surname
FROM conversation c
JOIN user u ON u.id IN (c.user_one_id, c.user_two_id) AND u.id <> 1
WHERE c.user_one_id = 1 OR c.user_two_id = 1
ORDER BY c.date DESC;

complicated sql query returns a result with empty tables

I have three empty tables
--
-- Tabellenstruktur für Tabelle `projects`
--
CREATE TABLE IF NOT EXISTS `projects` (
`id_project` int(11) NOT NULL AUTO_INCREMENT,
`id_plan` int(11) DEFAULT NULL,
`name` varchar(255) NOT NULL,
`description` longtext NOT NULL,
PRIMARY KEY (`id_project`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=2 ;
-- --------------------------------------------------------
--
-- Tabellenstruktur für Tabelle `project_plans`
--
CREATE TABLE IF NOT EXISTS `project_plans` (
`id_plan` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`description` longtext NOT NULL,
`max_projects` int(11) DEFAULT NULL,
`max_member` int(11) DEFAULT NULL,
`max_filestorage` bigint(20) NOT NULL DEFAULT '3221225472' COMMENT '3GB Speicherplatz',
PRIMARY KEY (`id_plan`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=2 ;
-- --------------------------------------------------------
--
-- Tabellenstruktur für Tabelle `project_users`
--
CREATE TABLE IF NOT EXISTS `project_users` (
`id_user` int(11) NOT NULL,
`id_project` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
All these tables are empty but i get a result with my query?
my query:
SELECT
A.id_plan,
A.name AS plan_name,
A.description AS plan_description,
A.max_projects,
A.max_member,
A.max_filestorage,
B.id_plan,
B.name AS project_name,
B.description AS project_description,
C.id_user,
C.id_project,
COUNT(*) AS max_project_member
FROM
".$this->config_vars["projects_plans_table"]." AS A
LEFT JOIN
".$this->config_vars["projects_table"]." AS B
ON
B.id_plan = A.id_plan
LEFT JOIN
".$this->config_vars["projects_user_table"]." AS C
ON
C.id_project = B.id_project
WHERE
C.id_project = '".$id."'
&& B.deleted = '0'
i think the problem is the COUNT (*) AS ...
how i can solve the problem?
For one, you are getting a record explicitly due to the COUNT(). Even though you have no records, you are asking the engine how many records which at worst case will return zero. Count(), like other aggregates are anticipated to have a group by, so even though you don't have one, you are still asking.
So the engine is basically stating hey... there are no records, but I have to send you a record so you can get the count() column to look at and do with what you will. So, it is doing what you asked.
Now, for the comment to the other question where you asked...
Yes but i want to count the project member from a project, how i can count the users from project_users where all users have the id_project 1.
Since you only care about a count, and not the specific WHO involved, you can get this result directly from the project_users table (which should have an index on both the ID_User and another on the ID_Project. Then
select count(*)
from project_users
where id_project = 1
To expand from basis of your original question to get the extra details, I would do...
select
p.id_project,
p.id_plan,
p.name as projectName,
p.description as projectDescription,
pp.name as planName,
pp.description as planDescription,
pp.max_projects,
pp.max_member,
pp.max_filestorage,
PJCnt.ProjectMemberCount
from
( select id_project,
count(*) as ProjectMemberCount
from
project_users
where
id_project = 1 ) PJCnt
JOIN Projects p
on PJCnt.id_project = p.id_project
JOIN Project_Plans PP
on p.id_plan = pp.id_plan
Now, based on this layout of tables, a plan can have a max member count, but there is nothing indicating max members for the plan based on all projects, or max per SINGLE project. So, if a plan allows for 20 people, can there be 20 people for 10 different projects under the same plan? That's something only you would know the impact of... just something to consider what you are asking for.
Your cleaned-up query should look like :
See sqlfidle demo as well : http://sqlfiddle.com/#!2/e693f5/9
SELECT
A.id_plan,
A.name AS plan_name,
A.description AS plan_description,
A.max_projects,
A.max_member,
A.max_filestorage,
B.id_plan,
B.name AS project_name,
B.description AS project_description,
C.id_user,
C.id_project,
COUNT(*) AS max_project_member
FROM
project_plans AS A
LEFT JOIN
projects AS B
ON
B.id_plan = A.id_plan
LEFT JOIN
project_users AS C
ON
C.id_project = B.id_project
WHERE
C.id_project = '".$id."';
This will return you null values for all the cols from the select because you have one legit return form the result set and that is the count(*) output 0.
To fix this just add a group by at the end (see group by example http://sqlfiddle.com/#!2/14d46/2) or
Remove the count(*) and the null values will be gone as well as the count(*) values 0
See simple sql example here : http://sqlfiddle.com/#!2/ab7dd/5
Just comment the count() and you fixed you null problem!

Can't get my query to run any faster on MySQL database with 2M entries

I have this payments table, with about 2M entries
CREATE TABLE IF NOT EXISTS `payments` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned NOT NULL,
`date` datetime NOT NULL,
`valid_until` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `date_id` (`date`,`id`),
KEY `user_id` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=2113820 ;
and this users table from ion_auth plugin/library for CodeIgniter, with about 320k entries
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`ip_address` varbinary(16) NOT NULL,
`username` varchar(100) NOT NULL,
`password` varchar(80) NOT NULL,
`salt` varchar(40) DEFAULT NULL,
`email` varchar(100) NOT NULL,
`activation_code` varchar(40) DEFAULT NULL,
`forgotten_password_code` varchar(40) DEFAULT NULL,
`forgotten_password_time` int(11) unsigned DEFAULT NULL,
`remember_code` varchar(40) DEFAULT NULL,
`created_on` int(11) unsigned NOT NULL,
`last_login` int(11) unsigned DEFAULT NULL,
`active` tinyint(1) unsigned DEFAULT NULL,
`first_name` varchar(50) DEFAULT NULL,
`last_name` varchar(50) DEFAULT NULL,
`company` varchar(100) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name` (`first_name`,`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=322435 ;
I'm trying to get both the user information and his last payment. Ordering(ASC or DESC) by ID, first and last name, the date of the payment, or the payment expiration date. To create a table showing users with expired payments, and valid ones
I've managed to get the data correctly, but most of the time, my queries take 1+ second for a single user, and 40+ seconds for 30 users. To be honest I have no idea if it's possible to get the information under 1 second. Also probably my application is never going to reach this number of entries, probably a maximum of 10k payments and 300 users
My query, works pretty well with few entries and it's easy to change the ordering:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT * FROM (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
ORDER BY payments.valid_until DESC
) AS p GROUP BY p.user_id
) AS final ON final.user_id = users.id
ORDER BY id ASC
LIMIT 0, 30"
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY users ALL NULL NULL NULL NULL 322269 Using where; Using temporary; Using filesort
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 50
4 DEPENDENT SUBQUERY users_deactivated unique_subquery user_id user_id 4 func 1 Using index
2 DERIVED <derived3> ALL NULL NULL NULL NULL 2072327 Using temporary; Using filesort
3 DERIVED payments ALL NULL NULL NULL NULL 2072566 Using filesort
I'm open to any suggestions and tips, since I'm new to PHP, MySQL and stuff, and don't really know if I'm doing the correct way
I would first suggest removing the ORDER BY clause from your subquery -- I don't see how it's helping as you're reordering by id in your outer query.
You should also be able to move your GROUP BY statement into your subquery:
SELECT users.id, users.first_name, users.last_name, users.email, final.id AS payment_id, payment_date, final.valid_until AS payment_valid_until
FROM users
LEFT JOIN (
SELECT payments.id, payments.user_id, payments.date AS payment_date, payments.valid_until
FROM payments
GROUP BY payments.user_id
) AS final ON final.user_id = users.id
ORDER BY users.id ASC
LIMIT 0, 30
Given your comments, how about this -- not sure it would be better than your current query, but ORDER BY can be expensive:
SELECT users.id, users.first_name, users.last_name, users.email, p.id AS payment_id, p.payment_date, p.valid_until AS payment_valid_until
FROM users
LEFT JOIN payments p ON p..user_id = users.id
LEFT JOIN (
SELECT user_id, MAX(valid_until) Max_Valid_Until
FROM payments
GROUP BY user_id
) AS maxp ON p.user_id = maxp.user_id and p.valid_until = maxp.max_valid_until
ORDER BY users.id ASC
LIMIT 0, 30
use an index on the payments table for users, that and do the group by on the payments table...
alter table payments add index (user_id);
your query
ORDER BY users.id ASC
alter table payments drop index user_id;
and why don't you use the payments "id" instead of "valid_until" ? Is there a reason to not trust the ids are sequential? if you don't trust the id add index to the valid_until field:
alter table payments add index (valid_until) desc;
and don't forget to drop it later
alter table payments drop index valid_intil;
if the query is still slow you will need to cache the results... this means you need to improve your schema, here is a suggestion:
create table last_payment
(user_id int,
constraint pk_last_payment primary key user_id references users(id),
payment_id int,
constraint fk_last_payment foreign key payment_id references payments(id)
);
alter table payments add index (user_id);
insert into last_payment (user_id, payment_id)
(select user_id, max(id) from payments group by user_id);
#here you probably use your own query if the max (id) does not refer to the last payment...
alter table payments drop index user_id;
and now comes the magic:
delimiter |
CREATE TRIGGER payments_trigger AFTER INSERT ON payments
FOR EACH ROW BEGIN
DELETE FROM last_payment WHERE user_id = NEW.user_id;
INSERT INTO last_payment (user_id, payment_id) values (NEW.user_id, NEW.id);
END;
|
delimiter ;
and now every-time you want to know the last payment made you need to query the payments_table.
select u.*, p.*
from users u inner join last_payment lp on (u.id = lp.user_id)
inner join payments on (lp.payment_id = p.id)
order by user_id asc;
Maybe something like this...
SELECT u.id
, u.first_name
, u.last_name
, u.email
, p.id payment_id
, p.payment_date
, p.payment_valid_until
FROM users u
JOIN payments p
ON p.user_id = u.id
JOIN
( SELECT user_id,MAX(p.valid_until) max_valid_until FROM payments GROUP BY user_id ) x
ON x.user_id = p.user_id
AND x.may_valid_until = p.valid_until;
The problem with joining to a sub query is that MySql internally generates the result of the sub query before performing the join. This is expensive in resources and is probably taking the time. Best solution is to change the query to avoid sub queries.
SELECT users.id, users.first_name, users.last_name, users.email, max(payments.id) AS payment_id, max(payments.date) as payment_date, max(payments.valid_until) AS payment_valid_until
FROM users
LEFT JOIN payments use index (user_id) on payments.user_id=users.id
group by users.id
ORDER BY id ASC
LIMIT 0, 30
This query is only correct , however, if the largest values for valid_until, payment_date and payment_date are always in the same record.
SELECT payments.users_id, users.first_name, users.last_name,
users.email, (final.id), MAX(payment.date), MAX(final.valid_until)
FROM payments final
JOIN users ON final.user_id = users.id
GROUP BY final.user_id
ORDER BY final.user_id ASC
LIMIT 0, 30
The idea is to flatten the payments first.
The MAX fields of course are of different payment records.
Speed up
Above I did a MySQL specific thing: final.id without MAX. Better not use the field at all.
If you could leave out the payments.id, it would be faster (with the appropiate index).
KEY `user_date` (`user_id`, `date` DESC ),
KEY `user_valid` (`user_id`, `valid_until` DESC ),

Select only the last inserted item related to another registry

I modeled a small database for easier explanation:
CREATE TABLE bands (
id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
name VARCHAR(120) NULL,
PRIMARY KEY(id)
)
TYPE=InnoDB;
CREATE TABLE albums (
id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
band_id INTEGER UNSIGNED NOT NULL,
album_name VARCHAR(120) NULL,
rating INTEGER UNSIGNED NULL,
insertion_date TIMESTAMP NULL,
PRIMARY KEY(id),
INDEX albums_FKIndex1(band_id),
FOREIGN KEY(band_id)
REFERENCES bands(id)
ON DELETE NO ACTION
ON UPDATE NO ACTION
)
TYPE=InnoDB;
Now, pretending that we already have some bands and many albums registered in their respective tables, I want to select ONLY the last inserted album from each registered band.
PS: I have to use the "album.insertion_date" field to determine which album is the last inserted.
Try joining the two tables and filtering by insertion_date and band:
SELECT al.*
FROM albums al
INNER JOIN bands b ON al.band_id=b.id
WHERE al.insertion_date=(
SELECT max(insertion_date)
FROM albums
WHERE band_id=b.id
)
Try this one:
select b.name, a.album_name, a.isertion_date
from bands b, albums a
where a.band_id = b.id
and a.insertion_date = (select max(a1.insertion_date) from albums a1 where a1.band_id = b.id)
Considering that you have the albums' ids to be AUTO_INCREMENT and the possibility for the insertion_date to be NULL(as it is the default value), using insertion_date to determine the results is not the smartest thing to do but ... there you go:
SELECT DISTINCT band, last_album, insertion_date
FROM (
SELECT bands.name AS band, albums.album_name AS last_album, albums.insertion_date
FROM bands
JOIN albums ON bands.id=albums.band_id
ORDER BY albums.insertion_date DESC
) t1
GROUP BY band;