Voting - Stored procedures - mysql

I'm trying to come up with a nice elegant solution for a voting system like SO's. If there's a way to do it with elegance using triggers I couldn't figure it out so I'm trying with stored procedures. This is what I've come up with, It's not pretty so I'm asking for ideas. I'll probably even have one query rather than the query+stored procedure. But I'd really like to know a clean way to update a user's points and insert/update votes. Points are in a separate table to be updated by procedure.
Upvote
INSERT INTO votes
ON DUPLICATE KEY
UPDATE votes
SET v.weight = v.weight + 1
WHERE v.weight = 0 OR v.weight = -1
AND v.userid = {$uid}
AND v.itemid = {$itemid}
//call procedure to +1 user points
Downvote
INSERT INTO votes
ON DUPLICATE KEY
UPDATE votes
SET v.weight = v.weight - 1
WHERE v.weight = 1 OR v.weight = 0
AND v.userid = {$uid}
AND v.itemid = {$itemid}
//call procedure to -1
Flipdown (when user changes vote from up to down)
INSERT INTO votes
ON DUPLICATE KEY
UPDATE votes
SET v.weight = -1
WHERE v.weight = 1
AND v.userid = {$uid}
AND v.itemid = {$itemid}
//call procedure to -2
Flipup
INSERT INTO votes
ON DUPLICATE KEY
UPDATE votes
SET v.weight = 0
WHERE v.weight = -1
AND v.userid = {$uid}
AND v.itemid = {$itemid}
//call procedure to +2

I assume that votes table has 3 columns (post_id, user_id, weight). You can use the following query:
insert into votes(post_id, user_id, weight)
values(post_id_in, user_id_in, weight_in)
on duplicate key update
set
weight = weight_in;
Use should have unique index on(post_id, user_id).
If you denormalize data and table posts has column for total votes that you need to recalculate it.

I personally do not see a need for stored procedure in your case. If you are tracking user votes this means that the user id is available to you. I would suggest opening a mysql transaction perform your insert in the votes table and then perform an update to keep track of the user's score. Then if both calls are successful commit the transaction this will ensure data integrity.
Maybe you could share the specific reasoning why you want to use procedures?

Related

Is there a way to get the number of comments for each user and update it in the number_of_comments column automatically?

I have two tables in MySQL like this
Users -> user_id , user_name , number_of_comments
Comments -> comment_id , comment , user_id
Is there a way to get the number of comments for each user and update it in the number_of_comments column automatically?
Not recommended, but solves nevertheless. For learning purposes only.
CREATE TRIGGER tr_ai_update_n_of_comments
AFTER INSERT ON comments
FOR EACH ROW
UPDATE users
SET number_of_comments = ( SELECT COUNT(*)
FROM comments
WHERE comments.user_id = NEW.user_id )
WHERE user_id = NEW.user_id;
If the rows in comments may be updated (with user_id value changing) and/or deleted then create similar AFTER DELETE and AFTER UPDATE triggers.
PS. I strongly recommend you to remove users.number_of_comments column at all and calculate actual comments amount value by according query when needed.
If you agree that the value may be approximate (slightly different from the exact one), then you can use an incremental trigger.
CREATE TRIGGER tr_ai_update_n_of_comments
AFTER INSERT ON comments
FOR EACH ROW
UPDATE users
SET number_of_comments = number_of_comments + 1
WHERE user_id = NEW.user_id;
But just in case, provide for the creation of a service stored procedure (or event) that will periodically recalculate the accumulated value.

MYSQL Multi-Table Update Is Extremely Slow

I am trying to run a multi-table update in MYSQL (Amazon RDS) and it is extremely slow.
What I am trying to do?
Remove all duplicate rows based on a 1 hour time frame.
Below I created a temp table to identify the duplicate rows in the table. This query runs in 2 seconds.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
CREATE TEMPORARY TABLE tmpIds (id int primary key);
INSERT into tmpIds
SELECT distinct
d.id
FROM api d INNER JOIN api orig
on d.domain_id = orig.domain_id and d.user_id = orig.user_id
WHERE
orig.created_at < d.created_at
AND d.created_at <= DATE_ADD(orig.created_at, Interval 1 hour)
AND d.type = 'api/check-end'
AND d.created_at >= '2016-08-01';
SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
The problem is the UPDATE query it takes way to long to run on the production server. It also locks the api table.
SET #TRIGGER_DISABLED = 1;
UPDATE
api
SET
deleted_at = now()
WHERE type = 'api/check-end' AND created_at >= '2016-08-01'
AND id IN (SELECT id FROM tmpIds);
SET #TRIGGER_DISABLED = 0;
I also tried this version:
SET #TRIGGER_DISABLED = 1;
UPDATE
api a,
tmpIds ti
SET
a.deleted_at = now()
WHERE
type = 'api/check-end' AND created_at >= '2016-08-01' AND a.domain_id < 10 AND a.id = ti.id;
SET #TRIGGER_DISABLED = 0;
STATS
Temp Table: 32,000 rows
api table: total - 250,000 rows, after where clause (type, created_at)
200,000 rows.
The api table has costly triggers, this is why I turned them
off.
Sample run for 1000 updates 6 minutes.
There is an index on the api table primary key
The problem was the following statement:
SET #TRIGGER_DISABLED = 1;
Was not disabling the triggers. I had to delete the UPDATE trigger on the api table and the UPDATE ran in 1.3 seconds.
Any help on the best way to disable triggers while running a query?

How to increase query speed with using like command

When I try to run below update query, It takes about 40 hours to complete. So I added a time limitation(Update query with time limitation). But still it takes nearly same time to complete.Is there any way to speed up this update?
EDIT: What I really want to do is only get logs between some specific dates and run this update query on this records.
create table user
(userid varchar(30));
create table logs
( log_time timestamp,
log_detail varchar(100),
userid varchar(30));
insert into user values('user1');
insert into user values('user2');
insert into user values('user3');
insert into user values('');
insert into logs values('no user mentioned','user3');
insert into logs values('inserted by user2','user2');
insert into logs values('inserted by user3',null);
Table before Update
log_time | log_detail | userid |
.. |-------------------|--------|
.. | no user mention | user3 |
.. | inserted by user2 | user2 |
.. | inserted by user3 | (null) |
Update query
update logs join user
set logs.userid=user.userid
where logs.log_detail LIKE concat("%",user.userID,"%") and user.userID != "";
Update query with time limitation
update logs join user
set logs.userid = IF (logs.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44', user.userID, null)
where logs.log_detail LIKE concat("%",user.userID,"%") and user.userID != "";
Table after update
log_time | log_detail | userid |
.. |-------------------|--------|
.. | no user mentione | user3 |
.. | inserted by user2 | user2 |
.. | inserted by user3 | user3 |
EDIT: Original question Sql update statement with variable .
Log tables can easily fill up with tons of rows of data each month and even the best indexing won't help, especially in the case of a LIKE operator. Your log_detail column is 100 characters long and your search query is CONCAT("%",user.userID,"%"). Using a function in a SQL command can slow things down because the function is doing extra computations. And what you're trying to search for is, if your userID is John, %John%. So your query will scan every row in that table because indexes will be semi-useless. If you didn't have the first %, then the query would be able to utilize its indexes efficiently. Your query would, in effect, do an INDEX SCAN as opposed to an INDEX SEEK.
For more information on these concepts, see:
Index Seek VS Index Scan
Query tuning a LIKE operator
Alright, what can you do about this? Two strategies.
Option 1 is to limit the number of rows that you're searching
through. You had the right idea using time limitations to reduce the
number of rows to search through. What I would suggest is to put the
time limitations as the first expression in your WHERE clause.
Most databases execute the first expression first. So when
the second expression kicks in, it'll only scan through the rows returned by
the first expression.
update logs join user
set logs.userid=user.userid
where logs.log_time between '2015-08-01' and '2015-08-11'
and logs.log_detail LIKE concat('%',user.userID,'%')
Option 2 depends on your control of the database. If you have total
control (and you have the time and money, MySQL has a feature called
Auto-Sharding. This is available in MySQL Cluster and MySQL
Fabric. I won't go over those products in much detail as the links
provided below can explain themselves much better than I could
summarize, but the idea behind Sharding is to split the rows into
horizontal tables, so to speak. The idea behind it is that you're
not searching through a long database table, but instead across
several sister tables at the same time. Searching through 10 tables
of 10 million rows is faster than searching through 1 table of 100
million rows.
Database Sharding - Wikipedia
MySQL Cluster
MySQL Fabric
First, the right place to put the time limitation is in the where clause, not an if:
update logs l left join
user u
on l.log_detail LIKE concat("%", u.userID)
set l.userid = u.userID
where l.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
If you want to set the others to NULL do this before:
update logs l
set l.userid = NULL
where l.log_time not between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
But, if you really want this to be fast, you need to use an index for the join. It is possible that this will use an index on users(userid):
update logs l left join
user u
on cast(substring_index(l.log_detail, ' ', -1) as signed) = u.userID
set l.userid = u.userID
where l.log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44';
Look at the explain on the equivalent select. It is really important that the cast() be to the same type as the UserId.
You could add a new column called log_detail_reverse where a trigger can be set so that when you insert a new row, you also insert the log_detail column in reverse character order using the MySQL function reverse. When you're doing your update query, you also reverse the userID search. The net effect is that you then transform your INDEX SCAN to an INDEX SEEK, which will be much faster.
update logs join user
set logs.userid=user.userid
where logs.log_time between '2015-08-01' and '2015-08-11'
and logs.log_detail_reverse LIKE concat(reverse(user.userID), '%')
MySQL Trigger
The Trigger could be something like:
DELIMITER //
CREATE TRIGGER log_details_in_reverse
AFTER INSERT
ON logs FOR EACH ROW
BEGIN
DECLARE reversedLogDetail varchar(100);
DECLARE rowId int; <-- you don't have a primary key in your example, but I'm assuming you do have one. If not, you should look into adding it.
-- Reverse the column log_detail and assign it to the declared variable
SELECT reverse(log_detail) INTO reversedLogDetail;
SELECT mysql_insert_id() INTO rowId;
-- Update record into logs table
UPDATE logs
SET log_detail_reverse = reversedLogDetail
WHERE log_id = rowId;
END; //
DELIMITER ;
One thing about speeding up updates is not to update records that need no update. You only want to update records in a certain time range where the user doesn't match the user mentioned in the log text. Hence limit the records to be updated in your where clause.
update logs
set userid = substring_index(log_detail, ' ', -1)
where log_time between '2015-08-11 00:39:41' AND '2015-08-01 17:39:44'
and not userid <=> substring_index(log_detail, ' ', -1);

InnoDB and count: Are helper tables the way to go?

Assume I've got an users table with 1M users on MySQL/InnoDB:
users
userId (Primary Key, Int)
status (Int)
more data
If I would want to have an exact count of the amount of users with status = 1 (denoting an activate account), what would be the way to go for big tables, I was thinking along the lines of:
usercounts
status
count
And then run an TRIGGER AFTER INSERT on users that updates the appropiate columns in usercounts
Would this be the best way to go?
ps. An extra small question: Since you also need an TRIGGER AFTER UPDATE on users for when status changes, is there a syntax available that:
Covers both the TRIGGER AFTER INSERT and TRIGGER AFTER UPDATE on status?
Increments the count by one if a count already is present, else inserts a new (status, count = 0) pair?
Would this be the best way to go?
Best (opinion-based) or not but it's definitely a possible way to go.
is there a syntax available that: covers both the TRIGGER AFTER INSERT and TRIGGER AFTER UPDATE on status?
No. There isn't a compound trigger syntax in MySQL. You'll have to create separate triggers.
is there a syntax available that: increments the count by one if a count already is present, else inserts a new (status, count = 0) pair?
Yes. You can use ON DUPLICATE KEY clause in INSERT statement. Make sure that status is a PK in usercounts table.
Now if users can be deleted even if only for maintenance purposes you also need to cover it with AFTER DELETE trigger.
That being said your triggers might look something like
CREATE TRIGGER tg_ai_users
AFTER INSERT ON users
FOR EACH ROW
INSERT INTO usercounts (status, cnt)
VALUES (NEW.status, 1)
ON DUPLICATE KEY UPDATE cnt = cnt + 1;
CREATE TRIGGER tg_ad_users
AFTER DELETE ON users
FOR EACH ROW
UPDATE usercounts
SET cnt = cnt - 1
WHERE status = OLD.status;
DELIMITER $$
CREATE TRIGGER tg_au_users
AFTER UPDATE ON users
FOR EACH ROW
BEGIN
IF NOT NEW.status <=> OLD.status THEN -- proceed ONLY if status has been changed
UPDATE usercounts
SET cnt = cnt - 1
WHERE status = OLD.status;
INSERT INTO usercounts (status, cnt) VALUES (NEW.status, 1)
ON DUPLICATE KEY UPDATE cnt = cnt + 1;
END IF;
END$$
DELIMITER ;
To initially populate usercounts table use
INSERT INTO usercounts (status, cnt)
SELECT status, COUNT(*)
FROM users
GROUP BY status
Here is SQLFiddle demo
I think there are simpler options available to you.
Just add an index to the field you'd like to count on.
ALTER TABLE users ADD KEY (status);
Now a select should be very fast.
SELECT COUNT(*) FROM users WHERE status = 1

Insert into a table, if there is no such rows

I have a sessions table with the 'userid' and 'expired' columns. I want to check, whether any row with the given user ID and not expired exists and insert a new row, if no rows found.
I've read about INSERT IGNORE, but (userid + expired) cannot be a key, since it's possible to have multiple rows with the same users (all expired), I cannot only have more than 1 not expired user.
I tried this, but to no avail ('You have an error...'):
IF (SELECT 1 FROM sessions WHERE user = :user AND expired = 0) <> 1
INSERT INTO sessions (user) VALUES(:user)
('expired' is 0 by default). mySQL version is 5.
UPDATE.
I've tried this as well:
IF NOT EXISTS(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0)
INSERT INTO sessions (user) VALUES(0)
using HeidySQL 7. It doesn't work neither. mySQL version is 5.5.
UPDATE2. Logical errors in my statement fixed.
Regards,
use If exists clause (in this case, not exists)
IF NOT EXISTS(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0)
INSERT INTO sessions (`user`) VALUES(:user)
edit
INSERT INTO sessions (`user`) select user from
(select :user as user from sessions where not exists(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0)) T1
then put 0 or the value you want harcoded in :user
INSERT INTO sessions (`user`) select user from
(select 0 as user from sessions where not exists(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0)) T1
Elvieejo's solution seems good, except looks like you'd have to use not exists going by the problem stated.
IF NOT EXISTS(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0)
INSERT INTO sessions (user) VALUES(:user)
INSERT IGNORE INTO sessions (user) VALUES(:user)
The problem with my statement is that IF EXISTS is supposed to be in a stored procedure. Nevertheless, the function IF() works:
SELECT IF(EXISTS(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0), 0, 1)
Now I'm looking for how to equal plain value like 0 and INSERT INTO statement. This doesn't work:
SELECT IF(EXISTS(SELECT 1 FROM sessions WHERE user = 0 AND expired = 0), 0,
INSERT INTO sessions (user) VALUES (0) SELECT)
If it was C++, I'd say 'type mismatch between ?: operands' and use comma operator. In SQL it doesn't work.
UPDATE
Well, I've just created a stored routine. Just because I cannot use IF otherwise. What a stupid RDBMS this mySQL is!