Recursive MySQL trigger which calls the same table and the same trigger - mysql

I'm writing a simple forum for a php site. I'm trying to calculate the post counts for each category. Now a category can belong to another category with root categories being defined as having a NULL parent_category_id. With this architecture a category can have an unlimited number of sub-categories and keeps the table structure fairly simple.
To keep things simple lets say the categories table has 3 fields: category_id, parent_category_id, post_count. I don't think the remaining database structure is relevant so I'll leave it out for now.
Another trigger is calling the categories table causing this trigger to run. What I want is it to update the post count and then recursively go through each parent category increasing that post count.
DELIMITER $$
CREATE TRIGGER trg_update_category_category_post_count BEFORE UPDATE ON categories FOR EACH ROW
BEGIN
IF OLD.post_count != NEW.post_count THEN
IF OLD.post_count < NEW.post_count THEN
UPDATE categories SET post_count = post_count + 1 WHERE categories.category_id = NEW.parent_category_id;
ELSEIF OLD.post_count > NEW.post_count THEN
UPDATE categories SET post_count = post_count - 1 WHERE categories.category_id = NEW.parent_category_id;
END IF;
END IF;
END $$
DELIMITER ;
The error I'm getting is:
#1442 - Can't update table 'categories' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.
I figure you can do a count() on each page load to calculate the total posts but on large forums this will slow things down as discussed many times on here (e.g. Count posts with php or store in database). Therefore for future proofing i'm storing the post count in the table. To go one step further I thought i'd use triggers to update these counts rather than PHP.
I understand there are limitations in MySQL for running triggers on the same table that's being updated which is what is causing this error (i.e. to stop an infinite loop) but in this case surely the loop would stop once it reaches a category with a NULL parent_category_id? There must be some kind of solution whether it's adjusting this trigger or something different entirely. Thanks.
EDIT I appreciate this might not be the best way of doing things but it is the best thing I can think of. I suppose if you changed a parents category to another it would mess things up, but this could be fixed by another trigger which re-syncs everything. I'm open to other suggestions on how to solve this problem.

I usually recommend against using triggers unless you really, really need to; recursive triggers are a great way of introducing bugs that are really hard to reproduce, and require developers to understand the side effects of an apparently simple action - "all I did was insert a record into the categories table, and now the whole database has locked up". I've seen this happen several times - nobody did anything wrong or stupid, it's just a risk you run with side effects.
So, I would only resort to triggers once you can prove you need to; rather than relying on the opinion of strangers based on generalities, I'd rig up a test environment, drop in a few million test records, and try to optimize the "calculate posts on page load" solution so it works.
A database design that might help with that is Joe Celko's "nested set" schema - this takes a while to get your head round, but can be very fast for querying.
Only once you know you have a problem that you really can't solve other than by pre-computing the post count would I consider a trigger-based approach. I'd separate out the "post counts" into a separate table; that keeps your design a little cleaner, and should get round the recursive trigger issue.

The easiest solution is to fetch all the posts per category and afterwards link them together using a script/programming language:
for instance in php:
<?php
// category: id, parent, name
// posts: id, title, message
$sql = "select *, count(posts.id) From category left join posts ON posts.cat = category.id Group by category.id";
$query = mysql_query($sql);
$result = array();
while($row = mysql_fetch_assoc($query)){
$parent = $row['parent'] == null ? 0 : $row['parent'];
$result[$parent][] = $row;
}
recur_count(0);
var_dump($result);
function recur_count($depth){
global $result;
var_dump($result[$depth],$depth);
foreach($result[$depth] as $id => $o){
$count = $o['count'];
if(isset($result[$o['id']])){
$result[$depth][$id]['count'] += recur_count($o['id']);
}
}
return $count;
}

Ok so for anyone wondering how I solved this I used a mixture of both triggers and PHP.
Instead of getting each category to update it's parent, I've left it to the following structure: a post updates it's thread and then a thread updates it's category with the post count.
I've then used PHP to pull all categories from the database and loop through adding up each post count value using something like this:
function recursiveCategoryCount($categories)
{
$count = $categories['category']->post_count;
if(!is_null($categories['children']))
foreach($categories['children'] as $child)
$count += recursiveCategoryCount($child);
return $count;
}
At worst instead of PHP adding up every post on every page load, it only adds up the total category posts (depending at what node in the tree you are in). This should be very efficient as you're reducing the total calculations from 1000s to 10s or 100s depending on your number of categories. I would also recommend running a script every week to recalculate the post counts in case they become out of sync, much like phpBB. If I run into issues using triggers then I'll move that functionality into the code. Thanks for everyones suggestions.

Related

How to update the balance with random value (mysql) when a user registers the website?

For one of my courses, I'm trying to create banking system website using mysqldb and to write the code that make it possible for me to update users balance with random value while registration so the balance will not depend on the user's registration inserted information. i want the value to be inserted to the right spot in the table, only if this spot is null.
i used the code below:
$cursor = $MySQLdb->prepare("UPDATE users SET Balance=(Select FLOOR(0+ RAND() * 10000)) WHERE Balance=null AND userID=<userID>;")
I hope I was understood.
Thanks in advance
First of all, I would never put a calculation in a query string ;)
Also don't overcomplicate the rand() function. Take a look on the docs: rand() function docs
And lastly maybe think about it that is it a good idea to leave it on null? Maybe you could do something like 1 or so. (Only if it not possible for someone to have 1 money!!)
Do something like:
$balance_variable = $balance_variable = rand(5000, 10000);
$userID_variable = /*specify it somehow*/;
$cursor = $MySQLdb->prepare("UPDATE users SET Balance=? WHERE Balance IS NULL AND userID=?");
$cursor->bind_param('ss', $balance_variable, $userID_variable);
$cursor->execute();

Duplicate row and everything related to that ID in different tables

I'm working on a software for managing the Rally company of my boss, where he can manage all the volunteers, their affectations, and many other things.
But the volunteers and these others things vary depending on the event, so they all have a column representing the event that they are linked/related to.
My boss requested that I add a "duplicate" button, that would duplicate an event (from my events table) and also duplicate all the volunteers, and values from any other table that is linked to that event, so the new duplications are linked to the new event.
The reason for this, is that he is constantly organizing rallies, and often it happens that the data from a Rally (event) to another is almost the same, so instead of adding it all manually, he'd rather make an entire duplication of the event and all the data related to it, then manually add and remove the errors in it.
I would like to know, is there any way in MySQL that I could duplicate an event, and everything that is linked to it's ID, even though they are in different tables, and make the duplications have the ID of the new event?
Sadly I don't have very much time, but until I get an answer I can work on other tasks my boss gave me.
Thank you so much to anybody who helps me or gives me any hint!!
EDIT:
Here's the schema of my Database (I know it's kinda dirty and there's issues with it, my boss gave me indications on how to create the database since he used to work in the domain before, but he didn't tell me how to make the links and he wants to make them himself)
And I apologize for the French language and the weird names..
Basically I wish to duplicate an entry in the "event" table, all the "affect" and "lieu" entries that are linked to it, and all the "tache" entries related the the duplicated "lieu" entries.
EDIT 2:
Thank you MrMadsen for the Query!
I had to fix it a bit, but here's what it looks like after.
SET #NEWEVEN = (SELECT MAX(NO_EVE) from db_rallye.event)+1;
INSERT INTO db_rallye.event
SELECT #NEWEVEN,
NM_EVE,
AN_EVE,
DT_EVE,
NM_REG_EVE
FROM event
WHERE NO_EVE = event_to_duplicate;
SET #NO_AFFECT_LIEU = (SELECT MAX(NO_AFF) from db_rallye.affect);
INSERT INTO db_rallye.affect
SELECT #NO_AFFECT_LIEU:=#NO_AFFECT_LIEU+1,
CO_AFF,
DT_AVI_AFF,
DS_STA_AFF,
CO_STA_AFF,
NO_BRA_EVN,
NO_PERS,
#NEWEVEN,
NO_EQU,
NO_LIE,
NO_PERS_RES,
IN_LUN,
IN_BAN,
DS_HEB,
IN_HEB_JEU,
IN_HEB_VEN,
IN_HEB_SAM,
NO_BOR,
NO_TUL_CAH,
NB_SPEC
FROM affect
WHERE NO_EVEN = event_to_duplicate;
INSERT INTO db_rallye.lieu
SELECT NO_LIE,
#NEWEVEN,
CO_LAT,
CO_LON,
FI_IMA,
FI_CRO,
FI_TUL,
DS_LIE,
DS_COU,
DS_LON,
NB_BLK,
VL_KM,
IN_FUS,
VL_DIS_FUS,
NO_LIE_FUS_SUI
FROM lieu
WHERE NO_EVEN = event_to_duplicate;
SET #NO_AFFECT_TACHE = (SELECT MAX(NO_TAC) from db_rallye.tache);
INSERT INTO db_rallye.tache
SELECT #NO_AFFECT_TACHE:=#NO_AFFECT_TACHE+1,
NO_LIE,
#NEWEVEN,
NO_AFF,
DS_REP,
DS_TAC
FROM tache
WHERE NO_LIE IN
(SELECT NO_LIE FROM lieu WHERE NO_EVEN = event_to_duplicate);
If a lot of the events contain similar information than I would create a template (or templates and the ability to create/modify templates) containing all of the information that you would duplicate.
Then when a new event is created he can just choose a starting template and then only add whatever data is unique to that event. In my opinion this would be much better than constantly duplicating the data.
As far as how to duplicate a row and all the associated rows, this is completely dependent on your database schema and how the tables relate to one another. If you post the relevant part of that we can help you more.
Edit
Here are the queries I came up with, test them in a dev database first but I think they will work. Let me know. Good luck!
INSERT INTO `event`
SELECT NULL NO_EVE,
NM_EVE,
AN_EVE,
DT_EVE,
NM_REG_EVE
FROM `email_log`
WHERE id NO_EVE = id_of_event_to_duplicate
INSERT INTO `affect`
SELECT NULL NO_AFF,
CO_AFF,
DT_AVI_AFF,
DS_STA_AFF,
CO_STA_AFF,
NO_BRA_EVN,
NO_PERS,
NO_EVEN,
NO_EQU,
NO_LIE,
NO_PERS_RES,
IN_LUN,
IN_BAN,
DS_HEB_JEU,
IN_HEB_VEN,
IN_HEB_SAM,
NO_BOR,
NO_TUL_CAH,
NB_SPEC
FROM `affect`
WHERE NO_EVE = id_of_event_to_duplicate
INSERT INTO `lieu`
SELECT NULL NO_LIE,
NO_EVEN,
CO_LAT,
CO_LON,
FI_IMA,
FI_CRO,
FI_TUL,
DS_LIE,
DS_COU,
DS_LON,
DB_BLK,
VL_KM,
IN_FUS,
VL_DIS_FUS,
NO_LIE_FUS_SUI
FROM `lieu`
WHERE NO_EVEN = id_of_event_to_duplicate
INSERT INTO `tache`
SELECT NULL NO_TAC,
NO_LIE,
NO_EVEN,
NO_AFF,
DS_REP,
DS_TAC
FROM `tache`
WHERE NO_LIE IN
(SELECT NO_LIE FROM `lieu` WHERE NO_EVEN = id_of_event_to_duplicate)

Query ActiveRecord for records and relation calculations at once

TL;DR? See Edit 2
I've got a little Rails application that has a few different sort of games people can play: it's based around sports, so they can pick the winners of each game every week (model PickEm, attribute correct boolean with nil for unfinished games), and predict the outcome of a specific team's game (model Guess, attribute score with integer, nil for unfinished games). Every User has_many PickEms and Guesses. And I'm trying to display standings (correct/total - total being all non-nil, score/total possible).
What I'm finding is that I can gather the users and their associated records, but in trying to display standings I'm discovering that every single User is triggering another query - slow and not sustainable as the user base increases. That's because #user.pick_em_score is pick_ems.where(correct: true).size and #user.guess_Score is guesses.where.not(score: nil).sum(:score). So I call user.pick_em_score and it runs that query. I feel like there should be a way to get every User, as well as these specific counts, at once, rather than buffering a whole bunch of needless extra stuff.
What I need:
User record
User.pick_em_score (calculated by counting correct records)
User.pick_ems count where NOT NULL
User.guesses_score (calculated by guesses.sum(:score))
User.guesses count where NOT NULL
Most of the stuff I find on Rails's ActiveRecord helpers, especially related to calculations, is for retrieving only the calculation. It looks like I'll probably need to delve directly into select() etc. But I can't get it working. Can someone point me in the right direction?
Edit
For clarification: I'm aware that I can write this information to the User model, but this is overly restrictive: next season, I'll need to add a new column to the User for that year's results, etc. In addition, this is a third degree of callback updating related models – the Match model already updates related PickEms and Guesses on save. I'm looking for the simplest ActiveRecord query or queries to be able to work with this information, as indicated by the title. Ideally one query that returns the above information, but if it needs to a few, that's OK.
I used to work directly in MySQL with PHP, but those skills have rusted (in raw MySQL, I imagine, I'd have several sub-select statements to help pull these counts) and I'd also like to be able to use Rails's ActiveRecord helpers and such, and avoid constructing raw SQL as much as possible.
Second Edit:
I seem to have it down to one call that starts to work, but I'm writing a lot of SQL. It's also brittle, IMO, and trying to run with it has failed. It also looks like I'm just pushing the million singular SELECT queries from Rails right into SQL, but that may still be a step up.
User.unscoped.select('users.*',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct) AS correct_pick_ems',
'(SELECT COUNT(*) FROM pick_ems WHERE pick_ems.user_id = users.id AND pick_ems.correct IS NOT NULL) AS total_pick_ems',
'(SELECT SUM(guesses.score) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_score',
'(SELECT COUNT(*) FROM guesses WHERE guesses.user_id = users.id AND guesses.score IS NOT NULL) AS guesses_count' )
The issue seems to be: is there a way to use Rails, and not raw SQL, to link up users.id that we see there with these subqueries? Or just … a better way to construct this, in general?
In addition, I'm running another set of SELECTs for the WHERE, which would hinge on total_pick_ems and guesses_count being > 0 but since I can't use those aliased columns, I have to call the SELECT one more time.
Welcome to AR. Its really only good for simple CRUD like queries. Once you actually want to query your data in anger it just doesn't have the capababilities to do the queries you want without resorting to wholesale SQL strings and often abandoning the ability to chain as a result.
Its precisely why I moved to Sequel as it does have the features to compose queries using a much fuller SQL feature set, including join conditions, window functions, recursive common table expressions, and advanced eager loading. The author is incredibly responsive and documentation is excellent compared to AR and Arel.
I don't expect you will like this answer but a time will come when you will start to look outside the opinionated components that come with rails which I have to say are hardly best of breed. Sequel also sped my application up many times over what I was able to get with AR as well, it not just developer happiness, it means less servers to run. Yes it will be a learning curve but IMO its better to learn tools that have your back covered.
Joins might work. Smthing like below
User.unscoped.joins(:guesses).joins(:pick_ems).
where("guesses.score IS NOT NULL").
select("users.*,
sum(guesses.score) as guesses_score,
count(guesses.id) as guesses_count,
count(case when pick_ems.correct = True then 1 else null end)
as correct_pick_ems,
count(case when pick_ems.correct != null then 1 else null end)
as total_pick_ems,
").
group("users.id")
If you need this information for a limited number of users at a time then above query or eager loading (User.includes(:guesses, :pick_ems)) with class methods like
def correct_pick_ems
pick_ems.count(&:correct)
end
would work.
However If you need this information for all the users most of the time, cached counters within the users table would be more optimal.
What you need is some sort of custom (smart) counter_cache to count only at certain conditions (e.g correct is true)
You can achive this using conditional after_save & after_destroy triggers to build your own custom counter_cache that looks like this:
class PickEm
belongs_to :user
after_save :increment_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
after_destroy :decrement_finished_counter_cache, if: Proc.new { |pick_em| pick_em.correct }
private
def increment_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter + 1) #update_column should not trigger any validations or callbacks
end
def decrement_finished_counter_cache
self.user.update_column(:finished_games_counter, self.user.finished_games_counter - 1) #update_column should not trigger any validations or callbacks
end
end
Notes:
Code not tested (only to show the idea)
Some guys said it's better to avoid naming custom counters as rails name them (foo_counter_cache)
You should benchmark it, but my hunch is that adding all of that data into a single SELECT isn't going to be much faster than breaking it up into separate SELECTs (I've actually had cases where the latter was faster). By breaking it up, you can also stick to more ActiveRecord and less raw SQL, e.g.:
user_ids_to_pick_em_score = User.joins(:pick_ems).where(pick_ems: {correct: true}).group(:user_id).count
user_ids_to_pick_ems_count = User.joins(:pick_ems).where.not(pick_ems: {correct: nil}).group(:user_id).count
user_ids_to_guesses_score = Hash[User.select("users.id, SUM(guesses.score) AS total_score").joins(:guesses).group(:user_id).map{|u| [u.id, u.total_score]}]
user_ids_to_guesses_count = User.joins(:guesses).where.not(guesses: {score: nil}).group(:user_id).count
Edit: To display them, you could do like so:
<%- User.select(:id, :name).find_each do |u| -%>
Name: <%= u.name %>
Picks Correct: <%= user_ids_to_pick_em_score[u.id] %>/<%= user_ids_to_pick_ems_count[u.id] %>
Total Score: <%= user_ids_to_guesses_score[u.id] %>/<%= user_ids_to_guesses_count[u.id] %>
<%- end -%>

MySql: How to delete the delay between SELECT and UPDATE

I have a simple set up for assigning opponents in a game.
Basically if the matchID is zero(value comes from elsewhere), a new match needs to be created, and it will perform a mysql select on the last matchID record to ascertain if there is someone waiting for a match or not.
To see if a player is waiting we can see if the teamB space is Zero (not taken). If however teamB has a value then no one is waiting and a new match must be created with this player as a 'team A'.
The code is as follows:
if ($matchID == 0)
{
$teamBquery = $conn ->query("SELECT matchID,teamBID,teamAID FROM challengeMatches ORDER BY matchID DESC LIMIT 1 ");
$teamBarray = $teamBquery->fetch(PDO::FETCH_ASSOC);
$teamBID=$teamBarray[teamBID];
$matchID=$teamBarray[matchID];
$teamAID=$teamBarray[teamAID];
if ($teamBID == 0){
$newChallenge = $conn ->query ("UPDATE challengeMatches SET managerBID='$managerID', teamBID='$teamID',matchStatus=1 WHERE matchID='$matchID'");
}else{
$filler = 0;
$matchID = $matchID+1;
$newChallenge = $conn ->query ("INSERT INTO challengeMatches (matchID,managerAID,managerBID,matchStatus,teamAID,teamBID) VALUES ('','$managerID','$filler','$filler','$teamID','$filler')");
}
}
My concern as someone pretty inexperienced is that as far as I can see there will be a delay between selecting the info and updating the info and so technically two mysql selects might return the same matchID to be used. And then even using the matchID+1 as variable is risky because it could be out of sync with the auto-increment matchID that is created in the database.
Are my fears founded or is the code so fast that the probability is not worth worrying about?
If i should be worried what can I do?
You firstly need to identify if you really need a solution to overcome this, the time gap should be so small unless you run a massive site the likelihood of selecting two matchId's is remote.
However there are really three area's you can look to improve this:
Locking - look at SELECT .. FOR UPDATE or SELECT ... LOCK IN SHARE MODE
Transactions
Refactor the database to actually update with a limit first and then select the updated range

Codeigniter database call efficiency

This is an efficiency/best practice question. Hoping to receive some feed back on performance. Any advice is greatly appreciated.
So here is a little background in what i have setup. I'm using codeigniter, the basic setup is pretty similar to any other product relationships. Basic tables are: Brands, products, categories. On top of these tables there is a need for install sheets, marketing materials, and colors.
I created some relationship tables:
Brands_Products
Products_Colors
Products_Images
Products_Sheets
I also have a Categories_Relationships table that holds all of the relationships to categories. Install sheets etc can have their own categories but i didn't want to define a different category relationship table for each type because i didn't think that would be very expandable.
On the front end I am sorting by brands, and categories.
I think that covers the background now to the efficiency part. I guess my question pertains mostly to weather it would be better to use joins or to make separate calls to return individual parts of each item (colors, images, etc)
What I currently have coded is working, and sorting fine but I think i can improve the performance, as it take some time to return the query. Right now its returning about 45 items. Here is my first function it grabs all the products and its info.
It works by first selecting all the products and joining it's brand information. then looping through the result i set up the basic information, but for the categories images and installs i am using functions that returns each of respected items.
public function all()
{
$q = $this->db
->select('*')
->from('Products')
->join('Brands_Products', 'Brands_Products.product_id = Products.id')
->join('Brands', 'Brands.id = Brands_Products.brand_id')
->get();
foreach($q->result() as $row)
{
// Set Regular Data
$data['Id'] = $row->product_id;
$data['Name'] = $row->product_name;
$data['Description'] = $row->description;
$data['Brand'] = $row->brand_name;
$data['Category'] = $this->categories($row->product_id);
$data['Product_Images'] = $this->product_images($row->product_id);
$data['Product_Installs'] = $this->product_installs($row->product_id);
$data['Slug'] = $row->slug;
// Set new item in return object with created data
$r[] = (object)$data;
}
return $r;
}
Here is an example of one of the functions used to get the individual parts.
private function product_installs($id)
{
// Select Install Images
$install_images = $this->db
->select('*')
->where('product_id', $id)
->from('Products_Installs')
->join('Files', 'Files.id = Products_Installs.file_id')
->get();
// Add categories to category object
foreach($install_images->result() as $pImage)
{
$data[] = array(
'id' => $pImage->file_id,
'src' => $pImage->src,
'title' => $pImage->title,
'alt' => $pImage->alt
);
}
// Make sure data exists
if(!isset($data))
{
$data = array();
}
return $data;
}
So again really just looking on advice on what is the most efficient, best practice way of doing this. I really appreciate any advice, or information.
I think your approach is correct. There are only a couple of options: 1) load your product list first, then loop, and load required data for each product row. 2) create a big join on all tables first, then loop through (possibly massive) cartesian product. The second might get rather ugly to parse. For example, if you got Product A and Product B, and Product A has Install 1, Install 2, Install 3, and product B has Install 1, and Install 2,t hen your result is
Product A Install 1
Product A Install 2
Product A Install 3
Product B Install 1
Product B Install 2
Now, add your images and categories to the join and it might become huge.
I am not sure what the sizes of your tables are but returning 45 rows shouldn't take long. The obvious thing to ensure (and you probably did that already) is that product_id is indexed in all tables as well as your brands_products tables and others. Otherwise, you'll do a table scan.
The next question is how you're displaying your data on the screen. So you're getting all products. Do you need to load categories, images, installs when you're getting a list of products? If you're simply listing products on the screen, you might want to wait to load that data until user picks a products they are viewing.
On a side note, any reason you're converting your array to object
$r[] = (object)$data;
Also, in the second function, you can simply add
$data = array();
before the foreach, instead of
// Make sure data exists
if(!isset($data))
{
$data = array();
}
You can try this:
Query all of the products
Get all of the product IDs from step 1
Query all of the install images that has a product ID from step 2, sorted by product ID
Iterate through the products from step 1, and add the results from step 3
That takes you from 46 queries (for 45 products) to 2 queries, without any additional joins.
You can also use CodeIgniter's Query Caching to increase performance even further, if it's worth the time to write the code to reset the cache when data is updated.
Doing more work in PHP is generally better than doing the work in MySQL in terms of scalability. You can scale PHP easily with load balancers and web servers. Scaling MySQL isn't as easy due to concurrency issues.