Hello guys.
I've an issue with a simple query.
Here we go, that's the code.
UPDATE user_resources AS ures
LEFT JOIN user_buildings as ub
ON ub.city_id = ures.city_id
INNER JOIN building_consumption AS bcons
ON bcons.resource_id = ures.resource_id
SET ures.quantity = ures.quantity - abs(FORMULA HERE that requires
building level and consumption at lvl 1 [default])
WHERE
(SELECT COUNT(id) FROM building_consumption AS bc2
WHERE bc2.building_id=ub.building_id) =
(SELECT COUNT(bc3.id) FROM building_consumption AS bc3
LEFT JOIN tmp_user_resources AS ures
ON ures.resource_id = bc3.resource_id
WHERE ures.city_id = ub.city_id
AND bc3.building_id=ub.building_id
AND bc3.quantity>0
AND IFNULL(ures.quantity, 0) - abs(FORMULA AGAIN);
I'll try to explain a bit.
As you can imagine, this is for a game.
Users (players) can has different buildings in different cities.
tab user_buildings
|id, city_id, buildings_id, level, usage|
A building can produce different resources
tab building_production
|id, building_id, resource_id, quantity_h|
but it can consume some resources too:
tab building_consumption
|id, building_id, resource_id, quantity_h|
Obviously a building cannot produce if there are not enough resources to consume for his job.
That's why I'm trying to compare WHERE SELECT COUNT how many resources it has to consume AND how many resources it can actually consume.
Mysql does NOT ALLOW to subquery same table inside an UPDATE stmt.
Using a cursor + loop is too much slow. I prefer to use different solution.
Temp table could be a solution but my problem now is how to update the temp table without triggers? (UPDATE + SELECT fires triggers and to avoid endless loops mysql block the query, and i can't pause/resume triggers because
IF ((#TRIGGER_CHECKS = FALSE)
OR (#TRIGGER_BEFORE_INSERT_CHECKS = FALSE))
AND (USER() = 'root#localhost')
THEN
LEAVE thisTrigger;
END IF;
is inside the trigger itself).
I am open to all your suggestions!
Thanks
P.S. The code must be inside a scheduled event.
Related
I'm trying to update the total revenue for offices located in different geographies. The geographies are defined by circles and polygons which are both in the shapes.shape column.
When I run the query below, MySQL throws "R_INVALID_GROUP_FUNC_USE: Invalid use of group function"
I tried to adapt this answer, but I can't figure out the logic with the conditional join and geospatial data -- it's not as simple as adding a subquery with a WHERE clause. (Or is it?)
For context, I have about 350 geographies and 150,000 offices.
UPDATE
shapes s
LEFT JOIN offices ON (
CASE
WHEN s.type = 'circle' THEN ST_Distance_Sphere(o.coords, s.shape) < s.radius
ELSE ST_CONTAINS(s.shape, o.coords)
END
)
SET
s.totalRevenue = SUM(o.revenue);
UPDATE:
This works, but it's slow and confusing. Is there a faster/more concise way?
UPDATE
shapes s
LEFT JOIN (
SELECT
t.shape_id,
SUM(g.revenue) revenue
FROM
shapes t
LEFT JOIN offices o ON (
CASE
WHEN t.type = 'circle' THEN ST_Distance_Sphere(o.coords, t.shape) < t.radius
ELSE ST_CONTAINS(t.shape, o.coords)
END
)
GROUP BY
t.shape_id
) b ON s.shape_id = b.shape_id
SET
s.totalRevenue = b.revenue;
I think that speed can be helped by splitting into two UPDATEs:
... WHERE t.type = 'circle'
AND ST_Distance_Sphere ...
and
... WHERE t.type != 'circle'
AND ST_CONCAINS ...
And then see if the resulting SQLs can be simplified.
To further investigate the query, please isolate the subquery b and see if the bulk of the time is in doing that SELECT (as opposed to the time doing the UPDATE).
Please provide SHOW CREATE TABLE for each table and EXPLAIN for both the UPDATE(s) and the isolated SELECT(s). A number of clues might come from such.
I can find sequenced record gaps where sequenced weeks with same numbers using following query.
SELECT * FROM pointed_numbers A WHERE EXISTS (
SELECT * FROM pointed_numbers B WHERE A.number = B.number AND (A.week = B.week + 1 XOR A.week = B.week - 1)
) ORDER BY A.number, A.week;
How can I identify each gaps without stored procedure. I have tried with user-defined variable but I had no success.
Take a look at http://www.artfulsoftware.com/infotree/queries.php and look at the stuff under the "sequences" section. This is a super super helpful site with recipes for how to do complicated things in mysql!
I've to execute a complex query, selecting several columns from 7-8 tables.
We don't want to write that query in programming language (PHP - Symfony 1.4/Propel 1.4 in our case) but to create a view or stored procedure to have very simple select query for developers. I'm confused what will be better approach.
We need query in following format:
SET #PlayerId = 1;
SELECT CASE WHEN mat.player1id = #PlayerId THEN mat.player2id ELSE mat.player1id END as opponent
/*plus many other columns*/
FROM `wzo_matches` as mat /*plus few other tables*/
WHERE (mat.player1id =#PlayerId OR mat.player2id=#PlayerId)
/*plus other join conditions*/
Problem with view is, SET #PlayerId=xx statement. We don't know player id in advance but will be passed through PHP. I hope this is the reason to rule out views; is there any workaround for that?
Other option will be stored procedure. Only issue with that is, it will create a new view for every query so operation will be very heavy for DB.
Can someone suggest best approach so that developers can get required data from above query without writing above complex query in PHP. (Obviously through SP or view & simple select query from there)
Based on reply of Can I create view with parameter in MySQL?, My issue is fixed with following queries:
create function getPlayer() returns INTEGER DETERMINISTIC NO SQL return #getPlayer;
create view getPlay as
SELECT
CASE WHEN play.hiderid = getPlayer() THEN play.seekerid ELSE play.hiderid END AS opponent, play . *
FROM odd_play play, odd_match mat
WHERE (seekerid = getPlayer() OR hiderid = getPlayer())
AND play.id = mat.latestplay;
select play.*
from (select #getPlayer:=1 p) ply, getPlay play;
CREATE PROCEDURE SELECT_PLAYER(p INT) SET #PlayerId = p
SELECT CASE WHEN mat.player1id = #PlayerId THEN mat.player2id ELSE mat.player1id END as opponent
/*plus many other columns*/
FROM `wzo_matches` as mat /*plus few other tables*/
WHERE (mat.player1id =#PlayerId OR mat.player2id=#PlayerId)
/*plus other join conditions*/
I'm trying to reference a field from the 1st select table in the 3rd select(subquery) table.
However, that field isn't recognized when it goes to that sub-level of a query.
The php code I'm working on uses sql to return part of the sql command (string) that will be used in other places.
I've came up with this example that shows up the kind of nested querys that I want to solve.
In here I'm trying to get the name and emails of users that are working at night and have a matching job rank for an available job:
tables -----------> fields
table_users -> [user_id, name, email, rank, ...]
table_users_jobs -> [user_id, job_id, period, ....]
table_jobs -> [job_id, status, rank, ...]
-- sql calling code -> $rank = "t1.rank"; get_users_info_by_rank($rank);
-- maybe using: SET #rank = NULL; SELECT #rank := $rank, t1.name, ...
SELECT t1.name, t1.email
FROM table_users as t1
WHERE t1.user_id IN (
SELECT t2.user_id
FROM table_users_jobs as t2
WHERE t2.period = 'night' AND
t2.job_id IN (
-- avaiable jobs to that rank -> get_job_ranks_sql($rank);
SELECT t3.job_id
FROM table_jobs as t3
-- maybe using: t3.rank = #rank
WHERE t3.rank = t1.rank AND
t3.status = 'avaiable_position')
)
Working a little I guess I could avoid the 3rd level select problem. Nevertheless the point is that I'm trying to reuse sql code like the function that gives me the job_id of the rank that I chose:
function get_job_ranks_sql($rank){
//probably 't3' will be renamed for something more unique
return 'SELECT t3.job_id
FROM table_jobs as t3
WHERE t3.rank = '.$rank.' AND
t3.status = "available_position")';
}
Even using php I'm trying to make it generic to maybe use with another language if possible.
The sql version using is MySQL 5.1.41
Actually I think it's possible the way I want, by using sql variables like #rank, but I'm not sure if it's slower and if there are other better ways to do it.
Thanks in advance for any help :)
So, as one commenter pointed out, I think you would do much better off using JOINS, than sub-selects. For example, if I am reading your query/problem correctly, you could do a join query like this:
SELECT t1.name, t1.email, t3.job_id
FROM table_users t1
LEFT JOIN table_users_job t2
ON t1.user_id = t2.user_id
LEFT JOIN table_jobs t3
ON t3.job_id = t2.job_id
WHERE t2.period = 'night
AND t3.status = 'available_position'
Which is a lot more concise, easier to read, and is easier on your database. But doing this would prevent you from modularizing your SQL. If that is really important, you might consider storing such queries in Stored Procedure. This way, you can actually get a SP to return a list of results. Take a look at this tutorial:
http://www.wellho.net/resources/ex.php4?item=s163/stp4
Of course, that doesn't really solve your problem of being able to access variables at the lower levels of a sub select, but it would make your SQL easier to manage, and make it available to other language implementations, as you mentioned might be a need for you.
Something else to consider, in the bigger picture, would be migrating to a PHP framework that provides an ORM layer, where you could make those tables into objects, and then be able to access your data with much greater ease and flexibility (usually). But that is very 'big picture' and might not be suitable for your project requirements. One such framework that I could recommend, however, is CakePHP.
Im trying to make an operation of creating user network based on call detail records in my CDR table.
To make things simple lets say Ive got CDR table :
CDRid
UserAId
UserBId
there is more than 100 mln records so table is quite big.
I reated user2user table:
UserAId
UserBId
NumberOfConnections
then using curos I iterate through each row in the table, then I make select statement:
if in user2user table there is record which has UserAId = UserAId from CDR record and UserBId = UserBId from CDR record then increase NumberOfConnections.
otherwise insert such a row which NumebrOfConnections = 1.
Quite simple task and it works as I said using cursor but it is very bad in performance (estimated time at my computer ~60 h).
I heard about Sql Server Integration Services that it has got better performance when we are talking about such big tables.
Problem is that I have no idea how to customize SSIS package for creating such task.
If anyone has got any idea how to help me, any good resources etc I would be really thankful.
Maybe there is any other good solution to make it work faster. I used indexes and variable tables and so on and performance is still pure.
thanks for help,
P.S.
This is script which I wrote and execution of this takes sth like 40 - 50 h.
DECLARE CDR_cursor CURSOR FOR
SELECT CDRId, SubscriberAId, BNumber
FROM dbo.CDR
OPEN CDR_cursor;
FETCH NEXT FROM CDR_cursor
INTO #CdrId, #SubscriberAId, #BNumber;
WHILE ##FETCH_STATUS = 0
BEGIN
--here I check if there is a user with this number (Cause in CDR i only have SubscriberAId --and BNumber so that I need to check which one user is this (I only have users from
--network so that each time I cant find this user I add one which is outide network)
SELECT #UserBId = (Select UserID from dbo.Number where Number = #BNumber)
IF (#UserBId is NULL)
BEGIN
INSERT INTO dbo.[User] (ID, Marked, InNetwork)
VALUES (#OutUserId, 0, 0);
INSERT into dbo.[Number](Number, UserId) values (#BNumber, #OutUserId);
INSERT INTO dbo.User2User
VALUES (#SubscriberAId, #OutUserId, 1)
SET #OutUserId = #OutUserId - 1;
END
else
BEGIN
UPDATE dbo.User2User
SET NumberOfConnections = NumberOfConnections + 1
WHERE User1ID = #SubscriberAId AND User2ID = #UserBId
-- Insert the row if the UPDATE statement failed.
if(##ROWCOUNT = 0)
BEGIN
INSERT INTO dbo.User2User
VALUES (#SubscriberAId, #UserBId, 1)
END
END
SET #Counter = #Counter + 1;
if((#Counter % 100000) = 0)
BEGIN
PRINT Cast (#Counter as NVarchar(12));
END
FETCH NEXT FROM CDR_cursor
INTO #CdrId, #SubscriberAId, #BNumber;
END
CLOSE CDR_cursor;
DEALLOCATE CDR_cursor;
The thing about SSIS is that it probably won't be much faster than a cursor. It's pretty much doing the same thing: reading the table record by record, processing the record and then moving to the next one. There are some advanced techniques in SSIS like sharding the data input that will help if you have heavy duty hardware, but without that it's going to be pretty slow.
A better solution would be to write an INSERT and an UPDATE statement that will give you what you want. With that you'll be better able to take advantage of indices on the database. They would look something like:
WITH SummaryCDR AS (UserAId, UserBId, Conns) AS
(
SELECT UserAId, UserBId, COUNT(1) FROM CDR
GROUP BY UserAId, UserBId)
UPDATE user2user
SET NumberOfConnections = NumberOfConnections + SummaryCDR.Conns
FROM SummaryCDR
WHERE SummaryCDR.UserAId = user2user.UserAId
AND SummaryCDR.UserBId = user2user.UserBId
INSERT INTO user2user (UserAId, UserBId, NumberOfConnections)
SELECT CDR.UserAId, CDR.UserBId, Count(1)
FROM CDR
LEFT OUTER JOIN user2user
ON user2user.UserAId = CDR.UserAId
AND user2user.UserBId = CDR.UserBId
WHERE user2user.UserAId IS NULL
GROUP BY CDR.UserAId, CDR.UserBId
(NB: I don't have time to test this code, you'll have to debug it yourself)
is this what you need?
select
UserAId, UserBId, count(CDRid) as count_connections
from cdr
group by UserAId, UserBId
Could you break the conditional update/insert into two separate statements and get rid of the cursor?
Do the INSERT for all the NULL rows and the UPDATE for all the NOT NULL rows.
Why are you even considering doing row-by-row processing on a table that size? You know you can use the merge statment and insert or update and it will be faster. Or you could write an update to insert all rows that need updating in one set-based stament and an insert to insert alll rows when the row doesn't exist in one set-based statement.
Stop using the values clause and use an insert with joins instead. Same thing with updates. If you need extra complexity the case stamenet will probably give you all you need.
In general stop thinking of row-by-row processing. If you can write a select for the cursor, you can write a set-based statement to do the work 99.9% of the time.
You may still want a cursor with a table this large but one to process batches of data (for instance a 1000 records at time) not one to run ro-by-row.