Auto-increment with Group BY

Auto-increment with Group BY - mysql

I have two tables as follows:
Contract
|
Contractuser
My job was to fetch latest invoice date for each contract number from Contractuser table and display results. The resultant table was as follows:
Result Table
Now I wanted to get a auto-increment column to display as the first column in my result set.
I used the following query for it:
SELECT #i:=#i+1 AS Sno,a.ContractNo,a.SoftwareName,a.CompanyName,b.InvoiceNo,b.InvoiceDate,
b.InvAmount,b.InvoicePF,max(b.InvoicePT) AS InvoicePeriodTo,b.InvoiceRD,b.ISD
FROM contract as a,contractuser as b,(SELECT #i:=0) AS i
WHERE a.ContractNo=b.ContractNo
GROUP BY b.ContractNo
ORDER BY a.SoftwareName ASC;
But it seems that the auto-increment is getting performed before the group by procedure because of which serial numbers are getting displayed in a non contiguous manner.

GROUP BY and variables don't necessarily work as expected. Just use a subquery:
SELECT (#i := #i + 1) AS Sno, c.*
FROM (SELECT c.ContractNo, c.SoftwareName, c.CompanyName, cu.InvoiceNo, cu.InvoiceDate,
cu.InvAmount, cu.InvoicePF, max(cu.InvoicePT) AS InvoicePeriodTo, cu.InvoiceRD, cu.ISD
FROM contract c JOIN
contractuser as b
ON c.ContractNo = cu.ContractNo
GROUP BY cu.ContractNo
ORDER BY c.SoftwareName ASC
) c CROSS JOIN
(SELECT #i := 0) params;
Notes:
I also fixed the JOIN syntax. Never use commas in the FROM clause.
I also added reasonable table aliases -- abbreviations for the tables. a and b don't mean anything, so they make the query harder to follow.
I left the GROUP BY with only one key. It should really have all the unaggregated keys but this is allowed under some circumstances.

SELECT #row_no := IF(#prev_val = lpsm.lit_no, #row_no + 1, 1) AS row_num,#prev_val := get_pad_value(lpsm.lit_no,26) LAWSUIT_NO,lpsm.cust_rm_no
FROM lit_person_sue_map lpsm,(SELECT #row_no := 0) x,(SELECT #prev_val := '') y
ORDER BY lpsm.lit_no ASC;
This will return sequence number group by lit_no;

Related

MySQL SUM top n values for several columns and group

I have a MySQL table containing player points for serveral categories (p1, p2 etc) and player id (pid).
I have a query that counts SUM of points for each category, puts them as aliases and groups them by player id (pid).
SELECT *,
SUM(p1) as p1,
SUM(p2) as p2,
SUM(p3) as p3,
SUM(p4) as p4,
SUM(p6) as p6,
SUM(p13) as p13,
SUM(p14) as p14,
SUM(p15) as p15,
SUM(p16) as p16,
SUM(p17) as p17,
SUM(p18) as p18,
SUM(p19) as p19,
SUM(p20) as p20,
SUM(p21) as p21
FROM results GROUP BY pid
Futher I do a while loop and update other table with these alias values.
Now I have a need to count only top 5 or 12 (depending on a category) values for each group. I don't know where to start. I found similar questions, but none of them addresses putting value in an alias, so i don't have to change futher code.
Can someone help me, and write an example query for at least two categories, so i can understand a principle of doing this right?
Thank you in advance!

As we need to do sum of top n records, we need to use something like this:
SELECT pid, sum(p1)
FROM (SELECT p.*,
(#pn := if(#p = pid, #pn + 1,
if(#p := pid, 1, 1)
)
) as seqnum
FROM player p CROSS JOIN
(SELECT #p := 0, #pn := 0) as p1
ORDER BY pid, p1 DESC
) p
WHERE seqnum <= 1
GROUP BY pid;
Here, we can modify seqnum <= 1 condition as per the number of records needed. E.g. if we want 5 records then we need to write seqnum <= 5.
Please note that this will only calculate Top n sum for a particular field. If we want multiple fields then we may need to repeat the query.
Here is the SQL Fiddle example to play around with.

Building on the answer by #DarshanMehta , you can do repeated sub queries like that. Note that the variable names in each sub query need to be different.
Something like this, assuming you have a table of players:-
SELECT players.pid,
suba1.p1sum,
suba2.p2sum
FROM players
LEFT OUTER JOIN
(
SELECT pid, SUM(p1) AS p1sum
FROM (SELECT r.pid,
r.p1,
#p1n := if(#p1 = pid, #p1n + 1, 1) AS seqnum,
#p1 := pid
FROM results r
CROSS JOIN (SELECT #p1 := 0, #p1n := 0) as p1
ORDER BY r.pid, r.p1 DESC
) sub1
WHERE seqnum <= 5
GROUP BY pid
) suba1
ON players.pid = suba1.pid
LEFT OUTER JOIN
(
SELECT pid, SUM(p2) AS p1sum
FROM (SELECT r.pid,
r.p2,
#p2n := if(#p2 = pid, #p2n + 1, 1) AS seqnum,
#p2 := pid
FROM results r
CROSS JOIN (SELECT #p2 := 0, #p2n := 0) as p2
ORDER BY r.pid, r.p2 DESC
) sub1
WHERE seqnum <= 5
GROUP BY pid
) suba2
ON players.pid = suba1.pid

You can build a table with all that SUM information, and use this one:
SELECT * from newTable ORDER BY p1 DESC LIMIT 5;
and you can catch all info that you want, by changing the field p1 and LIMIT 5

MYSQL Deleting all records over 15 for Each GROUP

At the end of this process I need to have a maximum of 15 records for each type in a table
My (hypothetical) table "stickorder" has 3 columns: StickColor, OrderNumber, PrimeryKey. (OrderNumber, PrimeryKey are unique)
I can only handle 15 orders for each stick color So I need to delete all the extra orders (They will be processed another day and are in a master table so I don't need them in this table.)
I have tried some similar solutions on this site but nothing seem to work, this is the closest
INSERT INTO stickorder2
(select posts_ordered.*
from (
select
stickorder.*,
#row:=if(#last_order=stickorder.OrderNumber, #row+1, 1) as row,
#last_orders:=stickorder.OrderNumber
from
stickorder inner join
(select OrderNumber from
(select distinct OrderNumber
from stickorder
order by OrderNumber) limit_orders
) limit_orders
on stickorder.OrderNumber = limit_orders.OrderNumber,
(select #last_order:=0, #row:=0) r
) posts_ordered
where row<=15);

When using insert, you should always list the columns. Alternatively, you might really want create table as.
Then, there are lots of other issues with your query. For instance, you say you want a limit on the number for each color, and yet you have no reference to StickColor in your query. I think you want something more along these lines:
INSERT INTO stickorder2(col1, . . . col2)
select so.*
from (select so.*,
#row:=if(#lastcolor = so.StickColor, #row+1,
if(#lastcolor := so.lastcolor, 1, 1)
) as row
from stickorders so cross join
(select #lastcolor := 0, #row := 0) vars
order by so.StickColor
) so
where row <= 15;

using max and limit in a compund MySQL statement

I have a simple process I'm trying to do in a single SQL statement.
I've got a table of players (called tplayers) with columns indicating what their userid and tourneyid are, as well as a "playerpoints" column. I've also got a table called "tscores" which contains scores, a userid and column called "rankpoints" - I want to take the top 3 rows per player with the highest rankpoints and put that value in the corresponding user record in tplayers -- all for a specific tourneyid.
Here's the query:
update tplayers p set playerpoints=
(
select sum(b.mypoints) y from
(
select scorerankpoints as mypoints from tscores t where t.tourneyid=p.tourneyid and p.userid=t.userid and t.scorerankpoints>0 order by scorerankpoints desc limit 3
) as b
) where p.tourneyid='12'
This generates this error: Unknown column 'p.tourneyid' in 'where clause'
I'm basically looking to take the top 3 values of "scorerankpoints" from table tscores and put the summed value into a column in table tplayers called playerpoints,
and I want to do this for all players and scores who have the same tourneyid in their tables.
It appears that the inner reference to p.tourneyid is undefined... Is there a way to do this in a single statement or do I have to break it up?

MySQL has a problem resolving correlated references that are more than one layer deep. This is a hard one to fix.
The following uses variables to enumerate the rows and then choosing the right rows for aggregation in an update/join:
update tplayers p join
(select ts.userid, sum(ts.scorerankpoints) as mypoints
from (select ts.*,
#rn := if(#userid = userid, 1, #rn + 1) as rn,
#userid := #userid
from tscores ts cross join
(select #rn := 0, #userid := '') const
where ts.tourneyid = '12'
order by ts.userid, ts.scorerankpoints desc
) ts
where rn <= 3
) ts
on p.userid = ts.userid
set playerpoints = ts.mypoints
where p.tourneyid = '12' ;

#rownum not incrementing as expected (selecting n per group)

What I'm expecting is for the tmp.rank to increment 1-10 for each userid before moving on to the next userid however all I'm getting is it staying on 1 for every record thus not limiting 10 items per userid.
Any ideas what I'm doing wrong, most probably something simple and obvious or more likely using the SQL in a way it's not intended.
SELECT DISTINCT
tmp.title,
tmp.content,
tmp.postid,
tmp.userid,
tmp.screenname,
tmp.email
FROM
(
SELECT
qp.title,
qp.content,
qp.postid,
ut.userid,
ut.screenname,
ut.email,
qp.created,
#rownum := IF( #prev = ut.userid, #rownum+1, 1 ) AS rank,
#prev := ut.userid
FROM
user_table AS ut JOIN (SELECT #rownum := NULL, #prev := 0) AS r ,
qa_posts AS qp,
qa_categories AS qc,
expatsblog_country AS cc
WHERE
LOWER(ut.country_of_expat) = LOWER(qc.title)
AND ut.setting_notifications IN (3)
AND ut.valid=1
AND ut.confirm_email = 1
AND qc.categoryid = qp.categoryid
AND qp.type='Q'
AND DATE(qp.created)>=DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY ut.userid,qp.created ASC
) AS tmp
WHERE tmp.rank < 10
ORDER BY tmp.userid, tmp.created ASC

I've done many queries in the past, especially responding to posts out here regarding MySQL and # variables. One of the problems I've encountered is the return order of data when applying to the rows.
Your original query DID have an order by in the inner-most query, but I've encountered times where the #variable assignment is not truly respecting it, and resets the counter because it goes to some other (in this case) user, then later, encounters more records for the first and resets counter back to one even though it occurred later on.
Next, you are applying DISTINCT to the end. I would try pulling DISTINCT to the inner-most so you are not getting 10 records for some key, same user and ending up with only 2 records returned.
That said, I would adjust the query to what I have below. The inner-most just grabs DISTINCT on the columns you want, then apply the #variable assignments.
SELECT DISTINCT
tmp.title,
tmp.content,
tmp.postid,
tmp.userid,
tmp.screenname,
tmp.email,
#rownum := IF( #prev = ut.userid, #rownum+1, 1 ) AS rank,
#prev := ut.userid
FROM
( SELECT DISTINCT
qp.title,
qp.content,
qp.postid,
ut.userid,
ut.screenname,
ut.email
FROM
user_table AS ut
JOIN qa_categories AS qc
ON LOWER( ut.country_of_expat ) = LOWER( qc.title )
JOIN qa_posts AS qp
ON qc.categoryid = qp.categoryid
AND qp.type='Q'
AND DATE(qp.created)>=DATE_SUB(NOW(), INTERVAL 24 HOUR)
WHERE
ut.setting_notifications = 3
AND ut.valid = 1
AND ut.confirm_email = 1
ORDER BY
ut.userid,
qp.created ASC ) AS tmp,
( SELECT #rownum := NULL,
#prev := 0) AS r
HAVING
tmp.rank < 10
ORDER BY
tmp.userid
I did not see any references to "cc" being joined anywhere which would have
cause a Cartesian result giving a record for EACH entry in expatsblog_country
so I removed it... If it IS needed, put where applicable and put JOIN condition too.
( removed expatsblog_country AS cc )
Also, instead of a WHERE clause, I changed to a HAVING clause, so this way all returned records are CONSIDERED for the final result set. This will ensure the #rownum will keep incrementing when it encounters a POSSIBLE entry, but having will throw all those greater than 10 out.
Finally, since the inner table was already pre-ordered by user and created date, you should not need the explicit re-ordering AGAIN in the outer... maybe just the UserID as I have it.

I am going to take a guess here at what the problem is.
You are combining two types of join syntax and that could be causing your issue.
You are using both a JOIN and then commas between your tables. You are using a JOIN between the user_table and the user variables and then a comma between the remaining tables.
FROM user_table AS ut JOIN (SELECT #rownum := NULL, #prev := 0) AS r ,
qa_posts AS qp,
qa_categories AS qc,
expatsblog_country AS cc
While your WHERE clause includes the columns to join on for most tables. It looks like you have no join condition for the table expatsblog_country. When you are joining tables, you should use one type of syntax and not mix them.
I would suggest something similar to this. I did not see any join condition fo the expatsblog_country table so I used a CROSS JOIN to join those tables to the a subquery of the others. If you have a column to join the expatsblog_country to any of the others, then move that query into the subquery:
SELECT DISTINCT tmp.title,
tmp.content,
tmp.postid,
tmp.userid,
tmp.screenname,
tmp.email
FROM
(
SELECT src.title,
src.content,
src.postid,
src.userid,
src.screenname,
src.email,
src.created,
#rownum := IF( #prev = src.userid, #rownum+1, 1 ) AS rank,
#prev := src.userid
FROM
(
select ut.userid,
ut.screenname,
ut.email,
qp.title,
qp.content,
qp.postid,
qp.created
from user_table AS ut
join qa_categories AS qc
on LOWER(ut.country_of_expat) = LOWER(qc.title)
join qa_posts AS qp
on qc.categoryid = qp.categoryid
where ut.setting_notifications IN (3)
and ut.valid=1
and ut.confirm_email = 1
and qp.type='Q'
and DATE(qp.created)>=DATE_SUB(NOW(), INTERVAL 24 HOUR)
) src
CROSS JOIN
(
SELECT #rownum := 0, #prev := 0
) AS r
CROSS JOIN expatsblog_country AS cc
ORDER BY src.userid, src.created ASC
) AS tmp
WHERE tmp.rank < 10
ORDER BY tmp.userid, tmp.created ASC

Try changing
( SELECT #rownum := NULL,
#prev := 0) AS r
so that you initialize #rownum to 0, and #prev to NULL instead.

Figured out what this was
expats_country was a not supposed to be part of the query
and more importantly:
the "DATE()" around qp.created was messing this up and the actual "last 24 hour" part of the query too. Removed this and it was fine.

How to optimize query if table contain 10000 entries using MySQL?

When I execute this query like this they take so much execution time because user_fans table contain 10000 users entries. How can I optimize it?
Query
SELECT uf.`user_name`,uf.`user_id`,
#post := (SELECT COUNT(*) FROM post WHERE user_id = uf.`user_id`) AS post,
#post_comment_likes := (SELECT COUNT(*) FROM post_comment_likes WHERE user_id = uf.`user_id`) AS post_comment_likes,
#post_comments := (SELECT COUNT(*) FROM post_comments WHERE user_id = uf.`user_id`) AS post_comments,
#post_likes := (SELECT COUNT(*) FROM post_likes WHERE user_id = uf.`user_id`) AS post_likes,
(#post+#post_comments) AS `sum_post`,
(#post_likes+#post_comment_likes) AS `sum_like`,
((#post+#post_comments)*10) AS `post_cal`,
((#post_likes+#post_comment_likes)*5) AS `like_cal`,
((#post*10)+(#post_comments*10)+(#post_likes*5)+(#post_comment_likes*5)) AS `total`
FROM `user_fans` uf ORDER BY `total` DESC lIMIT 20

I would try to simplify this COMPLETELY by putting triggers on your other tables, and just adding a few columns to your User_Fans table... One for each respective count() you are trying to get... from Posts, PostLikes, PostComments, PostCommentLikes.
When a record is added to whichever table, just update your user_fans table to add 1 to the count... it will be virtually instantaneous based on the user's key ID anyhow. As for the "LIKES"... Similar, only under the condition that something is triggered as a "Like", add 1.. Then your query will be a direct math on the single record and not rely on ANY joins to compute a "weighted" total value. As your table gets even larger, the queries too will get longer as they have more data to pour through and aggregate. You are going through EVERY user_fan record which in essence is querying every record from all the other tables.
All that being said, keeping the tables as you have them, I would restructure as follows...
SELECT
uf.user_name,
uf.user_id,
#pc := coalesce( PostSummary.PostCount, 000000 ) as PostCount,
#pl := coalesce( PostLikes.LikesCount, 000000 ) as PostLikes,
#cc := coalesce( CommentSummary.CommentsCount, 000000 ) as PostComments,
#cl := coalesce( CommentLikes.LikesCount, 000000 ) as CommentLikes,
#pc + #cc AS sum_post,
#pl + #cl AS sum_like,
#pCalc := (#pc + #cc) * 10 AS post_cal,
#lCalc := (#pl + #cl) * 5 AS like_cal,
#pCalc + #lCalc AS `total`
FROM
( select #pc := 0,
#pl := 0,
#cc := 0,
#cl := 0,
#pCalc := 0
#lCalc := 0 ) sqlvars,
user_fans uf
LEFT JOIN ( select user_id, COUNT(*) as PostCount
from post
group by user_id ) as PostSummary
ON uf.user_id = PostSummary.User_ID
LEFT JOIN ( select user_id, COUNT(*) as LikesCount
from post_likes
group by user_id ) as PostLikes
ON uf.user_id = PostLikes.User_ID
LEFT JOIN ( select user_id, COUNT(*) as CommentsCount
from post_comment
group by user_id ) as CommentSummary
ON uf.user_id = CommentSummary.User_ID
LEFT JOIN ( select user_id, COUNT(*) as LikesCount
from post_comment_likes
group by user_id ) as CommentLikes
ON uf.user_id = CommentLikes.User_ID
ORDER BY
`total` DESC
LIMIT 20
My variables are abbreviated as
"#pc" = PostCount
"#pl" = PostLikes
"#cc" = CommentCount
"#cl" = CommentLike
"#pCalc" = weighted calc of post and comment count * 10 weighted value
"#lCalc" = weighted calc of post and comment likes * 5 weighted value
The LEFT JOIN to prequeries runs those queries ONCE through, then the entire thing is joined instead of being hit as a sub-query for every record. By using the COALESCE(), if there are no such entries in the LEFT JOINed table results, you won't get hit with NULL values messing up the calcs, so I've defaulted them to 000000.
CLARIFICATION OF YOUR QUESTIONS
You can have any QUERY as an "AS AliasResult". The "As" can also be used to simplify any long table names for simpler readability. Aliases can also be using the same table but as a different alias to get similar content, but for different purpose.
select
MyAlias.SomeField
from
MySuperLongTableNameInDatabase MyAlias ...
select
c.LastName,
o.OrderAmount
from
customers c
join orders o
on c.customerID = o.customerID ...
select
PQ.SomeKey
from
( select ST.SomeKey
from SomeTable ST
where ST.SomeDate between X and Y ) as PQ
JOIN SomeOtherTable SOT
on PQ.SomeKey = SOT.SomeKey ...
Now, the third query above is not practical requiring the ( full query resulting in alias "PQ" representing "PreQuery" ). This could be done if you wanted to pre-limit a certain set of other complex conditions and wanted a smaller set BEFORE doing extra joins to many other tables for all final results.
Since a "FROM" does not HAVE to be an actual table, but can be a query in itself, any place else used in the query, it has to know how to reference this prequery resultset.
Also, when querying fields, they too can be "As FinalColumnName" to simplify results to where ever they will be used too.
select
CONCAT( User.Salutation, User.LastName ) as CourtesyName
from ...
select
Order.NonTaxable
+ Order.Taxable
+ ( Order.Taxable * Order.SalesTaxRate ) as OrderTotalWithTax
from ...
The "As" columnName is NOT required being an aggregate, but is most commonly seen that way.
Now, with respect to the MySQL variables... If you were doing a stored procedure, many people will pre-declare them setting their default values before the rest of the procedure. You can do them in-line in a query by just setting and giving that result an "Alias" reference. When doing these variables, the select will simulate always returning a SINGLE RECORD worth of the values. Its almost like an update-able single record used within the query. You don't need to apply any specific "Join" conditions as it may not have any bearing on the rest of the tables in a query... In essence, creates a Cartesian result, but one record against any other table will never create duplicates anyhow, so no damage downstream.
select
...
from
( select #SomeVar := 0,
#SomeDate := curdate(),
#SomeString := "hello" ) as SQLVars
Now, how the sqlvars work. Think of a linear program... One command is executed in the exact sequence as the query runs. That value is then re-stored back in the "SQLVars" record ready for the next time through. However, you don't reference it as SQLVars.SomeVar or SQLVars.SomeDate... just the #SomeVar := someNewValue. Now, when the #var is used in a query, it is also stored as an "As ColumnName" in the result set. Some times, this can be just a place-holder computed value in preparation of the next record. Each value is then directly available for the next row. So, given the following sample...
select
#SomeVar := SomeVar * 2 as FirstVal,
#SomeVar := SomeVar * 2 as SecondVal,
#SomeVar := SomeVar * 2 as ThirdVal
from
( select #SomeVar := 1 ) sqlvars,
AnotherTable
limit 3
Will result in 3 records with the values of
FirstVal SecondVal ThirdVal
2 4 8
16 32 64
128 256 512
Notice how the value of #SomeVar is used as each column uses it... So even on the same record, the updated value is immediately available for the next column... That said, now look at trying to build a simulated record count / ranking per each customer...
select
o.CustomerID,
o.OrderID
#SeqNo := if( #LastID = o.CustomerID, #SeqNo +1, 1 ) as CustomerSequence,
#LastID := o.CustomerID as PlaceHolderToSaveForNextRecordCompare
from
orders o,
( select #SeqNo := 0, #LastID := 0 ) sqlvars
order by
o.CustomerID
The "Order By" clause forces the results to be returned in sequence first. So, here, the records per customer are returned. First time through, LastID is 0 and customer ID is say...5. Since different, it returns 1 as the #SeqNo, THEN it preserves that customer ID into the #LastID field for the next record. Now, next record for customer... Last ID is the the same, so it takes the #SeqNo (now = 1), and adds 1 to 1 and becomes #2 for the same customer... Continue on the path...
As for getting better at writing queries, take a look at the MySQL tag and look at some of the heavy contributors. Look into the questions and some of the complex answers and how problems solving works. Not to say there are not others with lower reputation scores just starting out and completely competent, but you'll find who gives good answers and why's. Look at their history of answers posted too. The more you read and follow, the more you'll get a better handle on writing more complex queries.

You can convert this query to Group By clause, instead of using Subquery for each column.
You can create indexes on the relationship parameters ( it will be the most helpful way of optimizing your query response ).

1000 user records isn't much data at all.
There may be work you can do on the database itself:
1) Have you got the relevant indexes set on the foreign keys (indexes set on user_id in each of the tables)? Try running EXPLAIN before the query http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
2) Are your data types correct?

See the difference between #me(see image 1) and #DRapp(see image 2) Query execution time and explain. When i read #Drapp answer i realized that what am i doing wrong in this query and why my query take so much time basically answer is so simple my query dependent on subquery or #Drapp used derived (temporary/file sort) with the help of session variables , Alias and joins...
image 1 exe time (00:02:56:321)
image 2 exe time (00:00:32:860)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008