Mysql increment column value by group - mysql

Would like to increment column between groups of the same parentid. See problem below:
ID Name Parent Pos
================================
1 Alex 1 0
2 Mary 1 0
3 John 1 0
4 Doe 2 0
5 Bob 2 0
6 Kate 2 0
EXPECTED RESULT
ID Name Parent Pos
================================
1 Alex 1 1
2 Mary 1 2
3 John 1 3
4 Doe 2 1
5 Bob 2 2
6 Kate 2 3
I would do this using two queries to select distinct values of the parent, then do a loop and update in sets but I feel there is a more efficient way!!

These problems can be easily solved by ranking function. As mysql doesn't support ranking function we've to go with alternative.
Check this query
-- for dense rank
SELECT
Id,
NAME,
Parent,
Pos
, case when #previousParent = rankTab.Parent THEN #runningGroup := #runningGroup + 1
else #runningGroup := 1 AND #previousParent := rankTab.Parent
END as denseRank
FROM
inc_col_val_by_group AS rankTab,
(SELECT #runningGroup := 0) AS b
, (select #previousParent := 0 ) as prev
ORDER BY rankTab.Parent -- order by Parent
--
-- -- below are the create table & insert the given records script
-- create the table
CREATE TABLE inc_col_val_by_group
(Id INT
, NAME CHAR(10)
, Parent INT
, Pos INT
)
-- insert some records
INSERT INTO inc_col_val_by_group(Id, NAME, Parent, Pos)
VALUES
(1, 'Alex', 1, 0)
, (1, 'Mary', 1, 0)
, (3, 'John', 1, 0)
, (4, 'Doe', 2, 0)
, (5, 'Bob', 2, 0)
, (6, 'Kate', 2, 0)

The most efficient way is to probably use variables:
select t.*,
(#rn := if(#p = parent, #p + 1,
if(#p := parent, 1, 1)
)
) as pos
from table t cross join
(select #p := 0, #rn := 0) init
order by parent, id;

SET #posn:=0;
SET #pid:=0;
SELECT IF(#pid=k.parentid,#posn:=#posn+1,#posn:=1) pos,#pid:=k.parentid pid, k.*
FROM kids k
ORDER BY parentid

Related

How to find median given frequency of numbers?

The Numbers table keeps the value of number and its frequency.
+----------+-------------+
| Number | Frequency |
+----------+-------------|
| 0 | 7 |
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |
+----------+-------------+
In this table, the numbers are 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3, so the median is (0 + 0) / 2 = 0. How to find median (output shown) given frequency of numbers?
+--------+
| median |
+--------|
| 0.0000 |
+--------+
I found the following solution here. However, I am unable to understand it. Can someone please explain the solution and/or post a different solution with explanation?
SELECT AVG(n.Number) AS median
FROM Numbers n LEFT JOIN
(
SELECT Number, #prev := #count AS prevNumber, (#count := #count + Frequency) AS countNumber
FROM Numbers,
(SELECT #count := 0, #prev := 0, #total := (SELECT SUM(Frequency) FROM Numbers)) temp ORDER BY Number
) n2
ON n.Number = n2.Number
WHERE
(prevNumber < floor((#total+1)/2) AND countNumber >= floor((#total+1)/2))
OR
(prevNumber < floor((#total+2)/2) AND countNumber >= floor((#total+2)/2))
Here's the SQL script for reproducibility:
CREATE TABLE `Numbers` (
`Number` INT NULL,
`Frequency` INT NULL);
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('0', '7');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('1', '1');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('2', '3');
INSERT INTO `Numbers` (`Number`, `Frequency`) VALUES ('3', '1');
Thanks!
You can use a cumulative sum and then take the midway point. I think the logic looks like this:
select avg(number)
from (select t.*, (#rf := #rf + frequency) as running_frequency
from (select t.* from t order by number) t cross join
(select #rf := 0) params
) t
where running_frequency - frequency >= ceil(#rf / 2) and
running_frequency <= ceil((#rf + 1) / 2);

MySQL self join with limited hierarchy

Hi I have a self joining MySQL table I am using for comments and replies.
CREATE TABLE comments (id INT, parent_id INT, comment VARCHAR(50));
INSERT INTO comments VALUES
(1, 0, 'comment 1' ),
(2, 0, 'comment 2' ),
(3, 0, 'comment 3' ),
(4, 1, 'comment 1 - reply 1' ),
(5, 0, 'comment 4' ),
(6, 3, 'comment 3 - reply 1' ),
(7, 1, 'comment 1 - reply 2' ),
(8, 0, 'comment 5' );
There is only ever one level of replies. That is, a reply can only ever be associated with a top level comment (where parent_id = 0).
I using the following query to show each top level comment (where parent_id = 0) and each of comments associated replies.
SELECT *
FROM comments
ORDER BY IF(parent_id = 0, id, parent_id) desc , parent_id != 0, id desc
Output:
id parent_id comment
-------------------------
8 0 comment 5
5 0 comment 4
3 0 comment 3
6 3 comment 3 - reply 1
2 0 comment 2
1 0 comment 1
7 1 comment 1 - reply 2
4 1 comment 1 - reply 1
The current query is working well for what I need.
My question is how can I limit the number of replies for each comment? eg. Show the latest 50 top level comments with a maximum of 2 replies for each comment.
Here is a SqlFiddle if it helps
Try this:
EDIT:
SELECT pc.id,
pc.parent_id,
pc.comment
FROM (
SELECT id,
parent_id,
comment,
#parentRank := #parentRank + 1 AS rank
FROM comments,
(SELECT #parentRank := 0) pcr
WHERE parent_id = 0
ORDER BY id DESC
) pc
WHERE pc.rank <= 5
UNION
SELECT cc.id,
cc.parent_id,
cc.comment
FROM (
SELECT id,
parent_id,
comment,
#childRank := if(#current_parent_id = parent_id, #childRank + 1, 1) AS rank,
#current_parent_id := parent_id
FROM comments,
(SELECT #childRank := 0) cr
WHERE parent_id in (
SELECT id
FROM (
SELECT id,
#parentRank := #parentRank + 1 AS rank
FROM comments,
(SELECT #parentRank := 0) pcr
WHERE parent_id = 0
ORDER BY id DESC
) pc
WHERE pc.rank <= 5
)
ORDER BY parent_id DESC,
id DESC
) cc
WHERE cc.rank <= 1
ORDER BY IF(parent_id = 0, id, parent_id) desc , parent_id != 0, id desc
I did a demo in SQLFiddler
First parameter of Limit (offset) controls the number of replies
SELECT *,
(SELECT COUNT(id) FROM comments r where r.parent_id = c.id) AS number_of_replies
FROM comments c
WHERE IFNULL((SELECT e.id FROM comments e WHERE e.parent_id != 0 AND
e.parent_id = c.parent_id ORDER BY e.id DESC LIMIT 2, 1), 0) < c.id
ORDER BY IF(parent_id = 0, id, parent_id) desc , parent_id != 0, id desc

Count consecutive rows with a particular status

I need to count whether there are three consecutive failed login attempts of the user in last one hour.
For example
id userid status logindate
1 1 0 2014-08-28 10:00:00
2 1 1 2014-08-28 10:10:35
3 1 0 2014-08-28 10:30:00
4 1 0 2014-08-28 10:40:00
In the above example, status 0 means failed attempt and 1 means successful attempt.
I need a query that will count three consecutive records of a user with status 0 occurred in last one hour.
I tried below query
SELECT COUNT( * ) AS total, Temp.status
FROM (
SELECT a.status, MAX( a.id ) AS idlimit
FROM loginAttempts a
GROUP BY a.status
ORDER BY MAX( a.id ) DESC
) AS Temp
JOIN loginAttempts t ON Temp.idlimit < t.id
HAVING total >1
Result:
total status
2 1
I don't know why it display status as 1. I also need to add a where condition on logindate and status field but don't know how would it work
For consecutive count you can use user defined variables to note the series values ,like in below query i have use #g and #r variable, in inner query i am storing the current status value that could be 1/0 and in case expression i am comparing the value stored in #g with the status column if they both are equal like #g is holding previous row value and previous row's status is equal to the current row's status then do not change the value stored in #r,if these values don't match like #g <> a.status then increment #r with 1, one thing to note i am using order by with id column and assuming it is set to auto_increment so for consecutive 1s #r value will be same like #r was 3 for first status 1 and the again status is 1 so #r will 3 until the status changes to 0 same for status 0 vice versa
SELECT t.userid,t.consecutive,t.status,COUNT(1) consecutive_count
FROM (
SELECT a.* ,
#r:= CASE WHEN #g = a.status THEN #r ELSE #r + 1 END consecutive,
#g:= a.status g
FROM attempts a
CROSS JOIN (SELECT #g:=2, #r:=0) t1
WHERE a.`logindate` BETWEEN '2014-08-28 10:00:00' AND '2014-08-28 11:00:00'
ORDER BY id
) t
GROUP BY t.userid,t.consecutive,t.status
HAVING consecutive_count >= 3 AND t.status = 0
Now in parent query i am grouping results by userid the resultant value of case expression i have name is it as consecutive and status to get the count for each user's consecutive status
One thing to note for above query that its necessary to provide the
hour range like i have used between without this it will be more
difficult to find exactly 3 consecutive statuses with in an hour
Sample data
INSERT INTO attempts
(`id`, `userid`, `status`, `logindate`)
VALUES
(1, 1, 0, '2014-08-28 10:00:00'),
(2, 1, 1, '2014-08-28 10:10:35'),
(3, 1, 0, '2014-08-28 10:30:00'),
(4, 1, 0, '2014-08-28 10:40:00'),
(5, 1, 0, '2014-08-28 10:50:00'),
(6, 2, 0, '2014-08-28 10:00:00'),
(7, 2, 0, '2014-08-28 10:10:35'),
(8, 2, 0, '2014-08-28 10:30:00'),
(9, 2, 1, '2014-08-28 10:40:00'),
(10, 2, 1, '2014-08-28 10:50:00')
;
As you can see from id 3 to 5 you can see consecutive 0s for userid 1 and similarly id 6 to 8 userid 2 has consecutive 0s and they are in an hour range using above query you can have results as below
userid consecutive status consecutive_count
------ ----------- ------ -------------------
1 2 0 3
2 2 0 3
Fiddle Demo
M Khalid Junaid's answer is great, but his Fiddle Demo didn't work for me when I clicked it.
Here is a Fiddle Demo which works as of this writing.
In case it doesn't work later, I used the following in the schema:
CREATE TABLE attempts
(`id` int, `userid` int, `status` int, `logindate` datetime);
INSERT INTO attempts
(`id`, `userid`, `status`, `logindate`)
VALUES
(1, 1, 0, '2014-08-28 10:00:00'),
(2, 1, 1, '2014-08-28 10:10:35'),
(3, 1, 0, '2014-08-28 10:30:00'),
(4, 1, 0, '2014-08-28 10:40:00'),
(5, 1, 0, '2014-08-28 10:50:00'),
(6, 2, 0, '2014-08-28 10:00:00'),
(7, 2, 0, '2014-08-28 10:10:35'),
(8, 2, 0, '2014-08-28 10:30:00'),
(9, 2, 1, '2014-08-28 10:40:00'),
(10, 2, 1, '2014-08-28 10:50:00')
;
And this as the query:
SELECT t.userid,t.consecutive,t.status,COUNT(1) consecutive_count
FROM (
SELECT a.* ,
#r:= CASE WHEN #g = a.status THEN #r ELSE #r + 1 END consecutive,
#g:= a.status g
FROM attempts a
CROSS JOIN (SELECT #g:=2, #r:=0) t1
WHERE a.`logindate` BETWEEN '2014-08-28 10:00:00' AND '2014-08-28 11:00:00'
ORDER BY id
) t
GROUP BY t.userid,t.consecutive,t.status
HAVING consecutive_count >= 3 AND t.status = 0;

Count occurrences that differ within a column

I want to be able to select the amount of times the data in columns Somedata_A and Somedata_B has changed from the from the previous row within its column. I've tried using DISTINCT and it works to some degree. {1,2,3,2,1,1} will show 3 when I want it to show 4 course there's 5 different values in sequence.
Example:
A,B,C,D,E,F
{1,2,3,2,1,1}
A compare to B gives a difference, B compare to C gives a difference . . . E compare to F gives not difference. All in all it gives 4 differences within a set of 6 values.
I have gotten DISTINCT to work but it does not really do the trick for me. And to add more to the question I'm really not interested it the whole range, lets say just the 2 last days/entries per Title.
Second I'm concern about performance issues. I tried the query below on a real set of data and it got interrupted probably due to timeout.
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE testdata(
Title varchar(10),
Date varchar(10),
Somedata_A int(5),
Somedata_B int(5));
INSERT INTO testdata (Title, Date, Somedata_A, Somedata_B) VALUES
("Alpha", '123', 1, 2),
("Alpha", '234', 2, 2),
("Alpha", '345', 1, 2),
("Alpha", '349', 1, 2),
("Alpha", '456', 1, 2),
("Omega", '123', 1, 1),
("Omega", '234', 2, 2),
("Omega", '345', 3, 3),
("Omega", '349', 4, 3),
("Omega", '456', 5, 4),
("Delta", '123', 1, 1),
("Delta", '234', 2, 2),
("Delta", '345', 1, 3),
("Delta", '349', 2, 3),
("Delta", '456', 1, 4);
Query 1:
SELECT t.Title, (SELECT COUNT(DISTINCT Somedata_A) FROM testdata AS tt WHERE t.Title = tt.Title) AS A,
(SELECT COUNT(DISTINCT Somedata_B) FROM testdata AS tt WHERE t.Title = tt.Title) AS B
FROM testdata AS t
GROUP BY t.Title
Results:
| TITLE | A | B |
|-------|---|---|
| Alpha | 2 | 1 |
| Delta | 2 | 4 |
| Omega | 5 | 4 |
Something like this may work: it uses a variable for row number, joins on an offset of 1 and then counts differences for A and B.
http://sqlfiddle.com/#!2/3bbc8/9/2
set #i = 0;
set #j = 0;
Select
A.Title aTitle,
sum(Case when A.SomeData_A <> B.SomeData_A then 1 else 0 end) AVar,
sum(Case when A.SomeData_B <> B.SomeData_B then 1 else 0 end) BVar
from
(SELECT Title, #i:=#i+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as A
INNER JOIN
(SELECT Title, #j:=#j+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as B
ON A.RowID= B.RowID + 1
AND A.Title=B.Title
Group by A.Title
This works (see here) (FYI: Your results in the question do not match your data - for instance, for Alpha, ColumnA: it never changes from 1. The answer should be 0)
Hopefully you can adapt this Statement to your actual data model
SELECT t1.title, SUM(t1.Somedata_A<>t2.Somedata_a) as SomeData_A
,SUM(t1.Somedata_b<>t2.Somedata_b) as SomeData_B
FROM testdata AS t1
JOIN testdata AS t2
ON t1.title = t2.title
AND t2.date = DATE_ADD(t1.date, INTERVAL 1 DAY)
GROUP BY t1.title
ORDER BY t1.title;

MySQL Count frequency of records

Table:
laterecords
-----------
studentid - varchar
latetime - datetime
reason - varchar
students
--------
studentid - varchar -- Primary
class - varchar
I would like to do a query to show the following:
Sample Report
Class No of Students late 1 times 2 times 3 times 4 times 5 & more
Class A 3 1 0 2 0 0
Class B 1 0 1 0 0 0
My query below can show the first column results:
SELECT count(Distinct studentid), class FROM laterecords, students
WHERE students.studenid=laterecords.studentid AND
GROUP BY class
I can only think of getting the results for each column and store them into php arrays. Then echo them to table in HTML.
Is there any better SQL way to do the above? How to do up the mysql query ?
Try this:
SELECT
a.class,
COUNT(b.studentid) AS 'No of Students late',
SUM(b.onetime) AS '1 times',
SUM(b.twotime) AS '2 times',
SUM(b.threetime) AS '3 times',
SUM(b.fourtime) AS '4 times',
SUM(b.fiveormore) AS '5 & more'
FROM
students a
LEFT JOIN
(
SELECT
aa.studentid,
IF(COUNT(*) = 1, 1, 0) AS onetime,
IF(COUNT(*) = 2, 1, 0) AS twotime,
IF(COUNT(*) = 3, 1, 0) AS threetime,
IF(COUNT(*) = 4, 1, 0) AS fourtime,
IF(COUNT(*) >= 5, 1, 0) AS fiveormore
FROM
students aa
INNER JOIN
laterecords bb ON aa.studentid = bb.studentid
GROUP BY
aa.studentid
) b ON a.studentid = b.studentid
GROUP BY
a.class
How about :
SELECT numlates, `class`, count(numlates)
FROM
(SELECT count(laterecords.studentid) AS numlates, `class`, laterecords.studentid
FROM laterecords,
students
WHERE students.studentid=laterecords.studentid
GROUP BY laterecords.studentid, `class`) aliastbl
GROUP BY `class`, numlates