I have a table like this:
id col1 col2 col3
10 1 3
9 1 2 3
8 2 3
7 2 3
6 1 2
5 3
Each column has one value only or null. Eg. Col1 has 1 or empty. Col2 has 2 or empty.
I'd like to get the sum of repeating values only between two successive rows.
so the result would look like this:
I need to get the sum of total repeating values in each row.
id col1 col2 col3 Count
10 1 3 2 (shows the repeating values between id10 & id9 rows)
9 1 2 3 2 (shows the repeating values between id9 & id8 rows)
8 2 3 1
7 2 1
6 1 2 0
5 3
I googled and tried some queries I found on the web but couldn't get the right result. Thanks in advance for your help.
To further clarify, for example:
id10 row has (1,,3) and id9 row has (1,2,3). so there is two values repeating. so count is 2.
If the ids are consecutive and there are no gaps, you can do it with a self join:
select
t.*,
coalesce((t.col1 = tt.col1), 0) +
coalesce((t.col2 = tt.col2), 0) +
coalesce((t.col3 = tt.col3), 0) count
from tablename t left join tablename tt
on tt.id = t.id - 1
See the demo.
Results:
| id | col1 | col2 | col3 | count |
| --- | ---- | ---- | ---- | ----- |
| 10 | 1 | | 3 | 2 |
| 9 | 1 | 2 | 3 | 2 |
| 8 | | 2 | 3 | 1 |
| 7 | | 2 | | 1 |
| 6 | 1 | 2 | | 0 |
| 5 | | | 3 | 0 |
And if there are gaps...
SELECT a.id
, a.col1
, a.col2
, a.col3
, COALESCE(a.col1 = b.col1,0) + COALESCE(a.col2 = b.col2,0) + COALESCE(a.col3 = b.col3,0) n
FROM
( SELECT x.*
, MIN(y.id) y_id
FROM my_table x
JOIN my_table y
ON y.id > x.id
GROUP
BY x.id
) a
LEFT
JOIN my_table b
ON b.id = a.y_id;
Were you to restructure your schema, then you could do something like this instead...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL
,val INT NOT NULL
,PRIMARY KEY(id,val)
);
INSERT INTO my_table VALUES
(10,1),
(10,3),
( 9,1),
( 9,2),
( 9,3),
( 8,2),
( 8,3),
( 7,2),
( 7,3),
( 6,1),
( 6,2),
( 5,3);
SELECT a.id
, COUNT(b.id) total
FROM
( SELECT x.*
, MIN(y.id) next
FROM my_table x
JOIN my_table y
ON y.id > x.id
GROUP
BY x.id
, x.val
) a
LEFT
JOIN my_table b
ON b.id = a.next
AND b.val = a.val
GROUP
BY a.id;
+----+-------+
| id | total |
+----+-------+
| 5 | 0 |
| 6 | 1 |
| 7 | 2 |
| 8 | 2 |
| 9 | 2 |
+----+-------+
You can use :
select t1_ID, t1_col1,t1_col2,t1_col3, count
from
(
select t1.id as t1_ID, t1.col1 as t1_col1,t1.col2 as t1_col2,t1.col3 as t1_col3, t2.*,
case when t1.col1 = t2.col1 then 1 else 0 end +
case when t1.col2 = t2.col2 then 1 else 0 end +
case when t1.col3 = t2.col3 then 1 else 0 end as count
from tab t1
left join tab t2
on t1.id = t2.id + 1
order by t1.id
) t3
order by t1_ID desc;
Demo
If there are gaps between id values for the next row, you could have user defined variables to explicitly assign values to rows in their natural ordering in the table. Rest logic remains the same as already answered. You would do an inner join between current row number and next row number to get the col1,col2 and col3 values and use coalesce for computation of count.
select derived_1.*,
coalesce((derived_1.col1 = derived_2.col1), 0) +
coalesce((derived_1.col2 = derived_2.col2), 0) +
coalesce((derived_1.col3 = derived_2.col3), 0) count
from (
select #row := #row + 1 as row_number,t1.*
from tablename t1,(select #row := 0) d1
) derived_1
left join (
select *
from (
select #row2 := #row2 + 1 as row_number,t2.*
from tablename t2,(select #row2 := 0) d2
) d3
) derived_2
on derived_1.row_number + 1 = derived_2.row_number;
Demo: https://www.db-fiddle.com/f/wAzb67zSEfbZKg5RywQvC8/1
Related
If I have a table T that look like this: where id is the unique auto-increment primary key. Difference column is default to 0. I want to UPDATE only the difference of largestId - secondLargestId in each id_str group while the rest remains unchanged.
id_str id Value Difference
2380 1 21.01 0
2380 3 22.04 0
2380 5 22.65 0
2380 8 23.11 0
2380 10 35.21 0
20100 2 37.07 0
20100 4 38.17 0
20100 6 38.97 0
20103 7 57.98 0
20103 9 60.83 0
The result I want is:
id_str id Value Difference
2380 1 21.01 0
2380 3 22.04 0
2380 5 22.65 0
2380 8 23.11 0
2380 10 35.21 12.1
20100 2 37.07 0
20100 4 38.17 0
20100 6 38.97 0.8
20103 7 57.98 0
20103 9 60.83 2.85
How can I write the query?
This should do the trick in MySQL.
CREATE TABLE SomeTable
( id_str VARCHAR(10),
id INTEGER,
value_ DECIMAL(7,5),
difference DECIMAL(7,5)
);
INSERT INTO SomeTable VALUES(2380,1,21.01,0);
INSERT INTO SomeTable VALUES(2380,3,22.04,0);
INSERT INTO SomeTable VALUES(2380,5,22.65,0);
INSERT INTO SomeTable VALUES(2380,8,23.11,0);
INSERT INTO SomeTable VALUES(2380,10,35.21,0);
INSERT INTO SomeTable VALUES(20100,2,37.07,0);
INSERT INTO SomeTable VALUES(20100,4,38.17,0);
INSERT INTO SomeTable VALUES(20100,6,38.97,0);
INSERT INTO SomeTable VALUES(20103,7,57.98,0);
INSERT INTO SomeTable VALUES(20103,9,60.83,0);
UPDATE SomeTable,
(SELECT T1.id AS id_updt,
T1.value_ - T2.value_ AS diff_updt
FROM (SELECT id_str,
id,
value_,
(
CASE id_str
WHEN #curStr THEN #curRow := #curRow + 1
ELSE #curRow := 1
AND #curStr := id_str
END
) AS rnk
FROM SomeTable,
(SELECT #curRow := 0, #curStr := '') r
ORDER
BY id_str DESC,
id DESC
) AS T1
INNER
JOIN (SELECT id_str,
id,
value_,
(
CASE id_str
WHEN #curStr THEN #curRow := #curRow + 1
ELSE #curRow := 1
AND #curStr := id_str
END
) AS rnk
FROM SomeTable,
(SELECT #curRow := 0, #curStr := '') r
ORDER
BY id_str DESC,
id DESC
) AS T2
ON T1.id_str = T2.id_str
AND T1.rnk = 1
AND T2.rnk = 2
) AS UPDT
SET SomeTable.difference = UPDT.diff_updt
WHERE SomeTable.id = UPDT.id_updt;
Deprecated solution - This will work for a DBMS that supports the rank function.
UPDATE SomeTable
FROM ( SELECT RNK1.id AS id_updt,
RNK1.value_ - RNK2.value_ AS diff_updt
FROM (SELECT id_str,
RANK() OVER
( PARTITION BY id_str
ORDER BY id DESC
) AS id_rnk
FROM SomeTable
) AS RNK1
INNER
JOIN (SELECT id_str,
RANK() OVER
( PARTITION BY id_str
ORDER BY id DESC
) - 1 AS id_rnk_decrement
FROM SomeTable
) AS RNK2
ON RNK1.id_str = RNK2.id_str
AND RNK1.id_rnk = RNK2.id_rnk_decrement
WHERE RNK1.id_rnk = 1
) AS UPDT
SET SomeTable.difference_ = UPDT.diff_updt
WHERE SomeTable.id = UPDT.id_updt;
You can find the two greatest ids per group with the following query:
select t1.id_str, max(t1.id) as id1, (
select max(t2.id)
from mytable t2
where t2.id_str = t1.id_str
and t2.id < max(t1.id)
) as id2
from mytable t1
group by t1.id_str;
Result:
| id_str | id1 | id2 |
|--------|-----|-----|
| 2380 | 10 | 8 |
| 20100 | 6 | 4 |
| 20103 | 9 | 7 |
Use it as subquery in your update statement:
update mytable u
join (
select t1.id_str, max(t1.id) as id1, (
select max(t2.id)
from mytable t2
where t2.id_str = t1.id_str
and t2.id < max(t1.id)
) as id2
from mytable t1
group by t1.id_str
) t on t.id1 = u.id
join mytable t1 on t1.id = t.id1
join mytable t2 on t2.id = t.id2
set u.Difference = t1.Value - t2.Value;
The table will now contain:
| id_str | id | Value | Difference |
|--------|----|-------|------------|
| 2380 | 1 | 21.01 | 0 |
| 2380 | 3 | 22.04 | 0 |
| 2380 | 5 | 22.65 | 0 |
| 2380 | 8 | 23.11 | 0 |
| 2380 | 10 | 35.21 | 12.1 |
| 20100 | 2 | 37.07 | 0 |
| 20100 | 4 | 38.17 | 0 |
| 20100 | 6 | 38.97 | 0.8 |
| 20103 | 7 | 57.98 | 0 |
| 20103 | 9 | 60.83 | 2.85 |
http://rextester.com/CCO40873
How to achieve this by joins and group by or any other alternative
Tab 1:
id | data
1 | aaa
2 | bbb
3 | ccc
tab 2:
id | tab1ID | status
101 | 1 | Y
102 | 2 | Y
103 | 1 | X
104 | 2 | X
105 | 3 | X
106 | 1 | Z
107 | 2 | Z
required output:
id | data | status
1 | aaa | Z
2 | bbb | Z
3 | ccc | X
Record with the highest priority status has to come up in the result Z > Y > X
I want to avoid creating a separate table to store the priority order
Edit 1: change in sample data
First, give a row number to the second table based on the columns tablID and the priority of status. Then join it with the first table to get the the columns id and data and select only the rows having row number is 1.
Query
select t1.`id`, t1.`data`, t2.`status`
from `tab1` t1
left join(
select `id`, `tab1ID`, `status`,
(
case `tab1ID` when #curA
then #curRow := #curRow + 1
else #curRow := 1 and #curA := `tab1ID` end
) as rn
from `tab2`,
(select #curRow := 0, #curA := '') r
order by `tab1ID`, case `status` when 'Z' then 1
when 'Y' then 2 when 'X' then 3 else 4 end
)t2
on t1.`id` = t2.`tab1ID`
where t2.rn = 1;
SQL Fiddle Demo
If you want the most recent status, then one method is a correlated subquery:
select t1.*,
(select t2.status
from tab2 t2
where t2.tab1id = t1.id
order by t2.id desc
limit 1
) as status
from tab1 t1;
EDIT:
If you just want the highest status, use JOIN and GROUP BY:
select t1.*, max(t2.status)
from tab1 t1 left join
tab2 t2
on t2.tab1id = t1.id
group by t1.id;
Note: The use of select t1.* is permitted and even supported by the ANSI standard, assuming that t1.id is unique (a reasonable assumption).
example::
JOHN | 1 | 6 | 2
PETER | 1 | 7 | 6
MARK | 2 | 1 | 6
DIANNA | 3 | 2 | 1
SPIDERMAN | 4 | 1 | 6
JAMIE FOXX | 5 | 1 | 6
how can I do a select count how many times that the numbers are repeated in each of the 3 columns
Example:
number 1 is repeated 6 times.
the number 6 is repeated 5 times.
Assuming your number column are c1,c2 and c3 and the table is t.
select c,count(*)
from ( select c1 as c from t
union all select c2 from t
union all select c3 from t
) t
group by c
;
Assuming you are looking for 1
A way is using union and sum
select sum(num) from
(
select count(*) as num
from my_table
where col1 = 1
union all
select count(*)
from my_table
where col2 = 1
union all
select count(*)
from my_table
where col3 = 1
) t
SELECT COUNT(CASE WHEN col1 = #number THEN 1 END) +
COUNT(CASE WHEN col2 = #number THEN 1 END) +
COUNT(CASE WHEN col3 = #number THEN 1 END) as repeat
FROM YourTable, (SELECT #number := 1) as parameter
I have a table
id value
1 a
2 a
3 b
4 b
5 b
6 c
My id is primary.
I have total 2 a , 3 b and 1 c. So I want to count total repeat value in each primary id which matches on it
I want this format
id value_count
1 2
2 2
3 3
4 3
5 3
6 1
Try this query:
SELECT a.id, b.valueCnt
FROM tableA a
INNER JOIN (SELECT a.value, COUNT(a.value) valueCnt
FROM tableA a GROUP BY a.value) AS B ON a.value = b.value;
Check the SQL FIDDLE DEMO
OUTPUT
| ID | VALUECNT |
|----|----------|
| 1 | 2 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
| 5 | 3 |
| 6 | 1 |
Try This
select id, value_count from tablename as a1
join (select count(*) as value_count, value from tablename group by value) as a2
on a1.value= a2.value
I suggest you use a subselect without any joins:
SELECT
a.id,(SELECT COUNT(*) FROM tableA WHERE value = a.value) as valueCnt
FROM tableA a
Fiddle Demo
You need to use subquery.
SELECT table.id , x.value_count
FROM table
INNER JOIN
(SELECT t1.value, count(t1.id) as value_count
FROM table t1
Group by t1.value
) x on x.value = table.value
I am using MYSQL to create a rating system to implement my database. What I want to do is to rate each attribute by its percentage with some calculation. Here is the example database:
| ID | VALUE1 | VALUE2|
-----------------------
| 2 | 5 | 20 |
| 4 | 5 | 30 |
| 1 | 3 | 5 |
| 3 | 2 | 8 |
Here is the ideal output I need:
| ID | VALUE1 | RANK1 | Score1 | VALUE2 | RANK2 | Score2 |
---------------------------------------------------------
| 2 | 5 | 1 | 10 | 20 | 2| 8.3|
| 4 | 5 | 1 | 10 | 30 | 1| 10|
| 1 | 3 | 2 | 7.5| 5 | 4| 5|
| 3 | 2 | 3 | 5 | 8 | 3| 6.6|
The formula for score calculation is
5+5*(MaxRank-rank)/(MaxRank-MinRank)
How to generate multiple ranking like the table? I have tried
SELECT
#min_rank := 1 AS min_rank
, #max_rank1 := (SELECT COUNT(DISTINCT value1) FROM table) AS max_rank1
, #max_rank2 := (SELECT COUNT(DISTINCT value2) FROM table) AS max_rank2
;
SELECT
ID
, R1
, TRUNCATE(5.0+5.0 * (#max_rank1 - R1) / (#max_rank1 - #min_rank), 2) AS Score1
, R2
, TRUNCATE(5.0+5.0 * (#max_rank2 - R2) / (#max_rank2 - #min_rank), 2) AS Score2
FROM (
SELECT
ID
, value1
, FIND_IN_SET( `value1`, (SELECT GROUP_CONCAT(DISTINCT `value1` ORDER BY `value1` DESC) FROM table)) AS R1
, value2
, FIND_IN_SET( `value2`, (SELECT GROUP_CONCAT(DISTINCT `value2` ORDER BY `value2` DESC) FROM table)) AS R2
FROM table
) ranked_table;
It works fine with ranking below 170. My database has approximate 200+ ranking for some values and ranks larger then 170 will be seen as 0 when it returns. In that case, the scores with ranks >170 will be miscalculated. Thank you guys.
That looks nasty to calculate.
Something like this might do it
SELECT a.ID, a.VALUE1, Sub1.Rank1, (5.0+5.0 * (Sub3.MaxRank1 - Sub1.Rank1) / (Sub3.MaxRank1 - 1)) AS Score1, a.VALUE2, Sub2.Rank2, (5.0+5.0 * (Sub4.MaxRank2 - Sub2.Rank2) / (Sub4.MaxRank2 - 1)) AS Score2
FROM TestTable a
INNER JOIN (SELECT DISTINCT z.VALUE1, (SELECT ((COUNT(DISTINCT VALUE1) + 1)) FROM TestTable y WHERE z.VALUE1 < y.VALUE1) AS RANK1
FROM TestTable z
) Sub1 ON a.VALUE1 = Sub1.VALUE1
INNER JOIN (SELECT DISTINCT z.VALUE2, (SELECT ((COUNT(DISTINCT VALUE2) + 1)) FROM TestTable y WHERE z.VALUE2 < y.VALUE2) AS RANK2
FROM TestTable z
) Sub2 ON a.VALUE2 = Sub2.VALUE2
CROSS JOIN (SELECT COUNT(*) + 1 AS MaxRank1 FROM TestTable CROSS JOIN (SELECT MAX(VALUE1) AS MaxValue1 FROM TestTable) Sub3a WHERE VALUE1 < MaxValue1) Sub3
CROSS JOIN (SELECT COUNT(*) + 1 AS MaxRank2 FROM TestTable CROSS JOIN (SELECT MAX(VALUE2) AS MaxValue2 FROM TestTable) Sub4a WHERE VALUE2 < MaxValue2) Sub4
Note I am not sure on your score calculation. The equation you give doesn't appear to me to give the results in your example. But I might just be misreading it.