cumulative product over a big MySQL table - mysql

I have a big MySQL table on which I'd like to calculate a cumulative product. This product has to be calculated for each group, a group is defined by the value of the first column.
For example :
name | number | cumul | order
-----------------------------
a | 1 | 1 | 1
a | 2 | 2 | 2
a | 1 | 2 | 3
a | 4 | 8 | 4
b | 1 | 1 | 1
b | 1 | 1 | 2
b | 2 | 2 | 3
b | 1 | 2 | 4
I've seen this solution but don't think it would be efficient to join or subselect in my case.
I've seen this solution which is what I want except it does not partition by name.

This is similar to a cumulative sum:
select t.*,
(#p := if(#n = name, #p * number,
if(#n := name, number, number)
)
) as cumul
from t cross join
(select #n := '', #p := 1) params
order by name, `order`;

Related

Mysql update all rows value with count of same table column

I have following table with data:
| predp_id | strp_ID | predp_nas |
| -------- | ------- | --------- |
| 1 | 1 | null |
| 2 | 1 | null |
| 3 | 1 | null |
| 4 | 2 | null |
| 5 | 2 | null |
| 6 | 3 | null |
predp_nas column should be count of strp_ID column + 1 for same strp_ID on every row.
I am currently using next query to achieve this on every new insert:
INSERT INTO PREDMETIP
(`strp_ID`, `predp_nas`)
VALUES(
1,
(SELECT counter + 1 FROM (SELECT COUNT(strp_ID) counter FROM PREDMETIP WHERE strp_ID = '1') t)
);
This gives me:
| predp_id | strp_ID | predp_nas |
| -------- | ------- | --------- |
| 1 | 1 | null |
| 2 | 1 | null |
| 3 | 1 | null |
| 4 | 2 | null |
| 5 | 2 | null |
| 6 | 3 | null |
| 7 | 1 | 4 |
But now I have imported large amount of data and I need to update all predp_nas fields at once to give me result:
| predp_id | strp_ID | predp_nas |
| -------- | ------- | --------- |
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 2 |
| 6 | 3 | 1 |
| 7 | 1 | 4 |
I have DB fiddle with insert query View on DB Fiddle , I am having trouble understanding how to write query for same thing but to update all fields at once. Any help is appreciated.
What you're looking for is ROW_NUMBER() (if you're using MySQL 8+), but since your fiddle is on MySQL 5.7 I'm assuming that's your version and so you can emulate it by counting the number of rows for a given strp_ID that have a lower predp_id and using that to update the table:
UPDATE PREDMETIP p1
JOIN (
SELECT p1.predp_id,
COUNT(p2.predp_id) + 1 AS rn
FROM PREDMETIP p1
LEFT JOIN PREDMETIP p2 ON p2.strp_ID = p1.strp_ID AND p2.predp_id < p1.predp_id
GROUP BY p1.predp_id
) p2 ON p1.predp_id = p2.predp_id
SET p1.predp_nas = p2.rn
;
SELECT *
FROM PREDMETIP
Output after update:
predp_id strp_ID predp_nas
1 1 1
2 1 2
3 1 3
4 2 1
5 2 2
6 3 1
7 1 4
You seeem to be looking for an update query. If you are running MySQL 8.0, you can do this with row_number():
update predmetip p
inner join (
select p.*, row_number() over(partition by predp_id order by strp_id) rn
from predmetip p
) p1 on p1.predp_id = p.predp_id and p1.strp_id = p.strp_id
set p.predp_nas = p1.rn
On the other hand, if you are running a MySQL 5.x version, then one option is to use correlated subqueries, as demonstrated in Nick's answer. This works fine - and I upvoted Nick's answer - but the performance tends to quickly degrade when the volume of data gets larger, because you need to scan the table for each and every row in the resultset.
You can do this with user variables, but it's is tricky: since, as explained in the documentation, the order of evaluation of expressions in the select clause is undefined, we need to evaluate and assign in the same expression ; case comes handy for this. Another important thing is that we need to order the rows in a subquery before variables come into play.
You would write the select statement as follows:
set #rn := 0, #strp_id = '';
select
predp_id,
strp_id,
#rn := case
when #strp_id = strp_id then #rn + 1 -- read
when #strp_id := strp_id then 1 -- assign
end as predp_nas
from (
select *
from predmetip
order by strp_id, predp_id
) t
You can then turn it to an update:
set #rn := 0, #strp_id = '';
update predmetip p
inner join (
select
predp_id,
strp_id,
#rn := case
when #strp_id = strp_id then #rn + 1
when #strp_id := strp_id then 1
end as predp_nas
from (
select *
from predmetip
order by strp_id, predp_id
) t
) p1 on p1.predp_id = p.predp_id and p1.strp_id = p.strp_id
set p.predp_nas = p1.predp_nas;
Demo on DB Fiddle (with credits to Nick for creating it in the first place).
To read more about user variables and their tricks, I recommend this excellent answer by Madhur Bhaiya, which also contains another interesting blog link.

Mysql sort data alternatively

I have a big table where data are structured like this
My table car
id_car | Site_car | descr_car
-----------------------------------
1 | onesite | onedesc
2 | twosite | twodesc
3 | twosite | onedesc
4 | onesite | onedesc
5 | twosite | twodesc
6 | onesite | onedesc
7 | treesite | onedesc
8 | treesite | onedesc
I want to be able to display the column site_car randomly but with onesite first twosite second and threesite third each 15 time or more
what I want to display
id_car | Site_car | descr_car
-----------------------------------
4 | onesite | onedesc
3 | twosite | twodesc
7 | treesite | onedesc
1 | onesite | onedesc
2 | twosite | twodesc
6 | treesite | onedesc
Do you guys have idea?
Thx
This is tricky in MySQL. The idea is to enumerate the rows and then order by that enumeration:
select c.*
from (select c.*,
(#rn := if(#sc = site_car, #rn + 1,
if(#sc := site_car, 1, 1)
)
) as rn
from (select c.*
from car c
order by site_car, id_car
) c cross join
(select #sc := -1, #rn := 0) params
) c
order by rn, field(site_car, 'onesite', 'twosite', 'threesite');
By the way this is much simpler in MySQL 8+:
select c.*
from car c
order by row_number() over (partition by site_car order by id_car),
field(site_car, 'onesite', 'twosite', 'threesite');
you can try this
SET #Fno:= 999
SET #Sno:= 9999
SET #Tno:= 99999
SELECT id_car, Site_car , descr_car from
(SELECT
#row_number:=CASE
WHEN Site_car = 'onesite' THEN #Fno + 1
WHEN Site_car = 'twosite' THEN #Sno + 1
ELSE #Tno+1
END AS num,* from car) order by num

How to write a MySQL SELECT Query to achieve this result?

I have an abstract problem which can be simplified as the following problem: Assume that we have two tables persons and names that look as follows:
SELECT * FROM persons;
+----+-------+--------+
| id | name | fan_of |
+----+-------+--------+
| 1 | alice | 2 |
| 2 | bob | 4 |
| 3 | carol | 1 |
| 4 | dave | 3 |
| 5 | bob | 2 |
+----+-------+--------+
and
SELECT * FROM names;
+----+-------+--------+
| id | name | active |
+----+-------+--------+
| 1 | alice | 1 |
| 2 | bob | 1 |
| 3 | carol | 0 |
| 4 | dave | 1 |
+----+-------+--------+
Every person (a row in the persons) table is a fan of itself or another person (represented by that other persons id in the fan_of column). The names table contains names that can be active or inactive.
For a given offset k, I want to SELECT the persons (rows of persons) that have the k+1-th active name as their name or that have one of these people as their fans. For example, if the offset is 1, the second active name is bob and hence I want to select all people with the name bob plus the people that have one of these bobs as their fans, which is in this example the row of persons with id=4. This means that I want to have the result:
+----+------+--------+
| id | name | fan_of |
+----+------+--------+
| 2 | bob | 4 |
| 4 | dave | 3 |
| 5 | bob | 2 |
+----+------+--------+
What I have so far is the following query:
1 SELECT * FROM persons WHERE
2 EXISTS (
3 SELECT * FROM (
4 SELECT * FROM names WHERE active=true LIMIT 1 OFFSET 1
5 ) AS selectedname WHERE (selectedname.name=persons.name)
6 )
7 OR
8 EXISTS (
9 SELECT * FROM(
10 SELECT * FROM persons WHERE EXISTS (
11 SELECT * FROM (
12 SELECT * FROM names WHERE active=true LIMIT 1 OFFSET 1
13 ) AS selectedname WHERE (selectedname.name=persons.name)
14 )
15 ) AS personswiththatname WHERE persons.id=personswiththatname.fan_of
16 );
It gives me the desired result from above but please note that it is inefficient because the lines 3-5 and 11-13 are the same.
I have the following two questions:
What can be done to avoid this inefficiency?
I actually need to distinguish between those rows that came from the
name condition (here the rows with name=bob) and those that came
from the fan_of condition (here the row with name=dave). This
could be done in the application code but then I would need another
database query before to find out the k+1-th active name and this might
be slow (please correct me if this is the better solution). I would
rather prefer an additional column z that helps me to distinguish
like
+----+------+--------+---+
| id | name | fan_of | z |
+----+------+--------+---+
| 2 | bob | 4 | 1 |
| 4 | dave | 3 | 0 |
| 5 | bob | 2 | 1 |
+----+------+--------+---+
How can such an output be achieved?
It looks like I can get the minimum you want to achieve using parameters (should this be an option).
It's not pretty, but I can't see a simple way of achieving what you're asking for, so this is what I have so far....(set #offset to suit 'k')
SET #offset = 1;
SET #name = (SELECT name FROM (select name, #rank := #rank +1 as Rank from names n, (SELECT #rank := 0) r where active !=0) as activeRanked where activeRanked.rank = (1 + #offset));
select
a.*
From persons a
where (a.name = #name) OR (a.id IN (SELECT fan_of from persons where name = #name));
If you still don't have an answer by the time I've had food, I'll look at part 2.
(hopefully I've read your brief correctly)
P.S. I've kept the #name SQL in a single line as it seems to read better in this context.
Edit: Here's a pretty messy but functional indicator of source, using your example. Z = 1 is where the row is from the name, '0' is from fan_of
SET #offset = 1;
SET #name = (SELECT name FROM (select name, #rank := #rank +1 as Rank from names n, (SELECT #rank := 0) r where active !=0) as activeRanked where activeRanked.rank = (1 + #offset));
select
a.*,'1' as z
From persons a
where (a.name = #name)
union
select
a.*,'0' as z
From persons a
where (a.id IN (SELECT fan_of from persons where name = #name));
Distinct ID Query:
SET #offset = 1;
SET #name = (SELECT name FROM (select name, #rank := #rank +1 as Rank from names n, (SELECT #rank := 0) r where active !=0) as activeRanked where activeRanked.rank = (1 + #offset));
SELECT id, name, fan_of, z FROM
(select
distinct a.id,
a.name,
a.fan_of,
1 as z
From persons a
where (a.name = #name)
union
select
distinct a.id,
a.name,
a.fan_of,
0 as z
From persons a
where (a.id IN (SELECT fan_of from persons where name = #name))
ORDER BY z desc) qry
GROUP BY id;
This produces:
+----+------+--------+---+
| id | name | fan_of | z |
+----+------+--------+---+
| 2 | bob | 4 | 1 |
| 5 | bob | 2 | 1 |
| 4 | dave | 3 | 0 |
+----+------+--------+---+

Sum or count only few/limited row from a table

I have a student mark table where i putted student marks by subject. I wanted to take sum of maximum 3 subjects for each student. And also wanted to see subject count or how many subjects mark entry exist on this table group by a student. This is my table structure.
students
----------------------
id | name | roll
----------------------
1 | Rahim | 201
2 | Kalas | 203
----------------------
student_marks
--------------------------------
id | student_id | subject | mark
--------------------------------
1 | 1 | A | 10
2 | 1 | B | 5
3 | 1 | C | 10
4 | 1 | D | 5
5 | 2 | A | 10
6 | 2 | B | 10
--------------------------------
my_expected_table
----------------------------------
student_id | student_name | sum
----------------------------------
1 | Rahim | 25
2 | Kalas | 20
----------------------------------
I am trying but can't understand how would i give limit on join table my sample query here
SELECT students.id as student_id,
students.name as student_name,
sum(student_marks.mark) as sum
From students
inner join student_marks on student_marks.student_id = students.id
Group by student_marks.student_id
the query output you know, it will show sum of all row. but i want like previous table "my_expected_table"
----------------------------------
student_id | student_name | sum
----------------------------------
1 | Rahim | 30
2 | Kalas | 20
----------------------------------
This is painful to do in MySQL. The best way uses variables:
select sm.student_id, count(*) as num_marks,
sum(case when rn <= 3 then mark end) as random_3_sum
from (select sm.*,
(#rn := if(#s = student_id, #rn + 1,
if(#s := student_id, 1, 1)
)
) as rn
from student_marks sm cross join
(select #rn := 0, #s := -1) params
order by student_id, id
) sm
group by sm.student_id;
Notes:
I didn't both with the join to the student table, that is obvious to add.
This defines 3 marks as "the 3 marks on the rows with the lowest id". This is determined by the second order by key.

Limiting the output of specific column sql hana

I have a table structure as given below and what I'd like to be able to do is retrieve the top three records with the highest value for each Company code.
I've googled and I couldn't find a better way so hopefully you guys can help me out.
By the way, I'm attempting this in MySQL and SAP HANA. But I am hoping that I can grab the "structure" if the query for HANA if I can get help for only MySQL
Thanks much!
Here's the table:
http://pastebin.com/xgzCgpKL
In MySQL you can do
To get exactly three records per group (company) no matter ties emulating ROW_NUMBER() analytic function. Records with the same value get the same rank.
SELECT company, plant, value
FROM
(
SELECT company, plant, value, #n := IF(#g = company, #n + 1, 1) rnum, #g := company
FROM table1 CROSS JOIN (SELECT #n := 0, #g := NULL) i
ORDER BY company, value DESC, plant
) q
WHERE rnum <= 3;
Output:
| COMPANY | PLANT | VALUE |
|---------|-------|-------|
| 1 | C | 5 |
| 1 | B | 4 |
| 1 | A | 3 |
| 2 | G | 6 |
| 2 | C | 5 |
| 2 | D | 3 |
| 3 | E | 8 |
| 3 | A | 7 |
| 3 | B | 3 |
Get all records per group that have a rank from 1 to 3 emulating DENSE_RANK() analytic function
SELECT company, plant, value
FROM
(
SELECT company, plant, value, #n := IF(#g = company, IF(#v = value, #n, #n + 1), 1) rnum, #g := company, #v := value
FROM table1 CROSS JOIN (SELECT #n := 0, #g := NULL, #v := NULL) i
ORDER BY company, value DESC, plant
) q
WHERE rnum <= 3;
Output:
| COMPANY | PLANT | VALUE |
|---------|-------|-------|
| 1 | C | 5 |
| 1 | B | 4 |
| 1 | A | 3 |
| 1 | E | 3 |
| 1 | G | 3 |
| 2 | G | 6 |
| 2 | C | 5 |
| 2 | D | 3 |
| 3 | E | 8 |
| 3 | A | 7 |
| 3 | B | 3 |
| 3 | G | 3 |
Here is SQLFiddle demo
UPDATE: Now it looks like HANA supports analytic functions so the queries will look like
SELECT company, plant, value
FROM
(
SELECT company, plant, value,
ROW_NUMBER() OVER (PARTITION BY company ORDER BY value DESC) rnum
FROM table1
)
WHERE rnum <= 3;
SELECT company, plant, value
FROM
(
SELECT company, plant, value,
DENSE_RANK() OVER (PARTITION BY company ORDER BY value DESC) rank
FROM table1
)
WHERE rank <= 3;
Here is SQLFiddle demo It's for Oracle but I believe it will work for HANA too