Get frequency distribution of a decimal range in MySQL - mysql

I'm looking for an elegant way (in terms of syntax, not necessarily efficient) to get the frequency distribution of a decimal range.
For example, I have a table with ratings column which can be a negative or positive. I want to get the frequency of rows with a rating of certain range.
- ...
- [-140.00 to -130.00): 5
- [-130.00 to -120.00): 2
- [-120.00 to -110.00): 1
- ...
- [120.00 to 130.00): 17
- and so on.
[i to j) means i inclusive to j exclusive.
Thanks in advance.

You could get pretty close using 'select floor(rating / 10), count(*) from (table) group by 1'

I was thinking of seomthing that could do many levels like
DELIMITER $$
CREATE PROCEDURE populate_stats()
BEGIN
DECLARE range_loop INT Default 500 ;
simple_loop: LOOP
SET the_next = range_loop - 10;
Select sum(case when range between range_loop and the_next then 1 else 0 end) from table,
IF the_next=-500 THEN
LEAVE simple_loop;
END IF;
END LOOP simple_loop;
END $$
usage: call populate_stats();
Would handle 100 ranges from 500-490, 490-480, ... -480 - -490, -490 - -500

assuming a finite number of ranges.
Select
sum(case when val between -140 to -130 then 1 else 0 end) as sum-140_to_-130,
sum(Case when val between -130 to -120 then 1 else 0 end) as sum-130_to_-140,
...
FROM table
and if not, you could use dynamic SQL to generate the select allowing a number of ranges however you may run into a column limitation.

Just put your desired ranges into a table, and use that to discriminate the values.
-- SET search_path='tmp';
DROP TABLE measurements;
CREATE TABLE measurements
( zval INTEGER NOT NULL PRIMARY KEY
);
INSERT INTO measurements (zval)
SELECT generate_series(1,1000);
DELETE FROM measurements WHERE random() < 0.20 ;
DROP TABLE ranges;
CREATE TABLE ranges
( zmin INTEGER NOT NULL PRIMARY KEY
, zmax INTEGER NOT NULL
);
INSERT INTO ranges(zmin,zmax) VALUES
(0, 100), (100, 200), (200, 300), (300, 400), (400, 500),
(500, 600), (600, 700), (700, 800), (800, 900), (900, 1000)
;
SELECT ra.zmin,ra.zmax
, COUNT(*) AS zcount
FROM ranges ra
JOIN measurements me
ON me.zval >= ra.zmin AND me.zval < ra.zmax
GROUP BY ra.zmin,ra.zmax
ORDER BY ra.zmin
;
Results:
zmin | zmax | zcount
------+------+--------
0 | 100 | 89
100 | 200 | 76
200 | 300 | 76
300 | 400 | 74
400 | 500 | 86
500 | 600 | 78
600 | 700 | 75
700 | 800 | 75
800 | 900 | 80
900 | 1000 | 82
(10 rows)

Related

Creating a column being the multiple of others

I need some help. I have 2 colluns from mysql query result: 1 with text, and another with decimal values. Like that:
select desc, value from table a
|5,50 % | 2984.59 |
|Subs | 10951.70 |
|Isent | 3973.17 |
|13,30 % | 560.26 |
From the rows that have the %, I want to multiply the values and create a third result column, rounding up to two decimal places. See above
2984,59 * 0,055 = 164,15245
560,26 * 0,133 = 74,514
I need make the sql query that show something like above.
+-------+-----------+-----------+
|5,50 % | 2984,59 | 164,16 |
|Subs | 10951,70 | 0 or NULL |
|Isent | 3973,17 | 0 or NULL |
|13,30% | 560,26 | 74,52 |
+-------+-----------+-----------+
How i can do it?
Thanks so much for help
It would be better to have floaring numbers in the first place, converting costs time
You have commas in your procentage, but mysql needs dots there
If value isn't always a number, you can use the mysql way to add a 0 0 to it, that remioves all non numerical characters
SELECT `desc`, `value`, (REPLACE(`desc`,',','.') + 0) * `value` / 100 FROM val
desc
value
(REPLACE(`desc`,',','.') + 0) * `value` / 100
5,50 %
2985
164.175
Subs
10952
0
Isent
3973
0
13,30 %
560
74.48
fiddle
SELECT `desc`, `value`, CEIL((REPLACE(`desc`,',','.') + 0) * `value`) / 100 FROM val
desc
value
CEIL((REPLACE(`desc`,',','.') + 0) * `value`) / 100
5,50 %
2985
164.18
Subs
10952
0
Isent
3973
0
13,30 %
560
74.48
fiddle

Counting product pairs in a store whose difference in expenses is less than a certain amount in SQL

I have a table with the serial number of each product, whether it is in stock (1- in stock, 0- not in stock), the level of revenue from the product and the level of expenses from the product in the store. I would like to write a query that counts all customer pairs (without duplication of the same pair), that the expense difference between them is less than NIS 1,000 and both are in stock or both are out of stock. Show the average income gap (approximately) of all pairs, how many such pairs are in stock And how much is not in stock.
Sample table:
serial
Is_in_stock
Revenu_ from_the_product
Expenses_from_the_product
1
1
27627
57661
2
0
48330
20686
3
0
26010
861
4
1
22798
37771
5
0
24606
8905
6
1
48311
6433
7
0
29929
6278
8
0
24254
8590
Unfortunately I am lost and unable to find a solution to my problem.
I was thinking of creating subqueries but could not find a suitable solution
The result should show something like this(Please do not refer to this data for illustration):
Average income gap (in absolute value) of all pairs
Quantity of pairs in stock
The amount of pairs that are not in stock
13
10
5
In addition it is very important that the count be done without duplicates of the same pair
We can do this with two queries, without a procedure or user defined function
CREATE TABLE products(serial INT, Instock INT, Revenu INT, Expenses INT);
INSERT INTO products VALUES
(1,1,27627,57661),
(2,0,48330,20686),
(3,0,26010,861 ),
(4,1,22798,37771),
(5,0,24606,8905 ),
(6,1,48311,6433 ),
(7,0,29929,6278 ),
(8,0,24254,8590 );
✓
✓
SELECT a.serial,b.serial from
products a
join products b
on abs(a.expenses-b.expenses)<1000
where a.serial<b.serial
and a.instock=b.instock
serial | serial
-----: | -----:
5 | 8
select count(a.expenses) 'number of pairs',
avg(abs(a.expenses-b.expenses)) 'average difference',
sum(case when a.instock=1 and b.instock=1 then 1 else 0 end) pairsInstock,
sum(case when a.instock=0 and b.instock=0 then 1 else 0 end) pairsneitherStock,
sum(case when (a.instock+b.instock)=1 then 1 else 0 end ) oneInStock
from products a
cross join products b
where a.serial < b.serial;
number of pairs | average difference | pairsInstock | pairsneitherStock | oneInStock
--------------: | -----------------: | -----------: | ----------------: | ---------:
28 | 21362.1071 | 3 | 10 | 15
db<>fiddle here
I have solved it in stored procedure.
Starting with variables definition.
Cursor iterate results of sorted list and check if the following condition it TRUE according to your definition of pair.
prev_exp - curr_Expenses_from_the_product < 1000 AND prev_in_stock - curr_Is_in_stock = 0
In case it TRUE counter increased by 1.
In the end I closing the cursor and returning the counter value.
* You can add more logic to procedure and return more columns.
** Usage of this procedure is just to call to stored procedure by its name.
Table creation:
CREATE TABLE A(serial INT(11), Is_in_stock INT(11), Revenu_from_the_product INT(11), Expenses_from_the_product INT(11));
Data insertion:
INSERT INTO A (serial,Is_in_stock,Revenu_from_the_product,Expenses_from_the_product) VALUES
(1,1,27627,57661),
(2,0,48330,20686),
(3,0,26010,861 ),
(4,1,22798,37771),
(5,0,24606,8905 ),
(6,1,48311,6433 ),
(7,0,29929,6278 ),
(8,0,24254,8590 );
Query:
BEGIN
DECLARE finished INTEGER DEFAULT 0;
DECLARE prev_exp int(11) DEFAULT 0;
DECLARE prev_in_stock int(11) DEFAULT 0;
DECLARE curr_Is_in_stock int(11) DEFAULT 0;
DECLARE curr_Expenses_from_the_product int(11) DEFAULT 0;
DECLARE duplications_counter int(11) DEFAULT 0;
-- declare cursor for relevant fields
DEClARE curs
CURSOR FOR
SELECT A.Is_in_stock,A.Expenses_from_the_product FROM A ORDER BY A.Expenses_from_the_product DESC;
-- declare NOT FOUND handler
DECLARE CONTINUE HANDLER
FOR NOT FOUND SET finished = 1;
OPEN curs;
getRow: LOOP
FETCH curs INTO curr_Is_in_stock,curr_Expenses_from_the_product;
IF finished = 1 THEN
LEAVE getRow;
END IF;
IF prev_exp - curr_Expenses_from_the_product < 1000 AND prev_in_stock - curr_Is_in_stock = 0 THEN
SET duplications_counter = duplications_counter+1;
END IF;
END LOOP getRow;
CLOSE curs;
-- return the counter
SELECT duplications_counter;
END
Result:
Counter: 5

Use a single trigger to insert into multiple tables based on a condition

I have a table named three_current. this tables is inserted with 3 new rows every 1 minutes from another application, so the tables keeps on increasing in rows. These three new rows always have their channel number to be 350, 351, and 352. I want a trigger to insert each of these three rows into three separate tables such that each tables contains data with the same channel number.
The three_current tables is as such:
three_current table
datetime
channel_number
Value
Status
01/06/2021 22:45:00
350
100
1
01/06/2021 22:45:00
351
120
1
01/06/2021 22:45:00
352
110
1
01/06/2021 22:46:00
350
95
1
01/06/2021 22:46:00
351
105
1
01/06/2021 22:46:00
352
150
1
01/06/2021 22:47:00
350
195
1
01/06/2021 22:47:00
351
205
1
01/06/2021 22:47:00
352
250
1
I also have three other tables name red_current, yellow_current, and blue_current. I am trying without success to create a trigger to update these three tables based on the channel_number of three_current table such that
red_current table will be
datetime
channel_number
Value
Status
01/06/2021 22:45:00
350
100
1
01/06/2021 22:46:00
350
95
1
01/06/2021 22:47:00
350
195
1
yellow_current table will be
datetime
channel_number
Value
Status
01/06/2021 22:45:00
351
120
1
01/06/2021 22:46:00
351
105
1
01/06/2021 22:47:00
351
205
1
blue_current table will be
datetime
channel_number
Value
Status
01/06/2021 22:45:00
352
110
1
01/06/2021 22:46:00
352
150
1
01/06/2021 22:47:00
352
250
1
But what I get after executing my code is that the red_current, yellow_current and the blue_current tables are all being inserted with rows where the channel number is 350. This means that only the red_current table is correct while the other two tables are duplicates of the red_current table. (I feel my code can only execute for the first row of each updates received by three_current table and thats the row with channel number 350).
My code is as follows:
DELIMITER //
CREATE TRIGGER `add` AFTER INSERT ON `three_current`
FOR EACH ROW
BEGIN
DECLARE new_datetime datetime ; -- choose the datatypes
DECLARE new_channel_number int; --
DECLARE new_value double; --
DECLARE new_status smallint; --
SET new_datetime = new.datetime ;
SET new_channel_number = new.channel_number ;
SET new_value = new.value ;
SET new_status = new.status;
INSERT INTO red_current (datetime, channel_number, value, status)
SELECT new.datetime, new.channel_number , new.value, new.status
FROM three_current WHERE channel_number = '350'
ON DUPLICATE KEY UPDATE status = new.status;
INSERT INTO yellow_current (datetime, channel_number, value, status)
SELECT new.datetime, new.channel_number , new.value, new.status
FROM three_current WHERE channel_number = '351'
ON DUPLICATE KEY UPDATE status = new.status;
INSERT INTO blue_current (datetime, channel_number, value, status)
SELECT new.datetime, new.channel_number , new.value, new.status
FROM three_current WHERE channel_number = '352'
ON DUPLICATE KEY UPDATE status = new.status ;
END
//
DELIMITER ;

MySQL calculating query

I have this table, only two columns, each record stores an interest rate for a given month:
id rate
===========
199502 3.63
199503 2.60
199504 4.26
199505 4.25
... ...
201704 0.79
201705 0.93
201706 0.81
201707 0.80
201708 0.14
Based on this rates, I need to create another table of accumulated rates which similar structure, whose data is calculated as function of a YYYYMM (month/year) parameter, this way (this formula is legally mandatory):
The month given as parameter has always rate of 0 (zero)
The month immediately previous has always a rate of 1 (one)
The previous months' rates will be (one) plus the sum of rates of months between that given month and the month given as parameter.
I'll clarify this rules with this example, given parameter 201708:
SOURCE CALCULATED
id rate id rate
=========== =============
199502 3.63 199502 360.97 (1 + sum(rate(199503) to rate(201707)))
199503 2.60 199503 358.37 (1 + sum(rate(199504) to rate(201707)))
199504 4.26 199504 354.11 (1 + sum(rate(199505) to rate(201707)))
199505 4.25 199505 349.86 (1 + sum(rate(199506) to rate(201707)))
... ... ... ...
201704 0.79 201704 3.54 (1 + rate(201705) + rate(201706) + rate(201707))
201705 0.93 201705 2.61 (1 + rate(201706) + rate(201707))
201706 0.81 201706 1.80 (1 + rate(201707))
201707 0.80 201707 1.00 (per definition)
201708 0.14 201708 0.00 (per definition)
Now I've already implemented a VB.NET function that reads the source table and generates the calculated table, but this is done in runtime at each client machine:
Public Function AccumRates(targetDate As Date) As DataTable
Dim dtTarget = Rates.Clone
Dim targetId = targetDate.ToString("yyyyMM")
Dim targetIdAnt = targetDate.AddMonths(-1).ToString("yyyyMM")
For Each dr In Rates.Select("id<=" & targetId & " and id>199412")
If dr("id") = targetId Then
dtTarget.Rows.Add(dr("id"), 0)
ElseIf dr("id") = targetIdAnt Then
dtTarget.Rows.Add(dr("id"), 1)
Else
Dim intermediates =
Rates.Select("id>" & dr("id") & " and id<" & targetId).Select(
Function(ldr) New With {
.id = ldr.Field(Of Integer)("id"),
.rate = ldr.Field(Of Decimal)("rate")}
).ToArray
dtTarget.Rows.Add(
dr("id"),
1 + intermediates.Sum(
Function(i) i.rate))
End If
Next
Return dtTarget
End Function
My question is how can I put this as a query in my database so it can be used dynamically by other queries which would use these accumulated rates to update debts to any given date.
Thank you very much!
EDIT
I managed to make a query that returns the data I want, now I just don't know how to encapsulate it so that it can be called from another query passing any id as argument (here I did it using a SET ... statement):
SET #targetId=201708;
SELECT
id AS id_acum,
COALESCE(1 + (SELECT
SUM(taxa)
FROM
tableSelic AS ts
WHERE
id > id_acum AND id < #targetId
LIMIT 1),
IF(id >= #targetId, 0, 1)) AS acum
FROM
tableSelic
WHERE id>199412;
That's because I'm pretty new to MySQL, I'm used to MS-Access where parametrized queries are very straightfoward to create.
For example:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL PRIMARY KEY
,rate DECIMAL(5,2) NOT NULL
);
INSERT INTO my_table VALUES
(201704,0.79),
(201705,0.93),
(201706,0.81),
(201707,0.80),
(201708,0.14);
SELECT *
, CASE WHEN #flag IS NULL THEN #i:=1 ELSE #i:=#i+rate END i
, #flag:=1 flag
FROM my_table
, (SELECT #flag:=null,#i:=0) vars
ORDER
BY id DESC;
+--------+------+-------------+-------+------+------+
| id | rate | #flag:=null | #i:=0 | i | flag |
+--------+------+-------------+-------+------+------+
| 201708 | 0.14 | NULL | 0 | 1 | 1 |
| 201707 | 0.80 | NULL | 0 | 1.80 | 1 |
| 201706 | 0.81 | NULL | 0 | 2.61 | 1 |
| 201705 | 0.93 | NULL | 0 | 3.54 | 1 |
| 201704 | 0.79 | NULL | 0 | 4.33 | 1 |
+--------+------+-------------+-------+------+------+
5 rows in set (0.00 sec)
Ok, I made it with a function:
CREATE FUNCTION `AccumulatedRates`(start_id integer, target_id integer) RETURNS decimal(6,2)
BEGIN
DECLARE select_var decimal(6,2);
SET select_var = (
SELECT COALESCE(1 + (
SELECT SUM(rate)
FROM tableRates
WHERE id > start_id AND id < target_id LIMIT 1
), IF(id >= unto, 0, 1)) AS acum
FROM tableRates
WHERE id=start_id);
RETURN select_var;
END
And them a simple query:
SELECT *, AccumulatedRates(id,#present_id) as acum FROM tableRates;
where #present_id is passed as parameter.
Thanks to all, anyway!

Get count of columns having same value in comma separated format Sql

Hi i need a complex query
my table structure is
attribute_id value entity_id
188 48,51,94 1
188 43,22 2
188 43,22 3
188 43,22 6
190 33,11 10
190 90,61 12
190 90,61 15
I need the count of the value like
attribute_id value count
188 48 2
188 43 3
188 51 1
188 94 1
188 22 2
190 33 1
190 11 1
190 90 2
190 61 2
I have searched a lot on google to have something like this but unfortunately i didn't get any success. Please suggest me how can i achieve this .
I use a UDF for things like this. If that could work for you:
CREATE FUNCTION [dbo].[UDF_StringDelimiter]
/*********************************************************
** Takes Parameter "LIST" and transforms it for use **
** to select individual values or ranges of values. **
** **
** EX: 'This,is,a,test' = 'This' 'Is' 'A' 'Test' **
*********************************************************/
(
#LIST VARCHAR(8000)
,#DELIMITER VARCHAR(255)
)
RETURNS #TABLE TABLE
(
[RowID] INT IDENTITY
,[Value] VARCHAR(255)
)
WITH SCHEMABINDING
AS
BEGIN
DECLARE
#LISTLENGTH AS SMALLINT
,#LISTCURSOR AS SMALLINT
,#VALUE AS VARCHAR(255)
;
SELECT
#LISTLENGTH = LEN(#LIST) - LEN(REPLACE(#LIST,#DELIMITER,'')) + 1
,#LISTCURSOR = 1
,#VALUE = ''
;
WHILE #LISTCURSOR <= #LISTLENGTH
BEGIN
INSERT INTO #TABLE (Value)
SELECT
CASE
WHEN #LISTCURSOR < #LISTLENGTH
THEN SUBSTRING(#LIST,1,PATINDEX('%' + #DELIMITER + '%',#LIST) - 1)
ELSE SUBSTRING(#LIST,1,LEN(#LIST))
END
;
SET #LIST = STUFF(#LIST,1,PATINDEX('%' + #DELIMITER + '%',#LIST),'')
;
SET #LISTCURSOR = #LISTCURSOR + 1
;
END
;
RETURN
;
END
;
The UDF takes two parameters: A string to be split, and the delimiter to split by. I've been using it for all sorts of different things over the years, because sometimes you need to split by a comma, sometimes by a space, sometimes by a whole string.
Once you have that UDF, you can just do this:
DECLARE #TABLE TABLE
(
Attribute_ID INT
,Value VARCHAR(55)
,Entity_ID INT
);
INSERT INTO #TABLE VALUES (188, '48,51,94', 1);
INSERT INTO #TABLE VALUES (188, '43,22', 2);
INSERT INTO #TABLE VALUES (188, '43,22', 3);
INSERT INTO #TABLE VALUES (188, '43,22', 6);
INSERT INTO #TABLE VALUES (190, '33,11', 10);
INSERT INTO #TABLE VALUES (190, '90,61', 12);
INSERT INTO #TABLE VALUES (190, '90,61', 15);
SELECT
T1.Attribute_ID
,T2.Value
,COUNT(T2.Value) AS Counter
FROM #TABLE T1
CROSS APPLY dbo.UDF_StringDelimiter(T1.Value,',') T2
GROUP BY T1.Attribute_ID,T2.Value
ORDER BY T1.Attribute_ID ASC, Counter DESC
;
I did an ORDER BY Attribute_ID ascending and then the Counter descending so that you get each Attribute_ID with the most common repeating values first. You could change that, of course.
Returns this:
Attribute_ID Value Counter
-----------------------------------
188 43 3
188 22 3
188 94 1
188 48 1
188 51 1
190 61 2
190 90 2
190 11 1
190 33 1