How to create a histogram using MySQL - mysql

I am trying to create a histogram data using following query:
SELECT FLOOR(Max_Irrad/10) AS bucket, COUNT(*) AS COUNT
FROM marctest.test_summarynimish
where Lcu_name='Allegro'
and Lcu_Mode='Standard'
GROUP BY bucket;
following is the result that i am getting:
bucket count
0 3
4 3
5 12
7 6
8 3
10 3
now the bucket field is the range or bin used in the histogram. I want to create a bucket values with consistent range, for eg starting from 0,4,8,12.... and so on.. Is there any way to achieve this in mysql?
This is how I am expecting to have as result:
bucket count
0 3
4 21
8 6

I think we can use the following general form to create a general histogram:
select (x div 4) * 4 as NewX, count(*) as NewY from histogram
group by NewX
Where x is the real x value of the x axis and count(*) is the real y value. The number 4 is the size amount of the x values we want to group. This means we will group all x values in groups of 4 (e.g.: group 1 is 0, 1, 2, 3; group 2 is 4, 5, 6, 7, and so on). The count of each item in the group will become the NewY value
You can play with this here
Applying this logic to your query this would be:
select (floor(Max_Irrad/10) div 4) * 4 as NewX, count(*) as NewY
from marctest.test_summarynimish
where Lcu_name='Allegro' and Lcu_Mode='Standard'
group by NewX
Let me know if you have any trouble or doubt about this.

Just make your buckets bigger by dividing Max_Irrad by 40 instead of 10.

Related

Sql query to count values after a particular condition is met

I have a table,
Name Seconds Status_measure
a 0 10
a 10 13
a 20 -1
a 30 15
a 40 20
a 50 12
a 60 -1
Here I want for a particular name a new column which is calculated by, "The number of times the value goes >-1 only after once the -1 is met" . So in this particular data I want a new column for the name "a" which has the value=3 , because once the -1 is reached in Status_measure, we have 3 values (15 and 20 and 12)>-1
Required data frame:
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
I tried doing
count(status_measure>-1) over (partition by name order by seconds)
But this is not giving any desired result
You can do it in 2 steps, group data, count entries of the grp = 1.
select *, sum(Status_measure > -1 and grp = 1) over(partition by name) n
from (
select *
, row_number() over(partition by name order by Seconds) - sum(Status_measure > -1 ) over(partition by name order by Seconds) grp
from tbl
) t
An option is using a variable update, which:
starts from 0
increases its value when reaches a -1
decreases its value when reaches a second -1
Once you have this column, you can run a sum over your values.
SET #change = 0;
SELECT *, SUM(CASE WHEN Status_measure = -1
THEN IF(#change=0, #change := #change + 1, #change := #change - 1)
ELSE #change END) OVER() -1 AS Value_
FROM tab
Check the demo here.
Limitations: this solution assumes you have only one range of interesting values between -1s.
Note: there's a -1 decrement from your sum because the first update of the variable will leave 1 in the same row of -1, which you don't want. For better understanding, comment out the application of SUM() OVER and see intermediate output.
More of a clarification to your question first. I want to expand your original data to include another row for the sake of 2 vs 3 entries. Also, is there some auto-increment ID in your data that the sequential consideration is applicable such as
Id Name Seconds Status_measure Value
1 a 0 10 3
2 a 10 13 3
3 a 20 -1 3
4 a 30 15 3
5 a 40 20 3
6 a 50 12 3
7 a 60 -1 3
If sequential, and you have IDs 1 & 2 above the -1 at ID #3. This would indicate two entries. But then for IDs 4-6 above -1 have a count of three entries before ID #7.
So, what "VALUE" do you want to have in your result. The max count of 3 for all rows, or would it be a value of 2 for ID#s 1, 2 and 3? And value of 3 for Ids 4-7? Or, do you want ALL entries to recognize the greatest count before -1 measure to show 3 for all entries.
Please EDIT your question, you can copy/paste this in your original question if need be and provide additional clarification as requested (auto-increment as well as that is an impact of final output / determining break).

Selecting part of a matrix in Octave to form another matrix

Suppose I have the matrix
a = [1 2 3;
4 5 6;
7 8 9;]
I want to select the first two columns to form a matrix
b = [1 2;
4 5;
7 8;]
How to achieve this in Octave?
I know how to select a single column, but how to select many columns (let's say, the first 8 columns of a matrix having 16 columns) and form a matrix with them?
Also, how to select rows in a similar manner to form a matrix?
You can use the following code
b = a(:,1:2)
where : means taking all rows, and 1:2 means taking columns from 1 to 2.

MYSQL NTILE function start with highest percentile

I'm using the MYSQL NTILE function and for the most part it is doing what I need it to, however there is one case in which I need different behaviour and I can't figure out how to do it. The case is when I have more buckets than I do records.
So lets say my data in a table called data looks like this
ID val
1 15
2 20
3 10
My issue is when I have more buckets than I do records, so lets say I run
select *, NTILE(4) over (order by val) from data
This will result in
ID val NTILE
3 10 1
1 15 2
2 20 3
I'm having some trouble wording my question which is probably why I am struggling to find solutions on Google, but basically my question is this: Is there any way that when I have more buckets than records (in this example 4 buckets but only 3 records) that I can treat the highest value as the highest percentile and work backwards rather than what it is currently doing which is treating the lowest value as the lowest percentile? Essentially resulting in this:
ID val NTILE
2 20 4
1 15 3
3 10 2
I think you might be able to reverse the ordering in the NTILE() and numerically flip the result like so:
select *, 5-NTILE(4) over (order by val desc) from data
I would expect the following to happen (I have not run this though!):
ID val NITLE
2 20 4
1 15 3
3 10 2

Determine number of factors of a given number without overlap in MySQL

I have a data set with orders of tickets. Tickets can be bought in packs of 5, or 3, as well as individually. I need to group the data using the quantity of tickets sold per order, to determine if it was a 5 pack (divisible by five), then 3 pack, or else/then individually (1 or 2 qty). So if I have a quantity of 27, I know that order consisted of five "5 packs", and 2 individual tickets.
SUM(CASE WHEN (id % 5) = 0 THEN 1 ELSE 0 END) fivepack
I have this in my query, but stringing these together for fivepack, and threepack, doesn't eliminate the starting number from the total quantity on the next operation. So a quantity of 27, would yield a result of 5 "five packs" and 9 "three packs", and then 27 "individuals".
So given a quantity, how would you first divide by a large factor, get the remainder and divide by the smaller, then finally handle the remainder?
Edit:
The sample packs provide a discount of the purchase price(not relevant to the technical issue), so the first maximum division needs to occur first. So as Gordon Linoff asked below, in the case of 27 tickets quantity, you would take the maximum number of 5 divisions first, then pass the remainder to try to divide by 3, and then return the final remainder as individuals.
The issue is passing the value of one operation in SQL to the next operation, so so on. So I can do Math1, pass Answer1 to Math2, and then pass Answer2 to Math3.
I don't fully understand why 27 would be 5 five packs and 2 individuals rather than any of the following:
27 individuals
9 3-packs
4 5-packs, 2 3-packs, 1-individual
8 3-packs and 3 individuals
and so on.
But, if you want a greedy approach, you can use the following arithmetic:
select floor(num / 5) as five_packs,
floor( (num - 5 * floor(num / 5)) / 3) as three_packs,
num - 5 * floor(num / 5) - 3 * floor( (num - 5 * floor(num / 5)) / 3) as singles
Here is a SQL Fiddle illustrating the logic.

sql to select top 10 records

I have the following table (points):
recno uid uname points
============================
1 a abc 10
2 b bac 8
3 c cvb 12
4 d aty 13
5 f cyu 9
-------------------------
--------------------------
What I need is to show only the top ten records with by points (desc) and five records on each page. I have following the SQL statement:
select * from points where uid in(a,c) order by uid LIMIT 1, 5
Thanks
for the first page:
SELECT * FROM points p ORDER BY points DESC LIMIT 0, 5
for the second page:
SELECT * FROM points p ORDER BY points DESC LIMIT 5, 5
You can't execute an SQL query to return a set number of pages, you'll have to implement some kind of pagination module or whatever equivalent there is for the scenario you're in and fetch LIMIT 0, 5 for one then LIMIT 5, 5 for the other.
With such few records it wouldn't be an issue but in a production scale environment selected all records then breaking those results down into pages would be a lot of unnecessary overhead, it's good practice to only select the data you need.