mysql return valid nth multiples results - mysql

I know there's similar questions but none seem to apply as to what I want to do.
Given the following query that returns lets say 85 results.
SELECT * FROM tbl_name WHERE person = 'Tom';
And I have another query that's similar but returns 168 results.
SELECT * FROM tbl_name WHERE person = 'Bob';
I'm trying to get results in valid multiples of 50.
By just changing person value in the WHERE clause I want to have the expected output of Tom's 1st, 50th results. Which means 2 row results total.
Likewise Bob would have returned the 1st, 50th, 100th, 150th result. Which is 4 row results total.
Is it possible to do this with just MySQL?

Nailed it. Change the 50 for different increments. This assumes you meant that you wanted 1, 51, 101 (every 50th).
SELECT
returnTable.*
FROM (
SELECT
#rownum:=#rownum+1 AS rowNumber,
tbl_name.*
FROM forum_posts, (SELECT #rownum:=0) variableInit
WHERE tbl_name.person = 'Tom'
) AS returnTable
WHERE returnTable.rowNumber = 1
OR (returnTable.rowNumber - 1) MOD 50 = 0;
If you actually want 1, 50, 100, 150 then the following does that (removed -1 from the WHERE)
SELECT
returnTable.*
FROM (
SELECT
#rownum:=#rownum+1 AS rowNumber,
tbl_name.*
FROM forum_posts, (SELECT #rownum:=0) variableInit
WHERE tbl_name.person = 'Tom'
) AS returnTable
WHERE returnTable.rowNumber = 1
OR returnTable.rowNumber MOD 50 = 0;

Related

How do I COUNT rows of a GROUP BY query where a condition matches?

This is my persons table:
neighborhood birthyear
a 1958
a 1959
b 1970
c 1980
I'd like to get the COUNT of people in an age group within every neighborhood. For example, if I wanted to get everyone under the age of 18, I would get:
neighborhood count
a 0
b 0
c 0
If I wanted to get everyone over 50, I'd get
neighborhood count
a 2
b 0
c 0
I tried
SELECT neighborhood, COUNT(*)
FROM persons
WHERE YEAR(NOW()) - persons.birthyear < 18
GROUP BY neighborhood;
but this gives me 0 rows, when instead I want 3 rows with distinct neighborhoods and 0 count for each. How would I accomplish this?
You can use conditional aggregation:
SELECT neighborhood, SUM(YEAR(NOW()) - p.birthyear) as under_18,
SUM(YEAR(NOW()) - p.birthyear BETWEEN 34 AND 42) as age_34_42
FROM persons p
GROUP BY neighborhood;
I think that if the count is 0, the row doesn't appear.
Your code seems correct to me, if you try it on the example with age 50, it should give you one row whith the expected line (neighborhood:a,count:2)
I would recommend using a sub query:
SELECT
count(*) [group-by-count-greater-than-ten]
FROM
(
SELECT
columnFoo,
count(*) cnt
FROM barTable
WHERE columnBaz = "barbaz"
GROUP BY columnFoo
)
AS subQuery
WHERE cnt > 10
In the above, the subquery return result set is being used by the main query as any other table.
The column cnt is no longer seen by the main query as a computed field and does not have to reference the count() function.
However, inside the subquery running a where clause or a having clause that must look at the alias cnt column, the count() function would have to be referenced as referencing cnt in the subquery would throw an error.
In your case using a subquery would look something like this.
SELECT
neighborhood,
age,
count(*) as cnt
FROM
(
SELECT
*,
(YEAR(NOW()) - birthyear) as age
FROM PERSONS
) as WithAge
WHERE age < 18
GROUP BY neighborhood, age

Selecting heighest number in sql with the same id

I am trying to select max transaction_num from my table tbl_loan and group it by c_id to avoid duplicate of c_id.
here is my query
SELECT * FROM `tbl_loan` WHERE transaction_num IN (SELECT max(transaction_num) max_trans FROM tbl_loan GROUP BY c_id)
and my output is
still have duplicate c_id.
MySQL MAX with GROUP BY clause
To find the maximum value for every group, you use the MAX function with the GROUP BY clause in a SELECT statement.
You use the following query:
SELECT
*, MAX(transaction_num)
FROM
tbl_loan
GROUP BY c_id
ORDER BY MAX(transaction_num);
From the looks of it, and correct me if I'm wrong. The transaction number appears to be sequential per each C_ID whenever a new transaction happens. There is also the "I_ID" column which appears to be an auto-incrementing column which does not duplicate. It appears your transaction number is sequentially 1, 2, 3, etc per C_ID for simple counting purposes, so everyone starts with a 1, and those with more have 2nd and possibly 3rd and more...
So, if this is accurate and you want the most recent per C_ID, you really want the max "I_ID" per C_ID because multiple records will exist with a value of 2, 3, etc...
try this.
SELECT
TL.*
FROM
tbl_loan TL
JOIN ( SELECT C_ID, max(I_ID) maxI_ID
FROM tbl_loan
GROUP BY c_id) MaxPer
on TL.I_ID = MaxPer.MaxI_ID
So, from your data for C_ID = 55, you have I_ID = 61 (trans num = 1) and 62 (trans num = 2). So for ID = 55, you want the transaction I_ID = 62 which represents the second transaction.
For C_ID = 70, it has I_IDs of 77 & 78, of which will grab I_ID = 78.
The rest only have a single trans num and will get their only single entry id.
HTH
Think about it like this
Your query:
SELECT * FROM `tbl_loan` WHERE transaction_num IN (SELECT max(transaction_num) max_trans FROM tbl_loan GROUP BY c_id)
Lets say your subquery returns one transaction_num of 20. This 20 can be the same for multiple c_id's.
So your outer query is then running
SELECT * FROM `tbl_loan` WHERE transaction_num IN (20)
and returning all those results.

SQL query for finding the media of a set of data belonging to a category

I have a table with binned data for different categories, for instance:
category, bin, frequency
a, 0, 10
a, 1, 20
a, 2, 30
a, 3, 15
b, 0, 18
b, 1, 54
b, 2, 33
b, 3, 24
I need to find the approximate median for each category. To do this, I'd like to compute a cumulative percentage histogram for each category and take the first value above 50%. I know how to do this for one category:
SELECT category, bin as approx_median
FROM (
SELECT category, bin, frequency,
(SELECT SUM(frequency) FROM table sub WHERE sub.bin <= base.bin)
/ (SELECT SUM(frequency) FROM table)
* 100 as running_percent
FROM table base
WHERE category = a
ORDER BY bin ) p
WHERE p.running_percent >= 50.0
LIMIT 1
The question is, how do I do this for all the categories to obtain the result
category, approx_median
a, 2
b, 1
Thanks for any suggestion.
What you might want to do is something like this:
SELECT category, Min(bin) As approx_median
FROM(
SELECT base.category,
base.bin,
(SELECT SUM(sub.frequency) AS SummeBin FROM [table] sub WHERE sub.bin <= base.bin and sub.category = base.category)
/ (SELECT SUM(sub.frequency) FROM [table] sub WHERE sub.category = base.category GROUP BY sub.category) * 100 as running_percent
FROM [table] base
) p
WHERE running_percent >= 50.0
GROUP BY category
You need to group the category and also reference it in the aggregations.
If you use SQL Server 2012 and above you can use Window functions. An example for ABC-Analysis with Window Function.
You can use IN operator, I don't konw if it work. Just give it a try.
SELECT category, bin as approx_median
FROM (
SELECT category, bin, frequency,
(SELECT SUM(frequency) FROM table sub WHERE sub.bin <= base.bin)
/ (SELECT SUM(frequency) FROM table)
* 100 as running_percent
FROM table base
WHERE category in (select distinct category from table)
ORDER BY bin ) p
WHERE p.running_percent >= 50.0
LIMIT 1
If the query you posted is doing what you actually want, then just remove the condition WHERE category = a and give it a try. Your running_percent calculation is based on bin column anyway. You may order your outer query further by category to make it look nice maybe.

SELECT rows with minimum count(*)

Let's say i have a simple table voting with columns
id(primaryKey),token(int),candidate(int),rank(int).
I want to extract all rows having specific rank,grouped by candidate and most importantly only with minimum count(*).
So far i have reached
SELECT candidate, count( * ) AS count
FROM voting
WHERE rank =1
AND candidate <200
GROUP BY candidate
HAVING count = min( count )
But,it is returning empty set.If i replace min(count) with actual minimum value it works properly.
I have also tried
SELECT candidate,min(count)
FROM (SELECT candidate,count(*) AS count
FROM voting
where rank = 1
AND candidate < 200
group by candidate
order by count(*)
) AS temp
But this resulted in only 1 row,I have 3 rows with same min count but with different candidates.I want all these 3 rows.
Can anyone help me.The no.of rows with same minimum count(*) value will also help.
Sample is quite a big,so i am showing some dummy values
1 $sampleToken1 101 1
2 $sampleToken2 102 1
3 $sampleToken3 103 1
4 $sampleToken4 102 1
Here ,when grouped according to candidate there are 3 rows combining with count( * ) results
candidate count( * )
101 1
103 1
102 2
I want the top 2 rows to be showed i.e with count(*) = 1 or whatever is the minimum
Try to use this script as pattern -
-- find minimum count
SELECT MIN(cnt) INTO #min FROM (SELECT COUNT(*) cnt FROM voting GROUP BY candidate) t;
-- show records with minimum count
SELECT * FROM voting t1
JOIN (SELECT id FROM voting GROUP BY candidate HAVING COUNT(*) = #min) t2
ON t1.candidate = t2.candidate;
Remove your HAVING keyword completely, it is not correctly written.
and add SUB SELECT into the where clause to fit that criteria.
(ie. select cand, count(*) as count from voting where rank = 1 and count = (select ..... )
The HAVING keyword can not use the MIN function in the way you are trying. Replace the MIN function with an absolute value such as HAVING count > 10

Mysql- Aggregate function query grouping problem

Consider following table.
I'm trying to write a query to display - Max values for all the parts per category. Also display the date when the value was max.
So i tried this -
select Part_id, Category, max(Value), Time_Captured
from data_table
where category = 'Temperature'
group by Part_id, Category
First of all, mysql didn't throw an error for not having Time_Captured in group by.
Not sure if its a problem with mysql or my mysql.
So I assume it should return -
1 Temperature 50 11-08-2011 08:00
2 Temperature 70 11-08-2011 09:00
But its returning me the time captured from the first record of the data set i.e. 11-08-2011 07:00
Not sure where I'm going wrong. Any thoughts?
(Note: I'm running this inside a VM. Just in case if it changes anything)
You need to join to the results of a query that finds the max(value), like this:
select dt.Part_id, dt.Category, dt.Value, dt.Time_Captured
from data_table dt
join (select Part_id, Category, max(Value) as Value
from data_table group by 1, 2) x
on x.Part_id = dt.Part_id and x.Category = dt.Category
where dt.category = 'Temperature';
Note that this will return multiple rows if there are multiple rows with the same max value.
If you want to limit this to one row even though there are multiple matches for max(value), select the max(Time_Captured) (or min(Time_Captured) if you prefer), like this:
select dt.Part_id, dt.Category, dt.Value, max(dt.Time_Captured) as Time_Captured
from data_table dt
join (select Part_id, Category, max(Value) as Value
from data_table group by 1, 2) x
on x.Part_id = dt.Part_id and x.Category = dt.Category
where dt.category = 'Temperature'
group by 1, 2, 3;