count records grouped into ranges - mysql

I have a table in my db called students with a column age along with other columns. I need to count students in each age range e.g. give me count of students with age between 0-5, 6-10,11-15 onward. Can I get this with single query or do i need to use BETWEEN clause in loop.
Thanks
EDIT:
This can also be taken as employee-salary relation. It can be changed to count employees with salaries in different ranges e.g. 1000k to 1500k etc.

i dont think there is a function that will group the ages into your ranges or buckets. so you'll have to do that manually and then group the data.
select
case when age < 5 then '0-5'
when age < 10 then '6-10'
when age < 15 then '11-15'
...
...
end as agerange
count(studentID)
from students
group by agerange

You can do it with a case and between the given ages (mine is just for an example), and then group by that.
select case
when age between 0 and 20 then ' 0-20'
when age between 21 and 41 then ' 21-41'
when age between 42 and 62 then ' 42-62'
when age between 63 and 83 then ' 63-83'
when age between 84 and 104 then ' 84-104'
end as `range`, count(*) as `users`
from users group by `range`
I haven't been able to test it, but it should work :),else post the error and we try to fix it from there

Related

SQL: how to produce table of all the individual ID's that had more than 2 occurrences within 24 hours of each other

I have a table as shown below
User ID
Payment ID
DateTime
Amount
UID1
12dw4r434t
19/08/2020 13:40:12
10
UID2
k2dw86774z
19/08/2020 14:30:52
5
UID3
5hjs982835
17/08/2020 09:56:08
7
UID1
hg19283jdg
20/08/2020 07:59:33
12
UID1
2563ghmn77
20/08/2020 08:10:22
54
UID2
999gegh77d
19/08/2020 17:11:37
67
UID2
mnnnhsgdje
20/08/2020 19:18:55
67
UID1
qccc356njd
20/08/2020 16:10:11
87
UID3
mmk0999111
18/08/2020 05:16:29
4
UID3
yyy63hgd72
18/08/2020 05:25:44
4
I want to be able to produce a table/list of all the User IDs that have at least 3 (more than 2) occurrences within 24 hours of each other.
The result for the above data would be:
User ID
UID1
UID3
To be clear, I want the 24 hour period to be rolling. I am not looking for the results over a fixed date range.
ALTERNATIVE SOLUTION OUTPUT
Another output solution that would satisfy my need would be to produce a table with an extra column (Count) added to the table above. For every row, the Count column counts how many transactions within the whole table are within 24 hours (and for the same User ID)of the transaction for the current row . I have easily managed to do this using COUNTIF in Excel, but cannot get it working in SQL. Below is how I tried to do this:
SELECT
a."User Id",
a."Payment Id",
a."created_at",
a."Amount",
b.m_count
FROM "Payment Summary Table" AS a
INNER JOIN( SELECT
"User Id",
"created_at",
COUNT(*) AS m_count
FROM "Payment Summary Table"
GROUP BY "Merchant",
"amount"
) AS b ON a."User Id" = b."User Id"
AND b."created_at" <= dateadd (hour, 24, a."created_at")
You can get the first of the three using lead() and date arithmetic:
select pst.*
from (select pst.*,
lead(created_at, 2) over (partition by user_id order by created_at) as created_at_2
from payment_summary_table pst
) pst
where created_at_2 < dateadd(hour, 24, created_at);
This looks at the payment 2 rows ahead for the user. If it is within 24 hours, then the three payments are within 24 hours.
If you just want the users, replace the select with:
select distinct user_id

two decimal places in MySQL but it shows error at the word SELECT

hi can anyone help me on how to execute the "NEW-BONUS" column with two decimal places? i tried cast, round, truncate and decimal but it shows error at the word SELECT
SELECT LASTNAME, EDLEVEL,
SALARY+1200 AS "NEW-SALARY",
BONUS*0.5 AS "NEW-BONUS"
FROM EMPLOYEE
WHERE EDLEVEL = 18 OR EDLEVEL = 20
ORDER BY EDLEVEL DESC, 3
You can try using "TRUNCATE"
So please update your query like this
SELECT LASTNAME, EDLEVEL,
(123+1200) AS "NEW-SALARY",
TRUNCATE((111*0.5),2) AS "NEW-BONUS"
FROM EMPLOYEE
WHERE EDLEVEL = 18 OR EDLEVEL = 20
ORDER BY EDLEVEL DESC, 3
123 is demo value and 111 is also demo value

Select Statements: Select referring previous select query

I have two Queries which work independently; I need to combine these results.
Fetch all fields (including wdate and Empid), from Fromdate to ToDate.
Calculate a value (for efficiency, Efc), for a specific wdate and Empid from First Query.
1st Query
SELECT *
FROM tblProductionEffcyDetails
WHERE wDate BETWEEN '06/26/2019' AND '07/25/2019'
AND worker = 'Techn'
ORDER BY Empid, wDate
2nd Query
SELECT Cast(ROUND(SUM(Tstdmin) / NULLIF(SUM(TAvlblmin), 0) * 100,0) as int) AS [Efc]
FROM tblProductionEffcyDetails
WHERE wDate='07/11/2019'
AND Empid='00021'
GROUP BY wdate, Empid
That is, in this 2nd Query, the values for wDate and Empid should come from the results of the 1st Query.
Notes on the data/ table:
Any particular date (wDate) or person (Empid) can have any number
of entries.
Efficiency (Efc) should be given just once per day
(wDate) i.e. it should not have multiple values for a particular wDate.
Table structure is as below
SL wDate Avlbl_Mins NP_Mins Empid Name Process Model Efc
117571 7/13/2019 0 0 21 MARRY Block removing 900-2930 80
117572 7/13/2019 0 0 21 MARRY Microscope checking 900-2929 Null
116872 6/26/2019 430 75 52 SUGANTHI Slab removing 900-2929 75
116873 6/26/2019 0 0 52 SUGANTHI Slab Removing 900-2528 Null
Try this,
you can get the particular Empid set for date between 06/26/2019 and 07/25/2019 using your 1st query and by connecting it to your second query you can sum it up.
SELECT Empid, Cast(ROUND(SUM(Tstdmin) / NULLIF(SUM(TAvlblmin), 0) * 100,0) as int) AS [Efc]
FROM tblProductionEffcyDetails t
WHERE t.Empid in (
SELECT Empid
FROM tblProductionEffcyDetails tb
WHERE tb.wDate BETWEEN '06/26/2019' AND '07/25/2019'
AND tb.worker = 'Techn') and t.wDate BETWEEN '06/26/2019' AND '07/25/2019'
GROUP BY t.wdate, t.Empid
I made the query as I understood your question, let me know if it didn't get the output you want. so I can change the answer.
Hope this helps.
SELECT Empid, Cast(ROUND(SUM(Tstdmin) / NULLIF(SUM(TAvlblmin), 0) * 100,0) as
int) AS [Efc]
FROM (
SELECT *
FROM tblProductionEffcyDetails tb1
WHERE tb1.wDate BETWEEN '06/26/2019' AND '07/25/2019'
AND tb1.worker = 'Techn') as t
GROUP BY t.wdate, t.Empid

Passing values to a query that may be null

I'm trying to make a filter section for a website that will return objects from the DB.
SELECT *
FROM users
WHERE type = "1"
AND age < 50
AND age > 20
When entering this query, some users may not want to set a type, just filter it by age? Is there any sort of wild card i can pass into the query so it returns all if it not been defined by the user?
The type can either be 1, 2 or 3. if this is not passed into it, i want it to return all users between the age of 20 and 50, and then all types?
These could be two possible solutions, first is simple if you want all records when age between 50 and 20. Second query will return records if type=1 and age between 50 and 20,
SELECT *
FROM users
WHERE age BETWEEN 20 AND 50
OR
SELECT * FROM users
WHERE
IF(type == 1)
THEN
type = 1
ELSE
1=1
END IF
AND age BETWEEN 20 AND 50
SELECT * FROM USERS WHERE AGE BETWEEN 20 AND 50 AND COALESCE(TYPE,'1')=1
Try this query :
select * from table_name where age between 20 and 50 AND type is null

Optimizing similar MySQL subqueries

This is a subquery I have in a larger SQL script. It's performing the same action within multiple different CASE statements, so I was hoping I could somehow combine the action so it doesn't have to do the same thing over and over. However, I can't get the right results if I move the ORDER BY command outside of the CASE statements.
I'm joining 2 tables, met_data and flexgridlayers_table, on JDAY. Flexgridlayers_table has fields for JDAY and Segment, and met_data has fields JDAY, TAIR, and TDEW (in this simple example, but actually more fields). I'm running this through Matlab, so variable1 and variable2 are values set by a nested loop. I need to use CASE statements to account for the situation where variable1 is not equal to 1, then I want to output 0. Otherwise, I want to find values corresponding to a JDAY join, but the values may not be exact matches in F.JDAY and M.JDAY. I want to match on the closest <= value, so I use the ORDER BY M.JDAY DESC LIMIT 1 statement in each subquery.
The output is a table with fields JDAY (from F.JDAY), TAIR, and TDEW. Whenever I try moving the ORDER BY part outside of the CASE statements to get rid of the repeating subqueries, I get only a single row of results representing the largest JDAY. This query gives me the correct result - is there a way to optimize this?
SELECT F.JDAY,
CASE
WHEN *variable1*<>1 THEN 0
ELSE
(SELECT M.TAIR
FROM met_data AS M
WHERE M.Year=2000 AND M.JDAY<=F.JDAY
ORDER BY M.JDAY DESC LIMIT 1)
END AS TAIR,
CASE
WHEN *variable1*<>1 THEN 0
ELSE
(SELECT M.TDEW
FROM met_data AS M
WHERE M.Year=2000 AND M.JDAY<=F.JDAY
ORDER BY M.JDAY DESC LIMIT 1)
END AS TDEW
FROM FlexGridLayers_table AS F
WHERE F.SEGMENT=*variable2*
Further explanation:
This query pulls all JDAY values from flexgridlayers_table, and then searches within the table met_data to find values corresponding to the closest <= JDAY values in that table. For example, consider the following flexgridlayers_table and met_data tables:
flexgridlayers_table:
Segment JDAY
2 1.5
2 2.5
2 3.5
3 1.5
3 2.5
3 3.5
met_data:
JDAY Year TAIR TDEW
1.0 2000 7 8
1.1 2000 9 10
1.6 2000 11 12
2.5 2000 13 14
2.6 2000 15 16
3.4 2000 17 18
4.0 2000 19 20
What I want (and what the query above returns) would be this, for variable1=1 and variable2=2:
JDAY TAIR TDEW
1.5 9 10
2.5 13 14
3.5 17 18
I'm just wondering if there is a more efficient way of writing this query, so I'm not performing the ORDER BY command on the same list of JDAY values over and over for each TAIR, TDEW, etc. field.
Then I would write as follows... It looks like you are looking for one "TAIR" and "TDEW" per JDAY. If that is the case, apply a LEFT JOIN to your met_data table once on the year condition and F vs M JDay values. Now normally, this would return multiple rows per "JDay"
SELECT
PQ.JDay,
PQ.MaxJDayPerFDay,
CASE WHEN *var1* <> 1 THEN 0 ELSE M2.TAIR END TAIR,
CASE WEHN *var1* <> 1 THEN 0 ELSE M2.TDEW END TDEW
from
( SELECT
F.JDay,
MAX( M.JDAY ) as MaxJDayPerFDay
from
FlexGridLayers_Table F
JOIN met_Data M
ON M.Year = 2000
AND F.JDay >= M.JDay
where
F.Segment = *var2*
group by
F.JDay ) PQ
JOIN Met_Data M2
on M2.Year = 2000
AND PQ.MaxJDayPerFDay = M2.JDay
Now this does a pre-query by applying a MAX() JDay in the met_data ONCE and group by JDay so it will always return one record per F.JDay. So, now you have one query pre-qualified for your F.Segment = variable 2. If you had other columns you wanted from the "F" table, put them into this "PreQuery" (PQ alias) as needed.
Then, this result can immediately be joined back to the met_data table since the one day value is now explicitly known from the prequery. So, you can now get both TAIR and TDEW values at once rather than in two separate queries being applied for every record.
Hope this make sense, if not, let me know.