SQL merge results of different rows, combined by a case - mysql

i have the folowing table:
type | amount
pine | 10
cypress | 40
gold | 30
sylver | 25
I would like to classify, merge and sum within a case:
SELECT CASE WHEN (type == 'pine' OR type == 'cypress') then 'wood' end, from materials;
I would liek to get:
wood | 50
gold | 30
silver | 25
I though a case would merge the results but apparently isnt the case, i'm trying with a SUM but without success.

You can use the CASE statement like that, but you'll also have to group later to get the total values. Try this:
SELECT
(CASE WHEN type = 'pine' OR type = 'cypress' THEN 'wood' ELSE type END) AS type,
SUM(amount) AS total
FROM myTable
GROUP BY type;
Your case statement doesn't have any ELSE to it. You should include this, because it appears that if it's not pine or cypress, then you want to select whatever type already exists for that material.

Try this:
SELECT CASE WHEN (type == 'pine' OR type == 'cypress') then 'wood' end as MacroType, count(*) from materials group by MacroType;

Related

Ignore empty values on average math, but show it as zero on result

I need to calculate the Average value of fields, but two things needs to happen:
1- The empty values should NOT be counted for the average math.
2- If the field is empty it still must be shown in the result (with avg === 0)
Imagine that I have this dataset:
-----------------------
Code | valField | Date
-----------------------
A | | 2020-09-08
B | 12 | 2020-09-09
A | 10 | 2020-09-08
B | 15 | 2020-09-09
B | | 2020-09-09
C | | 2020-09-09
So I need the average of the day. As you can see, we have:
A = { empty, 10 }
B = { 12, 15, empty }
C = { empty }
I need to make the average like this:
Average of A = 10
Average of B = (12+15)/2 (because we have 2 non-empty values)
Average of C = 0 (It has not a single value, but I need it to show on result as 0)
So far I could accomplish both of the requirements, but not in the same time.
This query will show empty values BUT will also count empty fields on average math
SELECT AVG(valField) FROM myTable;
So Average of B would be = (12+15+0)/3 - wrong!
Now this will ignore empty values, the AVG math will be correct, but C would NOT be shown.
SELECT AVG(valFIeld) FROM myTable WHERE valField <> ''
How may I accomplish both requirements?
From your comment I understood, you have valField defined as varchar, so you can use next trick:
select
Code,
coalesce(avg(nullif(valField, '')), 0) as avg_value
from tbl
group by Code;
Test the query on SQLize.online
Here I used NULLIF function for convert empty values to null before calculate the average
I think you want:
SELECT code, COALESCE(AVG(valField), 0) FROM myTable GROUP BY code
This assumes valField is of a numeric datatype, and that by empty you mean null.
Here is what happens behind the hood:
avg(), as most other aggregate functions, ignores null values
if all values are null, then avg() does return null; you can replace that with 0 using coalesce()
That should be easy just create two queries one that calculates the average using non null values and the other one calculating the codes having no value in the data.
select round(avg(valField)) as avg, code from new where valField is not null group by Code
union all
select 0 as avg, code from new group by Code having avg(valField) is null;

If more than 10% of results are over X in mysql

I have a database table with lists of temperature readings from many locations in a number of buildings. I need a query that will give me a true or false if more than 10% of the readings in a building, taken on a date, are greater than X
I am not looking for a average. If there are 100 measurements taken in a building on a date, and 10 of them are over X (say 80 degrees) then create a flag.
The table is laid out as
Building # location # date temperature
| 123 | 555 |2016-04-08 | 68.5 |
| 123 | 556 |2016-04-08 | 70.2 |
| 123 | 557 |2016-04-08 | 65.4 |
| 888 | 999 |2013-03 22 | 80.4 |
Typically a building would have over 100 readings. There are many hundreds of building/date entries in the table
Can this be done with a single mysql query and can you share that query with me?
I obviously haven't made my question clear.
The result I am looking for is a single True or False.
If more than 10% of the results for a building/date combination were over X (say 80%) then show true, or some flag equal to true.
The known fields will be building and date. The location is not relevant, and can be ignored. So given the input of building (123) and date (2016-04-08) are more than 10% of the entries in the table that have that building number and date greater than X (e.g. 80). The only data to be tested are those for that building and date. So the query would end in:
where building_id=`123` AND date =`2016-04-08`
I am NOT looking for an average or a median. I am NOT looking to see a list of the data for that 10%. I am just looking for true or false.
You can use conditional aggregation, something like this:
select building, date,
(case when avg(temperature > x) > 0.1 then 'Y' else 'N' end) as flag
from t
group by building, date;
To return building and date, and "create a flag" for rows where more than 10% of the readings for that building on that date are over a given value X ...
SELECT r.building
, DATE(r.date)
, ( SUM(r.reading > X ) > SUM(.10) ) AS _flag
FROM myreadings r
GROUP BY r.building, DATE(r.date)
Absent more specification about the actual resultset you want to return, we're just guessing at what result set you want to return.
FOLLOWUP
Based on the update to the question... to return a row for a single building and a single date, add the WHERE clause as shown in the question. And remove expressions from the SELECT list.
SELECT ( SUM(r.reading > X ) > SUM(.10) ) AS _flag
FROM myreadings r
WHERE r.building = '123'
AND r.date >= '2016-04-08'
AND r.date < '2016-04-08' + INTERVAL 1 DAY
If there are no rows for the given building and given date, the query will return zero rows. If there is at least one row, and the number of rows that have a reading greater than X is more than 10% of the total number of rows, the query will return a single row, with _flag having a value of 1 (TRUE). Otherwise, the query will return a single row with _flag having a value of 0 (FALSE).
If you want the query to return a row even when there are no matching rows in the table, that can be accomplished with a more complex SQL statement.
If you want the query to return string values 'TRUE' or 'FALSE', that can be accomplished as well.
Again, absent an example of the resultset you are expecting to have returned, (without an actual specification which we can compare a resultset to), we're just guessing.

Oracle SQL when querying a range of data

I have a table that for an ID, will have data in several bucket fields. I want a function to pull out a sum of buckets, but the function parameters will include the start and end bucket field.
So, if I had a table like this:
ID Bucket0 Bucket30 Bucket60 Bucket90 Bucket120
10 5.00 12.00 10.00 0.0 8.00
If I send in the ID and the parameters Bucket0, Bucket0, it would return only the value in the Bucket0 field: 5.00
If I send in the ID and the parameters Bucket30, Bucket120, it would return the sum of the buckets from 30 to 120, or (12+10+0+8) 30.00.
Is there a nicer way to write this other than a huge ugly
if parameter1=bucket0 and parameter2=bucket0
then select bucket0
else if parameter1=bucket0 and parameter2=bucket1
then select bucket0 + bucket1
else if parameter1=bucket0 and parameter2=bucket2
then select bucket0 + bucket1 + bucket2
and so on?
The table already exists, so I don't have a lot of control over that. I can make my parameters for the function however I want. I can safely say that if a set of buckets are wanted, none in the middle will be skipped, so specifying start and end buckets would work. I could have a single comma delimited string of all buckets wanted.
It would have been better if your table had been normalised, like this:
id | bucket | value
---+-----------+------
10 | bucket000 | 5
10 | bucket030 | 12
10 | bucket060 | 10
10 | bucket090 | 0
10 | bucket120 | 8
Also, the buckets should better have names that are easy to compare in ranges, so that bucket030 comes between bucket000 and bucket120 in the normal alphabetical order, which is not the case if you leave out the padded zeroes.
If the above normalisation is not possible, then use an unpivot clause to turn your current table into the structure depicted above:
select id, sum(value)
from (
select *
from mytable
unpivot (value for bucket_id in (bucket0 as 'bucket000',
bucket30 as 'bucket030',
bucket60 as 'bucket060',
bucket90 as 'bucket090',
bucket120 as 'bucket120'))
) normalised
where bucket_id between 'bucket000' and 'bucket060'
group by id
When you do this with parameter variables, make sure those parameters have the padded zeroes as well.
You could for instance ensure that as follows for parameter1:
if parameter1 like 'bucket%' then
parameter1 := 'bucket' || lpad(+substr(parameter1, 7), 3, '0');
end if;
...etc.

How to make a select that returns 4 totals from same table but with different filters

I'm trying to make a report in SSRS where I show some totals from the same table. I know I can use selects into select, but I've heard that could affect the performance and make it slow. That is why I decided to use store procedures but I'm not so familiar with it (I only did some basic SP) so some help will be apreciated:
This is what I need to get:
|--------------|------------------------- TOTALS AND PERCENTAGES ----------------------|
|COMPANY | PACKAGES | WEIGHT | PACKAGE_DELIVERED |% DELIVERED | ONTIME |% ONTIME |
These are the querys I did in a previous version of the report (using asp):
SELECT COMPANY_NAME, COUNT(ID) AS PACKAGES, SUM(WEIGHT) AS WEIGHT
FROM PACKAGE
WHERE ACTUAL_DELIVERY_DATE BETWEEN 'X' AND 'Y'
GROUP BY COMPANY_CODE, COMPANY_NAME
Then I put the results in arrays and then make a new select to get the rest of information adding the COMPANY as filter:
SELECT COMPANY_CODE, ESTIMATED_DELIVERY_DATE, ACTUAL_DELIVERY_DATE
FROM PACKAGE
WHERE ACTUAL_DELIVERY_DATE BETWEEN 'X' AND 'Y'
AND STATUS = 'DELIVERED'
AND COMPANY_CODE = 'DHL'
ORDER BY STATUS
For every row
PACKAGES_DELIVERED = + 1
IF ACTUAL_DELIVERY_DATE < ESTIMATED_DELIVERY_DATE THEN ONTIME = + 1
Next
Then I calculate the percentages and show all together in a table.
Somebody that can help me to put all this in a Store Procedure or maybe have another idea.
Thanks in advance.
I would add the following columns to the original SELECT, using SUM on a CASE statement:
, SUM ( CASE WHEN STATUS = 'DELIVERED' THEN 1 ELSE 0 END ) AS PACKAGES_DELIVERED
, SUM ( CASE WHEN STATUS = 'DELIVERED' AND ACTUAL_DELIVERY_DATE < ESTIMATED_DELIVERY_DATE THEN 1 ELSE 0 END ) AS ONTIME
This doesnt seem complex enough to bother with a Stored Procedure.

Trying to write a query to look at business and address information

I have 1 input field for a user to type in either a business name, city and state, or zip code. I want to be able to pull back the correct data, but right now i'm getting incorrect results.
For example, if someone searches for "ERICS", it returns fine, but if someone searches for "ERICS ROCHESTER MN", it returns all of the results from ROCHESTER, MN. I want it to return no results in that case.
By the way, I am parsing out the data so i'm passing in up to 4 values in my query. business name, city, state, zip.
How can I modify my query so that it will give me the correct results? Can I somehow check in the query which variables are not null?
Schema
Business
business_id | name | city | state | zip
1 TOMS ROCHESTER MN 55906
2 BILLYs MINNEAPOLIS MN 55555
3 ERICS LAX WI 11111
Rating
rating_id | rating
1 GOOD
2 BAD
business_rating
br_id | business_id | rating_id
1 1 1
2 1 2
select b.business_id,
b.name,
b.city,
b.state,
b.zip
count(br.business_id) num_ratings,
round(avg(br.quality_id),2) quality_rating,
round(avg(br.friendly_id),2) friendly_rating,
round(avg(br.professional_id),2) professional_rating
from business b
Left Join business_rating br
On br.business_id = b.business_id
Left Join rating r
On r.rating_id = br.quality_id
And r.rating_id = br.friendly_id
where (upper(b.business_name) like '%ERICS%'
and upper(b.city) like '%ROCHESTER%'
and upper(b.state) like '%MN%'
and upper(b.zip) like '')
or (upper(b.city) like '%ROCHESTER%'
and upper(b.state) like '%MN%'
and upper(b.zip) like '')
or (upper(b.city) like '%ROCHESTER%'
and upper(b.state) like '%MN%')
or (upper(b.zip) like '')
group by id
Notice your WHERE clause - you're asking for a four-part match OR a three-part match, etc. That's why a three-part match (city, state, zip) is being returned.
The best way to handle this is to ship it out to a real search engine, e.g. solr.
Since you didn't ask for the best way, here's my answer to your question:
Write 4 queries.
One query for one-part requests, one query for two-part requests, etc. Have your code branch (or overload) based on how many parameters were provided.
This is assuming that, if there were no "ERICS", you would want to return no results, rather than showing some other bar in Rochester.
Oh, and don't apply functions (e.g. UPPER) to your columns - the engine will not be able to use your indexes to satisfy the query. Of course, using unanchored LIKE patterns will also preclude the use of indexes. Really, seriously, try solr.
Or, try this...
This query creates a *max_score* based on the number of fields that you've provided as non-null.
It calculates the number of columns that match the input, which is the score.
Then it only shows those rows whose score is the max.
set #NAME = "erics";
set #CITY = "rochester";
set #STATE = "MN";
select id,
case when #NAME is not null then 1 else 0 end +
case when #CITY is not null then 1 else 0 end +
case when #STATE is not null then 1 else 0 end +
case when #ZIP is not null then 1 else 0 end as max_score,
case when #name is not null and name like concat("%", #NAME, "%") then 1 else 0 end +
case when #city is not null and city like concat("%", #CITY, "%") then 1 else 0 end +
case when #STATE is not null and state like concat("%", #STATE, "%") then 1 else 0 end +
case when #ZIP is not null and zip like concat("%", zip, "%") then 1 else 0 end as score
from business
having score = max_score;
Now, please go install solr.