For readability, I would like to modify the below statement. Is there a way to extract the CASE statement, so I can use it multiple times without having to write it out every time?
select
mturk_worker.notes,
worker_id,
count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end ) correct,
100 * ( sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end)
/ count(episode_has_accepted_imdb_url) ) percentage
from
mturk_completion
inner join mturk_worker using (worker_id)
where
timestamp > '2015-02-01'
group by
worker_id
order by
percentage desc,
correct desc
You can actually eliminate the case statements. MySQL will interpret boolean expressions as integers in a numeric context (with 1 being true and 0 being false):
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) as correct,
(100 * sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
If you like, you can simplify it further by using the null-safe equals operator:
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url <=> accepted_imdb_url) as correct,
(100 * sum(imdb_url <=> accepted_imdb_url) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
This isn't standard SQL, but it is perfectly fine in MySQL.
Otherwise, you would need to use a subquery, and there is additional overhead in MySQL associated with subqueries.
Related
I'm facing the following problem...
Given this data:
table : votes
=========
value
=========
10
25
38
90
92
93
98
100
120
I would like to return the value only, if the difference between next and previously accepted value is bigger than 10% of the first one:
if abs(int(a)-int(b))*100/int(a) < 10:
return True
So the end list should be (I have added % difference in square brackets):
==========
result
==========
10 ()
25 (150%)
38 (52%)
90 (136%)
100 (11%)
120 (20%)
The query should also sort those values first.
I'm able to do it with code (as shown above), but haven't got any chance in coming even close to a direct query.
MySQL v.8.0.19
You don't mention what version of MySQL you are using, so I'll assume it's a mordern one (8.x). You can use LAG(). For example:
select
concat('', value,
case when prev_value is null then ''
else concat('', 100 * (value - prev_value) / prev_value, '%')
end
) as result
from (
select
value,
lag(value) over (order by value) as prev_value
from t
) x
where prev_value is null or value > prev_value * 1.1
order by value
In MySQL 8.0, you can do this with lag(). Assuming that you want to sort rows by value, that would be:
select value
from (
select
value,
lag(value, 1, 0) over(order by value) lag_value
from mytable t
) t
where value > lag_value * 1.10
If you want to use an different ordering column, then you can change the order by clause to use the relevant column.
In earlier versions, one option is a correlated subquery:
select value
from mytable t
where value > 1.10 * coalesce(
(
select t1.value
from mytable t1
where t1.value < t.value
order by t1.value desc
limit 1
),
0
)
To use another ordering column here, you need to change the where clause and the order by clause of the subquery.
On the other hand, if you want to select the next row according to the ratio against the previously selected row, then that's a different question. You need some kind of iterative process: in SQL, one approach is a recursive query:
with
data as (
select value, row_number() over(order by value) rn
from mytable t
) d,
cte as (
select 1 is_valid, value, rn from data where rn = 1
union all
select
(d.value > 1.1 * c.value),
case when d.value > 1.1 * c.value then d.value else c.value end,
d.rn
from cte c
inner join data d on d.rn = c.rn + 1
)
select value from cte where is_valid order by value
The query enumerates the values, then walks the dataset sequentially while keeping track of the last selected value, and setting flags on records that should appear in the final resultset.
I was on the "hospital_payment_data" table.
I want to call up the data of the number of data, the cache_account_received sum, and the total_medical_bills sum, and then bring up the mount sum value from the cash_recipit_rowtable to express. What should I do?
hospital_payment_data
cash_receipt_row
I want result
However, sending the following queries results in the following:
SELECT
COUNT(*) as total,
SUM(cash_amount_received) AS sum_cash_amount_received,
COUNT(
IF(total_medical_bills >= 100000 AND
cash_amount_received , total_medical_bills, NULL)
) as obligatory_issue,
SUM(
IF(total_medical_bills >= 100000 AND
cash_amount_received , cash_amount_received, NULL)
) as sum_obligatory_issue,
SUM(amount) AS sum_amount
FROM (
SELECT total_medical_bills, cash_amount_received, amount
FROM hospital_payment_data, cash_receipt_row
) AS a
wrong result
Try this.
SELECT
COUNT(*) as total,
SUM(cash_amount_received) AS sum_cash_amount_received,
COUNT(
IF(total_medical_bills >= 100000 AND
cash_amount_received , total_medical_bills, NULL)
) as obligatory_issue,
SUM(
IF(total_medical_bills >= 100000 AND
cash_amount_received , cash_amount_received, NULL)
) as sum_obligatory_issue,
SUM(amount) AS sum_amount
FROM (
SELECT total_medical_bills, cash_amount_received, amount
FROM hospital_payment_data, cash_receipt_row
WHERE hospital_payment_data.id = cash_receipt_row.id
) AS a
Never use commas in the FROM clause. Always use proper, explicit, standard, readable JOIN syntax.
You can also simplify your counting logic in MySQL. There is no need for IF() or a subquery:
SELECT COUNT(*) as total,
SUM(cash_amount_received) AS sum_cash_amount_received,
SUM( total_medical_bills >= 100000 AND
obligatory_issue <> 0
) as obligatory_issue,
SUM(CASE WHEN total_medical_bills >= 100000
THEN cash_amount_received
END) as sum_obligatory_issue,
SUM(amount) AS sum_amount
FROM hospital_payment_data hpd JOIN
cash_receipt_row crr
ON hpd.id = crr.id;
You'll notice that where conditional logic is needed, then this uses the standard SQL construct, CASE, rather than IF.
Hi I want sum two columns (type double) with two diffrent tables. My query sql works until i add clauses "where". If every clausule "where" are met then is okej, return correct result. If even one clause return null then result is null. What change my code to single clause return 0 if doesnt exist record.
select (select sum(amount) from change_graphic where month(change_date)=4 and year(change_date)=2019)+(select SUM(provision) from contracts where accepted=0 and month(date)=4 and year(date)=2019);
Use coalesce():
select (select coalesce(sum(amount), 0)
from change_graphic
where month(change_date) = 4 and year(change_date) = 2019) +
(select coalesce(sum(provision), 0)
from contracts
where accepted = 0 and month(date) = 4 and year(date) = 2019
);
The subqueries are guaranteed to return one row, because they are aggregation queries with no GROUP BY. Hence, you can convert NULL generated by the SUM() into 0 for the addition.
I would recommend that you approach the date comparisons as:
select (select coalesce(sum(amount), 0)
from change_graphic
where change_date >= '2019-04-01' and
change_date < '2019-05-01'
) +
(select coalesce(sum(provision), 0)
from contracts
where accepted = 0 and
date >= '2019-04-01' and
date < '2019-05-01'
);
This enables MySQL to use an index on the date column, if an appropriate index is available.
How do you apply a Where or Having clause to a query? I am having problems with the Having clause.
DECLARE #dtDate DATE
SET #dtDate = GETDATE();
with EMS as
(
select * from ReportingView.WTA where FiscalMonth = DATENAME(MONTH, #dtDate) + ', ' + DATENAME(YEAR, #dtDate) and ProductGroup = 'AAD'
)
select
[ID]
,(CASE
WHEN Entitlements <= 0 THEN '0'
ELSE CAST([Activations] as float) / [Entitlements]
END) as Utilization
from EMS
**HAVING Utilization >= .25**
The HAVING keyword is only used if you are using a GROUP BY too. What you want is a WHERE but you will not be able to reference Utilization unless you wrap it in a sub select.
Both a where and a having clause go at the end of your query. If you have both, then the where comes before the having.
In your case, your having is not working, because having is only to be used with group by. having is essentially a where clause for aggregate values (such as sum, count, etc)
Examples:
WHERE
SELECT
*
FROM
EMS
WHERE
Utilization >= 0.25
HAVING
SELECT
col1, count(*)
FROM
EMS
GROUP BY
col1
HAVING
count(*) > 10
HAVING and WHERE
SELECT
col1, count(*)
FROM
EMS
WHERE
Utilization >= 0.25
GROUP BY
col1
HAVING
count(*) > 10
Edit: This modified query should work for you. I'm not sure why your original query was using a CTE, but I've moved the case logic to the CTE.
with EMS as
(
select
[ID],
(
CASE
WHEN Entitlements <= 0 THEN '0'
ELSE CAST([Activations] as float) / [Entitlements]
END
) as Utilization
from
ReportingView.WTA
where
FiscalMonth = DATENAME(MONTH, #dtDate) + ', ' + DATENAME(YEAR, #dtDate)
and ProductGroup = 'AAD'
)
select
*
from
EMS
where
Utilizaiton >= .25
I have the following query I'm trying to use to spit out each day in a date range and show the # of leads, assignments, & returns:
select
date_format(from_unixtime(date_created), '%m/%d/%Y') as date_format,
(select count(distinct(id_lead)) from lead_history where (date_format(from_unixtime(date_created), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as leads,
(select count(id) from assignments where deleted=0 and (date_format(from_unixtime(date_assigned), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as assignments,
(select count(id) from assignments where deleted=1 and (date_format(from_unixtime(date_deleted), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as returns
from lead_history
where date_created between 1509494400 and 1512086399
group by date_format
The date_created, date_assigned, and date_deleted fields are integers representing timestamps. id, id_lead, id_vertical and id_website are already indexed.
Would adding indexes to date_created, date_assigned, date_deleted, and deleted help make this faster? The issue I'm having is that it is very slow, and I'm not sure an index will help when using date_format(from_unixtime(...
Here is the EXPLAIN:
Looking to your code you could rewrite the query as ..
select
date_format(from_unixtime(date_created), '%m/%d/%Y') as date_format
, count(distinct(h.id_lead) as leads
, sum(case a.deleted = 1 then 1 else 0 end) assignments
, sum(case b.deleted = 0 then 1 else 0 end) returns
from lead_history h
inner join assignments on a a.date_assigned = h.date_created
and a.id_vertical = 2
and id_website in (3,8))
inner join assignments on b b.deleted = h.date_created
and a.id_vertical = 2
and id_website in (3,8))
where date_created between 1509494400 and 1512086399
group by date_format
anyway you shold avoid unuseful () and nested (), avoid unuseful conversion between date and use join instead of subselect .. or at least reduce similar sabuselect using case
PS for what concern the index remember that the use of conversion on a column value invalid the use of related the index ..