Hi I want sum two columns (type double) with two diffrent tables. My query sql works until i add clauses "where". If every clausule "where" are met then is okej, return correct result. If even one clause return null then result is null. What change my code to single clause return 0 if doesnt exist record.
select (select sum(amount) from change_graphic where month(change_date)=4 and year(change_date)=2019)+(select SUM(provision) from contracts where accepted=0 and month(date)=4 and year(date)=2019);
Use coalesce():
select (select coalesce(sum(amount), 0)
from change_graphic
where month(change_date) = 4 and year(change_date) = 2019) +
(select coalesce(sum(provision), 0)
from contracts
where accepted = 0 and month(date) = 4 and year(date) = 2019
);
The subqueries are guaranteed to return one row, because they are aggregation queries with no GROUP BY. Hence, you can convert NULL generated by the SUM() into 0 for the addition.
I would recommend that you approach the date comparisons as:
select (select coalesce(sum(amount), 0)
from change_graphic
where change_date >= '2019-04-01' and
change_date < '2019-05-01'
) +
(select coalesce(sum(provision), 0)
from contracts
where accepted = 0 and
date >= '2019-04-01' and
date < '2019-05-01'
);
This enables MySQL to use an index on the date column, if an appropriate index is available.
Related
Here is my query
SELECT
SUM(o.order_disc + o.order_disc_vat) AS manualsale
FROM
orders o
WHERE
o.order_flag IN (0 , 2, 3)
AND o.order_status = '1'
AND (o.assign_sale_id IN (SELECT GROUP_CONCAT(CAST(id AS SIGNED)) AS ids FROM users WHERE team_id = 92))
AND DATE(o.payment_on) = DATE(NOW())
above query return null when i run this query in terminal
When i use subquery below it returns data
SELECT GROUP_CONCAT(CAST(id AS SIGNED)) AS ids FROM users WHERE team_id = 92)
above query returns
'106,124,142,179'
and when i run my first query like below
SELECT
SUM(o.order_disc + o.order_disc_vat) AS manualsale
FROM
orders o
WHERE
o.order_flag IN (0 , 2, 3)
AND o.order_status = '1'
AND (o.assign_sale_id IN (106,124,142,179))
AND DATE(o.payment_on) = DATE(NOW())
it return me value.
Why it is not working with subquery please help
This does not do what you want:
AND (o.assign_sale_id IN (SELECT GROUP_CONCAT(CAST(id AS SIGNED)) AS ids FROM users WHERE team_id = 92))
This compares a single value against a comma-separated list of values, so it never matches (unless there is just one row in users for the given team).
You could phrase this as:
AND assign_sale_id IN (SELECT id FROM users WHERE team_id = 92)
But this would probably be more efficently expressed with exists:
AND EXISTS(SELECT 1 FROM users u WHERE u.team_id = 92 AND u.id = o.assign_sale_id)
Side note: I would also recommend rewriting this condition:
AND DATE(o.payment_on) = DATE(NOW())
To the following, which can take advantage of an index:
AND o.payment_on >= current_date AND o.payment_on < current_date + interval 1 day
I have the following query I'm trying to use to spit out each day in a date range and show the # of leads, assignments, & returns:
select
date_format(from_unixtime(date_created), '%m/%d/%Y') as date_format,
(select count(distinct(id_lead)) from lead_history where (date_format(from_unixtime(date_created), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as leads,
(select count(id) from assignments where deleted=0 and (date_format(from_unixtime(date_assigned), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as assignments,
(select count(id) from assignments where deleted=1 and (date_format(from_unixtime(date_deleted), '%m/%d/%Y') = date_format) and (id_vertical in (2)) and (id_website in (3,8))) as returns
from lead_history
where date_created between 1509494400 and 1512086399
group by date_format
The date_created, date_assigned, and date_deleted fields are integers representing timestamps. id, id_lead, id_vertical and id_website are already indexed.
Would adding indexes to date_created, date_assigned, date_deleted, and deleted help make this faster? The issue I'm having is that it is very slow, and I'm not sure an index will help when using date_format(from_unixtime(...
Here is the EXPLAIN:
Looking to your code you could rewrite the query as ..
select
date_format(from_unixtime(date_created), '%m/%d/%Y') as date_format
, count(distinct(h.id_lead) as leads
, sum(case a.deleted = 1 then 1 else 0 end) assignments
, sum(case b.deleted = 0 then 1 else 0 end) returns
from lead_history h
inner join assignments on a a.date_assigned = h.date_created
and a.id_vertical = 2
and id_website in (3,8))
inner join assignments on b b.deleted = h.date_created
and a.id_vertical = 2
and id_website in (3,8))
where date_created between 1509494400 and 1512086399
group by date_format
anyway you shold avoid unuseful () and nested (), avoid unuseful conversion between date and use join instead of subselect .. or at least reduce similar sabuselect using case
PS for what concern the index remember that the use of conversion on a column value invalid the use of related the index ..
For readability, I would like to modify the below statement. Is there a way to extract the CASE statement, so I can use it multiple times without having to write it out every time?
select
mturk_worker.notes,
worker_id,
count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end ) correct,
100 * ( sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end)
/ count(episode_has_accepted_imdb_url) ) percentage
from
mturk_completion
inner join mturk_worker using (worker_id)
where
timestamp > '2015-02-01'
group by
worker_id
order by
percentage desc,
correct desc
You can actually eliminate the case statements. MySQL will interpret boolean expressions as integers in a numeric context (with 1 being true and 0 being false):
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) as correct,
(100 * sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
If you like, you can simplify it further by using the null-safe equals operator:
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url <=> accepted_imdb_url) as correct,
(100 * sum(imdb_url <=> accepted_imdb_url) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
This isn't standard SQL, but it is perfectly fine in MySQL.
Otherwise, you would need to use a subquery, and there is additional overhead in MySQL associated with subqueries.
I have a database including certain strings, such as '{TICKER|IBM}' to which I will refer as ticker-strings. My target is to count the amount of ticker-strings per day for multiple strings.
My database table 'tweets' includes the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss) and 'processed text'. The ticker-strings, such as '{TICKER|IBM}', are within the 'processed text' row.
At this moment, I have a working SQL query for counting one ticker-string (thanks to the help of other Stackoverflow-ers). What I would like to have is a SQL query in which I can count multiple strings (next to '{TICKER|IBM}' also '{TICKER|GOOG}' and '{TICKER|BAC}' for instance).
The working SQL query for counting one ticker-string is as follows:
SELECT d.date, IFNULL(t.count, 0) AS tweet_count
FROM all_dates AS d
LEFT JOIN (
SELECT COUNT(DISTINCT tweet_id) AS count, DATE(created_at) AS date
FROM tweets
WHERE processed_text LIKE '%{TICKER|IBM}%'
GROUP BY date) AS t
ON d.date = t.date
The eventual output should thus give a column with the date, a column with {TICKER|IBM}, a column with {TICKER|GOOG} and one with {TICKER|BAC}.
I was wondering whether this is possible and whether you have a solution for this? I have more than 100 different ticker-strings. Of course, doing them one-by-one is an option, but it is a very time-consuming one.
If I understand correctly, you can do this with conditional aggregation:
SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG, coalesce(BAC, 0) AS BAC
FROM all_dates d LEFT JOIN
(SELECT DATE(created_at) AS date,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
END) as IBM,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
END) as GOOG,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
END) as BAC
FROM tweets
GROUP BY date
) t
ON d.date = t.date;
I'd return the specified resultset like this, adding expressions to the SELECT list for each "ticker" I want returned as a separate column:
SELECT d.date
, IFNULL(SUM(t.processed_text LIKE '%{TICKER|IBM}%' ),0) AS `cnt_ibm`
, IFNULL(SUM(t.processed_text LIKE '%{TICKER|GOOG}%'),0) AS `cnt_goog`
, IFNULL(SUM(t.processed_text LIKE '%{TICKER|BAC}%' ),0) AS `cnt_goog`
, IFNULL(SUM(t.processed_text LIKE '%{TICKER|...}%' ),0) AS `cnt_...`
FROM all_dates d
LEFT
JOIN tweets t
ON t.created_at >= d.date
AND t.created_at < d.date + INTERVAL 1 DAY
GROUP BY d.date
NOTES: The expressions within the SUM aggregates above are evaluated as booleans, so they return 1 (if true), 0 (if false), or NULL. I'd avoid wrapping the created_at column in a DATE() function, and use a range scan instead, especially if a predicate is added (WHERE clause) that restricts the values ofdatebeing returned fromall_dates`.
As an alternative, expressions like this will return an equivalent result:
, SUM(IF(t.process_text LIKE '%{TICKER|IBM}%' ,1,0)) AS `cnt_ibm`
I have a requirement where I need o group data into equal number ob rows. As mysql doesn't have rownum() I'm simulating this behaviour:
SET #row:=6;
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count
FROM
(
SELECT timestamp, value, #row:=#row+1 AS row
FROM data
WHERE channel_id=52 AND timestamp >= 0 ORDER BY timestamp
) AS agg
GROUP BY row div 8
ORDER BY timestamp ASC;
Note: according to Can grouped expressions be used with variable assignments? this query may not be 100% correct, but it does work.
An additional requirement is to calculate the row difference between the grouped sets. I've looked for a solution joining the same table with a subquery:
SET #row:=6;
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count
FROM
(
SELECT timestamp, value, #row:=#row+1 AS row
FROM data
WHERE channel_id=52 AND timestamp >= 0 ORDER BY timestamp
) AS agg
LEFT JOIN data AS prev
ON prev.channel_id = agg.channel_id
AND prev.timestamp = (
SELECT MAX(timestamp)
FROM data
WHERE data.channel_id = agg.channel_id
AND data.timestamp < MIN(agg.timestamp)
)
GROUP BY row div 8
ORDER BY timestamp ASC;
Unfortunately that errors:
Error Code: 1054. Unknown column 'agg.channel_id' in 'on clause'
Any idea how this query could be written?
You never selected channel_id from your sbuquery, so it's not returned to the parent query, and is therefore invisible. Try
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count
FROM
(
SELECT timestamp, value, #row:=#row+1 AS row, channel_id
^^^^^^^^^^^^-- need this
FROM data
Since MySQL only sees and uses the fields you explicitly return from that subquery, and will NOT "dig deeper" into the table underlying the query, you need to select/return all of the fields you'll be using the parent queries.
How about this version:
SELECT MAX(agg.timestamp) AS timestamp, MAX(agg.value) AS value, COUNT(agg.value) AS count, COALESCE(prev.timestamp, 0) AS prev_timestamp
FROM (SELECT d.*, #row:=#row+1 AS row
FROM data d CROSS JOIN
(select #row := 6) vars
WHERE channel_id = 52 AND timestamp >= 0 ORDER BY timestamp
) agg LEFT JOIN
data prev
ON prev.channel_id = agg.channel_id AND
prev.timestamp = (SELECT MAX(timestamp)
FROM data
WHERE data.channel_id = agg.channel_id AND
data.timestamp < agg.timestamp
)
GROUP BY row div 8
ORDER BY timestamp ASC;
This includes all the columns in the subquery. And it puts the variable initialization in the same query.