MYSQL: Adding fields which are aliases and have operations - mysql

Good day! I want to add fields "duration and dials"
Here is my mysql query:
select
u.dialer_display_name,
u.dialer_ext,
sum(dl.duration/60) as duration,
count(dl.dial_id) as dials
from leads.dial_log as dl
left join leads.users as u on u.user_id=dl.user_id
where date(dl.dial_date) = date(now())
and u.dialer_display_name != ''
group by dl.user_id order by dials DESC
So what I want to achieve is to add duration and dials and alias them as mets.
Something like:
select sum(duration+dials) as mets
How to build the query? Thanks!

Just add the two terms making the respective columns:
SELECT u.dialer_display_name,
u.dialer_ext,
SUM(dl.duration/60) AS duration,
COUNT(dl.dial_id) AS dials,
SUM(dl.duration/60) + COUNT(dl.dial_id) AS mets -- here is your column
FROM leads.dial_log AS dl
LEFT JOIN leads.users AS u
ON u.user_id = dl.user_id
WHERE DATE(dl.dial_date) = DATE(NOW()) AND
u.dialer_display_name != ''
GROUP BY dl.user_id
ORDER BY dials DESC

You need to repeat the logic, because you cannot re-use the aliases:
select u.dialer_display_name, u.dialer_ext,
sum(dl.duration/60) as duration,
count(dl.dial_id) as dials,
(sum(dl.duration/60) + count(dl.dial_id) ) as mets
from leads.dial_log dl join
leads.users u
on u.user_id = dl.user_id
where dl.dial_date >= curdate() and
dl.dial_date < date_add(curdate(), interval 1 day) and
u.dialer_display_name <> ''
group by u.dialer_display_name, u.dialer_ext
order by dials DESC;
Notes:
The comparison in the where using date() has been replaced with two comparisons. This allows the use of an index (if available).
date(now()) has been replaced with curdate().
The group by has been modified to match the columns in the select.
The left join has been replaced with an inner join. The where clause undoes the left join, so don't be misleading.
The new column has been added.
I will also say that count(dl.dial_id) seems strange. I suspect you want one of the following:
count(distinct dl.dial_id)
count(*)

Related

Passing argument in LEFT JOIN

I am currently trying to get data from 2 tables with a LEFT JOIN having an unknow value.
I tried using LEFT JOIN but it didn't work.
Here is my code example :
SELECT
cc.shid,
cc.user,
ts.type,
sum(cc.qty1) + sum(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM
content_c cc
LEFT JOIN
(SELECT
s.shid,
s.type
FROM
tab_s s
LIMIT 1
) as ts ON ts.shid = cc.shid
WHERE
cc.time_i like '2019-01%'
GROUP BY
cc.user,
ts.type
With that query it will never work : ts will contain the first occurence of tab_s regardless of cc.shid. I wonder if there is a way to make this :
LEFT JOIN
(SELECT
s.shid,
s.type
FROM
tab_s s
WHERE
s.shid = cc.shid
LIMIT 1
) as ts ON ts.shid = cc.shid
Any idea ? Is there a pointer notion in SQL or something like ? Like I can use &cc.shid, or #cc.shid ?
Note that doing the following :
LEFT JOIN tab_s ts ON ts.shid = cc.shid
Will make my request to take more than 1 minute to display results. And I cannot set an index in tab_s.shid aswell as cc.shid as its have multiple occurences.
Please keep in mind that content_c can have multiple occurence of cc.shid, that why I need to take only the first result (LIMIT 1). It's important.
Use a correlated subquery:
SELECT cc.shid, cc.user, cc.type,
SUM(cc.qty1) + SUM(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM (SELECT cc.*,
(SELECT s.type
FROM tab_s s
WHERE ts.shid = cc.shid
LIMIT 1
) as type
FROM content_c cc
) cc
WHERE cc.time_i >= '2019-01-01' AND
cc.time_i < '2019-02-01'
GROUP BY cc.shid, cc.user, cc.type;
Notes:
The use of LIMIT with no ORDER BY is suspicious. Why would there be duplicates in the underlying table?
Your date comparisons are bad. Use date/time functions when working with date/time values. Don't use string functions.
The GROUP BY should include all non-aggregated columns in the SELECT.
As discussed in the question comments, Can you please try this script and see if it meets your requirements? This will return a row per ID in "content_c" table with the GROUP BY impact.
SELECT
cc.shid,
cc.user,
ts.type,
sum(cc.qty1) + sum(cc.qty2) as qty_tot,
COUNT(cc.id) as nb
FROM content_c cc
LEFT JOIN
(
SELECT DISTINCT s.shid, s.type FROM tab_s s
) AS ts ON ts.shid = cc.shid
WHERE cc.time_i like '2019-01%'
GROUP BY cc.shid,cc.user,ts.type

Getting all value from every month, put zero if no data of that month

i'm trying to get data for each month, if there is no data found for a particular month, I will put zero. I already created a calendar table so I can left join it, but I still can't get zero.
Here's my query
SELECT calendar.month, IFNULL(SUM(transaction_payment.total),0) AS total
FROM `transaction`
JOIN `transaction_payment` ON `transaction_payment`.`trans_id` =
`transaction`.`trans_id`
LEFT JOIN `calendar` ON MONTH(transaction.date_created) = calendar.month
WHERE`date_created` LIKE '2017%' ESCAPE '!'
GROUP BY calendar.month
ORDER BY `date_created` ASC
the value in my calendar tables are 1-12(Jan-Dec) int
Result should be something like this
month total
1 0
2 20
3 0
4 2
..
11 0
12 10
UPDATE
The problem seems to be the SUM function
SELECT c.month, COALESCE(t.trans_id, 0) AS total
FROM calendar c
LEFT JOIN transaction t ON month(t.date_created) = c.month AND year(t.date_created) = '2018'
LEFT JOIN transaction_payment tp ON tp.trans_id = t.trans_id
ORDER BY c.month ASC
I tried displaying the ID only and it's running well. but when I add back this function. I can only get months with values.
COALESCE(SUM(tp.total), 0);
This fixes the issues with your query:
SELECT c.month, COALESCE(SUM(tp.total), 0) AS total
FROM calendar c LEFT JOIN
transaction t
ON month(t.date_created) = month(c.month) AND
year(t.date_created) = '2017' LEFT JOIN
transaction_payment tp
ON tp.trans_id = t.trans_id
GROUP BY c.month
ORDER BY MIN(t.date_created) ASC;
This will only work if the "calendar" table has one row per month -- that seems odd, but that might be your data structure.
Note the changes:
Start with the calendar table, because those are the rows you want to keep.
Do not use LIKE with dates. MySQL has proper date functions. Use them.
The filtering conditions on all but the first table should be in the ON clause rather than the WHERE clause.
I prefer COALESCE() to IFNULL() because COALESCE() is ANSI standard.
You need to use right as per your query because you calendar table is present at right side
SELECT calendar.month, IFNULL(SUM(transaction_payment.total),0) AS total
FROM `transaction`
JOIN `transaction_payment` ON `transaction_payment`.`trans_id` =
`transaction`.`trans_id`
RIGHT JOIN `calendar` ON MONTH(transaction.date_created) = calendar.month
WHERE`date_created` LIKE '2017%' ESCAPE '!'
GROUP BY calendar.month
ORDER BY `date_created` ASC

Speed up MySql query time with multiple conditional joins

There are 3 tables, persontbl1, persontbl2 (each 7500 rows) and schedule (~3000 active schedules i.e. schedule.status = 0). Person tables contain data for the same persons as one to one relationship and INNER join between two takes less than a second. And schedule table contains data about persons to be interviewed and not all persons have schedules in schedule table. With Left join query instantly takes around 45 seconds, which is causing all sorts of issues.
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
Here is the explain for query:
Schedule Table structure:
Schedule Table indexes:
Please let me know if any further information is required.
Thanks.
Edit: Added fully qualified table names and their columns.
You should just replace this line:
AND (DATE(SCHEDULE.call_datetime) <= CURDATE())
to this one:
AND SCHEDULE.call_datetime <= '2015-04-18 00:00:00'
so mysql will not call 2 functions per every record but will use static constant '2015-04-18 00:00:00'.
So you can just try for performance improvements if your query is:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
schedule.id, schedule.call_datetime, schedule.enum_id,
schedule.enum_change, schedule.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN SCHEDULE ON (schedule.survey_id = persontbl1._URI)
AND (SCHEDULE.status=0)
AND (SCHEDULE.call_datetime <= '2015-02-01 00:00:00')
ORDER BY schedule.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
EDIT 1 So you said without LEFT JOIN part it was fast enough, so you can try then:
SELECT persontbl1._CREATION_DATE, persontbl2._TOP_LEVEL_AURI,
persontbl2.RESP_CNIC, persontbl2.RESP_CNIC_NAME,
persontbl1.MOB_NUMBER1, persontbl1.MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id,
s.enum_change, s.status
FROM persontbl1
INNER JOIN persontbl2 ON (persontbl2._TOP_LEVEL_AURI = persontbl1._URI)
AND (AGR_CONTACT=1)
LEFT JOIN
(SELECT *
FROM SCHEDULE
WHERE status=0
AND call_datetime <= '2015-02-01 00:00:00'
) s
ON s.survey_id = persontbl1._URI
ORDER BY s.call_datetime IS NULL DESC, persontbl1._CREATION_DATE ASC
I'm guessing that AGR_CONTACT comes from p1. This is the query you want to optimize:
SELECT p1._CREATION_DATE, _TOP_LEVEL_AURI, RESP_CNIC, RESP_CNIC_NAME,
MOB_NUMBER1, MOB_NUMBER2,
s.id, s.call_datetime, s.enum_id, s.enum_change, s.status
FROM persontbl1 p1 INNER JOIN
persontbl2 p2
ON (p2._TOP_LEVEL_AURI = p1._URI) AND (p1.AGR_CONTACT = 1) LEFT JOIN
SCHEDULE s
ON (s.survey_id = p1._URI) AND
(s.status = 0) AND
(DATE(s.call_datetime) <= CURDATE())
ORDER BY s.call_datetime IS NULL DESC, p1._CREATION_DATE ASC;
The best indexes for this query are: persontbl2(agr_contact), persontbl1(_TOP_LEVEL_AURI, _uri), and schedule(survey_id, status, call_datime).
The use of date() around the date time is not recommended. In general, that precludes the use of indexes. However, in this case, you have a left join, so it doesn't make a difference. That column is not being used for filtering anyway. The index on schedule is only for covering the on clause.

MySQL Syntax Issue combining to working queries

I'm just starting to learn SQL, and managed to cobble together a couple of working queries, but then when I combine them I am getting a syntax error. The query throwing the error:
SELECT sca_ticket_status.name As Status, AVG(QueueTime)
FROM (SELECT DateDiff (created, now()) as 'QueueTime'
FROM sca_ticket as SubQuery
LEFT JOIN sca_ticket_status
ON sca_ticket.status_id = sca_ticket_status.id
GROUP BY name
ORDER BY sort
For reference, the two working queries that I am attempting to leverage are as follows:
SELECT sca_ticket_status.name As Status, COUNT(sca_ticket.ticket_id) AS Count
FROM sca_ticket
LEFT JOIN sca_ticket_status
ON sca_ticket.status_id = sca_ticket_status.id
WHERE sca_ticket.created between date_sub(now(),INTERVAL 1 WEEK) and now()
GROUP BY name
ORDER BY sort
SELECT AVG(QueueTime)
FROM (SELECT DateDiff (created, now()) as 'QueueTime'
FROM `sca_ticket`
WHERE `status_id` = 1) as SubQuery
Try closing your second select statement
SELECT sca_ticket_status.name As Status, AVG(QueueTime)
FROM (SELECT status_id, DateDiff (created, now()) as 'QueueTime'
FROM sca_ticket) q1
LEFT JOIN sca_ticket_status
ON q1.status_id = sca_ticket_status.id
GROUP BY name
ORDER BY sort
You will also need to expose the status_id column in your inner select list if you want to join on it later.
You do not need a subquery at all. This just slows down the processing in MySQL (the optimizer is not very smart; it materializes subqueries losing index information).
SELECT ts.name As Status, AVG(DateDiff(t.created, now()))
FROM sca_ticket t LEFT JOIN
sca_ticket_status ts
ON t.status_id = ts.id
GROUP BY ts.name
ORDER BY sort

Count tweets between dates (mysql)

I have an assignment to create a twitter like database. And in this assignment i have to filter out the trending topics. My idea was to count the tweets with a specific tag between the date the tweet was made and 7 days later, and order them by the count.
I have the following 2 tables i am using for this query :
Table Tweet : id , message, users_id, date
Table Tweet_tags : id, tag, tweet_id
Since mysql isn't my strong point at all im having trouble getting any results from the query.
The query i tried is :
Select
Count(twitter.tweet_tags.id) As NumberofTweets,
twitter.tweet_tags.tag
From twitter.tweet
Inner Join twitter.tweet_tags On twitter.tweet_tags.tweet_id = twitter.tweet.id
WHERE twitter.tweet_tags.tag between twitter.tweet.date and ADDDATE(twitter.tweet.date, INTERVAL 7 day)
ORDER BY NumberofTweets
The query works, but gives no results. I just can't get it to work. Could you guys please help me out on this, or if you have a better way to get the trending topics please let me know!
Thanks alot!
This is equivalent to your query, with table aliases to make it easier to read, with BETWEEN replaced by two inequality predicates, and the ADDDATE function replaced with equivalent operation...
SELECT COUNT(s.id) As NumberofTweets
, s.tag
FROM twitter.tweet t
JOIN twitter.tweet_tags s
ON s.tweet_id = t.id
WHERE s.tag >= t.date
AND s.tag <= t.date + INTERVAL 7 DAY
ORDER
BY NumberofTweets
Two things pop out at me here...
First, there is no GROUP BY. To get a count by "tag", you want at GROUP BY tag.
Second, you are comparing "tag" to "date". I don't know your tables, but that just doesn't look right. (I expect "date" is a DATETIME or TIMESTAMP, and "tag" is a character string (maybe what my daughter calls a "hash tag". Or is that tumblr she's talking about?)
If I understand your requirement:
For each tweet, and for each tag associated with that tweet, you want to get a count of the number of other tweets, that have a matching tag, that are made within 7 days after the datetime of the tweet.
One way to get this result would be to use a correlated subquery. (This is probably the easiest approach to understand, but is probably not the best approach from a performance standpoint).
SELECT t.id
, s.tag
, ( SELECT COUNT(1)
FROM twitter.tweet_tags r
JOIN twitter.tweet q
ON q.id = r.tweet_id
WHERE r.tag = s.tag
AND q.date >= t.date
AND q.date <= t.date + INTERVAL 7 DAY
) AS cnt
FROM twitter.tweet t
JOIN twitter.tweet_tags s
ON s.tweet_id = t.id
ORDER
BY cnt DESC
Another approach would be to use a join operation:
SELECT t.id
, s.tag
, COUNT(q.id) AS cnt
FROM twitter.tweet t
JOIN twitter.tweet_tags s
ON s.tweet_id = t.id
LEFT
JOIN twitter.tweet_tags r
ON r.tag = s.tag
LEFT
JOIN twitter.tweet q
ON q.id = r.tweet_id
AND q.date >= t.date
AND q.date <= t.date + INTERVAL 7 DAY
GROUP
BY t.id
, s.tag
ORDER
BY cnt DESC
The counts from both of these queries assume that tweet_tags (tweet_id, tag) is unique. If there are any "duplicates", then including the DISTINCT keyword, i.e. COUNT(DISTINCT q.id) (in place of COUNT(1) and COUNT(q.id) respectively) would get you the count of "related" tweets.
NOTE: the counts returned will include the original tweet itself.
NOTE: removing the LEFT keywords from the query above should return an equivalent result, since the tweet/tag (from t/s) is guaranteed to match itself (from r/q), as long as the tag is not null and the tweet date is not null.
Those queries are going to have problematic performance on large sets. Appropriate covering indexes are going to be needed for acceptable performance:
... ON twitter.tweet_tags (tag, tweet_id)
... ON twitter.tweet (date)