Display zero in group by sql for a particular period - mysql

I am trying to run the following query to obtain the sales for each type of job for a particular period. However for certain months where there are no jobs of a particular job type performed no 0 is displayed in sales.
How can i display the zeros in such a condition.
Here is the sql query-
select Year(postedOn), month(postedOn), jobType, sum(price)
from tbl_jobs
group by jobType, year(postedOn), month(postedOn)
order by jobType, year(postedOn), month(postedOn)

Typically, this is where your all-purpose calendar or numbers table comes in to anchor the query with a consistent sequential set:
SELECT job_summary.*
FROM Calendar
CROSS JOIN (
-- you may not have though about this part of the problem, though
-- what about years/months with missing job types?
SELECT distinct jobType FROM tbl_jobs
) AS job_types
LEFT JOIN (
select Year(postedOn) AS year,month(postedOn) as month,jobType ,sum(price)
from tbl_jobs
group by jobType, year(postedOn), month(postedOn)
) job_summary
ON job_summary.jobType = job_types.jobType
AND job_summary.year = Calendar.year
AND job_summary.month = Calendar.month
WHERE Calendar.day = 1 -- Assuming your calendar is every day
AND calendar.date BETWEEN some_range_goes_here -- you don't want all time, right?
order by job_types.jobType, Calendar.year, Calendar.month

Related

Conditionally counting while also grouping by

I am trying to join two tables
ad_data_grouped
adID, adDate (date), totalViews
This is data that has already been grouped by both adID and adDate.
The second table is
leads
leadID, DateOfBirth, adID, state, createdAt(dateTime)
What I'm struggling with is joining these two tables so I can have a column that counts the number of leads when it shares the same adID and where the adDate = createdAt
The problem I'm running into is that when the counts are all the same for all groupings of adID....I have a few other things I'm trying to do, but it's based on similar similar conditional counting.
Query:(I know the temp table is probably overkill, but I'm trying to break this up into small pieces where I can understand what each piece does)
CREATE TEMPORARY TABLE ad_stats_grouped
SELECT * FROM `ad_stats`
LIMIT 0;
INSERT INTO ad_stats_grouped(AdID, adDate, DailyViews)
SELECT
AdID,
adDate,
sum(DailyViews)
FROM `ad_stats`
GROUP BY adID, adDate;
SELECT
ad_stats_grouped.adID,
ad_stats_grouped.adDate,
COUNT(case when ad_stats_grouped.adDate = Date(Leads.CreatedAt) THEN 1 ELSE 0 END)
FROM `ad_stats_grouped` INNER JOIN `LEADS` ON
ad_stats_grouped.adID = Leads.AdID
GROUP BY adID, adDate;
The problem with your original query is the logic in the COUNT(). This aggregate functions takes in account all non-null values, so it counts 0 and 1s. One solution would be to change COUNT() to SUM().
But I think that the query can be furtermore improved by moving the date condition on the date to the on part of a left join:
select
g.adid,
g.addate,
count(l.adid)
from `ad_stats_grouped` g
left join `leads` l
on g.adid = l.adid
and l.createdat >= g.addate
and l.createdat < g.ad_stats + interval 1 day
group by g.adid, g.addate;

MYSQL query to get project details and last MAX() action details from log

How can I write a MYSQL query to get project details and the entire last row of the activity log? I want a list of all the projects, with the data from each project's most recent row from the action log, all of it ordered by the most recent action log date DESC. Sorry, I know that this is a common query and the answer must be very easy. But I can't find the solution. I searched with every possible word combination. I found examples that need only one field such as MAX(id) from the joined table. I found solutions with COALESCE but can't seem to make them work. My problem is that I need many fields from the 'parent' table row PL_PROJECTS as well as many fields from the joined table PL_LOG row, not to mention people's names from the same table joined twice.
Everything I try either gives me all the rows of the PL_LOG, repeating rows from PL_PROJECTS. Or, I get just one row from PL_LOG for just one project if I put a LIMIT in the sub query. Here's my query that doesn't work:
SELECT
PJ.pj_id, PJ.pj_title, PJ.pj_location, PJ.pj_desc, PJ.pj_request, PJ.pj_date_start, PP1.pp_name AS supervisor_name, PP2.pp_name AS customer_name, ST.st_desc, logDate, logDesc
FROM PL_PROJECTS PJ
INNER JOIN PL_PEOPLE PP1 ON PJ.pj_spst_member = PP1.pp_id
INNER JOIN PL_PEOPLE PP2 ON PJ.pj_pp_id = PP2.pp_id
INNER JOIN PL_STATUS ST ON PJ.pj_status = ST.st_id
LEFT OUTER JOIN (
SELECT MAX(lg_pj_id) MaxLogID, lg_date AS logDate, lg_desc AS logDesc, lg_pj_id
FROM PL_LOG PL
ORDER BY lg_id DESC
)
LR ON LR.lg_pj_id = PJ.pj_id
GROUP BY PJ.pj_id
ORDER BY logDate DESC
LIMIT 9999999
I think you problem is, that your subselect only generates one row as you are using max() while you need one row per project (lg_pj_id i think).
You only need to rewrite the subselect to generate one row per project with the informations from the recent activity. Do you have an activity_ID in your action log? Because it looks like
lg_pj_id is the project_ID. The meaning of lg_desc is also unknown (or is that the action_log_id ?). Try to group by project_ID in you subselect and depending on your needs either select the max values from the corresponding rows or select the row with the maximum values per group (project_ID)
Thanks for the suggestion of GROUP BY to get one row per project. I tried changing the sub-query like so:
SELECT MAX(lg_id) AS MaxLogID, lg_desc, lg_pj_id
FROM PL_LOG PL
GROUP BY lg_pj_id
Now, I get one row from the log, but it gives me the max id, but not the lg_desc from the same row! If I try the sub-query by itself:
SELECT lg_id, lg_pj_id, lg_date, lg_desc
FROM `PL_LOG`
WHERE lg_pj_id = 33
ORDER BY lg_date DESC
I get these rows. You can see the max row, 68 has a description "30 minute skype call."
68,33,2018-06-10 00:00:00","30 minute skype call."
61,33,"2018-06-02 00:00:00","Sent email to try to elicit a response."
52,33,"2018-05-10 00:00:00","sent follow up email"
47,33,"2018-03-26 00:00:00","sent initial email"
46,33,"2018-03-26 00:00:00","sent initial email"
But when I try to get just that row, using GROUP BY, it gives me the max lg_id, but the first lg_desc. I need the data all from the max(lg_id) row:
SELECT MAX(lg_id) AS MaxLogID, lg_pj_id, lg_date, lg_desc
FROM PL_LOG
WHERE lg_pj_id = 33
GROUP BY lg_pj_id
ORDER BY MaxLogID DESC
Returns:
68, 33, "2018-03-26 00:00:00", "sent initial email"
Try this as mentioned in my comment:
SELECT
PJ.pj_id, PJ.pj_title, PJ.pj_location, PJ.pj_desc, PJ.pj_request,
PJ.pj_date_start, PP1.pp_name AS supervisor_name, PP2.pp_name AS
customer_name, ST.st_desc, logDate, logDesc
FROM PL_PROJECTS PJ
INNER JOIN PL_PEOPLE PP1 ON PJ.pj_spst_member = PP1.pp_id
INNER JOIN PL_PEOPLE PP2 ON PJ.pj_pp_id = PP2.pp_id
INNER JOIN PL_STATUS ST ON PJ.pj_status = ST.st_id
LEFT JOIN (SELECT lg_id, lg_date AS logDate, lg_desc AS logDesc, lg_pj_id
FROM PL_LOG AS PL
WHERE PL.lg_id=(SELECT MAX(lg_id) FROM PL_LOG AS PL_2
WHERE PL_LOG.lg_pj_id = PL_2.lg_pj_id )
LR ON LR.lg_pj_id = PJ.pj_id
GROUP BY PJ.pj_id
ORDER BY logDate DESC
LIMIT 9999999

Is there a way to create an SQL query faster than this one?

I have a MySQL table which stores the data of a hotel's reservations.
I need a query to see the amount of guests who stayed in the hotel for each date.
I was able to create a query (using a subquery) but it performs very slowly. Is there a better way to get the requested data? (For example join the table to itself, or whatever.)
My query is:
SELECT CheckOutDate AS Date,
(SELECT SUM(NrOfGuests) FROM tblGuests tG
WHERE tG.CheckInDate <= tblGuests.CheckOutDate
AND tG.CheckOutDate > tblGuests.CheckOutDate
AND tG.IsCancelled = False AND tG.NoShow = False)
AS NrOfGestsStaying
FROM tblGuests
GROUP BY CheckOutDate
What is the best way to make it perform faster?
In the original query, the SELECT returns a SUM on every row of the table using a subquery. The duplicates are removed afterwards using a group by CheckOutDate. So, in other words, this is the SUM(NrOfGuests) for distinct CheckOutDate.
You can remove duplicate CheckOutDate in advance by subquerying distinct CheckOutDate. So in the receiving query the SUM is applied just one time for distinct CheckOutDate:
SELECT dT.CheckOutDate
,(SELECT SUM(NrOfGuests)
FROM tblGuests tG
WHERE tG.CheckInDate <= dT.CheckOutDate
AND tG.CheckOutDate >= dT.CheckOutDate
AND tG.IsCancelled = 0
AND tG.NoShow = 0
) AS NrOfGuests
FROM (
SELECT DISTINCT CheckOutDate
FROM tblGuests
) AS dT
ORDER BY dT.CheckOutDate

Find average amount of time from Requisition Submitted to Order Created

I have two tables requisition_headers and order_headers. I am interested in finding the average time it takes from the time the requisition is submitted (requisition_headers.submitted_at) and the time the order is created (orders.headers_created_at) where the requisition_headers.status <> 'draft'.
I would like the result to look like:
Avg_Req_To_PO_Cycle_Time = 3.2 Days
I have the following script but it's not working:
SELECT Database() as Customer,
AVG(timestampdiff(requisition_headers.submitted_at,order_headers.created_at)) AS REQ_PO_Cycle_Time
FROM order_headers
LEFT JOIN requisition_headers ON order_headers.requisition_header_id = requisition_headers.id
WHERE requisition_headers.status <> 'draft'
Any Ideas?
--UPDATE--
I changed the query to the following and now get a response of 229491.71 my question is- is that days, hours, minutes, seconds?
SELECT DATABASE() AS CUSTOMER,
AVG(TIME_TO_SEC(TIMEDIFF(order_headers.created_at,requisition_headers.submitted_at))) as Cycle_time
FROM order_headers LEFT JOIN requisition_headers ON order_headers.requisition_header_id = requisition_headers.id
where requisition_headers.status <> 'draft'
Make sure you know what a system function returns when you use it. The TimestampDiff function returns the difference between the two dates in the unit you specify in the first argument. You don't specify that unit so I don't know what you get back. I get a compile error.
In your second attempt, you are using TimeDiff which returns an interval value, then converting the result of Avg to seconds. So if you want the result in fractional days just divide by the number of seconds in a day.
You also use a left join when getting the dates. At first I thought you wanted to get all the requisitions whether the orders had been created or not. But you are joining the tables in the wrong order for that. But, assuming that is your intention, if the order has not yet been created you will be putting NULL as one of the parameters. You will get a NULL as an answer so you get nothing. If you want to use a left join, then you should specify a substitute date for any missing Created dates -- after getting the table in the right order, that is.
Here are two options. One ignores orders that have not yet been created by using a regular inner join. The other includes those but substitutes the current date and time.
By asking for the number of minutes between the dates, the final answer in days is found by dividing by the number of minutes in a day.
SQLFiddle
SELECT Customer,
AVG( timestampdiff( minute, r.submitted_at,
o.created_at)) / (24 * 60 )AS REQ_PO_Cycle_Time
FROM requisition_headers r
JOIN order_headers o
ON o.requisition_header_id = r.id
WHERE r.status <> 'draft'
group by Customer;
SELECT Customer,
AVG( timestampdiff( minute, r.submitted_at,
IfNull( o.created_at, CurDate()))) / (24 * 60 )AS REQ_PO_Cycle_Time
FROM requisition_headers r
LEFT JOIN order_headers o
ON o.requisition_header_id = r.id
WHERE r.status <> 'draft'
group by Customer;

Getting Incorrect SUM for left joins and GROUP BY

I am getting wrong results in the sum of total deposits.
I want to output a report of total deposits per campaign_name
and eventually inside a date range.
SELECT IFNULL(campaign_name,'DIRECT'),
IFNULL(TotalDeposit,0)
FROM trackings
LEFT JOIN
(SELECT deposit_amount,
sum(deposit_amount) AS TotalDeposit,
uuid
FROM conversions
LEFT JOIN transactions ON conversions.trader_id = transactions.trader_id
WHERE aff_id =3
AND TYPE='deposit'
GROUP BY transactions.trader_id) AS conversions ON trackings.uuid = conversions.uuid
WHERE aff_id=3
GROUP BY campaign_name
results: missing 200 from trynow campaign??
campaign_name,TotalDeposit
DIRECT,0.00
new_campaign_name,0.00
test march,500.00
testing,0.00
trynow,800.00
expected results:
campaign_name,TotalDeposit
DIRECT,0.00
new_campaign_name,0.00
test march,500.00
testing,0.00
trynow,1000.00
I think your data isn't quite right - using the data that you've supplied, the deposit of 500 for test march is never going to be returned, as it is linked to trader_id 7506, who has no records in the conversions table.
However, the following query is simpler and easier to understand, and correctly returns 1000 for trynow
SELECT
IFNULL(SUM(t.deposit_amount),0) AS total_deposits
, IFNULL(tr.campaign_name,'DIRECT') AS campaign
FROM
trackings tr LEFT JOIN
conversions c ON
tr.uuid = c.uuid LEFT JOIN
transactions t ON
c.trader_id = t.trader_id AND
tr.`aff_id` = t.aff_id AND
t.type = 'Deposit'
WHERE
tr.aff_id = 3 AND
tr.updated_at >= '2015-03-01' AND tr.updated_at < '2015-04-01'
GROUP BY
IFNULL(tr.campaign_name,'DIRECT')
If you can check the test data supplied or otherwise point me in the right direction, I might be able to improve the query to return exactly what you want.
For date filtering, see the addition to the where clause above. NOte that if you need to filter on a date in the transactions table, the date filtering clause must be part of the "on" statement instead (as this table is left-joined, so we can't filter in the main where clause).