How can I write this SQL query correctly? - mysql

I'm trying to figure out how to write a query to get the right results.
I'll keep it simple. First, the foundation:
I have two tables: DEALS and TASKS
A Deal can have 1 or more Task inside, so TASKS has a deal_id field.
Also, every Task has a time_start field (unix timestamp) and a completed field (1 or 0).
Ok. Now, what I need? In my view I need to show all deals with the "Next Task" column rendered.
So for every Deal, if I have a Task (one or more) it must show only the closest. If no Task, I'll render an alert.
Deal Title | Value | Next step
deal 1 | 1.000 | tomorrow at 11:00
deal 2 | 1.000 | NO TASK IN THIS DEAL
deal 3 | 1.500 | 12/03/2017 at 9:00
In this example, deal 1 has 3 tasks inside, but the nearest start tomorrow. I don't want "deal 1" repeated 3 times. <-- GROUP BY deals.id??
To get this right, currently, I run the deals query without JOIN and I'm using a custom PDO class to run a new query for tasks for every row.
But this is BAD! I have a new query for every row of the DEALS table.
I'm pretty sure that there is a way to write one single query to get this result.
PS: Don't care about the rendering of the text, I used "tomorrow" only to write the example, next_spet is the unix timestamp from the db ... I can easily use moment.js to format it correctly.
EDIT:
I'll provide the data inside the 2 tables, just to complete the example.
DEALS
ID | TITLE | VALUE
1 | Deal 1 | 1000
2 | Deal 2 | 1000
3 | Deal 3 | 1000
TASKS
ID | DEAL_ID | TITLE | TIME_START | COMPLETED
1 | 1 | Send Proposal | 1483678800 | 0
2 | 1 | Follow up | 1483441200 | 0
3 | 1 | Ask for referrals | 1484441200 | 0
4 | 2 | Send email | 1483678900 | 0
5 | 3 | Sort out meeting | 1483678900 | 0
NOTE: timestamps don't match with the results that I have written in the first table. They were just an example, I take the timestamp of the time_start field and format it in a human readable mode, but this isn't my question.

If you want the time of the next task, you can use a correlated subquery:
select d.*,
(select t.time_start
from tasks t
where t.deal_id = d.deal_id and t.time_start > d.time_col
order by t.time_start
limit 1
) as task_time_start
from deals d;
EDIT:
If you want the future, then just change the time comparison:
select d.*,
(select t.time_start
from tasks t
where t.deal_id = d.deal_id and t.completed = 0 and
t.time_start > now()
order by t.time_start
limit 1
) as task_time_start
from deals d;

Something like this should do it.
select
d.title as 'Deal Title',
d.value as 'Value',
IsNull(CONVERT(VARCHAR(50), min(t.time_start)), 'NO TASK IN THIS DEAL') as 'Next step'
from Deals d
left join Tasks t
on t.deal_id = d.deal_id
and t.is_completed = 0
and t.time_start > GetDate()
group by d.deal_id, d.title, d.value
Here's the SQLite equivalent:
select
d.title as 'Deal Title',
d.value as 'Value',
IfNull(min(t.time_start), 'NO TASK IN THIS DEAL') as 'Next step'
from Deals d
left join Tasks t
on t.deal_id = d.deal_id
and t.is_completed = 0
and t.time_start > now() --remove this line to include past Tasks
group by d.deal_id, d.title, d.value

Related

MySQL limitations to simplify Query

Please note that I'm an absolute n00b in MySQL but somehow I managed to build some (for me) complex queries that work as they should. My main problem now is that for a many of the queries we're working on:
The querie is becoming too big and very hard to see through.
The same subqueries get repeated many times and that is adding to the complexity (and probably to the time needed to process the query).
We want to further expand this query but we are reaching a point where we can no longer oversee what we are doing. I've added one of these subqueries at the end of this post, just as an example.
!! You can fast foward to the Problem section if you want to skip the details below. I think the question can be answered also without the additional info.
What we want to do
Create a MySQL query that calculates purchase orders and forecasts for a given supplier based on:
Sales history in a given period (past [x] months = interval)
Current stock
Items already in backorder (from supplier)
Reserved items (for customers)
Supplier ID
I've added an example of a subquery at the bottom of this message. We're showing just this part to keep things simple for now. The output of the subquery is:
Part number
Units sold
Units sold (outliers removed)
Units sold per month (outliers removed)
Number of invoices with the part number in the period (interval)
It works quite OK for us, although I'm sure it can be optimised. It removes outliers from the sales history (e.g. one customer that orders 50 pcs of one product in one order). Unfortunately it can only remove outliers with substantial data, so if the first order happens to be 50 pcs then it is not considered an outlier. For that reason we take the amount of invoices into account in the main query. The amount of invoices has to exceed a certain number otherwise the system wil revert to a fixed value of "maximum stock" for that product.
As mentioned this is only a small part of the complete query and we want to expand it even further (so that it takes into account the "sales history" of parts that where used in assembled products).
For example if we were to build and sell cars, and we want to place an
order with our tyre supplier, the query calculates the amount of tyres we need to order based on the sales history of the various car models (while also taking into account the stock of the cars, reserved cars and stock of the tyres).
Problem
The query is becomming massive and incomprehensible. We are repeating the same subqueries many times which to us seems highly inefficient and it is the main cause why the query is becomming so bulky.
What we have tried
(Please note that we are on MySQL 5.5.33. We will update our server soon but for now we are limited to this version.)
Create a VIEW from the subqueries.
The main issue here is that we can't execute the view with parameters like supplier_id and interval period. Our subquery calculates the sum of the sold items for a given supplier within the given period. So even if we would build the VIEW so that it calculates this for ALL products from ALL suppliers we would still have the issue that we can't define the interval period after the VIEW has been executed.
A stored procedure.
Correct me if I'm wrong but as far as I know, MySQL only allows us to perform a Call on a stored procedure so we still can't run it against the parameters (period, supplier id...)
Even this workaround won't help us because we still can't run the SP against the parameters.
Using WITH at the beginning of the query
A common table expression in MySQL is a temporary result whose scope is confined to a single statement. You can refer this expression multiple times with in the statement.
The WITH clause in MySQL is used to specify a Common Table Expression, a with clause can have one or more comms-separated subclauses.
Not sure if this would be the solution because we can't test it. WITH is not supported untill MySQL version 8.0.
What now?
My last resort would be to put the mentioned subqueries in a temp table before starting the main query. This might not completely eliminate our problems but at least the main query will be more comprehensible and with less repetition of fetching the same data. Would this be our best option or have I overlooked a more efficient way to tackle this?
Many thanks for your kind replies.
SELECT
GREATEST((verkocht_sd/6*((100 + 0)/100)),0) as 'units sold p/month ',
GREATEST(ROUND((((verkocht_sd/6)*3)-voorraad+reserved-backorder),0),0) as 'Order based on units sold',
SUM(b.aantal) as 'Units sold in period',
t4.verkocht_sd as 'Units sold in period, outliers removed',
COUNT(*) as 'Number of invoices in period',
b.art_code as 'Part number'
FROM bongegs b -- Table that has all the sales records for all products
RIGHT JOIN totvrd ON (totvrd.art_code = b.art_code) -- Right Join stock data to also include items that are not in table bongegs (no sales history).
LEFT JOIN artcred ON (artcred.art_code = b.art_code) -- add supplier ID to the part numbers.
LEFT JOIN
(
SELECT
SUM(b.aantal) as verkocht_sd,
b.art_code
FROM bongegs b
RIGHT JOIN totvrd ON (totvrd.art_code = b.art_code)
LEFT JOIN artcred ON (artcred.art_code = b.art_code)
WHERE
b.bon_datum > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
and b.bon_soort = "f" -- Selects only invoices
and artcred.vln = 1 -- 1 = Prefered supplier
and artcred.cred_nr = 9117 -- Supplier ID
and b.aantal < (select * from (SELECT AVG(b.aantal)+3*STDDEV(aantal)
FROM bongegs b
WHERE
b.bon_soort = 'f' and
b.bon_datum > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)) x)
GROUP BY b.art_code
) AS t4
ON (b.art_code = t4.art_code)
WHERE
b.bon_datum > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
and b.bon_soort = "f"
and artcred.vln = 1
and artcred.cred_nr = 9117
GROUP BY b.art_code
Bongegs | all rows from sales forms (invoices F, offers O, delivery notes V)
| art_code | bon_datum | bon_soort | aantal |
|:---------|:---------: |:---------:|:------:|
| item_1 | 2021-08-21 | f | 6 |
| item_2 | 2021-08-29 | v | 3 |
| item_6 | 2021-09-03 | o | 2 |
| item_4 | 2021-10-21 | f | 6 |
| item_1 | 2021-11-21 | o | 6 |
| item_3 | 2022-01-17 | v | 6 |
| item_1 | 2022-01-21 | o | 6 |
| item_4 | 2022-01-26 | f | 6 |
Artcred | supplier ID's
| art_code | vln | cred_nr |
|:---------|:----:|:-------:|
| item_1 | 1 | 1001 |
| item_2 | 1 | 1002 |
| item_3 | 1 | 1001 |
| item_4 | 1 | 1007 |
| item_5 | 1 | 1004 |
| item_5 | 2 | 1008 |
| item_6 | 1 | 1016 |
| item_7 | 1 | 1567 |
totvrd | stock
| art_code | voorraad | reserved | backorder |
|:---------|:---------: |:--------:|:---------:|
| item_1 | 1 | 0 | 5 |
| item_2 | 0 | 0 | 0 |
| item_3 | 88 | 0 | 0 |
| item_4 | 9 | 0 | 0 |
| item_5 | 67 | 2 | 20 |
| item_6 | 112 | 9 | 0 |
| item_7 | 65 | 0 | 0 |
| item_8 | 7 | 1 | 0 |
Now, on to the query. You have LEFT JOINs to the artcred table, but then include artcred in the WHERE clause making it an INNER JOIN (required both left and right tables) in the result. Was this intended, or are you expecting more records in the bongegs table that do NOT exist in the artcred.
Well to be honest I was not fully aware that this would essentially form an INNER JOIN but in this case it doesn't really matter. A record that exists in bongegs always exists in artcred as well (every sold product must have a supplier). That doesn't work both ways since a product can be in artcred without ever being sold.
You also have RIGHT JOIN on totvrd which implies you want every record in the TotVRD table regardless of a record in the bongegs table. Is this correct?
Yes it is intended. Otherwise only products with actual sales in the period would end up in the result and we also wanted to include products with zero sales.
One simplification:
and b.aantal < ( SELECT * from ( SELECT AVG ...
-->
and b.aantal < ( SELECT AVG ...
A personal problem: my brain hurts when I see RIGHT JOIN; please rewrite as LEFT JOIN.
Check you RIGHTs and LEFTs -- that keeps the other table's rows even if there is no match; are you expecting such NULLs? That is, it looks like they can all be plain JOINs (aka INNER JOINs).
These might help performance:
b: INDEX(bon_soort, bon_datum, aantal, art_code)
totvrd: INDEX(art_code)
artcred: INDEX(vln, cred_nr, art_code)
Is b the what you keep needing? Build a temp table:
CREATE TEMPORARY TABLE tmp_b
SELECT ...
FROM b
WHERE ...;
But if you need to use tmp_b multiple times in the same query, (and since you are not yet on MySQL 8.0), you may need to make it a non-TEMPORARY table for long enough to run the query. (If you have multiple connections building the same permanent table, there will be trouble.)
Yes, 5.5.33 is rather antique; upgrade soon.
(pre
By getting what I believe are all the pieces you had, I think this query significantly simplifies the query. Lets first start with the fact that you were trying to eliminate the outliers by selecting the standard deviation stuff as what to be excluded. Then you had the original summation of all sales also from the bongegs table.
To simplify this, I have the sub-query ONCE internal that does the summation, counts, avg, stddev of all orders (f) within the last 6 months. I also computed the divide by 6 for per-month you wanted in the top.
Since the bongegs is now all pre-aggregated ONCE, and grouped per art_code, it does not need to be done one after the other. You can use the totals directly at the top (at least I THINK is similar output without all actual data and understanding of your context).
So the primary table is the product table (Voorraad) and LEFT-JOINED to the pre-query of bongegs. This allows you to get all products regardless of those that have been sold.
Since the one aggregation prequery has the avg and stddev in it, you can simply apply an additional AND clause when joining based on the total sold being less than the avg/stddev context.
The resulting query below.
SELECT
-- appears you are looking for the highest percentage?
-- typically NOT a good idea to name columns starting with numbers,
-- but ok. Typically let interface/output name the columns to end-users
GREATEST((b.verkocht_sdperMonth * ((100 + 0)/100)),0) as 'units sold p/month',
-- appears to be the total sold divided by 6 to get monthly average over 6 months query of data
GREATEST( ROUND(
( (b.verkocht_sdperMonth * 3) - v.voorraad + v.reserved - v.backorder), 0), 0)
as 'Order based on units sold',
b.verkocht_sd as 'Units sold in period',
b.AvgStdDev as 'AvgStdDeviation',
b.NumInvoices as 'Number of invoices in period',
v.art_code as 'Part number'
FROM
-- stock, master inventory, regardless of supplier
-- get all products, even though not all may be sold
Voorraad v
-- LEFT join to pre-query of Bongegs pre-grouped by the art_code which appears
-- to be basis of all other joins, std deviation and average while at it
LEFT JOIN
(select
b.arc_code,
count(*) NumInvoices,
sum( b.aantal ) verkocht_sd,
sum( b.aantal ) / 6.0 verkocht_sdperMonth,
avg( b.aantal ) AvgSale,
AVG(b.aantal) + 3 * STDDEV( b.aantal) AvgStdDev
from
bongegs b
JOIN artcred ac
on b.art_code = ac.art_code
AND ac.vln = 1
and ac.cred_nr = 9117
where
-- only for ORDERS ('f') and within last 6 months
b.bon_soort = 'f'
AND b.bon_datum > DATE_SUB(CURDATE(), INTERVAL 6 MONTH)
group by
b.arc_code ) b
-- result is one entry per arc_code, thus preventing any Cartesian product
ON v.art_code = b.art_code
GROUP BY
v.art_code

Using a Select query to change the Value for two tables depending on a condition

Goal: I want to create a Select query where the result contains all records from both tables, except the time slot. In a addition to this I want to have the condition that if the minutes of parking are 0 or Null the value of the field should be set to -1.
Progress: At the current state I merged the two tables and could set the 0 value to -1. Due to the fact that I am quite new to SQL I couldnĀ“t find a solution for keeping the original values for minutes and integrate the 'When Null Then -1' clause. Many solutions suggest a Update query , but the operation needs to be in a Select result. MYSQL 2017. This is my code so far:
Select c.ID, c.status, c.Date, Case When c.Minutes = 0 Then -1 End as Minutes
From Customer_1 as c
Union
Select c1.ID, c1.status, c1.Date, Case When c1.Minutes = 0 Then -1 End as
Minutes
From Customer_2 as c1
Original Dataset: I Have two tables with the exact same column names, representing user IDs
Customer_1:
ID| Date| Minutes| Time| status
1 | 2019| 3 | 2019| A
2 | 2019| 0 | 2019| A
Customer_2:
ID| Date| Minutes| Time| status
3 | 2019| Null | 2019| A
4 | 2019| 0 | 2019| A
What the final query should look like:
ID| Date| Minutes| status
1 | 2019| 3 | A
2 | 2019| -1 | A
3 | 2019| -1 | A
4 | 2019| -1 | A
Any suggestion how build a working query that fulfills the criteria would be much appreciated!
Just using Coalesce() function and adding Else part is needed within your Case .. When statement :
Select ID, status, Date, Case When Coalesce(Minutes,0) = 0 Then -1 Else Minutes End as Minutes
From Customer_1
Union
Select ID, status, Date, Case When Coalesce(Minutes,0) = 0 Then -1 Else Minutes End
From Customer_2
and using aliases for tables is redundant in this case, since the queries are independent except for Union. The alias(Minutes) for the second query is also redundant.
Another alternative might be using an IF statement along with COALESCE() function :
Select ID, status, Date, IF(Coalesce(Minutes,0) , Minutes, -1) as Minutes
From Customer_1
Union
Select ID, status, Date, IF(Coalesce(Minutes,0) , Minutes, -1)
From Customer_2
Demo

How do I add a where clause to a sum aggregate?

I am trying to figure out best way to get the aggregate of a person's hours spent on a project name that follows a certain pattern
Current Tables
+--------------------+----------------+-----------------+
| Tbl_Employee | Tbl Projects | tbl_timesheet |
+--------------------+----------------+-----------------+
| employee_id | project_id | timesheet_id |
| employee_full_name | cws_project_id | employee_id |
| | | project_id |
| | | timesheet_hours |
+--------------------+----------------+-----------------+
Here is the query I have so far
select
te.employee_id,
te.employee_last_name,
te.employee_first_name,
te.employee_department,
te.employee_type_id,
te.timesheet_routing,
sum(tt.timesheet_hours) as total_hours,
month(tt.timesheet_date) as "month",
year(tt.timesheet_date) as "year"
from tbl_employee te
left join tbl_timesheet tt
on te.employee_id = tt.employee_id
join tbl_projects tp
on tp.project_id = tt.project_id
where te.employee_active = 1
and te.employee_id > 0
and employee_department IN ("Project Management","Engineering","Deployment Srvs.")
and year(tt.timesheet_date) = 2015
group by te.employee_last_name, year(tt.timesheet_date), month(tt.timesheet_date)
order by employee_last_name
What I need to add to my select statement is something to the effect of
sum(tt.timesheet_hours) as where cws_project_id like '%Training%' as training
In short I need to know the sum of hours an employee has contributed to a project where the cws_project_id contains the word Training. I know you cant add a where clause to a Sum but I cant seem to find another way to do it.
If this makes a difference I need to do this several times - ie where the project_name contains a different word.
Thank you so much for any help that can be provided. I hope that is not clear as mud.
Here is the general form of what you are looking for:
SELECT SUM(IF(x LIKE '%y%', z, 0)) AS ySum
even more general
SELECT SUM(IF([condition on row], [value or calculation from row], 0)) AS [partialSum]
Edit: For more RDBMS portability (earlier versions of MS SQL do not support this form of IF):
SELECT SUM(CASE WHEN [condition on row] THEN [value or calculation from row] ELSE 0 END) AS [partialSum]

How can I get the difference between the individual maximum values of different days?

I am new in MySQL, I am trying to find:
The difference between a given day's maximum value occurred and the previous day's maximum value.
I was able to get the maximum values for dates via:
select max(`bundle_count`), `Production_date`
from `table`
group by `Production_date`
But I don't know how to use SQL to calculate the differences between maximums for two given dates.
am expecting output like this
Please help me.
Update 1: Here is a fiddle, http://sqlfiddle.com/#!2/818ad/2, that I used for testing.
Update 2: Here is a fiddle, http://sqlfiddle.com/#!2/3f78d/10 that I used for further refining/fixing, based on Sandy's comments.
Update 3: For some reason the case where there is no previous day was not being dealt with correctly. I thought it was. However, I've updated to make sure that works (a bit cumbersome--but it appears to be right. Last fiddle: http://sqlfiddle.com/#!2/3f78d/45
I think #Grijesh conceptually got you the main thing you needed via the self-join of the input data (so make sure you vote up his answer!). I've cleaned up his query a bit on syntax (building off of his query!):
SELECT
DATE(t1.`Production_date`) as theDate,
MAX( t1.`bundle_count` ) AS 'max(bundle_count)',
MAX( t1.`bundle_count` ) -
IF(
EXISTS
(
SELECT date(t2.production_date)
FROM input_example t2
WHERE t2.machine_no = 1 AND
date_sub(date(t1.production_date), interval 1 day) = date(t2.production_date)
),
(
SELECT MAX(t3.bundle_count)
FROM input_example t3
WHERE t3.machine_no = 1 AND
date_sub(date(t1.production_date), interval 1 day) = date(t3.production_date)
GROUP BY DATE(t3.production_date)
), 0
)
AS Total_Bundles_Used
FROM `input_example` t1
WHERE t1.machine_no = 1
GROUP BY DATE( t1.`production_date` )
Note 1: I think #Grijesh and I were cleaning up the query syntax issues at the same time. It's encouraging that we ended up with very similar versions after we were both doing cleanup. My version differs in using IFNULL() for when there is no preceding data. I also ended up with a DATE_SUB, and I made sure to reduce various dates to mere dates without time component, via DATE()
Note 2: I originally had not fully understood your source tables, so I thought I needed to implement a running count in the query. But upon better inspection, it's clear that your source data already has a running count, so I took that stuff back out.
I am not sure but you need something like this, Hope it will be helpful to you upto some extend:
Try this:
SELECT t1.`Production_date` ,
MAX(t1.`bundle_count`) - MAX(t2.`bundle_count`) ,
COUNT(t1.`bundle_count`)
FROM `table_name` AS t1
INNER JOIN `table_name` AS t2
ON ABS(DATEDIFF(t1.`Production_date` , t2.`Production_date`)) = 1
GROUP BY t1.`Production_date`
EDIT
I create a table name = 'table_name', as below,
mysql> SELECT * FROM `table_name`;
+---------------------+--------------+
| Production_date | bundle_count |
+---------------------+--------------+
| 2004-12-01 20:37:22 | 1 |
| 2004-12-01 20:37:22 | 2 |
| 2004-12-01 20:37:22 | 3 |
| 2004-12-02 20:37:22 | 2 |
| 2004-12-02 20:37:22 | 5 |
| 2004-12-02 20:37:22 | 7 |
| 2004-12-03 20:37:22 | 6 |
| 2004-12-03 20:37:22 | 7 |
| 2004-12-03 20:37:22 | 2 |
| 2004-12-04 20:37:22 | 1 |
| 2004-12-04 20:37:22 | 9 |
+---------------------+--------------+
11 rows in set (0.00 sec)
My query: to find difference in bundle_count between two consecutive dates:
SELECT t1.`Production_date` ,
MAX(t2.`bundle_count`) - MAX(t1.`bundle_count`) ,
COUNT(t1.`bundle_count`)
FROM `table_name` AS t1
INNER JOIN `table_name` AS t2
ON ABS(DATEDIFF(t1.`Production_date` , t2.`Production_date`)) = 1
GROUP BY t1.Production_date;
its output:
+---------------------+-------------------------------------------------+--------------------------+
| Production_date | MAX(t2.`bundle_count`) - MAX(t1.`bundle_count`) | COUNT(t1.`bundle_count`) |
+---------------------+-------------------------------------------------+--------------------------+
| 2004-12-01 20:37:22 | 4 | 9 |
| 2004-12-02 20:37:22 | 0 | 18 |
| 2004-12-03 20:37:22 | 2 | 15 |
| 2004-12-04 20:37:22 | -2 | 6 |
+---------------------+-------------------------------------------------+--------------------------+
4 rows in set (0.00 sec)
This is PostgreSQL syntax (sorry; it's what I'm familiar with) but should fundamentally work in either database. Note this doesn't exactly run in PostgreSQL either because group is not a valid table name (it's a reserved keyword). The approach is a self-join as others have mentioned but I've used a view to handle the max-by-day and the difference as separate steps.
create view max_by_day as
select
date_trunc('day', production_date) as production_date,
max(bundle_count) as bundle_count
from
group
group by
date_trunc('day', production_date);
select
today.production_date as production_date,
today.bundle_count,
today.bundle_count - coalesce(yesterday.bundle_count, 0)
from
max_by_day as today
left join max_by_day yesterday on (yesterday.production_date = today.production_date - '1 day'::interval)
order by
production_date;
PostgreSQL also has a construct called window functions which is useful for this and a bit easier to understand. Just had to stick in a bit of advocacy for a superior database. :-P
select
date_trunc('day', production_date),
max(bundle_count),
max(bundle_count) - lag(max(bundle_count), 1, 0)
over
(order by date_trunc('day', production_date))
from
group
group by
date_trunc('day', production_date);
These two approaches differ in how they handle missing days in the data - the first will treat it as a 0, the second will use the previous day which is present. There wasn't a case like this in your sample so I don't know if this is something you care about.

MySQL: Return 0 if row doen't exist

I've been bashing my head on this for a while, so now I'm here :) I'm a SQL beginner, so maybe this will be easy for you guys...
I have this query:
SELECT COUNT(*) AS counter, recur,subscribe_date
FROM paypal_subscriptions
WHERE recur='monthly' and subscribe_date > "2010-07-16" and subscribe_date < "2010-07-23"
GROUP BY subscribe_date
ORDER BY subscribe_date
Now the dates I've shown above are hard coded, my application will supply a variable date range.
Right now I'm getting a result table where there is a value for that date.
counter |recur | subscribe_date
2 | Monthly | 2010-07-18
3 | Monthly | 2010-07-19
4 | Monthly | 2010-07-20
6 | Monthly | 2010-07-22
I'd like to return in the counter column if the date doesn't exist.
counter |recur | subscribe_date
0 | Monthly | 2010-07-16
0 | Monthly | 2010-07-17
2 | Monthly | 2010-07-18
3 | Monthly | 2010-07-19
4 | Monthly | 2010-07-20
0 | Monthly | 2010-07-21
6 | Monthly | 2010-07-22
0 | Monthly | 2010-07-23
Is this possible?
You will need a table of dates (new table added), and then you will have to do an outer join between that table and your query.
This question is also similar to another question. Answers can be quite similar.
Insert Dates in the return from a query where there is none
You will need a table of dates to group against. This is quite easy in MSSQL using CTE's like this - I'm not sure if MySQL has something similar?
Otherwise you will need to create a hard table as a one off exercise
EDIT : Give this a try:
SELECT COUNT(pp.subscribe_date) AS counter, dl.date, MIN(pp.recur)
FROM date_lookup dl
LEFT OUTER JOIN paypal pp
on (pp.subscribe_date = dl.date AND pp.recur ='monthly')
WHERE dl.date >= '2010-07-16' and dl.date <= '2010-07-23'
GROUP BY dl.date
ORDER BY dl.date
The subject of the query needs to be changed to the date_lookup table
(the order of the Left Outer Join becomes important)
Count(*) isn't going to work since the 'date' record always exists - need to count something in the PayPay table
pp.recur ='monthly' is now a join condition, not a filter because of the LOJ
Finally, showing pp.recur in the select list isn't going to work.
I've used an aggregate, but MIN(pp.recur) will return null if there are no PayPal records
What you could do when you parameterize your query is to just repeat the Recur Type Filter?
Again, plz excuse the MSSQL syntax
SELECT COUNT(pp.subscribe_date) AS counter, dl.date, #ppRecur
FROM date_lookup dl
LEFT OUTER JOIN paypal pp
on (pp.subscribe_date = dl.date AND pp.recur =#ppRecur)
WHERE dl.date >= #DateFrom and dl.date <= #DateTo
GROUP BY dl.date
ORDER BY dl.date
Since there was no easy way to do this, I had to have the application fill in the blanks for me rather than have the database return the data I wanted. I do get a performance hit for this, but it was necessary for the completion of the report.
I will definitely look into making this return what I want from the DB in the near future. I'll give nonnb's solution a try.
thanks everyone!