Postgres LEFT JOIN with WHERE condition - mysql

I need to left join two tables with a where condition:
Table time_table
id rid start_date end_date
1 2 2017-07-01 00:00:00 2018-11-01 00:00:00
2 5 2017-01-01 00:00:00 2017-06-01 00:00:00
3 2 2018-07-01 00:00:00 2020-11-01 00:00:00
Table record_table
id name date
1 record1 2017-10-01 00:00:00
2 record2 2017-02-01 00:00:00
3 record3 2017-10-01 00:00:00
I need to get all those records which are present under given date range. In the above example, I need those records that lie under range for rid = 2 only. Hence the output for the above query needs to be:
1 record1 2017-10-01 00:00:00
3 record3 2017-10-01 00:00:00

left join two tables with a where condition
It's typically wrong to use a LEFT [OUTER] JOIN and then filter with a WHERE condition, thereby voiding the special feature of a LEFT JOIN to include all rows from the left table unconditionally. Detailed explanation:
Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
Put conditions supposed to filter all rows into the WHERE clause (rid = 2), but move conditions on record_table to the join clause:
SELECT t.start_date, t.end_date -- adding those
, r.id, r.name, r.date
FROM time_table t
LEFT JOIN record_table r ON r.date >= t.start_date
AND r.date < t.end_date
WHERE t.rid = 2;
As commented, it makes sense to include columns from time_table in the result, but that's my optional addition.
You also need to be clear about lower and upper bounds. The general convention is to include the lower and exclude the upper bound in time (timestamp) ranges. Hence my use of >= and < above.
Related:
SQL query on a time series to calculate the average
Selecting an average of records grouped by 5 minute periods
Performance should be no problem at all with the right indexes.
You need an index (or PK) on time_table(rid) and another on record_table(date).

I'm not exactly sure if this is what you want, but if you are saying you want the dates where the record_table date is between the dates in the time_table, then this would do the job:
select
rt.id, rt.name, rt.date
from
time_table tt
join record_table rt on
rt.date between tt.start_date and tt.end_date
where
tt.rid = 2
That said, this will be horribly inefficient for large datasets. If your data is relatively small (< 10k records in each table, post-filters), then it probably won't matter much, but if you would need to scale this concept, it would warrant knowing more about your data -- for example, do the dates, always round to the first of each month?
Again, from your example, I wasn't sure if this is what you meant by "get all those records which are present under given date range."

SELECT time_tbl.name,record_tbl.date
FROM dbo.time_table AS time_tbl
INNER JOIN record_table AS record_tbl
ON time_tbl.id=record_tbl.id
WHERE(time_tbl.rid=2)

Related

SQL right join between tables

I have three tables where I want to perform a join between them.
1st table looks like (named users)
id column1
1 1
2 2
3 3
2nd table looks like (named transactions) here we have some transactions of users
user_id transaction_date transaction_expire
1 2017-03-31 2017-05-16
1 2017-02-28 2017-04-16
3rd table looks like (named user_logs) we have logs of the users based on days
user_id date some_log_data
1 2017-03-07 1505
1 2017-03-03 1201
1 2017-03-22 942
1 2017-03-31 1490
1 2017-04-05 1490
I want to know the sum of every user based on transactions something like:
user_id transaction_date transaction_expire log
1 2017-03-31 2017-05-16 2980
1 2017-02-28 2017-04-16 6628
So this is the result which I want to achieve for every use get the SUM of their log in all transactions.
By doing this query between transaction_date and transaction_expire I get some result but when I try to do the summation the result are for all of them:
SELECT t.transaction_date, t.transaction_expire, ul.log
FROM user_logs as ul
RIGHT JOIN transactions as t ON ul.user_id= t.user_id
WHERE ul.date BETWEEN t.transaction_date AND t.transaction_expire
This query gives me 7 rows which is correct but now I want to find only the sum of the logs in these two different transactions.
Your query is basically correct, but you need aggregation:
SELECT t.transaction_date, t.transaction_expire, SUM(ul.log)
FROM transactions t LEFT JOIN
user_logs ul
ON ul.user_id = t.user_id AND
ul.date BETWEEN t.transaction_date AND t.transaction_expire
GROUP BY t.transaction_date, t.transaction_expire;
Also note that the condition in the WHERE clause is moved to the ON clause. I switched the JOIN to a LEFT JOIN. I find LEFT JOIN much more intuitive than RIGHT JOIN, because it keeps all rows in the first table.
How about this
SELECT t.user_id, t.transaction_date, t.transaction_expire, SUM(IFNULL(ul.log, 0))
FROM user_logs as ul
LEFT JOIN transactions as t ON (ul.user_id= t.user_id AND
ul.date BETWEEN t.transaction_date AND t.transaction_expire)
GROUP BY t.user_id, t.transaction_date, t.transaction_expire;
Your expected result contains user_id. So, I included that. If a user doesnt have a log, this will return a 0 for the SUM. I assume this is what you wanted

merging two queries

I have a table of recipes, and I want to show a weekly value for each of them. The values are votes cast for them. My problem is that I want to make an excel-like table with all available fridays on my db, add a column for each recipe, and put it's value for the friday on that column, if any value exists.
Now apparently the easiest join doesn't work so I wrote two queries: one to get all ids for my recipes and one for the values to show. The first (MySql) query is just a select id from recipes, the second is like this:
select d.date,perc from
(SELECT date FROM weekly where YEAR(date)=2014 group by date) as d
left join weekly on d.date = weekly.date and weekly.id_rec= :idrec
Any idea how to merge those two queries? Running two queries makes everything slow down, but when I tried to merge them I didn't get the correct results.
Data:
sql fiddle
The result should be something like:
Dates | Recipe A | Recipe B | ...
Date 1 | 0.005 | 0.11 |
Date 2 | 0 | 0 |
Date 3 | 0 | 0.1 |
Note that Date 2 doesn't exist for Recipe A and B, but for some other do.
You should be able to merge the two queries like this:
SELECT recipes.id, votes.date, votes.perc FROM recipes
RIGHT JOIN
(select weekly.id_rec, d.date, perc from
(SELECT weekly.id_rec, date FROM weekly where YEAR(date) = 2014 group by date) as d left join weekly on d.date = weekly.date) as votes
ON votes.id_rec = recipes.id
SQL Fiddle

MYSQL fill group by "gaps"

I´m trying to fill the gaps after using group by using an aux table, can you help?
aux table to deal with days with no orders
date quantity
2014-01-01 0
2014-01-02 0
2014-01-03 0
2014-01-04 0
2014-01-05 0
2014-01-06 0
2014-01-07 0
group by result from "orders" table
date quantity
2014-01-01 7
2014-01-02 1
2014-01-04 2
2014-01-05 3
desired result joining "orders" table with "aux table"
date quantity
2014-01-01 7
2014-01-02 1
2014-01-03 0
2014-01-04 2
2014-01-05 3
2014-01-06 0
2014-01-07 0
Without knowing how you create your group by result table, what you're looking for in an outer join, perhaps with coalesce. Something like this:
select distinct a.date, coalesce(b.quantity,0) quantity
from aux a
left join yourgroupbyresults b on a.date = b.date
Please note, you may or may not need distinct -- depends on your data.
Edit, given your comments, this should work:
select a.date, count(b.date_sent)
from aux a
left join orders b on a.date = date_format(b.date_sent, '%Y-%m-%d')
group by a.date
SQL Fiddle Demo
Using your results it would be something like:
SELECT a.date
,COALESCE(b.quantity,0) as quantity
FROM auxtable a
LEFT JOIN groupbyresult b
ON a.date = b.date
You can also do your grouping in the same query as the left join:
SELECT a.date
,COALESCE(COUNT(b.somefield),0) as quantity
FROM auxtable a
LEFT JOIN table1 b
ON a.date = b.date
GROUP BY a.date
One familiar approach to solving a problem like this is to use a row source that has the distinct list of dates you want to return, and then do an outer join to the table that has gaps. That way, you get all the dates back, and you can substitute a zero for the "missing" quantity values.
For example:
SELECT d.date
, IFNULL(SUM(s.quantity),0) AS quantity
FROM distinct_list_of_dates d
LEFT
JOIN information_source s
ON s.date = d.date
GROUP BY d.date
It's not clear why a GROUP BY would be eliminating some date values. We might conjecture that you are using a MySQL extension to ANSI-standard GROUP BY semantics, and that is eliminating rows. Or, you may have a WHERE clause that is excluding rows. But we're just guessing.
FOLLOW UP based on further information revealed by OP in comments...
In the query above, replace distinct_list_of_dates with aux, and replace information_source with orders, and adjusting the join predicate to account for datetime comparison to date
SELECT d.date
, IFNULL(SUM(s.quantity),0) AS quantity
FROM aux d
LEFT
JOIN orders s
ON s.date >= d.date
AND s.date < d.date + INTERVAL 1 DAY
GROUP BY d.date

MySQL - Count Yearly Totals when some Years have nulls

I have 1 table with similar data:
CustomerID | ProjectID | DateListed | DateCompleted
123456 | 045 | 07-29-2010 | 04-03-2011
123456 | 123 | 10-12-2011 | 11-30-2011
123456 | 157 | 12-12-2011 | 02-10-2012
123456 | 258 | 06-07-2011 | NULL
Basically, a customer contacts us, we get a project on our list, and we mark it completed when we're done with it.
What I'm after is a simple (you'd think, at least) count of all projects, with expected output like below:
YEAR | TotalListed | TotalCompleted
2010 | 1 | 0
2011 | 3 | 2
2012 | 0 | 1
However, my query below - because of the join - isn't showing 2012's count, because there's been no listed project for 2012. However, I can't really reverse the query, as then 2010's count wouldn't show up (since nothing was completed in 2010).
I'm open to any suggestions, or tips like how to do this. I've pondered a temp table, is that the best way to go? I'm open to anything that gets me what I need!
(If the code looks familiar, ya'll helped me get the subquery made! MySQL Subquery with main query data variable)
SELECT YEAR(p1.DateListed) AS YearListed, COUNT(p1.ProjectID) As Listed, PreQuery.Completed
FROM(
SELECT YEAR(DateCompleted) AS YearCompleted, COUNT(ProjectID) AS Completed
FROM projects
WHERE CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY YEAR(DateCompleted)
) PreQuery
RIGHT OUTER JOIN projects p1 ON PreQuery.YearCompleted = YEAR(p1.DateListed)
WHERE CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY YearListed
ORDER BY p1.DateListed
After reviewing your table, query, and expected results - I believe I have found a more-revised query to suit your needs. It is a fairly-full rewrite of your existing query though, but I've tested it with your given data and received the same results you want/expect:
SELECT
years.`year`,
SUM(IF(YEAR(DateListed) = years.`year`, 1, 0)) AS TotalListed,
SUM(IF(YEAR(DateCompleted) = years.`year`, 1, 0)) AS TotalCompleted
FROM
projects
LEFT JOIN (
SELECT DISTINCT `year` FROM (
SELECT YEAR(DateListed) AS `year` FROM projects
UNION SELECT YEAR(DateCompleted) AS `year` FROM projects WHERE DateCompleted IS NOT NULL
) as year_inner
) AS years
ON YEAR(DateListed) = `year`
OR YEAR(DateCompleted) = `year`
WHERE
CustomerID = 123456 AND DateListed >= DATE_SUB(Now(), INTERVAL 5 YEAR)
GROUP BY
years.`year`
ORDER BY
years.`year`
To explain, we should start with the inner query (aliased as year_inner). It selects a full list of years in the DateListed and DateCompleted columns and then selects a DISTINCT list of those to create the years alias sub-query. This sub-query is used to get a full list of "years" that we want data for. Doing it this way, opposed to a sub-query with counts and groupings will allow you to only have to define the WHERE clause on the outermost query (though, if efficiency becomes an issue with thousands and thousands of records, you could always add a WHERE clause to the inner query too; or an index to the date columns).
After we've built our inner queries, we join the projects table on the results with a LEFT JOIN for the DateListed or DateCompleted's YEAR() value - which will allow us to bring back null columns too!
For the field selections, we use the year column from our inner query to assure that we get a full list of years to display. Then, we compare the current row's DateListed & DateCompleted YEAR() value to the current year; if they're equal, add 1 - else add 0. When we GROUP BY year, our SUM() will count all of the 1's for that year for each column and give you the output you want (hopefully, of course =P).

Retrieving data from joined MySQL tables using an AVG in the WHERE clause?

I am trying to select data from multiple tables which uses an AVG in the WHERE clause.
SELECT company_metrics.*, companies.company_name, companies.permalink
FROM company_metrics LEFT JOIN companies
ON companies.company_id = company_metrics.company_id
WHERE MONTH(date) = '04' AND YEAR(date) = '2011'
HAVING (SELECT avg(company_unique_visitors)
FROM (SELECT company_metrics.company_unique_visitors
FROM company_metrics
ORDER BY company_metrics.date DESC LIMIT 3)
average ) >'2000'
ORDER BY date DESC
Example Data:
###Company Metrics#### Table
company_id company_unique_visitors date
----------- ----------------------- ----
604 2054 2011-04-01
604 3444 2011-03-01
604 2122 2011-02-01
604 2144 2011-01-01
604 2001 2010-12-01
602 2011 2011-04-01
602 11 2011-03-01
602 411 2011-02-01
602 611 2011-01-01
602 111 2010-12-01
EDIT
I would like only the 3 latest numbers from company_unique_visitors AVG'ed
/EDIT
So the query would select company_id 604 but it wouldn't select company_id 602 because 602 doesn't have an AVG greater than 2000.
I need help writing the correct query to do as I have described. I can clarify if needed.
Thanks for your help!
There are several problems with your query as written. I'm not completely clear as to the structure of all the tables, but I believe I understand the gist based on the query you posted. Your first problem with the posted query is that you're not grouping by or using any aggregates in the query where you're using the HAVING clause. You use aggregates in one of the subqueries, but the HAVING where it is right now doesn't make much sense.
I believe you wanted to group by the company_id before you did an aggregate of the averages, so I made that the primary group by on the outer query. You were also using too many nested queries to accomplish what was was a seemingly simple task of only selecting the 3 most recent measurements. I moved that subquery into the primary join so that the data was only selected once and in a logical way.
And, without further ceremony, here's the fixed query:
SELECT limited_metrics.*, companies.company_name, companies.permalink,
avg(limited_metrics.company_unique_visitors) AS avg_visitors
FROM
(SELECT *
FROM company_metrics
ORDER BY company_metrics.date DESC LIMIT 3) AS limited_metrics
LEFT JOIN companies
ON companies.company_id = limited_metrics.company_id
WHERE MONTH(limited_metrics.date) = '04' AND YEAR(limited_metrics.date) = '2011'
GROUP BY companies.company_id
HAVING avg_visitors > 2000
Ok based off of Jared Harding's answer and this post: Moving average - MySQL
I was able to figure out the query.
SELECT metrics.*,companies.company_name,companies.permalink
FROM (SELECT company_id,AVG(company_unique_visitors) AS met_avg
FROM company_metrics
WHERE `date` BETWEEN DATE_SUB(NOW(), INTERVAL 4 MONTH) AND NOW()
GROUP BY company_id HAVING met_avg>2000) AS metrics
LEFT JOIN companies ON companies.company_id=metrics.company_id
Thanks Jared for all your help!