MySql SELECT Subqueries JOIN - mysql

I am trying to do all of these SELECT statements in all one query so I will be able to further it and group it. I believe I have to tell it to JOIN on TABLE1. I can tell you that it should be JOINing on the field called ITEM. I have tried dozens of JOIN statements none of which does the trick because I have two WHERE statements in my subqueries.
SELECT ITEM, DSI, LEADTIME,
(SELECT COUNT(ORDER_NUMBER) FROM SUBTABLE1 TR1 WHERE TRANS_DATE BETWEEN DATE_SUB(curdate(), INTERVAL 730 DAY) AND DATE_SUB(curdate(), INTERVAL 365 DAY))
as OLDORDERS,
(SELECT COUNT(ORDER_NUMBER) FROM SUBTABLE2 TR2 WHERE TRANS_DATE BETWEEN DATE_SUB(curdate(), INTERVAL 364 DAY) AND curdate())
as NEWORDERS
FROM TABLE1
Displays:
ITEM | DSI | LEADTIME | OLDORDERS | NEWORDERS
PROD-1 0 1 16036 38399
PROD-2 1 0 16036 38399
PROD-3 1 1 16036 38399
Again...I believe I need it to JOIN the field ITEM on the subqueries, but I do not know how to do this, any ideas?

You don't actually need a JOIN, per se; rather, you need to "correlate" your subqueries, so that they refer to data in their containing query.
You haven't given your exact table definitions, so I can't say for sure, but here's my guess at what you need:
SELECT item, dsi, leadtime,
( SELECT COUNT(order_number)
FROM subtable1
WHERE trans_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 730 DAY)
AND DATE_SUB(CURDATE(), INTERVAL 365 DAY)
-- restrict to "current" record from TABLE1:
AND subtable1.item = table1.item
) as OLDORDERS,
( SELECT COUNT(order_number)
FROM subtable1
WHERE trans_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 364 DAY)
AND CURDATE()
-- restrict to "current" record from table1:
AND subtable1.item = table1.item
) as NEWORDERS
FROM table1
;
That's assuming that table1.item is the primary key, and that subtable1.item is a foreign-key referring to it. Naturally you'll have to adjust the query if that's not the case.

Related

How ot return 0 instead of null on mysql query?

The following query returns the visitors and pageviews of last 7 days. However, if there are no results (let's say it is a fresh account), nothing is returned.
How to edit this in order to return 0 in days that there are no entries?
SELECT Date(timestamp) AS day,
Count(DISTINCT hash) AS visitors,
Count(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND timestamp >= Subdate(Curdate(), 7)
GROUP BY day
Assuming that you always have at least one record in the table for each of the last 7 days (regardless of the company_id), then you can use conditional aggregation as follows:
select
date(timestamp) as day,
count(distinct case when company_id = 1 then hash end) as visitors,
sum(company_id = 1) as pageviews
from behaviour
where timestamp >= curdate() - interval 7 day
group by day
Note that I changed you query to use standard date arithmetics, which I find easier to understand that date functions.
Otherwise, you would need to move the condition on the date from the where clause to the aggregate functions:
select
date(timestamp) as day,
count(distinct case when timestamp >= curdate() - interval 7 day and company_id = 1 then hash end) as visitors,
sum(timestamp >= curdate() - interval 7 day and company_id = 1) as pageviews
from behaviour
group by day
If your table is big, this can be expensive so I would not recommend that.
Alternatively, you can generate a derived table of dates and left join it with your original query:
select
curdate - interval x.n day day,
count(distinct b.hash) visitors,
count(b.hash) page_views
from (
select 1 n union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) x
left join behavior b
on b.company_id = 1
and b.timestamp >= curdate() - interval x.n day
and b.timestamp < curdate() - interval (x.n - 1) day
group by x.n
Use a query that returns all the dates from today minus 7 days to today and left join the table behaviour:
SELECT t.timestamp AS day,
Count(DISTINCT b.hash) AS visitors,
Count(b.timestamp) AS pageviews
FROM (
SELECT Subdate(Curdate(), 7) timestamp UNION ALL SELECT Subdate(Curdate(), 6) UNION ALL
SELECT Subdate(Curdate(), 5) UNION ALL SELECT Subdate(Curdate(), 4) UNION ALL SELECT Subdate(Curdate(), 3) UNION ALL
SELECT Subdate(Curdate(), 2) UNION ALL SELECT Subdate(Curdate(), 1) UNION ALL SELECT Curdate()
) t LEFT JOIN behaviour b
ON Date(b.timestamp) = t.timestamp AND b.company_id = 1
GROUP BY day
Use IFNULL:
IFNULL(expr1, 0)
From the documentation:
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns >a numeric or string value, depending on the context in which it is used.
You can use next trick:
First, get query that return 1 dummy row: SELECT 1;
Next use LEFT JOIN to connect summary row(s) without condition. This join will return values in case data exists on NULL values in other case.
Last select from joined queries onle what we need and convert NULL's to ZERO's
using IFNULL dunction.
SELECT
IFNULL(b.day,0) AS DAY,
IFNULL(b.visitors,0) AS visitors,
IFNULL(b.pageviews,0) AS pageviews
FROM (
SELECT 1
) a
LEFT JOIN (
SELECT DATE(TIMESTAMP) AS DAY,
COUNT(DISTINCT HASH) AS visitors,
COUNT(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND TIMESTAMP >= SUBDATE(CURDATE(), 7)
GROUP BY DAY
) b ON 1 = 1;

how to join tables and get sum of first table if second has multiple occurences

I have two tables "temp_user_batches" and "user_activities" i am trying to find sum of user_activities for users present in temp_user_batches table.
problem is sum of user_activities is getting multiplied by number of times in ratio of occurences of user in temp_user_batches table.
Below is temp_user_batches table
This is user_activities table
it is supposed to give sum of time_spent column 649 + 364 = 1013 but instead its giving 2016
my query is:
SELECT temp_user_batches.user_id as user_id,
temp_user_batches.activity_goal as goal,
DATE_SUB(CURDATE(), INTERVAL 7 day) as min_activity_date,
CURDATE() as max_activity_date,
(sum(user_activities.time_spent)/60) as total_time_spent
FROM temp_user_batches
INNER JOIN user_activities
ON temp_user_batches.user_id = user_activities.user_id
WHERE activity_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 7 day) AND CURDATE()
group by user_id, goal, max_activity_date, min_activity_date
You can use a derived table that contains the DISTINCT pairs of user_id, activity_goal from table temp_user_batches:
SELECT t1.user_id as user_id,
t2.activity_goal as goal,
DATE_SUB(CURDATE(), INTERVAL 7 day) as min_activity_date,
CURDATE() as max_activity_date,
(sum(t2.time_spent)/60) as total_time_spent
FROM (
SELECT DISTINCT user_id, activity_goal
FROM temp_user_batches) AS t1
INNER JOIN user_activities AS t2 ON t1.user_id = t2.user_id
WHERE activity_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 7 day) AND CURDATE()
group by user_id, goal, max_activity_date, min_activity_date
From my understanding, you should try to GROUP_BY the temp_user_batches on user_id, last_activity before joining it with user_activities. This is because now you join user_activities on 2 rows instead of 1 row the way you want (from what I understand).
Something like:
SELECT
temp_user_batches.user_id AS user_id,
temp_user_batches.activity_goal AS goal,
DATE_SUB(CURDATE(), INTERVAL 7 DAY) AS min_activity_date,
CURDATE() AS max_activity_date,
(SUM(user_activities.time_spent) / 60) AS total_time_spent
FROM
(SELECT
*
FROM
temp_user_batches
GROUP BY user_id , last_activity)
INNER JOIN
user_activities ON temp_user_batches.user_id = user_activities.user_id
WHERE
activity_date BETWEEN DATE_SUB(CURDATE(), INTERVAL 7 DAY) AND CURDATE()
GROUP BY user_id , goal , max_activity_date , min_activity_date

mysql union all with aliases, syntax error

Why do I get
Error in query (1064): Syntax error near 'as q2)' at line 7
with
SELECT SQL_NO_CACHE q1.d1, q1.a, q2.b, (q1.a-q2.b)/q1.a*100 as Percentage
FROM
(SELECT Date(date) d1, count(id_update) a
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
GROUP BY DATE(date)) as q1
UNION ALL
(SELECT date(date) as d2, count(id_update) as b
FROM vas_updates
WHERE date BETWEEN
date_sub(date_sub(now(), interval 1 day), interval 2 hour)
AND
date_sub(now(), interval 1 day) group by DATE(d2) ) as q2
Can't I use aliases with UNION?
UPDATE:
this query might have leftovers from another query, I was tyring to understand the syntax error first.
What I'm trying to calculate is the percentage increase or decrease of two sums which are the hits from the last 2 hours of today compared to same timeframe from yesterday.
the table has just id and datetime
I suspect you actually want a JOIN
Something like this:-
SELECT SQL_NO_CACHE q1.d1, q1.a, q2.b, (q1.a-q2.b)/q1.a*100 as Percentage
FROM
(
SELECT Date(date) d1, count(id_update) a
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
GROUP BY DATE(date)
) as q1
INNER JOIN
(
SELECT date(date) as d2, count(id_update) as b
FROM vas_updates
WHERE date BETWEEN date_sub(date_sub(now(), interval 1 day), interval 2 hour) AND date_sub(now(), interval 1 day)
group by DATE(d2)
) as q2
ON q1.d1 = q2.d2
EDIT
Checked your updated query and it IS a JOIN you need.
You can use a CROSS JOIN. You are returning 1 value from each sub query, and doing a calculation on those values:-
SELECT SQL_NO_CACHE q1.d1, q1.a, q2.b, (q1.a-q2.b)/q1.a*100 as Percentage
FROM
(
SELECT MIN(Date(date)) d1, count(id_update) a
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
) as q1
CROSS JOIN
(
SELECT MIN(Date(date)) d2, count(id_update) as b
FROM vas_updates
WHERE date BETWEEN
date_sub(date_sub(now(), interval 1 day), interval 2 hour)
AND
date_sub(now(), interval 1 day)
) as q2
CROSS JOIN gives you every combination of the rows. In this case you have 1 resulting record. I have just returned the MIN date to get a single date to display.
You Can't. A UNION operation does not allow you to use alias on subqueries as it is an operation that creates a single table.
Like this:
select 1 a, 2 b
union all
select 3 blah, 4 bleh
This will result in
a b
1 2
3 4
See it here: http://sqlfiddle.com/#!2/68b32/444
On this query you only have two fields no matters what is on the second query it will only parse the first one, check if the others querys has the same quantity of fields as the first and if they are of the same type. Name the UNIONed querys with alies is invalid.
So I think what you need is probably a JOIN OR just all the fields
So, your query would be something like:
SELECT SQL_NO_CACHE tbl.d1,
tbl.a,
tbl.b,
(tbl.a-tbl.b)/tbl.a*100 as Percentage
FROM (SELECT Date(date) d1,
count(id_update) a,
null d2,
null b
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
GROUP BY DATE(date)
UNION ALL
SELECT null d1,
null a
date(date) as d2,
count(id_update) as b
FROM vas_updates
WHERE date
BETWEEN date_sub(date_sub(now(), interval 1 day), interval 2 hour)
AND date_sub(now(), interval 1 day) group by DATE(d2)
) tbl
But this most likely will not make the calculations right. You can use the version that #Kickstart has provided you.
Query of the answer from #Kickstart
SELECT SQL_NO_CACHE q1.d1, q1.a, q2.b, (q1.a-q2.b)/q1.a*100 as Percentage
FROM
(
SELECT Date(date) d1, count(id_update) a
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
GROUP BY DATE(date)
) as q1
INNER JOIN
(
SELECT date(date) as d2, count(id_update) as b
FROM vas_updates
WHERE date BETWEEN date_sub(date_sub(now(), interval 1 day), interval 2 hour) AND date_sub(now(), interval 1 day)
group by DATE(d2)
) as q2
ON q1.d1 = q2.d2
I decided to put this answer to explain why you are using the UNION operation in a wrong way.
I think that #Kickstart is right,and you can try this.
SELECT SQL_NO_CACHE d, a
FROM
(SELECT Date(date) d, count(id_update) as a
FROM vas_updates
WHERE date > date_senter code hereub(now(), interval 2 hour)
GROUP BY DATE(date))
UNION ALL
(SELECT date(date) as d, count(id_update) as a
FROM vas_updates
WHERE date BETWEEN
date_sub(date_sub(now(), interval 1 day), interval 2 hour)
AND
date_sub(now(), interval 1 day) group by DATE(d2) )
I'm wrong,UPDATE, you can try like this
SELECT SQL_NO_CACHE q1.d1, q1.a, q2.b, (q1.a-q2.b)/q1.a*100 as Percentage
FROM
(
SELECT DATE_FORMAT(Date(date),'%H') as d1, count(id_update) a
FROM vas_updates
WHERE date > date_sub(now(), interval 2 hour)
GROUP BY DATE(date)
) as q1
INNER JOIN
(
SELECT DATE_FORMAT(Date(date),'%H') as d2, count(id_update) as b
FROM vas_updates
WHERE date BETWEEN date_sub(date_sub(now(), interval 1 day), interval 2 hour) AND date_sub(now(), interval 1 day)
group by DATE(d2)
) as q2
ON q1.d1 = q2.d2

Return a zero for a day with no results

I have a query which returns the total of users who registered for each day. Problem is if a day had no one register it doesn't return any value, it just skips it. I would rather it returned zero
this is my query so far
SELECT count(*) total FROM users WHERE created_at < NOW() AND created_at >
DATE_SUB(NOW(), INTERVAL 7 DAY) AND owner_id = ? GROUP BY DAY(created_at)
ORDER BY created_at DESC
Edit
i grouped the data so i would get a count for each day- As for the date range, i wanted the total users registered for the previous seven days
A variation on the theme "build your on 7 day calendar inline":
SELECT D, count(created_at) AS total FROM
(SELECT DATE_SUB(NOW(), INTERVAL D DAY) AS D
FROM
(SELECT 0 as D
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
) AS D
) AS D
LEFT JOIN users ON date(created_at) = date(D)
WHERE owner_id = ? or owner_id is null
GROUP BY D
ORDER BY D DESC
I don't have your table structure at hand, so that would need adjustment probably. In the same order of idea, you will see I use NOW() as a reference date. But that's easily adjustable. Anyway that's the spirit...
See for a live demo http://sqlfiddle.com/#!2/ab5cf/11
If you had a table that held all of your days you could do a left join from there to your users table.
SELECT SUM(CASE WHEN U.Id IS NOT NULL THEN 1 ELSE 0 END)
FROM DimDate D
LEFT JOIN Users U ON CONVERT(DATE,U.Created_at) = D.DateValue
WHERE YourCriteria
GROUP BY YourGroupBy
The tricky bit is that you group by the date field in your data, which might have 'holes' in it, and thus miss records for that date.
A way to solve it is by filling a table with all dates for the past 10 and next 100 years or so, and to (outer)join that to your data. Then you will have one record for each day (or week or whatever) for sure.
I had to do this only for MS SqlServer, so how to fill a date table (or perhaps you can do it dynamically) is for someone else to answer.
A bit long winded, but I think this will work...
SELECT count(users.created_at) total FROM
(SELECT DATE_SUB(CURDATE(),INTERVAL 6 DAY) as cdate UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 5 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 4 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 3 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 2 DAY) UNION ALL
SELECT DATE_SUB(CURDATE(),INTERVAL 1 DAY) UNION ALL
SELECT CURDATE()) t1 left join users
ON date(created_at)=t1.cdate
WHERE owner_id = ? or owner_id is null
GROUP BY t1.cdate
ORDER BY t1.cdate DESC
It differs from your query slightly in that it works on dates rather than date times which your query is doing. From your description I have assumed you mean to use whole days and therefore have used dates.

Selecting first records of a type in a given period

I have a database table that stores user comments:
comments(id, user_id, created_at)
From that table, I want to get the number of users that have commented for the first time in the past 7 days.
Here's what I have so far:
SELECT COUNT(DISTINCT `user_id`)
FROM `comments`
WHERE `created_at` BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) AND NOW()
This would give the number of users that have commented, but it would not take into consideration whether these comments are first for their users.
SELECT COUNT(DISTINCT user_id)
FROM comments AS c1
WHERE c1.created_at BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) AND NOW()
AND NOT EXISTS (SELECT 1 FROM comments AS c2
WHERE c2.user_id = c1.user_id AND c2.created_at < c1.created_at)
The NOT EXISTS clause checks whether the same user_id has a record with an earlier created_at time. If so, it means this is not the first time they are commenting, and thus we should discount this record.
I have kept DISTINCT user_id because it is possible two comments are created at the same time. You could also try the following instead, which only gets the very first record for each user, so you can do away with the DISTINCT, but I don't know which would be more optimal:
SELECT COUNT(*)
FROM comments AS c1
WHERE c1.created_at BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) AND NOW()
AND NOT EXISTS (SELECT 1 FROM comments AS c2
WHERE c2.user_id = c1.user_id
AND (c2.created_at < c1.created_at
OR (c2.created_at = c1.created_at AND c2.id < c1.id)))
SELECT COUNT(DISTINCT `user_id`)
FROM comments c1
WHERE created_at BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY) AND NOW()
AND NOT EXISTS
(SELECT NULL
FROM comments c2
where c1.user_id = c2.user_id
AND c2.create_at < DATE_SUB(NOW(), INTERVAL 7 DAY));