MySQL Union Operator not behaving as expected - mysql

I usually find the answer to my questions here. If not, I know that is probably because I haven't looked deep enough. However this time none of the answers I looked through fit my needs. So here it goes:
I want to use MySQL server to build a report from two tables:
ID Email ID Date Spent
----------------------- ---------------------------
A123 a#test.com A123 3.3.14 2.50
B102 b#test.com A123 7.3.14 3.50
yum yum#a.com B102 4.4.14 7.00
(null) (null)
I want to make a report in which for a given timestamp, eg. from 3.1.2014 to 3.31.14, I get a list of all the ID's of the system with the corresponding amount that they have sent, even if they didn't spend anything.
Therefore from the above tables, let's say we want to retrieve the month of March 2014 (from 3.1.2014 to 3.31.14), I would like to get:
ID Spent
------------------
A123 6.00
B102 7.00
yum NULL
I don't mind to have NULL or 0's for those users that didn't spend any money. So far I have gotten this:
SELECT ID,
sum(Spent) AS Spent
FROM expenses
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
GROUP BY expenses.ID
UNION
SELECT users.ID,
NULL AS Spent
FROM users
WHERE users.ID NOT IN
(SELECT expenses.ID
FROM expenses
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
GROUP BY expenses.ID)
ORDER BY ID;
You can check the above code in here.
It works as expected, but in the real case of my database, the amount of rows in the result query is greater that the amount of unique rows in users.
I have checked that all the ID's in the expenses table are in the users tables, which is the case. I have tested with:
SELECT ID FROM expenses
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
AND ID NOT IN (SELECT users.ID FROM users);
and it returns an empty set, which is as expected.
I must be missing something, but I have no clue of what. Could someone please give some insights? I am pretty new to MySQL, and maybe there is a better way of doing this.

I think you want a left outer join rather than a union:
SELECT u.ID, sum(e.Spent) AS Spent
FROM users u left outer join
expenses e
on u.id = e.id and
e.date >= '2014-03-01 00:00:00' and
e.date < '2014-03-31 00:00:00'
GROUP BY u.ID;
Note in the data you have in the SQL Fiddle, this will return four rows, because of the NULL valued row in users.

you can do a join of two tables (Users and Expenses) to get the same result (http://sqlfiddle.com/#!2/738ea6/5). This will be much faster and an understandable query.
SELECT expenses.ID,
sum(Spent) AS Spent
FROM expenses, users
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
AND users.ID = expenses.ID
GROUP BY expenses.ID
ORDER BY ID;

Group by in second query?
SELECT ID,
sum(Spent) AS Spent
FROM expenses
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
GROUP BY expenses.ID
UNION
SELECT users.ID,
NULL AS Spent
FROM users
WHERE users.ID NOT IN
(SELECT expenses.ID
FROM expenses
WHERE expenses.date >= '2014-03-01 00:00:00'
AND expenses.date < '2014-03-31 00:00:00'
GROUP BY expenses.ID)
GROUP BY ID
ORDER BY ID;

Related

Grab all users who have last logged in between two dates in another table

I want to grab all users that have last logged in between two specific days. How can this be done?
Note: the reason why I did sub.updated_at is because:
the login session may last for more than a day
same users may log in on different pc and start a secondary session
I need to get the very last session started by the user and then check when it was last updated.
Example build: http://sqlfiddle.com/#!9/39849/11
So from the example on sqlfiddle I would expect the query to select Bill, because updated_at, for his last record, is between 2019-09-15 00:00:00 and 2019-09-15 23:59:59. One of John's entries was also updated between those dates but his latest record was updated on 2019-09-18 12:00:00 hence why John should not be selected.
This is a possible solution, but I have concerns about performance, indexing update_at should have a positive effect, but what if we have 1mil+ users and an even greater number of user logins:
SELECT *
FROM
(
SELECT *
FROM (
SELECT *
FROM `user_logins`
WHERE `updated_at` >= '2019-09-15 00:00:00'
ORDER BY `user_logins`.`id` DESC
) as sub
GROUP BY `sub`.user_id
) as sub
WHERE `sub`.updated_at BETWEEN '2019-09-15 00:00:00' and '2019-09-15 23:59:59'
SELECT *
FROM (
SELECT *
FROM `user_logins`
WHERE `updated_at` BETWEEN '2019-09-15 00:00:00' and '2019-09-15 23:59:59'
ORDER BY `user_logins`.`id` DESC
) as sub
WHERE NOT EXISTS (
SELECT 1
FROM user_logins
WHERE user_id = sub.user_id
and updated_at > '2019-09-15 23:59:59'
);
You query may be simplified to -
SELECT U.*, UL.*
FROM users U
JOIN user_logins UL ON U.id = UL.user_id
JOIN (SELECT user_id, MAX(updated_at) updated_at
FROM user_logins
GROUP BY user_id) UL2 ON UL2.updated_at = UL.updated_at
AND UL2.user_id = UL.user_id
WHERE DATE(UL2.updated_at) BETWEEN DATE('2019-09-15 00:00:00') AND DATE('2019-09-15 23:59:59')
ORDER BY UL.id DESC
Here is the fiddle.

How to get latest results by date when selecting from two table?

I have two tables and I would like to join then with a query.
result save the actual entry of results
user_tracking tracks the acceptance and completion of work, users can cancel and accepts work again at a later time.
SELECT *
from
svr1.result r,
svr1.user_tracking u
where
r.uid = u.user_id and r.tid = u.post1
and u.function_name = '7' #7 == accept work
and r.insert_time > '2015-09-23 00:00:00' and r.insert_time < '2015-10-03 00:00:00'
and u.track_time > '2015-09-23 00:00:00' and u.track_time < '2015-10-03 00:00:00'
my result table had 1785 records within the period I wanted to track
but the above query returns 1990 records. I would like to know how can i filter to get the latest date accepted by user only.
in result table: uid,INT, tid,INT, result,VARCHAR and insert_time,TIMESTAMP
in user_tracking table: user_id,INT, post1,VARCHAR function_name,VARCHAR, result,VARCHAR and track_time,TIMESTAMP
the user_tracking function sample records, in this query the track time will change and the rest will remain the same.
Use the GROUP BY command with a MAX() on the required date, this will select the latest date of all the options (assuming all the other columns are equal). Code as follows (need to declare all columns because of the MAX unfortunately):
SELECT r.uid,
r.tid,
r.result,
r.insert_time,
u.user_id,
u.post1,
u.function_name,
u.result,
MAX(track_time)
FROM
svr1.result r,
svr1.user_tracking u
WHERE
r.uid = u.user_id AND r.tid = u.post1
AND u.function_name = '7' #7 == accept work
AND r.insert_time > '2015-09-23 00:00:00' AND r.insert_time < '2015-10-03 00:00:00'
AND u.track_time > '2015-09-23 00:00:00' AND u.track_time < '2015-10-03 00:00:00'
GROUP BY
r.uid,
r.tid

Count number of entries in time interval 1 that appear in time interval 2 - SQL

I am new here and tried to look up the answer to my question but couldn't find anything on it. I am currently learning how to work with SQL queries and am wondering how I can count the amount of unique values that appear in two time intervals?
I have two columns; one is the timestamp while the other is a customer id. What I want to do is to check, for example, the amount of customers that appear in time interval A, let's say January 2014 - February 2014. I then want to see how many of these also appear in another time interval that i specify, for example February 2014-April 2014. If the total sample were 2 people who both bought something in january while only one of them bought something else before the end of April, the count would be 1.
I am a total beginner and tried the query below but it obviously won't return what I want because each entry only having one timestamp makes it not possible to be in two intervals.
SELECT
count(customer_id)
FROM db.table
WHERE time >= date('2014-01-01 00:00:00')
AND time < date('2014-02-01 00:00:00')
AND time >= date('2014-02-01 00:00:00')
AND time < date('2014-05-01 00:00:00')
;
Try this.
select count(distinct t.customer_id) from Table t
INNER JOIN Table t1 on t1.customer_id = t.customer_id
and t1.time >= '2014-01-01 00:00:00' and t1.time<'2014-02-01 00:00:00'
where t.time >='2014-02-01 00:00:00' and t.time<'2014-05-01 00:00:00'
Here's one method of doing this with conditional grouping in an inner-select.
Select Case
When GroupBy = 1 Then 'January - February 2014'
When GroupBy = 2 Then 'February - April 2014'
End As Period,
Count (Customer_Id) As Total
From
(
SELECT Customer_Id,
Case
When Time Between '2014-01-01' And '2014-02-01' Then 1
When Time Between '2014-02-01' And '2014-04-01' Then 2
Else -1
End As GroupBy
From db.Table
) D
Where GroupBy <> -1
Group By GroupBy
Edit: Sorry, misread the question. This will show you those that overlap those two time ranges:
Select Count(Customer_Id)
From db.Table t1
Where Exists
(
Select Customer_Id
From db.Table t2
Where t1.customer_id = t2.customer_id
And t2.Time Between '2014-02-01' And '2014-04-01'
)
And t1.Time Between '2014-01-01' And '2014-02-01'

Select sum of payments in period where first payment was made in that period

I have table of payments 'user_id - created - price'.
I need to calculate sum of all payments made in period 14-16 of March where first payment was made in the same period
The best solution I came by is
SELECT user_id, SUM(price/100) FROM payment WHERE
(DATE(created) BETWEEN '2014-03-14' AND '2014-03-16')
AND (MIN(created) BETWEEN '2014-03-14 00:00:01' AND '2014-03-16 23:59:59')
GROUP BY user_id
but I got "Invalid use of group function"
Update:
This request solved my problem
SELECT p.user_id, SUM(p.price/100) FROM payment p WHERE
(DATE(created) BETWEEN '2014-03-14' AND '2014-03-16')
AND (SELECT MIN(DATE(created)) FROM payment WHERE user_id = p.user_id) BETWEEN '2014-03-14' AND '2014-03-16'
GROUP BY p.user_id
You can't use MIN() function in where clause,you need to use HAVING to filter on aggregate functions you can rewrite your query as below
SELECT user_id, SUM(price/100)
FROM payment
WHERE
created BETWEEN '2014-03-14 00:00:00' AND '2014-03-16 23:59:59'
GROUP BY user_id
HAVING MIN(created) BETWEEN '2014-03-14 00:00:00' AND '2014-03-16 23:59:59'

MySQL - select data from database between two dates

I have saved the dates of a user's registration as a datetime, so that's for instance 2011-12-06 10:45:36. I have run this query and I expected this item - 2011-12-06 10:45:36 - will be selected:
SELECT `users`.* FROM `users` WHERE created_at >= '2011-12-01' AND
created_at <= '2011-12-06'
But is not. Exist any elegant way, how to select this item? As a first idea that I got was like 2011-12-06 + 1, but this doesn't looks very nice.
Your problem is that the short version of dates uses midnight as the default. So your query is actually:
SELECT users.* FROM users
WHERE created_at >= '2011-12-01 00:00:00'
AND created_at <= '2011-12-06 00:00:00'
This is why you aren't seeing the record for 10:45.
Change it to:
SELECT users.* FROM users
WHERE created_at >= '2011-12-01'
AND created_at <= '2011-12-07'
You can also use:
SELECT users.* from users
WHERE created_at >= '2011-12-01'
AND created_at <= date_add('2011-12-01', INTERVAL 7 DAY)
Which will select all users in the same interval you are looking for.
You might also find the BETWEEN operator more readable:
SELECT users.* from users
WHERE created_at BETWEEN('2011-12-01', date_add('2011-12-01', INTERVAL 7 DAY));
SELECT users.* FROM users WHERE created_at BETWEEN '2011-12-01' AND '2011-12-07';
You need to use '2011-12-07' as the end point as a date without a time default to time 00:00:00.
So what you have actually written is interpreted as:
SELECT users.*
FROM users
WHERE created_at >= '2011-12-01 00:00:00'
AND created_at <= '2011-12-06 00:00:00'
And your time stamp is: 2011-12-06 10:45:36 which is not between those points.
Change this too:
SELECT users.*
FROM users
WHERE created_at >= '2011-12-01' -- Implied 00:00:00
AND created_at < '2011-12-07' -- Implied 00:00:00 and smaller than
-- thus any time on 06
Another alternative is to use DATE() function on the left hand operand as shown below
SELECT users.* FROM users WHERE DATE(created_at) BETWEEN '2011-12-01' AND '2011-12-06'
Have you tried before and after rather than >= and <=? Also, is this a date or a timestamp?
Searching for created_at <= '2011-12-06' will search for any records that where created at or before midnight on 2011-12-06
. You want to search for created_at < '2011-12-07'.
Maybe use in between better. It worked for me to get range then filter it
You can use MySQL DATE function like below
For instance, if you want results between 2017-09-05 till 2017-09-09
SELECT DATE(timestamp_field) as date FROM stocks_annc WHERE DATE(timestamp_field) >= '2017-09-05' AND DATE(timestamp_field) <= '2017-09-09'
Make sure to wrap the dates within single quotation ''
Edit:
A better solution would be this. It would make sure that it uses the index if any exists.
select date(timestamp_field) as date from stocks_annc where time_stamp_field >= '2022-01-01 00:00:00' and time_stamp_field <= '2022-01-10 00:00:00'
Hope this helps.