I have two queries that retrieve records from 2 different tables that are almost alike and I need to merge them together.
Both have created_date which is of type datetime and I'm casting this column to date because I want to group and order them by date only, I don't need the time.
First query:
select cast(created_date as date) the_date, count(*)
from question
where user_id = 2
group by the_date
order by the_date;
+------------+----------+
| the_date | count(*) |
+------------+----------+
| 2021-01-02 | 1 |
| 2021-02-10 | 1 |
| 2021-02-14 | 5 | -- this line contains a mutual date
| 2021-03-16 | 1 |
| 2021-03-26 | 3 |
| 2021-03-27 | 23 |
| 2021-03-28 | 5 |
| 2021-03-29 | 1 |
+------------+----------+
Second query:
select cast(created_date as date) the_date, count(*)
from answer
where user_id = 2
group by the_date
order by the_date;
+------------+----------+
| the_date | count(*) |
+------------+----------+
| 2021-02-08 | 2 |
| 2021-02-14 | 1 | -- this line contains a mutual date
| 2021-04-05 | 5 |
| 2021-04-06 | 2 |
+------------+----------+
What I need is to merge them like this:
+------------+---------------+---------------+
| the_date | count(query1) | count(query2) |
+------------+---------------+---------------+
| 2021-01-02 | 1 | 0 | -- count(query2) is 0 bc. it's not in the second query
| 2021-02-08 | 0 | 2 | -- count(query1) is 0 bc. it's not in the first query
| 2021-02-10 | 1 | 0 |
| 2021-02-14 | 5 | 1 | -- mutual date
| 2021-03-16 | 1 | 0 |
| 2021-03-26 | 3 | 0 |
| 2021-03-27 | 23 | 0 |
| 2021-03-28 | 5 | 0 |
| 2021-03-29 | 1 | 0 |
| 2021-04-05 | 0 | 5 |
| 2021-04-06 | 0 | 2 |
+------------+---------------+---------------+
Basically what I need is to have all dates together and for each date to have the corresponding values from those two queries.
try something like this.
SELECT the_date , max(cnt1) , max(cnt2)
FROM (
select cast(created_date as date) the_date, count(*) AS cnt1 , 0 as cnt2
from question
where user_id = 2
group by the_date
order by the_date
UNION ALL
select cast(created_date as date) the_date, 0, count(*)
from answer
where user_id = 2
group by the_date
order by the_date
) as t1
GROUP BY the_date
ORDeR BY the_date;
I'm trying to get the total amount of overdraft accounts from an old Date, the goal is to get the total amount it was on the 31st of January.
I have the following tables Users and Transactions.
USERS (currently)
| user_id | name | account_balance |
|---------|---------|------------------|
| 1 | Wells | 1.00 |
| 2 | John | -10.00 |
| 3 | Sahar | -5.00 |
| 4 | Peter | 1.00 |
TRANSACTIONS (daily transition can go back in time)
| trans_id | user_id | amount_tendered | trans_datetime |
|------------|---------|-------------------|---------------------|
| 1 | 1 | 2 | 2021-02-16 |
| 2 | 2 | 3 | 2021-02-16 |
| 3 | 3 | 5 | 2021-02-16 |
| 4 | 4 | 2 | 2021-02-16 |
| 5 | 1 | 10 | 2021-02-15 |
so the current total overdraft amount is
SELECT sum(account_balance) AS O_D_Amount
FROM users
WHERE account_balance < 0;
| O_D_Amount |
|------------|
| -15 |
I need Help to reverse this amount to a date in history.
Assuming overdrafts are based on the sum of transactions up to a point, you can use a subquery:
select sum(total) as total_overdraft
from (select user_id, sum(amount_tendered) as total
from transactions t
where t.trans_datetime <= ?
group by user_id
) t
where total < 0;
The ? is a parameter placeholder for the date/time you care about.
Given a statuses table that holds information about products availability, how do I select the date that corresponds to the 1st day in the latest 20 days that the product has been active?
Yes I know the question is hard to follow. I think another way to put it would be: I want to know how many times each product has been sold in the last 20 days that it was active, meaning the product could have been active for years, but I'd only want the sales count from the latest 20 days that it had a status of "active".
It's something easily doable in the server-side (i.e. getting any collection of products from the DB, iterating them, performing n+1 queries on the statuses table, etc), but I have hundreds of thousands of items so it's imperative to do it in SQL for performance reasons.
table : products
+-------+-----------+
| id | name |
+-------+-----------+
| 1 | Apple |
| 2 | Banana |
| 3 | Grape |
+-------+-----------+
table : statuses
+-------+-------------+---------------+---------------+
| id | name | product_id | created_at |
+-------+-------------+---------------+---------------+
| 1 | active | 1 | 2018-01-01 |
| 2 | inactive | 1 | 2018-02-01 |
| 3 | active | 1 | 2018-03-01 |
| 4 | inactive | 1 | 2018-03-15 |
| 6 | active | 1 | 2018-04-25 |
| 7 | active | 2 | 2018-03-01 |
| 8 | active | 3 | 2018-03-10 |
| 9 | inactive | 3 | 2018-03-15 |
+-------+-------------+---------------+---------------+
table : items (ordered products)
+-------+---------------+-------------+
| id | product_id | order_id |
+-------+---------------+-------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 1 | 4 |
| 5 | 1 | 5 |
| 6 | 2 | 3 |
| 7 | 2 | 4 |
| 8 | 2 | 5 |
| 9 | 3 | 5 |
+-------+---------------+-------------+
table : orders
+-------+---------------+
| id | created_at |
+-------+---------------+
| 1 | 2018-01-02 |
| 2 | 2018-01-15 |
| 3 | 2018-03-02 |
| 4 | 2018-03-10 |
| 5 | 2018-03-13 |
+-------+---------------+
I want my final results to look like this:
+-------+-----------+----------------------+--------------------------------+
| id | name | recent_sales_count | date_to_start_counting_sales |
+-------+-----------+----------------------+--------------------------------+
| 1 | Apple | 3 | 2018-01-30 |
| 2 | Banana | 0 | 2018-04-09 |
| 3 | Grape | 1 | 2018-03-10 |
+-------+-----------+----------------------+--------------------------------+
So this is what I mean by latest 20 active days for e.g. Apple:
It was last activated at '2018-04-25'. That's 4 days ago.
Before that, it was inactive since '2018-03-15', so all these days until '2018-04-25' don't count.
Before that, it was active since '2018-03-01'. That's more 14 days until '2018-03-15'.
Before that, inactive since '2018-02-01'.
Finally, it was active since '2018-01-01', so it should only count the missing 2 days (4 + 14 + 2 = 20) backwards from '2018-02-01', resulting in date_to_start_counting_sales = '2018-01-30'.
With the '2018-01-30' date in hand, I'm then able to count Apple orders in the last 20 active days: 3.
Hope that makes sense.
Here is a fiddle with the data provided above.
I've got a standard SQL solution, that does not use any window function as you are on MySQL 5
My solution requires 3 stacked views.
It would have been better with a CTE but your version doesn't support it. Same goes for the stacked Views... I don't like to stack views and always try to avoid it, but sometimes you have no other choice, because MySQL doesn't accept subqueries in FROM clause for Views.
CREATE VIEW VIEW_product_dates AS
(
SELECT product_id, created_at AS active_date,
(
SELECT created_at
FROM statuses ti
WHERE name = 'inactive' AND ta.created_at < ti.created_at AND ti.product_id=ta.product_id
GROUP BY product_id
) AS inactive_date
FROM statuses ta
WHERE name = 'active'
);
CREATE VIEW VIEW_product_dates_days AS
(
SELECT product_id, active_date, inactive_date, datediff(IFNULL(inactive_date, SYSDATE()),active_date) AS nb_days
FROM VIEW_product_dates
);
CREATE VIEW VIEW_product_dates_days_cumul AS
(
SELECT product_id, active_date, ifnull(inactive_date,sysdate()) AS inactive_date, nb_days,
IFNULL((SELECT SUM(V2.nb_days) + V1.nb_days
FROM VIEW_product_dates_days V2
WHERE V2.active_date >= IFNULL(V1.inactive_date, SYSDATE()) AND V1.product_id=V2.product_id
),V1.nb_days) AS cumul_days
FROM VIEW_product_dates_days V1
);
The final view produce this :
| product_id | active_date | inactive_date | nb_days | cumul_days |
|------------|----------------------|----------------------|---------|------------|
| 1 | 2018-01-01T00:00:00Z | 2018-02-01T00:00:00Z | 31 | 49 |
| 1 | 2018-03-01T00:00:00Z | 2018-03-15T00:00:00Z | 14 | 18 |
| 1 | 2018-04-25T00:00:00Z | 2018-04-29T11:28:39Z | 4 | 4 |
| 2 | 2018-03-01T00:00:00Z | 2018-04-29T11:28:39Z | 59 | 59 |
| 3 | 2018-03-10T00:00:00Z | 2018-03-15T00:00:00Z | 5 | 5 |
So it aggregates all active periods of all products, it counts the number of days for each period, and the cumulative days of all past active periods since current date.
Then we can query this final view to get the desired date for each product. I set a variable for your 20 days, so you can change that number easily if you want.
SET #cap_days = 20 ;
SELECT PD.id, Pd.name,
SUM(CASE WHEN o.created_at > PD.date_to_start_counting_sales THEN 1 ELSE 0 END) AS recent_sales_count ,
PD.date_to_start_counting_sales
FROM
(
SELECT p.*,
(CASE WHEN LowerCap.max_cumul_days IS NULL
THEN ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(-#cap_days))
ELSE
CASE WHEN LowerCap.max_cumul_days < #cap_days AND HigherCap.min_inactive_date IS NULL
THEN ADDDATE(ifnull(LowerCap.max_inactive_date,sysdate()),(-LowerCap.max_cumul_days))
ELSE ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(LowerCap.max_cumul_days-#cap_days))
END
END) as date_to_start_counting_sales
FROM products P
LEFT JOIN
(
SELECT product_id, MAX(cumul_days) AS max_cumul_days, MAX(inactive_date) AS max_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days <= #cap_days
GROUP BY product_id
) LowerCap ON P.id=LowerCap.product_id
LEFT JOIN
(
SELECT product_id, MIN(cumul_days) AS min_cumul_days, MIN(inactive_date) AS min_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days > #cap_days
GROUP BY product_id
) HigherCap ON P.id=HigherCap.product_id
) PD
LEFT JOIN items i ON PD.id = i.product_id
LEFT JOIN orders o ON o.id = i.order_id
GROUP BY PD.id, Pd.name, PD.date_to_start_counting_sales
Returns
| id | name | recent_sales_count | date_to_start_counting_sales |
|----|--------|--------------------|------------------------------|
| 1 | Apple | 3 | 2018-01-30T00:00:00Z |
| 2 | Banana | 0 | 2018-04-09T20:43:23Z |
| 3 | Grape | 1 | 2018-03-10T00:00:00Z |
FIDDLE : http://sqlfiddle.com/#!9/804f52/24
Not sure which version of MySql you're working with, but if you can use 8.0, that version came out with a lot of functionality that makes things slightly more doable (CTE's, row_number(), partition, etc.).
My recommendation would be to create a view like in this DB-Fiddle Example, call the view on server side and iterate programatically. There are ways of doing it in SQL, but it'd be a bear to write, test and likely would be less efficient.
Assumptions:
Products cannot be sold during inactive date ranges
Statuses table will always alternate status active/inactive/active for each product. I.e. no date ranges where a certain product is both active and inactive.
View Results:
+------------+-------------+------------+-------------+
| product_id | active_date | end_date | days_active |
+------------+-------------+------------+-------------+
| 1 | 2018-01-01 | 2018-02-01 | 31 |
+------------+-------------+------------+-------------+
| 1 | 2018-03-01 | 2018-03-15 | 14 |
+------------+-------------+------------+-------------+
| 1 | 2018-04-25 | 2018-04-29 | 4 |
+------------+-------------+------------+-------------+
| 2 | 2018-03-01 | 2018-04-29 | 59 |
+------------+-------------+------------+-------------+
| 3 | 2018-03-10 | 2018-03-15 | 5 |
+------------+-------------+------------+-------------+
View:
CREATE OR REPLACE VIEW days_active AS (
WITH active_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'active'),
inactive_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'inactive')
SELECT x1.product_id,
x1.created_at AS active_date,
CASE WHEN x2.created_at IS NULL
THEN Curdate()
ELSE x2.created_at
END AS end_date,
CASE WHEN x2.created_at IS NULL
THEN Datediff(Curdate(), x1.created_at)
ELSE Datediff(x2.created_at,x1.created_at)
END AS days_active
FROM active_rn x1
LEFT OUTER JOIN inactive_rn x2
ON x1.rownum = x2.rownum
AND x1.product_id = x2.product_id ORDER BY
x1.product_id);
+-----------+------------+------------+
| ACCOUNT | PAID_DATE | DUE_DATE |
+-----------+------------+------------+
| 103240005 | 2010-07-22 | 2009-11-30 |
| 103240005 | 2010-07-22 | 2007-09-30 |
| 103240005 | 2010-07-22 | 2008-09-30 |
| 103240006 | 2010-07-22 | 2009-09-30 |
| 103240006 | 2010-07-22 | 2007-07-22 |
| 103240007 | 2010-07-22 | 2008-07-22 |
| 103240008 | 2010-07-22 | 2009-08-31 |
| 103240009 | 2010-07-22 | 2007-12-31 |
| 103240009 | 2010-07-22 | 2008-12-31 |
| 103240005 | 2010-07-22 | 2009-12-31 |
+-----------+------------+------------+
The above sample dataset is from a banking application I am building.
I would like to get per account, the amount of records where the payments were made on time, i.e DATEDIFF(DUE_DATE, PAID_DATE) = 0. Please note that there are multiple entries per account.
Here is my problematic query:
select ACCOUNT_NUMBER, count(DATEDIFF(PAID_DATE, DUE_DATE) as diff) as diff_count
from TRANSACTIONS
where diff=0 group by ACCOUNT_NUMBER;
Assuming the paid_date and due_date are of date type, you can use:
select account_number,
sum(paid_date = due_date) as diff_count
from transactions
group by account_number;
if the two dates are equal, the result will be true which is taken as 1 by mysql and 0 for false.
EDIT:
You can further add more aggregates as needed. For e.g. - count where the overdue is 10 or more days, you can use:
select account_number,
sum(paid_date = due_date) as diff_count,
sum(datediff(paid_date, due_date) >= 10) as overdue,
from transactions
group by account_number;
I have a MySQL table that looks like this:
+--------+------------+------------------+
| id | account_id | posted_at |
+--------+------------+------------------+
| 1 | 1 | 2013-10-05 23:09 |
| 2 | 1 | 2013-10-07 14:24 |
| 3 | 1 | 2013-10-07 01:17 |
| 4 | 1 | 2013-10-09 06:58 |
+--------+------------+------------------+
For a particular account_id (in this case 1), I want to return this (for dates in the current month):
+--------+------------+
| count | date |
+--------+------------+
| 0 | 2013-10-01 |
| 0 | 2013-10-02 |
| 0 | 2013-10-03 |
| 0 | 2013-10-04 |
| 1 | 2013-10-05 |
| 0 | 2013-10-06 |
| 2 | 2013-10-07 |
| 0 | 2013-10-08 |
| 1 | 2013-10-09 |
+--------+------------+
I have a SQL query that returns the COUNTS for each date within this month.
SELECT
DATE(posted_at) AS formatted_date,
COUNT(id) AS count
FROM entries
WHERE account_id = 1
AND MONTH(DATE(posted_at)) = MONTH(NOW())
GROUP BY formatted_date
ORDER BY formatted_date ASC
It's just returning this:
+--------+------------+
| count | date |
+--------+------------+
| 1 | 2013-10-05 |
| 2 | 2013-10-07 |
| 1 | 2013-10-09 |
+--------+------------+
Of course, COUNT doesn't return anything for dates that have no data. I want the result to have a zero for dates with no data.
I've read that you should create a join table of all possible dates. Is this the only way?
You can try something like this
Declare #INT DATETIME = null
SELECT COUNT( CASE WHEN #INT IS NOT NULL THEN #INT ELSE NULL END)
Im not to sure about mysql but in sql server it would be something like this...
SELECT
DATE(posted_at) AS formatted_date,
COUNT( CASE WHEN IS NOT NULL THEN posted_at ELSE NULL END ) AS [count]
FROM entries
WHERE account_id = 1
AND MONTH(DATE(posted_at)) = MONTH(NOW())
GROUP BY formatted_date
ORDER BY formatted_date ASC