SQL - narrowing terms by a differencing method - mysql

I have a SQL database I am trying to access from, what I require is the average for all terms given a difference between the max and min of each entry being above a certain threshold.
So, for interests sake say we have the following:
+------------+------+---------+
| Date | Name | Number |
+------------+------+---------+
| 2017-01-03 | Dude | 1000000 |
| 2017-01-03 | Dude | 2000000 |
| 2017-01-04 | Dude | 7000000 |
| 2017-01-04 | Dude | 8750000 |
+------------+------+---------+
I now want to take the averages for date 2017-01-03, but only if the difference between the max and min number for that day is above/below X. Of course my actual table is much larger, so removing the data and looping in VBA, for instance, is not helpful.
My ideal output would be:
+------------+------+---------+
| Date | Name | AVGnum |
+------------+------+---------+
| 2017-01-03 | Dude | X |
| 2017-01-04 | Dude | Y |
+------------+------+---------+
Where X and Y are the averages of the numbers if and only if the difference between the max and min on that day is above a specified X.
Thanks a lot!!!!

Something like this:
select date, name,
(case when max(num) - min(num) > #X then avg(num) end) as avgnum
from t
group by date, name;
This puts NULLs in the places where the difference does not meet the condition.
If you want to filter out the rows, use having instead:
select date, name, avg(num) as avgnum
from t
group by date, name
having max(num) - min(num) > #X

somthing like:
SELECT table.Date,
table.Name,
AVG(table.number)
FROM table
WHERE table.Date = '2017-01-03'
GROUP BY table.Date,
table.Name
HAVING MAX(table.Number) - MIN(table.Number) < X
AND MAX(table.Number) - MIN(table.Number) > X
Be aware that only the numbers matching the having conditions would be taken in the AVG calculation.
With 'HAVING' you can use aggregate functions in GROUP BY.

Related

How to group by year from a concatenated column

Having a MySQL table as this, where the id is a concatenation of the date with Ymd format (of the moment the row is inserted) with an incremental id.
| id | weight |
| 20200128001 | 100 |
| 20200601002 | 250 |
| 20201208003 | 300 |
| 20210128001 | 150 |
| 20210601002 | 200 |
| 20211208003 | 350 |
To make a sum of 'weight' by a single year I'm making:
SELECT sum(weight) as weight FROM `table` WHERE id LIKE '2020%';
resulting in this case as:
650
How can I make it result in a table of weights by year, instead of querying by every single possible year, resulting in this case as:
| date | weight |
| 2020 | 650 |
| 2021 | 700 |
Use one of the string processing functions in MySQL like left()
SELECT LEFT(id,4) as Year, SUM(weight) as Weight
FROM `table`
GROUP BY LEFT(id,4)
ORDER BY LEFT(id,4)
And if you want to limit the results to just those 2 years
SELECT LEFT(id,4) as Year, SUM(weight) as Weight
FROM `table`
WHERE LEFT(id,4) IN (2021, 2022)
GROUP BY LEFT(id,4)
ORDER BY LEFT(id,4)

Calculate average, minimum, maximum interval between date

I am trying to do this with SQL. I have a transaction table which contain transaction_date. After grouping by date, I got this list:
| transaction_date |
| 2019-03-01 |
| 2019-03-04 |
| 2019-03-05 |
| ... |
From these 3 transaction dates, I want to achieve:
Average = ((4-1) + (5-4)) / 2 = 2 days (calculate DATEDIFF every single date)
Minimum = 1 day
Maximum = 3 days
Is there any good syntax? Before I iterate all of them using WHILE.
Thanks in advance
If your mysql version didn't support lag or lead function.
You can try to make a column use a subquery to get next DateTime. then use DATEDIFF to get the date gap in a subquery.
Query 1:
SELECT avg(diffDt),min(diffDt),MAX(diffDt)
FROM (
SELECT DATEDIFF((SELECT transaction_date
FROM T tt
WHERE tt.transaction_date > t1.transaction_date
ORDER BY tt.transaction_date
LIMIT 1
),transaction_date) diffDt
FROM T t1
) t1
Results:
| avg(diffDt) | min(diffDt) | MAX(diffDt) |
|-------------|-------------|-------------|
| 2 | 1 | 3 |
if your mysql version higher than 8.0 you can try to use LEAD window function instead of subquery.
Query #1
SELECT avg(diffDt),min(diffDt),MAX(diffDt)
FROM (
SELECT DATEDIFF(LEAD(transaction_date) OVER(ORDER BY transaction_date),transaction_date) diffDt
FROM T t1
) t1;
| avg(diffDt) | min(diffDt) | MAX(diffDt) |
| ----------- | ----------- | ----------- |
| 2 | 1 | 3 |
View on DB Fiddle

How to get the average time between multiple dates

What I'm trying to do is bucket my customers based on their transaction frequency. I have the date recorded for every time they transact but I can't work out to get the average delta between each date. What I effectively want is a table showing me:
| User | Average Frequency
| 1 | 15
| 2 | 15
| 3 | 35
...
The data I currently have is formatted like this:
| User | Transaction Date
| 1 | 2018-01-01
| 1 | 2018-01-15
| 1 | 2018-02-01
| 2 | 2018-06-01
| 2 | 2018-06-18
| 2 | 2018-07-01
| 3 | 2019-01-01
| 3 | 2019-02-05
...
So basically, each customer will have multiple transactions and I want to understand how to get the delta between each date and then average of the deltas.
I know the datediff function and how it works but I can't work out how to split them transactions up. I also know that the offset function is available in tools like Looker but I don't know the syntax behind it.
Thanks
In MySQL 8+ you can use LAG to get a delayed Transaction Date and then use DATEDIFF to get the difference between two consecutive dates. You can then take the average of those values:
SELECT User, AVG(delta) AS `Average Frequency`
FROM (SELECT User,
DATEDIFF(`Transaction Date`, LAG(`Transaction Date`) OVER (PARTITION BY User ORDER BY `Transaction Date`)) AS delta
FROM transactions) t
GROUP BY User
Output:
User Average Frequency
1 15.5
2 15
3 35
Demo on dbfiddle.com
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(user INT NOT NULL
,transaction_date DATE
,PRIMARY KEY(user,transaction_date)
);
INSERT INTO my_table VALUES
(1,'2018-01-01'),
(1,'2018-01-15'),
(1,'2018-02-01'),
(2,'2018-06-01'),
(2,'2018-06-18'),
(2,'2018-07-01'),
(3,'2019-01-01'),
(3,'2019-02-05');
SELECT user
, AVG(delta) avg_delta
FROM
( SELECT x.*
, DATEDIFF(x.transaction_date,MAX(y.transaction_date)) delta
FROM my_table x
JOIN my_table y
ON y.user = x.user
AND y.transaction_date < x.transaction_date
GROUP
BY x.user
, x.transaction_date
) a
GROUP
BY user;
+------+-----------+
| user | avg_delta |
+------+-----------+
| 1 | 15.5000 |
| 2 | 15.0000 |
| 3 | 35.0000 |
+------+-----------+
I don't know what to say other than use a GROUP BY.
SELECT User, AVG(DATEDIFF(...))
FROM ...
GROUP BY User

How to select till the sum reach some value

having this table
#table stock
+-------+----------+
| id | stock |
+-------+----------+
| 1 | 20 |
+-------+----------+
| 2 | 25 |
+-------+----------+
| 3 | 10 |
+-------+----------+
| 4 | 20 |
+-------+----------+
#note: this is an arbitrary random data
How can I keep selecting rows from the table till the sum() of the stock column reaches some value or a little higher , and the table is ORDER BY id ASC.
For example I want to select rows from the table till I have sum of stock '50' , so the result will be
#result 3 rows
+-------+----------+
| id | stock |
+-------+----------+
| 1 | 20 |
+-------+----------+
| 2 | 25 |
+-------+----------+
| 3 | 10 |
+-------+----------+
the sum of stock now is '55' which is the closest possible higher value than '50' , and if we take the next row id:4 the sum of stock will be higher than 50 , and if we remove the row id:3 the value will be 45 which is less than the stock I want 50 .
I can achieve this with PHP by selecting all the rows and loop throw them, but I guess that will be a waste. Is there a possible way to do that on a lower level and let mysql do that for me by a sql query?
Thank you and forgive me if I messed something , I'm new to programming
You need a cumulative sum for this to work. One method uses variables:
select t.*
from (select t.*, (#sum := #sum + stock) as cume_stock
from t cross join
(select #sum := 0) params
order by id
) t
where cume_stock < 50 or (cume_stock >= 50 and cume_stock - stock < 50);

SQL foreach table and get number for duplicate data using reference date

I have a table that contains two fields name and date
+-------+------------+
| name | date |
+-------+------------+
| B | 28-09-2015 |
| A | 28-09-2015 |
| B | 29-09-2015 |
| A | 29-09-2015 |
| B | 30-09-2015 |
| A | 30-09-2015 |
| B | 01-10-2015 |
| C | 01-10-2015 |
| B | 02-10-2015 |
| B | 03-10-2015 |
| C | 03-10-2015 |
| B | 04-10-2015 |
+-------+------------+
I went compare date now with date for my data and get this table
+-------+------------+
| name | Number |
+-------+------------+
| A | -4 day |
| C | -1 day |
| B | 0 day |
+-------+------------+
Thank you
You should group by each name, get the max date and use curdate() to get the difference. Use DATE() to convert from from datetime to date for calculation.
select name, max(DATE(datecolumn)) - curdate()
from tablename
group by name
order by max(DATE(datecolumn)) - curdate()
Step one is to get a list of the newest dates. You can use this with MAX(date) but that alone will just get you the newest date in the table. You can tell the database you want the newest date per name with a GROUP BY clause. In this case, GROUP BY name.
SELECT name, MAX(date)
FROM names
GROUP BY name
Now you can do some date math on MAX(date) to determine how old it is. MySQL has DATEDIFF to get the difference between two dates in days. CURRENT_DATE() gives the current date. So DATEDIFF(MAX(date), CURRENT_DATE()).
SELECT name, DATEDIFF(MAX(date), CURRENT_DATE()) as Days
FROM names
GROUP BY name
Finally, to append the "days" part, use CONCAT.
SELECT name, CONCAT(DATEDIFF(MAX(date), CURRENT_DATE()), " days") as Days
FROM names
GROUP BY name
You can play around with it in SQLFiddle.
I would recommend not doing that last part in SQL. You won't get the formatting quite right ("1 days" is bad grammar) and the data is more useful as a number. Instead, do the formatting at the point you want to display the data.