I have a table of items similar to this:
id | desc | created | user
-----------------------------
1 | a... | 2015-05-23 | 1
2 | b... | 2015-05-23 | 1
3 | c... | 2015-06-23 | 1
4 | d... | 2015-07-23 | 2
5 | e... | 2015-07-23 | 1
I want to count the amount of days where the user submitted to the db. MY desired result from the above example would be:
User 1: 3
User 2: 1
Please try below query, I hope it helps.
select user, count(distinct date(created)) from table_name group by user;
SQLFiddle
SELECT COUNT(DISTINCT created) from items WHERE user=':user_id'
This is assuming your 'created' column only receives distinct daily values, if not, you would have to use trim functions.
select user,count(DISTINCT created) from tableName group by user;
Here is an explanation how COUNT(DISTINCT ) works.
Related
What I'm trying to do is bucket my customers based on their transaction frequency. I have the date recorded for every time they transact but I can't work out to get the average delta between each date. What I effectively want is a table showing me:
| User | Average Frequency
| 1 | 15
| 2 | 15
| 3 | 35
...
The data I currently have is formatted like this:
| User | Transaction Date
| 1 | 2018-01-01
| 1 | 2018-01-15
| 1 | 2018-02-01
| 2 | 2018-06-01
| 2 | 2018-06-18
| 2 | 2018-07-01
| 3 | 2019-01-01
| 3 | 2019-02-05
...
So basically, each customer will have multiple transactions and I want to understand how to get the delta between each date and then average of the deltas.
I know the datediff function and how it works but I can't work out how to split them transactions up. I also know that the offset function is available in tools like Looker but I don't know the syntax behind it.
Thanks
In MySQL 8+ you can use LAG to get a delayed Transaction Date and then use DATEDIFF to get the difference between two consecutive dates. You can then take the average of those values:
SELECT User, AVG(delta) AS `Average Frequency`
FROM (SELECT User,
DATEDIFF(`Transaction Date`, LAG(`Transaction Date`) OVER (PARTITION BY User ORDER BY `Transaction Date`)) AS delta
FROM transactions) t
GROUP BY User
Output:
User Average Frequency
1 15.5
2 15
3 35
Demo on dbfiddle.com
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(user INT NOT NULL
,transaction_date DATE
,PRIMARY KEY(user,transaction_date)
);
INSERT INTO my_table VALUES
(1,'2018-01-01'),
(1,'2018-01-15'),
(1,'2018-02-01'),
(2,'2018-06-01'),
(2,'2018-06-18'),
(2,'2018-07-01'),
(3,'2019-01-01'),
(3,'2019-02-05');
SELECT user
, AVG(delta) avg_delta
FROM
( SELECT x.*
, DATEDIFF(x.transaction_date,MAX(y.transaction_date)) delta
FROM my_table x
JOIN my_table y
ON y.user = x.user
AND y.transaction_date < x.transaction_date
GROUP
BY x.user
, x.transaction_date
) a
GROUP
BY user;
+------+-----------+
| user | avg_delta |
+------+-----------+
| 1 | 15.5000 |
| 2 | 15.0000 |
| 3 | 35.0000 |
+------+-----------+
I don't know what to say other than use a GROUP BY.
SELECT User, AVG(DATEDIFF(...))
FROM ...
GROUP BY User
I realize this question has been asked quite a few times, however i haven't managed to find a working solution for my case.
Essentially my problem arises because MySQL Doesn't allow sub-querys in views.
I found a few workarounds but they don't seem to work.
In more detail...
My first table (competitions) stores a users competitions:
id_tournament | id_competition | id_user | result
-------------------------------------------------
1 | 1 | 1 | 10
1 | 1 | 2 | 30
1 | 2 | 1 | 20
1 | 2 | 3 | 50
1 | 3 | 2 | 90
1 | 3 | 3 | 100
1 | 3 | 4 | 85
In this example there are three competitions:
(
user1 vs. user2,
user1 vs. user3,
user2 vs. user3 vs. user4
)
My problem is that i need to define a view that gives me the winners in each competition.
Expected Result:
id_tournament | id_competition | id_winner
------------------------------------------
1 | 1 | 2
1 | 2 | 3
1 | 3 | 3
This can be solved with the query:
SELECT
id_tournament,
id_competition,
id_user as id_winner
FROM (
SELECT * FROM competitions ORDER BY result DESC
) x GROUP BY id_tournament, id_competition
This query however uses a subquery (not allowed in views), so my first solution was to define a 'helper view'as :
CREATE VIEW competitions_helper AS (
SELECT * FROM competitions ORDER BY result DESC
);
CREATE VIEW competition_winners AS (
SELECT
id_tournament,
id as id_competition,
id_user as winner
FROM competitions_helper GROUP BY id_tournament, id_competition
);
However this does not seem to give the correct result.
It's result will then be:
id_tournament | id_competition | id_winner
------------------------------------------
1 | 1 | 1
1 | 2 | 1
1 | 3 | 1
What i don't understand is why it works when i use Sub-querys and why it gives a different result with the exact same statement in a view.
Any help is appreciated, thanks alot.
This is due to the GROUP BY behaviour.
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
I would solve the problem in this way:
CREATE VIEW competitions_helper AS (
SELECT id_tournament,
id_competition,
MAX(result) as winning_result
FROM competitions
GROUP BY id_tournament,
id_competition
);
CREATE VIEW competition_winners AS (
SELECT c.id_tournament,
c.id_competition,
c.id_user
FROM competitions c
INNER JOIN competitions_helper ch
ON ch.id_tournament = c.id_tournament
AND ch.id_competition = c.id_competition
AND ch.winning_result = c.result
);
I'm trying to do a query to get first and last timestamp of each unique user.
Database looks like this:
| ID | EventID | Timestamp | Person | Number |
--------------------------------------------------------
| 1 | 2 | 2015-01-08 17:31:40 | 7 | 5 |
| 2 | 2 | 2015-01-08 17:35:40 | 7 | 4 |
| 3 | 2 | 2015-01-08 17:38:40 | 7 | 7 |
--------------------------------------------------------
I'm trying to put together a MySQL query that will do the following:
SUM of number field for each unique user.
Time difference (in hours) between first and last row for each unique user.
I would imagine that if I could get the first and last timestamp for each user, I should be able to use timediff to get the time difference in hours.
What I've got so far:
SELECT
person,
SUM(number) AS 'numbers_all_sum'
FROM database
WHERE eventid = 2
GROUP BY person
ORDER BY numbers_all_sum DESC
Any help would be greatly appreciated.
Something like this:
SELECT
Person
MIN(Timestamp),
MAX(Timestamp),
SUM(number) AS 'numbers_all_sum'
FROM database
WHERE eventid = 2
GROUP BY person
Having a structure like so:
+----+------------+--------+-----------+
| id | date | userid | status |
+----+------------+--------+-----------+
| 1 | 2013-06-05 | 1 | validated |
| 2 | 2013-06-05 | 2 | validated |
| 3 | 2013-06-06 | 2 | pending |
| 4 | 2013-06-07 | 1 | validated |
| 5 | 2013-06-08 | 1 | validated |
| 6 | 2013-06-08 | 1 | validated |
| 7 | 2013-06-09 | 1 | validated |
+----+------------+--------+-----------+
If I want to select users with 5 validated statuses, I can do:
SELECT userid, COUNT(status) as valid
FROM table1
WHERE status="validated"
GROUP BY userid
HAVING valid=5
Now I want to up the complexity of this query, I want to select users that have 5 validated rows starting from a given date:
SELECT userid, COUNT(status) as valid
FROM ladder_offers_actions
WHERE status="validated"
AND date > "2013-06-06"
GROUP BY userid
HAVING valid=5
This will of course return 0 users with the example given above, this is because it's only looking at validated entries after 2013-06-06 (4).
I want to select users that only have 5 validated entries after a given date... example:
User 1 has 4 validated rows before 2013-06-06 and 1 validated row after 2013-06-06 - this user should be included in the select
User 2 has 3 validated rows before 2013-06-06 and 2 validated row after 2013-06-06 - this user should be included in the select
User 3 has 5 validated rows before 2013-06-06 - this user should not be included in the select.
User 4 has 5 validated rows after 2013-06-06 - this user should be included in the select.
Hopefully this is clear enough, essentially I only want users that that have 5 validated rows after a certain date, but include the rows before that date if the user didn't yet have 5 validated rows.
If I understand correctly what you are asking, how about using a subquery? Something like:
SELECT * from (
SELECT userid, COUNT(status) as valid, MAX(date) as lastdate
FROM ladder_offers_actions
WHERE status="validated"
GROUP BY userid
HAVING valid=5
) x WHERE x.lastdate > '2013-06-06'
Well to simplify your problem, you want to select someone if she has 5 validate and she has any validate after a certain date.
Then you can use:
SELECT userid, COUNT(status) as valid
FROM ladder_offers_actions
WHERE status="validated"
and userid in
(SELECT t2.userid
FROM ladder_offers_actions t2
WHERE t2.date > "2013-06-06")
GROUP BY userid
HAVING valid=5
If I have a table with the following structure and data:
id | user_id | created_at
-------------------------
1 | 7 | 0091942
2 | 3 | 0000014
3 | 6 | 0000890
4 | 6 | 0029249
5 | 7 | 0000049
6 | 3 | 0005440
7 | 9 | 0010108
What query would I use to get the following results (explanation to follow):
id | user_id | created_at
-------------------------
1 | 7 | 0091942
6 | 3 | 0005440
4 | 6 | 0029249
7 | 9 | 0010108
As you can see:
Only one row per user_id is returned.
The row with the highest created_at is the one returned.
Is there a way to accomplish this without using subqueries? Is there a name in relational algebra parlance that this procedure goes by?
The query is known as a groupwise maximum, which (in MySQL, at least) can be implemented with a subquery. For example:
SELECT my_table.* FROM my_table NATURAL JOIN (
SELECT user_id, MAX(created_at) created_at
FROM my_table
GROUP BY user_id
) t
See it on sqlfiddle.
You can just get the max and group by the user_id:
select id,user_id,max(created_at)
from supportContacts
group by user_id
order by id;
Here is what it outputs:
ID USER_ID MAX(CREATED_AT)
1 7 91942
2 3 5440
3 6 29249
7 9 10108
See the working demo here
Note that the example on the fiddle uses the created_at field as int, just use your format it should make no difference.
EDIT: I will leave this answer as a referece but note that his query will produce undesired results as Gordon stated, please do not use this in production.