I am trying to attain the count of users that ordered at least 1 product on multiple days.
Transactions Table
usr_id|transt_id|product_id|spend| transaction_date
4 8 32 40 2020-05-08 17:54:59
4 7 31 20 2020-05-01 17:54:59
4 7 31 40 2020-05-01 17:54:59
4 6 20 30 2020-05-02 17:54:59
4 6 19 20 2020-05-02 17:54:59
4 6 18 10 2020-05-02 17:54:59
3 5 17 20 2020-05-04 17:54:59
3 5 16 10 2020-05-04 17:54:59
2 3 14 30 2020-05-04 18:54:59
2 3 13 50 2020-05-04 18:54:59
1 2 12 30 2020-05-05 20:54:59
1 2 12 40 2020-05-05 20:54:59
1 2 12 40 2020-05-04 20:54:59
1 1 11 20 2020-05-05 21:54:59
1 1 10 40 2020-05-05 21:54:59
3 4 10 60 2020-05-06 17:54:59
Through my code I have been able to reach to a point where the output is:
select user_id, count(*)
from (
select user_id, date(transaction_date)
from transactions
group by user_id, date(transaction_date)) as abc
group by user_id
having count(user_id)>1;
user_id | count
1 2
3 2
4 3
I want to write a code without writing another subquery to get the count of users having count(*)>1;
The output should be: 3.
In other words, I don't want the following code; I want to write one less subquery or a completely new query
select count(*)
from (
select user_id, count(*)
from (
select user_id, date(transaction_date)
from transactions
group by user_id, date(transaction_date)) as abc
group by user_id
having count(user_id)>1) as bcd;
The query that you already have could be written without a subquery:
select user_id, count(distinct date(transaction_date)) count
from transactions
group by user_id
having count(distinct date(transaction_date))>1;
So what you need now can be written with only 1 subquery:
select count(*) count
from (
select user_id
from transactions
group by user_id
having count(distinct date(transaction_date))>1
) t
You can get the same result with EXISTS:
select count(distinct t.user_id) count
from transactions t
where exists (
select 1
from transactions
where user_id = t.user_id and date(transaction_date) <> date(t.transaction_date)
)
See the demo.
Related
I have running total
SELECT
id,
DepositValue,
action_date,
SUM(DepositValue) OVER(ORDER by action_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Running_total
The above select returns me the following:
id action_date DepositValue Running_total
1 2020-04-01 20 20
2 2020-04-02 2 22
3 2020-04-03 8 30
4 2020-04-04 10 38
5 2020-04-05 14 48
6 2020-04-06 15 62
7 2020-04-07 22 77
8 2020-04-08 12 99
9 2020-04-09 4 103
What i want to achieve is selecting only part of Running_total depend on action_date with already calculated values like this.
id action_date DepositValue Running_total
3 2020-04-03 8 30
4 2020-04-04 10 38
5 2020-04-05 14 48
You can turn your query to a subquery and filter in the outer query:
SELECT *
FROM (
SELECT
id,
DepositValue ,
action_date,
SUM(DepositValue) OVER(ORDER by action_date) AS Running_total
FROM mytable
) t
WHERE action_date BETWEEN '2020-04-03' AND '2020-04-05'
Note that window specification ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is actually the default when not specificed, hence you can just remove it.
Also, your original query was missing a FROM clause, I added it.
I have the following table:
id post_id user_id to_user_id date time
---- ---------- -------- ------------ ------
1 100 1 2 10:00
2 100 1 2 10:30
3 100 2 2 11:00
4 100 5 2 11:30
5 100 8 2 11:45
6 105 10 50 09:00
7 105 2 50 09:30
8 105 11 50 11:00
9 105 30 50 11:30
10 105 32 50 11:45
On the following table you can see that user_id 2 has comments for post 100 and 105.
I need to get only the records per post_id that is hight than the first comment he wrote.
so for this example the result will be records 4 and 5 for post 100 and 8, 9, 10 for post 105 because 4, 5 is bigger than 3 (first record for user_id 2)
and 8, 9, 10 is bigger than 7 (user_2 first comment)
clear expected result:
id post_id user_id to_user_id date time
4 100 5 2 11:30
5 100 8 2 11:45
8 105 11 50 11:00
9 105 30 50 11:30
10 105 32 50 11:45
Could be with a subselect and an aggregation function
select * from my_table
where ( post_id, date_time) > (select post_id, max( date_time)
from my_table where user_id =2
group y post_id);
or if the tuple version donìt work properly try
select * from my_table as m
inner join (select post_id, max( date_time)
from my_table where user_id =2
group y post_id ) t on m.post_id = t.post_id
where m.date_time > t.date_time
I have a table called play_progress.
play_progress
id user_id coins timecreated
1 1 20 2016-01-23 06:55:09
2 1 24 2016-01-23 06:59:22
3 1 28 2016-01-23 07:05:34
4 2 4 2016-01-23 07:10:58
5 2 10 2016-01-23 07:12:08
6 1 32 2016-01-24 00:07:48
7 2 14 2016-01-24 00:12:08
8 1 35 2016-01-24 00:44:48
9 2 18 2016-01-24 00:55:08
I like to get the latest row( based on timecreated) for each day for each user.
I have tried the following query;
SELECT user_id, coins, MAX(timecreated)
FROM player_progress
GROUP BY user_id, DATE(timecreated);
It gives the result for each day but it gives wrong timecreatd value.
Where I am going wrong?
Result details like
id user_id coins timecreated
1 1 28 2016-01-23 07:05:34
5 2 10 2016-01-23 07:12:08
8 1 35 2016-01-24 00:44:48
9 2 18 2016-01-24 00:55:08
I have searched through SO, but couldn't find a solution to my problem.
If you need to fetch more columns then the ones used for grouping (user_id and timecreatd) you need a more complex query:
SELECT p.user_id, p.id, coins, p.timecreated
FROM play_progress p INNER JOIN
(SELECT user_id, MAX(timecreated) as max_time
FROM play_progress
GROUP BY user_id, DATE(timecreated)) pp
ON p.user_id = pp.user_id
AND p.timecreated = pp.max_time;
Here you have a sample
I'm learning HIVE these days and meet some problems...
I have a table called SAMPLE:
USER_ID PRODUCT_ID NUMBER
1 3 20
1 4 30
1 2 25
1 6 50
1 5 40
2 1 10
2 3 15
2 2 40
2 5 30
2 3 35
How can I use HIVE to group table by user_id and in each group order the records by DESC order of NUMBER and in each group I want to keep up to 3 records.
The result I want to have is like:
USER_ID PRODUCT_ID NUMBER(optional column)
1 6 50
1 5 40
1 4 30
2 2 40
2 3 35
2 5 30
or
USER_ID PRODUCT_IDs
1 [6,5,4]
2 [2,3,5]
Could someone help me ?..
Thanks very much!!!!!!!!!!!!!!!!
try this,
select user_id,product_id,number
from(
select user_id,product_id,number, ROW_NUMBER() over (Partition BY user_id) as RNUM
from (
select user_id, number,product_id
from SAMPLE
order by number desc
) t) t2
where RNUM <=3
output
1 6 50
1 5 40
1 4 30
2 2 40
2 3 35
2 5 30
hive version should be 0.11 or greater, may I know if your version is lower
I am (very) new to MySQL. Forgive my lack of knowledge....
I am working on a hockey stat database where I need to add all the assists and all the goals together to get "total points". I have two queries already figured out, but I am not able to figure out how to sum the two.
Here are the queries:
select player_id, count(*)
from(select * from 1st_assists
union
select * from 2nd_assists) as tem
join players on tem.fk_player_id=players.player_id
group by fk_player_id
order by count(*) desc
select player_id, count(*)
from goals_for
join shots_for on goals_for.fk_shot_for_id=shots_for.shot_for_id
join players on shots_for.fk_player_id=players.player_id
group by player_id
order by count(*) desc;
how do I combine these two queries into one and get the total of both counts?
Here are the results of each of the queries
Total Assists
player_id count(*)
79 24
55 22
45 17
90 16
40 15
65 15
37 13
1 13
20 11
84 11
64 10
27 9
93 7
8 5
24 3
57 1
Goals
player_id count(*)
90 38
37 28
40 19
55 13
45 11
1 8
24 8
20 8
84 8
27 6
8 5
79 4
65 4
93 1
64 1
It is untested, but can you, please, try this:
select p.player_first_name, p.player_last_name, (count1+count2) as total_count
from
(select player_id, count(*) count1
from(select * from 1st_assists
union
select * from 2nd_assists) as tem
join players on tem.fk_player_id=players.player_id
group by fk_player_id
order by count(*) desc) q1
left join
(select player_id, count(*) count2
from goals_for
join shots_for on goals_for.fk_shot_for_id=shots_for.shot_for_id
join players on shots_for.fk_player_id=players.player_id
group by player_id) q2
ON q1.player_id=q2.player_id
left join player p ON q1.player_id=p.player_id
order by (count1+count2) desc;