I have below query as
WITH search_agg as(
SELECT user_id, count(search_id) as count_search
FROM search
WHERE date>current_date -interval '7 days'
GROUP BY user_id
HAVING count_search>10)
SELECT count (distinct user_id)
FROM search_agg
If am correct, I don't need distinct in my outer query since my group by takes care of that correct? or is it better practice to have distinct anyway? Thanks
It depends upon your need in this case there is no need for distinct. Because CTE already returns (distinct) group by user_id.
WITH search_agg as(
SELECT user_id, count(search_id) as count_search
FROM search
WHERE date>current_date -interval '7 days'
GROUP BY user_id
HAVING count_search>10)
SELECT count (user_id)
FROM search_agg;
If you have group by based on two or more columns then you need to use distinct to fetch unique rows.
Related
I’m trying to retrieve the number of unique users that have made a purchase in a monthly basis. This sounds simple but the problem here is that we have three type of products and the purchases of these products are on different tables in which the only common key is the user_id, so in order to find out unique users I have to query the three tables separately, union the results and execute a count distinct.
Here’s an example of what I’m doing right now:
SELECT
month,
count(distinct user_id) as users
FROM
(
SELECT
DATE_FORMAT(purchase_date,’%Y-%m) as month,
user_id
FROM purchases_a
UNION
SELECT
DATE_FORMAT(purchase_date,’%Y-%m) as month,
user_id
FROM purchases_b
UNION
SELECT
DATE_FORMAT(purchase_date,’%Y-%m) as month,
user_id
FROM purchases_c
)
GROUP BY 1
Is this the only way to go? This query takes forever. Thanks!
One method is to use union all in a subquery and then aggregate:
select DATE_FORMAT(purchase_date, '%Y-%m') as month,
count(distinct user_id)
from ((select user_id, purchase_date from purchases_a) union all
(select user_id, purchase_date from purchases_b) union all
(select user_id, purchase_date from purchases_c)
) p
group by month
I need to find the percentage of logs in my table that are duplicated. Therefore I did a query with a "having" that checks if the key was duplicated. The problem is that after doing this "having" I lost all the logs that were not duplicated.
Here is the table:
Here is my query:
(SELECT count(params_advertiserId) AS duplicates
FROM android_clicks
GROUP BY params_advertiserId ,app_id ,date --my key is a triplet
HAVING COUNT(params_advertiserId) > 1)
Help would be appreciated.
It this what you want?
select (count(*) - count(distinct params_advertiserId, app_id, date)) / count(*) as duplicate_ratio
from android_clicks ac;
Your query is incorrect because AND is used for boolean expressions. So the result of the GROUP BY expression is true, false, or NULL.
If you want the count, then wrap it as a subquery:
SELECT COUNT(*) as num_duplicates
FROM (SELECT params_advertiserId, app_id, date AS duplicates
FROM android_clicks ac
GROUP BY params_advertiserId, app_id, date
HAVING COUNT(*) > 1
);
GROUP BY use comma , instead of AND
SELECT count(params_advertiserId) AS duplicates
FROM android_clicks
GROUP BY params_advertiserId , app_id , date
HAVING COUNT(params_advertiserId) > 1
Hi I have this VISITS table
What I want to achieve:
**affiliate_id** **unique visits count**
167 4
121 1
137 1
Special Condition is one IP can only be counted once per day for single affiliate_id.
So for visit_id 553 and 554, it can be only counted as one visits because both have same ip, same date and same affiliate_id.
From what I understand I need to group by ip, date and affiliate_id and count it, but not sure how to write the query.
Can you guys point me to some reference or insight to solve this problem?
Thanks in advance!
--
Update with link sample SQL:
https://dl.dropboxusercontent.com/u/3765168/tb_visits.sql
Based on your requirement i think you need the distinct ip per date and affiliate_id
select DATE(date), affiliate_id, count(distinct( ip))
from your_table
group by DATE(date), affiliate_id
If I understood correctly,
SELECT affiliate_id, count(*)
FROM (SELECT DISTINCT affiliate_id, ip, DAY(date)
FROM visits) AS q
GROUP BY affiliate_id;
What you are trying to do is group the number of unique or distinct ip's for a given affiliate_id so the only group by you need is the affiliate_id. The Unique hits are calculated using a count and to make then unique you add the DISTINCT key word
SELECT
affiliate_id, COUNT(DISTINCT ip) AS unique_visit_counts,
FROM tablename
GROUP BY affiliate_id
However since you want it by the day as well you might want to include a date clause such as:
DATE_FORMAT(date, "%y-%m-%d") AS `date`
Which will turn your date and time stamp into a day in the YY-MM-DD format.
If you group by that you can get a full list by day by affiliate_id using something like
SELECT
affiliate_id,
COUNT(DISTINCT ip) AS unique_visit_counts,
DATE_FORMAT(date, "%y-%m-%d") AS `date`
FROM tablename
GROUP BY `date`, affiliate_id
Or pick a specific date using something like
SELECT
affiliate_id,
COUNT(DISTINCT ip) AS unique_visit_counts,
FROM tablename
WHERE DATE_FORMAT(date, "%y-%m-%d") = '17-02-08'
GROUP BY affiliate_id
I have a table that has a unique key each time a user creates a case:
id|doctor_id|created_dt
--|---------|-----------
1|23 |datetimestamp
2|23 |datetimestamp
3|17 |datetimestamp
How can I select and return the average amount of entries a user has per month?
I have tried this:
SELECT avg (id)
FROM `cases`
WHERE created_dt BETWEEN DATE_SUB(CURDATE(),INTERVAL 90 DAY) AND CURDATE()
and doctor_id = 17
But this returns a ridiculously large value that cannot be true.
To clarify: I am trying to get something like doctor id 17 has an average of 2 entries per month into this table.
I think you were thrown off by the idea of "averaging". You don't want the average id, or average user_id. You want the average number of entries into the table, so you would use COUNT():
SELECT count(id)/3 AS AverageMonthlyCases
FROM `cases`
WHERE created_dt BETWEEN DATE_SUB(CURDATE(),INTERVAL 90 DAY) AND CURDATE()
group by doctor_id
Since you have a 90 day interval, you want to count the number of rows per 30 days, or the count/3.
SELECT AVG(cnt), user_id
FROM (
SELECT COUNT(id) cnt, user_id
FROM cases
WHERE created_dt BETWEEN <yourDateInterval>
GROUP BY user_id, year(created_dt), month(created_dt)
)
Since you need average number of entries, AVG function is not really applicable, because it is SUM()/COUNT() and obviously you do not need that (why would you need SUM of ids).
You need something like this
SELECT
doctor_id,
DATE(created_dt,'%m-%Y') AS month,
COUNT(id) AS visits
FROM `cases`
GROUP BY
`doctor_id`,
DATE(created_dt,'%m-%Y')
ORDER BY
`doctor_id` ASC,
DATE(created_dt,'%m-%Y') ASC
To get visits per month per doctor. If you want to average it, you can then use something like
SELECT
doctor_id,
SUM(visits)/COUNT(month) AS `average`
FROM (
SELECT
doctor_id,
DATE(created_dt,'%m-%Y') AS month,
COUNT(id) AS visits
FROM `cases`
GROUP BY
`doctor_id`,
DATE(created_dt,'%m-%Y')
ORDER BY
`doctor_id` ASC,
DATE(created_dt,'%m-%Y') ASC
) t1
GROUP BY
doctor_id
Obviously you can add your WHERE clauses, as this query is compatible for multiple years (i.e. it will not count January of 2013th and January of 2014th as one month).
Also, it takes into account if a doctor has "blank" months, where he did not have any patients, so it will not count those months (0 can destroy and average).
Use this, you'll group each doctor's total id, by month.
Select monthname(created_dt), doctor_id, count(id) as total from cases group by 1,2 order by 1
Also you can use GROUP_CONCAT() as nested query in order to deploy a pivot like table, where each column is each doctor_id.
All I want to count entries based on date.(i.e entries with same date.)
My table is
You can see 5th and 6th entry have same date.
Now, the real problem as i think is the same date entry have different time so i am not getting what I want.
I am using this sql
SELECT COUNT( created_at ) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY created_at
LIMIT 0 , 30
What I am getting is this.
I want entries as 2 for date 2012-02-22
The reason you get what you get is because you also compare the time, down to a second apart. So any entries created the same second will be grouped together.
To achieve what you actually want, you need to apply a date function to the created_at column:
SELECT COUNT(1) AS entries, DATE(created_at) as date
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30
This would remove the time part from the column field, and so group together any entries created on the same day. You could take this further by removing the day part to group entries created on the same month of the same year etc.
To restrict the query to entries created in the current month, you add a WHERE-clause to the query to only select entries that satisfy that condition. Here's an example:
SELECT COUNT(1) AS entries, DATE(created_at) as date
FROM wp_frm_items
WHERE user_id = 1
AND created_at >= DATE_FORMAT(CURDATE(),'%Y-%m-01')
GROUP BY DATE(created_at)
Note: The COUNT(1)-part of the query simply means Count each row, and you could just as well have written COUNT(*), COUNT(id) or any other field. Historically, the most efficient approach was to count the primary key, since that is always available in whatever index the query engine could utilize. COUNT(*) used to have to leave the index and retrieve the corresponding row in the table, which was sometimes inefficient. In more modern query planners this is probably no longer the case. COUNT(1) is another variant of this that didn't force the query planner to retrieve the rows from the table.
Edit: The query to group by month can be created in a number of different ways. Here is an example:
SELECT COUNT(1) AS entries, DATE_FORMAT(created_at,'%Y-%c') as month
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE_FORMAT(created_at,'%Y-%c')
You must eliminate the time with GROUP BY
SELECT COUNT(*) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30
Oops, misread it.
Use GROUP BY DATE(created_at)
Try:
SELECT COUNT( created_at ) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30