Better solution for finding unique visitors - mysql

I am using the following query for finding the number of unique visitors from one of my table for each day. But this is affecting the performance. Can anyone suggest a better solution for this. My current query is :
SELECT t.date,COUNT(DISTINCT t.uID) as unique_clicks FROM table_name t
WHERE
NOT EXISTS(
SELECT 1
FROM table_name t2
WHERE
t2.uID = t.uID
AND t2.date < (t.date)
)
GROUP BY t.date

You could try this:
SELECT
t.date, COUNT(DISTINCT t.uID) as unique_clicks
FROM
table_name t LEFT JOIN table_name t1
ON t.uID=t2.uID AND t2.date < t.date
WHERE
t2.uID is NULL
GROUP BY t.date
I think that a join should be faster than an EXISTS clause in this particular situation. Or if I understand the logic correctly, also this:
SELECT min_date, COUNT(*) as unique_clicks
FROM (
SELECT
t.uID, min(t.date) min_date
FROM
table_name t
GROUP BY
t.uID
) s
GROUP BY min_date
Please see fiddle here.

Related

Get id of the record having Min() value

I have a complex mysql query where one of the Select fields is Min(value). Since all the 'values' are unique, is there also a way to get found min value's row id along?
In other words if we simplify the query to this question, it is like this:
SELECT t1.name, MIN(t2.value) AS minval
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
GROUP BY id_user
How can i now know which t2.id was chosen for lowest t2.value for particular user? Thank you!
Use ROW_NUMBER() to find the first value of each id_user
You can replace * with the fields you need
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY t2.id_user ORDER BY t2.value) as rnk
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
) as X
WHERE X.rnk = 1
Maybe this simple, dont know how complex your statement is:
SELECT name,value,id
FROM(
SELECT t1.name,t2.value,t2.id
FROM table1 t1
LEFT JOIN table2 t2
ON t2.id_user = t1.id
GROUP BY t2.id,id_user
ORDER BY t1.name,t2.id asc) as test
GROUP BY name;

SQL count number of common matches using several WHERE clauses

I have a table having columns like: membership_id | user_id | group_id
I'm looking for a SQL query to get the number of common groups between 2 different users. I could do that in several queries and using some PHP but I'd like to know if there is a way to use only SQL for that.
Like with the user ids 1 and 3, there are 3 common groups (1, 5 and 6) so the result returned would be 3.
I've made several tests but so far no result...Thank you.
You don't need "multiple WHERE clauses" or even a self JOIN:
SELECT group_id
FROM theTable AS t
WHERE t.user_id IN (1, 3)
GROUP BY group_id
HAVING COUNT(DISTINCT user_id) = 2;
more generically
SELECT group_id
FROM theTable AS t
WHERE t.user_id IN ([user id list])
GROUP BY group_id
HAVING COUNT(DISTINCT user_id) = [# of user ids in list];
Edit: Oh, you wanted the number of groups....
SELECT COUNT(1) FROM (
SELECT group_id
FROM theTable AS t
WHERE t.user_id IN (1, 3)
GROUP BY group_id
HAVING COUNT(DISTINCT user_id) = 2
);
You can achieve this with join.
Try this:
select t1.user_id, t2.user_id, group_concat(distinct t1.group_id)
from your_table t1
join your_table t2
on t1.user_id < t2.user_id
and t1.group_id = t2.group_id
group by t1.user_id, t2.user_id;
If you don't want a concatenated output:
select distinct t1.user_id, t2.user_id, t1.group_id
from your_table t1
join your_table t2
on t1.user_id < t2.user_id
and t1.group_id = t2.group_id;
Try to join two instances of the same table (for each of them you select only the records relative to one of the users) using group_id as join attribute, and count the result:
SELECT COUNT(*)
FROM table AS t1
JOIN table AS t2 ON t1.group_id=t2.group_id
WHERE t1.user_id=1 AND t2.user_id=3;
SELECT COUNT(*)
FROM TABLE_NAME USER_ONE_INFO
TABLE_NAME USER_TWO_INFO
WHERE USER_ONE_INFO.ID = USER_ONE_ID
AND USER_TWO_INFO.ID = USER_TWO_ID
AND USER_ONE_INFO.GROUP_ID = USER_TWO_INFO.GROUP_ID;

Subselect bad performance

This is my query, it takes a long time to execute. Can I use an inner join? I am working on only one table.
SELECT imei,csv_data_table.time,phone_model,test_unique_id
FROM verveba_mos.csv_data_table
WHERE time = (SELECT MAX(time) FROM csv_data_table
T1 WHERE csv_data_table.imei=T1.imei)
You can use JOIN or NOT EXISTS() to do this, that doesn't necessarily means it will be faster:
EXISTS() :
SELECT imei,csv_data_table.time,phone_model,test_unique_id
FROM verveba_mos.csv_data_table t
WHERE NOT EXISTS(SELECT 1
FROM csv_data_table s
WHERE t.imei= s.imei
AND s.time > t.time)
JOIN:
SELECT t.imei,t.time,t.phone_model,t.test_unique_id
FROM verveba_mos.csv_data_table t
JOIN(SELECT s.imei,MAX(time) as max_t
FROM csv_data_table s
GROUP BY s.imei) p
ON(t.imei= p.imei and t.time = p.max_t)
SELECT t1.imei, t1.time, t1.phone_model, t1.test_unique_id
FROM csv_data_table t1
JOIN (select imei, max(time) time from csv_data_table group by imei) t2
ON (t1.imei = t2.imei and t1.time = t2.time)
You should also consider putting an index on csv_data_table(imei, time) if you don't already have one.

group by while getting the Ids of the columns that meet the condition

I wanted to write something like this
SELECT MIN(date),
id
FROM test
WHERE t.SITE_ID=SITE_id
GROUP BY SITE_ID
but its not possible to get the ids after a group by call.
then I came up with this
SELECT t.id
FROM test t
WHERE t.date IN(SELECT MIN(date)
FROM test
WHERE t.SITE_ID=SITE_id
GROUP BY SITE_ID)
this select is supposed to get me the ids of the test rows that have the same date as the minimum date for each site is there anyway to make it simpler ?
Try
SELECT t1.id
FROM test t1
inner join
(
select site_id, min(date) as mdate
from test
GROUP BY site_id
) t2 on t1.site_id = t2.site_id and t1.date = t2.mdate
SELECT t1.id
FROM test t1
inner join
(
select site_id, min(date) as mdate
from test
GROUP BY site_id, id
) t2 on t1.site_id = t2.site_id and t1.date = t2.mdate

sql script works on MySQL, but not on google bigquery

I have sql scripts that work fine in MySQL, but that I cannot get to work in google bigquery. After reading through bq documentation, I made a number of adjustments (eg no more than one join per select statement), but the script still fails. Any help is appreciated. If you know of any good resources in terms of bq sql vs other sql, that would also be greatly appreciated. Thanks.
SELECT
T1.action_date AS action_date,
T1.ad_campaign_category AS ad_campaign_category,
T1.campaign_id AS campaign_id,
T2.total_sends AS total_sends,
count(*) AS clicks_per_category
FROM (
SELECT action_date, campaign_id, ad_campaign_category
FROM projectX.email_action
WHERE action_date > '2009-04-01' AND action_date < '2011-05-01') T1,
(
SELECT action_date, campaign_id, ad_campaign_category, count(*) AS total_sends
FROM projectX.email_action
WHERE action_type = 'send' AND action_date > '2009-04-01' AND action_date < '2011-05-01'
GROUP BY action_date, campaign_id) T2
WHERE T1.action_date = T2.action_date
AND T1.campaign_id = T2.campaign_id
GROUP BY action_date, campaign_id, ad_campaign_category
The JOIN must be explicit -- that is, rather than using SELECT ... FROM (...) t1, (...) t2 WHERE t1.x = t2.y you should use the form SELECT ... FROM (...) t1 JOIN (...) t2 ON t1.x = t2.y
For your example, this would look like:
SELECT
T1.action_date AS action_date,
T1.ad_campaign_category AS ad_campaign_category,
T1.campaign_id AS campaign_id,
T2.total_sends AS total_sends,
count(*) AS clicks_per_category
FROM (
SELECT action_date, campaign_id, ad_campaign_category
FROM projectX.email_action
WHERE action_date > '2009-04-01' AND action_date < '2011-05-01') T1
JOIN (
SELECT action_date, campaign_id, ad_campaign_category, count(*) AS total_sends
FROM projectX.email_action
WHERE action_type = 'send' AND action_date > '2009-04-01' AND action_date < '2011-05-01'
GROUP BY action_date, campaign_id) T2
ON T1.action_date = T2.action_date
AND T1.campaign_id = T2.campaign_id
GROUP BY action_date, campaign_id, ad_campaign_category
Note if you get an error that one of the tables is too large, try using JOIN EACH instead of JOIN.