MySQL COUNT DISTINCT - mysql

I'm trying to collect the number of distinct visits in my cp yesterday, then count them.
SELECT
DISTINCT `user_id` as user,
`site_id` as site,
`ts` as time
FROM
`cp_visits`
WHERE
ts >= DATE_SUB(NOW(), INTERVAL 1 DAY)
For some reason this is pulling multiple results with the same site id....how do i only pull and count the distinct site_id cp logins?

Select
Count(Distinct user_id) As countUsers
, Count(site_id) As countVisits
, site_id As site
From cp_visits
Where ts >= DATE_SUB(NOW(), INTERVAL 1 DAY)
Group By site_id

Overall
SELECT
COUNT(DISTINCT `site_id`) as distinct_sites
FROM `cp_visits`
WHERE ts >= DATE_SUB(NOW(), INTERVAL 1 DAY)
Or per site
SELECT
`site_id` as site,
COUNT(DISTINCT `user_id`) as distinct_users_per_site
FROM `cp_visits`
WHERE ts >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY `site_id`
Having the time column in the result doesn't make sense - since you are aggregating the rows, showing one particular time is irrelevant, unless it is the min or max you are after.

You need to use a group by clause.
SELECT site_id, MAX(ts) as TIME, count(*) group by site_id

Related

How ot return 0 instead of null on mysql query?

The following query returns the visitors and pageviews of last 7 days. However, if there are no results (let's say it is a fresh account), nothing is returned.
How to edit this in order to return 0 in days that there are no entries?
SELECT Date(timestamp) AS day,
Count(DISTINCT hash) AS visitors,
Count(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND timestamp >= Subdate(Curdate(), 7)
GROUP BY day
Assuming that you always have at least one record in the table for each of the last 7 days (regardless of the company_id), then you can use conditional aggregation as follows:
select
date(timestamp) as day,
count(distinct case when company_id = 1 then hash end) as visitors,
sum(company_id = 1) as pageviews
from behaviour
where timestamp >= curdate() - interval 7 day
group by day
Note that I changed you query to use standard date arithmetics, which I find easier to understand that date functions.
Otherwise, you would need to move the condition on the date from the where clause to the aggregate functions:
select
date(timestamp) as day,
count(distinct case when timestamp >= curdate() - interval 7 day and company_id = 1 then hash end) as visitors,
sum(timestamp >= curdate() - interval 7 day and company_id = 1) as pageviews
from behaviour
group by day
If your table is big, this can be expensive so I would not recommend that.
Alternatively, you can generate a derived table of dates and left join it with your original query:
select
curdate - interval x.n day day,
count(distinct b.hash) visitors,
count(b.hash) page_views
from (
select 1 n union all select 2 union all select 3 union all select 4
union all select 5 union all select 6 union all select 7
) x
left join behavior b
on b.company_id = 1
and b.timestamp >= curdate() - interval x.n day
and b.timestamp < curdate() - interval (x.n - 1) day
group by x.n
Use a query that returns all the dates from today minus 7 days to today and left join the table behaviour:
SELECT t.timestamp AS day,
Count(DISTINCT b.hash) AS visitors,
Count(b.timestamp) AS pageviews
FROM (
SELECT Subdate(Curdate(), 7) timestamp UNION ALL SELECT Subdate(Curdate(), 6) UNION ALL
SELECT Subdate(Curdate(), 5) UNION ALL SELECT Subdate(Curdate(), 4) UNION ALL SELECT Subdate(Curdate(), 3) UNION ALL
SELECT Subdate(Curdate(), 2) UNION ALL SELECT Subdate(Curdate(), 1) UNION ALL SELECT Curdate()
) t LEFT JOIN behaviour b
ON Date(b.timestamp) = t.timestamp AND b.company_id = 1
GROUP BY day
Use IFNULL:
IFNULL(expr1, 0)
From the documentation:
If expr1 is not NULL, IFNULL() returns expr1; otherwise it returns expr2. IFNULL() returns >a numeric or string value, depending on the context in which it is used.
You can use next trick:
First, get query that return 1 dummy row: SELECT 1;
Next use LEFT JOIN to connect summary row(s) without condition. This join will return values in case data exists on NULL values in other case.
Last select from joined queries onle what we need and convert NULL's to ZERO's
using IFNULL dunction.
SELECT
IFNULL(b.day,0) AS DAY,
IFNULL(b.visitors,0) AS visitors,
IFNULL(b.pageviews,0) AS pageviews
FROM (
SELECT 1
) a
LEFT JOIN (
SELECT DATE(TIMESTAMP) AS DAY,
COUNT(DISTINCT HASH) AS visitors,
COUNT(*) AS pageviews
FROM behaviour
WHERE company_id = 1
AND TIMESTAMP >= SUBDATE(CURDATE(), 7)
GROUP BY DAY
) b ON 1 = 1;

Query Help Between Date Range, but Not Recent

I'm know this can be written as a single SQL statement, but I just don't know how to do it. I have two separate queries. Ont that pulls all orders from a specific period of last year
SELECT * FROM `order` WHERE date_added BETWEEN '2014-10-01' AND '2014-11-01';
and one that pulls from the last month
SELECT * FROM `order` WHERE date_added BETWEEN DATE_SUB( now(), INTERVAL 1 MONTH) AND Now() ORDER BY date_added ASC
What I want to do is now join the two so that I only get the customer_id of orders that were placed inside of the date range last year (query 1), but haven't placed an order in the last month (query 2).
I know there is a way to set this up as a join, but my knowledge on sql join's is not very limited. Thanks for any help.
http://sqlfiddle.com/#!9/35ed0/1
SELECT * FROM `order`
WHERE date_added BETWEEN '2014-10-01' AND '2014-11-01'
AND customer_id NOT IN (
SELECT DISTINCT customer_id FROM `order`
WHERE date_added BETWEEN DATE_SUB( now(), INTERVAL 1 MONTH) AND Now())
UPDATE If you need only 1 records per customer_id, here is an example . It is not very best from performance perspective. But it returns only last (according to the date_added column) order per customer.
SELECT t.*,
if(#fltr=customer_id, 0, 1) fltr,
#fltr:=customer_id
FROM (SELECT *
FROM `order`
WHERE date_added BETWEEN '2014-10-01' AND '2014-11-01'
AND customer_id NOT IN (
SELECT DISTINCT customer_id FROM `order`
WHERE date_added BETWEEN DATE_SUB( now(), INTERVAL 1 MONTH) AND Now())
ORDER BY customer_id, date_added DESC
) t
HAVING (fltr=1);
I usually use a correlated not exists predicate for this as I feel that it corresponds well with the intent of the question:
SELECT *
FROM `order` o1
WHERE date_added BETWEEN '2014-10-01' AND '2014-11-01'
AND NOT EXISTS (
SELECT 1
FROM `order` o2
WHERE date_added BETWEEN DATE_SUB(NOW(), INTERVAL 1 MONTH) AND NOW()
AND o1.customer_id = o2.customer_id
);
I like to approach these questions using group by and having. You are looking for customer ids, so:
select o.customer_id
from orders o
group by o.customer_id
having sum(date_added BETWEEN '2014-10-01' AND '2014-11-01') > 0 and
sum(date_added BETWEEN DATE_SUB( now(), INTERVAL 1 MONTH) AND Now()) = 0;

MySql SELECT COUNT with two criterias

I'm trying to do a query that fetches data per the last hour and the two last hours. It's just a hits counter.
So I would like to get the resultset as follows:
id, page_url, last_hour_hits, two_last_hours_hits
The table is very simple:
id (autonumber)
page_url
time_stamp
I tried the query below:
SELECT
page_url,
COUNT(page_url) AS last_hour_hits
FROM stats
WHERE time_stamp > '2015-08-01 00:00:00'
GROUP BY page_url
('2015-08-01 00:00:00' is calculated for the last hour)
It works fine, but I have now idea how to add the 'two_last_hours' counter.
Thank you in advance.
With MySQL you can use the fact that boolean expression returns 1 for true and simplify the query to this:
SELECT
page_url
,sum(time_stamp > (now() - interval 1 hour)) as last_hour_hits
,sum(time_stamp > (now() - interval 2 hour)) as two_last_hours_hits
FROM stats
GROUP BY page_url;
This uses now() to get the current time and subtracts the interval as needed.
Just put your WHERE clause with two hours before ('2015-07-31 23:00:00') and a CASE WHEN to count for the last hour ('2015-08-01 00:00:00') :
SELECT
page_url,
COUNT(DISTINCT CASE WHEN time_stamp > '2015-08-01 00:00:00' THEN id ELSE NULL END) as last_hour_hits
COUNT(*) AS two_last_hours_hits
FROM stats
WHERE time_stamp > '2015-07-31 23:00:00'
GROUP BY page_url;
you can use join between two result sets- one for the one hour and the other for the two hours:
SELECT stats.page_url, last_hour_hits, last_two_hours_hits from
stats inner join (
SELECT page_url, COUNT(page_url) AS last_hour_hits
FROM stats WHERE time_stamp > '2015-08-01 00:00:00'
GROUP BY page_url ) as last_hour
on stats.page_url = last_hour.page_url
inner join (
SELECT page_url, COUNT(page_url) AS last_two_hours_hits
FROM stats WHERE time_stamp > '2015-07-31 23:00:00'
GROUP BY page_url ) as last_two_hours
on stats.page_url = last_two_hours.page_url

match timestamp with date in MYSQL using PHP

I have a table
id user Visitor timestamp
13 username abc 2014-01-16 15:01:44
I have to 'Count' total visitors for a 'User' for last seven days group by date(not timestamp)
SELECT count(*) from tableA WHERE user=username GROUPBY __How to do it__ LIMIT for last seven day from today.
If any day no visitor came so, no row would be there so it should show 0.
What would be correct QUERY?
There is no need to GROUP BY resultset, you need to count visits for a week (with unspecified user). Try this:
SELECT
COUNT(*)
FROM
`table`
WHERE
`timestamp` >= (NOW() - INTERVAL 7 DAY);
If you need to track visits for a specified user, then try this:
SELECT
DATE(`timestamp`) as `date`,
COUNT(*) as `count`
FROM
`table`
WHERE
(`timestamp` >= (NOW() - INTERVAL 7 DAY))
AND
(`user` = 'username')
GROUP BY
`date`;
MySQL DATE() function reference.
Try this:
SELECT DATE(a.timestamp), COUNT(*)
FROM tableA a
WHERE a.user='username' AND DATEDIFF(NOW(), DATE(a.timestamp)) <= 7
GROUP BY DATE(a.timestamp);
i think it's work :)
SELECT Count(*)
from table A
WHERE user = username AND DATEDIFF(NOW(),timestamp)<=7

First and last record for a user in a given time period in one query

I am looking to get the first and last record for a given user_id in a time period, for example, 24 hours.
I am aware this could be done using two queries, doing something like this and then switching the ORDER BY ASC/DESC.
SELECT id, user_id, date, other_columns
FROM table
WHERE user_id = 1 AND date > DATE_SUB(CURDATE(), INTERVAL 24 HOUR)
ORDER BY date DESC
LIMIT 1
However, I am wondering if it would be possible to do this using one query.
This is something that you could consider:
SELECT t.id, t.user_id, t.date, t.other_columns
FROM table t
WHERE user_id = 1
AND date = (
SELECT MIN(date)
FROM table
WHERE user_id = t.user_id
AND date > DATE_SUB(CURDATE(), INTERVAL 24 HOUR))
UNION ALL
SELECT id, user_id, date, other_columns
FROM table
WHERE user_id = 1
AND date = (
SELECT MAX(date)
FROM table
WHERE user_id = t.user_id
AND date > DATE_SUB(CURDATE(), INTERVAL 24 HOUR))