I have table with users actions. One of them occurs every time when user opens certain page.
Table structure:
id, user_id, action_type, created_at...
I need to select from this table all actions per day/week... but without repeating of similar in one day. For example: user has visited 10 pages but 5 of them was the same. The result of selection should contain only unique pages per day.
Is it possible to do with only MySQL logic? Or better I should update repeated action if it occurs the same day?
One approach uses select distinct:
select distinct user_id, action_type, date(created_at) created_date
from mytable
If needed, you can also count how many times each action_type was met on a user_id and day basis with aggregation:
select user_id, action_type, date(created_at) created_date, count(*) cnt
from mytable
group by user_id, action_type, date(created_at)
I suggest the following SQL code :
SELECT DISTINCT URL
FROM table_name
GROUP BY date;
I assume that your table name is table_name, you have the URLs (pages) in the column named URL and you you track the date in the column named date;
Related
I have a subquery that aggregates some UNION ALL selects. Over that I prepare the SELECT to create cross-tab and limit it to let's say 20. I would like to be able to retrieve the total COUNT of sub query results before I am limiting them in main query. This is for the purpose of trying to build a pagination that receives the total number of records and then the specific page record grid.
Sample query:
SELECT
name,
sumIf(metric_value, metric_name = 'data') AS data,
sumif(....
FROM
(SELECT
name, metric_name, SUM(metric_value) as metric_value
FROM
(SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table2
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
SELECT
name, 'data' AS metric_name, SUM(data) AS metric_value
FROM
table3
WHERE
date > '2017-01-01 00:00:00'
GROUP BY
name
UNION ALL
.
.
.)
GROUP BY
name, metric_name)
GROUP BY
name
ORDER BY
name ASC
LIMIT 0,20;
The first subselect returns tons of data, so I thought I can count it and return as one column value, or row and it would propagate to main select that limits 20 results. Because I need to know the entire set of results but don;t want to call the same query twice without limit and with limit just to get COUNT. There are at least 12 UNION ALL third level sub selects, so why waste resources. I am looking to try generic SQL solutions not necessarily related to ClickHouse
I was thinking of using count(*) OVER (), however that is not supported, so if thats only option I know I need to run query twice.
The first thing that one should mention is that nobody is usually interested in the exact number of pages on a query. It can be easily estimated and almost no one will care how exact is the estimation. However, if you have a link to the last page in your GUI, people will often click to link just to see whether it works.
Nevertheless, there are cases when an analyst should visit all the pages, and then the GUI should display the exact amount of work. A good news is that in that latter case, a better strategy is to cache a snapshot of the whole results table and counting the rows in the table becomes not a problem anymore.
I mean, it makes sense to discuss with the customers whether they really need it, because unneeded full scans many times per day may have effect on the database load and billing sums.
Anyway, if you still need to estimate the number of rows, you can simplify the query just to count the number of rows. As I understand this is something like:
SELECT SUM(cnt) as row_count
FROM (
SELECT COUNT(DISTINCT name) as cnt FROM table1 WHERE date > ...
UNION ALL
SELECT COUNT(DISTINCT name) as cnt FROM table2 WHERE date > ...
...
) as counts;
or if data is a constant metric name
SELECT COUNT(DISTINCT name) as row_count
FROM (
SELECT DISTINCT name FROM table1 WHERE date > ...
UNION ALL
SELECT DISTINCT name FROM table2 WHERE date > ...
...
) as names;
Supposed I have a SQL table that looks like this
Now I am suppose to do this 'logic' so that I know that on 23/6/2017, the word 'accessories' appeared 2 times and 'tools' appeared 1 time.
I think there is some kind of way to do this is mysql, something along the lines of COUNT() and GROUPBY but I cannot get the result I want.
Appreciate any guidance. Thanks!
You can do this by using GROUP BY :
SELECT date, category, count(*) as count FROM table_name GROUP BY date, category
You have to put every selected columns after group by otherwise it will show query error.
SELECT date, category, count(*)
FORM yourtable
GROUP BY date, category
It's as simple as counting the categories corresponding to each date
SELECT date
category
COUNT(*)
FROM table_name
GROUP BY date, category
I have a very basic table, consisting of an integer column and a timestamp column.
What's the query to count how many entries there are for each day?
When I use SELECT COUNT(*) FROM taps GROUP BY(DATE(time_stamp)) , I get the total number of rows int he table, rather than the number of rows for each DISTINCT date.
How do I need to modify the query?
Pretty straightforward.
SELECT
DATE(time_stamp),
COUNT(1)
FROM taps
GROUP BY DATE(time_stamp)
I have the following table:
table 1
columns are as following:
date , time , user_id , channel
I wish to find for a list of USERS watching in 2 different DATES , channel(lets say CNN, NBC...) all the entries relevant.
Means the channel in DATE 1 and DATE 2 is the same, also the user_id.
I try allready following:
select distinct monthname(date),max(date), min(date) count(distinct user_id)
from iptv_clean
group by monthname(date)
having min(date)!= max(date)
But it seems not to work well.
Any ideas?
The following gives a list of users and channels that users watched on more than one date:
select user_id, channel, count(distinct date)
from iptv_clean
group by user_id, channel
having count(distinct date) > 1;
Is this what you want?
I have a table with columns user_id, time_stamp and activity which I use for recoding user actions for an audit trail.
Now, I would just like to get the most recent timestamp for each unique user_id. How do I do that?
SELECT MAX(time_stamp), user_id FROM table GROUP BY user_id;
The following query should be what you want...
select user_id,max(time_stamp) from yourtable group by user_id order by user_id, time_stamp desc