Let's say I have a schools table (cols = "ids (int)") and a users table (cols = "id (int), school_id (int), created_at (datetime)").
I have a list of school ids saved in <school_ids>. I want to group those schools by the yearweek(users.created_at) value for the user at that school with the earliest created_at value, and for each group list the value of yearweek(users.created_at) and the number of schools.
In other words, i want to find the earliest-created user for each school, and then group the schools by the yearweek() result for that created_at date, so i have the number of schools that signed up their first user in each week, effectively.
So, i want results like
| 201301 | 22 | #meaning there are 22 schools where the earliest created_at user
#has yearweek(created_at) = "201301"
| 201302 | 5 | #meaning there are 5 schools where the earliest created_at user
#has yearweek(created_at) = "201302"
etc
As a sanity check, the total of all rows in the second column should equal the size of <school_ids>, ie the number of ids in school_ids.
Does that make sense? I can't quite figure out how to get this without doing several queries and storing values in between. I'm sure there's a one-liner. Thanks! max
You could use a subquery that returns the minimum created_at field for every school_id, and then you can group by yearweek and do the count:
SELECT
yearweek(u.min_created_at) AS yearweek_first_user,
COUNT(*)
FROM
(
SELECT school_id, MIN(created_at) AS min_created_at
FROM users
GROUP BY school_id
) u
GROUP BY
yearweek(u.min_created_at)
Related
I'm practicing MySQL and I'm trying to solve an exercise. I have data that contains reviews to a hotel. The data contains reviews by different users: one user can have many reviews if they have visited more than once. Each review has its own id and then review values from 1 to 5.
The reviews also have dates, and now I would like to count the average reviews of the first visits (earliest date). My problem is that the ways I have tried to retrieve the earliest date, don't actually work. By this I mean that I get the same results with and without the HAVING and WHERE methods. Is there someone that could help me with this? Thanks!
Here is my query (I have tried with the HAVING and WHERE methods)
SELECT AVG(overall_rating), AVG(rooms_rating), AVG(service_rating), AVG(location_rating), AVG(value_rating)
FROM reviews
HAVING MIN(review_date)
#WHERE review_date IN (SELECT MIN(review_date) FROM repatrons_reviews GROUP BY id)
Here is an example of the data
user_id | id | rooms_rating | service_rating | location_rating | value_rating | date
---------------------------------------------------------------------------------------
matt21 | 123 | 4 | 5 | 2 | 4 | 2007-08-20
This
SELECT ...
FROM reviews
HAVING MIN(review_date)
cannot work. Let's say the minimum date in the table is DATE '2020-01-01', then what is HAVING DATE '2020-01-01' supposed to mean?
This
SELECT ...
FROM reviews
WHERE review_date IN (SELECT MIN(review_date) FROM reviews GROUP BY id);
is close, but it's not the minimum date per ID, but the minimum date per user ID you want. And if you replace id by user_id, then there is still a problem, because what is the first date for one user can be the third date for another.
Here is this query corrected:
SELECT
AVG(overall_rating), AVG(rooms_rating),
AVG(service_rating), AVG(location_rating), AVG(value_rating)
FROM reviews
WHERE (user_id, review_date) IN
(SELECT user_id, MIN(review_date) FROM reviews GROUP BY user_id);
You can do it with NOT EXISTS:
SELECT AVG(r.overall_rating), AVG(r.rooms_rating), AVG(r.service_rating), AVG(r.location_rating), AVG(r.value_rating)
FROM reviews r
WHERE NOT EXISTS (SELECT 1 FROM reviews WHERE user_id = r.user_id AND date < r.date)
The below table contains an id and a Year and Groups
GroupingTable
id | Year | Groups
1 | 2000 | A
2 | 2001 | B
3 | 2001 | A
Now I want select the greatest year even after grouping them by the Groups Column
SELECT
id,
Year,
Groups
FROM
GroupingTable
GROUP BY
`Groups`
ORDER BY Year DESC
And below is what I am expecting even though the query above doesnt work as expected
id | Year | Groups
2 | 2001 | B
3 | 2001 | A
You need to learn how to use aggregate functions.
SELECT
MAX(Year) AS Year,
Groups
FROM
GroupingTable
GROUP BY
`Groups`
ORDER BY Year DESC
When using GROUP BY, only the column(s) you group by are unambiguous, because they have the same value on every row of the group.
Other columns return a value arbitrarily from one of the rows in the group. Actually, this is behavior of MySQL (and SQLite), but because of the ambiguity, it's an illegal query in standard SQL and all other brands of SQL implementations.
For more on this, see my answer to Reason for Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
Your query misuses the heinously confusing nonstandard extension to GROUP BY that's built in to MySQL. Read this and weep. https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
If all you want is the year it's a snap.
SELECT MAX(Year) Year, Groups
FROM GroupingTable
GROUP BY Groups
If you want the id of the row in question, you have to do a bunch of monkey business to retrieve the column id from the above query.
SELECT a.*
FROM GroupingTable a
JOIN (
SELECT MAX(Year) Year, Groups
FROM GroupingTable
GROUP BY Groups
) b ON a.Groups = b.Groups AND a.Year = b.Year
You have to do this because the GROUP BY query yields a summary result set, and you have to join that back to the detail result set to retrieve the ID.
I'm having trouble writing this Query. I have 2 tables, vote_table and click_table. in the vote_table I have two fields, id and date. the format of the date is "12/30/11 : 14:28:36". in the click_table i have two fields, id and date. the format of the date is "12.30.11".
The id's occur multiple times in both tables. What i want to do is produce a result that contains 3 fields: id, votes, clicks. the id column should have distinct id values, the votes column should have the total times that ID has the date 12/30/11% from the vote_table, and the clicks should have the total times that ID has the date 12.30.11 from the click table, so something like this:
ID | VOTES | CLICKS
001 | 24 | 50
002 | 30 | 45
Assuming that the types of the 'date' columns are actually either DATE or DATETIME (rather than, say, VARCHAR), then the required operation is fairly straight-forward:
SELECT v.id, v.votes, c.clicks
FROM (SELECT id, COUNT(*) AS votes
FROM vote_table AS v1
WHERE DATE(v1.`date`) = TIMESTAMP('2011-12-30')
GROUP BY v1.id) AS v
JOIN (SELECT id, COUNT(*) AS clicks
FROM click_table AS c1
WHERE DATE(c1.`date`) = TIMESTAMP('2011-12-30')
GROUP BY c1.id) AS c
ON v.id = c.id
ORDER BY v.id;
Note that this only shows ID values for which there is at least one vote and at least one click on the given day. If you need to see all the ID values which either voted or clicked or both, then you have to do more work.
If you have to normalize the dates because they are VARCHAR columns, the WHERE clauses become correspondingly more complex.
I have one sales table with 4 columns:
id | item_id | department_id | bought_at | sold_at
All columns are INT(11).
items can be bought all the time, but I would like to know the average of the quickest sales for one department. Per item, bought_at are all the same.
Any suggestion how to do this in one query?
I assume by quickest sail, you mean stored_at and sold_at are nearly equal. If you want average number of quickest sales per item for a department, then use the following query (assuming department_id for which we want to find average is 1, also assuming the threshold for sold difference as 10)
SELECT AVG(sales_count) average_sales FROM
( SELECT COUNT(*) AS sales_count
FROM sales
WHERE department_id = 1 AND (sold_at - stored_at) < 10
GROUP BY item_id
) counts
Maybe this work :-
select department_id, avg(sold_at-bought_at) as avg_time
from sales
group by department_id
order by avg_time;
I am storing all visits to my site in a table, I store the date, the page visited and a session id.
In theory, I can group somebody by their session id and this counts as 1 visit.
What I'd like to do however is go through the table and get the total of visits for each date. So it would group by the session id, and then group by the date.
ie:
SELECT DATE(added) as date, COUNT(*) FROM visits GROUP BY sessionID, date
This doesn't work as it retrieves then the total of visits for that session id, and the date.
My table structure looks a bit like this:
----------------------------------
| id | added | page | sessionid
----------------------------------
Any ideas?
My query gives me results that look like this:
2010-11-24 | 2
2010-11-24 | 14
2010-11-24 | 17
2010-11-24 | 1
While I'd be hoping for something more like a total of all those under the 1 date, ie:
2010-11-24 | 34
Each date contains the time which will be different for each request. If you use DATE in the GROUP BY clause just like you did in the SELECT clause, that will solve your problem.
By grouping by sessionID, it's going to create a row for every session. If instead of grouping by sessionID, you use COUNT(DISTINCT sessionID), that will contact the distinct number of session IDs for that date.
SELECT DATE(added) as date, COUNT(DISTINCT sessionID) as sessions FROM visits GROUP BY DATE(added)