sql selecting a lot of counts - mysql

I have a problem driving me nuts for the last 2 days. I basically have 4 tables with inheritance in the following order:
users
|
categories blogs
| | |
---- pages visits
So a user has many blogs which has many pages and visits. Each page also belongs to a category.
All I want is to extract all users with the following counts associated:
total number of blogs each user has
total number of pages each user has
total number of categories each user has blogs in
total number of visits each user has
total number of visitors each user has (visits but we count by distinct ip_address)
My query is as follows:
SELECT
u.id
u.username,
COUNT(b.id) as blogs_count,
COUNT(p.id) as pages_count,
COUNT(v.id) as visits_count,
COUNT(distinct ip_address) as visitors_count
COUNT(c.id) as categories_count
FROM
users u
LEFT JOIN
blogs b ON(b.user_id=u.id)
LEFT JOIN
pages p ON(p.blog_id=b.id)
LEFT JOIN
visits v ON(v.blog_id=b.id)
LEFT JOIN
categories c ON(v.category_id=c.id)
GROUP BY u.id, blogs_count, pages_count, visits_count,
visitors_count, categories_count
I should get 24 users with their counts but, given the fact that I have almost 300,000 visits I get my SQL database hanging in forever probably trying to pull millions of rows.
I'm not a db guru and it's obvious. Can someone point me to the right direction somehow so I can make a good query able to perform well on even millions of records (with the right hardware of course)?

Try this:
SELECT u.id,
u.username,
COUNT(b.id) AS blogs_count,
COALESCE(MAX(p.pagecnt), 0) AS pages_count,
COALESCE(MAX(v.visitscnt), 0) AS visits_count,
COALESCE(MAX(v.visitorscnt), 0) AS visitors_count,
COALESCE(MAX(c.catcnt), 0) AS categories_count
FROM users u
LEFT JOIN blogs b ON u.id = b.user_id
LEFT JOIN (
SELECT blog_id,
COUNT(*) AS pagecnt
FROM pages
GROUP BY blog_id
) p ON b.id = p.blog_id
LEFT JOIN (
SELECT blog_id,
COUNT(*) AS visitscnt,
COUNT(DISTINCT ip_address) AS visitorscnt
FROM visits
GROUP BY blog_id
) v ON b.id = v.blog_id
LEFT JOIN (
SELECT aa.id,
COUNT(DISTINCT dd.id) AS catcnt
FROM users aa
JOIN blogs bb ON aa.id = bb.user_id
JOIN pages cc ON bb.id = cc.blog_id
JOIN categories dd ON cc.category_id = dd.id
GROUP BY aa.id
) c ON u.id = c.id
GROUP BY u.id,
u.username
Breakdown
This should also work across different DBMSs like PGSQL, SQL-Server, etc.
The challenge is that you have this sort of hierarchy of 1:M relationships in which joining them all together can easily throw off the different types of counts (as you want distinct counts in some places, but total counts in others).
What I've decided to do is first subselect the count of each page and visit / distinct visitors, grouping by the blog_id. This ensures that we get only one row per blog_id, even after joining the subselects on the blogs table.
For the category count, you want a count of distinct categories per user, but the challenge is that categories is linked deep within the relationship hierarchy (to the pages table), so you have to make a separate subselect that joins on the user_id instead of the blog_id.
Even with as many subselects as this query contains, it should still be quite fast as no two subselects are joining against each other. As long as there is an indexed table (subselects are actually unindexed temporary tables) on either side of the join, you should be fine.

SELECT
u.id
u.username,
COUNT(b.id) as blogs_count,
COUNT(p.id) as pages_count,
COUNT(v.id) as visits_count,
COUNT(distinct ip_address) as visitors_count
COUNT(c.id) as categories_count
FROM
users u
LEFT JOIN
blogs b ON(b.user_id=u.id)
LEFT JOIN
pages p ON(p.blog_id=b.id)
LEFT JOIN
visits v ON(v.blog_id=b.id)
LEFT JOIN
categories c ON(v.category_id=c.id)
GROUP BY u.id
Try with removing blogs_count, pages_count, visits_count, visitors_count, categories_count from your group by statment.

Related

Get total count and comments with posts

I want to get the total likes and total count of the every post in a single query with the help of joins.
I am using this query. but the result is wrong
SELECT blog.id, count(blog_comments.id) as likes , count(blog_likes.id) as comments
FROM blog LEFT JOIN
blog_comments
ON blog.id = blog_comments.blog_id LEFT JOIN
blog_likes
ON blog.id = blog_likes.blog_id
GROUP BY blog.id
Please check the image for table structure:
Your problem is that you are aggregating along two dimensions at the same time. The produces a Cartesian product -- a row with each like pairs with each comment, for a total of l * c rows.
The simplest way to fix this is to use the DISTINCT keyword:
SELECT b.id, count(DISTINCT bl.id) as likes , count(DISTINCT bc.id) as comments
FROM blog b LEFT JOIN
blog_comments bc
ON b.id = bc.blog_id LEFT JOIN
blog_likes
ON b.id = bl.blog_id
GROUP BY b.id;
If you have posts that have lots of likes and lots of comments, this is not recommended, because it creates a Cartesian product of the two.
There are several solutions for this, but I would recommend correlated subqueries:
select b.id,
(select count(*) from blog_likes bl where bl.blog_id = b.id) as likes,
(select count(*) from blog_comments bc where bc.blog_id = b.id) as comments
from blogs b;
This can take advantage of indexes on blog_likes(blog_id) and blog_comments(blog_id).
This is according to my table it will help you...
SELECT people.pe_name, COUNT(distinct orders.ord_id) AS num_orders, COUNT(items.item_id) AS num_items FROM people INNER JOIN orders ON orders.pe_id = people.pe_id INNER JOIN items ON items.ord_id = orders.ord_id GROUP BY people.pe_id;

strange MySQL join query results with aggregate functions

i wrote the following join query to get a report using aggregate functions
SELECT users.id, SUM(orders.totalCost) AS bought, COUNT(comment.id) AS commentsCount, COUNT(topics.id) AS topicsCount, COUNT(users_login.id) AS loginCount, COUNT(users_download.id) AS downloadsCount
FROM users
LEFT JOIN orders ON users.id=orders.userID AND orders.payStatus=1
LEFT JOIN comment ON users.id=comment.userID
LEFT JOIN topics ON users.id=topics.userID
LEFT JOIN users_login ON users.id=users_login.userID
LEFT JOIN users_download ON users.id=users_download.userID
GROUP BY users.id
ORDER BY bought DESC
but i don't know why i get the following output?
the result of aggregate functions are multiplied with each other!!!
i don't know why?
for example for the last row i expected the following result
821 | 48000 | 63 | 0 | 10 | 10
the result of executing EXPLAIN query are shown below
One reason for that type of result would be you are using a left joins with your users table and the result set may contains the duplicate rows for each user so you are getting count more than the expected one for this you can use DISTINCT in count to count only the unique associations per user and for sum of totalCost you can use a subselect to have a sum for each user without having repeated values for orders of user
SELECT
u.id,
COALESCE(o.bought,0) bought
COUNT(DISTINCT c.id) AS commentsCount,
COUNT(DISTINCT t.id) AS topicsCount,
COUNT(DISTINCT ul.id) AS loginCount,
COUNT(DISTINCT ud.id) AS downloadsCount
FROM users u
LEFT JOIN (SELECT
userID,
SUM(totalCost) bought
FROM orders
WHERE payStatus=1
GROUP BY userID) o
ON u.id=o.userID
LEFT JOIN `comment` c ON u.id=c.userID
LEFT JOIN topics t ON u.id=t.userID
LEFT JOIN users_login ul ON u.id=ul.userID
LEFT JOIN users_download ud ON u.id=ud.userID
GROUP BY u.id
ORDER BY bought DESC

MySql query to get count of days spent in each country for each purpose? (Get count of all record in second table present in first table)

I have three tables tl_log, tl_geo_countries,tl_purpose. I am trying to get the count of number of days spent in each country in table 'tl_log' for each purpose in table 'tl_purpose'.
I tried below mysql query
SELECT t.country_id AS countryID,t.reason_id AS reasonID,count(t.reason_id) AS
days,c.name AS country, p.purpose AS purpose
FROM `tl_log` AS t
LEFT JOIN tl_geo_countries AS c ON t.country_id=c.id
LEFT JOIN tl_purpose AS p ON t.reason_id=p.id
GROUP BY t.reason_id,t.country_id ORDER BY days DESC
But landed up with.
I am not able to get the count for purpose for each country in 'tl_log' that is not present in table 'tl_log'. Any help is greatly appreciated. Also, Please let me know if the question is difficult to understand.
Expected Output:
Below is the structure of these three tables
tl_log
tl_geo_countries
tl_purpose
If you want all possible combination of countries and purposes, even those that do not appear on the log table (these will be shown with a count of 0), you can do first a cartesian product of the two tables (a CROSS join) and then LEFT join to the log table:
SELECT
c.id AS countryID,
p.id AS reasonID,
COUNT(t.reason_id) AS days,
c.name AS country,
p.purpose AS purpose
FROM
tl_geo_countries AS c
CROSS JOIN
tl_purpose AS p
LEFT JOIN
tl_log AS t
ON t.country_id = c.id
AND t.reason_id = p.id
GROUP BY
p.id,
c.id
ORDER BY
days DESC ;
If you want the records for only the countries that are present in the log table (but still all possible reason/purposes), a slight modification is needed:
SELECT
c.id AS countryID,
p.id AS reasonID,
COUNT(t.reason_id) AS days,
c.name AS country,
p.purpose AS purpose
FROM
( SELECT DISTINCT
country_id
FROM
tl_log
) AS dc
JOIN
tl_geo_countries AS c
ON c.id = dc.country_id
CROSS JOIN
tl_purpose AS p
LEFT JOIN
tl_log AS t
ON t.country_id = c.id
AND t.reason_id = p.id
GROUP BY
p.id,
c.id
ORDER BY
days DESC ;
LEFT JOIN should be replaced by RIGHT JOIN

Counting results from multiple tables with same column

I have a system where, essentially, users are able to put in 3 different pieces of information: a tip, a comment, and a vote. These pieces of information are saved to 3 different tables. The linking column of each table is the user ID. I want to do a query to determine if the user has any pieces of information at all, of any of the three types. I'm trying to do it in a single query, but it's coming out totally wrong. Here's what I'm working with now:
SELECT DISTINCT
*
FROM tips T
LEFT JOIN comments C ON T.user_id = C.user_id
LEFT JOIN votes V ON T.user_id = V.user_id
WHERE T.user_id = 1
This seems to only be getting the tips, duplicated for as many votes or comments there are, even if the votes or comments weren't made by the specified user_id.
I only need a single number in return, not individual counts of each type. I basically want a sum of the number of tips, comments, and votes saved under that user_id, but I don't want to do three queries.
Anyone have any ideas?
Edit: Actually, I don't even technically need an actual count, I just need to know if there are any rows in any of those three tables with that user_id.
Edit 2: I almost have it with this:
SELECT
COUNT(DISTINCT T.tip_id),
COUNT(DISTINCT C.tip_id),
COUNT(DISTINCT V.tip_id)
FROM tips T
LEFT JOIN comments C ON T.user_id = C.user_id
LEFT JOIN votes V ON T.user_id = V.user_id
WHERE T.user_id = 1
I'm testing with user_id 1 (me). I've made 11 tips, voted 4 times, and made no comments. My return is a row with 3 columns: 11, 0, 4. That's the proper count. However, I tested it with a user that hasn't made any tips or comments, but has voted 3 times, that returned 0 for all counts, it should have returned: 0, 0, 3.
The problem that I'm having seems to be that if the table that I'm using for the WHERE clause doesn't have any rows from that user_id, then I get 0 across the board, even if the other tables DO have rows with that user_id. I could use this query:
SELECT
(SELECT COUNT(*) FROM tips WHERE user_id = 2) +
(SELECT COUNT(*) FROM comments WHERE user_id = 2) +
(SELECT COUNT(*) FROM votes WHERE user_id = 2) AS total
But I really wanted to avoid running multiple queries, even if they're subqueries like this.
UPDATE
Thanks to ace, I figured this out:
SELECT
(COUNT(DISTINCT T.tip_id) + COUNT(DISTINCT C.tip_id) + COUNT(DISTINCT V.tip_id)) AS total
FROM users U
LEFT JOIN tips T ON U.user_id = T.user_id
LEFT JOIN votes V ON U.user_id = V.user_id
LEFT JOIN comments C ON U.user_id = C.user_id
WHERE U.user_id = 4
the users table contains the actual information bout the user including, obviously, the user id. I used the user table as the parent, since I could be 100% sure that the user would be present in that table, even if they weren't in the other tables. I got the proper count that I wanted with this query!
As I understand your question. You want to count the total comments + tips + votes for each user. Though is not really clear to me take a look at below query. I added columns for details this is a cross tabs query as someone teach me.
EDITED QUERY:
SELECT
COALESCE(COALESCE(t2.tips,0) + COALESCE(c2.comments,0) + COALESCE(v2.votes,0)) AS `Totals`
FROM parent p
LEFT JOIN (SELECT t.user_id, COUNT(t.tip_id) AS tips FROM tips t GROUP BY t.user_id) t2
ON p.user_id = t2.user_id
LEFT JOIN (SELECT c.user_id, COUNT(c.tip_id) AS comments FROM comments c GROUP BY c.user_id) c2
ON p.user_id = c2.user_id
LEFT JOIN (SELECT v.user_id, COUNT(v.tip_id) AS votes FROM votes v GROUP BY v.user_id) v2
ON p.user_id = v2.user_id
WHERE p.user_id = 1;
Note: This used a parent table in order to get the result of a table which doesn't in other table.
The reason why I use a sub-query in my JOIN is to create a virtual table that will get the sum of tip_id for each table. Also I'm having problem with the DISTINCT using the same query of yours, so I end up with this query.
I know you prefer not using sub-queries, but I failed without a sub-query. For now this is all I can.

MySQL group_concat with 2 joins returns unwanted results

When executing this query i expect te get 2 mobilenumbers and 1 category, instead i get 2 categories, what am i doing wrong?
I guess it has to do with the way i am joining things?
User, can have multiple imei's,
categoryjoin links a user to multiple categories
SELECT
u.*,
group_concat(i.mobilenumber) as mobilenumbers,
group_concat(c.name) as categories
FROM
users AS u
INNER JOIN
categoryjoin AS cj
ON
u.uid = cj.user_id
INNER JOIN
categories AS c
ON
cj.category_id = c.uid
INNER JOIN
imei AS i
ON
u.uid = i.user_id
GROUP BY
u.uid
Big pre-thanks you for your help!
If a user matches one category, but matches 2 rows in imei, then the category will be duplicated in the result set. You can get rid of redundant values from group_concat using DISTINCT:
SELECT
u.*,
group_concat(distinct i.mobilenumber) as mobilenumbers,
group_concat(distinct c.name) as categories