I need some help with mysql.
Let's show you. I have two tables: nodes(250 000 rows):
id | name
And table groups(~400 000 rows):
id | group_id | node_id - index(node_id) and maybe index(group_id)
Problem query:
select count(*) from nodes n inner join groups g on g.node_id = n.id \
where g.group_id in (1,20, 30...);
Execution time: 0.50 sec and it's problem for me.
How I can optimize query to count rows and then make select?
Where I can put index or new field for benefit?
`SHOW CREATE TABLE` for each table
Don't bother with an id for a many-to-many table. Do be sure to have indexes going both ways in such a table.
Why does your Problem Query need to touch nodes at all? This will count the number of nodes in each group, won't it?
SELECT group_id,
COUNT(*)
FROM groups
WHERE group_id IN (...)
GROUP BY group_id;
groups: PRIMARY KEY(group_id, node_id), INDEX(node_id, group_id)
If that is not what you are looking for, please describe the query.
Related
i have a table 'A' with status column, it can have 4 values. In table A i have table 'B's id, table B have table 'C's id. I want to get the status count FROM table 'A' by joining all these columns. The status column in table A is a foreign key from table 'D'. Table 'D' having status like 1-agreed, 2-not agreed etc
The question is missing some information that might be helpful. Particularly, what exactly you are wanting to count. (i.e. are you just trying to count ALL rows, or are you trying to count the number of rows in table A that have each status). I'll put together an answer that assumes that latter.
I'll also just assume that "id" is the primary key of its own table, and that id will be the id from other tables inside a table.
select A.statusField, count(*)
from A
join B on (A.Bid = B.id)
join C on (B.Cid = C.id)
group by A.statusField
Hope that helps.
We want to select customers based on following parameters i.e. customer should be in:
specific city i.e. cityId=1,2,3...
specific customerId should be excluded i.e. customerId=33,2323,34534...
specific age i.e. 5 years, 7 years, 72 years...
This inclusion & exclusion list can be any long.
How should we design database for this:
Create separate table 'customerInclusionCities' for these inclusion cities and do like:
select * from customers where cityId in (select cityId from customerInclusionCities)
Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries:
i.e. select * from customers where age in (select age from customerEligibleAge)
and Create separate table 'customerIdToBeExcluded' for excluding customers:
i.e. select * from customers where customerId not in (select customerId from customerIdToBeExcluded)
OR
Create One table with Category and Ids.
i.e. Category1 for cities, Category2 for CustomerIds to be excluded.
Which approach is better, creating one table for these parameters OR creating separate tables for each list i.e. age, customerId, city?
IN ( SELECT ... ) can be very slow. Do your query as a single SELECT without subqueries. I assume all 3 columns are in the same table? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses:
SELECT ...
FROM tbl
WHERE cityId IN (1,2,3...)
AND customerId NOT IN (33,2323,34534...)
AND age IN (5, 7, 72)
Have (at least):
INDEX(cityId),
INDEX(age)
(Negated things are unlikely to be able to use an index.)
The query will use one of the indexes; having both will give the Optimizer a choice of which it thinks is better.
Or...
SELECT c.*
FROM customers AS c
JOIN cityEligible AS b ON b.city = c.city
JOIN customerEligibleAge AS ce ON c.age = ce.age
LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
WHERE ex.customerId IS NULL
Suggested indexes (probably as PRIMARY KEY):
customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)
In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work.
If you use the database only that operation, I recommend to use the first solution. Also the first solution is very simple to deploy.
The second solution fills up with junk the DB.
I am running a query on three tables messages, message_recipients and users.
Table structure of messages table:
id int pk
message_id int
message text
user_id int
...
Index for this table is on user_id, message_id and id.
Table structure of message_recipients table:
id int pk
message_id int
read_date datetime
user_id int
...
Index is on id, message_id and user_id.
Table structure of users table:
id int pk
display_name varchar
...
Index is on id.
I am running the following query against these tables:
SELECT
m.*,
if(m.user_id = 0, 'Campus Manager', u.display_name) AS name,
mr.read_date,
IF(m1.message_id > 0 and m1.user_id=1, true, false) as replied
FROM
messages m
JOIN
message_recipients mr
ON
mr.message_id = m.id
LEFT JOIN
users u
ON
u.UID = m.user_id
LEFT JOIN
messages m1
ON
m1.message_id = m.id
WHERE
mr.user_id = 1
AND
m.published = 1
GROUP BY
mr.message_id
ORDER BY
m.created DESC
EXPLAIN returns the following data for this query:
UPDATE
As suggested by #e4c5, I added new composite index on (published,user_id,created) and now the explain query shows this:
How can this query be optimized by adding required indexes (if any) as it is taking lot of time?
GROUP BY needs to list all the non-aggregated columns. I suspect that would be a mess. Why do you need GROUP BY at all?
Why are you linking messages.id to messages_id? Is this a hierarchical table, but the column names aren't like 'parent_id'?
"Index is on id, message_id and user_id" -- is that one composite index or 3 single-column indexes? (It makes a big difference.) It would be better to show us SHOW CREATE TABLE instead of ambiguously paraphrasing.
Is user_id=1 prolific? That is, are you expecting thousands of rows? Is this query only a problem for him?
Using LEFT JOIN implies that m1.message_id could be NULL, yet the reference to it seems to ignore that possibility.
If this is a single table that contains a message thread -- both the main info about the thread and the individual responses, then I suggest it is a bad design. (I made this mistake once upon a time.) I think it iis better to have a table with one row per thread and another table with one row per comment. 1 thread : many comments. So there would be a thread_id in the comment table.
I was able to bring down the query time from 3 seconds to 0.1 second by adding a new index to messages and message_recipients table and changing the database engine of messages table to MyISAM from InnoDB.
Composite index composite added on these columns with respective order on messages table - published, user_id, created
Composite index message_id_2 added on two columns on message_recipients table - message_id, user_id
EXPLAIN Query now shows
I have two tables. One is a category ID, the other one is a product table. I would like to count how many products of each category ID, and the query is below.
SELECT hkgg_emall_goods_class.gc_id, COUNT(*) as productcount
FROM hkgg_emall_goods_class
LEFT JOIN hkgg_emall_goods
ON hkgg_emall_goods.gc_id=hkgg_emall_goods_class.gc_id GROUP BY hkgg_emall_goods_class.gc_id ;
It shows what I want, except the query shows some rows to have count of 1 even they have no products associated, and some row as 1 count when they actually have one product associated.
I want your advice on
1) how to solve this problem
2) I have added the gc_productcount column in the category table. How can I insert the count query into the gc_productcount column for every row?
INSERT INTO `hkgg_emall_goods_class.gc_productcount`
This query is not working well when I put it in front of the select count query.
P.S. I have browsed the other thread in stackoverflow, but luck is not good enough to browse a similar solution.
Thank you in advance.
Assuming hkgg_emall_goods table has a primary or at least a unique key, that's what you want to count. i.e. you don't want to COUNT(*), you want to COUNT(hkgg_emall_goods.id).
So assuming that primary key is hkgg_emall_goods.id then your query will look like this:
SELECT
hgc.gc_id,
COUNT(hg.id) AS productcount
FROM hkgg_emall_goods_class hgc
LEFT JOIN hkgg_emall_goods hg ON hg.gc_id = hgc.gc_id
GROUP BY
hgc.gc_id
So I have the following:
A lookup table that has two columns and looks like this for example:
userid moduleid
4 4
I also have a users table that has a primary key userid which the lookup table references. The user table has a few users lets say, and looks like this:
userid
1
2
3
4
In this example, it show that the user with ID 4 has a match with module ID 4. The others are not matched to any moduleid.
I need a query that gets me data from the users table WHERE the moduleid is not 4. In my application, I know the module but I don't know the user. So the query should return the other userids apart from 4, because 4 is already matched with module ID 4.
Is this possible to do?
I think I understand your question correctly. You can use a sub-query to cross-check the data between both tables using the NOT IN() function.
The following will select all userid records from the user_tbl table that do not exist in the lookup_tbl table:
SELECT `userid`
FROM `user_tbl`
WHERE `userid` NOT IN (
SELECT DISTINCT(`userid`) FROM `lookup_tbl` WHERE moduleid = 4
)
There are several ways to do this, one pretty intuitive way (in my opinion) is the use an in predicate to exclude the users with moduleid 4 in the lookup table:
SELECT * FROM Users WHERE UserID NOT IN (SELECT UserID FROM Lookup WHERE ModuleID = 4)
There are other ways, with possibly better performance (using a correlated not exists query or a join for instance).
One other option is to use a LEFT JOIN so that you can get the values from both tables, even when there is not a match. Then, pick the rows where there is no userid value from the lookup table.
SELECT u.userid
FROM usersTable u
LEFT JOIN lookupTable lt ON u.userid = lt.userid
WHERE lt.userid IS NULL
Are you looking for a query like this?
select userid from yourtablename where moduleid<>4