Could this MYSQL be optimized? - mysql

Hello I'm having a comments table on which I run a fulltext search.
c1 and c2 are aliases on the same table used
via criteria: c1.parent_id=0 I get the questions only(not the answers attached to them)
and via c2.parent_id<>0 I filter the questions that already have answers
SELECT DISTINCT c1.comment, c1.comment_id, MATCH(c1.comment) AGAINST ('keyword1 keyword2 keyword3') AS score
FROM comments AS c1
JOIN comments AS c2
ON c1.comment_id = c2.parent_id
WHERE c1.parent_id=0
and c2.parent_id <> 0
ORDER BY score DESC LIMIT 9
The problem is that when I run EXPLAIN SELECT... the search looks up through each and every row of the table - so the bigger it gets the slower this operation will be, instead of searching just the rows with parent_id=0.
I would like to ask: is it possible to optimize this kind of query any further?

add an index to all the id columns
alter table your_table add index(parent_id)
same with comment_id

Related

select * from table1, table2 where table1.id = 1 showing values with other id

I just can't see the problem with how I'm making my foreign keys and I'm just really confused why I keep getting the wrong result. Here are screenshots from my workbench
Here are my tables:
And here's my diagram
I've also tried to normalize my tables and I was kinda expecting my query to return a similar result like in the sample table (Questions table) where it will only show 2 results since I want to query where idsurvey = 1 I made in this image:
My question is that, how do I fix my foreign key so that if I want to query
select * from survey.survey, survey.questions where idsurvey = 1
it will only return 2 rows? (based on sample data in the workbench screenshot)
Any comments and suggestions on my diagram would also be greatly appreciated.
When you have two tables in the from clause, every row from the first table is matched with every table from the second table. This is known as a Cartesian Product. Usually, this isn't the behavior you'd want (like it isn't in this case), and you'd use a condition to tell the database how to match these two tables:
SELECT *
FROM survey.survey s, survey.questions q
WHERE s.idsurvey = q.survey_id AND idsurvey = 1
While this should work, it's quite outdated to use multiple tables in the same from clause. You should probably use an explicit join clause instead:
SELECT *
FROM survey.survey s
JOIN survey.questions q ON s.idsurvey = q.survey_id
WHERE idsurvey = 1

getting quize data, questions and answers in 1 query?

I need to get quize title, quize description, quize questions and answers for each questions. My table structure is:
quizes
quize_id | title | user_id | ...
questions
questions_id | quize_id | question | ...
question_answers
answer_id | question_id | user_id | answer | ...
I can use join
SELECT * FROM quizes JOIN questions q ON q.quize_id=quizes.quize_id JOIN question_answers a ON a.question_id=q.question_id
But the problem with this is that I will get in results many rows with redundant data. For example each row will carry field title,user_id, ... Another way is to make for each question extra query to get answers. Is there any better way? Should I use only 1 query or more?
Your tables hold 3 types of data. If you use the query you've got, you'll get all the data as a big table. You've said that this involves a lot of duplication.
If you use multiple queries, you will get multiple result sets, which effectively will leave you with multiple tables, and thus this is unlikely to help.
You could cut the query down to just the columns you want to get the data for:
SELECT qq.Question, qa.Answer
FROM quizes qz
join questions qq on qz.quize_id = qq.quize_id
join question_answers qa on qq.question_id = qa.question_id
WHERE qz.quize_id = #quize_id
ORDER BY 1, 2 --or other ordering
However where there are multiple answers for the same question, the question will be repeated on every row. There isnt much you can do about that, it is the price of combining multiple table's data into one table ("denormalising").
If you need to format your output table so that it looks like this (but with more columns):
Quize_id | Question | Answer
1 Q1 A1
A2
Q2 A3
2 Q3 A4
This is a whole different matter. You would need to use the query you've got to populate a temporary table, ordering the data by the sort order you want displayed. To this table you'd need to add a primary key (integer) column, then run a set of update statements to replace the repeated values with nulls, then output the table in the order of the primary key column. (There are other ways to do this, but this is the easiest to explain)
Does this help?
I found also another way which return all data I need, including user details for each question:
SELECT
question,
group_concat(qa.answer SEPARATOR ',') as answers,
group_concat(qa.user_id SEPARATOR ',') as userIds,
group_concat(up.nickname SEPARATOR ',') as nickname
FROM quize_questions qq
INNER JOIN question_answers qa ON qa.question_id=qq.question_id
INNER JOIN user_profile up ON up.user_id = qa.user_Id
GROUP BY qq.question_id
I am just not sure if this is the right way. I am worried about speed.

MySQL JOIN tables with WHERE clause

I need to gather posts from two mysql tables that have different columns and provide a WHERE clause to each set of tables. I appreciate the help, thanks in advance.
This is what I have tried...
SELECT
blabbing.id,
blabbing.mem_id,
blabbing.the_blab,
blabbing.blab_date,
blabbing.blab_type,
blabbing.device,
blabbing.fromid,
team_blabbing.team_id
FROM
blabbing
LEFT OUTER JOIN
team_blabbing
ON team_blabbing.id = blabbing.id
WHERE
team_id IN ($team_array) ||
mem_id='$id' ||
fromid='$logOptions_id'
ORDER BY
blab_date DESC
LIMIT 20
I know that this is messy, but i'll admit, I am no mysql veteran. I'm a beginner at best... Any suggestions?
You could put the where-clauses in subqueries:
select
*
from
(select * from ... where ...) as alias1 -- this is a subquery
left outer join
(select * from ... where ...) as alias2 -- this is also a subquery
on
....
order by
....
Note that you can't use subqueries like this in a view definition.
You could also combine the where-clauses, as in your example. Use table aliases to distinguish between columns of different tables (it's a good idea to use aliases even when you don't have to, just because it makes things easier to read). Example:
select
*
from
<table> as alias1
left outer join
<othertable> as alias2
on
....
where
alias1.id = ... and alias2.id = ... -- aliases distinguish between ids!!
order by
....
Two suggestions for you since a relative newbie in SQL. Use "aliases" for your tables to help reduce SuperLongTableNameReferencesForColumns, and always qualify the column names in a query. It can help your life go easier, and anyone AFTER you to better know which columns come from what table, especially if same column name in different tables. Prevents ambiguity in the query. Your left join, I think, from the sample, may be ambigous, but confirm the join of B.ID to TB.ID? Typically a "Team_ID" would appear once in a teams table, and each blabbing entry could have the "Team_ID" that such posting was from, in addition to its OWN "ID" for the blabbing table's unique key indicator.
SELECT
B.id,
B.mem_id,
B.the_blab,
B.blab_date,
B.blab_type,
B.device,
B.fromid,
TB.team_id
FROM
blabbing B
LEFT JOIN team_blabbing TB
ON B.ID = TB.ID
WHERE
TB.Team_ID IN ( you can't do a direct $team_array here )
OR B.mem_id = SomeParameter
OR b.FromID = AnotherParameter
ORDER BY
B.blab_date DESC
LIMIT 20
Where you were trying the $team_array, you would have to build out the full list as expected, such as
TB.Team_ID IN ( 1, 4, 18, 23, 58 )
Also, not logical "||" or, but SQL "OR"
EDIT -- per your comment
This could be done in a variety of ways, such as dynamic SQL building and executing, calling multiple times, once for each ID and merging the results, or additionally, by doing a join to yet another temp table that gets cleaned out say... daily.
If you have another table such as "TeamJoins", and it has say... 3 columns: a date, a sessionid and team_id, you could daily purge anything from a day old of queries, and/or keep clearing each time a new query by the same session ID (as it appears coming from PHP). Have two indexes, one on the date (to simplify any daily purging), and second on (sessionID, team_id) for the join.
Then, loop through to do inserts into the "TempJoins" table with the simple elements identified.
THEN, instead of a hard-coded list IN, you could change that part to
...
FROM
blabbing B
LEFT JOIN team_blabbing TB
ON B.ID = TB.ID
LEFT JOIN TeamJoins TJ
on TB.Team_ID = TJ.Team_ID
WHERE
TB.Team_ID IN NOT NULL
OR B.mem_id ... rest of query
What I ended up doing is;
I added an extra column to my blabbing table called team_id and set it to null as well as another field in my team_blabbing table called mem_id
Then I changed the insert script to also insert a value to the mem_id in team_blabbing.
After doing this I did a simple UNION ALL in the query:
SELECT
*
FROM
blabbing
WHERE
mem_id='$id' OR
fromid='$logOptions_id'
UNION ALL
SELECT
*
FROM
team_blabbing
WHERE
team_id
IN
($team_array)
ORDER BY
blab_date DESC
LIMIT 20
I am open to any thought on what I did. Try not to be too harsh though:) Thanks again for all the info.

mySQL Query optimisation LEFT JOIN

I need to acquire some data from a questions table and then LEFT JOIN product answers. I need the list of all questions in a particular category (a total of 16 categories out of approximately 200 categories) and then list the product answers next to the questions for a certain product id.
SELECT `questions`.`id`, `questions`.`text`, `questions`.`catalogue_id`, `productanswers`.`answer`
FROM `questions`
LEFT JOIN `productanswers` ON `productanswers`.`question_id` = `questions`.`id`
AND product_id = '2001682'
WHERE `catalogue_id` IN (1234912,1234913,1234914)
ORDER BY `catalogue_id`
which returns as I would expect approximately 17 results. Questions without an answer for this product are filled with Null, great!
The problem is that the query takes approximately 23 seconds to execute :-o making a full query with all catalogue questions impossible.
How can I optimise the query, or do you have any other ideas?
Thanks,
Taff
Add a composite (multiple column) index on the question_id and product_id together:
ALTER TABLE productanswers ADD KEY(question_id, product_id)
Note: You might want to switch the column order in the index, depending on the selectivity of question_id vs product_id
For more on composite indexes see the Docs
In addition you should have indexes on all the your id columns but I guess you have this. Another source for information is the explain statement: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

Any way to improve this slow query?

Its particular query pops up in the slow query log all the time for me. Any way to improve its efficiency?
SELECT
mov_id,
mov_title,
GROUP_CONCAT(DISTINCT genres.genre_name) as all_genres,
mov_desc,
mov_added,
mov_thumb,
mov_hits,
mov_numvotes,
mov_totalvote,
mov_imdb,
mov_release,
mov_type
FROM movies
LEFT JOIN _genres
ON movies.mov_id = _genres.gen_movieid
LEFT JOIN genres
ON _genres.gen_catid = genres.grenre_id
WHERE mov_status = 1 AND mov_incomplete = 0 AND mov_type = 1
GROUP BY mov_id
ORDER BY mov_added DESC
LIMIT 0, 20;
My main concern is in regard to the group_concat function, which outputs a comma separated list of genres associated with the particular film, which I put through a for loop and make click-able links.
Do you need the genre names? If you can do with just the genre_id, you can eliminate the second join. (You can fill in the genre name later, in the UI, using a cache).
What indexes do you have?
You probably want
create index idx_movies on movies
(mov_added, mov_type, mov_status, mov_incomplete)
and most certainly the join index
create index ind_genres_movies on _genres
(gen_mov_id, gen_cat_id)
Can you post the output of EXPLAIN? i.e. put EXPLAIN in front of the SELECT and post the results.
I've had quite a few wins with using SELECT STRAIGHT JOIN and ordering the tables according to their size.
STRAIGHT JOIN stops mysql guess which order to join tables and does it in the order specified so if you use your smallest table first you can reduce the amount of rows being joined.
I'm assuming you have indexes on mov_id, gen_movieid, gen_catid and grenre_id?
The 'using temporary, using filesort' is from the group concat distinct genres.genre_name.
Trying to get a distinct on a column without an index will cause a temp table to be used.
Try adding an index on genre_name column.