mysql view super slow - mysql

this is the query for Unified Medical Language System(UMLS) to find a word related to normalized word. this query result is 165MS, but if I am running VIEW of this same query it is taking 70 sec. I m new to the mysql. Please help me.
Query:
SELECT a.nwd as Normalized_Word,
b.str as String,
c.def as Defination,
d.sty as Semantic_type
FROM mrxnw_eng a, mrconso b, mrdef c, mrsty d
WHERE a.nwd = 'cold'
AND b.sab = 'Msh'
AND a.cui = b.cui
AND a.cui = c.cui
AND a.cui = d.cui
AND a.lui = b.lui
AND b.sui = a.sui
group by a.cui
View definition:
create view nString_Sementic as
SELECT a.nwd as Normalized_Word,
b.str as String,
c.def as Defination,
d.sty as Semantic_type
FROM mrxnw_eng a, mrconso b, mrdef c, mrsty d
WHERE b.sab = 'Msh'
AND a.cui = b.cui
AND a.cui = c.cui
AND a.cui = d.cui
AND a.lui = b.lui
AND b.sui = a.sui
group by a.cui
Selection from view:
select * nString_Sementic
where nwd = 'phobia'

You may be able to get better performance by specifying the VIEW ALGORITHM as MERGE. With MERGE MySQL will combine the view with your outside SELECT's WHERE statement, and then come up with an optimized execution plan.
To do this however you would have to remove the GROUP BY statement from your VIEW. As it is, a temporary table is being created of the entire view first, before being filtered by your WHERE statement.
If the MERGE algorithm cannot be used, a temporary table must be used
instead. MERGE cannot be used if the view contains any of the
following constructs:
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subquery in the select list
Refers only to literal values (in this case, there is no underlying
table)
Here is the link with more info. http://dev.mysql.com/doc/refman/8.0/en/view-algorithms.html
If you can change your view to not include the GROUP BY statement, to specify the view's algorithm the syntax is:
CREATE ALGORITHM = MERGE VIEW...
Edit: This answer was originally based on MySQL 5.0. I've updated the links to point to the current documentation, but I have not otherwise confirmed if the answer correct for versions >5.0.

Assuming that mrxnw_eng.nwd is functionally dependent on mrxnw_eng.cui, try changing the group by clause of the view to include a.nwd - like so:
group by a.cui, a.nwd

Related

View created from valid SQL statement returns error "#1242 - Subquery returns more than 1 row"

When I run the below SQL statement, it correctly shows me what I expect:
SELECT Users.idQuiz as ThisIDQuiz,Rounds.RoundNr as ThisRoundNr, Questions.QuestionNr as ThisQuestionNr, Answer as ThisAnswer, Questions.QuestionScore AS QuestionScoreMax,AnswerCorrect,
(SELECT COUNT(*) as "Aantal Ploegen met dit antwoord"
FROM `Answers`
JOIN Questions on Answers.idQuestion = Questions.idQuestion
JOIN Rounds on Questions.idRound = Rounds.idRound
JOIN Users on Users.idUser = Answers.idUser
where (Users.idQuiz = ThisIDQuiz AND Rounds.RoundNr = ThisRoundNr AND Questions.QuestionNr=ThisQuestionNr AND Answers.Answer = ThisAnswer )
GROUP BY Users.idQuiz,Rounds.RoundNr, Questions.QuestionNr,Answer
) as NrOfTeamsWithThisAnswer,
(SELECT COUNT(*)
FROM Users
WHERE ((Users.idQuiz = ThisIDQuiz) AND (Users.UserType = 0))
) As TotalNrOfTeams,
AnswerCorrect *((Select TotalNrOfTeams)- (SELECT NrOfTeamsWithThisAnswer))as ScoreForThisAnswer
FROM `Answers`
JOIN Questions on Answers.idQuestion = Questions.idQuestion
JOIN Rounds on Questions.idRound = Rounds.idRound
JOIN Users on Users.idUser = Answers.idUser
WHERE Questions.QuestionType = 5
GROUP BY ThisAnswer
ORDER BY ThisIDQuiz, ThisRoundNr, ThisQuestionNr, ThisAnswer;
See Results of the query for what the result looks like.
I then create a VIEW from this statement. The view is created fine, but when I open it, I get the error "#1242 - Subquery returns more than 1 row".
I tried dropping and recreating the view, same result.
When I use the exact same SQL statement but without the penultimate line (GROUP BY ThisAnswer), it works fine (i.e. I can create the view and it opens without an error). This second view suits my purposes fine, so I can continue, but just out of curiosity: can someone explain this behaviour?
I use phpMyAdmin version 5.1.3 to do my SQL manipulations.
As #NickW asks, the statement GROUP BY ThisAnswer expects you to have an aggregate function (i.e. count, avg, max, min, etc.) somewhere in that main SELECT. Having a GROUP BY without an aggregate function will create errors. Either add the aggregate function, or remove the GROUP BY statement in the main (outermost) SELECT.

Optimize derived table in select

I have sql query:
SELECT tsc.Id
FROM TEST.Services tsc,
(
select * from DICT.Change sp
) spc
where tsc.serviceId = spc.service_id
and tsc.PlanId = if(spc.plan_id = -1, tsc.PlanId, spc.plan_id)
and tsc.startDate > GREATEST(spc.StartTime, spc.startDate)
group by tsc.Id;
This query is very, very slow.
Explain:
Can this be optimized? How to rewrite this subquery for another?
What is the point of this query? Why the CROSS JOIN operation? Why do we need to return multiple copies of id column from Services table? And what are we doing with the millions of rows being returned?
Absent a specification, an actual set of requirements for the resultset, we're just guessing at it.
To answer your questions:
Yes, the query could be "optimized" by rewriting it to the resultset that is actually required, and do it much more efficiently than the monstrously hideous SQL in the question.
Some suggestions: ditch the old-school comma syntax for the join operation, and use the JOIN keyword instead.
With no join predicates, it's a "cross" join. Every row matched from one side matched to every row from the right side.) I recommend including the CROSS keyword as an indication to future readers that the absence of an ON clause (or, join predicates in the WHERE clause) is intentional, and not an oversight.
I'd also avoid an inline view, unless there is a specific reason for one.
UPDATE
The query in the question is updated to include some predicates. Based on the updated query, I would write it like this:
SELECT tsc.id
FROM TEST.Services tsc
JOIN DICT.Change spc
ON tsc.serviceid = spc.service_id
AND tsc.startdate > spc.starttime
AND tsc.startdate > spc.starttdate
AND ( tsc.planid = spc.plan_id
OR ( tsc.planid IS NOT NULL AND spc.plan_id = -1 )
)
Ensure that the query is making use of suitable index by looking at the output of EXPLAIN to see the execution plan, in particular, which indexes are being used.
Some notes:
If there are multiple rows from spc that "match" a row from tsc, the query will return duplicate values of tsc.id. (It's not clear why or if we need to return duplicate values. IF we need to count the number of copies of each tsc,id, we could do that in the query, returning distinct values of tsc.id along with a count. If we don't need duplicates, we could return just a distinct list.
GREATEST function will return NULL if any of the arguments are null. If the condition we need is "a > GREATEST(b,c)", we can specify "a > b AND a > c".
Also, this condition:
tsc.PlanId = if(spc.plan_id = -1, tsc.PlanId, spc.plan_id)
can be re-written to return an equivalent result (I'm suspicious about the actual specification, and whether this original condition actually satisfies that adequately. Without example data and sample of expected output, we have to rely on the SQL as the specification, so we honor that in the rewrite.)
If we don't need to return duplicate values of tsc.id, assuming id is unique in TEST.Services, we could also write
SELECT tsc.id
FROM TEST.Services tsc
WHERE EXISTS
( SELECT 1
FROM DICT.Change spc
ON spc.service_id = tsc.serviceid
AND spc.starttime < tsc.startdate
AND spc.starttdate < tsc.startdate
AND ( ( spc.plan_id = tsc.planid )
OR ( spc.plan_id = -1 AND tsc.planid IS NOT NULL )
)
)

Join Tables with sum in sql query

It is showing Sum(with table name) is not valid. Kindly help:
Modws.DisplayDataGrid(dgvClosingBalance,
"Select
Invoice.Customer, Invoice.Sum(Total),
RptTempTable.Sum(INVOICETOTAL), RptTempTable.Sum(CNTOTAL),
RptTempTable.Sum(DEBITTOTAL), RptTempTable.Sum(RECEIPTTOTAL)
From Invoice
inner join RptTempTable on Invoice.Customer = RptTempTable.Customer")
RptTempTable.Sum(INVOICETOTAL) should be Sum(RptTempTable.INVOICETOTAL)
The same goes for the other calls to sum()
The table prefix belongs to the column name not the function call.
MySQL will accept this invalid SQL and will return "inderminate" (aka "random") values instead.
To understand the implications of MySQL's "loose" (aka "sloppy") group by implementation you might want to read these articles:
http://www.mysqlperformanceblog.com/2006/09/06/wrong-group-by-makes-your-queries-fragile/
http://rpbouman.blogspot.de/2007/05/debunking-group-by-myths.html

SQL statement hanging up in MySQL database

I am needing some SQL help. I have a SELECT statement that references several tables and is hanging up in the MySQL database. I would like to know if there is a better way to write this statement so that it runs efficiently and does not hang up the DB? Any help/direction would be appreciated. Thanks.
Here is the code:
Select Max(b.BurID) As BurID
From My.AppTable a,
My.AddressTable c,
My.BurTable b
Where a.AppID = c.AppID
And c.AppID = b.AppID
And (a.Forename = 'Bugs'
And a.Surname = 'Bunny'
And a.DOB = '1936-01-16'
And c.PostcodeAnywhereBuildingNumber = '999'
And c.PostcodeAnywherePostcode = 'SK99 9Q9'
And c.isPrimary = 1
And b.ErrorInd <> 1
And DateDiff(CurDate(), a.ApplicationDate) <= 30)
There is NO mysql error in the log. Sorry.
Pro tip: use explicit JOINs rather than a comma-separated list of tables. It's easier to see the logic you're using to JOIN that way. Rewriting your query to do that gives us this.
select Max(b.BurID) As BurID
From My.AppTable AS a
JOIN My.AddressTable AS c ON a.AppID = c.AppID
JOIN My.BurTable AS b ON c.AppID = b.AppID
WHERE (a.Forename = 'Bugs'
And a.Surname = 'Bunny'
And a.DOB = '1936-01-16'
And c.PostcodeAnywhereBuildingNumber = '999'
And c.PostcodeAnywherePostcode = 'SK99 9Q9'
And c.isPrimary = 1
And b.ErrorInd <> 1
And DateDiff(CurDate(), a.ApplicationDate) <= 30)
Next pro tip: Don't use functions (like DateDiff()) in WHERE clauses, because they defeat using indexes to search. That means you should change the last line of your query to
AND a.ApplicationDate >= CurDate() - INTERVAL 30 DAY
This has the same logic as in your query, but it leaves a naked (and therefore index-searchable) column name in the search expression.
Next, we need to look at your columns to see how you are searching, and cook up appropriate indexes.
Let's start with AppTable. You're screening by specific values of Forename, Surname, and DOB. You're screening by a range of ApplicationDate values. Finally you need AppID to manage your join. So, this compound index should help. Its columns are in the correct order to use a range scan to satisfy your query, and contains the needed results.
CREATE INDEX search1 USING BTREE
ON AppTable
(Forename, Surname, DOB, ApplicationDate, AppID)
Next, we can look at your AddressTable. Similar logic applies. You'll enter this table via the JOINed AppID, and then screen by specific values of three columns. So, try this index
CREATE INDEX search2 USING BTREE
ON AddressTable
(AppID, PostcodeAnywherePostcode, PostcodeAnywhereBuildingNumber, isPrimary)
Finally, we're on to your BurTable. Use similar logic as the other two, and try this index.
CREATE INDEX search3 USING BTREE
ON BurTable
(AppID, ErrorInd, BurID)
This kind of index is called a compound covering index, and can vastly speed up the sort of summary query you have asked about.

MySQL: Subquery returns more than 1 row

I know this has been asked plenty times before, but I cant find an answer that is close to mine.
I have the following query:
SELECT c.cases_ID, c.cases_status, c.cases_title, ci.custinfo_FName, ci.custinfo_LName, c.cases_timestamp, o.organisation_name
FROM db_cases c, db_custinfo ci, db_organisation o
WHERE c.userInfo_ID = ci.userinfo_ID AND c.cases_status = '2'
AND organisation_name = (
SELECT organisation_name
FROM db_sites s, db_cases c
WHERE organisation_ID = '111'
)
AND s.sites_site_ID = c.sites_site_ID)
What I am trying to do is is get the cases, where the sites_site_ID which is defined in the cases, also appears in the db_sites sites table alongside its organisation_ID which I want to filter by as defined by "organisation_ID = '111'" but I am getting the response from MySQL as stated in the question.
I hope this makes sense, and I would appreciate any help on this one.
Thanks.
As the error states your subquery returns more then one row which it cannot do in this situation. If this is not expect results you really should investigate why this occurs. But if you know this will happen and want only the first result use LIMIT 1 to limit the results to one row.
SELECT organisation_name
FROM db_sites s, db_cases c
WHERE organisation_ID = '111'
LIMIT 1
Well the problem is, obviously, that your subquery returns more than one row which is invalid when using it as a scalar subquery such as with the = operator in the WHERE clause.
Instead you could do an inner join on the subquery which would filter your results to only rows that matched the ON clause. This will get you all rows that match, even if there is more than one returned in the subquery.
UPDATE:
You're likely getting more than one row from your subquery because you're doing a cross join on the db_sites and db_cases table. You're using the old-style join syntax and then not qualifying any predicate to join the tables on in the WHERE clause. Using this old style of joining tables is not recommended for this very reason. It would be better if you explicitly stated what kind of join it was and how the tables should be joined.
Good pages on joins:
http://dev.mysql.com/doc/refman/5.0/en/join.html (for the right syntax)
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html (for the differences between the types of joins)
I was battling this for an hour, and overcomplicated it completely. Sometimes a quick break and writing it out on an online forum can solve it for you ;)
Here is the query as it should be.
SELECT c.cases_ID, c.cases_status, c.cases_title, ci.custinfo_FName, ci.custinfo_LName, c.cases_timestamp, c.sites_site_ID
FROM db_cases c, db_custinfo ci, db_sites s
WHERE c.userInfo_ID = ci.userinfo_ID AND c.cases_status = '2' AND (s.organisation_ID = '111' AND s.sites_site_ID = c.sites_site_ID)
Let me re-write what you have post:
SELECT
c.cases_ID, c.cases_status, c.cases_title, ci.custinfo_FName, ci.custinfo_LName,
c.cases_timestamp, c.sites_site_ID
FROM
db_cases c
JOIN
db_custinfo ci ON c.userInfo_ID = ci.userinfo_ID and c.cases_status = '2'
JOIN
db_sites s ON s.sites_site_ID = c.sites_site_ID and s.organization_ID = 111