HOW to optimize this mysql query (max,replace,left join)? - mysql

I know everyone hates too specify questions, but i need your help.
So - thats the code. It must get info for user
select
c.id,
c.firstName,
c.lastName,
...
dt.code as docCode,
dt.name as docName,
replace(replace(replace(replace(replace(cc.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') as homePhone,
replace(replace(replace(replace(replace(cc1.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') as cellPhone
from Client c
left join ClientPolicy p on p.id=(select max(pp.id)
from ClientPolicy pp
where
pp.client_id = c.id
and pp.deleted = 0)
left join rbPolicyType pt on pt.id = p.policyType_id
left join ClientDocument d on d.id =(SELECT MAX(dd.id)
FROM ClientDocument dd
WHERE
dd.client_id = c.id
and dd.deleted = 0)
left join rbDocumentType dt on dt.id = d.documentType_id and dt.code IN ('1')
left join ClientContact cc ON cc.id = (select MAX(ccc.id)
FROM ClientContact ccc
where
ccc.client_id = c.id
and ccc.deleted = 0
and ccc.contactType_id = 1)
left join ClientContact cc1 ON cc1.id = (SELECT MAX(ccc1.id)
FROM ClientContact ccc1
WHERE
ccc1.client_id = c.id
and ccc1.deleted = 0
and ccc1.contactType_id = 3)
where
c.deleted = 0
and c.firstName like '%'
and c.patrName like '%'
and c.lastName like '%'
and replace(replace(replace(replace(replace(cc.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') like '%521%'
and replace(replace(replace(replace(replace(cc1.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') like '%8905%'
Every % means that user will inserts some data there, for example like '%8905%
About INDEX's
I added indexes like below, so this way, I'm sure will not help
INDEX client_insurer (client_id, insurer_id),
INDEX policyType_id (policyType_id),
INDEX Serial_Num (serial, number),
About replace(replace...
I am sure that regexp will give me just 1 second less and adding this
How do you extract a numerical value from a string in a MySQL query?
solvation doesn't reduse time(actually added 5 sec more)
I have no idea(mb paste conditions from where to join?) how make it faster. Please, help me.

Well, you didn't add an explain, no information about your data, no table-structure, no information about (useful) indexes. Without this, optimization is just an educated guess.
But I'll try anyway.
I'd try a series of subqueries, since you have to go through all of your data anyway because of the likes.
select
c.id,
c.firstName,
c.lastName,
...
dt.code as docCode,
dt.name as docName,
cphone1 as as homePhone,
cphone3 as cellPhone
from
( select *,
replace(replace(replace(replace(replace(cc.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') as cphone1,
replace(replace(replace(replace(replace(cc1.contact, ' ',''), '(',''), ')', ''), '-', ''), '+', '') as cphone3
from
(select c.id,
(select max(pp.id) from ClientPolicy pp
where pp.client_id = c.id and pp.deleted = 0) as pmax,
(select max(dd.id) FROM ClientDocument dd
WHERE dd.client_id = c.id and dd.deleted = 0) as dmax,
(select MAX(ccc.id) FROM ClientContact ccc
where ccc.client_id = c.id and ccc.deleted = 0
and ccc.contactType_id = 1) as cmax1,
(SELECT MAX(ccc1.id) FROM ClientContact ccc1
WHERE ccc1.client_id = c.id and ccc1.deleted = 0
and ccc1.contactType_id = 3) as cmax3
from Client c
where c.deleted = 0
and c.firstName like '%'
and c.patrName like '%'
and c.lastName like '%'
) as clientbase
join ClientContact cc on cc.id = clientbase.cmax1
join ClientContact cc1 on cc1.id = clientbase.cmax3
) as clientnamed
join client c on c.id = clientnamed.id
left join ClientPolicy p on p.id=clientnamed.pmax
left join rbPolicyType pt on pt.id = p.policyType_id
left join ClientDocument d on d.id = clientnamed.dmax
left join rbDocumentType dt on dt.id = d.documentType_id and dt.code = 1
where cphone1 like '%521%' and cphone3 like '%8905%';
If your search parameters '%521%' or '%8905%' are optional (e.g. not always given), you have to use left join ClientContact cc (same for cc1), but then do not have cphone1 like '%521%'in your where (same for cphone3 like '%8905%'), as it would act as a join again. (And your like '%' should actually also not be in there.)
You might get an improvement if you would be able to have the phonenumbers in a clean way in a new column (e.g. updated by a trigger).
Now to indexes:
You absolutely have to have id as the primary key.
Create indexes for Client (id, deleted), ClientPolicy (id, deleted) and ClientDocument (id, deleted).
You should try an index (id, deleted, contact_type) or (id, contact_type, deleted) on ClientContact - depending on your data: if you have a lot deleted entries, the first one should work better, otherwise the seconds one.
If these indexes would have a measurable effect depends on your data, but since you didn't tell us anything about your data, try it (having a lot of indexes will slow your inserts/updates down, so don't spam indexes; but again, you didn't tell us anything about the data, so you have to try it out yourself.)
For any follow up, you have to add at least the following
explain to see what mysql is actually doing
explain for the subquery select c.id, (select max(pp.id) from ClientPolicy pp ... c.lastName like '%' without the surrounding code
time it takes to get the results for the whole code and just the above subquery

Related

table valued function xml reader is causing performance in SQL Server

Could you please help me to change the script with substring or any other alternative way to improve the query.
I am not a developer it would be great if you can provide a sample script from the below script. the table valued function cost 49% in actual execution plan of the query and runs for 7 minutes.
Thank you in advance.
LEFT OUTER JOIN
(SELECT
pp2.patient_id,
Foodallergy= STUFF((SELECT ',' + cd.description
FROM [OHCP_OHProblemList].[problemlist].[problems_v] pp
INNER JOIN [OHCP_OHCLINICAL].[CodeSet].[CodeSet] CS ON cs.identifier = pp.problem_name_code_set
INNER JOIN ohcp_ohclinical.codeset.CodeDefinition cd ON (cd.code = pp.problem_name_code AND cd.codesetid = cs.id)
WHERE patient_id = pp2.patient_id
AND pp.last_update_action NOT IN ('DELETE', 'CLOSE')
AND pp.adr_class_code IN (3,4)
AND cd.description != 'Other'
ORDER BY [description]
FOR XML PATH ('')), 1, 1, ''),
description = STUFF((SELECT N',' + ' Other:' + pp.problem_name_freetext_desc
FROM [OHCP_OHProblemList].[problemlist].[problems_v] pp
INNER JOIN [OHCP_OHCLINICAL].[CodeSet].[CodeSet] CS on cs.identifier = pp.problem_name_code_set
INNER JOIN ohcp_ohclinical.codeset.CodeDefinition cd ON (cd.code = pp.problem_name_code AND cd.codesetid = cs.id)
WHERE patient_id = pp2.patient_id
AND pp.last_update_action not in ('DELETE', 'CLOSE')
AND pp.adr_class_code in (3, 4)
FOR XML PATH('')), 1, 1, '')
FROM
[OHCP_OHProblemList].[problemlist].[problems_v] pp2
GROUP BY
pp2.patient_id) AS problem ON problem.patient_id = externalpatientid.externalpatientid

How to get all records from one table and irrespective of where clause

Below is my Query, when I am putting where clause, my recordset get reduced. I can understand why it reducing,
But I am looking for all records from teacher_profiles table (31 records) and then corresponding details if there is any else blank.
Can any one suggest ? Using mySQL
SELECT
CONCAT(a.teacherFirstName , ' ', COALESCE(a.teacherMiddleName, ''), ' ', a.teacherLastName ) as teacherName ,
COALESCE(GROUP_CONCAT(DISTINCT c.subjectLongName SEPARATOR ', ') , '') AS subjects ,
COALESCE(GROUP_CONCAT(DISTINCT f.classStd SEPARATOR ', ') , '') AS classes ,
FROM
teacher_profiles a **<--- this table has 31 records. I need all these 31** Record in recordset
LEFT JOIN subjectteacherallocation b ON a.teacherId = b.teacherId
LEFT JOIN subject_master c ON b.subjectId = c.subjectId
LEFT JOIN timetabledistribution d ON a.teacherId = d.teacherId
LEFT JOIN TimeTableClassSection e ON d.TimeTableClassSectionId = d.TimeTableClassSectionId
LEFT JOIN class_master f ON f.classId = e.classId
WHERE b.academicYear='2015' <--- condition reducing record
GROUP BY a.teacherId

MySQL - Way to set a default return value when NULL is found?

Is there a way in SQL to set a default return value when NULL is returned for part of the results?
Here is my SQL:
SELECT p.id, p.title, concat( u1.meta_value, ' ', u2.meta_value ) as fullname, concat( r.name, ', ', c.name ) as location
FROM modules_profiles p
LEFT JOIN moonlight_usermeta u1 ON p.user_id = u1.user_id AND u1.meta_key = 'first_name'
LEFT JOIN moonlight_usermeta u2 ON p.user_id = u2.user_id AND u2.meta_key = 'last_name'
LEFT JOIN modules_regions r ON r.id = p.region_id
LEFT JOIN modules_countries c ON c.id = p.country_id
WHERE p.certification IN ( 'certified' ) AND p.country_id IN ( 2 )
ORDER BY p.user_id ASC
There are times when there is no region_id set for a given profile; therefore, NULL is returned for location for that respective user_id, even though we do have a country's name (c.name).
Is there a way in this case to just return the c.name only?
Use COALESCE() function like below, it will return the first non NULL value provided in list
COALESCE(col_name, 'default_value')
For your case, do
COALESCE(region_id, c.name)
I think, you are specifically talking about the part
concat( r.name, ', ', c.name ) as location
You can modify this using CASE expression as well
case when r.name is not null and c.name is not null
then concat( r.name, ', ', c.name ) else c.name end as location
You want to use MySQL's IFNULL(value, default) function.
Coalesce could help you
COALESCE(region_id, 'default value')

mysql - rewrite query with subqueries

Consider a table Users with columns Id, Name, Surname and a table Actions with columns Ip and Actor. I need to retrieve, for every Ip, the set of users who did as action using that Ip.
What I have now looks like:
SELECT a.ip, (
SELECT GROUP_CONCAT(t.id, '-', t.name, ' ', t.surname) FROM(
SELECT ud.id, ud.name, ud.surname
FROM users_data AS ud
JOIN actions AS a2 ON a2.actor = ud.id
WHERE a2.ip = a.ip
GROUP BY ud.id) AS t
)
FROM actions AS a
WHERE a.ip != '' AND a.ip != '0.0.0.0'
GROUP BY a.ip
It doesn't work because a.ip is unknown in the where clause in the inner subquery.
Do to performance issues, I need to avoid to use DISTINCT.
Any suggestion?
You can rewrite your query as
SELECT n.ip, GROUP_CONCAT( DISTINCT n.your_user SEPARATOR ' -- ') `users` FROM
(
SELECT a.ip AS ip, CONCAT(t.id, '-', t.name, ' ', t.surname) `your_user`
FROM users_data AS ud
JOIN actions AS a ON a.actor = ud.id
) `new_table` n
WHERE n.ip != '' AND n.ip != '0.0.0.0'
GROUP BY n.ip
Note Be aware of that the result is truncated to the maximum length
that is given by the group_concat_max_len system variable, which
has a default value of 1024
have you tried writing the condition a2.ip = a.ip outside the subquery.?
i.e. in the where clause of outer query!
I solved it using this query (still quite slow, so there's still space for improvements...):
SELECT SQL_NO_CACHE t.ip, COUNT(t.id) AS c, GROUP_CONCAT(t.id, '-', t.name, ' ', t.surname, '-', t.designerAt > 0) FROM (
SELECT a.ip, ud.id, ud.name, ud.surname, u.designerAt
FROM actions AS a
JOIN users_data AS ud ON ud.id = a.actor
JOIN users AS u ON u.id = a.actor
WHERE a.ip != ''
AND a.ip != '0.0.0.0'
AND a.actor !=0
GROUP BY a.ip, a.actor
) AS t
GROUP BY t.ip

how to find duplicate indexes in sql server [duplicate]

Is anyone aware of a T-SQL script that can detect redundant indexes across an entire database? An example of a redundant index in a table would be as follows:
Index 1: 'ColumnA', 'ColumnB', 'ColumnC'
Index 2: 'ColumnA', 'ColumnB'
Ignoring other considerations, such as the width of columns and covering indexes, Index 2 would be redundant.
Thanks.
There are situations where the redundancy doesn't hold. For example, say ColumnC was a huuge field, but you'd sometimes have to retrieve it quickly. Your index 1 would not require a key lookup for:
select ColumnC from YourTable where ColumnnA = 12
On the other hand index 2 is much smaller, so it can be read in memory for queries that require an index scan:
select * from YourTable where ColumnnA like '%hello%'
So they're not really redundant.
If you're not convinced by my above argument, you can find "redundant" indexes like:
;with ind as (
select a.object_id
, a.index_id
, cast(col_list.list as varchar(max)) as list
from (
select distinct object_id
, index_id
from sys.index_columns
) a
cross apply
(
select cast(column_id as varchar(16)) + ',' as [text()]
from sys.index_columns b
where a.object_id = b.object_id
and a.index_id = b.index_id
for xml path(''), type
) col_list (list)
)
select object_name(a.object_id) as TableName
, asi.name as FatherIndex
, bsi.name as RedundantIndex
from ind a
join sys.sysindexes asi
on asi.id = a.object_id
and asi.indid = a.index_id
join ind b
on a.object_id = b.object_id
and a.object_id = b.object_id
and len(a.list) > len(b.list)
and left(a.list, LEN(b.list)) = b.list
join sys.sysindexes bsi
on bsi.id = b.object_id
and bsi.indid = b.index_id
Bring cake for your users in case performance decreases "unexpectedly" :-)
Inspired by Paul Nielsen, I wrote this query to find/distinguish:
Duplicates (ignoring include order)
Redundant (different include columns)
Overlapping (different index columns)
And also record their usage
(One might also want to use is_descending_key, but I don't need it.)
WITH IndexColumns AS
(
SELECT I.object_id AS TableObjectId, OBJECT_SCHEMA_NAME(I.object_id) + '.' + OBJECT_NAME(I.object_id) AS TableName, I.index_id AS IndexId, I.name AS IndexName
, (IndexUsage.user_seeks + IndexUsage.user_scans + IndexUsage.user_lookups) AS IndexUsage
, IndexUsage.user_updates AS IndexUpdates
, (SELECT CASE is_included_column WHEN 1 THEN NULL ELSE column_id END AS [data()]
FROM sys.index_columns AS IndexColumns
WHERE IndexColumns.object_id = I.object_id
AND IndexColumns.index_id = I.index_id
ORDER BY index_column_id, column_id
FOR XML PATH('')
) AS ConcIndexColumnNrs
,(SELECT CASE is_included_column WHEN 1 THEN NULL ELSE COL_NAME(I.object_id, column_id) END AS [data()]
FROM sys.index_columns AS IndexColumns
WHERE IndexColumns.object_id = I.object_id
AND IndexColumns.index_id = I.index_id
ORDER BY index_column_id, column_id
FOR XML PATH('')
) AS ConcIndexColumnNames
,(SELECT CASE is_included_column WHEN 1 THEN column_id ELSE NULL END AS [data()]
FROM sys.index_columns AS IndexColumns
WHERE IndexColumns.object_id = I.object_id
AND IndexColumns.index_id = I.index_id
ORDER BY column_id
FOR XML PATH('')
) AS ConcIncludeColumnNrs
,(SELECT CASE is_included_column WHEN 1 THEN COL_NAME(I.object_id, column_id) ELSE NULL END AS [data()]
FROM sys.index_columns AS IndexColumns
WHERE IndexColumns.object_id = I.object_id
AND IndexColumns.index_id = I.index_id
ORDER BY column_id
FOR XML PATH('')
) AS ConcIncludeColumnNames
FROM sys.indexes AS I
LEFT OUTER JOIN sys.dm_db_index_usage_stats AS IndexUsage
ON IndexUsage.object_id = I.object_id
AND IndexUsage.index_id = I.index_id
AND IndexUsage.Database_id = db_id()
)
SELECT
C1.TableName
, C1.IndexName AS 'Index1'
, C2.IndexName AS 'Index2'
, CASE WHEN (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs) AND (C1.ConcIncludeColumnNrs = C2.ConcIncludeColumnNrs) THEN 'Exact duplicate'
WHEN (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs) THEN 'Different includes'
ELSE 'Overlapping columns' END
-- , C1.ConcIndexColumnNrs
-- , C2.ConcIndexColumnNrs
, C1.ConcIndexColumnNames
, C2.ConcIndexColumnNames
-- , C1.ConcIncludeColumnNrs
-- , C2.ConcIncludeColumnNrs
, C1.ConcIncludeColumnNames
, C2.ConcIncludeColumnNames
, C1.IndexUsage
, C2.IndexUsage
, C1.IndexUpdates
, C2.IndexUpdates
, 'DROP INDEX ' + C2.IndexName + ' ON ' + C2.TableName AS Drop2
, 'DROP INDEX ' + C1.IndexName + ' ON ' + C1.TableName AS Drop1
FROM IndexColumns AS C1
INNER JOIN IndexColumns AS C2
ON (C1.TableObjectId = C2.TableObjectId)
AND (
-- exact: show lower IndexId as 1
(C1.IndexId < C2.IndexId
AND C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs
AND C1.ConcIncludeColumnNrs = C2.ConcIncludeColumnNrs)
-- different includes: show longer include as 1
OR (C1.ConcIndexColumnNrs = C2.ConcIndexColumnNrs
AND LEN(C1.ConcIncludeColumnNrs) > LEN(C2.ConcIncludeColumnNrs))
-- overlapping: show longer index as 1
OR (C1.IndexId <> C2.IndexId
AND C1.ConcIndexColumnNrs <> C2.ConcIndexColumnNrs
AND C1.ConcIndexColumnNrs like C2.ConcIndexColumnNrs + ' %')
)
ORDER BY C1.TableName, C1.ConcIndexColumnNrs
I created the following query that gives me a lot of good information to identify duplicate and near-duplicate indexes. It also includes other information like how many pages of memory an index takes, which allows me to give a higher priority to larger indexes. It shows what columns are indexed and what columns are included, so I can see if there are two indexes that are almost identical with only slight variations in the included columns.
WITH IndexSummary AS
(
SELECT DISTINCT sys.objects.name AS [Table Name],
sys.indexes.name AS [Index Name],
SUBSTRING((SELECT ', ' + sys.columns.Name as [text()]
FROM sys.columns
INNER JOIN sys.index_columns
ON sys.index_columns.column_id = sys.columns.column_id
AND sys.index_columns.object_id = sys.columns.object_id
WHERE sys.index_columns.index_id = sys.indexes.index_id
AND sys.index_columns.object_id = sys.indexes.object_id
AND sys.index_columns.is_included_column = 0
ORDER BY sys.columns.name
FOR XML Path('')), 2, 10000) AS [Indexed Column Names],
ISNULL(SUBSTRING((SELECT ', ' + sys.columns.Name as [text()]
FROM sys.columns
INNER JOIN sys.index_columns
ON sys.index_columns.column_id = sys.columns.column_id
AND sys.index_columns.object_id = sys.columns.object_id
WHERE sys.index_columns.index_id = sys.indexes.index_id
AND sys.index_columns.object_id = sys.indexes.object_id
AND sys.index_columns.is_included_column = 1
ORDER BY sys.columns.name
FOR XML Path('')), 2, 10000), '') AS [Included Column Names],
sys.indexes.index_id, sys.indexes.object_id
FROM sys.indexes
INNER JOIN SYS.index_columns
ON sys.indexes.index_id = SYS.index_columns.index_id
AND sys.indexes.object_id = sys.index_columns.object_id
INNER JOIN sys.objects
ON sys.OBJECTS.object_id = SYS.indexES.object_id
WHERE sys.objects.type = 'U'
)
SELECT IndexSummary.[Table Name],
IndexSummary.[Index Name],
IndexSummary.[Indexed Column Names],
IndexSummary.[Included Column Names],
PhysicalStats.page_count as [Page Count],
CONVERT(decimal(18,2), PhysicalStats.page_count * 8 / 1024.0) AS [Size (MB)],
CONVERT(decimal(18,2), PhysicalStats.avg_fragmentation_in_percent) AS [Fragment %]
FROM IndexSummary
INNER JOIN sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL)
AS PhysicalStats
ON PhysicalStats.index_id = IndexSummary.index_id
AND PhysicalStats.object_id = IndexSummary.object_id
WHERE (SELECT COUNT(*) as Computed
FROM IndexSummary Summary2
WHERE Summary2.[Table Name] = IndexSummary.[Table Name]
AND Summary2.[Indexed Column Names] = IndexSummary.[Indexed Column Names]) > 1
ORDER BY [Table Name], [Index Name], [Indexed Column Names], [Included Column Names]
Results of the query look like this:
Table Name Index Indexed Cols Included Cols Pages Size (MB) Frag %
My_Table Indx_1 Col1 Col2, Col3 123 0.96 8.94
My_Table Indx_2 Col1 Col2, Col3 123 0.96 8.94
Complete Description
For the complete explanation see Identifying Duplicate or Redundant Indexes in SQL Server.
Try the script below to show Unused Indexes, hope it helps
/****************************************************************
Description: Script to show Unused Indexes using DMVs
****************************************************************/
SELECT TOP 100
o.name AS ObjectName
, i.name AS IndexName
, i.index_id AS IndexID
, dm_ius.user_seeks AS UserSeek
, dm_ius.user_scans AS UserScans
, dm_ius.user_lookups AS UserLookups
, dm_ius.user_updates AS UserUpdates
, p.TableRows
, 'DROP INDEX ' + QUOTENAME(i.name)
+ ' ON ' + QUOTENAME(s.name) + '.' + QUOTENAME(OBJECT_NAME(dm_ius.object_id)) as 'drop statement'
FROM sys.dm_db_index_usage_stats dm_ius
INNER JOIN sys.indexes i ON i.index_id = dm_ius.index_id AND dm_ius.object_id = i.object_id
INNER JOIN sys.objects o on dm_ius.object_id = o.object_id
INNER JOIN sys.schemas s on o.schema_id = s.schema_id
INNER JOIN (SELECT SUM(p.rows) TableRows, p.index_id, p.object_id
FROM sys.partitions p GROUP BY p.index_id, p.object_id) p
ON p.index_id = dm_ius.index_id AND dm_ius.object_id = p.object_id
WHERE OBJECTPROPERTY(dm_ius.object_id,'IsUserTable') = 1
AND dm_ius.database_id = DB_ID()
AND i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
ORDER BY (dm_ius.user_seeks + dm_ius.user_scans + dm_ius.user_lookups) ASC
GO
I was just reading some MSDN blogs, noticed a script to do this and remembered this question.
I haven't bothered testing it side by side with Andomar's to see if one has any particular benefit over the other.
One amendment I would likely make to both though would be to take into account the size of both indexes when assessing redundancy.
Edit:
Also see Kimberley Tripp's post on Removing duplicate indexes