Why is Linq2SQL generating a nested query instead of using a JOIN? - sql-server-2008

I'm trying to understand why Linq is generating the SQL that it is for the statement below:
var dlo = new DataLoadOptions();
dlo.LoadWith<TemplateNode>(x => x.TemplateElement);
db.LoadOptions = dlo;
var data = from node in db.TemplateNodes
where node.TemplateId == someValue
orderby node.Left
select node;
Which generates the following SQL:
SELECT [t2].[Id],
[t2].[ParentId],
[t2].[TemplateId],
[t2].[ElementId],
[t2].[Left] AS [Left],
[t2].[Right] AS [Right],
[t2].[Id2],
[t2].[Content]
FROM (SELECT ROW_NUMBER() OVER (ORDER BY [t0].[Left]) AS [ROW_NUMBER],
[t0].[Id],
[t0].[ParentId],
[t0].[TemplateId],
[t0].[ElementId],
[t0].[Left],
[t0].[Right],
[t1].[Id] AS [Id2],
[t1].[Content]
FROM [dbo].[TemplateNode] AS [t0]
INNER JOIN [dbo].[TemplateElement] AS [t1]
ON [t1].[Id] = [t0].[ElementId]
WHERE [t0].[TemplateId] = 16 /* #p0 */) AS [t2]
WHERE [t2].[ROW_NUMBER] > 1 /* #p1 */
ORDER BY [t2].[ROW_NUMBER]
There is a Foreign Key from TemplateNode.ElementId to TemplateElement.Id.
I would have expected the query to produce a JOIN, like so:
SELECT * FROM TemplateNode
INNER JOIN TemplateElement ON TemplateNode.ElementId = TemplateElement.Id
WHERE TemplateNode.TemplateId = #TemplateId
As per the suggestions in the answers to this question I have profiled both queries and the JOIN is 3 times faster than the nested query.
I'm using a .NET 4.0 Windows Forms app to test with SQL Server 2008 SP2 64bit developer edition.

The only reason that LINQ-SQL would generate the ROW_NUMBER query is due to the Skip Method. As bizare as the above SQL seems, I think within T-SQL there is no construct for simple paging like MySQL's Limit 10,25, so you get the above SQL when using Skip and Take.
I would assume that there is a Skip being used for paging purposes and LINQ-SQL is modifying the query. If you use an application like LINQ-Pad you can run different LINQ queries to see their generated SQL.

Your example of a join is not equivalent. You cannot get the ROW_NUMBER and subsequently select only rows WHERE ROW_NUMBER > 1 with a simple join. You would have to do a sub-select or similar to get this result.

Related

Perform a SQL Query

I have this query in PHP MySQL PDO:
SELECT p.las_plano_id, p.mensalidade_diferenciada, v.las_tipos_planos_id, t.valor_mensalidade
FROM isw_planos AS p
INNER JOIN isw_planos_vinculos AS v
ON p.las_plano_id =
(SELECT v.las_plano_id
FROM isw_planos_vinculos
WHERE v.data_encerramento IS NULL
ORDER BY v.data_adesao
DESC LIMIT 1)
INNER JOIN isw_planos_tipos AS t
ON v.las_tipos_planos_id = t.id
WHERE p.ativo = 1
But.. the result generate a long delay.. it's possible to perform this query to execute more fast?
Thnaks..
I suspect the error is with v.:
This looks wrong: SELECT v.las_plano_id ... since v is outside the subquery. Please check the aliases used.
If removing v. does not help, please provide SHOW CREATE TABLE so we can see the indexes, etc.

Checking if table exists if not use another for a 'SELECT' statement

I am still very new to SQL. I am working on a system which uses Derby database in development and Oracle in production. I want to have an SQL Statement which works in both. Here is my code:
SELECT rma.crspdt AS bic_crspndt,
rma.issr AS bic_issr
FROM rma
WHERE (rma.tp = 'Issued' OR rma.tp = 'Received')
AND rma.rmasts = 'Enabled'
AND rma.svcnm = 'swift.fin') r
INNER JOIN (SELECT 1 ID FROM SYSIBM.SYSDUMMY1 UNION ALL
SELECT 2 ID FROM SYSIBM.SYSDUMMY1) dummy ON (dummy.id = 1 AND r.bic_crspndt IS NOT NULL)
OR (dummy.id = 2 AND r.bic_issr IS NOT NULL)
I am using here 'SYSIBM.SYSDUMM1' table. Oracle has an exact alternative table for 'SYSIBM.SYSDUMM1' named 'DUAL'. The problem is that when I run my code in development (derby) this code works fine but in production (oracle) I get an error saying something like unknown table.
What I want to do is that in my code do an IF-ELSE/CASE-WHEN or something like this to check in runtime if 'SYSIBM.SYSDUMMY1' table exists and if it does not exist (like in oracle) then I want to use 'DUAL' table. I am very new to SQL and would like some help in this matter.
You don't say which Oracle version you are using. In Oracle 12c there is the SQL Translation Framework.
With this example you could set up a translation such that SYSIBM.SYSDUMMY1 is translated to DUAL.
I've seen demonstrations but haven't used it personally. I suggest the Oracle docs (as usual) for information - https://docs.oracle.com/database/121/DRDAA/sql_transl_arch.htm#DRDAA131.
Can you not just creat a DUAL table
There are some problem in your code from Oracle point of view which I can think of.
So from comments what I get it that you are not able to use Dual. Dual exists in Oracle. So try running select 1 from dual and if it doesn't work, your query is not running in oracle for sure. Apart from it there are couple of more problem with your query.
Using where before inner join.
Extra closing braces for r
Based of above input, this query might work for you if you are running it in Oracle. Replace dual with sysibm.sysdummy if you are not using Oracle.
Note: You should use proper join syntax(INNER JOIN). I wasn't able to figure out joining condition hence I am using comma to join.
SELECT rma.crspdt AS bic_crspndt,
rma.issr AS bic_issr
FROM rma r,
(SELECT 1 ID FROM dual UNION ALL
SELECT 2 ID FROM dual) dummy
WHERE ( (dummy.id = 1
AND r.bic_crspndt IS NOT NULL)
OR (dummy.id = 2
AND r.bic_issr IS NOT NULL)
)
AND (rma.tp = 'Issued'
OR rma.tp = 'Received')
AND rma.rmasts = 'Enabled'
AND rma.svcnm = 'swift.fin'

MariaDB performance issue with "Where IN" clause

I got an issue with my SQL code. We developed an application which runs on MySQL, and there it runs fine. So I decided to give MariaDB a try and installed it on a dev machine. On a certain query Stmt, i have a performance issue I do not understand. The query is the following:
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM
PDS.TABLE_SAMPLES SAMPLES
RIGHT OUTER JOIN PDS.TABLE_RAW_VALUES RAWS ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID
RIGHT OUTER JOIN PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS ON(DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID)
RIGHT OUTER JOIN PDS.TABLE_DATA_KEY_DEFINITION KEYDEF ON(DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE
SAMPLES.SAMPLE_ID IN(1991331,1991637,1991941,2046105,2046411,2046717,2047023,2047635,2047941,2048247)
AND (SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
As long as I got only ONE value in the "WHERE IN" condition, the query takes ~10ms to execute. That's about the same MySQL 5.6 took.
As soon as I add another value there, the query time raises to several minutes. In MySQL, it raises very slowly, the Query shown up tehre takes ~150ms on MySQL and about 140 seconds on the new MariaDB installation using exactly the same datasets.
I'm no SQL expert, can you give me some clues how to optimize the query to run as expected?
The right outer joins are being converted to inner joins by the where clause. So, just use the proper join type (I'm not sure if this affects the optimization of the query, but it could):
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM PDS.TABLE_SAMPLES SAMPLES JOIN
PDS.TABLE_RAW_VALUES RAWS
ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID JOIN
PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS
ON (DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) JOIN
PDS.TABLE_DATA_KEY_DEFINITION KEYDEF
ON DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE SAMPLES.SAMPLE_ID IN (1991331, 1991637, 1991941, 2046105, 2046411, 2046717, 2047023, 2047635, 2047941, 2048247) AND
(SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
Next, the best index for this query -- regardless of the number of values in the IN is the composite index PDS.TABLE_SAMPLES(PARAMETER_ID, SAMPLE_ID). This handles the WHERE clause.
Because your query runs quickly under some circumstances, I assume the other tables have the appropriate indexes for the joins.
Instead of operator 'IN' try using 'exists' and the use the subquery
instead of using sample_id's.

Optimizing MySQL query with nested select statements?

I've got read-only access to a MySQL database, and I need to loop through the following query about 9000 times, each time with a different $content_path_id. I'm calling this from within a PERL script that's pulling the '$content_path_id's from a file.
SELECT an.uuid FROM alf_node an WHERE an.id IN
(SELECT anp.node_id FROM alf_node_properties anp WHERE anp.long_value IN
(SELECT acd.id FROM alf_content_data acd WHERE acd.content_url_id = $content_path_id));
Written this way, it's taking forever to do each query (approximately 1 minute each). I'd really rather not wait 9000+ minutes for this to complete if I don't have to. Is there some way to speed up this query? Maybe via a join? My current SQL skills are embarrassingly rusty...
This is an equivalent query using joins. It depends what indexes are defined on the tables how this will perform.
If your Perl interface has the notion of prepared statements, you may be able to save some time by preparing once and executing with 9000 different binds.
You could also possibly save time by building one query with a big acd.content_url_id In ($content_path_id1, $content_path_id2, ...) clause
Select
an.uuid
From
alf_node an
Inner Join
alf_node_properties anp
On an.id = anp.node_id
Inner Join
alf_content_data acd
On anp.long_value = acd.id
Where
acd.content_url_id = $content_path_id
Try this extension to Laurence's solution which replaces the long list of OR's with an additional JOIN:
Select
an.uuid
From alf_node an
Join alf_node_properties anp
On an.id = anp.node_id
Join alf_content_data acd
On anp.long_value = acd.id
Join (
select "id1" as content_path_id union all
select "id2" as content_path_id union all
/* you get the idea */
select "idN" as content_path_id
) criteria
On acd.content_url_id = criteria.content_path_id
I have used SQL Server syntax above but you should be able to translate it readily.

Count Query with join and union

Hi I want to get the count of this linq query.Im using entity framework with repository pattern.
It is possible to get the result by queryUserWalls.ToList().Count()
which I think is inefficient.
Can any body help.
var queryUserWalls = (from participation in _eventParticipationRepository.GetAll()
join eve in _eventRepository.GetAll() on participation.EventId equals eve.Id
join userWall in _userWallRepository.GetAll() on participation.EventId equals userWall.EventId
where participation.UserId == userId
select userWall.Id)
.Union(from userWall in _userWallRepository.GetAll()
select userWall.Id);
Leave out the ToList because it forces query execution. You want to use Queryable.Count, not Enumerable.Count. Then, it will execute on the server.