How can I make an intentionally slow LINQ to SQL query? - linq-to-sql

I need to make a slow LINQ to SQL query for testing purposes. Just want to see how our system reacts when there's a long-running query.
I have a few tables with up to 50,000 rows but my queries are taking less than a second. How can I make a query that runs for a long time?

Silly me, I forgot that the actual database access only happens when ToList() is called. I had been thinking that my query was still still super-fast because I was only checking the line that sets up the query.
I was able to sufficiently slow down my query but simply duplicating the joins and where clauses, like this except 5 times longer:
var query =
from aa in db.MyTable
from aa1 in db.MyTable
from aa2 in db.MyTable
from aa3 in db.MyTable
join bb in db.OtherTable on aa.PartNumber equals bb.PartNumber
join bb1 in db.OtherTable on aa1.PartNumber equals bb1.PartNumber
join bb2 in db.OtherTable on aa2.PartNumber equals bb2.PartNumber
join bb3 in db.OtherTable on aa3.PartNumber equals bb3.PartNumber
where
aa.Column == "something" &&
aa1.Column == "something" &&
aa2.Column == "something" &&
aa3.Column == "something" &&
select new
{
PropertyOne = pr.ColumnOne,
PropertyTwo = oo.ColumnTwo,
};
var list = query.ToList();

You can use joins and order by in your query. The joins and order by operations lag out SQL server. So your query result time will be higher.

Related

MariaDB performance issue with "Where IN" clause

I got an issue with my SQL code. We developed an application which runs on MySQL, and there it runs fine. So I decided to give MariaDB a try and installed it on a dev machine. On a certain query Stmt, i have a performance issue I do not understand. The query is the following:
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM
PDS.TABLE_SAMPLES SAMPLES
RIGHT OUTER JOIN PDS.TABLE_RAW_VALUES RAWS ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID
RIGHT OUTER JOIN PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS ON(DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID)
RIGHT OUTER JOIN PDS.TABLE_DATA_KEY_DEFINITION KEYDEF ON(DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE
SAMPLES.SAMPLE_ID IN(1991331,1991637,1991941,2046105,2046411,2046717,2047023,2047635,2047941,2048247)
AND (SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
As long as I got only ONE value in the "WHERE IN" condition, the query takes ~10ms to execute. That's about the same MySQL 5.6 took.
As soon as I add another value there, the query time raises to several minutes. In MySQL, it raises very slowly, the Query shown up tehre takes ~150ms on MySQL and about 140 seconds on the new MariaDB installation using exactly the same datasets.
I'm no SQL expert, can you give me some clues how to optimize the query to run as expected?
The right outer joins are being converted to inner joins by the where clause. So, just use the proper join type (I'm not sure if this affects the optimization of the query, but it could):
SELECT SAMPLES.*, UNIX_TIMESTAMP(SAMPLES.SAMPLE_DATE) as TIMESTAMP,RAWS.VALUE, DATAKEYS.RAW_ID, DATAKEYS.DATA_KEY_VALUE, DATAKEYS.DATA_KEY_ID, KEYDEF.KEY_NAME, KEYDEF.LDD_ID
FROM PDS.TABLE_SAMPLES SAMPLES JOIN
PDS.TABLE_RAW_VALUES RAWS
ON SAMPLES.SAMPLE_ID = RAWS.SAMPLE_ID JOIN
PDS.TABLE_SAMPLE_DATA_KEYS DATAKEYS
ON (DATAKEYS.RAW_ID = RAWS.RAW_ID AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) OR
(DATAKEYS.RAW_ID = 0 AND DATAKEYS.SAMPLE_ID = SAMPLES.SAMPLE_ID) JOIN
PDS.TABLE_DATA_KEY_DEFINITION KEYDEF
ON DATAKEYS.DATA_KEY_ID = KEYDEF.DATA_KEY_ID)
WHERE SAMPLES.SAMPLE_ID IN (1991331, 1991637, 1991941, 2046105, 2046411, 2046717, 2047023, 2047635, 2047941, 2048247) AND
(SAMPLES.PARAMETER_ID = 9)
GROUP BY DATAKEYS.DATA_KEY_ID, RAWS.RAW_ID, DATAKEYS.DATA_KEY_ID
ORDER BY SAMPLES.SAMPLE_ID, DATAKEYS.RAW_ID;
Next, the best index for this query -- regardless of the number of values in the IN is the composite index PDS.TABLE_SAMPLES(PARAMETER_ID, SAMPLE_ID). This handles the WHERE clause.
Because your query runs quickly under some circumstances, I assume the other tables have the appropriate indexes for the joins.
Instead of operator 'IN' try using 'exists' and the use the subquery
instead of using sample_id's.

Count Query with join and union

Hi I want to get the count of this linq query.Im using entity framework with repository pattern.
It is possible to get the result by queryUserWalls.ToList().Count()
which I think is inefficient.
Can any body help.
var queryUserWalls = (from participation in _eventParticipationRepository.GetAll()
join eve in _eventRepository.GetAll() on participation.EventId equals eve.Id
join userWall in _userWallRepository.GetAll() on participation.EventId equals userWall.EventId
where participation.UserId == userId
select userWall.Id)
.Union(from userWall in _userWallRepository.GetAll()
select userWall.Id);
Leave out the ToList because it forces query execution. You want to use Queryable.Count, not Enumerable.Count. Then, it will execute on the server.

MySQL QUERY in preparing for too long

The following SQL has a preparing time of 30+ second. Is the SQL which is wrong, or the fact that I have close to one million result in the database? Can this SQL be optimized not to have it in preparing for that long?
UPDATE url_source_wp SET hash="ASDF2"
WHERE (url_source_wp.id NOT IN (
SELECT url_done_wp.url_source_wp FROM url_done_wp WHERE url_done_wp.url_group = 4)
)
AND (hash IS NULL) LIMIT 50
If preparation is your issue, you can pre compile it to a stored procedure.
See this :http://dev.mysql.com/doc/refman/5.0/en/stored-routines.html
It seems like you could more optimally do this update across a JOIN, avoiding the use of the sub-select.
UPDATE
url_source_wp AS s
INNER JOIN url_done_wp AS d
ON s.id = d.url_source_wp
SET
s.hash = 'ASDF2'
WHERE
s.hash IS NULL
AND d.url_group = 4
You need to make sure you have indexes on s.id, d.url_source_wp, s.hash, and d.url_group. Also, note that you can't use LIMIT with multi-table syntax, so if this is important this suggestion will likely not work for you.

Long time exection in update table with join in SQL server 2008

i'm facing a big problem when trying to update a table containing stock data put in join with a table containing product classification. This operation is taking long time for execution.
Table dw_giacenze (having flag_nomatch parameter equal to T) a is put on inner join with dw_key_prod z on ecat_key field.
a contains up to 3 milions records, z 150k records.
It takes more than 2 hours in execution.
Below the update query I'm using.
update dw_giacenze
set cate_ecat_key = z.cate_ecat_key,
sottocat_ecat_key = z.sottocat_ecat_key,
marchio_key = z.marchio_key,
sottocat_bi_key = z.sottocat_bi_key,
gruppo_bi_key = z.gruppo_bi_key,
famiglia_bi_key = z.famiglia_bi_key,
flag_nomatch = NULL
from dw_giacenze a
inner join dw_key_prod z on
z.ecat_key = a.ecat_key
where
a.flag_nomatch = 'T';
Can anyone help me in optimizing it?
Thanks in advance!
Enrico
I would suggest focusing in on a.flag_nomatch = 'T'.
A great way to get a really clear picture of what's going on is to use SQL Server Profiler. If this shows that your reads equals the number of rows in the table, then that's definitely an issue. Adding an index on flag_nomatch.
Alternatively, you could separate this out and update things individually (to start with)
UPDATE dw_giacenze
set sottocat_ecat_key = (SELECT sottocat_ecat_key
FROM dw_key_prod
WHERE dw_key_prod.ecat_key = dw_giacenze.ecat_key)
where
dw_giacenze.flag_nomatch = 'T';
I did notice that the first parameter in your set statement is actually the same parameter in your join. That means that you are setting it to the same exact value, so you should be able to remove that anyway.

which of these mysql queries is more efficient, using left join or not

i have a following sql query
$select_query_1 = SELECT * FROM user_module_comments WHERE useid = '$hash' ORDER BY id DESC LIMIT 0, 25
while($table = mysql_fetch_array($select_query_1)){
$user_moid = $table['canvas'];
$user_xtract_canvas = mysql_query("SELECT mcanvas FROM user_module WHERE uid = '$user_moid' LIMIT 1");
$selected = mysql_fetch_array($user_xtract_canvas);
$user_canvas_extract = $selected['mcanvas']; // this is what i need
}
OR this sql query
$select_query = SELECT user_module_comments.useid, user_module.mcanvas FROM user_module_comments LEFT JOIN user_module ON user_module.uid = user_module_comments.useid WHERE useid = '$hash' ORDER BY user_module_comments.id DESC LIMIT 0, 25
which of these queries is more efficient
thank
The JOIN is likely to be far, far faster than doing related queries in a loop. In general it is almost always faster to do one query than to do n queries. I only say "almost always" because I'm sure someone can come up with a use case where the opposite may be true.
There is a lot of overhead involved with MySQL compiling the SQL statement over and over in the loop, executing it, and fetching a rowset. Using the single statement eliminates all of that overhead.
You should install Xdebug and actually profile these statements in PHP to find out how long they take to execute.