L2E: GroupBy always gets translated into Distincts, never Group By

L2E: GroupBy always gets translated into Distincts, never Group By - sql-server-2008

How can I get the Linq-to-Entities provider to truly perform a GROUP BY? No matter what I do, it always generates SQL that is far slower than an actual GROUP BY. For example:
var foo = (from x in Context.AccountQuantities
where x.AccountID == 27777
group x by x.StartDate
into groups
select new
{
groups.FirstOrDefault().StartDate,
Quantity = groups.Max(y => y.Quantity)
}).ToList();
translates to:
SELECT
1 AS [C1],
[Project4].[C1] AS [C2],
[Project4].[C2] AS [C3]
FROM ( SELECT
[Project3].[C1] AS [C1],
(SELECT
MAX([Extent3].[Quantity]) AS [A1]
FROM [dbo].[AccountQuantities] AS [Extent3]
WHERE (27777 = [Extent3].[AccountID]) AND ([Project3].[StartDate] = [Extent3].[StartDate])) AS [C2]
FROM ( SELECT
[Distinct1].[StartDate] AS [StartDate],
(SELECT TOP (1)
[Extent2].[StartDate] AS [StartDate]
FROM [dbo].[AccountQuantities] AS [Extent2]
WHERE (27777 = [Extent2].[AccountID]) AND ([Distinct1].[StartDate] = [Extent2].[StartDate])) AS [C1]
FROM ( SELECT DISTINCT
[Extent1].[StartDate] AS [StartDate]
FROM [dbo].[AccountQuantities] AS [Extent1]
WHERE 27777 = [Extent1].[AccountID]
) AS [Distinct1]
) AS [Project3]
) AS [Project4]
How can I get it to execute this instead?
SELECT
AccountQuantities.StartDate,
MAX(AccountQuantities.Quantity)
FROM AccountQuantities
WHERE AccountID=27777
GROUP BY StartDate
Again, note the lack of ANY GROUP BY in what's executed. I'm fine with EF optimizing things non-ideally but this is orders of magnitude slower, I cannot find any way to convince it to really do a GROUP BY, and is a major problem for us!

Use groups.Key instead of groups.FirstOrDefault().StartDate
var foo = (from x in Context.AccountQuantities
where x.AccountID == 27777
group x by x.StartDate
into groups
select new
{
groups.Key,
Quantity = groups.Max(y => y.Quantity)
}).ToList();

Related

Problems with query speed when using a nested query for item count

When I add the nested query for invCount, my query time goes from .03 sec to 14 sec. The query works and I get correct values, but it is very, very slow in comparison. Is that just because I have to many conditions in that query? When I take it out and still have the second nested query, the time is still .03 secs. There is clearly something about the first nested query the database doesn't like, but I am not seeing what it is. I have a foreign key set for all the inner join lines too. Any help or ideas would be appreciated.
SELECT a.*,
f.name,
f.partNumber,
f.showInAdminStore,
f.showInPublicStore,
f.productImage,
r.mastCatID,
(SELECT COUNT(b.inventoryID)
FROM storeInventory b
INNER JOIN events c ON c.eventID = b.eventID
WHERE b.pluID = a.pluID
AND b.listPrice = a.listPrice
AND b.unlimitedQty = a.unlimitedQty
AND (b.packageID = a.packageID OR (b.packageID IS NULL AND a.packageID IS NULL))
AND b.orderID IS NULL
AND c.isOpen = '1'
AND b.paymentTypeID <= '2'
AND (b.inCart < '$cartTime' OR b.inCart IS NULL) ) AS invCount,
(SELECT COUNT(x.inventoryID)
FROM storeInventory x
WHERE x.packageID = a.inventoryID) AS packageCount
FROM storeInventory a
INNER JOIN storePLUs f ON f.pluID = a.pluID
INNER JOIN storeCategories r ON r.catID = f.catID
INNER JOIN events d ON d.eventID = a.eventID
WHERE a.storeFrontID = '1'
AND a.orderID IS NULL
AND a.paymentTypeID <= '2'
AND d.isOpen = '1'
GROUP BY a.packageID, a.unlimitedQty, a.listPrice, a.pluID
Table from query output
UPDATE: 12/12/2022
I changed the line checking the packageID to "AND (b.packageID <=> a.packageID)" as suggested and that cut my query time down to 7.8 seconds from 14 seconds. Thanks for the pointer. I will definitely use that in the future for NULL comparisons.
using "count(*)" took about half a second off. When I take the first nested query out, it drops down to .05 seconds even with the other nested queries in there, so I feel like there is still something causing issues. I tried running it without the other "AND (b.inCart < '$cartTime' OR b.inCart IS NULL)" line and that did take about a second off, but no where what I was hoping for. Is there an operand that includes NULL on a less than comparison? I also tried running it without the inner join in the nested query and that didn't change much at all. Of course removing any of that, throughs the values off and they become incorrect, so I can't run it that way.
Here is my current query setup that still pulls correct values.
SELECT a.*,
f.name,
f.partNumber,
f.showInAdminStore,
f.showInPublicStore,
f.productImage,
r.mastCatID,
(SELECT COUNT(*)
FROM storeInventory b
INNER JOIN events c ON c.eventID = b.eventID
WHERE b.pluID = a.pluID
AND b.listPrice = a.listPrice
AND b.unlimitedQty = a.unlimitedQty
AND (b.packageID <=> a.packageID)
AND b.orderID IS NULL
AND c.isOpen = '1'
AND b.paymentTypeID <= '2'
AND (b.inCart < '$cartTime' OR b.inCart IS NULL) ) AS invCount,
(SELECT COUNT(x.inventoryID)
FROM storeInventory x
WHERE x.packageID = a.inventoryID) AS packageCount
FROM storeInventory a
INNER JOIN storePLUs f ON f.pluID = a.pluID
INNER JOIN storeCategories r ON r.catID = f.catID
INNER JOIN events d ON d.eventID = a.eventID
WHERE a.storeFrontID = '1'
AND a.orderID IS NULL
AND a.paymentTypeID <= '2'
AND d.isOpen = '1'
GROUP BY a.packageID, a.unlimitedQty, a.listPrice, a.pluID
I am not familiar with the term 'Composite indexes' Is that something different than these?
Screenshot of ForeignKeys on Table a

I think
AND (b.packageID = a.packageID
OR (b.packageID IS NULL
AND a.packageID IS NULL)
)
can be simplified to ( https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_equal-to ):
AND ( b.packageID <=> a.packageID )
Use COUNT(*) instead of COUNT(x.inventoryID) unless you check for not-NULL.
The subquery to compute packageCount seems strange; you seem to count inventories but join on packages.
The need to reach into another table to check isOpen is part of the performance problem. If eventID is not the PRIMARY KEYforevents, then add INDEX(eventID, isOpen)`.
Some other indexes that may help:
a: INDEX(storeFrontID, orderID, paymentTypeID)
a: INDEX(packageID, unlimitedQty, listPrice, pluID)
b: INDEX(pluID, listPrice, unlimitedQty, orderID)
f: INDEX(pluID, catID)
r: INDEX(catID, mastCatID)
x: INDEX(packageID, inventoryID)
After OP's Update
There is no way to do (x<y OR x IS NULL) except by switching to a UNION. In your case, it is pretty easy to do the conversion. Replace
( SELECT COUNT(*) ... AND ( b.inCart < '$cartTime'
OR b.inCart IS NULL ) ) AS invCount,
with
( SELECT COUNT(*) ... AND b.inCart < '$cartTime' ) +
( SELECT COUNT(*) ... AND b.inCart IS NULL ) AS invCount,
Revised indexes:
storePLUs:
INDEX(pluID, catID)
storeCategories:
INDEX(catID, mastCatID)
events:
INDEX(isOpen, eventID)
storeInventory:
INDEX(pluID, listPrice, unlimitedQty, orderID, packageID)
INDEX(pluID, listPrice, unlimitedQty, orderID, inCart)
INDEX(packageID, inventoryID)
INDEX(storeFrontID, orderID, paymentTypeID)

SQL Subquery return more than 1 row

I need to set "dph" in this table "Strobjednavka", but i don´t know whats wrong there. Please help :).
Here is my SQL script:
UPDATE STRObjednavka as o SET dph = (
SELECT dph FROM STRCena WHERE
menuKodCode =
(SELECT menuKodCode FROM STRMenu WHERE
id = o.menuId
)
AND
skupinaId =
(SELECT stravGroupId FROM grups1 WHERE
PKey =
(SELECT SGroup FROM users1 WHERE
PKey = o.userId
)))
WHERE o.price > 0 AND `date` > '2015-01-28 13:52:36' AND dph = 0;
SQL say : SQL error 1242: Subquery returns more than 1 row

You can able to update with below script, but you need to check whether update is correct or not, If you give some sample data then it will be easy to track the problem.
UPDATE STRObjednavka as o SET dph = (
SELECT max(dph) FROM STRCena WHERE
menuKodCode =
(SELECT max(menuKodCode) FROM STRMenu WHERE
id = o.menuId
)
AND
skupinaId =
(SELECT max(stravGroupId) FROM grups1 WHERE
PKey =
(SELECT max(SGroup) FROM users1 WHERE
PKey = o.userId
)))
WHERE o.price > 0 AND `date` > '2015-01-28 13:52:36' AND dph = 0;

Unfortunately, MySQL doesn't allow you to LIMIT a subquery. Depending on your use case you can add MIN or MAX to your subqueries. Here it is with MINs in all the subqueries:
UPDATE STRObjednavka as o SET dph = (
SELECT MIN(dph) FROM STRCena WHERE
menuKodCode =
(SELECT MIN(menuKodCode) FROM STRMenu WHERE
id = o.menuId
)
AND
skupinaId =
(SELECT MIN(stravGroupId) FROM grups1 WHERE
PKey =
(SELECT MIN(SGroup) FROM users1 WHERE
PKey = o.userId
)))
WHERE o.price > 0 AND `date` > '2015-01-28 13:52:36' AND dph = 0;
Although you really only need to add it to the subquery that's returning more than one row.

Your first problem is that you're writing '.... = (SELECT .... )'. Since you're using the equality operator, you're asking SQL to assign an entire column of values to a single cell. Change your equality operators before your subqueries to IN operators.

You probably should use a different query pattern.
You have this sort of thing in your query, in several places.
WHERE menuKodCode = /* !! might generate error 1242 */
(SELECT menuKodCode FROM STRMenu WHERE id = o.menuId)
There's no guarantee that your inner query won't return more than one row, and when it does, MySQL throws error 1242.
SQL works wiith sets of values. If you used IN instead of =, your query would work.
WHERE ... menuKodCode IN
(SELECT menuKodCode FROM STRMenu WHERE id = o.menuId)
But you should figure out whether that logic is correct. If I were you I'd do a whole bunch of SELECT operations to test it before doing UPDATE.

SQL Query behavior

I'm bogged in trying to figure out why query a is returning different records than query b. Both queries have seemingly same purpose yet a is returning 500 and b 3500.
this is query a:
SELECT DISTINCT ODE.OrderBillToID
FROM APTIFY.dbo.vwVwOrderDetailsKGExtended ODE
WHERE ProductID IN (2022, 1393)
AND LTRIM(RTRIM(ODE.OrderStatus)) <> 'Cancelled'
AND LTRIM(RTRIM(ODE.OrderType)) <> 'Cancellation'
AND LTRIM(RTRIM(ODE.cancellationStatus)) <> 'FULLY CANCELLED'
UNION
SELECT DISTINCT ID
FROM APTIFY.dbo.vwPersons WHERE City = 'A'
UNION
SELECT DISTINCT RecordID
FROM APTIFY.dbo.vwTopicCodeLinks WHERE TopicCodeID = 16 AND Value = 'Yes, Please'
query b:
SELECT
APTIFY..vwPersons.ID
FROM
APTIFY..vwPersons
WHERE
( APTIFY..vwPersons.ID IN (
SELECT
vwMeetingRegistrants.ID
FROM
APTIFY.dbo.vwMeetings vwMeetings
INNER JOIN APTIFY.dbo.vwMeetingRegistrants vwMeetingRegistrants
ON vwMeetings.ID=vwMeetingRegistrants.ActualMeetingID WHERE
vwMeetings.ProductID = 2022
)
OR
APTIFY..vwPersons.ID IN (
SELECT
vwMeetingRegistrants.ID
FROM
APTIFY.dbo.vwMeetings vwMeetings
INNER JOIN APTIFY.dbo.vwMeetingRegistrants vwMeetingRegistrants
ON vwMeetings.ID=vwMeetingRegistrants.ActualMeetingID WHERE
vwMeetings.ProductID = 1393
)
OR
APTIFY..vwPersons.City = N'Albany' )
OR
((
APTIFY..vwPersons.ID IN (
SELECT
RecordID
FROM
APTIFY.dbo.vwTopicCodeLinks vwTopicCodeLinks
WHERE
vwTopicCodeLinks.TopicCodeID = 16
)
AND
APTIFY..vwPersons.ID IN (
SELECT
RecordID
FROM
APTIFY.dbo.vwTopicCodeLinks vwTopicCodeLinks
WHERE
vwTopicCodeLinks.Value = N'Yes, Please'
) )
)
vwMeetingsRegistrants from the b query are producing the same records as orderkgdetailsextended from query. I cannot see ANY difference in those queries - which perhaps shows my lack of understanding the query behaviour.
BIG Thanks for any points guys! :)

As it came out, incorrectly structured query is a result of badly configured application, Aptify.

MySQL: how can I count number of articles by a join table

I have a table with news items, I have another table with media_types, I want to make one simple query that reads the media_types table and count for each record how many news_items exist.
The result will be turned into a json response that I will use for a chart, this is my SQLstatement
SELECT
gc.country AS "country"
, COUNT(*) AS "online"
FROM default_news_items AS ni
JOIN default_news_item_country AS nic ON (nic.id = ni.country)
JOIN default_country AS c ON (nic.country = c.id)
JOIN default_geo_country AS gc ON (gc.id = c.geo_country)
LEFT JOIN default_medias ON (m.id = ni.media)
WHERE TRUE
AND ni.deleted = 0
AND ni.date_item > '2013-10-23'
AND ni.date_item < '2013-10-29'
AND gc.country <> 'unknown'
AND m.media_type = '14'
GROUP BY gc.country
ORDER BY `online` desc LIMIT 10
This is the json respond I create from the mysql respond
[
{"country":"New Zealand","online":"7"},
{"country":"Switzerland","online":"1"}
]
How do I add print and social data to my output like this
I would like the json respond look like this
[
{"country":"New Zealand","online":"7", "social":"17", "print":"2"},
{"country":"Switzerland","online":"1", "social":"7", "print":"1"}
]
Can I use the count (*) in the select statement to do something like this
COUNT( * ) as online, COUNT( * ) as social, COUNT( * ) as print
Is it possible or do I have to do several SQL statement to get the data I'm looking for?

This is the general structure:
SELECT default_geo_country.country as country,
SUM(default_medias.media_type = 14) as online,
SUM(default_medias.media_type = XX) as social,
SUM(default_medias.media_type = YY) as print
FROM ...
JOIN ...
WHERE ...
GROUP BY country

I think you want conditional aggregation. Your question, however, only shows the online media type.
Your query would be more readable by using table aliases and removing the back quotes. Also, if media_type is an integer, then you should not enclose the constant for comparison in single quotes -- I, for one, find it misleading to compare a string constant to an integer column.
I suspect this is the way you want to go. Where the . . . is, you want to fill in with the counts for the other media types.
SELECT default_geo_country.country as country,
sum(media_type = '14') as online,
sum(default_medias.media_type = XX) as social,
sum(default_medias.media_type = YY) as print
. . .
FROM default_news_items ni JOIN
default_news_item_country nic
ON nic.id = ni.country JOIN
default_country dc
ON nic.country = dc.id JOIN
default_geo_country gc
ON gc.id = dc.geo_country LEFT JOIN
default_medias dm
ON dm.id = dni.media
WHERE ni.deleted = '0'
AND ni.date_item > '2013-10-23'
AND ni.date_item < '2013-10-29'
AND gc.country <> 'unknown'
GROUP BY gc.country
ORDER BY online desc
LIMIT 10

Linq to SQL: how to aggregate without a group by?

I am searching for the Linq-to-SQL equivalent to this query:
SELECT
[cnt]=COUNT(*),
[colB]=SUM(colB),
[colC]=SUM(colC),
[colD]=SUM(colD)
FROM myTable
This is an aggregate without a group by. I can't seem to find any way to do this, short of issuing four separate queries (one Count and three Sum). Any ideas?

This is what I found seems like you still have to do a group by...can just use constant:
var orderTotals =
from ord in dc.Orders
group ord by 1 into og
select new
{
prop1 = og.Sum(item=> item.Col1),
prop2 = og.Sum(item => item.Col2),
prop3 = og.Count(item => item.Col3)
};
This produces the following SQL, which is not optimal, but works:
SELECT SUM([Col1]) as [prop1], SUM([Col2]) as [prop2], COUNT(*) AS [prop3]
FROM (
SELECT 1 AS [value], [t0].[Col1], [t0].[Col2], [t0].[Col3]
FROM [table] AS [t0]
) AS [t1]
GROUP BY [t1].[value]

You can do the same query using Lambda expression as follows:
var orderTotals = db.Orders
.GroupBy( i => 1)
.Select( g => new
{
cnt = g.Count(),
ScolB = g.Sum(item => item.ColB),
ScolC = g.Sum(item => item.ColC)
});

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

L2E: GroupBy always gets translated into Distincts, never Group By - sql-server-2008

Use groups.Key instead of groups.FirstOrDefault().StartDate var foo = (from x in Context.AccountQuantities where x.AccountID == 27777 group x by x.StartDate into groups select new { groups.Key, Quantity = groups.Max(y => y.Quantity) }).ToList();

Related

Problems with query speed when using a nested query for item count

SQL Subquery return more than 1 row

SQL Query behavior

MySQL: how can I count number of articles by a join table

Linq to SQL: how to aggregate without a group by?

Categories

Resources