Possible multiple enumeration of IEnumerable when counting and skipping - linq-to-sql

I'm preparing data for a datatable in Linq2Sql
This code highlights as a 'Possible multiple enumeration of IEnumerable' (in Resharper)
// filtered is an IEnumerable or an IQueryable
var total = filtered.Count();
var displayed = filtered
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
And I am 100% sure Resharper is right.
How do I rewrite this to avoid the warning
To clarify, I get that I can put a ToList on the end of filtered to only do one query to the Database eg.
var filteredAndRun = filtered.ToList();
var total = filteredAndRun.Count();
var displayed = filteredAndRun
.Skip(param.iDisplayStart)
.Take(param.iDisplayLength).ToList();
but this brings back a ton more data than I want to transport over the network.
I'm expecting that I can't have my cake and eat it too. :(

It sounds like you're more concerned with multiple enumeration of IQueryable<T> rather than IEnumerable<T>.
However, in your case, it doesn't matter.
The Count call should translate to a simple and very fast SQL count query. It's only the second query that actually brings back any records.
If it is an IEnumerable<T> then the data is in memory and it'll be super fast in any case.
I'd keep your code exactly the same as it is and only worry about performance tuning when you discover you have a significant performance issue. :-)

You could also do something like
count = 0;
displayed = new List();
iDisplayStop = param.iDisplayStart + param.iDisplayLength;
foreach (element in filteredAndRun) {
++count;
if ((count < param.iDisplayStart) || (count > iDisplayStop))
continue;
displayed.Add(element);
}
That's pseudocode, obviously, and I might be off-by-one in the edge conditions, but that algorithm gets you the count with only a single iteration and you have the list of displayed items only at the end.

Related

Django: ORM/SQL query speed significantly decreased after adding additional BooleanField or (SQL tinyint) to Django Filter

Using MySQL Latest Django:
I have a vaguely complex Django query that works quite quickly--until I add an additional "AND" with a Boolean Field--
See Below:
queriedForms = queryFormtype.form_set.filter(is_public=True)
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
logger.info(newQuery.query)
term['count'] = newQuery.count()
If I either remove the initial "is_public=True" or the final "flagged_for_deletion=False)--it works incredibly fast. If I use both as filters, it increases the time for the count() function by something like 2000%
The different QuerySet.query outputs are below:
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510 AND `maqluengine_form`.`flagged_for_deletion` = False)
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510)
The first takes about 20/30 seconds to perform the count(), while the second with only 1 of the two BooleanField's takes less than a second to perform the count()
=======================================
EDIT=======================
Apologies: since the question isn't obvious enough--why is adding an additional AND with a BooleanField increasing the query time by +2000%? Is anyone able to assist in figuring out WHY that's occurring. Thanks.
EDIT=========================
Also discovered that using a exclude(is_public=False) rather than filter(is_public=True) has the same effect as the solution below. Does anyone happen to know why an exclude() works fine--whereas the filter() does not?
==============================
Solution I came up with after a night's rest:
--I keep the query as is(I need it for later because it continues getting chain filtered)
--I need the count() from this stage--which is taking substantially longer than it should with the additional BooleanField AND
--I take a temporary values list to perform a len() on instead:
queriedForms = queryFormtype.form_set.all()
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
tempQuery = newQuery.values_list('is_public',flat=True)
finalQuery = [entry for entry in tempQuery if entry != 'False'] #Remove any indices that contain "False"
term['count'] = len(finalQuery)
The following counts that use chained filters after use the same technique--it's significantly faster--if not as fast as removing one of the Booleans from the filters.

How to count the number of rows after filtering an advanceddatagrid?

I am trying to count the number of rows in an advanceddatagrid
I need a function which could count all item with or without filterFunction.
I tried some solution but none works. The best that I found, is to expand all items and use a cursor for looping.
But, when we have a lot of data, expanding all is not a good solution.
Do you have an idea on how to do that ?
Thank you
The only way I came up was investigating dataProvider
// current not expanded data row lenght
grid.dataProvider.lenght;
// expanded length
// I assume you use xml as your data provider
// then you can count it like this
xmlListTotalSize(new XMLList(grid.dataProvider.source.source));
// or with casts
xmlListTotalSize(new XMLList((IHierarchicalCollectionView(view.grid.dataProvider).source as HierarchicalData).source));
and xmllist traversal function might look something like this:
private static function xmlListTotalSize(x:XMLList):int
{
var i:int = x.length();
for each(var xChild:XML in x.children())
i += xmlListTotalSize(xChild.children());
return i;
}

Custom sorting function bottleneck

I am trying to sort big array using actionscript 3.
The problem is that i have to use custom sorting function which is painfully slow and leads to flash plugin crash.
Below is a sample code for custom function used to sort array by length of its members:
private function sortByLength():int {
var x:int = arguments[0].length;
var y:int = arguments[1].length;
if (x > y){
return 1;
}else if (x < y){
return -1;
}else{
return 0;
}
}
Which is called like this:
var txt:Array = ["abcde","ab","abc","a"];
txt.sort(sortByLength);
Please advise me how can this be done faster ?
How to change application logic to avoid Flash plugin crashes during sorting ?
try to use strong typing whenever possible, here tell your function that you are waiting two strings.
you could rewrite your function in two way one fastest than the other if you know that all your element are not null:
function sortByLength(a:String, b:String):int {
return a.length-b.length // fastest way not comparison
}
and if you can have null check for it (this one will put null in front of all element):
function sortByLengthWithNull(a:String, b:String):int {
if (a==null) return -1
if (b==null) return 1
return a.length-b.length
}
If you need super-fast sorting, then it might be worthwhile not using an array at all and instead using a linked-list. There are different advantages to each. Primarily, with a linked-list, index-access is slow, while iterating through the list is fast, and linked-lists are not native to AS3 so you'll have to roll your own.
On the upside, you may well be able to use some of Polygonal Labs' code: http://lab.polygonal.de/as3ds/.
Sorting is very, very fast for nearly-sorted data with a linked list, as this article discusses: http://lab.polygonal.de/2007/11/26/data-structures-more-on-linked-lists/.
This solution gives you lots more work, but will eventually give you lots more sort-speed too.
Hope this helps.
-- additional --
I noticed your question in the comments of another answer about "One question however is unanswered - how to perform greedy computations in Flash without hanging it?"
For this, essentially the answer is to break your computation over multiple frames, something like this:
public function sort():void
{
addEventListener(Event.ENTER_FRAME, iterateSort);
}
private function iterateSort():void
{
var time:int = getTimer() + TARGET_MILLISECONDS_PER_FRAME;
var isFinished:Boolean = false;
while (!isFinished && getTimer() < time)
isFinished = continueSort();
if (isFinished)
removeEventListener(Event.ENTER_FRAME, iterateSort);
}
function continueSort():Boolean
{
... implement an 'atom of sort' here, whatever that means ...
}
sortByLength should have two parameters, shouldn't it? I guess that's what you mean by the arguments array...
This looks fine to me, unless arguments is not a local variable, but instead a member variable, and you're just looking at its [0] and [1] elements on each function call. That would at least produce undesired results.

How to sort var length ids (composite string + numeric)?

I have a MySQL database whose keys are of this type:
A_10
A_10A
A_10B
A_101
QAb801
QAc5
QAc25
QAd2993
I would like them to sort first by the alpha portion, then by the numeric portion, just like above. I would like this to be the default sorting of this column.
1) how can I sort as specified above, i.e. write a MySQL function?
2) how can I set this column to use the sorting routine by default?
some constraints that might be helpful: the numeric portion of my ID's never exceeds 100,000. I use this fact in some javascript code to convert my ID's to strings concatenating the non-numeric portion with the (number + 1,000,000). (At the time I had not noticed the variations/subparts as above such as A_10A, A_10B, so I'll have to revamp that part of my code.)
The best way to achieve what you want is to store each part in its own column, and I would strongly recommend to change table structure. If it's impossible, you can try the following:
Create 3 UDFs which returns prefix, numeric part, and postfix of your string. For a better performance they should be native (Mysql, as any other RDMS, is not really good in complex string parsing). Then you can call these functions in ORDER BY clause or in trigger body which validates your column. In any case, it will work slower than if you create 3 columns.
No simple answer that I know of. I had something similar a while back but had to use jQuery to sort it. So what I did was first get the output into an javascript array. Then you may want to insert a zero padding to your numbers. Separate the Alpha from Nummerics using a regex, then reassemble the array:
var zarr = new Array();
for(var i=0; i<val.length; i++){
var chunk = val[i].match(/(\d+|[^\d]+)/g).join(',');
var chunks = chunk.split(",");
for(var s=0; s<chunks.length; s++){
if(isNaN(chunks[s]) == true)
zarr.push(chunks[s]);
else
zarr.push(zeroPad(chunks[s], 5));
}
}
function zeroPad(num,count){
var numZeropad = num + '';
while(numZeropad.length < count) {
numZeropad = "0" + numZeropad;
}
return numZeropad;
}
You'll end up with an array like this:
A_00100
QAb00801
QAc00005
QAc00025
QAd02993
Then you can do a natural sort. I know you may want to do it through straight MySQL but I am not to sure if it does natural sorting.
Good luck!

What's the most efficient way to do these two foreach-loops?

i'm not sure this is the best way to do the following code. I'm not sold on a foreach inside another foreach. Could this be done *better** with Linq?
*I understand that better could be either
a) more performant
b) easier to read / more elegant
c) all of the above
NOTE: .NET 3.5 solutions accepted :)
NOTE2: the two IList's were the results of a multi-recordset stored procedure, via Linq2Sql.
here's the make believe code:
// These two lists are the results from a IMultipleResults Linq2Sql stored procedure.
IList<Car> carList = results.GetResult<Car>().ToList();
IList<Person> people = results.GetResult<Person>().ToList();
// Associate which people own which cars.
foreach(var person in people)
{
var cars = (from c in cars
where c.CarId == person.CarId
select c).ToList();
foreach (var car in cars)
{
car.Person = person;
}
}
Cheers :)
I don't think performance would be any different but if you're looking for terseness:
var q = from person in people
join car in cars on person.CarId equals car.CarId
select new { car, person };
foreach(var o in q)
{
o.car.Person = o.person;
}
Edit: After Jon's hint on this version being faster I got curious and profiled both functions. This version seems twice as fast which is amazing. I checked the disassembly. The overhead of the original implementation seems to come from new enumerators getting created for both outer and inner loop, causing P times new/dispose overhead.
For this code only one Enumerator is created which I think is the magic of "Join" function. I didn't examine how it works though.