LINQ to SQL DataContext Caching

LINQ to SQL DataContext Caching - linq-to-sql

I am using Linq to SQL as my DAL layer and during unit test I found out that my objects are not being returned from database but from the DataContext cache.
The strange thing is that when the objects are returned from the cache why does it require a separate call to the database to fetch all the fields.
Anyway, I implemented a ClearCache method that will clear out the cache. But I am only clearing the cache in the unit test and not in the API code.
The reason is that once the object is inserted it is good to load from the cache then to fetch it again from the database.
What do you think?
UPDATE:
public static void ClearCache(this EStudyModelDataContext context)
{
const BindingFlags Flags = BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic;
var method = context.GetType().GetMethod("ClearCache", Flags);
method.Invoke(context, null);
}

in my case work only with 4 bindings:
DATACONTEXT.GetType().InvokeMember(
"ClearCache",
BindingFlags.Instance |
BindingFlags.Public |
BindingFlags.NonPublic |
BindingFlags.InvokeMethod,
null, DATACONTEXT, null);

You are correct that L2S will return items from the cache when it can. I learned about this the hard way. A clean way to handle this is to do a context.RefreshRow after each Insert and Update operation. This refreshes the cache and guarantees the cache is current.

There is one big caveat to manually calling 'ClearCache' - it is that you're doing exactly that! So - if you had any prior database operations (pending), you just flushed them. You have to be very careful that you haven't walked on other "On Submit" changes. I learned this the hard way - so I had to add a few more 'SubmitChanges' statements into my code - as a direct result of implementing ClearCache after some stored procedure executions (bulk deletes).

Related

Apply OData function on retrieved data in a query

I just started to work with Odata and I had an impression the OData querying is kind of flexible.
But in some cases I want to retrieve updated/newly calculated data on the fly. In my case this data is SalaryData values. At some point, I want them to be slightly tweaked with additional applied calculation function. And the critical point that this action must occur on the retrieval of the data with the general request query.
But I don't know, is that applicable to use function in this case?
Ideally, I want to have the similar request:
/odata/Employee(1111)?$expand=SalaryData/CalculculationFunction(40)
Here I want to apply CalculculationFunction with parameters on SalaryData.
Is that possible to do it in OData in this way? Or should I create an entity set of salary data and retrieve calculated data directly using the query something like
/odata/SalaryData(1111)/CalculculationFunction(40)
But this way is least preferable for me, because I don't want to use id of SalaryData in request
Current example of the function I created:
[EnableQuery(MaxExpansionDepth = 10, MaxAnyAllExpressionDepth = 10)]
[HttpGet]
[ODataRoute("({key})/FloatingWindow(days={days})")]
public SingleResult<Models.SalaryData> MovingWindow([FromODataUri] Guid key, [FromODataUri] int days)
{
if (days <= 0)
return new SingleResult<Models.SalaryData>(Array.Empty<Models.SalaryData>().AsQueryable());
var cachedSalaryData = GetAllowedSalaryData().FirstOrDefault(x => x.Id.Equals(key));
var mappedSalaryData = mapper.Map<Models.SalaryData>(cachedSalaryData);
mappedSalaryData = Models.SalaryData.FloatingWindowAggregation(days, mappedSalaryData);
var salaryDataResult = new[] { mappedSalaryData };
return new SingleResult<Models.SalaryData>(salaryDataResult.AsQueryable());
}

There is always an overlap between What is OData Compliant Routing vs What can I do with Routes in Web API. It is not always necessary to conform to the OData (V4) specification, but a non-conforming route will need custom logic on the client as well.
The common workaround for this type of request is to create Function endpoint bound to the Employee item that accepts the parameter input that will be used to materialize the data. The URL might look like this instead:
/odata/Employee(1111)/WithCalculatedSalary(40)?$expand=SalaryData
This method could then internally call the existing MovingWindow function from the SalaryDataController to build the results. You could also engineer both functions to call a common set based routine.
The reason that you you should bind this function to the EmployeeController is that the primary identifying resource that correlates the resulting data together is the Employee.
In this way OData v4 compliant clients would still be able to execute this function and importantly would be able to discover it without any need for customisations.
If you didn't need to return the Employee resource as part of the response then you could still serve a collection of SalaryData from the EmployeeController:
/odata/Employee(1111)/CalculatedSalary(days=40)
[EnableQuery(MaxExpansionDepth = 10, MaxAnyAllExpressionDepth = 10)]
[HttpGet]
[ODataRoute("({key})/FloatingWindow(days={days})")]
public IQueryable<Models.SalaryData> CalculatedSalary([FromODataUri] int key, [FromODataUri] int days)
{
...
}
builder.EntitySet<Employee>("Employee")
.EntityType
.Function("CalculatedSalary")
.ReturnsCollectionFromEntitySet<SalaryData>("SalaryData")
.Parameter<int>("days");
$compute and $search in ASP.NET Core OData 8
The OData v4.01 specification does have support for System Query Option $compute which was designed to enable clients to append computed values into the response structure, you could hijack this pipeline and define your own function that can be executed from a $compute clause, but the expectation is that system canonical functions are used with a combination of literal values and field references.
The ASP.Net implementation has only introduced support for this in the OData Lib v8 runtime, as yet I have not yet found a good example of how to implement custom functions, but syntactically it is feasible.
The same concept could be used to augment the $apply execution, if this calculation operates over a collection and effectively performs an aggregate evaluation, then $apply
It might be that your current CalculculationFunction can be translated directly into a $compute statement, otherwise if you promote some of the calculation steps (metadata) as columns in the schema (you might use SQL Computed Columns for this...) then $compute could be a viable option.

Do Couchbase reactive clients guarantee order of rows in view query result

I use Couchbase Java SDK 2.2.6 with Couchbase server 4.1.
I query my view with the following code
public <T> List<T> findDocuments(ViewQuery query, String bucketAlias, Class<T> clazz) {
// We specifically set reduce false and include docs to retrieve docs
query.reduce(false).includeDocs();
log.debug("Find all documents, query = {}", decode(query));
return getBucket(bucketAlias)
.query(query)
.allRows()
.stream()
.map(row -> fromJsonDocument(row.document(), clazz))
.collect(Collectors.toList());
}
private static <A> A fromJsonDocument(JsonDocument saved, Class<A> clazz) {
log.debug("Retrieved json document -> {}", saved);
A object = fromJson(saved.content(), clazz);
return object;
}
In the logs from the fromJsonDocument method I see that rows are not always sorted by the row key. Usually they are, but sometimes they are not.
If I just run this query in browser couchbase GUI, I always receive results in expected order. Is it a bug or expected that view query results are not sorted when queried with async client?
What is the behaviour in different clients, not java?

This is due to the asynchronous nature of your call in the Java client + the fact that you used includeDocs.
What includeDocs will do is that it will weave in a call to get for each document id received from the view. So when you look at the asynchronous sequence of AsyncViewRow with includeDocs, you're actually looking at a composition of a row returned by the view and an asynchronous retrieval of the whole document.
If a document retrieval has a little bit of latency compared to the one for the previous row, it could reorder the (row+doc) emission.
But good news everyone! There is a includeDocsOrdered alternative in the ViewQuery that takes exactly the same parameters as includeDocs but will ensure that AsyncViewRow come in the same order returned by the view.
This is done by eagerly triggering the get retrievals but then buffering those that arrive out of order, so as to maintain the original order without sacrificing too much performance.
That is quite specific to the Java client, with its usage of RxJava. I'm not even sure other clients have the notion of includeDocs...

WinRT: Reading and deserializaing large amount of files takes too much time

I have a Windows Store application which manages collection of objects and stores them in the application local folder. Those objects are serialized on the file system using JSON. As I need to be able to edit and persist those items individually I opted for individual files for each objects instead of one large file. Objects are stored following this pattern:
Local Folder
|
--- db
|
--- AB283376-7057-46B4-8B91-C32E663EC964
| |
| --- AB283376-7057-46B4-8B91-C32E663EC964.json
| --- AB283376-7057-46B4-8B91-C32E663EC964.jpg
|
--- B506EFC5-E853-45E6-BA32-64193BB49ACD
| |
| --- B506EFC5-E853-45E6-BA32-64193BB49ACD.json
| --- B506EFC5-E853-45E6-BA32-64193BB49ACD.jpg
|
...
Each object has its folder node which will contains the JSON serialized object and other eventual resources.
Everything was fine when I made some writing, reading, deleting test. Where it got complicated is when I tried to load up large collections of object on application startup. I estimated that the largest amount of item one would store to 10000. So I wrote 10000 entries and then tried to load it... more than 3 minutes to the application to complete the operation, which of course is unacceptable.
So my questions are, What could be optimized in the code I made for reading and deserializing objects (code below)? Is there a way to implement a paging system so loading would be dynamic in my WinRT application? Is my storage method (pattern above) too heavy for in terms of IO/CPU? Am I missing something in WinRT?
public async Task<IEnumerable<Release>> GetReleases()
{
List<Release> items = new List<Release>();
var dbFolder = await ApplicationData.Current.LocalFolder.CreateFolderAsync(dbName, CreationCollisionOption.OpenIfExists);
foreach (var releaseFolder in await dbFolder.GetFoldersAsync())
{
var releaseFile = await releaseFolder.GetFileAsync(releaseFolder.DisplayName + ".json");
var stream = await releaseFile.OpenAsync(FileAccessMode.Read);
using (var inStream = stream.GetInputStreamAt(0))
{
DataContractJsonSerializer serializer = new DataContractJsonSerializer(typeof(Release));
Release release = (Release)serializer.ReadObject(inStream.AsStreamForRead());
items.Add(release);
}
stream.Dispose();
}
return items;
}
Thanks for your help.
NB: I already had a look as SQLite and I don't need such a sophisticated system.

Supposedly JSON.NET is better than the built in things. If you are not sending the data over the wire, then the quickest way is to do binary serialization rather than JSON or XML. Finally - think if you really need to load all the data when your application starts. Serialize your data as a list of binary records and create an index that will allow you to quickly jump to the range of records you actually need to use.

As Filip already mentioned, you probably don't need to load all data at startup. Even if you really want to show all the items in the first page (showing 10,000 items at once to a user doesn't sound like a good idea to me), you don't need to have all their properties available: usually only a couple of them are shown in the list, you need the rest of them when the user navigates to individual item details. You could have a separate "index" file containing only the data you need for the list. This does mean duplication, but it will help you with performance.
Although you've mentioned, you don't need SQLite as it is too sophisticated for your needs, you really should take a closer look at it. It is designed to efficiently handle structured data such as yours. I'm pretty sure if you switch to it, the performance will be much better and your code might end up even simpler in the end. Try it out.

Should I return IEnumerable<T> or IQueryable<T> from my DAL?

I know this could be opinion, but I'm looking for best practices.
As I understand, IQueryable<T> implements IEnumerable<T>, so in my DAL, I currently have method signatures like the following:
IEnumerable<Product> GetProducts();
IEnumerable<Product> GetProductsByCategory(int cateogoryId);
Product GetProduct(int productId);
Should I be using IQueryable<T> here?
What are the pros and cons of either approach?
Note that I am planning on using the Repository pattern so I will have a class like so:
public class ProductRepository {
DBDataContext db = new DBDataContext(<!-- connection string -->);
public IEnumerable<Product> GetProductsNew(int daysOld) {
return db.GetProducts()
.Where(p => p.AddedDateTime > DateTime.Now.AddDays(-daysOld ));
}
}
Should I change my IEnumerable<T> to IQueryable<T>? What advantages/disadvantages are there to one or the other?

It depends on what behavior you want.
Returning an IList<T> tells the caller that they've received all of the data they've requested
Returning an IEnumerable<T> tells the caller that they'll need to iterate over the result and it might be lazily loaded.
Returning an IQueryable<T> tells the caller that the result is backed by a Linq provider that can handle certain classes of queries, putting the burden on the caller to form a performant query.
While the latter gives the caller a lot of flexibility (assuming your repository fully supports it), it's the hardest to test and, arguably, the least deterministic.

One more thing to think about: where is your paging/sorting support? If you are providing paging support within your repository, returning IEnumerable<T> is fine. If you are paging outside of your repository (like in the controller or service layer) then you really want to use IQueryable<T> because you don't want to load the entire dataset into memory before it's paged.

HUUUUGGGE difference. I see this quite a bit.
You build up an IQueryable before it hits the database. The IQueryable only hits the DB once an eager function is called (.ToList() for example) or you actually try to pull values out. IQueryable = lazy.
An IEnumerable will execute your lambda against the DB right away. IEnumerable = eager.
As for which to use with the Repository pattern, I believe it's eager. I usually see ILists being passed but someone else will need to iron that out for you. EDIT - You usually see IEnumerable instead of IQueryable because you don't want layers past your Repository A) determining when the database hit will happen or B) Adding any logic to the joins outside the Repository
There is a very good LINQ video that I enjoy a lot- it hits more than just IEnumerable v IQueryable, but it really has some fantastic insight.
http://channel9.msdn.com/posts/matthijs/LINQ-Tips-Tricks-and-Optimizations-by-Scott-Allen/

You can use IQueryable and accept that someone could create a scenario where a SELECT N+1 could happen. This is a disadvantage, along with the fact that you may end up with code that is specific to your repository implementation in the layers above your repository. The advantage of this is that you are allowing the delegation common operations like paging and sorting to be expressed outside of your respository, therefore alleviating it of such concerns. It is also more flexible if you need to join the data with other database tables, as the query will remain an expression, so can be added to before its resolved into a query and hits the database.
The alternative is to lock down your repository so that it returns materialised lists by calling ToList(). With the example of paging and sorting, you will need to pass in skip, take and a sort expression as parameters to the methods of your repository, and use the parameters to return only a window of results. This means that the repository is taking on the responsibility of paging and sorting, and all of the projection of your data.
This is a bit of a judgement call, do you give your application the power of linq, and have less complexity in the repository, or do you control your data access. For me it depends on the number of queries associated with each entity, and combinations of entities, and where I want to manage that complexity.

How can I force Linq to SQL NOT to use the cache?

When I make the same query twice, the second time it does not return new rows form the database (I guess it just uses the cache).
This is a Windows Form application, where I create the dataContext when the application starts.
How can I force Linq to SQL not to use the cache?
Here is a sample function where I have the problem:
public IEnumerable<Orders> NewOrders()
{
return from order in dataContext.Orders
where order.Status == 1
select order;
}

The simplest way would be to use a new DataContext - given that most of what the context gives you is caching and identity management, it really sounds like you just want a new context. Why did you want to create just the one and then hold onto it?
By the way, for simple queries like yours it's more readable (IMO) to use "normal" C# with extension methods rather than query expressions:
public IEnumerable<Orders> NewOrders()
{
return dataContext.Orders.Where(order => order.Status == 1);
}
EDIT: If you never want it to track changes, then set ObjectTrackingEnabled to false before you do anything. However, this will severely limit it's usefulness. You can't just flip the switch back and forward (having made queries between). Changing your design to avoid the singleton context would be much better, IMO.

It can matter HOW you add an object to the DataContext as to whether or not it will be included in future queries.
Will NOT add the new InventoryTransaction to future in memory queries
In this example I'm adding an object with an ID and then adding it to the context.
var transaction = new InventoryTransaction()
{
AdjustmentDate = currentTime,
QtyAdjustment = 5,
InventoryProductId = inventoryProductId
};
dbContext.InventoryTransactions.Add(transaction);
dbContext.SubmitChanges();
Linq-to-SQL isn't clever enough to see this as needing to be added to the previously cached list of in memory items in InventoryTransactions.
WILL add the new InventoryTransaction to future in memory queries
var transaction = new InventoryTransaction()
{
AdjustmentDate = currentTime,
QtyAdjustment = 5
};
inventoryProduct.InventoryTransactions.Add(transaction);
dbContext.SubmitChanges();
Wherever possible use the collections in Linq-to-SQL when creating relationships and not the IDs.
In addition as Jon says, try to minimize the scope of a DataContext as much as possible.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

LINQ to SQL DataContext Caching - linq-to-sql

in my case work only with 4 bindings: DATACONTEXT.GetType().InvokeMember( "ClearCache", BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.InvokeMethod, null, DATACONTEXT, null);

You are correct that L2S will return items from the cache when it can. I learned about this the hard way. A clean way to handle this is to do a context.RefreshRow after each Insert and Update operation. This refreshes the cache and guarantees the cache is current.

Related

Apply OData function on retrieved data in a query

Do Couchbase reactive clients guarantee order of rows in view query result

WinRT: Reading and deserializaing large amount of files takes too much time

Should I return IEnumerable<T> or IQueryable<T> from my DAL?

How can I force Linq to SQL NOT to use the cache?

Categories

Resources