DbContext.ChangeTracker, DbContext.Entry() inconsistencies - entity-framework-4.1

Under the debugger I have a case where DbContext.ChangeTracker.Entry(e) returns an entry with a State of Detached. When I enumerate the results of DbContext.ChangeTracker.Entries() and the entries of the underlying ObjectContext when looking for e, I find an entry with a State of Unchanged (expected).
What is going on?
Here are some additional details:
using POCO entities.
change Tracking is on
proxy creation is off
lazy loading is off
problem does not occur when saving an entity for the first time (e.g. adding to context); occurs when getting old entity into context then trying to make changes to it. This is an aggregate root with many "reference" entities that aren't supposed to change
Equals is overridden on the entities and IEquatable<T> is implemented. That code is generated by T4.
I am using a generic repository implementation that is declaratively configured to generate rules for saving (e.g. whether entities should be added, attached/modified, attached/unchanged. It seems to be doing this in the right order. For example the aggregate root is added/attached last because attaching it first brings in other entities in a modified state (adding those first as unchanged prevents this).

(Answered in a question edit. Converted to a community wiki answer. See Question with no answers, but issue solved in the comments (or extended in chat) )
The OP wrote:
I have "solved" the problem, but I still want to know what's going on, because my solution doesn't do anything to address the root cause. My "solution" looks for an entity in the change tracker (I have also looked via the context.Entry() and context.Set().Local -- when I do it with this code (I did it as a loop instead of LINQ so I could set breakpoints), it works:
private DbEntityEntry GetChangeTrackedEntry(IEntity mine, Type type)
{
foreach (var en in context.ChangeTracker.Entries())
{
if (en.Entity.GetType() != type)
continue;
if (((IEntity)en.Entity).Id != mine.Id)
continue;
return en;
}
return null;
}
When I attempt to lookup an entity (via change tracker, the set, etc.) via using mine directly, that's when I end up with a detached case.
I thought perhaps there were cases of EF using ReferenceEquals but #Ladislav's comment may indicate something wrong with Equals implementation.
If anyone has a further explanation they can edit that into this community wiki answer.

Related

Redacted comments in MS's source code for .NET [duplicate]

The Reference Source page for stringbuilder.cs has this comment in the ToString method:
if (chunk.m_ChunkLength > 0)
{
// Copy these into local variables so that they
// are stable even in the presence of ----s (hackers might do this)
char[] sourceArray = chunk.m_ChunkChars;
int chunkOffset = chunk.m_ChunkOffset;
int chunkLength = chunk.m_ChunkLength;
What does this mean? Is ----s something a malicious user might insert into a string to be formatted?
The source code for the published Reference Source is pushed through a filter that removes objectionable content from the source. Verboten words are one, Microsoft programmers use profanity in their comments. So are the names of devs, Microsoft wants to hide their identity. Such a word or name is substituted by dashes.
In this case you can tell what used to be there from the CoreCLR, the open-sourced version of the .NET Framework. It is a verboten word:
// Copy these into local variables so that they are stable even in the presence of race conditions
Which was hand-edited from the original that you looked at before being submitted to Github, Microsoft also doesn't want to accuse their customers of being hackers, it originally said races, thus turning into ----s :)
In the CoreCLR repository you have a fuller quote:
Copy these into local variables so that they are stable even in the presence of race conditions
Github
Basically: it's a threading consideration.
In addition to the great answer by #Jeroen, this is more than just a threading consideration. It's to prevent someone from intentionally creating a race condition and causing a buffer overflow in that manner. Later in the code, the length of that local variable is checked. If the code were to check the length of the accessible variable instead, it could have changed on a different thread between the time length was checked and wstrcpy was called:
// Check that we will not overrun our boundaries.
if ((uint)(chunkLength + chunkOffset) <= ret.Length && (uint)chunkLength <= (uint)sourceArray.Length)
{
///
/// imagine that another thread has changed the chunk.m_ChunkChars array here!
/// we're now in big trouble, our attempt to prevent a buffer overflow has been thawrted!
/// oh wait, we're ok, because we're using a local variable that the other thread can't access anyway.
fixed (char* sourcePtr = sourceArray)
string.wstrcpy(destinationPtr + chunkOffset, sourcePtr, chunkLength);
}
else
{
throw new ArgumentOutOfRangeException("chunkLength", Environment.GetResourceString("ArgumentOutOfRange_Index"));
}
}
chunk = chunk.m_ChunkPrevious;
} while (chunk != null);
Really interesting question though.
Don't think that this is the case - the code in question copies to local variables to prevent bad things happening if the string builder instance is mutated on another thread.
I think the ---- may relate to a four letter swear word...

What's happening when I use for(i in object) in AS3?

To iterate over the properties of an Object in AS3 you can use for(var i:String in object) like this:
Object:
var object:Object = {
thing: 1,
stuff: "hats",
another: new Sprite()
};
Loop:
for(var i:String in object)
{
trace(i + ": " + object[i]);
}
Result:
stuff: hats
thing: 1
another: [object Sprite]
The order in which the properties are selected however seems to vary and never matches anything that I can think of such as alphabetical property name, the order in which they were created, etc. In fact if I try it a few different times in different places, the order is completely different.
Is it possible to access the properties in a given order? What is happening here?
I'm posting this as an answer just to compliment BoltClock's answer with some extra insight by looking directly at the flash player source code. We can actually see the AVM code that specifically provides this functionality and it's written in C++. We can see inside ArrayObject.cpp the following code:
// Iterator support - for in, for each
Atom ArrayObject::nextName(int index)
{
AvmAssert(index > 0);
int denseLength = (int)getDenseLength();
if (index <= denseLength)
{
AvmCore *core = this->core();
return core->intToAtom(index-1);
}
else
{
return ScriptObject::nextName (index - denseLength);
}
}
As you can see when there is a legitimate property (object) to return, it is looked up from the ScriptObject class, specifically the nextName() method. If we look at those methods within ScriptObject.cpp:
Atom ScriptObject::nextName(int index)
{
AvmAssert(traits()->needsHashtable());
AvmAssert(index > 0);
InlineHashtable *ht = getTable();
if (uint32_t(index)-1 >= ht->getCapacity()/2)
return nullStringAtom;
const Atom* atoms = ht->getAtoms();
Atom m = ht->removeDontEnumMask(atoms[(index-1)<<1]);
if (AvmCore::isNullOrUndefined(m))
return nullStringAtom;
return m;
}
We can see that indeed, as people have pointed out here that the VM is using a hash table. However in these functions there is a specific index supplied, which would suggest, at first glance, that there must then be specific ordering.
If you dig deeper (I won't post all the code here) there are a whole slew of methods from different classes involved in the for in/for each functionality and one of them is the method ScriptObject::nextNameIndex() which basically pulls up the whole hash table and just starts providing indices to valid objects within the table and increments the original index supplied in the argument, so long as the next value points to a valid object. If I'm right in my interpretation, this would be the cause behind your random lookup and I don't believe there would be any way here to force a standardized/ordered map in these operations.
Sources
For those of you who might want to get the source code for the open source portion of the flash player, you can grab it from the following mercurial repositories (you can download a snapshop in zip like github so you don't have to install mercurial):
http://hg.mozilla.org/tamarin-central - This is the "stable" or "release" repository
http://hg.mozilla.org/tamarin-redux - This is the development branch. The most recent changes to the AVM will be found here. This includes the support for Android and such. Adobe is still updating and open sourcing these parts of the flash player, so it's good current and official stuff.
While I'm at it, this might be of interest as well: http://code.google.com/p/redtamarin/. It's a branched off (and rather mature) version of the AVM and can be used to write server-side actionscript. Neat stuff and has a ton of information that gives insight into the workings of the AVM so I thought I'd include it too.
This behavior is documented (emphasis mine):
The for..in loop iterates through the properties of an object, or the elements of an array. For example, you can use a for..in loop to iterate through the properties of a generic object (object properties are not kept in any particular order, so properties may appear in a seemingly random order)
How the properties are stored and retrieved appears to be an implementation detail, which isn't covered in the documentation. As ToddBFisher mentions in a comment, though, a data structure commonly used to implement associative arrays is the hash table. It's even mentioned in this page about associative arrays in AS3, and if you inspect the AVM code as shown by Ascension Systems, you'll find exactly such an implementation. As described, there is no notion of order or sorting in typical hash tables.
I don't believe there is a way to access the properties in a specific order unless you store that order somehow.

Naming conventions for methods which must be called in a specific order?

I have a class that requires some of its methods to be called in a specific order. If these methods are called out of order then the object will stop working correctly. There are a few asserts in the methods to ensure that the object is in a valid state. What naming conventions could I use to communicate to the next person to read the code that these methods need to be called in a specific order?
It would be possible to turn this into one huge method, but huge methods are a great way to create problems. (There are a 2 methods than can trigger this sequence so 1 huge method would also result in duplication.)
It would be possible to write comments that explain that the methods need to be called in order but comments are less useful then clearly named methods.
Any suggestions?
Is it possible to refactor so (at least some of) the state from the first function is passed as a paramter to the second function, then it's impossible to avoid?
Otherwise, if you have comments and asserts, you're doing quite well.
However, "It would be possible to turn this into one huge method" makes it sound like the outside code doesn't need to access the intermediate state in any way. If so, why not just make one public method, which calls several private methods successively? Something like:
FroblicateWeazel() {
// Need to be in this order:
FroblicateWeazel_Init();
FroblicateWeazel_PerformCals();
FroblicateWeazel_OutputCalcs();
FroblicateWeazel_Cleanup();
}
That's not perfect, but if the order is centralised to that one function, it's fairly easy to see what order they should come in.
Message digest and encryption/decryption routines often have an _init() method to set things up, an _update() to add new data, and a _final() to return final results and tear things back down again.

Transforming an object implicitly

The following code illustrates a pattern I sometimes see, whereby an object is transformed implicitly as it is passed as a parameter across a number of method calls.
var o = new MyReferenceType();
DoSomeWorkAndPossiblyModifyO(o);
DoYetMoreWorkAndPossiblyFurtherModifyO(o);
//now use o...
This feels wrong to me (it hardly feels object oriented). Is it acceptable?
Based on your method names, I would argue that there is nothing implicit in the transformation. This pattern would be acceptable. If, on the other hand your methods had names like printO(o) or compareTo(o), but actually modified the Object o, the design would be bad.
It is acceptable but usually bad style.
The usual "good" approach is:
DoSomeWorkAndModify(&o); // explicit reference means we will be accepting changes
o = DoSomeWorkAndReturnModified(o); // much more elastic because you often want to keep original.
The approach you presented makes sense when o is huge, and making a copy of it in memory is out of question, or if it's a function you (and nobody else = private) use very frequently and don't want to bother with the & syntax. Otherwise it's laziness that results in some really difficult to detect bugs.
It depends entirely on what the methods actually do, besides modifying that object.
For instance, an object primarily related to keeping some state in memory might for instance not have anything related to persisting that state anywhere.
The methods could for instance load data from a database, and update the object with that information.
However! Since I program mostly in C# and thus .NET, which is a wholly object-oriented language, I would actually write your code like this:
var o = new MyReferenceType();
SomeOtherClass.DoSomeWorkAndPossiblyModifyO(o);
SomeOtherClass.DoYetMoreWorkAndPossiblyFurtherModifyO(o);
//now use o...
In which case the actual name of that other class (or those other classes if there's 2 involved) would give me a big clue as to what is actually happening and/or the context.
Example:
Person p = new Person();
DatabaseContext.FetchAllLazilyLoadedProperties(p);
DatabaseContext.Save(p); // updates primary key property with new ID

api documentation and "value limits": do they match?

Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation ?
Note: I am not talking about comments within the code
By "value limits", I mean:
does a parameter can support a null value (or an empty String, or...) ?
does a 'return value' can be null or is guaranteed to never be null (or can be "empty", or...) ?
Sample:
What I often see (without having access to source code) is:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* #return array of reader names
*/
String[] getReaderNames(final String aReaderNameRegexp);
What I like to see would be:
/**
* Get all readers name for this current Report. <br />
* <b>Warning</b>The Report must have been published first.
* #param aReaderNameRegexp filter in order to return only reader matching the regexp
* (can be null or empty)
* #return array of reader names
* (null if Report has not yet been published,
* empty array if no reader match criteria,
* reader names array matching regexp, or all readers if regexp is null or empty)
*/
String[] getReaderNames(final String aReaderNameRegexp);
My point is:
When I use a library with a getReaderNames() function in it, I often do not even need to read the API documentation to guess what it does. But I need to be sure how to use it.
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
Edit:
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?
I think they can belong together but don't necessarily have to belong together. In your scenario, it seems like it makes sense that the limits are documented in such a way that they appear in the generated API documentation and intellisense (if the language/IDE support it).
I think it does depend on the language as well. For example, Ada has a native data type that is a "restricted integer", where you define an integer variable and explicitly indicate that it will only (and always) be within a certain numeric range. In that case, the datatype itself indicates the restriction. It should still be visible and discoverable through the API documentation and intellisense, but wouldn't be something that a developer has to specify in the comments.
However, languages like Java and C# don't have this type of restricted integer, so the developer would have to specify it in the comments if it were information that should become part of the public documentation.
I think those kinds of boundary conditions most definitely belong in the API. However, I would (and often do) go a step further and indicate WHAT those null values mean. Either I indicate it will throw an exception, or I explain what the expected results are when the boundary value is passed in.
It's hard to remember to always do this, but it's a good thing for users of your class. It's also difficult to maintain it if the contract the method presents changes (like null values are changed to no be allowed)... you have to be diligent also to update the docs when you change the semantics of the method.
Question 1
Do you often see in API documentation (as in 'javadoc of public functions' for example) the description of "value limits" as well as the classic documentation?
Almost never.
Question 2
My only concern when I want to use this function is: what should I expect in term of parameters and return values ? That is all I need to know to safely setup my parameters and safely test the return value, yet I almost never see that kind of information in API documentation...
If I used a function not properly I would expect a RuntimeException thrown by the method or a RuntimeException in another (sometimes very far) part of the program.
Comments like #param aReaderNameRegexp filter in order to ... (can be null or empty) seems to me a way to implement Design by Contract in a human-being language inside Javadoc.
Using Javadoc to enforce Design by Contract was used by iContract, now resurrected into JcontractS, that let you specify invariants, preconditions, postconditions, in more formalized way compared to the human-being language.
Question 3
This can influence the usage or not for checked or unchecked exceptions.
What do you think ? value limits and API, do they belong together or not ?
Java language doesn't have a Design by Contract feature, so you might be tempted to use Execption but I agree with you about the fact that you have to be aware about When to choose checked and unchecked exceptions. Probably you might use unchecked IllegalArgumentException, IllegalStateException, or you might use unit testing, but the major problem is how to communicate to other programmers that such code is about Design By Contract and should be considered as a contract before changing it too lightly.
I think they do, and have always placed comments in the header files (c++) arcordingly.
In addition to valid input/output/return comments, I also note which exceptions are likly to be thrown by the function (since I often want to use the return value for...well returning a value, I prefer exceptions over error codes)
//File:
// Should be a path to the teexture file to load, if it is not a full path (eg "c:\example.png") it will attempt to find the file usign the paths provided by the DataSearchPath list
//Return: The pointer to a Texture instance is returned, in the event of an error, an exception is thrown. When you are finished with the texture you chould call the Free() method.
//Exceptions:
//except::FileNotFound
//except::InvalidFile
//except::InvalidParams
//except::CreationFailed
Texture *GetTexture(const std::string &File);
#Fire Lancer: Right! I forgot about exception, but I would like to see them mentioned, especially the unchecked 'runtime' exception that this public method could throw
#Mike Stone:
you have to be diligent also to update the docs when you change the semantics of the method.
Mmmm I sure hope that the public API documentation is at the very least updated whenever a change -- that affects the contract of the function -- takes place. If not, those API documentations could be drop altogether.
To add food to yours thoughts (and go with #Scott Dorman), I just stumble upon the future of java7 annotations
What does that means ? That certain 'boundary conditions', rather than being in the documentation, should be better off in the API itself, and automatically used, at compilation time, with appropriate 'assert' generated code.
That way, if a '#CheckForNull' is in the API, the writer of the function might get away with not even documenting it! And if the semantic change, its API will reflect that change (like 'no more #CheckForNull' for instance)
That kind of approach suggests that documentation, for 'boundary conditions', is an extra bonus rather than a mandatory practice.
However, that does not cover the special values of the return object of a function. For that, a complete documentation is still needed.