Is it still thread safe if I do first() then pop_front()? - stl

Consider the following code in a multithread program:
QString target = remaining.first(); // remaining is a QVector<QString> class
remaining.pop_front();
Would it be safe? Looks like multiple thread may use the same "target" simultaneously. Or what's the safe way to retrieve + erase the first value?

Without a mutex protecting that code, no, it's not at all safe.
I don't know QVector in detail but I believe it's OK for two threads to both do:
QString target = remaining.first();
This simply copies an element of the vector, so each thread has its own QString object called target and they are independent objects (behind the scenes they use implicit sharing so are not independent, but you should be able to treat them as independent)
But this line modifies the QVector:
remaining.pop_front();
This means two threads modify the same object without any synchronisation. If the first thread is still accessing the vector by calling remaining.first() when the second thread calls pop_front() then there is a data race, with undefined behaviour.
Similarly, if both threads call pop_front() concurrently they will both try to remove the first element, what happens there is completely unpredictable. You might erase one element, or two, or none, or crash the entire program immediately. As another possibility, consider what happens if the vector only has one element. Both threads check it's not empty, copy the first() element, then call pop_front(), which tries to remove two elements when there's only one. You're program is broken.
The safe way to do it is protect the code with a mutex, where mutex is some global or otherwise shared variable that is visible to both threads:
QString target;
{
QMutexLocker locker(&mutex);
if (!remaining.empty())
{
target = remaining.first();
remaining.pop_front();
}
}

Related

how to make Ghidra use a function's complete/original stackframe for decompiled code

I have a case where some function allocates/uses a 404 bytes temporary structure on the stack for its internal calculations (the function is self-contained and shuffles data around within that data structure). Conceptually the respective structure seems to consist of some 32-bit counters followed by an int[15] and a byte[80] array, and then an area that might or might not actually be used. Some of the generated data in the tables seems to represent offsets that are again used by the function to navigate within the temporary structure.
Unfortunately Ghidra's decompiler makes a total mess while trying to make sense of the function: In particular it creates separate "local_.." int-vars (and then uses a pointer to that var) for what should correctly be a pointer into the function's original data-structure (e.g. pointing into one of the arrays).
undefined4 local_17f;
...
dest= &local_17f;
for (i = 0xf; i != 0; i = i + -1) {
*dest = 0;
dest = dest + 1;
}
Ghidra does not seem to understand that an array based data access is actually being used at that point. Ghirda's decompiler then also generates a local auStack316[316] variable which unfortunately seems to cover only a part of the respective local data structure used by the original ASM code (at least Ghidra actually did notice that a temporary memory buffer is used). As a result the decompiled code basically uses two overlapping (and broken) shadow data structures that should correctly just be the same block of memory.
Is there some way to make Ghidra's decompiler use the complete 404 bytes block allocated by the function as an auStack404 thus bypassing Ghidra's flawed interpretation logic and actually preserve the original functionality of the ASM code?
I think I found something.. In the "Listing" view the used local-variable layout is shown as a comment under the function's header. It seems that by right clicking on a respective local-var line in that comment, "set data type" can be applied to a respective local variable. Ah, and then there is what I've been looking for under "Function/"Edit stack frame" :-)

Understanding heisenbug example: "Debuggers cause additional source code to be executed stealthily"

I read wikipedia page about heisunbug, but don't understand this example. Can anyone explain it in detail?
Debuggers also commonly provide watches or other user interfaces that cause additional source code (such as property accessors) to be executed stealthily, which can, in turn, change the state of the program.
I think what it's getting at is that the debugger itself may call code (such as getters) to retrieve the value of a property you might have placed a watch on.
Consider the getter:
def getter fahrenheit:
return celsius * 9 / 5 + 32;
and what would happen if you put a watch on the fahrenheit property.
That code would normally only be called if your code itself tried to access the fahrenheit propery but, if a debugger is calling it to maintain the watch, it may be called outside of the control of your program.
A simple example, let's say the getter has a (pretty obvious) bug which means that it returns the wrong result the first time it's called:
class temperature:
variable state
def init:
state = 1
def getter fahrenheit:
if state == 1:
state = 0
return -42
return celsius * 9 / 5 + 32;
So running your code without a debugger exhibits a problem in that it will return a weird value the first time your code calls it.
But, if your debugger is actually calling the getter to extract a value that it's watching (and it's probably doing this after every single-step operation you perform), that means the getter will be well and truly returning the correct value by the time your code calls it for what it thinks is the first time.
Hence the problem will disappear when you try to look closer at it, and that's the very definition of a Heisenbug, despite the fact that Heisenberg's actual uncertainty principle has little to do with the observer effect.

Using retain and release for Objects

Are there any general guide lines for using retain and release for objects in cocos2d-X ? When creating objects in a function, is it true that the functions memory is cleaned up the second the function returns. When a object is created, calling the retain function of the object, will retain object beyond the function return ?
Kind Regards
Generally in c++ you have this behaviour:
void foo() {
Object a;
Object *pA = new Object();
(…)
}
This would result in a being destroyed automatically at function end, as it was allocated on stack. The *pA would not get destroyed, as it was allocated on the heap (thus, you only loose the reference to it, but the object itself still lives).
Cocos implements a thing called "Automatic Reference Counting" : each CCObject has a reference counter and two methods retain() and release(). The way this works is, that every time you create an object, it gets registered in cocos structers (CCPoolManager). Then with every frame (between them being drawn) there is a maintenance loop which checks the reference counter of all objects : if it is 0 this means (to cocos) that no other objects reference it, so it is safe to delete it. The retain count of an object is automatically incresead when you use this object as an argument for an addChild function.
Example :
void cocosFoo() {
CCSprite *a = CCSprite::create(…);
CCSprite *b = CCSprite::create(…);
this->addChild(b);
}
What happens here is this :
Two CCSprites are created, cocos knows about them.
The b sprite is added to this object (say a CCLayer)
The function ends, no objects are destroyed (both of them being on heap).
Somewhere between this and next frame, the maintanance gets run. Cocos chcecks both sprites and sees that a has reference count == 0, so it deletes it.
This system is quite good, as you don't need to worry about memory management. If you want to create a CCSprite (for example), but not add it as a child yet, you can call retain() on it, which will raise its reference counter, saving it from automatic deletion. But then you'd have to remember about calling release() on it (for example, when adding it as a child).
The general things you have to remeber about are :
Each call to retain() by you needs to be paired with release().
You generally shouldn't delete CCObjects yourself. If you feel that you need to, there is a conveniece macro : CC_SAFE_DELETE(object)
So to answer your questions in short :
Are there any general guide lines for using retain and release for objects in cocos2d-X ?
Yes, you should generally not need to do it.
When creating objects in a function, is it true that the functions memory is cleaned up the second the function returns.
Answer to this is the whole text above.
When a object is created, calling the retain function of the object, will retain object beyond the function return ?
Yes, as will adding it as a child to another (retained in any way) object.
Here is the thing,
cocos2dx has an autorelease pool which drains the objects which have retain count=0 which is a variable to keep in check the scope of the cocos2dx object.
Now when you create new object using the create method it is already added to the autorelease pool and you don't need to release it or delete it anywhere , its like garbage collector in java, takes care of garbage objects behind your back.
But when you create new object using 'new' you definitely need to release it in its destructor or after its use is over.
Second thing,
when your object is added to the autorelease pool but you need it somewhere else you could just retain it , this increments its retain count by one and then you have to manually release it after its use is over.
Third Thing,
Whenever you add child your object it is retained automatically but you don't need to release it rather you remove it from the parent.

Is STL empty() threadsafe?

I have multiple threads modifying an stl vector and an stl list.
I want to avoid having to take a lock if the container is empty
Would the following code be threadsafe? What if items was a list or a map?
class A
{
vector<int> items
void DoStuff()
{
if(!items.empty())
{
AquireLock();
DoStuffWithItems();
ReleaseLock();
}
}
}
It depends what you expect. The other answers are right that in general, standard C++ containers are not thread-safe, and furthermore, that in particular your code doesn’t ward against another thread modifying the container between your call to empty and the acquisition of the lock (but this matter is unrelated to the thread safety of vector::empty).
So, to ward off any misunderstandings: Your code does not guarantee items will be non-empty inside the block.
But your code can still be useful, since all you want to do is avoid redundant lock creations. Your code doesn’t give guarantees but it may prevent an unnecessary lock creation. It won’t work in all cases (other threads can still empty the container between your check and the lock) but in some cases. And if all you’re after is an optimization by omitting a redundant lock, then your code accomplishes that goal.
Just make sure that any actual access to the container is protected by locks.
By the way, the above is strictly speaking undefined behaviour: an STL implementation is theoretically allowed to modify mutable members inside the call to empty. This would mean that the apparently harmless (because read-only) call to empty can actually cause a conflict. Unfortunately, you cannot rely on the assumption that read-only calls are safe with STL containers.
In practice, though, I am pretty sure that vector::empty will not modify any members. But already for list::empty I am less sure. If you really want guarantees, then either lock every access or don’t use the STL containers.
There is no thread-safe guaranty on anything in the containers and algorithms of the the STL.
So, No.
Regardless of whether or not empty is thread safe, your code will not, as written, accomplish your goal.
class A
{
vector<int> items
void DoStuff()
{
if(!items.empty())
{
//Another thread deletes items here.
AquireLock();
DoStuffWithItems();
ReleaseLock();
}
}
}
A better solution is to lock every time you work with items (when iterating, getting items, adding items, checking count/emptiness, etc.), thus providing your own thread safety. So, acquire the lock first, then check if the vector is empty.
As it is already answered, the above code is not thread safe and locking is mandatory before actually doing anything with the container.
But the following should have better performance than always locking and I can't think of a reason that it can be unsafe.
The idea here is that locking can be expensive and we are avoiding it, whenever not really needed.
class A
{
vector<int> items;
void DoStuff()
{
if(!items.empty())
{
AquireLock();
if(!items.empty())
{
DoStuffWithItems();
}
ReleaseLock();
}
}
}
STL is not thread safe and empty too. If you want make container safe you must close all its methods by mutex or other sync

What is an idempotent operation?

What is an idempotent operation?
In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters. For example, removing an item from a set can be considered an idempotent operation on the set.
In mathematics, an idempotent operation is one where f(f(x)) = f(x). For example, the abs() function is idempotent because abs(abs(x)) = abs(x) for all x.
These slightly different definitions can be reconciled by considering that x in the mathematical definition represents the state of an object, and f is an operation that may mutate that object. For example, consider the Python set and its discard method. The discard method removes an element from a set, and does nothing if the element does not exist. So:
my_set.discard(x)
has exactly the same effect as doing the same operation twice:
my_set.discard(x)
my_set.discard(x)
Idempotent operations are often used in the design of network protocols, where a request to perform an operation is guaranteed to happen at least once, but might also happen more than once. If the operation is idempotent, then there is no harm in performing the operation two or more times.
See the Wikipedia article on idempotence for more information.
The above answer previously had some incorrect and misleading examples. Comments below written before April 2014 refer to an older revision.
An idempotent operation can be repeated an arbitrary number of times and the result will be the same as if it had been done only once. In arithmetic, adding zero to a number is idempotent.
Idempotence is talked about a lot in the context of "RESTful" web services. REST seeks to maximally leverage HTTP to give programs access to web content, and is usually set in contrast to SOAP-based web services, which just tunnel remote procedure call style services inside HTTP requests and responses.
REST organizes a web application into "resources" (like a Twitter user, or a Flickr image) and then uses the HTTP verbs of POST, PUT, GET, and DELETE to create, update, read, and delete those resources.
Idempotence plays an important role in REST. If you GET a representation of a REST resource (eg, GET a jpeg image from Flickr), and the operation fails, you can just repeat the GET again and again until the operation succeeds. To the web service, it doesn't matter how many times the image is gotten. Likewise, if you use a RESTful web service to update your Twitter account information, you can PUT the new information as many times as it takes in order to get confirmation from the web service. PUT-ing it a thousand times is the same as PUT-ing it once. Similarly DELETE-ing a REST resource a thousand times is the same as deleting it once. Idempotence thus makes it a lot easier to construct a web service that's resilient to communication errors.
Further reading: RESTful Web Services, by Richardson and Ruby (idempotence is discussed on page 103-104), and Roy Fielding's PhD dissertation on REST. Fielding was one of the authors of HTTP 1.1, RFC-2616, which talks about idempotence in section 9.1.2.
No matter how many times you call the operation, the result will be the same.
Idempotence means that applying an operation once or applying it multiple times has the same effect.
Examples:
Multiplication by zero. No matter how many times you do it, the result is still zero.
Setting a boolean flag. No matter how many times you do it, the flag stays set.
Deleting a row from a database with a given ID. If you try it again, the row is still gone.
For pure functions (functions with no side effects) then idempotency implies that f(x) = f(f(x)) = f(f(f(x))) = f(f(f(f(x)))) = ...... for all values of x
For functions with side effects, idempotency furthermore implies that no additional side effects will be caused after the first application. You can consider the state of the world to be an additional "hidden" parameter to the function if you like.
Note that in a world where you have concurrent actions going on, you may find that operations you thought were idempotent cease to be so (for example, another thread could unset the value of the boolean flag in the example above). Basically whenever you have concurrency and mutable state, you need to think much more carefully about idempotency.
Idempotency is often a useful property in building robust systems. For example, if there is a risk that you may receive a duplicate message from a third party, it is helpful to have the message handler act as an idempotent operation so that the message effect only happens once.
A good example of understanding an idempotent operation might be locking a car with remote key.
log(Car.state) // unlocked
Remote.lock();
log(Car.state) // locked
Remote.lock();
Remote.lock();
Remote.lock();
log(Car.state) // locked
lock is an idempotent operation. Even if there are some side effect each time you run lock, like blinking, the car is still in the same locked state, no matter how many times you run lock operation.
An idempotent operation produces the result in the same state even if you call it more than once, provided you pass in the same parameters.
An idempotent operation is an operation, action, or request that can be applied multiple times without changing the result, i.e. the state of the system, beyond the initial application.
EXAMPLES (WEB APP CONTEXT):
IDEMPOTENT:
Making multiple identical requests has the same effect as making a single request. A message in an email messaging system is opened and marked as "opened" in the database. One can open the message many times but this repeated action will only ever result in that message being in the "opened" state. This is an idempotent operation. The first time one PUTs an update to a resource using information that does not match the resource (the state of the system), the state of the system will change as the resource is updated. If one PUTs the same update to a resource repeatedly then the information in the update will match the information already in the system upon every PUT, and no change to the state of the system will occur. Repeated PUTs with the same information are idempotent: the first PUT may change the state of the system, subsequent PUTs should not.
NON-IDEMPOTENT:
If an operation always causes a change in state, like POSTing the same message to a user over and over, resulting in a new message sent and stored in the database every time, we say that the operation is NON-IDEMPOTENT.
NULLIPOTENT:
If an operation has no side effects, like purely displaying information on a web page without any change in a database (in other words you are only reading the database), we say the operation is NULLIPOTENT. All GETs should be nullipotent.
When talking about the state of the system we are obviously ignoring hopefully harmless and inevitable effects like logging and diagnostics.
Just wanted to throw out a real use case that demonstrates idempotence. In JavaScript, say you are defining a bunch of model classes (as in MVC model). The way this is often implemented is functionally equivalent to something like this (basic example):
function model(name) {
function Model() {
this.name = name;
}
return Model;
}
You could then define new classes like this:
var User = model('user');
var Article = model('article');
But if you were to try to get the User class via model('user'), from somewhere else in the code, it would fail:
var User = model('user');
// ... then somewhere else in the code (in a different scope)
var User = model('user');
Those two User constructors would be different. That is,
model('user') !== model('user');
To make it idempotent, you would just add some sort of caching mechanism, like this:
var collection = {};
function model(name) {
if (collection[name])
return collection[name];
function Model() {
this.name = name;
}
collection[name] = Model;
return Model;
}
By adding caching, every time you did model('user') it will be the same object, and so it's idempotent. So:
model('user') === model('user');
Quite a detailed and technical answers. Just adding a simple definition.
Idempotent = Re-runnable
For example,
Create operation in itself is not guaranteed to run without error if executed more than once.
But if there is an operation CreateOrUpdate then it states re-runnability (Idempotency).
Idempotent Operations: Operations that have no side-effects if executed multiple times.
Example: An operation that retrieves values from a data resource and say, prints it
Non-Idempotent Operations: Operations that would cause some harm if executed multiple times. (As they change some values or states)
Example: An operation that withdraws from a bank account
It is any operation that every nth result will result in an output matching the value of the 1st result. For instance the absolute value of -1 is 1. The absolute value of the absolute value of -1 is 1. The absolute value of the absolute value of absolute value of -1 is 1. And so on. See also: When would be a really silly time to use recursion?
An idempotent operation over a set leaves its members unchanged when applied one or more times.
It can be a unary operation like absolute(x) where x belongs to a set of positive integers. Here absolute(absolute(x)) = x.
It can be a binary operation like union of a set with itself would always return the same set.
cheers
In short, Idempotent operations means that the operation will not result in different results no matter how many times you operate the idempotent operations.
For example, according to the definition of the spec of HTTP, GET, HEAD, PUT, and DELETE are idempotent operations; however POST and PATCH are not. That's why sometimes POST is replaced by PUT.
An operation is said to be idempotent if executing it multiple times is equivalent to executing it once.
For eg: setting volume to 20.
No matter how many times the volume of TV is set to 20, end result will be that volume is 20. Even if a process executes the operation 50/100 times or more, at the end of the process the volume will be 20.
Counter example: increasing the volume by 1. If a process executes this operation 50 times, at the end volume will be initial Volume + 50 and if a process executes the operation 100 times, at the end volume will be initial Volume + 100. As you can clearly see that the end result varies based upon how many times the operation was executed. Hence, we can conclude that this operation is NOT idempotent.
I have highlighted the end result in bold.
If you think in terms of programming, let's say that I have an operation in which a function f takes foo as the input and the output of f is set to foo back. If at the end of the process (that executes this operation 50/100 times or more), my foo variable holds the value that it did when the operation was executed only ONCE, then the operation is idempotent, otherwise NOT.
foo = <some random value here, let's say -2>
{ foo = f( foo ) }   curly brackets outline the operation
if f returns the square of the input then the operation is NOT idempotent. Because foo at the end will be (-2) raised to the power (number of times operation is executed)
if f returns the absolute of the input then the operation is idempotent because no matter how many multiple times the operation is executed foo will be abs(-2).
Here, end result is defined as the final value of variable foo.
In mathematical sense, idempotence has a slightly different meaning of:
f(f(....f(x))) = f(x)
here output of f(x) is passed as input to f again which doesn't need to be the case always with programming.
my 5c:
In integration and networking the idempotency is very important.
Several examples from real-life:
Imagine, we deliver data to the target system. Data delivered by a sequence of messages.
1. What would happen if the sequence is mixed in channel? (As network packages always do :) ). If the target system is idempotent, the result will not be different. If the target system depends of the right order in the sequence, we have to implement resequencer on the target site, which would restore the right order.
2. What would happen if there are the message duplicates? If the channel of target system does not acknowledge timely, the source system (or channel itself) usually sends another copy of the message. As a result we can have duplicate message on the target system side.
If the target system is idempotent, it takes care of it and result will not be different.
If the target system is not idempotent, we have to implement deduplicator on the target system side of the channel.
For a workflow manager (as Apache Airflow) if an idempotency operation fails in your pipeline the system can retry the task automatically without affecting the system. Even if the logs change, that is good because you can see the incident.
The most important in this case is that your system can retry the task that failed and doesn't mess up the pipeline (e.g. appending the same data in a table each retry)
Let's say the client makes a request to "IstanceA" service which process the request, passes it to DB, and shuts down before sending the response. since the client does not see that it was processed and it will retry the same request. Load balancer will forward the request to another service instance, "InstanceB", which will make the same change on the same DB item.
We should use idempotent tokens. When a client sends a request to a service, it should have some kind of request-id that can be saved in DB to show that we have already executed the request. if the client retries the request, "InstanceB" will check the requestId. Since that particular request already has been executed, it will not make any change to the DB item. Those kinds of requests are called idempotent requests. So we send the same request multiple times, but we won't make any change