Modeling ERGM with two edge attributes in R - igraph

I have a scenario, where one edge attribute affects another edge attribute. Is there a way to model this in ERGM? For instance, consider outsourcing contracts where complexity of the contract affects the value of contract. I would like to model valued ERGM in a way that I can predict a network with contract value using complexity.
I know we could use node attributes in modeling the ERGM, is it possible to model two edge attributes? Please also let me know if we could model this with methods other than ERGMs.

The first question is whether ergm is necessarily the best tool to use here. If you are interested in modelling the value of the contract given that it exists, you might not need the full ERGM machinery. Instead, you might be able to get away with an linear or generalised linear model with appropriate covariates. It's only if you need to model the existence of the contract and its value that you need the valued ERGM machinery.
If you do require ERGMs, you should be able to use the edgecov term in a valued ERGM. You need to be careful in constructing the network, though, since an edge has to exist in order to have a value. The most reliable approach is probably to construct a matrix with the predictor edge's values, attach it to the network as a network (%n%) attribute, and reference it in the term.

Related

Is it possible to use attention mechanism for dealing with non-sequential data in neural networks?

I'm still trying to understand the attention mechanism. It is not clear to me what query, key, and value mean yet, for example.
However, my main issue is regarding how to apply attention in my use cases. Most of the examples that I found using attention were for dealing with sequential data, like tokens in sentences.
For example, I have a use case in which I need to classify the relationship between pairs of objects, represented by 100 features each. Thus, I was thinking about building a NN architecture with two inputs (each input will represent a description of some object) and one output (the relationship between those objects). Is it possible to use attention mechanisms in this case?

Q-learning, how about picking the action that actually gives most reward?

So in Q learning, you update the Q function by Qnew(s,a) = Q(s,a) + alpha(r + gamma*MaxQ(s',a) - Q(s,a).
Now, if I were to use the same principle but change Q to V function, instead of performing the action based on the current V function, you actually perform all actions (assuming you can reset the simulated environment), and select the best action out of those, and update the V function for that state. Would this yield a better result?
Of course, the training time would probably increase because you actually do all the actions once for each update, but since you are guaranteed to select the best action each time (except when exploring), it would give you a global optimum policy in the end?
This is a bit similar to value iteration, except I'm don't have and not building a model for the problem.
Now, if I were to use the same principle but change Q to V function, instead of performing the action based on the current V function, you actually perform all actions (assuming you can reset the simulated environment), and select the best action out of those, and update the V function for that state. Would this yield a better result?
It is typically assumed in Reinforcement Learning that we do not have the ability to reset the (simulated) environment. Sure, when we're working on simulations it often may technically be possible, but generally we hope that work in RL can also extend to "real-world" problems outside of simulations afterwards, where that would no longer be possible.
If you do have that possibility, it would generally be recommended to look into search algorithms like Monte-Carlo Tree Search, rather than Reinforcement Learning like Sarsa, Q-learning, etc. I suspect your suggestion might work slightly better than Q-learning indeed in this case, but things like MCTS would be even better.
Now, if I were to use the same principle but change Q to V function, instead of performing the action based on the current V function, you actually perform all actions (assuming you can reset the simulated environment), and select the best action out of those, and update the V function for that state. Would this yield a better result?
Given that you don't have access to the model, you have to resort to model free methods. What you are suggesting is basically a Dynamics programming backup. See the slides 28 - 31 in David Silver's lecture notes for various backup strategies to iterate on the value function.
However, note that this is just for prediction (i.e. estimating the value function for a given policy) and not for control (figuring out the best policy). There won't be a Max involved in prediction. To do control, you can use the above policy evaluation + greedy policy improvement to arrive at a "policy iteration based on Dynamic prog backup policy evaluation" method.
The other options for model-free control are SARSA [+ greedy policy improvement] (on policy) and Q-learning (off-policy). These are Q-function based methods, though.
If you are just trying to win the game, and not necessarily interested in RL techniques discussed above, then you also have the choice of using purely planning based methods (like Monte Carlo Tree Search). Finally, you can combine planning and learning with methods such as Dyna.

Conceptual confusion about numerical methods

I have a conceptual question about numerical methods. What is the difference among finite element, continuous finite element, discontinuous finite element, continuous galerkin and discontinuous galerkin methods? Are some of them just the same thing?
Thanks in advance
Finite element methods are a subset of numerical methods (which also include finite volume, finite difference, Monte Carlo and a lot more).
Simply put, in finite element methods, one tries to approximate the solution to a problem with a linear combination of pre-defined basis functions. These basis functions can be chosen to be either continuous or discontinuous. The resulting numerical methods are called CG/DG (continuous/discontinuous Galerkin) methods. In a DG method, the basis functions are only piecewise continuous: each basis function is zero everywhere in the domain, except in one element. See also this excellent Wikipedia article, which features some very nice figures.
Discontinuous Garlerkin methods were originally popularised in the field of particle transport long ago, but they have recently gained ground in other fields as well. (This is mostly because it wasn't clear at first how discontinuous basis functions would work well in equations that involve diffusion, but that problem has been solved now.)
Just a small pedantic correction to #A. Hennink's answer - In DG methods, the state variable is not piece-wise continuous between basis functions. There is a non-physical jump in the state variable between basis functions, hence the discontinuous part in the name. This can be visualized in the following figure displaying (dis)continuous basis functions of CG and DG:
In CG methods, the basis functions are piece-wise continuous, meaning continuous in the state variable itself, but the derivative may not be continuous (i.e. there may be a discontinuity in the derivative of the state variable between basis functions). This means that the solution to what ever problem you're solving can have non-physical kinks in the solution. Note the 'kinks' in the following solution
Some basis functions are not always discontinuous in the derivative though (see Hermite basis functions). See how the following Hermite basis function has a derivative of zero at both the end points. If all basis functions are composed using Hermite polynomials, then the derivative is continuous between basis functions because it's zero at the boundaries. Below, Psi_1 is the derivate of Psi_0, and Psi_3 is the derivative of Psi_2:

DDD: The conondrum of Side-Effect-Free functions

I apologize for so many questions, but I felt that they make the most sense only when treated as a unit
Note - all quotes are from DDD: Tackling Complexity in the Heart of Software ( pages 250 and 251 )
1)
Operations can be broadly divided into two categories, commands and
queries.
...
Operations that return results without producing side effects are
called functions. A function can be called multiple times and return
the same value each time.
...
Obviously, you can't avoid commands in most software systems, but the
problem can be mitigated in two ways. First, you can keep the commands
and queries strictly segregated in different operations. Ensure that
the methods that cause changes do not return domain data and are kept
as simple as possible. Perform all queries and calculations in methods
that cause no observable side effects
a) Author implies that a query is a function since it doesn't produce side effects. He also notes that function will always return same value, by which I assume he means that for the same input we will always get the same output?
b) Assume we have a method QandC(int entityId) which queries for specific domain entity, from which it extracts certain values, which in turn are used to initialize a new Value Object and this VO is then returned to the caller. Isn't according to above quote QandC a function, since it doesn't change any state?
c) But author also argues that for same input a function will always produce same output, which isn't the case with QandC, since if we place several calls to QandC, it will produce different results, assuming that in the time between the two calls this entity was modified or even deleted. As such, how can we claim QandC is a function?
d)
Ensure that the methods that cause changes do not return domain data
...
Reason being that the state of returned non-VO may be changed in some future operations and as such the side effects of such methods are unpredictable?
e)
Ensure that the methods that cause changes do not return domain data
...
Is a query method that returns an entity still considered a function, even if it doesn't change any state?
2)
VALUE OBJECTS are immutable, which implies that, apart from
initializers called only during creation, all their operations are
functions.
...
An operation that mixes logic or calculations with state change
should be refactored into two separate operations. But by definition,
this segregation of side effects into simple command methods only
applies to ENTITIES. After completing the refactoring to separate
modification from querying, consider a second refactoring to move the
responsibility for the complex calculations into a VALUE OBJECT. The
side effect often can be completely eliminated by deriving a VALUE
OBJECT instead of changing existing state, or by moving the entire
responsibility into a VALUE OBJECT.
a)
VALUE OBJECTS are immutable, which implies that, apart from
initializers called only during creation, all their operations are
functions ... But by definition, this segregation of side effects into
simple command methods only applies to ENTITIES.
I think author is saying all methods defined on VOs are functions, which doesn't make sense, since even though a method defined on a VO can't change its own state, it still can change the state of other, non-VO objects?!
b) Assuming method defined on an entity doesn't change any state, do we consider such a method as being a function, even though it is defined on an entity?
c)
... consider a second refactoring to move the responsibility for the
complex calculations into a VALUE OBJECT.
Why is author suggesting we should only refactor from entities those function that perform complex calculations? Why instead shouldn't we also refactor simpler functions?
d)
... consider a second refactoring to move the responsibility for the
complex calculations into a VALUE OBJECT.
In any case, why is author suggesting we should refactor functions out of entities and place them inside VOs? Just because it makes it more apparent to the client that this operation MAY be a function?
e)
The side effect often can be completely eliminated by deriving a VALUE
OBJECT instead of changing existing state, or by moving the entire
responsibility into a VALUE OBJECT.
This doesn't make sense, since it appears author is arguing if we move a command ( ie operation which changes the state ) into a VO, then we will in essence eliminate any side-effects, even if command is changing the state. So any ideas, what was author actually trying to say?
UPDATE:
1b)
It depends on the perspective. A database query does not change state
and thus has no side effects, however it isn't deterministic by
nature, since as you point out the data can change. In the book, the
author is referring to functions associated with value object and
entities, which don't themselves make external calls. Therefore, the
rules don't apply to QandC.
So author was describing only functions that don't make external calls and as such QandC isn't a type of function that author was describing?
1c)
QandC does not itself change state - there are no side effects. The
underlying state may be changed out of band however. Due to this, it
is not a pure function.
But it also isn't the Side-Effect-Free function in the sense author defined them?
1d)
Again, this is based on CQS.
I know I'm repeating myself, but I assume discussion in the book is based on CQS and CQS doesn't consider QandC as Side Effect Free function due to a chance of entity returned by QandC having its state modified ( by some other operation ) sometime in the future?
1e)
It is considered a query from the CQRS perspective, but it cannot be
called a function in the sense that a pure function on a VO is a
function due to lack of determinism.
I don't quite understand what you were trying to say ( the confusing part is in bold ). Perhaps that while QandC is considered a query, it is not considered a function due to returning an entity and such the side-effects are unpredictable, which makes QandC a non-deterministic by nature
So author is only making those statements ( see quote in 1e ) under the implicit assumption that no operation defined in VO will ever try to change the state of non-VO objects?
2d)
Given that VOs are immutable, they are a fitting place to house pure
functions. This is another step towards freeing domain knowledge from
technical constraints.
I don't understand why moving function from entity to VO would help free domain knowledge from technical constraints ( I'm also not really sure what you mean by technical – technical as in technology-related or... )?
I assume other reason for putting function in VO is because it is that much more obvious ( to client ) that this is a function?
2e)
I view this as a hint towards event-sourcing. Instead of changing
existing state, you add a new event which represents the change. There
is still a net side effect, however existing state remains stable.
I must confess I know nothing about even-source programming, since I'd like to first wrap my head around DDD. Anyway, so author didn't imply that just moving a command to VO would automatically eliminate side-effects, but instead some additional actions would have to be taken ( such as implementing event-sourcing ), only he "forgot" to mention that part?
SECOND UPDATE:
2d)
One of the defining characteristics of an entity is its identity ....
By placing business logic into VOs you can consider it outside of the
context of an entity's identity. This makes it easier to test this
logic, among other things.
I somehwat understand the point you're making ( when thinking about the concept from distance ), but on the other hand I really don't. Why would function within an entity be influenced by an identity of this entity ( assuming this function is pure function, in other word it doesn't change state and is deterministic )?
2e)
Yes that is my understanding of it - there is still a net "side
effect". However, there are different ways to attain a side effect.
One way is to mutate existing state. Another way is to make the state
change explicit with an object representing that change.
I - Just to be sure ... From your answer I gather that author didn't imply that side-effects would be eliminated simply by moving a command into VO?
II - Ok,if I understand you correctly, we can move a command into VOs ( even though VOs shouldn't change the state of anything and as such shouldn't cause any side-effects ) and this command inside VO is still allowed to produce some sort of side effects, but this side effect is somehow more acceptable ( OR MORE CONTROLLABLE ) by making state change explicit ( which I interpret as the thing that changed is returned to the caller as VO )?
3) I must say that I still don't quite understand why state-changing method SC shouldn't return domain objects. Perhaps because non-VO may be changed in some future operations and as such the side effects of SC are very unpredictable?
THIRD UPDATE:
Delegating the management of state to the entity and the
implementation of behavior to VOs creates certain advantages. One is
basic partitioning of responsibilities.
a) You're saying that even though a method describes a behavior of an entity ( and thus entity containing this method adheres to SRP ) and as such belongs in the entity, it may still be a good idea to move it into VO? Thus in essence, we would partition a responsibility of an entity into two even smaller responsibilities?
b) But won't moving behavior into VO basically turn this entity into a mere data container ( I understand that entity will still manage its state, but still ... )?
thank you
1a) Yes. The discourse on separating queries from commands is based on the Command-query separation principle.
1b) It depends on the perspective. A database query does not change state and thus has no side effects, however it isn't deterministic by nature, since as you point out the data can change. In the book, the author is referring to functions associated with value object and entities, which don't themselves make external calls. Therefore, the rules don't apply to QandC. Determinism could be fabricated however, offering degrees of "pureness". For instance, a serializable transaction could be created which can ensure that data doesn't change for its duration.
1c) QandC does not itself change state - there are no side effects. The underlying state may be changed out of band however. Due to this, it is not a pure function. However, the restriction that QandC doesn't change state is still valuable. The value is fittingly demonstrated by CQRS which is the application of CQS in distributed scenarios.
1d) Again, this is based on CQS. Another take on this is the Tell-Don't-Ask principle. Given an understanding of these principles however, the rule can be bent IMO. A side-effecting method could return a VO representing the result for instance. However, in certain scenarios such as CQRS + Event Sourcing it could be desirable for commands to return void.
1e) It is considered a query from the CQRS perspective, but it cannot be called a function in the sense that a pure function on a VO is a function due to lack of determinism.
2a) No, a VO function shouldn't change state of anything, it should instead return a new object.
2b) Yes.
2c) Because functional purity tends to become more important in more complex scenarios. However, as you point out, isn't a clear and definitive rule. It shouldn't be based on complexity as much as it is based on the domain at hand.
2d) Given that VOs are immutable, they are a fitting place to house pure functions. This is another step towards freeing domain knowledge from technical constraints.
2e) I view this as a hint towards event-sourcing. Instead of changing existing state, you add a new event which represents the change. There is still a net side effect, however existing state remains stable.
UPDATE
1b) Yes.
1c) It is a side-effect free function, however it is not a deterministic function because it cannot be thought to always return the same value given the same input. For example, the function that returns the current time is a side-effect free function, but it certainly does not return the same value in subsequent calls.
1d) QandC can be thought of as side-effect free, but not pure. Another way to look at functional purity is as referential transparency - the ability to replace a function call by its value without changing program behavior. In other words, asking the question does not change the answer. QandC can guarantee that, but only within a context such as a transaction. So QandC can be thought of as a function, but only in a specific context.
1e) I think the confusing part is that the author is talking specifically about functions on VOs and entities - not database queries, where as we are talking about both. My statement extends the discussion to database queries and CQRS given certain restrictions, ie an ambient transaction.
2d) I can see how what I said was a bit vague, I was getting lazy. One of the defining characteristics of an entity is its identity. It maintains its identity throughout its life-cycle while its state may change. By placing business logic into VOs you can consider it outside of the context of an entity's identity. This makes it easier to test this logic, among other things.
2e) Yes that is my understanding of it - there is still a net "side effect". However, there are different ways to attain a side effect. One way is to mutate existing state. Another way is to make the state change explicit with an object representing that change.
UPDATE 2
2d) This particular point can be argued or can be a matter of preference. One perspective is the idea is based on the single-responsibility principle (SRP). The responsibility of an entity is the association of an identity with behavior and state. Behavior combines input with existing state to produce state transitions. Delegating the management of state to the entity and the implementation of behavior to VOs creates certain advantages. One is basic partitioning of responsibilities. Another is more subtle and perhaps more arguable. It is the idea that logic can be considered in a stateless manner. This allows thinking about such logic easier and more like thinking about a mathematical equation where all changes are explicit - no hidden state.
2e.1) Yes, eliminating a net side effect would alter behavior, which is not the goal.
2e.2) Yes.
3) Commands returning void have several advantages. One is that they become naturally more adept in async scenarios - no need to wait for a result. Another is that it allows you to represent the operation as a single command object - again, because there is no return value. This applies in CQRS and also event sourcing. In these cases, any command output is dispatched as an event instead of a result. But again, if these requirements don't apply returning a result object can be appropriate.
UPDATE 3
a) Yes, and this is a specific type of partitioning.
b) The responsibility of the entity is to coordinate behavior by delegating to VOs and applying the resulting state changes.

What's the best way to model an unordered list (i.e., a set)?

What's the most natural way to model a group of objects that form a set? For example, you might have a bunch of user objects who are all subscribers to a mailing list.
Obviously you could model this as an array, but then you have to order the elements and whoever is using your interface might be confused as to why you're encoding arbitrary ordering data.
You can use a hash where the members are keys that map to "1" or "true", but in most languages there are restrictions on what data types a hash key can be.
What's the standard way to do this in modern languages (PHP, Perl, Ruby, Python, etc)?
In Python, you would use the set datatype. A set supports containing any hashable object, so if you have a custom class you need to store in a set and the default hashable behaviour is not appropriate, you can implement __hash__ to implement the behaviour you want.
C# has the HashSet<T> generic collection.
public class EmailAddress // probably needs to override GetHashCode()
{
...
}
var addresses = new HashSet<EmailAddress>();
Most modern languages are going to have some form of Set data structure. Java has HashSet, which implements the Set interface.
In PHP you can use an array to store your data. Either search the array before you add a new element, or use array_unique to remove duplicates after inserting all elements.
In c as a stand-in for understanding the machine directly:
For small, discrete and well defined ranges: use a bitwise array to indicate the presence of each possible item (set for present, unset for absent).
Use a hash-table for all other cases.
Write functions to implement adding and removing items, testing for presence or absence, testing for sub-sets, etc as needed.
As the other answers note, however, if you just want the functionality, use a language feature or third-party library that is already well debugged.
A lot of the time hash-based sets are the correct thing to use, but if you don't need to do key-based lookups and don't worry about enforcing unique values, a vector or list is fine. There is overhead to a hash table, after all.
You seem to be concerned that people will think that the order in the vector is important, but I think that it is a common enough usage that, with documentation, you shouldn't confuse people.
It really depends on how you want to access and use the data.
and Array is usually the simplest way to store data, without any other requirements. Usually other data types are used for different reasons (you want to append data, you want to search data in constant time, you need quick set union/intersection, etc) If your only concern is the abstraction you could wrap it in some kind of unordered facade.
In Perl I would use a hash, definitely. In other languages I would lament the lack of a hash.