Index of the community from iGraph community algorithms - igraph

Does the index of the community from any of the iGraph community algorithms have a meaning? e.g., if I use fc <- fastgreedy.community(g) and I get communities 1,2,3: does community 1 mean it was the strongest community because it merged first in the algorithm or are they just labels?

There is no ordering; the community IDs that igraph uses are only arbitrary numbers.

Related

Counting the number of multiply-add operations (MAC) in Caffe CNN's architecture

Lately I've been benchmarking some CNNs regarding time, # of multiply-add operations (MAC), # of parameters and model size. I have seen some similar SO questions (here and here) and in the latter, they suggest using Netscope CNN Analyzer. This tool allows me to calculate most of the things I need just by inputing my Caffe network definition.
However, the number of multiply-add operations of some architectures I've seen in papers and over the internet doesn't match what Netscope is outputting, whereas other architectures match. I'm always comparing either FLOPs or MAC with the MACC column in netscope, but there a ~10x factor that I'm forgetting at some point (check table bellow for more detail).
Architecture ---- MAC (paper/internet) ---- macc column in netscope
VGG 16 ~15.5G ~157G
GoogLeNet ~1.55G ~16G
Reference about GoogLeNet macc number and VGG16 macc number in Netscope.
Does anybody that used that tool could point me out on what mistake I'm doing while reading Netscope output?
I've found what was causing the discrepancy between Netscope and the information I'd found in papers. Most preset architectures in Nestcope were using a batch size of 10 (this is the case for VGG and GoogLeNet, for example), therefore the x10 factor that multiplied the number of mult-add operations.

Database choice for a dictionary

I want to create a semantic texts analyzer. To do that I need to store a lot of words roots in the database - the basic language vocabulary which is about one hundred thousand words.
Is there any pattern or common architecture and what kind of database should I use - relational or nosql(probably mongodb)?
There are 26 letters and many thousand of words can start from each. If using relational db should I create 26 different tables for each letter or if using nosql should I store them all together?
Oracle SPARQL loaded with WORDNET is a good start.

Searching for effective way to store graph with 3 million vertices in MySQL

The goal is to make many cycled chains in graph with 3 million vertices.
The question is how to store edges in MySQL database and maintain fast speed, searching cycled chains, using Dijkstra's algorithm may be?
This is really a job for a graph database. Neo4j is an excellent choice.

What are the differences between genetic algorithms and genetic programming?

I would like to have a simple explanation of the differences between genetic algorithms and genetic programming (without too much programming jargon). Examples would also be appreciated.
Apparently, in genetic programming, solutions are computer programs. On the other hand, genetic algorithms represent a solution as a string of numbers. Any other differences?
Genetic algorithms (GA) are search algorithms that mimic the process of natural evolution, where each individual is a candidate solution: individuals are generally "raw data" (in whatever encoding format has been defined).
Genetic programming (GP) is considered a special case of GA, where each individual is a computer program (not just "raw data"). GP explore the algorithmic search space and evolve computer programs to perform a defined task.
Genetic programming and genetic algorithms are very similar. They are both used to evolve the answer to a problem, by comparing the fitness of each candidate in a population of potential candidates over many generations.
Each generation, new candidates are found by randomly changing (mutation) or swapping parts (crossover) of other candidates. The least 'fit' candidates are removed from the population.
Structural differences
The main difference between them is the representation of the algorithm/program.
A genetic algorithm is represented as a list of actions and values, often a string. for example:
1+x*3-5*6
A parser has to be written for this encoding, to understand how to turn this into a function. The resulting function might look like this:
function(x) { return 1 * x * 3 - 5 * 6; }
The parser also needs to know how to deal with invalid states, because mutation and crossover operations don't care about the semantics of the algorithm, for example the following string could be produced: 1+/3-2*. An approach needs to be decided to deal with these invalid states.
A genetic program is represented as a tree structure of actions and values, usually a nested data structure. Here's the same example, illustrated as a tree:
-
/ \
* *
/ \ / \
1 * 5 6
/ \
x 3
A parser also has to be written for this encoding, but genetic programming does not (usually) produce invalid states because mutation and crossover operations work within the structure of the tree.
Practical differences
Genetic algorithms
Inherently have a fixed length, meaning the resulting function has bounded complexity
Often produces invalid states, so these need to be handled non-destructively
Often rely on operator precedence (e.g. in our example multiplication happens before subtraction) which could be seen as a limitation
Genetic programs
Inherently have a variable length, meaning they are more flexible, but often grow in complexity
Rarely produces invalid states, these can usually be discarded
Use an explicit structure to avoid operator precedence entirely
To make it simple, (on the way I see it) Genetic Programming is an application of Genetic Algorithm. The Genetic Algorithm is used to create another solution via a computer program.
Practical answer:
GA is when using a population and evolve the generations of population to a better state.
(For example how the humans have evolved from animals to people, by breading and get better genes)
GP is when by known definition of the problem generate code into better solve a problem.
(GP will usually give a lots of if/else statements, that will explain the solution)
Lots of good partial answers above. As Koza put it in his seminal texts on the subject, "[if a GA was the best solution for a problem then a GP would evolve a GA to solve it]." Simply put, a GP is a type of GA that evolves programs that are evaluated by a cost function. The fact that the genome is a program rather than a collection of inputs for the cost function IMHO is the material difference.
https://en.wikipedia.org/wiki/Genetic_programming
genetic programming is much more powerful than genetic algorithms. the output of the genetic algorithms is a quantity while the output of the genetic programming is another computer program.

What datatype is suitable for storing latitudes and longitudes?

what should be the data type of latitudes and longitudes in general
In MySQL, it should be Point, so that you can run efficient SPATIAL queries.
For SQL Server 2000 and 2005, I've seen Numeric(15,10) for each coordinate used most often although I cannot speak to the correctness of that.
Sql Server 2008 has new spatial data types. See here for more on those (its a little more than half way down the page).
I would say double if you want to store them in separate columns.
I have used varchar2(40) under Oracle.
Depends on the language/DB you are using.
In general, its preferable to have it as float, in case of programming language.
and NUMBER(x,y) in case of sqlplus.
in general int (on Intel x86 in 32bit mode) is sufficient (in 100 (")representation).
sign is used to indicate N/S and E/W.
brgds/gabriel