Georeferenced subgraphs / clusters with Graphviz

Georeferenced subgraphs / clusters with Graphviz - gis

Sometimes I would like to overlay a conceptual graph on top of a map to provide additional context of where elements belong. For example, if I wanted to show social relationships between people in different countries, I might want to have the people located in their appropriate country, but with the layout within those countries being automated.
I've drawn (poorly) a picture to help illustrate what I'm hoping to do.
I found this example, but this appears to be a fake geography with clustering. What I would like is a real map where entities are contained inside their correct region, but where the entities themselves are automagically arranged.

I don't think it's possible, either, at least not in a strict sense.
However, you might try to add an invisible node in the middle of each country (use "pos" for placing and "style=plain" with no label to make it invisible), and then link the people who live there with a short edge (use "len" to constrain the length and "weight" to make it more important in the ordering). You might add more than one such node to a country, if it has a funky shape.

Related

Polygonal Search

I have read several of the posts concerning Polygonal Search, but they are all about fixing or updating the programs. I am just wondering how it works. If there is a way I can get something like pseudo code of it or an explanation of how a shape captures the data points.
To further specify my goal, I am trying to make a constant square that will be held over a map (such as google maps), but the map can move around behind the square, however, the square will continue to report whatever cities lie within its bounds. [I will eventually proceed to building it, I just need some guidance]
Thank you.

There is an open-source library which has a function to check if two shapes overlap. You can check source code:
http://turfjs.org/static/docs/module-turf_inside.html
If you look for theory behind it check Hyperplane separation theorem

Item matching with domain knowlege

I have various product items that I need to decide if they are the same. A quick example:
Microsoft RS400 mouse with middle button should match Microsoft Red Style 400 three buttoned mouse but not Microsoft Red Style 500 mouse
There isn't anything else nice that I can match with apart from the name and just doing it on the ratio of matching words isn't good enough (The error rate is far too high)
I do know about the domain and so I can (for example) hand write the fact that a three buttoned mouse is probably the same as a mouse with a middle button. I also know the manufacturers (or can take a very good guess at them).
The only thought I have had so far is matching them by trying to use hand written rules to reduce the size of the string and then checking the matching words, but I wondered if anyone had any ideas best way of doing this matching was with a better accuracy and precision (or where to start looking) and if anyone knew of any work that had been done in this area? (papers, examples etc).

"I do know about the domain..."
How much exactly do you know about the domain? If you know everything about the domain, then you might be better off building an index of all your manufacturers products (basically the description of the product from the manufacturers webpage). Then instead of trying to match your descriptions to each other, matching them to your index of products.
Advantages to this approach:
presumably all words used in the description of the product have been used somewhere in the promotional literature
if when building the index you were able to weight some of the information (such as product codes) then you may have more success
Disadvantages:
may take a long time to create the index (especially if done by hand)
If you don't know everything about your domain, then you might consider down-ranking words that are very common (you can get lists of common words off the internet), and up-ranking numbers and words that aren't in a dictionary (you can get lists of words off the internet/most linux/unix distributions come with them for spell checking purposes).
I don't know how much you know about search, but in the past I've found the book "Search Engines: Information Retrieval in Practice" by W. Bruce Croft, Donald Metzler, Trevor Strohman to be useful. There are some sample chapters in the publishers website which will tell you if the book's for you or not: pearsonhighered.com
Hope that helps.

In addition to hand-written rules, you may try to use supervised learning with feature extraction.
Let features be the words in description, than look on descriptions as feature vectors.
When teaching the algorithm, let it show you two vectors that look similar by the ratio, and if it's same item, let the algorithm improve weighs for those words.
For example, each pair of words may have bigger weight than simple ratio, as you have done.
[3-button] [middle]
[wheel] [button]
[mouse] [mouse]
By your algorithm, it'll give ratio of 1/3 to similarity. When you set this as "same item" algorithm should add more value to those pair of words, when it reaches them next time.

Just tokenize (you should seperate numbers from letters in that step aswell, so not just a whitespace tokenizer), stem, filter stopwords and uninteresting words like mouse. Perhaps you should have a list with words producers aswell and shorten all not producers and numbers to their first letter. (if you do that, you have to seperate capital letters aswell in the tokenizer)
Microsoft RS400 mouse with middle button -> Microsoft R S 400
Microsoft Red Style 400 three buttoned mouse -> Microsoft R S 400
Microsoft Red Style 500 mouse -> Microsoft R S 500
If you want a better solution
vsm (vector space model) out of plagiarism detection would be nice. (Every word gets a weight, according to their discriminative value and those weights are projected into a multidimensional space. After that you just measure the angular degree between 2 texts)

I would suggest something a lot more generally applicable. As I understand it, you want some nlp processing that will deal with things that you recognize as synonyms. I think that's a pretty simple implementation right there.
If I were you I would make a keyword object that had a list of synonyms as a parameter, then write a script that would scrape whatever text you have for words that only appear occasionally (have some capped frequency at which the keyword is actually considered applicable), then add a list of keywords as a parameter of each keyword that contains it's synonyms. If you were willing to go a step further I would set weights on the synonym list showing how similar they are.
With this kind of nlp problem, the chance that you will get to 100% accuracy is 0, but you could well get above 90%, I would suggest adding an element by which you can adjust the weights in an automated way. I have to be fairly vague here, but in my last job I was tasked with a similar problem, and was able to get accuracy in the high 90's. My implementation was also probably more complicated than what you need, but even a simple implementation should get you pretty good return, but if you aren't dealing with a fairly large data set (~hundreds+) it's probably not worth scripting.
Quick example, in your example the difference can be distilled pretty accurately to just saying that "middle" and "three" are synonyms. You can get more complex if you need to, but that would match a lot.

Randomly Generate Directed Graph on a grid

I am trying to randomly generate a directed graph for the purpose of making a puzzle game similar to the ice sliding puzzles from pokemon.
This is essentially what I want to be able to randomly generate: http://bulbanews.bulbagarden.net/wiki/Crunching_the_numbers:_Graph_theory
I need to be able to limit the size of the graph in an x and y dimension. In the example in the link, it would be restricted to an 8x4 grid.
The problem I am running in to is not randomly generating the graph, but randomly generating a graph which I can properly map out in a 2d space, since I need something (like a rock) on the opposite side of a node, to make it visually make sense when you stop sliding. The problem with this is sometimes the rock ends up in the path between two other nodes or possibly on another node itself, which causes the entire graph to become broken.
After discussing the problem with a few people I know, we came to a couple of conclusions that may lead to a solution. Including the obstacles in the grid as part of the graph when constructing it. Start out with a fully filled grid and just draw a random path and delete out blocks that will make that path work, though the problem then becomes figuring out which ones to delete so that you don't accidentally introduce an additional, shorter path. We were also thinking a dynamic programming algorithm may be beneficial, though none of us are too skilled with creating dynamic programming algorithms from nothing. Any ideas or references about what this problem is officially called (if it's an official graph problem) would be most helpful.

I wouldn't look at it as a graph problem, since as you say the representation is incomplete. To generate a puzzle I would work directly on a grid, and work backwards; first fix the destination spot, then place rocks in some way to reach it from one or more spots, and iteratively add stones to reach those other spots, with the constraint that you never add a stone which breaks all the paths to the destination.

You might want to generate a planar graph, which means that the edges of the graph will not overlap each other in a two dimensional space. Another definition of planar graphs ist that each planar graph does not have any subgraphs of the type K_3,3 (complete bi-partite with six nodes) or K_5 (complete graph with five nodes).
There's a paper on the fast generation of planar graphs.

Obfuscating Geocode results to protect privacy?

I have an app that finds other users within a 20 mile radius on a google map and associates an icon with each of them. However, I do not want their exact points to be given but rather an approximation. I've wrestled with a few ideas on how to do this:
Only Geocode the Zip Code, make graphic icons for 1-99, use the icon to represent how many results are within the zip code, and use the info window to show hyperlinks to the individual results. The only problem is, I'd like each individual icon to be shown because it just looks a lot better.
Add/Subtract a random number to the lat/lng values stored with each user and add a translucent circle around the icon.
What do you guys suggest?

It depends on the level of privacy you want (the 1st option protects privacy better), but I'd be tempted to go with randomly moving the indicators because it's a more natural representation (people on a map, not groups of people on a map) without too much of a compromise in terms of usefulness.

That depends on how hard you think someone will try to defeat your system.
If you plan to track these positions over time, you give away more information over time than you do in a snapshot. For instance, if you choose a fixed-offset from the center of the circle, it may be possible to find this offset by mapping the path over time to the street map. On the other hand if you continually change the offset, the position may be discoverable by averaging.
Here's one possible scheme based on hysteresis. Leave the visible circle in place until the user exits an invisible bounding circle with a random radius. Then compute a new visible circle with a different random offset, and also set up a new invisible circle with a different random radius. This should generate a visible-circle movement that is almost impossible to reverse engineer, but also avoids lots of jittery movement.

Howto dynamically render space background in actionscript3?

I'm creating a space game in actionscript/flex 3 (flash). The world is infinitely big, because there are no maps. For this to work I need to dynamically (programatically) render the background, which has to look like open space.
To make the world feel real and to make certain places look different than others, I must be able to add filters such as colour differences and maybe even a misty kind of transformation - these would then be randomly added and changed.
The player is able to "scroll" the "map" by flying to the sides of the screen, so that a certain part of the world is only visible at once but the player is able to go anywhere. The scrolling works by moving all objects except for the player in the opposite direction, making it look like it was the player that moved into that direction. The background also needs to be moved, but has to be different on the new discovered terrain (dynamically created).
Now my question is how I would do something like this, what kind of things do I need to use and how do I implement them? Performance also needs to be taken into account, as many more objects will be in the game.

You should only have views for objects that are within the visible area. You might want to use a quad tree for that.
The background should maybe be composed of a set of tiles, that you can repeat more or less randomly (do you really need a background, actually? wouldn't having some particles be enough?). Use the same technique here you use for the objects.
So in the end, you wind up having a model for objects and tiles or particles (that you would generate in the beginning). This way, you will only add a few floats (you can achieve additional performance, if you do not calculate positions of objects, that are FAR away. The quad tree should help you with that, but I think this shouldn't be necessary) If an object having a view leaves the stage, free the view, and use the quad tree to check, if new objects appear.
If you use a lot of objects/particles, consider using an object pool. If objects only move, and are not rotated/scaled, consider using DisplayObject::cacheAsBitmap.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008