I'm trying to create a geo viz using this data which maps out all of the zip codes in NYC: https://data.cityofnewyork.us/Health/Modified-Zip-Code-Tabulation-Areas-MODZCTA-Map/5fzm-kpwv. I've used geopandas to read the data as a geodataframe, used shapely.wkt - loads on the multipoloygon, & set the appropriate column as the geometry.
I then uploaded this tree census data (https://data.cityofnewyork.us/Environment/2015-Street-Tree-Census-Tree-Data/uvpi-gqnh) into a df and merged the two on the zipcode column (I renamed cols myself as datasets used different terms).
My goal is to create an interactive geo viz using Bokeh, which I understand uses GEOJson. I've used the following code to achieve: JNB screenshot of code
However, it is taking an exceedingly long time to run the code:
nyc_trees = json.loads(nyc_trees.to_json())
I've tried ujson as well, but it hasn't helped with the timing. Is there a way to speed this up or another work around?
Im new to Neo4j and looking for some guidance :-)
Basically I want to create the graph below from the csv below. The NEXT relationship is created between Points based on the order of their property sequence. I would like to be able to ignore if sequences are consecutive. Any ideas?
(s1:Shape)-[:POINTS]->(p1:Point)
(s1:Shape)-[:POINTS]->(p2:Point)
(s1:Shape)-[:POINTS]->(p3:Point)
(p1)-[:NEXT]->(p2)
(p2)[:NEXT]->(p3)
and so on
shape_id,shape_pt_lat,shape_pt_lon,shape_pt_sequence,shape_dist_traveled
"1-700-y11-1.1.I","53.42646060879","-6.23930113514121","1","0"
"1-700-y11-1.1.I","53.4268571616632","-6.24059395687542","2","96.6074531286277"
"1-700-y11-1.1.I","53.4269700485041","-6.24093540883784","3","122.549696670773"
"1-700-y11-1.1.I","53.4270439028769","-6.24106779537932","4","134.591291249566"
"1-700-y11-1.1.I","53.4268623569266","-6.24155684094256","5","172.866609667575"
"1-700-y11-1.1.I","53.4268380666968","-6.2417384245122","6","185.235926544428"
"1-700-y11-1.1.I","53.4268874080753","-6.24203735638874","7","205.851454672516"
"1-700-y11-1.1.I","53.427394066848","-6.24287421729846","8","285.060040065768"
"1-700-y11-1.1.I","53.4275257974236","-6.24327509689195","9","315.473852717259"
"1-700-y11-1.2.O","53.277024711771","-6.20739084216546","1","0"
"1-700-y11-1.2.O","53.2777605784999","-6.20671521402849","2","93.4772699644143"
"1-700-y11-1.2.O","53.2780318605927","-6.2068238246152","3","124.525619356934"
"1-700-y11-1.2.O","53.2786209984572","-6.20894363498438","4","280.387737910482"
"1-700-y11-1.2.O","53.2791038678913","-6.21057305710353","5","401.635418300665"
"1-700-y11-1.2.O","53.2790975844245","-6.21075327761739","6","413.677012879457"
"1-700-y11-1.2.O","53.2792296384738","-6.21116766400758","7","444.981964564454"
"1-700-y11-1.2.O","53.2799500357098","-6.21065767664905","8","532.073870043666"
"1-700-y11-1.2.O","53.2800290799386","-6.2105343995296","9","544.115464622458"
"1-700-y11-1.2.O","53.2815594673093","-6.20949562301196","10","727.987702875002"
It is the 3rd part that I cant finish. Creating the NEXT relationship!
//1. Create Shape
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
With distinct csv.shape_id as ids
Foreach (x in ids | merge (s:Shape {id: x} ));
//2. Create Point, and Shape to Point relationship
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
MATCH (s:Shape {id: csv.shape_id})
with s, csv
MERGE (s)-[:POINTS]->(p:Point {id: csv.shape_id,
lat : csv.shape_pt_lat, lon : csv.shape_pt_lat,
sequence : toInt(csv.shape_pt_sequence), dist_travelled : csv.shape_dist_traveled});
//3.Create Point to Point relationship
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
???
You'll want APOC Procedures installed for this one. It has both a means of batch processing, and a quick way to link all nodes in a collection together.
Since you already have all shapes the the points of the shape in the db, you don't need to do another load csv, just use the data you've got.
We'll use apoc.periodic.iterate() to batch process each shape, and apoc.nodes.link() to link all ordered points in the shape by relationships.
CALL apoc.periodic.iterate(
"MATCH (s:Shape) RETURN s",
"WITH {s} as shape
MATCH (shape)-[:POINTS]->(point:Point)
WITH shape, point
ORDER by point.sequence ASC
WITH shape, COLLECT(point) as points
CALL apoc.nodes.link(points,'NEXT')",
{batchSize:1000, parallel:true}) YIELD batches, total
RETURN batches, total
EDIT
Looks like there may be a bug when using procedure calls within the apoc.periodic.iterate() where no mutating operations occur (attempted this after including a SET operation in the second part of the query to set a property on some nodes, the property was not added).
Unsure if this is a general case of procedure calls being executed within procedure calls, or if this is specific to apoc.periodic.iterate(), or if this only occurs with both iterate() and link().
I'll file a bug if I can learn more about the cause. In the meantime, if you don't need batching, you can forgo apoc.periodic.iterate():
MATCH (shape:Shape)-[:POINTS]->(point:Point)
WITH shape, point
ORDER by point.sequence ASC
WITH shape, COLLECT(point) as points
CALL apoc.nodes.link(points,'NEXT')
I want to use chart.js to plot a real time plot of data being received on the html webapp using websockets with the value received as Y-Axis and reception time (in secs) as X-Axis.
Well, I assume you know HTML/JavaScript/CSS, websockets, etc to use chart.js in your apps.
Here is a related post that might be of help to get you started....
Chart.js number Y-AXIS label format for many decimal places
You just have to get started somewhere, sometime!!!
Good luck.
Use the addData method.
From the documentation at http://www.chartjs.org/docs/#advanced-usage-prototype-methods
.addData( valuesArray, label )
Calling addData(valuesArray, label) on your Chart instance passing an
array of values for each dataset, along with a label for those points.
// The values array passed into addData should be one for each dataset in the chart
myLineChart.addData([40, 60], "August");
// This new data will now animate at the end of the chart.
I am trying to implement a query and graph visualisation framework that allows a user to enter a Gremlin query, returning a D3 graph of results. The D3 graph is built using a JSON - this is created using separate vertices and edges outputs from the Gremlin query. For simple queries such as:
g.V.filter{it.attr_a == "foo"}
this works fine. However, when I try to perform a more complicated query such as the following:
g.E.filter{it.attr_a == 'foo'}.groupBy{it.attr_b}{it.outV.value}.cap.next().findAll{k,e->e.size()<=3}
- Find all instances of *value*
- Grouped by unique *attr_b*
- Where *attr_a* = foo
- And *attr_b* is paired with no more than 2 other instances of *value*
Instead, the output is of the following form:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
I would like to know if there is a way for Gremlin to output the results as a list of nodes and edges so I can display the results as a graph. I am aware that I could edit my D3 code to take in this new output but there are currently no restrictions to the type/complexity of the query, so the key/value pairs will no necessarily be the same every time.
Thanks.
You've hit what I consider one of the key problems with visualizing Gremlin results. They can be anything. Gremlin results might not just be a list of vertices and edges. There is no way to really control this that I can think of. At the end of the day, you can really only visualize results that match a pattern that D3 expects. I'd start by trying to detect that pattern and visualize only in those cases (simply display non-recognized patterns as JSON perhaps).
Thinking of your specific example that results like this:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
What would you want D3 to visualize there? The vertices/edges that were traversed over to get that result? If so, you might be stuck. Gremlin doesn't give you a way to introspect the pipeline to see what's passing through it. In other words, unless the user explicitly gathers vertices and edges within the pipeline that were touched you won't have access to them. It would be nice to be able to "spy" on a pipeline in that way, but at the moment it doesn't do that. There's been internal discussion within TinkerPop to create a new kind of pipeline implementation that would help with that, but at the moment, it doesn't exist.
So, without the "spying" capability, I think your only workarounds would be to:
detect vertex/edge list on your client side and only render those with d3. this would force users to always write gremlin that returned data in such a format, if they wanted visualization. put it in the users hands.
perhaps supply server-side bindings for a list of vertices/edges that a user could explicitly side-effect their vertices/edges into if their results did not conform to those expected by your visualization engine. again, this would force users to write their gremlin appropriately for your needs if they want visualization.
I am trying to extract the latitudes and longitudes for the places listed on the right side of this page. I want to create a table like the following:
Place Latitude Longitude
Agarda 23.12604 87.19869
Ahanda 23.13099 87.18501
.....
.....
West-Sanabandh 23.24876 86.99941
Is it possible to do this in R without calling up the individual hyperlinks for "Agarda:, "Ahanda"... etc. one at a time?
The data appears on different pages. You can't get that data without requesting each page.
If R supports threads then you can call them up in parallel rather than one at a time.
It's possible to use RCurl to scrape each page in some type of loop or sapply. If you combine it with some regex and/or readHTMLTable (to identify the hyperlinks) then it's a relatively straightforward function.
Within RCurl, it's possible to create a multicurl which will do this in parallel, although given the number of queries involved, it might be just as easy to serialise it and put a small system sleep between queries.