I am trying to import a raster file that contains land-cover codes. Once the raster sets the patch variables to these land-cover codes, I want to link those codes to a separate .csv that has vegetation-specific parameters for each land-cover code. Thus each patch will be assigned the .csv variables based on its land-cover code. I'm completely stumped as to how to do this. More generally, how can I use a .csv as a cross-reference file? I don't have any code examples but here is an example of the kind of .csv I want to use:
Table example
So this .csv would assign the GR1 variables to multiple patches with land-cover code GR1
I agree with JenB for sure, especially if your values table is relatively short. However, if you have a lot of values, it might work to use the csv and table extensions together to make a dictionary where the 'land-cover code' acts as the key to retrieve the other data for your patch. So one path would be:
Read the csv
Take one of the columns as the key
Keep the remaining columns as a list of values
Combine these two lists into a list of lists
Make a dictionary out of those two lists
Have each patch query the dictionary for the values of interest
So with this example csv table:
lcover,fuel,type
GR1,15,a
GR2,65,b
GR3,105,a
And these extensions and variables:
extensions [ csv table ]
globals [ csvRaw keyValList dataList patchDataDict ]
patches-own [ land-cover fuel patchType]
We can run a code block to do all these steps (more explanation in comments):
to setup
ca
; Load the csv
set csvRaw but-first csv:from-file "landCoverMeta.csv"
print csvRaw
; Pull first value (land cover)
set keyValList map first csvRaw
print keyValList
; Pull data values
set dataList map but-first csvRaw
print dataList
; Combine these two lists into a list of lists
let tempList ( map [ [ a b ] -> list a b ] keyValList dataList )
; Make a dictionary with the land cover as the key
; and the other columns as the value (in a list)
set patchDataDict table:from-list tempList
ask patches [
; Randomly set patch 'land cover' for this example
set land-cover one-of [ "GR1" "GR2" "GR3" ]
; Query the dictionary for the fuel column (item 0 since
; we've used landcover as the key) and for the type (item 1)
set fuel item 0 table:get patchDataDict land-cover
set patchType item 1 table:get patchDataDict land-cover
]
; Do some stuff based on the retrieved values
ask patches [
set pcolor fuel
if patchType = "a" [
sprout 1
]
]
reset-ticks
end
This generates a toy landscape where each patch is assigned a fuel and patchType value according to a query based on the first column of that csv:
Hopefully that gets you started
Related
I've got a couple hundred JSONs in a structure like the following example:
{
"JsonExport": [
{
"entities": [
{
"identity": "ENTITY_001",
"surname": "SMIT",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
},
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_002"
}
],
"entityEntityRelation": [
{
"parentIdentification": "PARENT_ENTITY_001",
"typeRelation": "FRIENDS_WITH",
"childIdentification": "ENTITY_002"
}
]
},
{
"identity": "ENTITY_002",
"surname": "JACKSON",
"entityLocationRelation": [
{
"parentIdentification": "PARENT_ENTITY_002",
"typeRelation": "SEEN_AT",
"locationIdentity": "LOCATION_001"
}
]
},
{
"identity": "ENTITY_003",
"surname": "JOHNSON"
}
],
"identification": "REGISTRATION_001",
"locations": [
{
"city": "LONDON",
"identity": "LOCATION_001"
},
{
"city": "PARIS",
"identity": "LOCATION_002"
}
]
}
]
}
With these JSON's, I want to make a graph consisting of the following nodes: Registration, Entity and Location. This part I've figured out and made the following:
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..locations.*" ) YIELD value AS locations
MERGE(l:Locations{identity:locations.identity, name:locations.city})
WITH json_file
CALL apoc.load.json(json_file,"$.JsonExport..entities.*" ) YIELD value AS entities
MERGE(e:Entities {name:entities.surname, identity:entities.identity})
All the entities and locations should have a relation with the registration. I thought I could do this by using the following code:
MERGE (e)-[:REGISTERED_ON]->(r)
MERGE (l)-[:REGISTERED_ON]->(r)
However this code doesn’t give the desired output. It creates extra "empty" nodes and doesn't connect to the registration node. So the first question is: How do I connect the location and entities nodes to the registration node. And in light of the other JSON's, the entities and locations should only be linked to the specific registration.
Furthermore, I would like to make the entity -> location relation and the entity - entity relation and use the given type of relation (SEEN_AT or FRIENDS_WITH) as label for the given relation. How can this be done? I'm kind of lost at this point and don’t see how to solve this. If someone could guide me into the right direction I would be much obliged.
Variable names (like e and r) are not stored in the DB, and are bound to values only within individual queries. MERGE on a pattern with an unbound variable will just create the entire pattern (including creating an empty node for unbound node variables).
When you MERGE a node, you should only specify the unique identifying property for that node, to avoid duplicates. Any other properties you want to set at the time of creation should be set using ON CREATE SET.
It is inefficient to parse through the JSON data 3 times to get different areas of the data. And it is especially inefficient the way your query was doing it, since each subsequent CALL/MERGE group of clauses would be done multiple times (since every previous CALL produces multiple rows, and the number of rows increases multiplicative). You can use aggregation to get around that, but it is unnecessary in your case, since you can just do the entire query in a single pass through the JSON data.
This may work for you:
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
MERGE(r:Registration {id:data.identification})
FOREACH(ent IN data.entities |
MERGE (e:Entities {identity: ent.identity})
ON CREATE SET e.name = ent.surname
MERGE (e)-[:REGISTERED_ON]->(r)
FOREACH(loc1 IN ent.entityLocationRelation |
MERGE (l1:Locations {identity: loc1.locationIdentity})
MERGE (e)-[:SEEN_AT]->(l1))
FOREACH(ent2 IN ent.entityEntityRelation |
MERGE (e2:Entities {identity: ent2.childIdentification})
MERGE (e)-[:FRIENDS_WITH]->(e2))
)
FOREACH(loc IN data.locations |
MERGE (l:Locations{identity:loc.identity})
ON CREATE SET l.name = loc.city
MERGE (l)-[:REGISTERED_ON]->(r)
)
For simplicity, it hard-codes the FRIENDS_WITH and REGISTERED_ON relationship types, as MERGE only supports hard-coded relationship types.
So playing with neo4j/cyper I've learned some new stuff and came to another solution for the problem. Based on the given example data, the following can create the nodes and edges dynamically.
WITH "file:///example.json" AS json_file
CALL apoc.load.json(json_file,"$.JsonExport.*" ) YIELD value AS data
CALL apoc.merge.node(['Registration'], {id:data.identification}, {},{}) YIELD node AS vReg
UNWIND data.entities AS ent
CALL apoc.merge.node(['Person'], {id:ent.identity}, {}, {id:ent.identity, surname:ent.surname}) YIELD node AS vPer1
UNWIND ent.entityEntityRelation AS entRel
CALL apoc.merge.node(['Person'],{id:entRel.childIdentification},{id:entRel.childIdentification},{}) YIELD node AS vPer2
CALL apoc.merge.relationship(vPer1, entRel.typeRelation, {},{},vPer2) YIELD rel AS ePer
UNWIND data.locations AS loc
CALL apoc.merge.node(['Location'], {id:loc.identity}, {name:loc.city}) YIELD node AS vLoc
UNWIND ent.entityLocationRelation AS locRel
CALL apoc.merge.relationship(vPer1, locRel.typeRelation, {},{},vLoc) YIELD rel AS eLoc
CALL apoc.merge.relationship(vLoc, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg1
CALL apoc.merge.relationship(vPer1, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg2
CALL apoc.merge.relationship(vPer2, "REGISTERED_ON", {},{},vReg) YIELD rel AS eReg3
RETURN vPer1,vPer2, vReg, vLoc, eLoc, eReg1, eReg2, eReg3
I want to use the syntax to substitute "#N/A" instead of the calculated value 0, but "" is not displayed in the csv file in NetLogo 6.0.3 (This is displayed ⇒ #N/A. I want to calculate the average value by mixing "#N/A" with numerical data in Excel, but #N/A is displayed as calculation result. If "#N/A" is displayed as a csv file, it could be calculated with Excel. In NetLogo 6.0.1, this was possible. What should I do with NetLogo 6.0.3?
The "correct" way to do this is to handle it in excel by ignoring N/As in your average. That way, you preserve those values as N/As and so have to be conscious about how you deal with them. You can do this by calculating the average with something like =AVERAGE(IF(ISNUMBER(A2:A5), A2:A5)) and then entering with ctrl+shift+enter instead of just enter. That, of course, is kind of annoying.
To solve it on the netlogo side, report the value "\"#N/A\"" instead of "#N/A". That will preserve the quotes when you import into excel. Alternatively, you could output pretty much any other string other than "#N/A". For instance, reporting "not-a-number" would make it a string, or even just using an empty string. The quotes you see in excel are actually part of the string, not just indicators that the field is a string. In general, fields in CSV don't have a type. Excel just interprets what it can as a number. It treats the exact field of #N/A as special, so modifying it in any way (not just adding quotes around it) will prevent it from interpreting in that special way.
It's also worth noting that this was a bug in previous versions of NetLogo (I'm assuming you're using BehaviorSpace here; the CSV extension has always worked this way). There was no way to output a string without having a quote at the beginning and end of the string. That is, the string value itself would have quotes in it. This behavior is a consequence of fixing it. Now, you can output true #N/A values if you want to, which there was no way of doing before.
Maybe this will work for you. Assuming you have the csv extension enabled:
extensions [ csv ]
You can use a reporter that replaces 0 values in a list (or list of lists) with the string value "#NA" (or "N/A" if you want, but for me #NA is what works with Excel).
to-report replace-zeroes [ list_ ]
if list_ = [] [ report [] ]
let out map [ i ->
ifelse-value is-list? i
[ replace-zeroes i ]
[ ifelse-value ( i != 0 ) [ i ] [ "#NA" ] ]
] list_
report out
end
As a quick check:
to test
ca
; make fake list of lists for csv output
let fake n-values 3 [ i -> n-values 5 [ random 4 ] ]
; replace the 0 values with the NA values
let replaced replace-zeroes fake
; print both the base and 0-replaced lists
print fake
print replaced
; export to csv
csv:to-file "replaced_out.csv" replaced
reset-ticks
end
Observer output (random):
[[0 0 2 2 0] [3 0 0 3 0] [2 3 2 3 1]]
[[#NA #NA 2 2 #NA] [3 #NA #NA 3 #NA] [2 3 2 3 1]]
Excel output:
I'm trying to write a Python script that will take any CSV file, run it through a geocoder, and then write the resulting geocoding attributes (+ all the data from the original file) to a new csv file.
My code so far is the following, and I should note everything is working as expected except for combining the geocoding attributes with the data in the raw csv file. Currently what's happening is that all the original csv file's field values for a particular row are presented as just one value in the csv file (although the geocoding attributes appear correctly). The problem with the script is located towards the end. I left out the code for the different classes for brevity.
I should also note I'm using hasattr* because although I don't know what all fields are in the original in_file, I do know that somewhere in the input csv these fields will be present and these are the fields needed for the geocoding.
Originally I tried changing "new_file.writerow([])" to "new_file.writerow()", at this point the row input -r- did write correctly the csv file, but the geocoding attributes could no longer be written to the csv as they were looked at as additional arguments.
def locate(file=None):
""" locate by geocoding func"""
start_time = time.time()
count = 0
if file != None:
with open (file) as in_file:
f_csv = csv.reader(in_file)
# regex headers and lowercase to standarize for hasattr func.
headers = [ re.sub('["\s+]', '_', h).lower() for h in next(f_csv)]
# Used namedtuple for headers
Row = namedtuple('Row', headers)
# for row in file
for r in f_csv:
count += 1
# set row values to named tuple values
row = Row(*r)
# Try hasattr to find fields names address, city, state, zipcode
if hasattr(row, 'address'):
address = row.address
elif hasattr(row, 'address1'):
address = row.address1
if hasattr(row, 'city'):
city = row.city
if hasattr(row, 'state'):
state = row.state
elif hasattr(row, 'st'):
state = row.st
if hasattr(row, 'zipcode'):
zipCode = row.zipcode
elif hasattr(row, 'zip'):
zipCode = row.zipcode
# Create new address object
addressObject = Address(address, city, state, zipCode)
# Get response from api
data = requests.get(addressObject.__str__()).json()
try:
data['geocodeStatusCode'] = int(data['geocodeStatusCode'])
except:
data['geocodeStatusCode'] = None
if data['geocodeStatusCode'] == 'SomeNumber':
# geocoded address ideally uses parent class attributes
geocodedAddressObject = GeocodedAddress(addressObject.address, addressObject.city, addressObject.state, addressObject.zipCode, data['addressGeo']['latitude'], data['addressGeo']['longitude'], data['addressGeo']['score'])
else:
geocodedAddressObject = GeocodedAddress(addressObject.address, addressObject.city, addressObject.state, addressObject.zipCode)
# Problem Area
geocoded_file = file.replace('.csv', '_geocoded2') + '.csv'
with open(geocoded_file, 'a', newline='') as geocoded:
# Problem area -- the r -row- attribute writes all within the same cell even though they are comma separated. The geocoding attributes do write correctly to the csv file
new_file = csv.writer(geocoded)
new_file.writerow([r, geocodedAddressObject.latitude, geocodedAddressObject.longitude, geocodedAddressObject.geocodeScore])
print('The time to geocode {} records: {}'.format(count, (time.time() - start_time)))
CSV Input Data Example:
"UID", "Occupant", "Address", "City", "State", "ZipCode"
"100001", "Playstation Theater", "New York", "NY", "10036"
"100002", "Ed Sullivan Theater", "New York, "NY", "10019"
CSV Output Example (the additional fields are parsed during geocoding)
"UID", "Occupant", "Address", "City", "State", "ZipCode", "GeoCodingLatitude", "GeoCodingLongitude", "GeoCodingScore"
"100001", "Playstation Theater", "New York", "NY", "10036", "45.1234", "-110.4567", "100"
"100002", "Ed Sullivan Theater", "New York, "NY", "10019", "44.1234", "-111.4567", "100"
I figured out a solution, although it is likely not the most elegant. I transformed the namedtuple into a dictionary, using namedtuple._asdict() and then looped through the values of the row adding them to a new list. At this point I add in the geocoded variables, and then write the entire list to the row. Here is a sample of the code I changed! If you can think of a better solution, please let me know.
if data['geocodeStatusCode'] == 'SomeNumber':
# geocoded address ideally should use parent class address values and not have to be restated
geocodedAddressObject = GeocodedAddress(addressObject.address, addressObject.city, addressObject.state, addressObject.zipCode,
data['addressGeo']['latitude'], data['addressGeo']['longitude'], data['addressGeo']['score'])
else:
geocodedAddressObject = GeocodedAddress(addressObject.address, addressObject.city, addressObject.state, addressObject.zipCode)
# This is where I made the change - set new list
list_values = []
# Use _asdict for the named tuple
row_content = row._asdict()
# Loop through and strip white space
for key, value in row_content.items():
# print(key, value.strip())
list_values.append(value.strip())
# Extend list rather then append due to multiple values
list_values.extend((geocodedAddressObject.latitude, geocodedAddressObject.longitude, geocodedAddressObject.geocodeScore))
# Finally write the new list to the csv file - which includes both the row and the geocoded objects
#- and is agnostic as to what data it's passed as long as its utf-8 complaint
new_file.writerow(list_values)
I'm trying to traverse an atom tree to get to a specific item value:
[
{"#":{"rel":"mentioned",
"ostatus:object-type":"http://activitystrea.ms/schema/1.0/collection",
"href":"http://activityschema.org/collection/public"}},
{"#":{"rel":"enclosure",
"type":"image/jpeg",
"length":"225009",
"href":"https://framapiaf.org/system/media_attachments/files/000/701/279/original/19ed3ec381293bd8.jpg"}},
{"#":{"rel":"enclosure",
"type":"image/jpeg",
"length":"180205",
"href":"https://framapiaf.org/system/media_attachments/files/000/701/280/original/101ec4084b0b1920.jpeg"}},
{"#":{"rel":"enclosure",
"type":"image/jpeg",
"length":"257325",
"href":"https://framapiaf.org/system/media_attachments/files/000/701/281/original/dff3d3dc2e8e1b89.jpg"}},
{"#":{"rel":"enclosure",
"type":"image/jpeg",
"length":"224565",
"href":"https://framapiaf.org/system/media_attachments/files/000/701/282/original/983f3ad336ebc721.jpg"}},
{"#":{"rel":"alternate",
"type":"text/html",
"href":"https://mastodon.art/#luka/99809176628976105"}}
]
(This is the output of JSON.stringify(item['activity:object'].link))
But I can't seem to get past those "#" keys ; How can I access the first #.type and #.href?
I'm using wget to fetch several dozen JSON files on a daily basis that go like this:
{
"results": [
{
"id": "ABC789",
"title": "Apple",
},
{
"id": "XYZ123",
"title": "Orange",
}]
}
My goal is to find row's position on each JSON file given a value or set of values (i.e. "In which row XYZ123 is located?"). In previous example ABC789 is in row 1, XYZ123 in row 2 and so on.
As for now I use Google Regine to "quickly" visualize (using the Text Filter option) where the XYZ123 is standing (row 2).
But since it takes a while to do this manually for each file I was wondering if there is a quick and efficient way in one go.
What can I do and how can I fetch and do the request? Thanks in advance! FoF0
In python:
import json
#assume json_string = your loaded data
data = json.loads(json_string)
mapped_vals = []
for ent in data:
mapped_vals.append(ent['id'])
The order of items in the list will be indexed according to the json data, since the list is a sequenced collection.
In PHP:
$data = json_decode($json_string);
$output = array();
foreach($data as $values){
$output[] = $values->id;
}
Again, the ordered nature of PHP arrays ensure that the output will be ordered as-is with regard to indexes.
Either example could be modified to use a mapped dictionary (python) or an associative array (php) if needs demand.
You could adapt these to functions that take the id value as an argument, track how far they are into the array, and when found, break out and return the current index.
Wow. I posted the original question 10 months ago when I knew nothing about Python nor computer programming whatsoever!
Answer
But I learned basic Python last December and came up with a solution for not only get the rank order but to insert the results into a MySQL database:
import urllib.request
import json
# Make connection and get the content
response = urllib.request.urlopen(http://whatever.com/search?=ids=1212,125,54,454)
content = response.read()
# Decode Json search results to type dict
json_search = json.loads(content.decode("utf8"))
# Get 'results' key-value pairs to a list
search_data_all = []
for i in json_search['results']:
search_data_all.append(i)
# Prepare MySQL list with ranking order for each id item
ranks_list_to_mysql = []
for i in range(len(search_data_all)):
d = {}
d['id'] = search_data_all[i]['id']
d['rank'] = i + 1
ranks_list_to_mysql.append(d)