How to calculate the the number and area of habitat patches in Arcview 10 - gis

I'm currently working on my masters thesis and having real trouble with GIS. I've downloaded the arc gis grid data set from http://www.kew.org/gis/projects/mad_veg/datasets_gis.html
Ive sucessfully plotted it in arcmap 10. The map consists of various different habitats. I want to know how I could take one of those habitats types, say "humid forest", and calculate how many patches of that habitat there are, and how big each patch is.
I've been been at this for weeks and haven't made much headway. someone suggested I look at zonal geometry as a table http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//009z000000w5000000.htm which look promising but I gave the coding a try and I couldnt get it to work. I posted some of my attempts below.
>>> import arcpy
>>> from arcpy import env
>>> from arcpy.sa import *
>>> env.workspace = "Q:/MADGIS"
>>> outZonalGeometryAsTable = ZonalGeometryAsTable("zones.shp", "Classes "zonalgeomout", 0.2)
Runtime error <class 'arcgisscripting.ExecuteError'>: ERROR 000626: Tool ZonalGeometryAsTable is not licensed.
>>> arcpy.CheckOutExtension("Spatial")
u'CheckedOut'
>>> outZonalGeometryAsTable = ZonalGeometryAsTable(inZoneData, zoneField, "AREA", cellSize)
Runtime error <type 'exceptions.NameError'>: name 'inZoneData' is not defined
The problem is some of the things ive copied in the example are specific to the example, but i'm not sure. If someone could even point me in the right direction it would be a big help

It seems that you didn’t set some parameters.
According to the link above, you must set this parameters:
# Set local variables
inZoneData = "YourShapefileName.shp"
zoneField = "Classes"
outTable = "zonalgeomout02.dbf"
processingCellSize = 0.2
# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")
Update:
You must use this code for your raster data:
import arcpy
from arcpy import env
from arcpy.sa import *
env.workspace = "C:/Users/Puya/Downloads/Documents/StackOverflow/veg_grid"
inZoneData = "vegetation"
zoneField = "Value"
outTable = "zonalgeomout02.dbf"
processingCellSize = 29
arcpy.CheckOutExtension("Spatial")
outZonalGeometryAsTable = ZonalGeometryAsTable(inZoneData, zoneField, "AREA", processingCellSize)
Also, in ArcMap you can use ArcToolbox -> Spatial Analyst -> Zonal -> ZonalGeometryAsTable and select above parameters and run ZonalGeometryAsTable.

Related

How to predict Sentiments after training and testing the model by using NLTK NaiveBayesClassifier in Python?

I am doing sentiment classification using NLTK NaiveBayesClassifier. I trained and test the model with the labeled data. Now I want to predict sentiments of the data that is not labeled. However, I run into the error.
The line that is giving error is :
score_1 = analyzer.evaluate(list(zip(new_data['Articles'])))
The error is :
ValueError: not enough values to unpack (expected 2, got 1)
Below is the code:
import random
import pandas as pd
data = pd.read_csv("label data for testing .csv", header=0)
sentiment_data = list(zip(data['Articles'], data['Sentiment']))
random.shuffle(sentiment_data)
new_data = pd.read_csv("Japan Data.csv", header=0)
train_x, train_y = zip(*sentiment_data[:350])
test_x, test_y = zip(*sentiment_data[350:])
from unidecode import unidecode
from nltk import word_tokenize
from nltk.classify import NaiveBayesClassifier
from nltk.sentiment import SentimentAnalyzer
from nltk.sentiment.util import extract_unigram_feats
TRAINING_COUNT = 350
def clean_text(text):
text = text.replace("<br />", " ")
return text
analyzer = SentimentAnalyzer()
vocabulary = analyzer.all_words([(word_tokenize(unidecode(clean_text(instance))))
for instance in train_x[:TRAINING_COUNT]])
print("Vocabulary: ", len(vocabulary))
print("Computing Unigran Features ...")
unigram_features = analyzer.unigram_word_feats(vocabulary, min_freq=10)
print("Unigram Features: ", len(unigram_features))
analyzer.add_feat_extractor(extract_unigram_feats, unigrams=unigram_features)
# Build the training set
_train_X = analyzer.apply_features([(word_tokenize(unidecode(clean_text(instance))))
for instance in train_x[:TRAINING_COUNT]], labeled=False)
# Build the test set
_test_X = analyzer.apply_features([(word_tokenize(unidecode(clean_text(instance))))
for instance in test_x], labeled=False)
trainer = NaiveBayesClassifier.train
classifier = analyzer.train(trainer, zip(_train_X, train_y[:TRAINING_COUNT]))
score = analyzer.evaluate(list(zip(_test_X, test_y)))
print("Accuracy: ", score['Accuracy'])
score_1 = analyzer.evaluate(list(zip(new_data['Articles'])))
print(score_1)
I understand that the problem is arising because I have to give two parameters is the line which is giving an error but I don't know how to do this.
Thanks in Advance.
Documentation and example
The line that gives you the error calls the method SentimentAnalyzer.evaluate(...) .
This method does the following.
Evaluate and print classifier performance on the test set.
See SentimentAnalyzer.evaluate.
The method has one mandatory parameter: test_set .
test_set – A list of (tokens, label) tuples to use as gold set.
In the example at http://www.nltk.org/howto/sentiment.html test_set has the following structure:
[({'contains(,)': False, 'contains(.)': True, 'contains(and)': False, 'contains(the)': True}, 'subj'), ({'contains(,)': True, 'contains(.)': True, 'contains(and)': False, 'contains(the)': True}, 'subj'), ...]
Here is a symbolic representation of the structure.
[(dictionary,label), ... , (dictionary,label)]
Error in your code
You are passing
list(zip(new_data['Articles']))
to SentimentAnalyzer.evaluate. I assume your getting the error because
list(zip(new_data['Articles']))
does not create a list of (tokens, label) tuples. You can check that by creating a variable which contains the list and printing it or looking at the value of the variable while debugging.
E.G.
test_set = list(zip(new_data['Articles']))
print("begin test_set")
print(test_set)
print("end test_set")
You are calling evaluate correctly 3 lines above the one that is giving the error.
score = analyzer.evaluate(list(zip(_test_X, test_y)))
I guess you want to call SentimentAnalyzer.classify(instance) to predict unlabeled data. See SentimentAnalyzer.classify.

How to train own model and test it with spacy

I am using the below code to train an already existing spacy ner model. However, I dont get correct results on tests:
What I am missing?
import spacy
import random
from spacy.gold import GoldParse
from spacy.language import EntityRecognizer
train_data = [
('Who is Rocky babu?', [(7, 16, 'PERSON')]),
('I like London and Berlin.', [(7, 13, 'LOC'), (18, 24, 'LOC')])
]
nlp = spacy.load('en', entity=False, parser=False)
ner = EntityRecognizer(nlp.vocab, entity_types=['PERSON', 'LOC'])
for itn in range(5):
random.shuffle(train_data)
for raw_text, entity_offsets in train_data:
doc = nlp.make_doc(raw_text)
gold = GoldParse(doc, entities=entity_offsets)
nlp.tagger(doc)
nlp.entity.update([doc], [gold])
Now, When i try to test the above model by using the below code, I don't get the expected output.
text = ['Who is Rocky babu?']
for a in text:
doc = nlp(a)
print("Entities", [(ent.text, ent.label_) for ent in doc.ents])
My output is as follows:
Entities []
whereas my expected output is as follows:
Entities [('Rocky babu', 'PERSON')]
Can someone please tell me what I'm missing ?
Could you retry with
nlp = spacy.load('en_core_web_sm', entity=False, parser=False)
If that gives an error because you don't have that model installed, you can run
python -m spacy download en_core_web_sm
on the commandline first.
And ofcourse keep in mind that for a proper training of the model, you'll need many more examples for the model to be able to generalize!

The built-in VGG16 network in MxNet is not working

I would like to test the trained built-in VGG16 network in MxNet. The experiment is to feed the network with an image from ImageNet. Then, I would like to see whether the result is correct.
However, the results are always error! Hi, how stupid the network is! Well, that cannot be true. I must do something wrong.
from mxnet.gluon.model_zoo.vision import vgg16
from mxnet.image import color_normalize
import mxnet as mx
import numpy as np
import cv2
path=‘http://data.mxnet.io/models/imagenet-11k/’
data_dir = ‘F:/Temps/Models_tmp/’
k = ‘synset.txt’
#gluon.utils.download(path+k, data_dir+k)
img_dir = ‘F:/Temps/DataSets/ImageNet/’
img = cv2.imread(img_dir + ‘cat.jpg’)
img = mx.nd.array(img)
img,_ = mx.image.center_crop(img,(224,224))
img = img/255
img = color_normalize(img,mean=mx.nd.array([0.485, 0.456, 0.406]),std=mx.nd.array([0.229, 0.224, 0.225]))
img = mx.nd.transpose(img, axes=(2, 0, 1))
img = img.expand_dims(axis=0)
with open(data_dir + ‘synset.txt’, ‘r’) as f:
labels = [l.rstrip() for l in f]
aVGG = vgg16(pretrained=True,root=‘F:/Temps/Models_tmp/’)
features = aVGG.forward(img)
features = mx.ndarray.softmax(features)
features = features.asnumpy()
features = np.squeeze(features)
a = np.argsort(features)[::-1]
for i in a[0:5]:
print(‘probability=%f, class=%s’ %(features[i], labels[i]))
The outputs from color_normalize seems not right for the absolute values of some numbers are greater than one.
This is my figure of cat which is downloaded from the ImageNet. 
These are my outputs.
probability=0.218258, class=n01519563 cassowary probability=0.172373,
class=n01519873 emu, Dromaius novaehollandiae, Emu novaehollandiae
probability=0.128973, class=n01521399 rhea, Rhea americana
probability=0.105253, class=n01518878 ostrich, Struthio camelus
probability=0.051424, class=n01517565 ratite, ratite bird, flightless
bird
Reading your code:
path=‘http://data.mxnet.io/models/imagenet-11k/’
I think you might be using the synset of the ImageNet 11k (11000 classes) rather than the 1k (1000) classes. That would explain the mismatch.
The correct synset is here: http://data.mxnet.io/models/imagenet/synset.txt

FiPy Simple Convection

I am trying to understand how FiPy works by working an example, in particular I would like to solve the following simple convection equation with periodic boundary:
$$\partial_t u + \partial_x u = 0$$
If initial data is given by $u(x, 0) = F(x)$, then the analytical solution is $u(x, t) = F(x - t)$. I do get a solution, but it is not correct.
What am I missing? Is there a better resource for understanding FiPy than the documentation? It is very sparse...
Here is my attempt
from fipy import *
import numpy as np
# Generate mesh
nx = 20
dx = 2*np.pi/nx
mesh = PeriodicGrid1D(nx=nx, dx=dx)
# Generate solution object with initial discontinuity
phi = CellVariable(name="solution variable", mesh=mesh)
phiAnalytical = CellVariable(name="analytical value", mesh=mesh)
phi.setValue(1.)
phi.setValue(0., where=x > 1.)
# Define the pde
D = [[-1.]]
eq = TransientTerm() == ConvectionTerm(coeff=D)
# Set discretization so analytical solution is exactly one cell translation
dt = 0.01*dx
steps = 2*int(dx/dt)
# Set the analytical value at the end of simulation
phiAnalytical.setValue(np.roll(phi.value, 1))
for step in range(steps):
eq.solve(var=phi, dt=dt)
print(phi.allclose(phiAnalytical, atol=1e-1))
As addressed on the FiPy mailing list, FiPy is not great at handling convection only PDEs (absent diffusion, pure hyperbolic) as it's missing higher order convection schemes. It is better to use CLAWPACK for this class of problem.
FiPy does have one second order scheme that might help with this problem, the VanLeerConvectionTerm, see an example.
If the VanLeerConvectionTerm is used in the above problem, it does do a better job of preserving the shock.
import numpy as np
import fipy
# Generate mesh
nx = 20
dx = 2*np.pi/nx
mesh = fipy.PeriodicGrid1D(nx=nx, dx=dx)
# Generate solution object with initial discontinuity
phi = fipy.CellVariable(name="solution variable", mesh=mesh)
phiAnalytical = fipy.CellVariable(name="analytical value", mesh=mesh)
phi.setValue(1.)
phi.setValue(0., where=mesh.x > 1.)
# Define the pde
D = [[-1.]]
eq = fipy.TransientTerm() == fipy.VanLeerConvectionTerm(coeff=D)
# Set discretization so analytical solution is exactly one cell translation
dt = 0.01*dx
steps = 2*int(dx/dt)
# Set the analytical value at the end of simulation
phiAnalytical.setValue(np.roll(phi.value, 1))
viewer = fipy.Viewer(phi)
for step in range(steps):
eq.solve(var=phi, dt=dt)
viewer.plot()
raw_input('stopped')
print(phi.allclose(phiAnalytical, atol=1e-1))

Calculate inverse log-weighted similarity in a bimodal network, igraph in Python

I'm trying to calculate Adamic-Adar similarity for a network, which have two types of nodes. I'm only interested in calculating similarity between nodes which have outgoing connections. Nodes with incoming connections are a kind of connector and I'm not interested in them.
Data size and characteristic:
> summary(g)
IGRAPH DNW- 3852 24478 --
+ attr: name (v/c), weight (e/n)
Prototype code in Python 2.7:
import glob
import os
import pandas as pd
from igraph import *
os.chdir("data/")
for file in glob.glob("*.graphml"):
print(file)
g = Graph.Read_GraphML(file)
indegree = Graph.degree(g, mode="in")
g['indegree'] = indegree
dev = g.vs.select(indegree == 0)
m = Graph.similarity_inverse_log_weighted(dev.subgraph())
df = pd.melt(m)
df.to_csv(file.split("_only.graphml")[0] + "_similarity.csv", sep=',')
There is something wrong with this code, because dev is of length 1, and m is 0.0, so it doesn't work as expected.
Hint
I have a working code in R, but seems like I'm unable to rewrite it to Python (which I'm doing for the sake of performance, networks are huge). Here it is:
# make sure g is your network
indegree <- degree(g, mode="in")
V(g)$indegree <- indegree
dev <- V(g)[indegree==0]
m <- similarity.invlogweighted(g, dev)
x.m <- melt(m)
colnames(x.m) <- c("dev1", "dev2", "value")
x.m <- x.m[x.m$value > 0, ]
write.csv(x.m, file = sub(".csv",
"_similarity.csv", filename))
You are assigning the in-degrees as a graph attribute, not as a vertex attribute, so you cannot reasonably call g.vs.select() later on. You need this instead:
indegree = g.degree(mode="in")
g.vs["indegree"] = indegree
dev = g.vs.select(indegree=0)
But actually, you could simply write this:
dev = g.vs.select(_indegree=0)
This works because of how the select method works:
Attribute names inferred from keyword arguments are treated specially
if they start with an underscore (_). These are not real attributes
but refer to specific properties of the vertices, e.g., its degree.
The rule is as follows: if an attribute name starts with an underscore,
the rest of the name is interpreted as a method of the Graph object.
This method is called with the vertex sequence as its first argument
(all others left at default values) and vertices are filtered
according to the value returned by the method.