How do I look inside the NLTK classifier train method - nltk

I've been looking at Laurent Luce's' blog on sentiment analysis. Unfortunately I have not been able to follow his blog so I cannot ask him a question directly. Here is the link to the blog: http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/
Pretty much everything works nicely. However I cannot figure out how to get this part to work - text below is pasted from the link. My question is how do I look inside the classifier train method? It will help my understanding if I could do that.
"Let’s take a look inside the classifier train method in the source code of the NLTK library. ‘label_probdist’ is the prior probability of each label and ‘feature_probdist’ is the feature/value probability dictionary. Those two probability objects are used to create the classifier."
def train(labeled_featuresets, estimator=ELEProbDist):
...
# Create the P(label) distribution
label_probdist = estimator(label_freqdist)
...
# Create the P(fval|label, fname) distribution
feature_probdist = {}
...
return NaiveBayesClassifier(label_probdist, feature_probdist)

NLTK is open source. You can find the source-code here. Github has solid search functionality, so you should be able to find whatever it is you need using that. Specifically, the naive bayes classifier is here
and you can see the train method for it in that file.

Related

Accessing regmap RegFields

I am trying to find a clean way to access the regmap that is used with *RegisterNode for creating documentation and testing files. The TLRegisterNode has methods for generating the json through some Annotations. These are done in the regmap method by adding them to the ElaborationArtefacts object. Other protocols don't seem to have these annotations.
Is there anyway to iterate over the "regmap" Register Fields post elaboration or during?
I cannot just access the regmap as it's not really a val/var since it's a method. I can't quite figure out where this information is being stored. I don't really believe it's actually "storing" any information as much as it is simply creating the hardware to attach the specified logic to the RegisterNode based logic.
The JSON output is actually fine for me as I could just write a post processing script to convert JSON to my required formats, but I'm wondering if I can access this information OR if I could add a custom function call at the end. I cannot extend the case class *RegisterNode, but I'm not sure if it's possible to add custom functions to run at the end of the regmap method.
Here is something I threw together quickly:
//in *RegisterRouter.scala
def customregmap(customFunc: (RegField.Map*) => Unit, mapping: RegField.Map*) = {
regmap(mapping:_*)
customFunc(mapping:_*)
}
def regmap(mapping: RegField.Map*) = {
//normal stuff
}
A user could then create a custom function to run and pass it to the regmap or to the RegisterRouter
def myFunc(mapping: RegField.Map*): Unit = {
println("I'm doing my custom function for regmap!")
}
// ...
node.customregmap(myFunc,
0x0 -> coreControlRegFields,
0x4 -> fdControlRegFields,
0x8 -> fdControl2RegFields,
)
This is just a quick example I have. I believe what would be better, if something like this was possible, would be to have a Seq of functions that could be added to the RegisterNode that are ran at the end of the regmap method, similar to how TLRegisterNode currently works. So a user could add an arbitrary number and you still use the regmap call.
Background (not directly part of question):
I have a unified register script that I have built over the years in which I describe the registers for a particular IP. It works very similar to the RegField/node.regmap, except it obviously doesn't know about diplomacy and the like. It will generate the Verilog, but also a variety of files for DV (basic `defines for simple verilog simulations and more complex uvm_reg_block defines also with the ability to describe multiple of the IPs for a subsystem all the way up to an SoC level). It will also print out C Header files for SW and Sphinx reStructuredText for documentation.
Diplomacy actually solves one of the main issues I've been dealing with so I'm obviously trying to push most of my newer designs to Chisel/Diplo.
I ended up solving this by creating my own RegisterNode which is the same as the rocketchip RegisterNodes except that I use a different Elaboration Artifact to grab the info and store it for later.

Where exactly does trainable_variables method belong in Tensorflow?

I'm a newbie in both deep learning and tensorflow and now trying to learn how to implement deep learning codes based on function API (not keras) by following example codes.
Inside the codes I'm looking at, I found out sources saying 'gradients=tape.gradient(loss,model.trainable variables)'
I intuitionally got what trainable variables mean, however in order to understand clearly,I tried to search on tensorflow documentation (which module or class the method belongs to, which are key arguments, etc) ,but I wasn't able to find the information I want. ('trainable variables' method was not in their documentation index and I'm wondering why)
So can anyone please tell me the module/class which trainable_variable method belongs to, and which arguments it takes, and also how it is able to judge and get all the trainable variables from the model ?
The reason you did not find this method is because trainable_variables is not a method, but an attribute/property. The Model class has a trainable_variables attribute, which is not documented officialy. It is inherited from the base class Layer, and to put it shortly, the list (of trainable variables) gets populated as new layers are added, since all layers have an init parameter trainable (this comes from base class Layer too). You can check the source code if you want to: "the source of the property", "adding new weights to layer appends to the list".

mlmodel doesn’t work properly after conversion from Caffe using coremltools

I want to convert this NSFW model to CoreML model. What I did:
Download Anaconda 2.7
Install coremltools
Convert this yahoo nsfw model from here - https://github.com/yahoo/open_nsfw/tree/master/nsfw_model but I am not sure it’s Caffe v1 because Apple documentation says that only this version supported. Anyway…
I use this commands for conversion and it converted without any warnings.
coreml_model = coremltools.converters.caffe.convert(('resnet_50_1by2_nsfw.caffemodel', 'deploy.prototxt'), image_input_names='data')
coreml_model.save(’nsfw2.mlmodel')
I imported this model to my project and again all looks fine.
I prepared 224x224 images and use Vision framework like VNImageRequestHandler with cgImage and etc.
But!
All images return the same result
[<VNCoreMLFeatureValueObservation: 0x281b1daa0> 2E00F417-95C0-4AA1-A621-A0945BB5E095 requestRevision=1 confidence=1.000000 "prob" - "MultiArray : Double 1 x 1 x 2 x 1 x 1 array" (1.000000)]
How can I debug this issue and found out what’s wrong?
Maybe you're looking only at naughty images? ;-)
It's probably the image preprocessing. You didn't specify any preprocessing options while Caffe models usually normalize using ImageNet mean/std. Refer to my blog post for more info: https://machinethink.net/blog/help-core-ml-gives-wrong-output/
However, I don't see any normalization options in your deploy.prototxt, so perhaps it's not that.
How I would debug this: remove everything but the first layer from the Caffe model and convert to Core ML. Run this one-layer model in both Caffe and Core ML and compare the outputs. If they are different, something is up with how you're loading or preprocessing the input data.

How the use the pre-trained model(.caffemodel file) provided in the below link?

https://github.com/playerkk/face-py-faster-rcnn
the link above has indicated that a pretrained model is available.
enter image description here
After you download the pretrained weights ( a .caffemodel file), you can instantiate a caffe.Net object with the network definition (.prototxt file - from the repository you referred, test.prototxt), e.g.
net = caffe.Net(prototxt, caffemodel, caffe.TEST)
(I guess you would like to use the pretrained model for inference, if you would like to do transfer-learning on your data you should use caffe.TRAIN).
Then you should load the image, feed it into the input blobs, run net.forward on the image and extract the results from the output blobs - e.g. net.blobs['cls_score'].data, net.blobs['cls_prob'].data and net.blobs['bbox_pred'].data.
You can use the original py-faster-rcnn's demo with minor adjustments.
Good luck!

Interesting uses of fluent interfaces?

I am wondering where and when fluent interfaces are a good idea, so I am looking for examples. So far I have found only 3 useful cases, e.g. Ruby's collections, like
unique_words = File.read("words.txt").downcase.split.sort.uniq.length
and Fest (Java) for unit testing:
assertThat(yoda).isInstanceOf(Jedi.class)
.isEqualTo(foundJedi)
.isNotEqualTo(foundSith);
and JMock. Do you know of any other good examples that use a fluent interface?
jQuery. :)
StringBuilder: http://msdn.microsoft.com/en-us/library/system.text.stringbuilder(VS.71).aspx
Or
RSpec. Example from the home page:
# bowling_spec.rb
require 'bowling'
describe Bowling do
before(:each) do
#bowling = Bowling.new
end
it "should score 0 for gutter game" do
20.times { #bowling.hit(0) }
#bowling.score.should == 0
end
end
Ninject: http://www.ninject.org
For an example that doesn't come from general-purpose libraries, I built an automated regression suite for a configuration wizard. I created a state machine that fills in values on a wizard page, verifies those values are acceptable, then moves on to the next page. The code for each step in the state machine looks like:
step.Filler().Fill().Verify().GoForward();