I was consulting several references to discover how I may output trained Weka models into Java source code so that I may use the classifiers I am training in actual code for research applications I have been developing.
As I was playing with Weka 3.7, I noticed that while it does output Java code to its main text buffer when use simpler classification (supervised in my case this time) methods such as J48 decision tree, it removes the option (rather, it voids it by removing the ability to checkmark it and fades the text) to output Java code for RandomTree and RandomForest (which are the ones that give me the best performance in my situation).
Note: I am clicking on the "More Options" button and checking "Output source code:".
Does Weka not allow you to output RandomTree or RandomForest as Java code? If so, why? Or if it does and just doesn't put it in the output buffer (since RF is multiple decision trees which I imagine it doesn't want to waste buffer space), how does one go digging up where in the file system Weka outputs java code by default?
Are there any tricks to get Weka to give me my trained RandomForest as Java code? Or is Serialization of the output *.model files my only hope when it comes to RF and RandomTree?
Thanks in advance to those who provide help.
NOTE: (As an addendum to the answer provided below) If you run across a similar situation (requiring you to use your trained classifier/ML model in your code), I recommend following the links posted in the answer that was provided in response to my question. If you do not specifically need the Java code for the RandomForest, as an example, de-serializing the model works quite nicely and fits into Java application code, fulfilling its task as a trained model/hardened algorithm meant to predict future unlabelled instances.
RandomTree and RandomForest can't be output as Java code. I'm not sure for the reasoning why, but they don't implement the "Sourceable" interface.
This explains a little about outputting a classifier as Java code: Link 1
This shows which classifiers can be output as Java code: Link 2
Unfortunately I think the easiest route will be Serialization, although, you could maybe try implementing "Sourceable" for other classifiers on your own.
Another, but perhaps inconvenient solution, would be to use Weka to build the classifier every time you use it. You wouldn't need to load the ".model" file, but you would need to load your training data and relearn the model. Here is a starters guide to building classifiers in your own java code http://weka.wikispaces.com/Use+WEKA+in+your+Java+code.
Solved the problem for myself by turning the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
Since I am using classifiers with Android, all of the existing options had disadvantages:
shipping Android apps with serialized models didn't reliably work across devices
computing the model on the phone took too much resources
The final code will consist of three classes only: the class with the generated model + two classes to make the classification work.
I have been using KNIME 2.7.4 for running analysis algorithm. I have integrated KNIME with our existing application to run in BATCH mode using the below command.
<<KNIME_ROOT_PATH>>\\plugins\\org.eclipse.equinox.launcher_1.2.0.v20110502.jar -application org.knime.product.KNIME_BATCH_APPLICATION -reset -workflowFile=<<Workflow Archive>> -workflow.variable=<<parameter>>,<<value>>,<<DataType>
Knime provide different kinds of plot which I want to use. However I am running the workflow in batch mode. Is there any option in KNIME where I can specify the Node Id and "View" option as a parameter to KNIME_BATCH_APPLICATION.
Would need suggestion or guidance to achieve this functionality.
I have posted this question in KNIME forum and got the satisfactory answer mentioned below
As per concept of command line execution, this requirement does not fit in. Also there is now way for batch executor to open the view of specific plot node.
Hence there could be two solutions
Solution 1
Write the output of workflow in a file and use any charitng plugin to plot the graph and do the drilldown activity.
Solution 2
Use jFreeChart and write the image using ImageWriter node which can be displayed in any screen.
I've just started using IPython for interactive development and exploratory research, which I've found really exciting with all the cool features and possibilities. I am using the Anaconda package manager to manage dependencies, which includes IPython.
From what I've read, one goal of the IPython team is to eventually integrate Sage Math (CAS) into IPython, as a cell magic. Does anyone know if this is still under development? Or rather, if I wanted to use Sage now, is writing an extension the only way to do this [1]?
[1] https://github.com/ipython/ipython/wiki/Extensions-Index
Also, if I install additional packages for scientific development, not included in the Anaconda distribution, is that as easy as just pip or do I have to go through a Anaconda package build to handle dependencies and such? If I were only using IPython, I could understand just doing easy_install or pip as recommended in the docs, but I believe that overwrites existing dependencies within Anaconda. If I use pip, how does that affect Anaconda dependencies if I do not install in an Anaconda environment, which I take is the equivalent as virtualenv.py, and is this the way to also set up revision control (i.e. Mercurial)?
To clarify, I do not want to run IPython from within Sage, I want to run Sage, as a CAS, from within IPython. I'd rather go the Sage approach of integrating domain specific languages. Or in contrast, will IPython extensions replace Sage?
I'm a self taught programmer, not a professional software developer. As an engineer, I am used to Matlab, Mathematica, and commercial solutions, that allow me to abstract away the plumbing. I'm trying to wrap my mind around getting everything glued together, but it's like a mix of spaghetti soup and a dynamic link library, due to lack of knowledge. I'm probably using the wrong approach.
What I want is Anaconda/Enthought package management (IPython, pandas, etc..), custom rolled Sage through hooks/extensions or magics, extensions to packages not included in Anaconda (i.e. Matlab see [1] above), and revision control with Git and Mercurial. How would professional developers set this up on a Mac or Linux box?
Answering the first question:
Sage is a huge collection of mathematical software, including IPython. There's no way we'd integrate all of that into IPython.
Possibly what you've heard is that we're going to integrate Sage-style 'interacts' into IPython. That's where you have a slider to control the value of some input variable, and the output updates as you move it around, based on a calculation written in Python. That is still on our roadmap to add to IPython.
Another possibility is that you're thinking of SymPy, a Python-based CAS. SymPy works well within IPython, especially if you call sympy.init_printing() to get the fancy representations of expressions.
I wrote an IPython extension to load Sage customizations into the IPython notebook---in fact, that's how many of the customizations to IPython are done for the normal Sage interface. It basically turns the IPython notebook into an interface to Sage (e.g., preparsing is done, etc.).
You do need to run it from Sage's copy of IPython, though. Just start the IPython notebook:
sage -ipython notebook
and then load the sage extension in a cell:
%load_ext sage.misc.sage_extension
Pretty soon we'll upgrade to IPython 1.0 (I've made the changes necessary, and it needs to be reviewed). If you want to run IPython 1.0 already, email the sage-support mailing list and I'll post instructions.
To answer your other question, Sage includes many packages that are not available in Anaconda. Sage depends on these packages heavily for many features. I suppose there is a possibility getting Sage and its dependencies distributed with something like Anaconda, but no one is working on that as far as I know. There is some work on packaging Sage up for different linux distributions and replacing the package manager for Sage.
To clarify, I do not want to run Ipython from within Sage, I want to run Sage, as a CAS, from within Ipython. I'd rather go the Sage approach of integrating domain specific languages. Or in contrast, will Ipython extensions replace Sage?
If you want to run Sage within Ipython, the easiest thing to do is to use Sage's copy of Ipython:
$ sage -ipython
Python 2.7.5 (default, Aug 1 2013, 18:11:00)
Type "copyright", "credits" or "license" for more information.
IPython 0.13.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: from sage.all import *
In [2]: integrate(x^2,x)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-2-006357f5d9c0> in <module>()
----> 1 integrate(x^2,x)
NameError: name 'x' is not defined
In [3]: var('x')
Out[3]: x
In [4]: integrate(x^2,x)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-4-006357f5d9c0> in <module>()
----> 1 integrate(x^2,x)
/Users/.../sage-5.11.rc0/local/lib/python2.7/site-packages/sage/structure/element.so in sage.structure.element.Element.__xor__ (sage/structure/element.c:6754)()
RuntimeError: Use ** for exponentiation, not '^', which means xor
in Python, and has the wrong precedence.
In [5]: integrate(x**2,x)
Out[5]: 1/3*x^3
Note that there are some differences between this and Sage - for instance, there is no preparsing of the syntax. Presumably if you had Sage someplace your own installation of Ipython can find it, you could do this there too (though there is no easy_install and the Python versions might not match correctly!).
Please observe that Sage, as it currently is and should be in the near future, is a huge package that runs in POSIX (Linux-like) OSes only. If you intend it to run in Windows you must use VirtualBox with Sage running under a VM.
If you want a simple CAS that is simple to install and runs everywhere that Python does, you should consider taking a look at Sympy. It doesn't interact with all the software that Sage does, but is simple to use, lightweight (no external libraries needed) and is already included in the base Anaconda distribution.
Besides, it can be easily used inside other Python programs, since it's just like any other module and doesn't make changes to the Python syntax.
I need to implement an inverted beta function in MySQL (similar to Excel's BETAINV).
There is some related material is available on Wolfram MathWorld's Beta Distribution page.
Any clues on where to start implementing this functionality in MySQL?
Look at the cephes library, specifically cprob.tgz. Be warned that the licensing situation of that code seems to be unclear. The source code just says "free", which was a problem for Debian so apparently they got the author to relicense it under the GPL for them.
Okay, task done by moving inverted beta calculation to PHP and using code from PHPExcel package.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
What tweaks / addins / themes do you have rigged up to make your IDE awesome? For example, in Visual Studio I color themes, CodeRush draws lines between braces, I always install and use the Consolas font and I have it setup to sync my settings across computers for when I change hotkeys and whatnot with the help of FolderShare.
Also, this isn't Visual Studio specific, please feel free to mention what you do with Emacs or Eclipse or whatnot as many of us use a few.
ReSharper 4.1 for Visual Studio 2008. It's a beautiful thing. It looks for all kinds of code errors, optimizations, etc. My code is cleaner thanks to this handy Visual Studio plugin.
Optimizing the IDE will be the first step. Resharper helps a lot but sometimes some simple macros are more than enough.
First things first. Change the font from the default crappy one. Then start fiddling with the 'Options' dialog box.
At the recommendation of a friend, I installed Visual Assist for Visual Studio 2008 -- it is awesome. I swear it can read my mind.
[Note: I have no affiliation with them -- just a very happy customer]
I've done a lot, but I really shouldn't have. So in the last few years, I've toned down the number of macros, custom key mappings, custom toolbars, etc. For the most part, I'm of the opinion that developers should get used to the default behavior and appearance of their IDE. Then when you need to work on your colleague's machine, you still know how to get around and manage to help them out. Not to mention that a whole slew of customizations will get broken or rearranged or otherwise reset when an update comes out.
That said, there are a few things that I still do every time I set up an IDE to work on... for example, setting the number of concurrent builds in Visual Studio to be 1, because that feature is so broken that nothing will compile correctly with any greater setting. Apart from that, being an ace with the default behavior will ultimately make you more efficient than spending hours tweaking the software to make it just a little more fluid.
I like CodeSmart VB6 and CodeSmart VS.NET from Axtools http://www.axtools.com/
for advanced syntax highlighting, drawing lines between parts of If..then..else..endif, Do While ... loop and all other constructs. It also has great code auditors and many add-in functions.
Vi plugin!!
I use Emacs. My .emacs file is only a few hundred lines long, but does customize settings based on my machine's hostname and operating system, so that I can use the same config file pretty much anywhere.
Colorization - Custom - White Text on GreyishBlack, Consolas Font
HotKeys - CTRL+SHIFT+ALT+Z (Attach to Process) probably some others too...
Addins - DPack, Coderush, GhostDoc.
Toolbars Off
All Windows set to collapsed
I am not doing presentations with this machine - If I was it would be barebones.
eclipse plugins make my ide exactly the way I want it of course.
eclipse plugin central
I kinda like the default setup of VS, I only make sure about Consolas as the editor font, and tabsize 2 (tabs to spaces), and change the color of numbers (red).
For Java development using Eclipse I have a few plugins that are indispensable. The MyEclipse Workbench adds a lot of functionality to most of the built-in modules. It makes it very easy to deploy an application to multiple application servers, and enhances a lot of the built-in editors. The PMD plugin is great for searching for potential code issues. As mentioned in a previous post today, the Ganymede plugin really helps to highlight log entries.
I dont care much for fancy visual addons, so I left my IDE (Visiual Studio) in the standard look (other than MS Reference Sans Serif font).
I usually change the color scheme to have a black background instead of white.
I use the Zenburn color scheme with Proggy Clean for a font. It's like a comfy chair for my eyes.
Nothing. I hate dealing with all the breakages that inevitably result from updates, etc. So, I adapt myself to my IDE instead.
I've using a combination of ViEmu and ReSharper with a dark theme.
Oh, and I also hide most of the toolbars and turn off the animations to try to speed things up.
Silver background, 8pt Consolas, disable all toolbars and set tab spacing to 3 spaces. :)
For Visual Studio
Most important - Resharper - I bought my own copy so I don't have to badger my employer about it.
Change the colourisation/font - choose whatever suits you
Optimisation (vote up Gulzar's post with the link in it)
Don't try and make the IDE do everything, just because you can. (Kind of ironic seeing I use emacs as well). I personally really dislike integrated source code management.
Change some defaut file associations so double clicking certain file types doesn't kick off Visual Studio
Aside from Resharper I've actually found most other beneficial thing is not customising the IDE, but customising yourself to learn the keyboard shortcuts. Start with the big gains like Ctrl -, Ctrl Shift -, Ctrl Shift V, Ctrl Alt L etc. etc. and then gradually learn the rest of the shortcuts in order of how often you'd use them
Rather than customizing the IDE, I customized my error messages. I have a macro that expands to a #pragma warning statement that generates a compile-time message in the same format as MSVC++. Visual Studio can parse the resulting warning, so a double-click on the message opens the offending file in the IDE and takes me right to the line in question.
I've used the macro:
To "bookmark" a section of code, so developers will be nagged to fix it each time they build.
Within #if blocks to test for various compile-time conditions.
In headers, to see who #includes them, and where.
From vim you can set the makeprg (make program) variable to a command that will build your project, and the errorformat variable to a scanf-style string that describes the format of the build errors. From there:
:make will build your project
:cl lists all of the errors that match errorformat
:cc takes to you the current error
:cn takes you to the next error
:cp takes you to the previous error.
Out of the box, vim sets makeprg and errorformat to work with make and gcc, and all of the commands are documented within vim's built-in help.
I do Java development in Eclipse. Here are some of the plug ins I find useful:
Mylyn - hides project elements not relevant to the current context.
eUML2 - UML editor.
FindBugs - Static analysis tool to find common bugs in Java.
Crap4J - Another static analysis tool.
EclEmma - Code coverage plug-in for unit tests.
Edit: I forgot one:
Disable the spellchecker. :)
In visual studio 2005 I do these:
Bind F11 to fullscreen mode
Enable a vertical line at 80 characters: HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\8.0\Text Editor\Guides = "RGB(196,196,196) 80" (Guides won't be present in the registry.)
Add the "Start Debugging", "Break All" and "Stop Debugging" buttons after the "Help" menu.
I am using Vim Cscope plugin.
Cscope is like 'ctags' on steroids and makes traversing code much easier.
I usually use it along with tags to find where a function is declared and then go directly to whatever code is calling this function.
I also use Vim's Rgrep plugin (recursive search) to search for files in the code hierarchy.
Create some basic macro such as printing bug fix code comments:
Public Sub WriteBugFix()
Dim TS As TextSelection = DTE.ActiveDocument.Selection
TS.Text = "'Edited for Bug Fixed By JK - " & Date.Now.ToShortDateString
End Sub
(This answer assumes the workstation is a GNU/Linux computer.)
Emacs makes an excellent IDE specifically because it can be greatly customized.
You customize Emacs by editing the .emacs file in your home directory. My .emacs is a symbolic link:
$ ln -s /home/bzimmerly/bin/emacs/emacs24/wbz.emacs.el .emacs
Since Emacs offers a variety of specialized major modes for program development and debugging, you can easily "roll your own" IDE design that works best for you. For example, when I'm programming in C or assembler, I like to have the left side running GDB mode, with the right side displaying the source being traced.
A little bit of LISP skill goes a long way to turning Emacs into a very powerful IDE. It is well worth the investment of time to learn how to use this powerful tool!
Finally, tools like Youtube are valuable places for learning how to do this. Just entering "Emacs as an IDE" on the Youtube search form will show videos of how people have modified Emacs for just such a purpose. There are videos on editing Python code, Javascript, Java, C, etc.