What data mining tools do you use? [closed] - open-source

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Besides the two well-known Open Source tools RapidMiner and Weka, are there any other good tools (either Open Source or Commercial), which you can recommend for data mining?
Thanks in advance!

My money is on R, see e.g. the Machine Learning task view.

How about the open source Orange data mining toolkit.
http://www.ailab.si/orange/

You can look at my project - Data Mining SDK.

According to the KDnuggets Poll 2011, RapidMiner once more is the most widely used data mining solution world-wide:
http://www.kdnuggets.com/2011/05/tools-used-analytics-data-mining.html

If it is commercial software the following two are awesome
SAS
SPSS

Another very powerful opensource tool is Knime. In some respects it is better than RapidMiner. As for commercial here's what I've tried:
1.Polyanalyst
2.SPSS Clementine
3.Kxen
4.Statistica Data Miner
5.MATLAB
I like Polyanalyst the best. But it's just my opinion.

According to the yearly KDnuggets Polls 2007, 2008, and 2009, RapidMiner is the most widely used Open Source Data Mining Solution among data mining experts world-wide:
KDnuggets Data Mining Tool Poll 2009
RapidMiner is open source and 100% Java, RapidMiner is much more flexible and offers significantly more functionality than Weka and KNIME.

JDMP http://www.jdmp.org/

The data mining tool I used(also machine learing tools):
Weka: classfication, clustering, association rule, decision tree......
Cluto: clustering
libsvm: classification
And from many posts, I find there still other famous tools which I haven't used:
Orange
R
RapidMiner
SAS
SPSS
There must be other useful tools that I'm not aware of.

Related

Suggestions for Dashboard/ Database Reporting software [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
We are looking for a reporting suite that will allow us to analyse our data. I'm not sure of the exact terminology for such suites but they are often known as 'Dashboard Software' or 'Database Reporting'
An example is: Wonder Graphs
We are looking for a suite that will integrate with our MySQL database and provide us with:
'Live' graphical Interfaces (Graphs, Charts) for viewing our data which are automatically updated
The ability to 'Drill Down' using these charts to see more specific information.
For example if a chart shows total sales, we want to be able to click on that graph and be shown information on type of sales.
The ability to export to excel
An easy-to-use user interface that allows non-technical users to create and customise their own views or dashboards.
If anyone can list software they use, have used, or know to be good that would be a great help.
If there is an open-source example available that is great however we are expecting to pay for such software.
Let me know if I have been to vague on details.
Thanks in advance,
James
There are a few things to consider. Firstly, how much data do you have? MySQL isn't designed to be an analytics database. If you have a "small amount" of data, then it doesn't matter. However, if you grow or plan to grow, you may want to copy the data over to an analytic database such as Infobright. Infobright does have an open-source option.
On top of the database, you have a few open-source BI solutions that will work very well. Take a look at Pentaho, Jaspersoft, and Actuate/BIRT. Actuate has some great drill down options, and they also have a way to easily get this data to a mobile device.
Full disclosure: I am the open-source guy for Infobright.
In my research I discovered the following site which seems to be a good place to start for anyone looking at this topic:
Dashboard Insight
This site includes a long list of available suites and tools.
We're currently using Dundas Dashboard and have had a lot of success with it. I am pretty sure it has a data source connector for mySQL, you just have to install a driver that they have available on their site.
All in all, great product, a little bit of a learning curve, but we've had a lot of success with it.

Best tutorial to learn SSIS [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Which book is best to learn SSIS. Actually in my project we need to take onput from CVS file and after processing the data in SQL server 2008 we have export it back to excel file. ASP.NET is used as UI for this.
Thanks,
Nabin
I completely agree with Cade in terms of simply working with it. I found that trying to follow specific "tutorials" to try and learn the package didn't really help but having a number of useful resources definitely came in handy.
At work, we had this book kicking around but really it just went over the flow objects available without going into any real-world examples. Jamie Thomson's blogs (here and here) are both excellent online resources though and have been really helpful for me personally.
Try this book:
http://www.amazon.com/Professional-Microsoft-Integration-Services-Programmer/dp/0470247959
The best way to learn SSIS is just to do it. Probably best to start and then refer to the book. Because the tool is so GUI intensive, I tended to get more after reading the book later once I was already familiar with the environment somewhat.
Reading the material some times couldn't solve your real time migration by missing some perticular functionality related to your project. I worked on your scenario case of migrating database to SQL using intermediate CSV or text files.
http://msdn.microsoft.com/en-us/library/dd537533(SQL.100).aspx
we migrated nearly 1TB in 30 min using SSIS 2008.
this could help to get the information on specific properties of souce file according to our requirements.
thanks
prav

how do free online OCR programs compare to commercial ones? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
How much better would commercial OCR software be compared to the stuff that's available online for free?
More specifically: Reading text in pictures (things like book covers etc...)
I work with OCR quite a lot and can definitely vouch that the commercial offerings are much better than what you can find out there for free. Yes, you can make a free one 'work', but it will take a lot of effort for sub-optimal results.
I recommend finding a product that uses the ABBYY FineReader : It does a great job with little configuration.
You may want to consider whether you need to use an SDK provided by the OCR supplier or an end-user application. The SDK will provide position details, etc of what it finds and offer a lot more in-depth control, but will be more expensive. The end-user package will basically just read everything it finds, but you may be able to set it to automatic or control it rudimentally and it might be good enough for what you're trying to do, and may be a lot cheaper.
Get a trial version and give it a go!
Google's ocropus is free opensource and one of the best

Open Source Data Mining Software [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I was wondering; what is the best open source software that I can use for non-binary association rule generations. I need a non-binary implementation because converting my currently non-binary data to binary data would not give the desired results.
Thanks and can't wait to here your comments!
Also take a look at Weka
Check out:
RapidMiner
and
R with Rattle
Try the Orange data mining toolkit.
http://www.ailab.si/orange/
Try Data Mining SDK.
These days I like Knime. See http://knime.org.
you could even try another one called Tanagra http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html
Its mainly for research purpose but works well and has good tutorials here
http://data-mining-tutorials.blogspot.com
I have an open-source software named SPMF with more than 130 algorithms related to association rules mining, frequent itemset mining, sequential rule mining and sequential pattern mining. You can check my webpage for more details and to download it:
It is Java source code. It has a simple graphical user interface. It also has many specialized algorithms that you will not find in other data mining software.

A powerful management tool for MySQL with similar features to SQL Server Management studio [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am currently working with a developer who is experienced at Ms-SQL, but not much at MySQL. He has been cursing MySQL for having Bugs, and also being far harder to use.
Is is because his experience has been so good with Management studio. It seems to me that his problems are with using phpMyAdmin.
For example, he cites not being able to cross join and compare between tables of different structures using MySQL. Is the problem actually our choice of management tool, or does MySQL have these flaws that my developer thinks. I hope not, as I have just been blown away how fast doing various data management tasks have been in Studio Manager.
You should really check SQLYog. It's great, and has a community version.
Hate to rain on your parade of tools, but while some of the ones mentioned here are pretty cool, none of them have the mojo of SSMS.
In my MySQL work, I basically switched between SQLYog and MySQL GUI Tools, depending on what I did. SQLYog Enterprise Edition (e.g. the non-free one) also adds support for really basic schema code completion.
Quest Toad is good and has pay and free versions for *MySQL.
free version no longer available
More tools to try: EMS SQL Management Studio or MySQL GUI Tools (now called MySQL Workbench)
I would suggest Aqua Data Studio. I don't think that it has a free version, however it is pretty powerful and has a ton of great features that are similar to SQL Server Management Studio.
VSQL++ for MySQL is a powerful GUI database management tool for MySQL