Which OCR Engine is better: Tesseract or OCRopus? [closed] - ocr

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have tried Tesseract with iPhone and assessed its accuracy to be 70% without image preprocessing. I also noticed that it might be poor in extracting digits. I have heard about OCRopus OCR engine: which is better, Tesseract or OCRopus, in terms of digit extraction and if my image preprocessing is low?
Has anyone run tests using both engines comparing the results using the usual metrics?

Initially OCRopus was actually using Tesseract as recognition engine inside, but later they changed it to their own brand-new engine. It is still fresh and not mature. We have been making accuracy comparison about year ago, and OCRopus was definitely losing to Tesseract, I am not even talking about commercial enignes. Since then I stopped following OCRopus progress, but what I definetely know that activity on OCRopus support forum is close to zero now. That means, no one is using it. Mostly people are using commercial engines, but if price is an issue for them and they can tolerate lower accuracy, then they use Tesseract. It is definetely best one among Open Source.

You can also check the activity of projects in "changes" link
https://code.google.com/p/ocropus/source/list?repo=ocropy
https://code.google.com/p/tesseract-ocr/source/list
tesseract is much busier

Related

Is cloud functions a valid replacement/implementation of a distributed system? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I want to process a list of data parallelly; processing of each element of the data won't affect other.
With for example google pub/sub + cloud functions, I could achieve something scalable and parallel, which looks like a distributed system.
I have little knowledge about distributed programming, and it seems that it takes a lot of time to master.
So I would like to know is this a replacement or a valid implementation of distributed system?
For the specific use case you're talking about - dividing work among function invocations to run in parallel - yes, it sounds like that would be adequate.
I would be very hesitant to call it a full "distributed system" (at least not without your very strict definition of what that really is). If you take wikipedeia's explanation of distributed computing, you might have a very basic system in place, but lack of a peer-to-peer direct messaging system probably makes it unsuitable for many of the listed applications you see on that page.
The bottom line I think you should really consider is if it satisfies the requirements of the problem at hand. Whether or not it's a "distributed system" is mostly irrelevant - either it works or it doesn't for that use case.

Is PostgreSQL or MySQL more popular with Node.js? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In absolute terms, Postgres has more features and has been used at scale by Instagram, etc. while MySQL has a much bigger user base and has been used at scale by likes of Facebook, Quora, etc. But how about in combination with NodeJS?
Which is more popular with NodeJS?
MySQL is probably more popular, solely in terms of userbase. (You sorta answered this yourself)
MySQL probably has more examples around the net which could help make things easier to set up. You'll probably also find MySQL more likely to come preloaded on a VPS if that's the sort of route you're taking. However setting up PostgreSQL on your server is not difficult, and there is plenty of documentation available.
It really depends on what your intentions are with your data. Digital Ocean wrote a nice concise comparison of MySQL and PostgreSQL found here
As far as how these play with node.js, in my experience the node modules for PostgreSQL and MySQL are equally pleasant to work with. Ultimately its more about picking the database that suits your data and what you want to do with your data. Then deciding how it fits into your node stack.

What are the advantages and disadvantages of pretty printing JSON? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Pretty printing JSON will tend to make it heavier than those rendered without pretty print. Beyond that I can't think else of something between pretty printing or not.
Let's say you want to provide web services for a public RESTful Web API, will it affect server performance, round-trip time, etc.?
So again, what are the advantages and disadvantages of pretty printing JSON?
Advantages
Easier to read
Disadvantages
The size of the data will increase
There will be some computational overhead
Thinking about this for all of 4 more seconds, I can't add any more to those lists! :P
Tools can both versions just the same, but the pretty printed versions are more readable for humans.
Why doesn't everyone pretty print?
For the same reason people don't turn on their car lights during the day, even though it would improve their visibility:
People don't think about it: json.dump() works, move on to the next problem.
It's not the default, so it takes a tiny bit of manual work
People don't see it as something they should optimize for.
People tend to micro-optimize other things (JSON string size, car battery usage)

Making my own Carbon Footprint Calculator [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 12 years ago.
Improve this question
I'm trying create my own carbon footprint calculator, but I'm having trouble finding all the proper equations and such online, anyone know of any decent resources?
Wow, that is a huge question. In part because "all the proper equations" really depend on who is doing the asking. I would start here: http://www.withouthotair.com/
This resource is HUGE for this. =)
I think this project sounds very interesting!
If you are familiar with web development, it would be very cool to make this a web-based project, which allows for constant growth and development of the equations. You could even make it so that users of your web site can view the equations you are using, and input their own equations. Maybe you could even consider some sort of mechanism to fold back user equations into the base - or set up multiple different bases for different users of different lifestyles.
I didn't directly answer your question, but I hope these concepts are interesting and useful to you.
-Brian J. Stinar-

What criteria do you use to quickly determine if a github project is finished/useable? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
When I browse github I have a hard time differentiating high quality code from half-finished crap without taking a serious look at the code. What are some good ways to quickly size up a project? Rubyforge allows people to designate a "Development Status". SourceForge has a "recommend" feature. Is there some feature that I've overlooked? I just look at the number of forks and watchers. Is there a better way? I don't see a checkout count, or any other measure of popularity.
I would check for documentation. Well advanced code should have associated documentation, while fledgling projects are too busy getting their code and architecture done to create documentation, which will probably have to change by the time they release anyway. Basically, writing documentation says to me that you think the code is stable and functional enough for users to be able to benefit from it.
Recent activity is a big one. If the project does not have recent developer commits or there are open bugs, tickets, issues, questions, etc without developer responses then move on.