Convert music wav file to text symbols [closed] - fft

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to give an audio wav file(instrumental - Violin etc) as input and I want to detect all the frequencies tones and to get updated in text sequences in order they have been played. I think I should use fft spectrum at regular intervals to get their frequencies values. Help me out on how to proceed.

This is a very difficult problem, and you will need a good knowledge of signal processing in order to get any kind of usable results. You're right that the FFT is a good starting point, but you should read some of the other posts here and papers around the web. Search for 'pitch estimation'. 'pitch detection' or 'music transcription'. You'll need to understand how a complex sound is comprised of a number of sinusoids at related frequencies ('harmonics') and how getting the peak of the FFT won't necessarily give you the pitch (some instruments have a spectrum where the fundamental frequency (ie the pitch) isn't the largest peak.
The Wikipedia page on Pitch Detection gives a good starting point. I'd suggest reading a few papers on the Autocorrelation Method and Harmonic Sum Spectrum.
https://stackoverflow.com/search?q=pitch+estimation

Related

Can YOLO be negatively affected by unlabeled images? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I have two traffic related datasets. One contains traffic signs and the other traffic lights.
I want to merge the two datasets and train the model to detect both of them.
Will unlabeled traffic-signs from the traffic-light dataset affect the training process and vice versa?
From what I've read so far YOLO also learns contextual information about the objects and that's why this concern.
As you mentioned and as I found there, "YOLO sees the entire image during training and test time so it implicitly encodes contextual information about classes as well as their appearance", but I see that the meaning is that it considers the direct areas around the labels to add their information to the trained network, thus, you will likely only lose the information from the unlabeled items, but it will not impact the labeled items negatively.

Differentiating b/w memory bound and compute bound CUDA kernels [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am trying to write a static analyzer for differentiating between data intensive and computation intensive CUDA kernels. As much as I have researched on this topic, there is not much literature present on it. One of the ways to accomplish this is to calculate the CGMA ratio of the kernel. If it is 'too high', then it might be compute intensive, otherwise, memory intensive.
The problem with the above method is that I can't seem to decide upon a threshold value for the ratio. That is, above what value should it be classified as compute intensive. One way is to use the ratio of CUDA cores and load/store units as threshold. What does SO think?
I came across this paper in which they are calculating a parameter called 'memory intensity'. First, they calculate a parameter called the activity factor, which is then used to calculate memory intensity. Please find the paper here. You can find memory intensity on page no: 6.
Does there exist any better approach? I am kind-of stuck in my research due to this, and desperately need help.

Designing a circuit that calculates Hamming distance? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I came across this question and I couldn't find it in textbooks or the internet. Seems pretty unique.
I guess there would be some comparators and adders involved, but I have no clue where to start.
The first step will undoubtedly be XORing the two bit sets. Then you need to count the number of logical ones in the output. The best method for designing your circuit would be to make a complete analogy of the hack discussed in this question and explained perfectly in its answer by nneonneo. This would result in the optimal tree of adders, rather than relying on sequential counting. The idea is that in each layer you know how to cap the maximum possible sum of a subset of the inputs, and in how many bits it will fit, eliminating the need for a carry bit. The programming approach is designed for 32 bits but easily modifiable for less or more than that.
For more possible algorithms for computing Hamming weight see this link.

Which OCR Engine is better: Tesseract or OCRopus? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have tried Tesseract with iPhone and assessed its accuracy to be 70% without image preprocessing. I also noticed that it might be poor in extracting digits. I have heard about OCRopus OCR engine: which is better, Tesseract or OCRopus, in terms of digit extraction and if my image preprocessing is low?
Has anyone run tests using both engines comparing the results using the usual metrics?
Initially OCRopus was actually using Tesseract as recognition engine inside, but later they changed it to their own brand-new engine. It is still fresh and not mature. We have been making accuracy comparison about year ago, and OCRopus was definitely losing to Tesseract, I am not even talking about commercial enignes. Since then I stopped following OCRopus progress, but what I definetely know that activity on OCRopus support forum is close to zero now. That means, no one is using it. Mostly people are using commercial engines, but if price is an issue for them and they can tolerate lower accuracy, then they use Tesseract. It is definetely best one among Open Source.
You can also check the activity of projects in "changes" link
https://code.google.com/p/ocropus/source/list?repo=ocropy
https://code.google.com/p/tesseract-ocr/source/list
tesseract is much busier

Making my own Carbon Footprint Calculator [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 12 years ago.
Improve this question
I'm trying create my own carbon footprint calculator, but I'm having trouble finding all the proper equations and such online, anyone know of any decent resources?
Wow, that is a huge question. In part because "all the proper equations" really depend on who is doing the asking. I would start here: http://www.withouthotair.com/
This resource is HUGE for this. =)
I think this project sounds very interesting!
If you are familiar with web development, it would be very cool to make this a web-based project, which allows for constant growth and development of the equations. You could even make it so that users of your web site can view the equations you are using, and input their own equations. Maybe you could even consider some sort of mechanism to fold back user equations into the base - or set up multiple different bases for different users of different lifestyles.
I didn't directly answer your question, but I hope these concepts are interesting and useful to you.
-Brian J. Stinar-