Adjust MIDI Note Volume - language-agnostic

[I am doing this work in Java, but I think the question is language-agnostic.]
I have a MIDI Note On volume (called "data2," it's 0-127) that I am adjusting with a fader (0 to 127). The "math" I am using is simple:
newData2 = oldData2 * faderVolume / 127;
Zero works perfectly, and 127 does too, but the volumes close to the bottom of the range are way too loud, especially the louder notes. What might be a different relationship than a linear one (in pseudo-code would be great)? I will have to plug them into the code and try them, of course.
I realize that this question depends on the instrument that is playing the Note Ons (a BFD Kit in Ableton Live, which doesn't inform much), but maybe not and perhaps there's a standard way to adjust a Midi Note On volume with a fader.

Your equation is correct. You are figuring up the note-on velocity relative to the fader in a linear fashion. A couple notes...
The parameter you are adjusting is velocity. This does not necessarily mean volume! The two do have a correlation for most synths (including your drum kit in Ableton) but it might not be as volume related as you might think.
0-velocity is equivalent to note-off and will never play a sound. I say this because if the difference between 0 and 1 is signficant, itmight be that volume isn't affect as much by the velocity parameter as you might think.
Finally, traditional mixer faders use logarithmic law. You might experiment with this, but again I think you are barking up the wrong tree with volume.
There is a MIDI message for channel volume that you should use for volume, and that is CC 7.

As I said on my comment, when playing with sound or audio or any audible technologies, rather use doubles or floats (depending on the hardware or API specifications).
You are returning an integer on newData2. Rather convert it to a double or float (for precision).
e.g.
float newData2 = (float)oldData2 * (float)faderVolume / (float)127;
Hope this helps.

Related

How to speed up the world in LibGDX without altering delta time

This is not a duplicate of this question because I need a way to speed up my world without changing deltaTime and have everything happen faster. Why can't I use deltaTime or change it? I'm using the velocity Verlet sympletic integration for the simulation of orbital mechanics and deltaTime is required to be as low as possible for increased precision. Therefore I have it set to Gdx.graphics.getDeltaTime() * 0.001f. I am using LibGDX in Java and if it is necessary, I am using the Game class to have my screens structured.
What have I tried or thought of?
Using higher order sympletic integrators will not require a smaller deltaTime but are difficult to implement and are my plan B for if this is not possible.
The question which you said is not duplicate is actually most likely your solution.
Your goal is to have small time-step of Gdx.graphics.getDeltaTime() * 0.001f. When we use framerate 60fps then it can be writen as 1f / 60f * 0.001f. So your current time-step is about 0.000017. This is the value which you want to use for Constants.TIME_STEP.
Then you will just need to replace WorldManager.world.step with call to your physics function. And the delta time which you pass as parameter will be Constants.TIME_STEP.
But due to your small time-step there will be large amount of calls to the physics function, which means it will have to be fast or you will have to find a way to reduce the time-step anyway.

Mapping Nonlinear Functions By Using Artificial Neural Network

I am dealing with an hard assignment which I could not move the pen. What is the way to solve the following problem? Any help would be appreciated.
f(x)=1/x and x is between 0.1 and 1
The problem is asking to traing the network by using back propagation algorithm with one hidden layer.
Trainin set will have 200 input/output pattern, test set will have 100 and validation will have 50 patterns.
How can I solve this? Regards.
That sound much more complicated than it actually is. The network does not know anything about what you actually want to represent with the input and output pattern. So do not worry about that. All you need to do is setup such a network (I assume that you know how to do that - otherwise just check around there are couple of libs, but it is even possible in Excel to set it up quickly for testing purposes)
Then just run the test data against the network in a loop. Once the network is kind of stable store it and start testing.
I assume the representation of the patters has been defined already? It's one of the most important point that defines the quality. The closer the x/y pairs are semantically the closer the representation patterns have to be - meaning here the delta between x/y pairs. In particular for the small x value/large y pairs!
Otherwise the network will not "understand" that and you can teach forever - since there is no correct representation of the similarity - in this case the delta x and delta y
For example the value 7 in binary format is not close at all to the value 8. Meaning if the network did not "learn" that because it has never seen the 8 it will not work well.
So the closer the values the more similarities the representation of the values should be for the network! - That's the key.
Tweaking the parameters will then fine tune your model

Motion Vectors and DCT residuals, are they related or independent?

I am working on a novel technique that uses already encoded H264 motion vectors from a pre-encoded video.
I need to know how the motion vectors and residuals are related. I need some very specific answers that I can't find answered anywhere else:
Are the motion vectors forward, or backward? I mean, does the vector indicate where the current 4x4 or 8x8, 8x4 .... block will be in the next frame (forward). Or is it the opposite? (That in the block it is indicated where that block comes from), (backwards).
In the case a block has multiple references (I don't know if that is even possible). How are those references added together? Mean? Weighted?
How is the residual error being compensated, per block (4x8, 8x4, etc)? Ignoring the sub blocks, and just partitioning the image in 8x8 chunks?
My ultimate goal, is to know from the video feed the "accuracy" of each motion vector. I can only do that with backwards prediction, and if the DCT residuals are per block. In that case I can measure the accuracy of the motion vector estimation by measuring the amount of residual error of that block.
Thanks in advance!!
PD: Reading trough the 800 pages of H264 is not easy task....
The H264 standard is your friend. Also get the books by Ian Richardson, a bit more readable than the standard (but only a bit :)
"Are the motion vectors forward, or backward?" - they are backward. The MV for a block points to where that block came from.
"In the case a block has multiple references (I don't know if that is even possible). How are those references added together?" - it is possible, check out weightb and weightp options for x264. Can have up to two references, the explicit weights are encoded in the stream (I think as deltas from the neighbor weights, so usually zeros - but don't quote me on that; also I think whether weights are used is a flag somewhere, if not used the weights are equal by default)
"How is the residual error being compensated" - depends on the macroblock partitioning mode and transform size. The MVs are for each partition, the residuals are for the transform size tiled into the partition (so if a 16x16 is partitioned into two 16x8 and the transform is 8x8, each partition gets two transforms; if the transform is 4x4 each partition gets (16/4)x(8/4)=8 transforms).
For experiments, you can change encoder settings to turn off B-frames and weighted P-frames, and also restrict the partitioning mode to not partition (ie 16x16 only). This allows much easier way to try different motion vectors :)

the Frequencies from the FFT is showing values that it shouldnt

I'm developing a software to input a monotonic .wav clip (piano) and show the piano notes which are played in that clip. I'm using FFT to calculate the frequencies but they are giving me values such as 22360 Hz and so on where I want to get around 260 to 600 Hz.
Can someone please help me with this?
Pianos put out a lot of powerful high harmonics or overtones, and thus an FFT should show amplitude in many high frequency bins. Perhaps you should use a pitch detection or estimation algorithm instead of just an FFT?
I think your problem is that you don't have enough samples, so the frequency resolution is poor. All you need to do is to have more samples or just zero-padding. See here and here. That may help.
Hotpaw2 makes an important point about overtone content.
However another thing you will require is a window function to prevent frequency domain artifacts of the sampling interval from contaminating your result. The window function applied to the data before the FFT essentially fades the signal in and out smoothly to avoid this.

How to detect local maxima and curve windows correctly in semi complex scenarios?

I have a series of data and need to detect peak values in the series within a certain number of readings (window size) and excluding a certain level of background "noise." I also need to capture the starting and stopping points of the appreciable curves (ie, when it starts ticking up and then when it stops ticking down).
The data are high precision floats.
Here's a quick sketch that captures the most common scenarios that I'm up against visually:
One method I attempted was to pass a window of size X along the curve going backwards to detect the peaks. It started off working well, but I missed a lot of conditions initially not anticipated. Another method I started to work out was a growing window that would discover the longer duration curves. Yet another approach used a more calculus based approach that watches for some velocity / gradient aspects. None seemed to hit the sweet spot, probably due to my lack of experience in statistical analysis.
Perhaps I need to use some kind of a statistical analysis package to cover my bases vs writing my own algorithm? Or would there be an efficient method for tackling this directly with SQL with some kind of local max techniques? I'm simply not sure how to approach this efficiently. Each method I try it seems that I keep missing various thresholds, detecting too many peak values or not capturing entire events (reporting a peak datapoint too early in the reading process).
Ultimately this is implemented in Ruby and so if you could advise as to the most efficient and correct way to approach this problem with Ruby that would be appreciated, however I'm open to a language agnostic algorithmic approach as well. Or is there a certain library that would address the various issues I'm up against in this scenario of detecting the maximum peaks?
my idea is simple, after get your windows of interest you will need find all the peaks in this window, you can just compare the last value with the next , after this you will have where the peaks occur and you can decide where are the best peak.
I wrote one simple source in matlab to show my idea!
My example are in wave from audio file :-)
waveFile='Chick_eco.wav';
[y, fs, nbits]=wavread(waveFile);
subplot(2,2,1); plot(y); legend('Original signal');
startIndex=15000;
WindowSize=100;
endIndex=startIndex+WindowSize-1;
frame = y(startIndex:endIndex);
nframe=length(frame)
%find the peaks
peaks = zeros(nframe,1);
k=3;
while(k <= nframe - 1)
y1 = frame(k - 1);
y2 = frame(k);
y3 = frame(k + 1);
if (y2 > 0)
if (y2 > y1 && y2 >= y3)
peaks(k)=frame(k);
end
end
k=k+1;
end
peaks2=peaks;
peaks2(peaks2<=0)=nan;
subplot(2,2,2); plot(frame); legend('Get Window Length = 100');
subplot(2,2,3); plot(peaks); legend('Where are the PEAKS');
subplot(2,2,4); plot(frame); legend('Peaks in the Window');
hold on; plot(peaks2, '*');
for j = 1 : nframe
if (peaks(j) > 0)
fprintf('Local=%i\n', j);
fprintf('Value=%i\n', peaks(j));
end
end
%Where the Local Maxima occur
[maxivalue, maxi]=max(peaks)
you can see all the peaks and where it occurs
Local=37
Value=3.266296e-001
Local=51
Value=4.333496e-002
Local=65
Value=5.049438e-001
Local=80
Value=4.286804e-001
Local=84
Value=3.110046e-001
I'll propose a couple of different ideas. One is to use discrete wavelets, the other is to use the geographer's concept of prominence.
Wavelets: Apply some sort of wavelet decomposition to your data. There are multiple choices, with Daubechies wavelets being the most widely used. You want the low frequency peaks. Zero out the high frequency wavelet elements, reconstruct your data, and look for local extrema.
Prominence: Those noisy peaks and valleys are of key interest to geographers. They want to know exactly which of a mountain's multiple little peaks is tallest, the exact location of the lowest point in the valley. Find the local minima and maxima in your data set. You should have a sequence of min/max/min/max/.../min. (You might want to add an arbitrary end points that are lower than your global minimum.) Consider a min/max/min sequence. Classify each of these triples per the difference between the max and the larger of the two minima. Make a reduced sequence that replaces the smallest of these triples with the smaller of the two minima. Iterate until you get down to a single min/max/min triple. In your example, you want the next layer down, the min/max/min/max/min sequence.
Note: I'm going to describe the algorithmic steps as if each pass were distinct. Obviously, in a specific implementation, you can combine steps where it makes sense for your application. For the purposes of my explanation, it makes the text a little more clear.
I'm going to make some assumptions about your problem:
The windows of interest (the signals that you are looking for) cover a fraction of the entire data space (i.e., it's not one long signal).
The windows have significant scope (i.e., they aren't one pixel wide on your picture).
The windows have a minimum peak of interest (i.e., even if the signal exceeds the background noise, the peak must have an additional signal excess of the background).
The windows will never overlap (i.e., each can be examined as a distinct sub-problem out of context of the rest of the signal).
Given those, you can first look through your data stream for a set of windows of interest. You can do this by making a first pass through the data: moving from left to right, look for noise threshold crossing points. If the signal was below the noise floor and exceeds it on the next sample, that's a candidate starting point for a window (vice versa for the candidate end point).
Now make a pass through your candidate windows: compare the scope and contents of each window with the values defined above. To use your picture as an example, the small peaks on the left of the image barely exceed the noise floor and do so for too short a time. However, the window in the center of the screen clearly has a wide time extent and a significant max value. Keep the windows that meet your minimum criteria, discard those that are trivial.
Now to examine your remaining windows in detail (remember, they can be treated individually). The peak is easy to find: pass through the window and keep the local max. With respect to the leading and trailing edges of the signal, you can see n the picture that you have a window that's slightly larger than the actual point at which the signal exceeds the noise floor. In this case, you can use a finite difference approximation to calculate the first derivative of the signal. You know that the leading edge will be somewhat to the left of the window on the chart: look for a point at which the first derivative exceeds a positive noise floor of its own (the slope turns upwards sharply). Do the same for the trailing edge (which will always be to the right of the window).
Result: a set of time windows, the leading and trailing edges of the signals and the peak that occured in that window.
It looks like the definition of a window is the range of x over which y is above the threshold. So use that to determine the size of the window. Within that, locate the largest value, thus finding the peak.
If that fails, then what additional criteria do you have for defining a region of interest? You may need to nail down your implicit assumptions to more than 'that looks like a peak to me'.