Speex AEC on WP8 - windows-phone-8

I'm working on a VoIP application for Windows Phone 8, and I want to cancel the echo produced when using speakerphone. Speex offers an AEC module, which I have tried to integrate into my application, but to no avail. My application works fine, but the echo persists. My code is based off the MS Chatterbox VoIP application, using WASAPI for capture and render. This is the form of the relevant sections (I tried to indicate what already existed and worked, and what was new):
Init:
// I've tried tail lengths between 100-500ms (800-4000 samples # 8KHz)
echoState = speex_echo_state_init (80, 800)
speex_echo_ctl(echoState, SPEEX_ECHO_SET_SAMPLING_RATE, 8000);
Render (runs every 10ms):
Read 10ms (80 samples) data from network (8KHz, 16 bit, mono)
NEW - speex_echo_playback(echoState, networkData)
Upsample data to 48KHz
Render data
Capture (runs every 10ms):
Capture 10ms of data (48KHz, 16 bit, mono)
Downsample to 8KHz
NEW - speex_echo_capture(echoState, downsampledData, echoCancelledData)
Send echoCancelledData to network
After reading the Speex documentation and looking at some posts on this site (not a lot of speex for Wp8 questions, but a few for android), I'm under the impression that this is, or is close to, the proper implementatinon of their API. So why isn't it working?
Thanks in advance

Related

How to control Turtlebot3 in Gazebo using a web interface?

If I have a simulated Turtlebot3 robot in Gazebo, how could I link it and control its movement using a self-made HTML/Bootstrap web interface (website?) I have tried many tutorials but none of them have worked (may be because they are all from a few years ago). Would appreciate any recent links or tutorials!
you can do so by installing gazebo, gzweb, turtlebot3 package.
What is gzweb?
GzWeb is usually installed on an Ubuntu server. Gzweb is a client for
Gazebo which runs on a web browser. Once the server is set up and
running, clients can interact with the simulation simply by accessing
the server's URL on a web browser.
For gazebo and gzweb installation follow: http://gazebosim.org/tutorials?tut=gzweb_install&cat=gzweb
After creating a package using catkin create a python file turtlebot3_move_gz.py and add the following code to the python script:
#!/usr/bin/env python3
import rospy
from geometry_msgs.msg import Twist
def talker():
rospy.init_node('vel_publisher')
pub = rospy.Publisher('/cmd_vel', Twist, queue_size=10)
rate = rospy.Rate(2)
move = Twist() # defining the way we can allocate the values
move.linear.x = 0.5 # allocating the values in x direction - linear
move.angular.z = 0.0 # allocating the values in z direction - angular
while not rospy.is_shutdown():
pub.publish(move)
rate.sleep()
if __name__ == '__main__':
try:
talker()
except rospy.ROSInterruptException:
pass
Save the file
Next steps
In terminal:
Launch Gazebo simulator turtlebot3:
roslaunch turtlebot3_gazebo turtlebot3_world.launch
Start gzwebserver in a new terminal. On the server machine, start gazebo or gzserver first, it's recommended to run in verbose mode so you see debug messages:
gzserver --verbose
Fire up another terminal to start npm:
npm start
run your python turtlebot3 file in catkin_ws directory:
rosrun name_of_the_package turtlebot3_move_gz.py
Open a browser that has WebGL and websocket support (i.e. most modern browsers) and point it to the IP address and port where the HTTP server is started, for example:
http://localhost:8080
To stop gzserver or the GzWeb servers, just press Ctrl+C in their terminals.
This is not something I have done before, but with a quick search I found some useful information.
You need to use rosbridge_suite and in specific rosbridge_server. The latter provides a low-latency bidirectional communication layer between a web browser and servers. This allows a website to talk to ROS using the rosbridge protocol.
Therefore, you need to have this suite installed and then what you can do is to use it to publish a Twist message from the website (based on the website UI controls) to Turtlebot's command topic.
Don't think of Gazebo in this equation. Gazebo is the simulator and is using under-the-hood ROS topics and services to simulate the robot. What you really need to focus on is how to make your website talk with ROS and publish a Twist message to the appropriate ROS topic.
I also found a JavaScript library from ROS called roslibjs that implements the rosbridge protocol specification. You can, therefore, use JavaScript to communicate with ROS and publish robot velocities to the TurtleBot.
An example excerpt from this tutorial (not tested):
<script type="text/javascript" type="text/javascript">
var cmdVel = new ROSLIB.Topic({
ros : ros,
name : '/cmd_vel',
messageType : 'geometry_msgs/Twist'
});
var twist = new ROSLIB.Message({
linear : {
x : 0.1,
y : 0.2,
z : 0.3
},
angular : {
x : -0.1,
y : -0.2,
z : -0.3
}
});
cmdVel.publish(twist);
</script>
As you can see above the JavaScript code creates an instance of the Twist message with the linear and angular robot velocities and then publishes this message to ROS's /cmd_vel topic. What you need to do is to integrate this into your website, make the velocities in this code to be dynamic based on the website UI controls and start the rosbridge server.

Has anyone reversed engineered the protocol used by Apple's iOS Remote app for controlling an Apple TV over IP?

I'm curious if it's possible for me to write programs that can control an Apple TV, specifically an Apple TV 4th gen running tvOS 9.1.1, like Apple's Remote app for iOS can. I'd like to send it commands for navigating in the four cardinal directions, selecting an item on the screen, going up the navigation stack -- essentially what Apple's Remote app can do.
Has anyone done any work reverse engineering the protocol it uses? Cursory Googling only has so far yielded out of date results about earlier generation Apple TVs and the DAAP protocol which looks like something different than what I want.
I captured the traffic on my iPhone using tcpdump and analyzed it with WireShark. The Remote app asks the Apple TV with normal HTTP requests on port 3689.
The workflow of the app consists in four HTTP requests:
/server-info for getting infos about the Apple TV. It responds with a Apple proprietary DAAP response (Digital Audio Access Protocol) providing some tags about the device, like the display name.
/login is performed during connection, when the app displays the "Connecting to Apple TV..." message. It responds with a DAAP about the login status.
Here's the bottleneck. /home-share-verify validates the connection between the app and the Apple TV. This call needs a Client-DAAP-Validation header with a long unknown string value. According to Wikipedia, this seems to be like an hash generated by a certificate exchange between verified sources that was introduced in iTunes 7.0+ and never reverse engineered.
/ctrl-int/1/{controlpromptupdate|controlpromptentry|playstatusupdate} seems to be the calls made for the input buttons.
Some other minor calls are fired in between (like a Bonjour service update or a /databases call).
Here and here you can find more infos. Hope this helps for getting an overview of how this simple (but protected) app works.
i wanted to tell alexa to trigger appletv and that would wake my appletv up and via HDMI & CEC turn my tv on,
in order to do that:
from your mac\linux\windows simply run:
curl -XPOST -d 'cmcc\x00\x00\x00\x01\x30cmbe\x00\x00\x00\x04menu' 'http://10.1.1.56:3689/ctrl-int/1/controlpromptentry?prompt-id=144&session-id=1'
the abstract command is:
curl -XPOST -d 'cmcc\x00\x00\x00\x01\x30cmbe\x00\x00\x00\x04menu' 'http://{APPLETV_IP}:3689/ctrl-int/1/controlpromptentry?prompt-id={CONTROL_PAIR_ID}&session-id={CONTROL_SESSION_ID}'
i extracted the CONTROL_PAIR_ID and CONTROL_SESSION_ID by setting my iphone wifi http proxy settings to my mac with fiddler on it and activated the old appletv remote app and that displayed the requests the app is executing
if you don't know how to set iphone to work with fiddler you can find it here:
http://docs.telerik.com/fiddler/Configure-Fiddler/Tasks/ConfigureForiOS
I did manage to control my Apple TV (currently running tvOS 9.2) from a python script. It turns out that you don't need to use Home Sharing to have a remote app control the Apple TV. I don't know if the following method will work if Home Sharing is enabled, but with it disabled on the Apple TV, the iOS Remote app has the option to manually add a device. (This may require removing all of the devices it is already paired with, since that was unfortunately necessary for me to get it to display the 'Add a device' button.) Once I had paired my iPhone to the Apple TV, I recorded some of its requests, copied the pairing GUID, and then constructed some of my own requests.
The only three requests necessary to make are:
/login?pairing-guid=< your pairing guid here >&hasFP=1
Logs into the Apple TV. The last four bytes of the response's is a session id, encoded as a big-endian four byte integer.
/logout?session-id=< your session id here >
Logs out. Not strictly necessary, as I found that logging in simply gets you a new session id, but probably not a bad idea to do things the way it expects.
/ctrl-int/1/controlpromptentry?prompt-id=114&session-id=< your session id here >
Send user input to the Apple TV. The data is one of several buffers that input a command, or possible a moving touch. For movement in the cardinal directions, sending several of these requests to simulate a moving touch is necessary.
I have a python script demonstrating how to do this here:
http://pastebin.com/mDHc353A
Utilizes the requests library: http://docs.python-requests.org/en/master/
Also special thanks to Adam Miskiewicz / github user skevy, since I made use of this file in his atlas-backend repo that conveniently had the right buffers to send for movement: https://github.com/skevy/atlas-backend/blob/master/atlas/services/appletv.coffee
For any people still checking out this question, I recommend checking out pyatv if they want to control their Apple TV through a python or command line interface.

Chrome extension to listen and capture streaming audio

Is it possible for a Chrome extension to listen for streaming audio from any of the browser's tabs? I would like to capture the streaming audio data and then analyse it.
Thanks
You could try 3 ways, neither one does provide 100% guarantee to meet your needs.
Before going into more detailed descriptions, I must note that Chrome extensions do not provide convenient tools for working on per connection level - sufficiently low level, required for stream capturing. This is by design. This is why the 1-st way is:
To look at other browsers, for example Firefox, which provides low-level APIs for connections. They are already known to be used by similar extensions. You may have a look at MediaStealer. If you do not have a specific requirement to build your system on Chrome, you should possibly move to Firefox.
You can develop a Chrome extension, which intercepts HTTP-requests by means of webRequest API, analyses their headers and extracts media urls (such as containing audio/mpeg MIME-type, for example, in HTTP-headers). Just for a quick example of code you make look at the following SO question - How to change response header in Chrome. Having the url you may force appropriate media download as a file. It will land in default downloads folder and may have unfriendly name. (I made such an extension, but I do not have requirements for further processing). If you need to further process such files, it can be a challenge to monitor them in the folder, and run additional analysis in a separate program.
You may have a look at NPAPI plugins in general, and their streaming APIs in particular. I can imagine that you create a plugin registered for, again, audio/mpeg MIME-type, and receives the data via NPP_NewStream, NPP_WriteReady and NPP_Write methods. The plugin can be wrapped into a Chrome extension. Though I made NPAPI plugins, I never used this API, and I'm not sure it will work as expected. Nethertheless, I'm mentioning this possibility here for completenees. This method requires some coding other than web-coding, meaning C/C++. NB. NPAPI plugins are deprecated and not supported in Chrome since September 2015.
Taking into account that you have some external (to the extension) "fingerprinting service" in mind, which sounds like an intelligent data processing, you may be interested in building all the system out of a browser. For example, you could, possibly, involve a HTTP-proxy, saving media from passing traffic.
If you're writing a Chrome extension, you can use the Chrome tabCapture API to record audio.
chrome.tabCapture.capture({audio: true}, function(stream) {
var recorder = new MediaRecorder(stream);
[...]
});
The rest is left as an exercise to the reader; MDN has more documentation on how to use MediaRecorder.
When this question was asked in 2013, neither chrome.tabCapture nor MediaRecorder existed.
Mac OSX solution using soundflower: http://rogueamoeba.com/freebies/soundflower/
After installing soundflower it should appear as a separate audio device in the sound preferences (apple > system preferences > sound). Divert the computer's audio to the 2ch option (stereo, 16ch is surround), then inside a DAW, such as 'audacity', set the audio input as soundflower. Now the sound should be channeled to your DAW ready for recording.
Note: having diverted the audio from the internal speakers to soundflower you will only be able to hear the audio if the 'soundflowerbed' app is actually open. You know it's open if there's a 8 legged blob in the top right task bar. Clicking this icon gives you the sound flower options.
My privoxy has the following log:
2013-08-28 18:25:27.953 00002f44 Request: api.audioaddict.com/v1/di/listener_sessions.jsonp?_method=POST&callback=_AudioAddict_WP_ListenerSession_create&listener_session%5Bid%5D=null&listener_session%5Bis_premium%5D=false&listener_session%5Bmember_id%5D=null&listener_session%5Bdevice_id%5D=6&listener_session%5Bchannel_id%5D=178&listener_session%5Bstream_set_key%5D=webplayer&_=1377699927926
2013-08-28 18:25:27.969 0000268c Request: api.audioaddict.com/v1/ping.jsonp?callback=_AudioAddict_WP_Ping__ping&_=1377699927928
2013-08-28 18:25:27.985 00002d48 Request: api.audioaddict.com/v1/di/track_history/channel/178.jsonp?callback=_AudioAddict_TrackHistory_Channel&_=1377699927942
2013-08-28 18:25:54.080 00003360 Request: pub7.di.fm/di_progressivepsy_aac?type=.flv
So I got the stream url and record it:
D:\Profiles\user\temp>wget pub7.di.fm/di_progressivepsy_aac?type=.flv
--18:26:32-- http://pub7.di.fm/di_progressivepsy_aac?type=.flv
=> `di_progressivepsy_aac#type=.flv'
Resolving pub7.di.fm... done.
Connecting to pub7.di.fm[67.221.255.50]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [video/x-flv]
[ <=> ] 1,234,151 8.96K/s
I got the file that can be reproduced in any multimedia pleer.

Captured audio buffers are all silent on Windows Phone 8

I'm trying to capture audio using WASAPI. My code is largely based on the ChatterBox VoIP sample app. I'm getting audio buffers, but they are all silent (flagged AUDCLNT_BUFFERFLAGS_SILENT).
I'm using Visual Studio Express 2012 for Windows Phone. Running on the emulator.
I had the exact same problem and managed to reproduce it in the ChatterBox sample app if I set Visual Studio to native debugging and at any point stepped through the code.
Also, closing the App without going through the "Stop" procedure and stopping the AudioClient will require you to restart the emulator/device before being able to capture audio data again.
It nearly drove me nuts before I figured out the before mentioned problems but I finally got it working.
So..
1. Be sure to NOT do native debugging
2. Always call IAudioClient->Stop(); before terminating the App.
3. Make sure you pass the correct parameters to IAudioClient->Initialize();
I've included a piece of code that works 100% of the time for me. I've left out error checking for clarity..
LPCWSTR pwstrDefaultCaptureDeviceId =
GetDefaultAudioCaptureId(AudioDeviceRole::Communications);
HRESULT hr = ActivateAudioInterface(pwstrDefaultCaptureDeviceId,
__uuidof(IAudioClient2), (void**)&m_pAudioClient);
hr = m_pAudioClient->GetMixFormat(&m_pwfx);
m_frameSizeInBytes = (m_pwfx->wBitsPerSample / 8) * m_pwfx->nChannels;
hr = m_pAudioClient->Initialize(AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_NOPERSIST | AUDCLNT_STREAMFLAGS_EVENTCALLBACK,
latency * 10000, 0, m_pwfx, NULL);
hr = m_pAudioClient->SetEventHandle(m_hCaptureEvent);
hr = m_pAudioClient->GetService(__uuidof(IAudioCaptureClient),
(void**)&m_pCaptureClient);
And that's it.. Before calling this code I've started a worker thread that will listen to m_hCaptureEvent and call IAudioCaptureClient->GetBuffer(); whenever the capture event is triggered.
Of course using Microsoft.XNA.Audio.Microphone works fine to, but it's not always an option to reference the XNA framework.. :)
It was a really annoying problem which waste about 2 complete days of mine.My problem was solved by setting AudioClientProperties.eCatagory to AudioCategory_Communications instead of AudioCategory_Other.
After this long try and error period I am not sure that the problem won't repeat in the future because the API doesn't act very stable and every run may return a different result.
Edit:Yeah my guess was true.Restarting the wp emulator makes the buffer silent again.But changing the AudioClientProperties.eCatagory back to AudioCategory_Other again solve it.I still don't know what is wrong with it and what is the final solution.
Again I encounter the same problem and this time commenting (removing) the
properties.eCategory = AudioCategory_Communications;
solve the problem.
I can add my piece of advice for Windows Phone 8.1.
I made the following experiment.
Open capture device. Buffers are not silent.
Open render device with AudioDeviceRole::Communications. Buffers immediately go silent.
Close render device. Buffers are not silent.
Then I opened capture device with AudioDeviceRole::Communications and capture device works fine all the time.
For Windows 10 capture device works all the time, no matter if you open it with AudioDeviceRole::Communications or not.
I've had the same problem. It seems like you can either use only AudioCategory_Other or create an instance of VoipPhoneCall and use only AudioCategory_Communications.
So the solution in my case was to use AudioCategory_Communications and create an outgoing VoipPhoneCall. You should implement the background agents as in Chatterbox VoIP sample app for the VoipCallCoordinator to work .

Web Audio node connected to two gain nodes, connected to destination, duplicates speed / pitch

As the title says, if I have an audio node that emits sound and I connect it to two separate GainNodes, which in turn are connected to the Audio Context destination, the sound plays at double speed / double pitch (as if half samples are sent to one gain node and half samples to the other, and the time is halved as well).
I have created an handy jsfiddle here, just drag your sound files in the black rectangle canvas and listen.
// audioContext: Web Audio context
// decoded: decoded audioBuffer
// gainNode1, gainNode2: gain nodes
var bSrc = audioContext.createBufferSource();
bSrc.connect (gainNode1);
bSrc.connect (gainNode2);
gainNode1.connect (audioContext.destination);
gainNode2.connect (audioContext.destination);
bSrc.buffer = decoded;
bSrc.loop = false;
// You'll hear two double-speed buffers playing at unison
bSrc.start(0);
Is that by design? What I would like is to exactly "duplicate" the sound (that will be sent to two different routes, the fiddle is just a proof-of-concept for a bigger project).
Edit:
I tested this on Chrome Version 24.0.1312.56 / Ubuntu 12.10 and the behaviour is present.
The behaviour is also present on Chrome Version 24.0.1312.68 / Ubuntu 12.10
On Chrome Version 24.0.1312.57 / Mac OSX, the Audio API works well and this behaviour is not present.
Could it be a Linux-only issue?
Sounds like a Linux implementation issue. It works for me in Chrome on OS X.