Transcoding Audio/Video/Image file in Android Device - android-mediacodec

I am working on a chat application like whatsApp, I want to transcode media file before uploading to server,I have gone through so many links but not able to decide which method i should use, is there any straight forward way of transcoding in android ?
FFMPEG i found it is highly cpu intensive process ,it will consume more battery power
Media Codec i want to do the transcoding using mediacodec but not able to get proper steps to understand the process.
Best link to give idea about transcoding
Library to transcode using media codec (It has many bugs)

We used both implementation for our video editing app. Basically we used MediaCodec implementation if android version >= 4.3 and use FFMPEG otherwise.
The problem with using FFMPEG:
As you said, cpu intensive process thus consume more battery
x264 encoder is licensed under GPL, so you might want to use OpenH264 encoder instead which only support Baseline Profile, therefore video quality is not the best
Since it used software encoder, processing speed is relatively slow, at least compared to the MediaCodec implementation
MediaCodec also have some cons though, for example:
If you want to do transcoding, android version need to be >= 4.3 unless you want to deal with color format conversion yourself, which is completely mess, since each vendor may have it's own color format implementation. (Since 4.3, MediaCodec support encoding using input surface)
Hardware encoder may behave differently for different models. (For example some encoder may produces B frames which is not supported yet by android MediaMuxer, so you may want to use ffmpeg for the muxing part)
So I should say if you only support new android version, you should use mediacodec, but if you want to be safe (easier to write code that works on all device) and does not really mind the performance, use FFMPEG with OpenH264

Android's MediaCodec is a relatively better way to transcode on the client since it uses its own low level buffer processing. But then it doesn't provide elaborate tweaking freedom as FFMpeg does.
As to MediaCodec source code, it also is CPU intensive for holding the buffers and processing them but its actually way lesser than FFmpeg.

Related

how to support all video format html 5

I have developed a website where the user can upload videos.
Everything worked fine until I discovered that if the video codec is not supported, it will not be displayed by the browser.
I realized that this same video that is not displayed by my site, is displayed perfectly by youtube.
is there a way to support all video formats in html 5?
If it does not exist, is there any way I can convert the video to another format with javascript or java on the backend?
Every help is welcome!!!
Current HTML5 implementations do not provide any way to get to your goal, even if they did, it would be very OS and browser specific. What i do to get this done is to install a local application that "live encodes" the stream locally and streams the output to the html5 video element.
In fact, there is no way at all on this world to support "all video formats" in one shot. The best you can currently do is to use ffmpeg on the backend to transcode.
Simply install ffmpeg on the backend and from your backend java, you just use the Runtime.getRuntime().exec method to call an ffmpeg commandline, like this
ffmpeg -i "%yourvideo%" "%youroutput".mp4"
It is a totally different topic to get a ffmpeg commandline done that is compatible to a lot of input formats, but using the above command you might hit already lots of formats.
Edit: please be aware of the consequences of "transcoding" on your server. It uses huge amounts of CPU and GPU usage is extremely complex. Due to this fact, you should not expect any really native way in java to do the same job as it would cost even a lot more CPU than the compiled C and assembler code of ffmpeg uses. Even if you find a native way how to use it natively in java, it will take weeks or months of R&D to get the same job done than a simple commandline exec could do for you.
Edit2: it might be a way for you to go through online encoding services like encoding.com and similar. But those do cost lots of money compared to running ffmpeg locally.

Using DX11 and DXVA2

I am trying to test decoding a h264/h265 video (with just a single iframe) using DX11 and DXVA2. This is on windows 7 so I probably have to interop between 2 d3d11 devices, one with 11.1 feature set and the other with 9.3. My question is since there is a severe lack of samples for loading a h264 file and decoding it using DXVA, I was wondering if there is a guide for how to layout the data to feed into DXVA to decode? I've read this How do I use Hardware accelerated video/H.264 decoding with directx 11 and windows 7? as well as https://msdn.microsoft.com/en-us/library/windows/desktop/hh162912(v=vs.85).aspx but neither has any guide on how to do the above.
Thanks
If you want a working sample to understand how to feed data into DXVA, look here : MFNode. Under MFTDxva2Decoder, you will see how to feed data. It is for mpeg1/2 file format, but the same apply to H264 (with shades, of course).
EDIT
See my response : How do I use Hardware accelerated video/H.264 decoding with directx 11 and windows 7?

Display RTSP stream in Adobe AIR

I am working on a project that involves displaying video feed from an IP camera using Adobe AIR. I know that Flash does not have a native support of RTSP protocol and therefore I am evaluating all possible routes I can take to solve this issue:
Use Adobe Media Server to convert incoming RTSP stream to RTMP and then use Flash API (NetConnection & NetStream) directly.
Write a custom class to fetch, decode and display the stream in adobe AIR. [I am unable to confirm if this is possible due to insufficient info on the net]
Give up on RTSP and instead fetch JPEG/MJPEG sequence of images and display them in AIR relatively easily but with doubtful live performance. [due to JPEG/MJPEG refresh interval of IP camera and same interval separately in AIR]
Use DirectShow Video Source Filter for JPEG and M-JPEG IP Cameras to process the JPEG/MJPEG stream, create a virtual Webcam device (the filter does this automatically) and then use Camera class to display the video feed in AIR.
Use webcam 7 - A software designed to handle RTSP, JPEG/MJPEG and other stream protocols for many camera brands/models. It installs a driver in system that creates a virtual camera, and that all the other applications can then use as a normal webcam.
Unfortunately this software is buggy and often becomes unstable (could be with my particular camera model only though) and might even crash.
Are there any better, easier options that might not require any third-party software?
EDIT:
In case anybody else bumps into same problem:
As suggested by Rudolfs Bundulis, I decided to write a NativeProcess (ANE) that uses FFMPEG to fetch the RTSP stream data, transcode it, and feed it to Flash player.
You might want to look at these for more specific steps:
http://www.purplesquirrels.com.au/2013/02/converting-video-with-ffmpeg-and-adobe-air/
https://www.youtube.com/watch?v=6N7eN9wvAGQ
Take the route described in option 2 - write a Adobe AIR native extension (ANE) that uses FFMpeg to handle the RTSP stream, decode it and pass the RGB data back to AIR for rendering. The hardest part would be compiling FFmpeg if you need cross platform functionality, however, since you mention DirectShow that is Windows only, then I assume you are bound to Windows. Zeranoe provides prebuild FFmpeg libraries for Windows, Stackoverflow has a lot of topics on decoding a stream using FFmpeg and then all you need is a callback to AIR and you're good.

Decoding Audio / Audio Playback (AS3)

I'm interested in learning how to decode and playback audio in ActionScript 3. I understand how to write bytes to a Sound object using the SAMPLE_DATA event, so that's not really a problem. What I want to understand is how I could implement alternate audio formats for native playback inside of Flash Player.
I guess what I'm asking is: how do I take something in X format and "convert/decode" it to WAV format and write the bytes to a Sound object, playing back the audio? I'm interested in writing a decoder for FLAC audio and possibly OGG audio, as these seem to be some of the most widely used open source audio formats.
Can anyone give me any advice on this?
If you want to write a decoder, the first thing you should probably look at is the spec for the format you want to decode.
The ogg/vorbis spec can be found here: http://xiph.org/vorbis/doc/Vorbis_I_spec.html.
Also, it could be of help to take a look (or maybe port) some other open source library that already does this (I'm not aware of any written in Actionscript), such as this, in Java: http://www.jcraft.com/jorbis/ (I don't know this library, I've just found it googling "ogg vorbis open source".
At any rate, you'll have to put some work to get it working and I don't mean this to discourage you, but I'm not sure Actionscript is fast enough for real time audio decoding.
You can try, but you're not going to have much grunt left to do other stuff. Prior to Flash 10, I wrote an article detailing a hack to feed PCM data into sound output in Flash. Someone got in touch because they had written an AS3 Ogg decoder, but... even after fully optimzing the code, it was found that AVM2 is really not that much up to the job. Basically, it's rather slow and decoding OGG is quite processor intensive. I can't see that things will have changed that much in the years since, because CPUs have become "wider" and not really that much faster. ActionScript is single threaded, so you can't offload to another core.
Probably worth checking out this... maybe performance has improved.
EDIT: Having said all that, as Juan has said, don't be discouraged by this answer. I suspect the computational demands of FLAC decoding are probably considerably less than OGG, and if DSP gets you excited, taking the time to figure all this stuff out is 100% worth it, even if the Flash route (possibly) leads to disappointment. Personally I think that the MediaStreamSource for Silverlight looks really promising,but haven't really dabbled that much.

Client-side image processing

We're building a web-based application that requires heavy image processing. We'd like this processing load to be on the client as much as possible and we'd like to support as much platforms (even mobiles) as much as possible.
Yeah, I know, wishful thinking
Here's the info:
Image processing is rasterization from some data. Think like creating a PNG image from a PDF file.
We don't have a lot of server power. So client-side processing is a bit of a must.
So, we're considering:
Flash - most widespread, but from what i read has lackluster development tools. (and no iPhone/iPad support for now).
Silverlight - allows us to use .NET CLR, so a big ++ (a lot of code is in .NET). But is not supported for most mobiles ( rumored android support in the future)
HTML5 + Javascript - probably the most "portable" option. The problem is having to rewrite all that image processing code in Javascript.
Any thoughts or architectures that might help?
Clarification: I don't need further ideas on what libraries are available for Silverlight and Javascript. My dilemma is
choosing Silverlight means no support for most mobiles
choosing Flash means we have to redevelop most of our code AND no iPhone/iPad support
HTML5 + Javascript we have to redevelop most of our code and not fully supported yet in all browsers
choosing two (Silverlight + Flash) will be too costly
Any out-of-the-box or bright ideas / alternatives I might be missing?
This is the sort of issue that software architects run up against all the time. As per usual, there is no ideal solution. You need to select which compromise is most acceptable to your business.
To summarise your problem, most of your image processing software is written in .NET. You'd like to run it client-side on mobile devices, but there is limited .NET penetration on mobiles. The alternatives with higher penetration (eg. Flash) would require you to re-write your code, which you can't afford to do. In addition, these alternatives are not supported on the iPhone/iPad.
What you ideally want is a way to run all your .NET code on most existing platforms, including iPhone/iPad. I can say with some confidence that no such solution currently exists - there is no "silver bullet" answer that you have overlooked.
So what will you need to compromise on? It seems to me that even if you redevelop in flash, you are still going to miss out on a major market (iPhone). And redeveloping software is extremely costly anyway.
Here is the best solution to your problem - you need to compromise on your "client side execution" constraint. If you execute server side, you get to keep your existing code, and also get to deploy to just about every mobile client, including the iPhone.
You said your server power is limited, but server processing power is cheap when compared to software development costs. Indeed, it is not all that expensive to outsource your server component and just pay for what you use. It's most likely that your application will only have low penetration to start off with. As the business grows, you will be able to afford to upgrade your server capacity.
I believe this is the best solution to your problem.
Host you image processing on Amazon E2C, Azure, or Google. IIRC E2C has many common image processing problems packaged and all ready to go.
Azure probably more familiar ground in term of sharing code as a web service
You just pay for CPU cycles and transfers/storage etc
I'm sure there will be Silverlight and JS people posting examples. Here are some image editors written in actionscript:
Phoenix
PhotoshopExpress
There is an ImageProcessing library to start with.
Plus PixelBender is available in Flash Player 10, it's fast, it runs in a separate thread
and people do some pretty mad things with it.
HTH
Some help for the Silverlight part:
There is an Silverlight image editor called Thumba.
And Nokola recently made one called EasyPainter and he will also provide the source code in the furure.
For the image conversion I would recommend the open source library ImageTools that also includes some basic effects.
Silverlight has a class for pixel manipulation of bitmaps called WriteableBitmap. The open source library WriteableBitmapEx is a collection of extension methods for Silverlight's WriteableBitmap. The WriteableBitmap API is very minimalistic and there's only the raw Pixels array for such operations. The WriteableBitmapEx library tries to compensate that with extensions methods that are easy to use like built in methods.
Pixel Shaders can also be used to make some fast and advanced effects. Although they are limited by Shader Model 2 shaders can be used for fast bluring, tinting and such things.
DISCLAIMER: I consider myself as an advocate of the Flash platform. I admire Silverlights huge potential as a technology to deploy almost any .NET content through the browser, but it has low penetration, is horribly marketed and -although perceived as such by many (mostly people who don't know either Flash or Silverlight)- is no competitor of Flash, as much as Flash is no competitor of Sliverlight. The idealist in me loves the idea of doing everything in HTML+JS using a standard, instead of relying on 3rd party proprietary software. But the truth is, JS is slow and the API is limited, and implementations of JS, HTML and CSS are terribly inconsistent accross browsers.
If you really wanna stick to .NET and are so interested in targeting the iPhone and its siblings, then you might wanna check out MonoTouch.
Still, even though this may surprise you, I am going to tell you to use Flash. :)
Why? The image processing bit is the smallest part of your application. Whatever it is you are writing, I am very sure of that. I don't know about Silverlight, but in Flash the filters used by "Thumba" and "EasyPainter" can be created within a day, most of them simply using ConvolutionFilter, ColorMatrixFilter, DisplacementMapFilter and BitmapData::paletteMap or even simply by applying one of the other filters Flash offers out of the box. Any additional things can be created using PixelBender, which was pointed out by George. The kernel language is a subset of C, so porting classic filters shouldn't be too time consuming. Also alchemy (an LLVM backend targeting Flash Player 10) would be an option worth investigating, although it's not very stable yet.
The biggest part of your app will be a lot of GUI design, GUI implementation, Business Logics etc. Flash is really great when it comes to simple, yet reasonably fast image manipulation and with the Flex framework and MXML you have a powerful tool to productively create the GUI of your app, that can interoperate very well with a multitude of server solutions for virtually any platform.
Also, Flash has a great and active community, offering tons of tutorials, code snippets, libraries and frameworks, and a big ecosystem, with cross-compilation tools to deliver flash content to other platforms (including the upcoming Flash CS5, or the mentioned Elips). I don't understand, where you got the impression, that the Flash platform lacks developement tools. The difference to the .NET suite is that they are provided by a multitude of vendors. The upcoming Flash Player 10.1 was already pointed out by George, but never the less, I wanted to stress, that this makes many of the cross-plattform considerations obsolete.
Last but not least, I'd like to point out Haxe. It allows compiling to SWF, but also to C++, using the very same API provided by NME, to target the iPhone. Also there's work in progress on an android backend. If you're aren't playing to launch within the next 4-5 months, then this is definitely an option.
Your issue is a perfect target for the Haxe programming language. Haxe is written for the web and can compile to JavaScript, Flash and Objective-C (possibly Java/.NET soon).
So you do not choose which platform you are going to invest in but in which language. Haxe is easily adoptable for an AcitonScript programmer.
It makes no sense to run your imageprocessing algorithms in a JavaScript sandbox when Flash is available because it will be much faster. It makes also no sense to run heavy image processing algorithms on a mobile device like the iPhone with JavaScript. I would only support JavaScript as the worst fallback solution.
If you do not like to use Haxe I would go with Flash. You can deploy your Flash application for the iPhone aswell if that is your problem. This is also very great because you get native ARM code. There are actually great tools for professional Flash development available. FDT and IntelliJ IDEA are two of them. The best Haxe IDE is probably FlashDevelop at the moment of writing.
So I would definitly not use JavaScript as the only solution. Haxe is perfect for what you try to achieve. If you do not trust or do not want to invest in Haxe you can use Flash because of the iPhone/iPad export.
Depending on your usecase I would also encourage you to look at cloud hosting like Amazon EC2 and Google AppEngine for instance. Hosting costs are cheap and scaling will be easy for your task. The experience will be much better when it comes to complex operations that can take even a lot of time on a desktop system.
In addition to other answers, another option may be a hybrid solution. For example, use Flash/Silverlight for the majority of your target audience and use server-side processing for those that don't support it (or you could create a native app for iP[hone|ad])
You may have to do something like this anyway as the mobiles you are targetting may have insufficient processing power depending how complex your image processing gets.
Of course you still have the option of upgrading your server which, although you've currently discounted, is probably far cheaper than spending development time creating/deploying/testing a client-side solution.
You can use Silverlight for all Silverlight enabled clients and for non Silverlight clients, do the image processing server side. Since the Silverlight code is C#, you can double compile it to make (mostly)the same code work as Silverlight and non-Silverlight (i.e. server). This gets you the best of both worlds.
You don't say what language "all that code" you'd have to rewrite is in. Might a semiautomated translation to Javascript be practical?
Perhaps you could start out server-side, as CraigS suggests, and then move functions into the client over time instead of rewriting all at once.
Have you checked the editor of Pixlr.com ?
Take a look at their API as well..
The best solution is to use silverlight (so you already have the code ready). If the client can't run it (mobile phones, etc) then process it server-side.
It's the best compromise.
Depends on the type of image processing and the end user experience you are targeting.
As you are looking to target mobile phones your image processing will need to take into consideration the type of handset the user or the receipient has (if messaging via SMS/MMS), as different handsets have different resolution screens and handle different image formats for main images and thumbnails.
I'd suggest that you consider a hybrid cloud architecture as was mentioned in the Microsoft PDC keynotes this year. This would enable you to have your own server(s) to support your application, but if you require additional capacity due you scale out into the cloud using AppFabric.
Additionally, to maximise the market availability of your product pulling the image processing to a common reusable infrastructure allows you to target different platforms, exploiting the positives in each.
I have worked on a solution that hosted its image processing and delivery infrastructure server side and then built different UI offerings allowing sales via desktops, MNOs and AppStores. It can work and from a business perspective can offer economies of scale benefits.
Why not mention Java Applet ?
Good sides are:
almost all browser support ?
need install JRE ?
all OS support
Java provide Java Advanced Image kits, but if c++ dll can be called, that is best (JNI can call c++ dll )
In Python, one of the most popular libraries for image processing is pillow. Through the pyodide project (python running inside browser via emscripten), it's possible to use libraries like pillow and numpy for image (or matrix) processing, and convert the output to a base64 string (via Python standard library). This can then be passed to your <img> html element, either native JS document or with a library like React.
The way I see it, there's no one solution that meets all of your needs. Your best option, imo, is to go with Flash and hope that Adobe sets an agreement with Apple to get Flash on the iPhone/iPad. The major downside, of course, is you'll have to rewrite much of your code.
If the mobile sector isn't absolutely critical, then choose the Silverlight option for reasons you mentioned already. You could also use Silverlight in an out-of-browser mode to work as a desktop application.