Extract number of video or audio files from Wikipedia article - mediawiki

I'm trying to extract the number of videos or audio files present in a Wikipedia article, I searched the APIs but didn't find one for that.
I did notice that when using the API to extract the images for a specific page, the audio file with .ogg extension appears in the list with the images.
http://ar.wikipedia.org/w/api.php?format=xml&action=parse&page=%D8%AD%D9%88%D8%AB%D9%8A%D9%88%D9%86&prop=images&redirects=
I don't know if this case can be generalized, and whether I can use it to count videos and audio files? Does anyone have another way to do this?

Basically all file types are treated equally by the API, but you can fetch the mediatype of each file, and use that to filter out the videos and audio files.
To get the mediatype of a file you would use prop=imageinfo (this will be changed to the more accurate prop=fileinfo in future versions) for each file. As prop=images can be used as a generator, you can get the list of files, and their mediatype, in one single API call, like this:
https://ar.wikipedia.org/w/api.php?action=query&generator=images&titles=%D8%AD%D9%88%D8%AB%D9%8A%D9%88%D9%86&redirects=&prop=imageinfo&iiprop=mediatype&continue=&format=xml
Here images is used as a generator, returning a list of files, and the list of files in its turn is being fed to the imageinfo call.
For each file, you will get something like this:
"2014232": {
"pageid": 2014232,
"ns": 6,
"title": "\u0645\u0644\u0641:06-Salame-Al Aadm 001.ogg",
"imagerepository": "local",
"imageinfo": [
{
"mediatype": "AUDIO"
}
]
}
The mediatype can be any of the following (copy-and-paste from the manual):
UNKNOWN // unknown format
BITMAP // some bitmap image or image source (like psd, etc). Can't scale up.
DRAWING // some vector drawing (SVG, WMF, PS, ...) or image source (oo-draw, etc). Can scale up.
AUDIO // simple audio file (ogg, mp3, wav, midi, whatever)
VIDEO // simple video file (ogg, mpg, etc; no not include formats here that may contain executable sections or scripts!)
MULTIMEDIA // Scriptable Multimedia (flash, advanced video container formats, etc)
OFFICE // Office Documents, Spreadsheets (office formats possibly containing apples, scripts, etc)
TEXT // Plain text (possibly containing program code or scripts)
EXECUTABLE // binary executable
ARCHIVE // archive file (zip, tar, etc)
The default mapping of mimetype <=> mediatype is available here, though it's possible to override this for an individual wiki.

Related

How to display images from an email with QWebView?

I have an email which contains perfectly formatted html with the single exception that images are linked differently: <img width=456 height=384 id="_x0000_i1026" src="cid:X.MA2.1374935634#aol.com" alt="cid:X.MA4.1372453963#aol.com"> the email has other parts including the image with this content id. The problem is that I dont know how to point the QWebview to the data (which I have). Is there a way to add the image to its cache?
It's possible but not easy.
Basically you need to:
1- provide your own QNetworkAccessManager-inherited class, overriding createRequest() to catch these links refering to "cid":
QNetworkReply*
MyManager::createRequest (Operation op,
const QNetworkRequest & req,
QIODevice * outgoingData = 0)
{
if (op==GetOperation && req.url().scheme() == "cid")
return MyNetworkReply(req.url().path());
else
return QNetworkAccessManager::createRequest(op, req, outgoingData);
}
2- Connect it to the webview with:
MyManager* manager = new MyManager;
view->page()->setNetworkAccessManager(manager);
3- Provide an implementation of MyNetworkReply which inherits from QNetworkReply, a QIODevice-class. And this is the complicated part. You need to provide at least readData(), bytesAvailable(), a constructor that sets up the reply in terms of HTTP headers, and launches the actual asynchronous read with QTimer::singleShot()
4- Decode the attachment (probably from base64 if it's a picture) into a QByteArray for your MyNetworkReply::readData() to read from that.
There's a complete example on qt.gitorious.org written by Qt Labs developers in the Qt 4.6 days. They display an internally generated PNG, not an external mail attachment, but the general steps are as described above. See:
http://qt.gitorious.org/qt-labs/graphics-dojo/blobs/master/url-rendering/main.cpp
However this code has a flaw with Qt-4.8. in the constructor for RendererReply, when it does:
open(ReadOnly|Unbuffered);
this should be:
open(ReadOnly);
otherwise webkit never reads the entire data and displays the broken picture icon.

How to add SoundCloud files to jPlayer?

Right now I have a website that uses jPlayer to stream mp3s on. I also want to be able to add the functionality of letting SoundCloud stream directly on the player.
I know this is possible, because one of my favorite music blogs hillydilly does this. From looking at their code, I see their setup for their jPlayer has a few extra arguments, namely sc and sclink.
$(".play-music").each(function(){
myPlaylist.add({
title: $(this).attr('data-title'),
artist: $(this).attr('data-artist'),
mp3: $(this).attr('data-mp3'),
url: $(this).attr('data-url'),
sc: $(this).attr('data-sc'),
sclink: $(this).attr('data-sclink')
});
});
I tried looking through the rest of their code, but can't figure out how or where they implement sc and sclink. Any help?
If you look at their playlist they're linking to Soundcloud for the mp3 property of the track:
myPlaylist.setPlaylist([
{
title:"Close Enough ft. Noosa",
artist:"Ghost Beach",
mp3:"http://api.soundcloud.com/tracks/79031167/stream?client_id=db10c5086fe237d1718f7a5184f33b51",
url:"http://hillydilly.com/2012/12/top-20-songs/",
sc:"true"
},
{
title:"Always",
artist:"Jahan Lennon",
mp3:"http://api.soundcloud.com/tracks/80961874/stream?client_id=db10c5086fe237d1718f7a5184f33b51",
url:"http://hillydilly.com/2012/12/top-20-songs/",
sc:"true"
}
HTML5 'streams' are really just MP3s you currently can't protect them like you can with Flash, Silverlight, Quicktime etc. If you open one of those links directly (like http://api.soundcloud.com/tracks/79031167/stream?client_id=db10c5086fe237d1718f7a5184f33b51) it'll download the MP3. So I'm guessing it you set it up the same way it should just work.
If you open up Chrome and its Network inspector (in dev tools: View > Developer > Developer Tools, then click Network) and click next on Hilly's player you can see the track loading in the background.
Actually kreek's answer is only half the story.
If you want to stream soundcloud songs, you first of all need to register your website on the developer page developers.soundcloud.com.
Then use their super cool API with your own key. You can either load the json into your php and generate links there or you load the information with jquery $.getJSON.
When streaming sounds from soundcloud you also want to be sure to properly attribute. The people from soundcloud are very generous to let us use their database like that.
//sorry for the bad links (no credits)
How do I use external JSON...?
http://api.jquery.com/jQuery.getJSON/
http://developers.soundcloud.com/docs/api/buttons-logos

Embed DWG file in HTML

I want to ask how to embed DWG file in HTML Page.
I have tried using tag with Volo Viewer but this solution run only in IE not in Firefox and Chrome.
Dwgview-x can do that, but it will need to be installed as a plug-in on client computers so that anyone can view the dwg file that you embed online.
There may be third party ActiveX controls that you could use, but I think ultimately you will find that it's not practical for drawing files of even average complexity. I recommend to create DWF (if you need vector format) or PNG files on demand (using e.g. the free DWG TrueView from http://usa.autodesk.com/design-review/ ) and embed those instead.
I use DWG Browser. Its a stand alone program that is used for reporting and categorizing drawings with previews. It saves exports in html too.
They have a free demo download available.
http://www.graytechnical.com/software/dwg-browser/
You'll find what I think is the latest information on Autodesk's labs site here: http://labs.blogs.com/its_alive_in_the_lab/2014/01/share-your-autodesk-360-designs-on-company-web-sites.html
It looks like a DWG can be embeded there is an example on this page, but clearly DWF is the way to go.
You can embed DWG file's content in an HTML page by rendering the file's pages as HTML pages or images. If you find it an attractive solution then you can do it using GroupDocs.Viewer API that allows you to render the document pages as HTML pages, images, or a PDF document as a whole. You can then include the rendered HTML/image pages or whole PDF document in your HTML page.
Using C#
ViewerConfig config = new ViewerConfig();
config.StoragePath = "D:\\storage\\";
// Create HTML handler (or ViewerImageHandler for rendering document as image)
ViewerHtmlHandler htmlHandler = new ViewerHtmlHandler(config);
// Guid implies that unique document name
string guid = "sample.dwg";
// Get document pages in html form
List<PageHtml> pages = htmlHandler.GetPages(guid);
// Or Get document pages in image form using image handler
//List<PageImage> pages = imageHandler.GetPages(guid);
foreach (PageHtml page in pages)
{
// Get HTML content of each page using page.HtmlContent
}
Using Java
// Setup GroupDocs.Viewer config
ViewerConfig config = new ViewerConfig();
// Set storage path
config.setStoragePath("D:\\storage\\");
// Create HTML handler (or ViewerImageHandler for rendering document as image)
ViewerHtmlHandler htmlHandler = new ViewerHtmlHandler(config);
String guid = "Sample.dwg"
// Get document pages in HTML form
List<PageHtml> pages = htmlHandler.getPages(guid);
for (PageHtml page : pages) {
// Get HTML content of each page using page.getHtmlContent
}
Disclosure: I work as a Developer Evangelist at GroupDocs.

HTML5 audio from mongodb GridFS

I have a MongoDB database with audio files stored in GridFS. HTML5 audio tag works with a link to a method that gets audio from MongoDB:
$file = $grid->findOne(array('_id' => new MongoId($id)));
header('Content-Length: ' . $file->file['length']);
header('Content-Type: ' . $file->file['file_type']);
header("Content-Disposition: filename=" . $file->file['filename']);
echo $file->getBytes();
All is good but one thing: I can't use slidebar to skip through audio, it only plays from start to end.
Try adding an Accept-Ranges = bytes header. From http://html5doctor.com/html5-audio-the-state-of-play/:
Most audio-capable browsers enable seeking to new file positions
during a download. To allow this, you must enable range requests on
your server. Although enabled by default on web servers such as
Apache, you can verify by checking that your server responds with the
Accept-Ranges header.
Also a X-Content-Duration = length_in_seconds header may help if the files are in ogg format. From https://developer.mozilla.org/en-US/docs/Configuring_servers_for_Ogg_media:
The Ogg format doesn't encapsulate the duration of media, so for the
progress bar on the video controls to display the duration of the
video, Gecko needs to determine the length of the media using other
means.
There are two ways Gecko can do this. The best way is to offer an
X-Content-Duration header when serving Ogg media files. This header
provides the duration of the video in seconds (not in HH:MM:SS format)
as a floating-point value.
Both of these headers help the browser to determine the audio's duration before the file is fully downloaded so that seeking is possible, and the playhead can be positioned properly.
In order to do scrolling, I expect that your script needs to handle ranges as well. Could also supply your (sample) HTML page? Then I can experiment a little bit to see if I can come up with a better answer.

I have a public URL for an iCloud file, how can I get the DIRECT link to download in iOS?

I am able to generate public URLs for iCloud files. e.g. https://www.icloud.com/documents/dl/?p=3&t=BAKsXkcDP-p8sdTS8NgBLWRQxE281oe4hogA
Accessing such a URL from a browser, I see a landing page, and shorty afterwards the file downloads automatically. Fine.
However, I want to be able to download this file from my iOS app (with NSURLConnection). How can I do this? Maybe...
a) process the html headers to somehow determine the direct URL?
b) intercept the redirect/refresh that triggers the download on a browser?
c) somehow imitate a browser in order to trigger a download?
Thanks
PS. please give me the idiot's answer- I'm clueless about html etc.
Here is the html response I'm getting for the indirect URL above:
var SC_benchmarkPreloadEvents={headStart:new Date().getTime()}; -->iCloud - Loading ...window.SC=window.SC||{MODULE_INFO:{},LAZY_INSTANTIATION:{}};SC.buildMode="production";
SC.buildNumber="1FCS22.32292";SC.buildLocale="en-us";String.preferredLanguage="en-us";window.SC=window.SC||{MODULE_INFO:{},LAZY_INSTANTIATION:{}};SC._detectBrowser=function(userAgent,language){var version,webkitVersion,browser={};
userAgent=(userAgent||navigator.userAgent).toLowerCase();language=language||navigator.language||navigator.browserLanguage;
version=browser.version=(userAgent.match(/.*(?:rv|chrome|webkit|opera|ie)/: ([ );]|$)/)||[])[1];
webkitVersion=(userAgent.match(/webkit/(.+?) /)||[])[1];browser.windows=browser.isWindows=!!/windows/.test(userAgent);
browser.mac=browser.isMac=!!/macintosh/.test(userAgent)||(/mac os x/.test(userAgent)&&!/like mac os x/.test(userAgent));
browser.lion=browser.isLion=!!(/mac os x 10_7/.test(userAgent)&&!/like mac os x 10_7/.test(userAgent));
browser.iPhone=browser.isiPhone=!!/iphone/.test(userAgent);browser.iPod=browser.isiPod=!!/ipod/.test(userAgent);
browser.iPad=browser.isiPad=!!/ipad/.test(userAgent);browser.iOS=browser.isiOS=browser.iPhone||browser.iPod||browser.iPad;
browser.android=browser.isAndroid=!!/android/.test(userAgent);browser.opera=/opera/.test(userAgent)?version:0;
browser.isOpera=!!browser.opera;browser.msie=/msie/.test(userAgent)&&!browser.opera?version:0;
browser.isIE=!!browser.msie;browser.isIE8OrLower=!!(browser.msie&&parseInt(browser.msie,10)<=8);
browser.mozilla=/mozilla/.test(userAgent)&&!/(compatible|webkit|msie)/.test(userAgent)?version:0;
browser.isMozilla=!!browser.mozilla;browser.webkit=/webkit/.test(userAgent)?webkitVersion:0;
browser.isWebkit=!!browser.webkit;browser.chrome=/chrome/.test(userAgent)?version:0;
browser.isChrome=!!browser.chrome;browser.mobileSafari=/apple.*mobile/.test(userAgent)&&browser.iOS?webkitVersion:0;
browser.isMobileSafari=!!browser.mobileSafari;browser.iPadSafari=browser.iPad&&browser.isMobileSafari?webkitVersion:0;
browser.isiPadSafari=!!browser.iPadSafari;browser.iPhoneSafari=browser.iPhone&&browser.isMobileSafari?webkitVersion:0;
browser.isiPhoneSafari=!!browser.iphoneSafari;browser.iPodSafari=browser.iPod&&browser.isMobileSafari?webkitVersion:0;
browser.isiPodSafari=!!browser.iPodSafari;browser.isiOSHomeScreen=browser.isMobileSafari&&!/apple.*mobile.*safari/.test(userAgent);
browser.safari=browser.webkit&&!browser.chrome&&!browser.iOS&&!browser.android?webkitVersion:0;
browser.isSafari=!!browser.safari;browser.language=language.split("-",1)[0];browser.current=browser.msie?"msie":browser.mozilla?"mozilla":browser.chrome?"chrome":browser.safari?"safari":browser.opera?"opera":browser.mobileSafari?"mobile-safari":browser.android?"android":"unknown";
return browser};SC.browser=SC._detectBrowser();if(typeof SC_benchmarkPreloadEvents!=="undefined"){SC.benchmarkPreloadEvents=SC_benchmarkPreloadEvents;
SC_benchmarkPreloadEvents=undefined}else{SC.benchmarkPreloadEvents={headStart:new Date().getTime()}
}SC.setupBodyClassNames=function(){var el=document.body;if(!el){return}var browser,platform,shadows,borderRad,classNames,style;
browser=SC.browser.current;platform=SC.browser.windows?"windows":SC.browser.mac?"mac":"other-platform";
style=document.documentElement.style;shadows=(style.MozBoxShadow!==undefined)||(style.webkitBoxShadow!==undefined)||(style.oBoxShadow!==undefined)||(style.boxShadow!==undefined);
borderRad=(style.MozBorderRadius!==undefined)||(style.webkitBorderRadius!==undefined)||(style.oBorderRadius!==undefined)||(style.borderRadius!==undefined);
classNames=el.className?el.className.split(" "):[];if(shadows){classNames.push("box-shadow")
}if(borderRad){classNames.push("border-rad")}classNames.push(browser);if(browser==="chrome"){classNames.push("safari")
}classNames.push(platform);var ieVersion=parseInt(SC.browser.msie,10);if(ieVersion){if(ieVersion===7){classNames.push("ie7")
}else{if(ieVersion===8){classNames.push("ie8")}else{if(ieVersion===9){classNames.push("ie9")
}}}}if(SC.browser.mobileSafari){classNames.push("mobile-safari")}if("createTouch" in document){classNames.push("touch")
}el.className=classNames.join(" ")};(function(){var styles=[];if(window.devicePixelRatio==2||window.location.search.indexOf("2x")>-1){styles=["/applications/documents/download/en-us/1FCS22.32292/stylesheet#2x-packed.css"];
SC.APP_IMAGE_ASSETS=["/applications/documents/sproutcore/desktop/en-us/1FCS22.32292/stylesheet-no-repeat#2x.png","/applications/documents/coreweb/views/en-us/1FCS22.32292/stylesheet-no-repeat#2x.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-no-repeat#2x.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-repeat-x#2x.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-repeat-y#2x.png","/applications/documents/download/en-us/1FCS22.32292/stylesheet-no-repeat#2x.png","/applications/documents/download/en-us/1FCS22.32292/stylesheet-repeat-x#2x.png"]
}else{styles=["/applications/documents/download/en-us/1FCS22.32292/stylesheet-packed.css"];
SC.APP_IMAGE_ASSETS=["/applications/documents/sproutcore/desktop/en-us/1FCS22.32292/stylesheet-no-repeat.png","/applications/documents/coreweb/views/en-us/1FCS22.32292/stylesheet-no-repeat.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-no-repeat.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-repeat-x.png","/applications/documents/sproutcore/ace/en-us/1FCS22.32292/stylesheet-repeat-y.png","/applications/documents/download/en-us/1FCS22.32292/stylesheet-no-repeat.png","/applications/documents/download/en-us/1FCS22.32292/stylesheet-repeat-x.png"]
}var head=document.getElementsByTagName("head")[0],len=styles.length,idx,css;for(idx=0;
idxSC.benchmarkPreloadEvents.headEnd=new Date().getTime();SC.benchmarkPreloadEvents.bodyStart=new Date().getTime();if(SC.setupBodyClassNames){SC.setupBodyClassNames()};SC.benchmarkPreloadEvents.bodyEnd=new Date().getTime();
As of July 2012, the following seems to work. But there's no guarantee that apple won't change their scheme for generating these, and it's possible that they would regard this as a private API and reject your app. So use at your own risk.
The URL has two important parameters, p and t. The first seems to identify a server, while the second identifies the actual file. The direct download link is made by plugging these values into this URL:
https://p[p]-ubiquityws.icloud.com/ws/file/[t]
Looking at your example:
https://www.icloud.com/documents/dl/?p=3&t=BAKsXkcDP-p8sdTS8NgBLWRQxE281oe4hogA
p is 3, and t is BAKsXkcDP-p8sdTS8NgBLWRQxE281oe4hogA. So your direct download link would be
https://p3-ubiquityws.icloud.com/ws/file/BAKsXkcDP-p8sdTS8NgBLWRQxE281oe4hogA
Whenever I've published a link to iCloud, p has been 01; so it's possible that you might need to zero-pad your value in which case your URL would be
https://p03-ubiquityws.icloud.com/ws/file/BAKsXkcDP-p8sdTS8NgBLWRQxE281oe4hogA
It would be great to know whether that's necessary.
In iCloud Drive / iOS8 the links are different, but you can still get a direct link to the files.
Original link:
https://www.icloud.com/attachment?u=https%3A%2F%2Fms-eu-ams-103-prod.digitalhub.com%2FB%2FATmkKK8ju8SRwQqDoEFKJzbRsxiuAXQ3PBcJBXw1Qot9jz68TkqjiiNu%2F%24%7Bf%7D%3Fo%3DAtenENR8OcvlNq6JMa331mr-8gCreXxwcfgQ26B5gFKo%26v%3D1%26x%3D3%26a%3DBclucinSeKmFAy2GJg%26e%3D1413787013%26k%3D%24%7Buk%7D%26r%3D567CC38A-FD1B-4DE6-B11B-4166A5669E1B-1%26z%3Dhttps%253A%252F%252Fp03-content.icloud.com%253A443%26s%3DlO5SolOouS9qhYz1oIxKDoGtMpo%26hs%3DovfPXj3b9XXz9lWKChBmyNq_cug&uk=OXDCcLTETbvUcOKdJ-vTdQ&f=Testdatei.vrphoto&sz=1212622
URL decoded to be more readable:
https://www.icloud.com/attachment?u=https://ms-eu-ams-103-prod.digitalhub.com/B/ATmkKK8ju8SRwQqDoEFKJzbRsxiuAXQ3PBcJBXw1Qot9jz68TkqjiiNu/${f}?o=AtenENR8OcvlNq6JMa331mr-8gCreXxwcfgQ26B5gFKo&v=1&x=3&a=BclucinSeKmFAy2GJg&e=1413787013&k=${uk}&r=567CC38A-FD1B-4DE6-B11B-4166A5669E1B-1&z=https%3A%2F%2Fp03-content.icloud.com%3A443&s=lO5SolOouS9qhYz1oIxKDoGtMpo&hs=ovfPXj3b9XXz9lWKChBmyNq_cug&uk=OXDCcLTETbvUcOKdJ-vTdQ&f=Testdatei.vrphoto&sz=1212622
Save the text between '?u=' and '&uk=' as a NSMutableString
Save the information after 'uk=' and 'f=' as NSStrings
In the first string replace the text '${f}' with the 'f=' string and replace the text '${uk}' whith the 'uk=' string
If you need the files size for any reason, it's the number after 'sz=', but this is not needed for the final link
Voila, here is your direct link to the file:
https://ms-eu-ams-103-prod.digitalhub.com/B/ATmkKK8ju8SRwQqDoEFKJzbRsxiuAXQ3PBcJBXw1Qot9jz68TkqjiiNu/Testdatei.vrphoto?o=AtenENR8OcvlNq6JMa331mr-8gCreXxwcfgQ26B5gFKo&v=1&x=3&a=BclucinSeKmFAy2GJg&e=1413787013&k=OXDCcLTETbvUcOKdJ-vTdQ&r=567CC38A-FD1B-4DE6-B11B-4166A5669E1B-1&z=https%3A%2F%2Fp03-content.icloud.com%3A443&s=lO5SolOouS9qhYz1oIxKDoGtMpo&hs=ovfPXj3b9XXz9lWKChBmyNq_cug
It looks like the heavy lifting is done by the file referenced there:
https://www.icloud.com/applications/documents/download/en-us/1FCS22.32292/javascript-packed.js
I'd start there looking for the file name etc.