The right way to use SSML with Web Speech API

The right way to use SSML with Web Speech API - google-chrome

Web Speech API specification says:
text attribute
This attribute specifies the text to be synthesized and
spoken for this utterance. This may be either plain text or a
complete, well-formed SSML document. For speech synthesis engines
that do not support SSML, or only support certain tags, the user
agent or speech engine must strip away the tags they do not support
and speak the text.
It does not provide an example of using text with an SSML document.
I tried the following in Chrome 33:
var msg = new SpeechSynthesisUtterance();
msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">ABCD</speak>';
speechSynthesis.speak(msg);
It did not work -- the voice attempted to narrate the XML tags. Is this code valid?
Do I have to provide a XMLDocument object instead?
I am trying to understand whether Chrome violates the specification (which should be reported as a bug), or whether my code is invalid.

In Chrome 46, the XML is being interpreted properly as an XML document, on Windows, when the language is set to en; however, I see no evidence that the tags are actually doing anything. I heard no difference between the <emphasis> and non-<emphasis> versions of this SSML:
var msg = new SpeechSynthesisUtterance();
msg.text = '<?xml version="1.0"?>\r\n<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US"><emphasis>Welcome</emphasis> to the Bird Seed Emporium. Welcome to the Bird Seed Emporium.</speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);
The <phoneme> tag was also completely ignored, which made my attempt to speak IPA fail.
var msg = new SpeechSynthesisUtterance();
msg.text='<?xml version="1.0" encoding="ISO-8859-1"?> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/10/synthesis http://www.w3.org/TR/speech-synthesis/synthesis.xsd" xml:lang="en-US"> Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream. The name is pronounced <phoneme alphabet="ipa" ph="pævˈloʊvə">...</phoneme> or <phoneme alphabet="ipa" ph="pɑːvˈloʊvə">...</phoneme>, unlike the name of the dancer, which was <phoneme alphabet="ipa" ph="ˈpɑːvləvə">...</phoneme> </speak>';
msg.lang = 'en';
speechSynthesis.speak(msg);
This is despite the fact that the Microsoft speech API does handle SSML correctly. Here is a C# snippet, suitable for use in LinqPad:
var str = "Pavlova is a meringue-based dessert named after the Russian ballerina Anna Pavlova. It is a meringue cake with a crisp crust and soft, light inside, usually topped with fruit and, optionally, whipped cream. The name is pronounced /pævˈloʊvə/ or /pɑːvˈloʊvə/, unlike the name of the dancer, which was /ˈpɑːvləvə/.";
var regex = new Regex("/([^/]+)/");
if (regex.IsMatch(str))
{
str = regex.Replace(str, "<phoneme alphabet=\"ipa\" ph=\"$1\">word</phoneme>");
str.Dump();
}
SpeechSynthesizer synth = new SpeechSynthesizer();
PromptBuilder pb = new PromptBuilder();
pb.AppendSsmlMarkup(str);
synth.Speak(pb);

There are bugs for this issue currently open with Chromium.
88072: Extension TTS API platform implementations need to support SSML
428902: speechSynthesis.speak() doesn't strip unrecognized tags This bug has been fixed in Chrome as of Sept 2016.

I've tried this using Chrome 104.0.5112.101 (on Linux). Didn't work. When checking the debugging console I got the message:
speechSynthesis.speak() without user activation is deprecated and will be removed
Adding a button like mentioned in The question of whether speechSynthesis is allowed to run without user interaction does work for me. At least to speak out text, not SSML formatted text though.

I have tested this, and XML parsing seems to work properly in Windows, however it does not work properly in MacOS.

Related

Is it possible to make "HTML to speech" same like "Text to speech"?

I have one weird requirement that in my existing app I have Text2Speech and for that, I have used AVSpeechSynthesizer to speech text, but now the requirement changed and now I need to convert HTML files data to text something like HTML2Speech.
One Solution we can think:
use HTML parsing and get all text from HTML and use same framework
for Text2Speech.
But the client doesn't want that type of parsing and he wants any API or framework which is providing directly HTML2Speech feature.
Any suggestion or help will be highly appreciated.

As I have worked with HTML parsing and text2speech here you can go with 2 steps
1.get Attribute string from HTML file with below code works in iOS7+
As per your client perspective : if there is any API in market for HTML2Speech may be its Paid or
you are depended on that API if you use any. While Native framework
will help same what you/client wants.
Step 1:
[[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding]
options:#{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: #(NSUTF8StringEncoding)}
documentAttributes:nil error:nil];
Then you can pass this Attributed String in AVSpeechUtterance
Step 2:
use below method to get HTML2String:
/**
* "ConvertHTMLtoStrAndPlay" : This method will convert the HTML to String
synthesizer.
*
* #param aURLHtmlFilePath : "object of html file path"
*/
-(void)ConvertHTMLtoStrAndPlay:(UIButton*)aBtnPlayPause
isSpeechPaused:(BOOL)speechPaused
stringWithHTMLAttributes:(NSAttributedString*)aStrWithHTMLAttributes
{
if (synthesizer.speaking == NO && speechPaused == NO) {
AVSpeechUtterance *utterance = [[AVSpeechUtterance alloc] initWithString:aStrWithHTMLAttributes.string];
//utterance.rate = AVSpeechUtteranceMinimumSpeechRate;
if (IS_ARABIC) {
utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:#"ar-au"];
}else{
utterance.voice = [AVSpeechSynthesisVoice voiceWithLanguage:#"en-au"];
}
[synthesizer speakUtterance:utterance];
}
else{
[synthesizer pauseSpeakingAtBoundary:AVSpeechBoundaryImmediate];
}
if (speechPaused == NO) {
[synthesizer continueSpeaking];
} else {
[synthesizer pauseSpeakingAtBoundary:AVSpeechBoundaryImmediate];
}
}
and as usual while you need to stop use below code to stop Speech.
/**
* "StopPlayWithAVSpeechSynthesizer" : this method will stop the playing of audio on the application.
*/
-(void)StopPlayWithAVSpeechSynthesizer{
// Do any additional setup after loading the view, typically from a nib.
[synthesizer stopSpeakingAtBoundary:AVSpeechBoundaryImmediate];
}
Hope This will help you to get HTML2Speech feature.

There's two parts to a solution here...
Presumably you don't care about the formatting in the HTML--after all, by the time it gets to the speech synthesizer, this text is to be spoken, not viewed. AVSpeechSynthesizer takes plain text, so you just need to get rid of the HTML markup. One easy way to do that is to create an NSAttributedString from the HTML, then ask that attributed string for its underlying plain-text string to pass text to the synthesizer.
In iOS 10 you don't even have to extract the string from an attributed string — you can pass an attributed string directly to AVSpeechUtterance.

One way or another it will always be parsing HTML to something else if you don't want to read files. If the client want direct HTML2Speech solution you can provide a method that takes html file as an argument and read it. What's happening with this file under the hood should not bother client that much as long as it's clean and not causing problems.
What happen when client will ask for Markdown2Speech or XML2Speech. For what i see in your desciption is better to have it for now in one framework with two public methods Text2Speech and HTML2Speech that will take as argument link to file or NSString.
So as #rickster suggest it can be NSAttributedString or NSString. There is a lot of parsers out there, Or if you want own solution you can remove everything what's inside < and > and change encoding.

The safest method will be to extract the text and use existing text2speech API.
Though if you are sure that the browser will be chrome then Speech Synthesis API maybe helpful. But this API still not fully adopted by all browsers; it will be a risky solution.
You can find necessary info regarding this API at
https://developers.google.com/web/updates/2014/01/Web-apps-that-talk-Introduction-to-the-Speech-Synthesis-API?hl=en
https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#examples-synthesis
https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API
There is no direct API for HTML to Speech except Speech Synthesis API mentioned above. Though you can try http://responsivevoice.org/. But I think this one is also based on browser's Speech Synthesis or Speech generation at server. So to use this one, you would have to extract text and pass the text to API to get the speech

Non-english characters are not displayed correctly in IntelliJ 14

I have code :
private static function askFromUser(cardId:uint):void {
var s:String = "У клиента " + cardId + " произошло задвоение данных.";
trace(s);
}
It shows :
[trace] � ������� 436 ��������� ��������� ������.
What is the problem?

Your IntelliJ probably does not have the correct File Encoding for that file. The encoding IntelliJ is using should be displayed in the Status Bar near the lower right-hand corner of your IntelliJ window. If you can't find it, this page goes into great detail about changing file encodings in IntelliJ. If you don't know what encoding your file is in, that can be difficult to determine.
I copied and pasted your code into a UTF-8 encoded file in my own IntelliJ and it displayed fine.

I don't have experience with Intellij Idea. But i have tried embedding different fonts in flash project. There's a blog post detailing my experience with font embedding . May be that will help you. font emebedding

Is there a built in way to parse an incomplete xml-text in Javascript in the latest Firefox?

I have read the answers so far, that DOMParser can't handle incomplete (and because of that not well formed) XML-Data.
As all major browsers can handle faulty html-source I just wonder if it's possible to use a workaround to get the browser interpreting the not well formed XML-Data.
For example by putting a manually written DOCTYPE-tag with an ATTLIST at the start of the Data, and then telling the browser to interpret it in a hidden frame, and then using the resulting dom-tree?
Is there a built in way to parse an incomplete xml-text in Javascript in the latest Firefox?
or
How would the DOCTYPE and ADDLIST have to look like, if in the Data there are unknown tags like <mytag> with attributes like nr="..." and date="..."? and <anothertag>with attributes like nr="..." and upto="..."?

Just use a DOMParser.
function parseFragment(fragment) {
var parser = new DOMParser(),
doc = parser.parseFromString(fragment, "text/html");
return doc.getElementsByTagName("body")[0];
}
and
var root = parseFragment('<foo><bar some="thing"><baz></bar>');
console.log(root.getElementsByTagName("bar")[0].getAttribute("some"));
// -> "thing"

How to retain the original html format when extracting content from web pages with boilerpipe?

I could extract the title and content (paragraphed) from the web pages on my Android application, but fail in fetching images sometimes.
However, I could not find a way to retain its html format parameters (e.g. bold, with a hyperlink, underline, or font size, etc..) in the extractor.
That is, if a sentence in the web page is equipped with bold, a hyperlink, or underline, how could I extract BOTH the sentence itself and its format parameters ?
I tried this page: An article both by the Web-API and APIs in local jar.
I would like to get the same result using local APIs as what Web-API did.
Could anyone share your experiences to this issue?
Much thanks,
James
Edit #1
Here are the codes:
signalUpdate(STATE.Start);
//
htmlDoc = HTMLFetcher.fetch(new URL(url));
//
doc = new BoilerpipeSAXInput(htmlDoc.toInputSource()).getTextDocument();
extraction.setTitle(doc.getTitle()); // obtaining title
ArticleExtractor.INSTANCE.process(doc); // obtaining content
SplitParagraphBlocksFilter.INSTANCE.process(doc);
contentBuilder.setLength(0);
for(TextBlock block : doc.getTextBlocks()) {
blockString = "<p>" + block.getText() + "</p>";
contentBuilder.append(blockString);
}
extraction.setContent(contentBuilder.toString());
// obtaining image
extractor = CommonExtractors.ARTICLE_EXTRACTOR;
ie = ImageExtractor.INSTANCE;
imgUrls = ie.process(new URL(url), extractor);
extraction.setImgUrls(imgUrls);
//
signalUpdate(STATE.Complete);
Actually, what I mean by "fail" is:
I could fetch images from some web sites. However, I could not get image in this article mentioned above.

Using iText to convert HTML to PDF

Does anyone know if it is possible to convert a HTML page (url) to a PDF using iText?
If the answer is 'no' than that is OK as well since I will stop wasting my time trying to work it out and just spend some money on one of a number of components which I know can :)

I think this is exactly what you were looking for
http://today.java.net/pub/a/today/2007/06/26/generating-pdfs-with-flying-saucer-and-itext.html
http://code.google.com/p/flying-saucer
Flying Saucer's primary purpose is to render spec-compliant XHTML and CSS 2.1 to the screen as a Swing component. Though it was originally intended for embedding markup into desktop applications (things like the iTunes Music Store), Flying Saucer has been extended work with iText as well. This makes it very easy to render XHTML to PDFs, as well as to images and to the screen. Flying Saucer requires Java 1.4 or higher.

I have ended up using ABCPdf from webSupergoo.
It works really well and for about $350 it has saved me hours and hours based on your comments above.

The easiest way of doing this is using pdfHTML.
It's an iText7 add-on that converts HTML5 (+CSS3) into pdf syntax.
The code is pretty straightforward:
HtmlConverter.convertToPdf(
"<b>This text should be written in bold.</b>", // html to be converted
new PdfWriter(
new File("C://users/mark/documents/output.pdf") // destination file
)
);
To learn more, go to http://itextpdf.com/itext7/pdfHTML

The answer to your question is actually two-fold. First of all you need to specify what you intend to do with the rendered HTML: save it to a new PDF file, or use it within another rendering context (i.e. add it to some other document you are generating).
The former is relatively easily accomplished using the Flying Saucer framework, which can be found here: https://github.com/flyingsaucerproject/flyingsaucer
The latter is actually a much more comprehensive problem that needs to be categorized further.
Using iText you won't be able to (trivially, at least) combine iText elements (i.e. Paragraph, Phrase, Chunk and so on) with the generated HTML. You can hack your way out of this by using the ContentByte's addTemplate method and generating the HTML to this template.
If you on the other hand want to stamp the generated HTML with something like watermarks, dates or the like, you can do this using iText.
So bottom line: You can't trivially integrate the rendered HTML in other pdf generating contexts, but you can render HTML directly to a blank PDF document.

Use itext libray:
Here is the sample code. It is working perfectly fine:
String htmlFilePath = filePath + ".html";
String pdfFilePath = filePath + ".pdf";
// create an html file on given file path
Writer unicodeFileWriter = new OutputStreamWriter(new FileOutputStream(htmlFilePath), "UTF-8");
unicodeFileWriter.write(document.toString());
unicodeFileWriter.close();
ConverterProperties properties = new ConverterProperties();
properties.setCharset("UTF-8");
if (url.contains(".kr") || url.contains(".tw") || url.contains(".cn") || url.contains(".jp")) {
properties.setFontProvider(new DefaultFontProvider(false, false, true));
}
// convert the html file to pdf file.
HtmlConverter.convertToPdf(new File(htmlFilePath), new File(pdfFilePath), properties);
Maven dependencies
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext7-core</artifactId>
<version>7.1.6</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>html2pdf</artifactId>
<version>2.1.3</version>
</dependency>

Use iText's HTMLWorker
Example

When I needed HTML to PDF conversion earlier this year, I tried the trial of Winnovative HTML to PDF converter (I think ExpertPDF is the same product, too). It worked great so we bought a license at that company. I don't go into it too in depth after that.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

The right way to use SSML with Web Speech API - google-chrome

There are bugs for this issue currently open with Chromium. 88072: Extension TTS API platform implementations need to support SSML 428902: speechSynthesis.speak() doesn't strip unrecognized tags This bug has been fixed in Chrome as of Sept 2016.

I have tested this, and XML parsing seems to work properly in Windows, however it does not work properly in MacOS.

Related

Is it possible to make "HTML to speech" same like "Text to speech"?

Non-english characters are not displayed correctly in IntelliJ 14

Is there a built in way to parse an incomplete xml-text in Javascript in the latest Firefox?

How to retain the original html format when extracting content from web pages with boilerpipe?

Using iText to convert HTML to PDF

Categories

Resources