Special Characters (German Umlauts) in Blazor manifest.json? - json

so I want to create a Microsoft Teams App using Visual Studio 2019, Teams Toolkit and Blazor, and I'm having a hard time getting Unicode Characters (German Umlaute ä, ö, and ü) to show up in my manifest.json - or rather in the Teams App Description page. I'm also pretty new at developing with Blazor and JSON.
I've tried the HTML-style ö but this just gets passed right through.
I've tried the "\u00f6" but then it just shows up as "?".
How do i get Unicode characters into my manifest? Anything I'm missing? Do I have to switch to a different encoding? Where do I even see what type of encoding is being used?
manifest.json:
{
"$schema": "https://developer.microsoft.com/en-us/json-schemas/teams/v1.9/MicrosoftTeams.schema.json",
"manifestVersion": "1.9",
"version": "1.0.0",
"localizationInfo": {
"defaultLanguageTag": "de"
},
"developer": {
"name": "Römer R\u00f6mer",
...
is displayed as:
Any suggestions of what I'm missing?
EDIT
So as a few answers have suggested, I've tried saving the manifest.json in a different encoding (ANSI, UTF-8) but nothing works. It seems to me that Microsoft Teams is somehow not interpreting the manifest correctly. Which is weird, because the Description Page of some other apps include Umlauts and they are displayed correctly.

You can use Notepad++ to check and also change the encoding of a text file.
https://notepad-plus-plus.org/downloads/
As a developer from Austria, who also has to fight with umlauts, I would recommend to change the encoding of the file.
You can change the encoding in Notepad++: Menu bar -> encoding/
EDIT: But by the way, I think that umlauts and special characters etc. should not be used in manifest files or source code files.

In VS you can use File/Save {your file} As...
then select to change the encoding in the drop down for the Save button
The docs say UTF-8 without BOM

We are able to repro the issue. We have raised a bug for this issue, concerned engineering team is working on it.

Related

How to find all this kind of UNICODE � in multiple html pages (or, with notepad++)? [duplicate]

I have a bizarre problem: Somewhere in my HTML/PHP code there's a hidden, invisible character that I can't seem to get rid of. By copying it from Firebug and converting it I identified it as  or 'Zero width no-break space'. It shows up as non-empty text node in my website and is causing a serious layout problem.
The problem is, I can't get rid of it. I can't see it in my files even when turning Invisibles on (duh). I can't seem to find it, no search tool seems to pick up on it. I rewrote my code around where it could be, but it seems to be somewhere deeper in one of the framework files.
How can I find characters by charcode across files or something like that? I'm open to different tools, but they have to work on Mac OS X.
You don't get the character in the editor, because you can't find it in text editors. #FEFF or #FFFE are so-called byte-order marks. They are a Microsoft invention to tell in a Unicode file, in which order multi-byte characters are stored.
To get rid of it, tell your editor to save the file either as ANSI/ISO-8859 or as Unicode without BOM. If your editor can't do so, you'll either have to switch editors (sadly) or use some kind of truncation tool like, e.g., a hex editor that allows you to see how the file really looks.
On googling, it seems, that TextWrangler has a "UTF-8, no BOM" mode. Otherwise, if you're comfortable with the terminal, you can use Vim:
:set nobomb
and save the file. Presto!
The characters are always the very first in a text file. Editors with support for the BOM will not, as I mentioned, show it to you at all.
If you are using Textmate and the problem is in a UTF-8 file:
Open the file
File > Re-open with encoding > ISO-8859-1 (Latin1)
You should be able to see and remove the first character in file
File > Save
File > Re-open with encoding > UTF8
File > Save
It works for me every time.
It's a byte-order mark. Under Mac OS X: open terminal window, go to your sources and type:
grep -rn $'\xFEFF' *
It will show you the line numbers and filenames containing BOM.
In Notepad++, there is an option to show all characters. From the top menu:
View -> Show Symbol -> Show All Characters
I'm not a Mac user, but my general advice would be: when all else fails, use a hex editor. Very useful in such cases.
See "Comparison of hex editors" in WikiPedia.
I know it is a little late to answer to this question, but I am adding how to change encoding in Visual Studio, hope it will be helpfull for someone who will be reading this sometime:
Go to File -> Save (your filename) as...
And in File Explorer window, select small arrow next to the Save button -> click Save with Encoding...
Click Yes (on Do you want to replace existing file dialog)
And finally select e.g. Unicode (UTF-8 without signature) - that removes BOM

International characters in website file names

I need to create a website (in PHP) that has filenames that include international characters.
For example: transportører.php (notice the 'o' with the diagonal line through it).
So I happily create the file, save it, and upload it to the web server. Whenever I LINK to this file, however, it all goes wrong. I'll have the usual link syntax:
My Link Text
Upon clicking such a link, the web browser attempts to navigate to a non-existent page:
The requested URL /transportører.php was not found on this server.
Notice how the filename has been mutated? The "ø" character in "transportører.php" has been changed into the bizarre "ø" symbol (that's not a comma after the "A", by the way, but an actual component of the symbol itself).
There's obviously some sort of translation going on here, but what, why, and how do I prevent it?
I think, it's two possible reasons:
html encoding
Possibly the encoding of the html file is wrong, so the link is actually pointing to a wrong path. Add
<meta charset="UTF-8">
in the head section of your file.
server settings
If the server is resolving the link wrongly (you can check this by typing the address of your norwegian-named.php in the browser and see if it is replaced), you need to know which server you are using and investigate in this direction. For apache, How to change the default encoding to UTF-8 for Apache? looks promising.
As the URL isn’t percent-encoded in the hyperlink, browsers assume¹ UTF-8 for percent-encoding it, where ø becomes %C3%B8.
However, your server seems to expect/use ISO 8859-1 (instead of UTF-8), where ø becomes %F8.
A quick fix would be to link to the ISO 8859-1 percent-encoded URL:
transportører
(A better fix would be to let your server use UTF-8 for everything, and then to use the UTF-8 percent-encoded URL in the hyperlink.)
¹ Either by default, or because the linking page seems to use UTF-8 (at least according to the HTTP header Content-Type: text/html; charset=UTF-8).
Well, this is embarrassing. Everything was - in actual fact - working correctly. The 404 error made the filename LOOK "wrong" - e.g. transportører.php. However, this is actually correct. That is how HTML seems to reference the file "behind the scenes". So to the browser, "transportører.php" is synonymous with "transportører.php"
What was happening was that FileZilla (my FTP client) objects to international characters. It was changing the filename during upload.... replacing the international characters with "something else". The filenames LOOKED correct on the screen (when I viewed the website folder with Linux Mint's native FTP client), but the underlying character coding was NOT correct. The web-browsers could tell the difference, and hence didn't associated my links with the (mutated) file names, hence triggering an error 404.
The solution in a nutshell: I used Linux Mint native FTP to upload my files, overwriting the ones uploaded by FileZilla, and everything just sprang into life.
Thanks to everyone who offered advice... it was all good stuff, just not the solution in this particular case.

Greek letters - JSON

I have a JSON file with Greek names and letters ie.
{
"playerId":1,
"name":"Τ. Παπαγιάννης",
"position":"Τερματοφύλακας",
"age":"21",
"number":"15",
"photo":"/papagiannis.jpg",
"details":"Ακαδημίες ΑΕΛ"
}
I display this file on google chrome browser(through a local server called wamp) and it is a mess.
Is there any way to change that?
Thank you.
Theo.
FIXED
I used notepad++ then went to Encoding->Encode to UTF-8. That simple.
You should set the encoding to UTF-8

HTML encoding issues - "Â" character showing up instead of " "

I've got a legacy app just starting to misbehave, for whatever reason I'm not sure. It generates a bunch of HTML that gets turned into PDF reports by ActivePDF.
The process works like this:
Pull an HTML template from a DB with tokens in it to be replaced (e.g. "~CompanyName~", "~CustomerName~", etc.)
Replace the tokens with real data
Tidy the HTML with a simple regex function that property formats HTML tag attribute values (ensures quotation marks, etc, since ActivePDF's rendering engine hates anything but single quotes around attribute values)
Send off the HTML to a web service that creates the PDF.
Somewhere in that mess, the non-breaking spaces from the HTML template (the s) are encoding as ISO-8859-1 so that they show up incorrectly as an "Â" character when viewing the document in a browser (FireFox). ActivePDF pukes on these non-UTF8 characters.
My question: since I don't know where the problem stems from and don't have time to investigate it, is there an easy way to re-encode or find-and-replace the bad characters? I've tried sending it through this little function I threw together, but it turns it all into gobbledegook doesn't change anything.
Private Shared Function ConvertToUTF8(ByVal html As String) As String
Dim isoEncoding As Encoding = Encoding.GetEncoding("iso-8859-1")
Dim source As Byte() = isoEncoding.GetBytes(html)
Return Encoding.UTF8.GetString(Encoding.Convert(isoEncoding, Encoding.UTF8, source))
End Function
Any ideas?
EDIT:
I'm getting by with this for now, though it hardly seems like a good solution:
Private Shared Function ReplaceNonASCIIChars(ByVal html As String) As String
Return Regex.Replace(html, "[^\u0000-\u007F]", " ")
End Function
Somewhere in that mess, the non-breaking spaces from the HTML template (the s) are encoding as ISO-8859-1 so that they show up incorrectly as an "Â" character
That'd be encoding to UTF-8 then, not ISO-8859-1. The non-breaking space character is byte 0xA0 in ISO-8859-1; when encoded to UTF-8 it'd be 0xC2,0xA0, which, if you (incorrectly) view it as ISO-8859-1 comes out as " ". That includes a trailing nbsp which you might not be noticing; if that byte isn't there, then something else has mauled your document and we need to see further up to find out what.
What's the regexp, how does the templating work? There would seem to be a proper HTML parser involved somewhere if your strings are (correctly) being turned into U+00A0 NON-BREAKING SPACE characters. If so, you could just process your template natively in the DOM, and ask it to serialise using the ASCII encoding to keep non-ASCII characters as character references. That would also stop you having to do regex post-processing on the HTML itself, which is always a highly dodgy business.
Well anyway, for now you can add one of the following to your document's <head> and see if that makes it look right in the browser:
for HTML4: <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
for HTML5: <meta charset="utf-8">
If you've done that, then any remaining problem is ActivePDF's fault.
If any one had the same problem as me and the charset was already correct, simply do this:
Copy all the code inside the .html file.
Open notepad (or any basic text editor) and paste the code.
Go "File -> Save As"
Enter you file name "example.html" (Select "Save as type: All Files (.)")
Select Encoding as UTF-8
Hit Save and you can now delete your old .html file and the encoding should be fixed
Problem:
Even I was facing the problem where we were sending '£' with some string in POST request to CRM System, but when we were doing the GET call from CRM , it was returning '£' with some string content. So what we have analysed is that '£' was getting converted to '£'.
Analysis:
The glitch which we have found after doing research is that in POST call we have set HttpWebRequest ContentType as "text/xml" while in GET Call it was "text/xml; charset:utf-8".
Solution:
So as the part of solution we have included the charset:utf-8 in POST request and it works.
In my case this (a with caret) occurred in code I generated from visual studio using my own tool for generating code. It was easy to solve:
Select single spaces ( ) in the document. You should be able to see lots of single spaces that are looking different from the other single spaces, they are not selected. Select these other single spaces - they are the ones responsible for the unwanted characters in the browser. Go to Find and Replace with single space ( ). Done.
PS: It's easier to see all similar characters when you place the cursor on one or if you select it in VS2017+; I hope other IDEs may have similar features
In my case I was getting latin cross sign instead of nbsp, even that a page was correctly encoded into the UTF-8. Nothing of above helped in resolving the issue and I tried all.
In the end changing font for IE (with browser specific css) helped, I was using Helvetica-Nue as a body font changing to the Arial resolved the issue .
I was having the same sort of problem. Apparently it's simply because PHP doesn't recognise utf-8.
I was tearing my hair out at first when a '£' sign kept showing up as '£', despite it appearing ok in DreamWeaver. Eventually I remembered I had been having problems with links relative to the index file, when the pages, if viewed directly would work with slideshows, but not when used with an include (but that's beside the point. Anyway I wondered if this might be a similar problem, so instead of putting into the page that I was having problems with, I simply put it into the index.php file - problem fixed throughout.
The reason for this is PHP doesn't recognise utf-8.
Here you can check it for all Special Characters in HTML
http://www.degraeve.com/reference/specialcharacters.php
Well I got this Issue too in my few websites and all i need to do is customize the content fetler for HTML entites. before that more i delete them more i got, so just change you html fiter or parsing function for the page and it worked. Its mainly due to HTML editors in most of CMSs. the way they store parse the data caused this issue (In My case). May this would Help in your case too

mailto special characters

Is there a way to make the email client ( Outlook ) accept special characters coming from the mailto link in html? I'm trying to have a mailto link with german characters in the body, but in Outlook I get only strange characters.
Thanks
I just spent 2 days investigation this issue. Our issue was that mailto: links on our utf-8 encoded web pages did not work for Outlook users if the subject= string contained non-ascii characters, like e.g Norwegian characters. An example is:
"mailto:mail#coretrek.no?subject=julegløgg og fårikål"
From what I have learned so far, Outlook simply does not handle anything other than ASCII and iso-8859-1 characters. So when trying to click on the above mailto link (either from IE or Firefox), Outlook fails to decode the characters, leaving the subject broken and containing "weird" characters.
So the next step was to try to re-encode the pages in ISO-8859-1. What we did was to replace the original mailto link on the utf-8 page with a link to a "email-to-iso"-service, like this:
http://url.com/service.php?service=util.mailtoencode&mailto=mail%40coretrek.no%3Fsubject%3Demne+%C3%B8%C3%A6%C3%A5+emne
This page would convert the mailto characters to iso-8859-1 and then output the entire page content in iso-8859-1. A javascript on the page, containing "location.href='mailto:...'" was used to open the client's email client automatically.
So far everything seemed ok. This actually works in Internet Explorer, both with Thunderbird and Outlook (tested on IE7 on WinXP with Outlook express and TB 2).
BUT the problem now is actually Firefox. It seems like Firefox is unable to decode url-encoded urls containing characters found only in ISO-8859-1 but not in ASCII (like the norwegian å, represented by %E5 when encoded). The same å is handled correct if the page encoding is utf-8, but it seems like the Firefox developers have forgotten to test special characters together with the ISO-8859-1 charset.
The result is that Firefox passes an un-decoded string (still containing %E5 intstead of å) to the email client. And, amazingly, this is handled correct by Outlook (which manages to decode the string itself), but NOT by Thunderbird, which probably has the same bug as Firefox. If you DON't url encode the subject, the string is passed correctly to Thunderbird, but not to Outlook.
We have also been trying other encoding methods, like php's htmlentities, htmlspecialchars, base64 encoding etc, but all of them fails one way or the other.
So, summarized:
Pages encoded in utf-8:
IE fails always
FF -> Thunderbird: OK
FF -> Outlook: FAIL
Pages encoded in iso-8859-1:
IE: OK
FF -> Thunderbird: Fails if subject is url encoded, ok if not)
FF -> Outlook: Fails if subject is not url encoded, ok if encoded)
(this is Windows, on Ubuntu Linux FF and TB works OK always).
Hoping this was helpful for others having the same problem.
In PHP I think the function that works best with Outlook is rawurlencode()
I think using a urlencode method should do what you're looking for. JavaScript has .encodeURI() methods on string objects, and .NET has the HttpUtility.UrlEncode method.
What language are you using?
Actually, the solution is http://blogs.msdn.com/ie/archive/2007/02/12/International-Mailto-URIs-in-IE7.aspx and it is not nice.
Basically, in IE 7 and 8 the user must have enabled an advanced setting in Internet Options, something that 100% of the users will not know will not have enabled.
You need to enable UTF-8 support for the mailto: protocol
From the main outlook window, click Tools -> Options -> mail format -> international options -> "Enable UTF-8 support for mailto: protocol".
rawurlencode() function works best with outlook,
tested with Firefox, Chrome & IE
As yandr indicated, this issue is an ongoing problem with Outlook.
Microsoft has published documentation that states that properly configured Outlook 2003 and 2007 attached to a properly configured Exchange server will default to supporting Unicode, but that doesn't really help you with the general public.
For reference, the "standard" you want to refer to for this is RFC 2047.
The solution that I have implemented to get around this limitation (with Swedish, actually) is to use a web form instead of a mailto: link. It requires more setup on the server side, but gives you a lot more control over the contact process.
I'm sure this isn't what you wanted to hear, but until the world stops using broken software from Microsoft, we'll continue to need workarounds like this.
It sounds like you need the page containing the mailto link to be in the encoding that Outlook is expecting. Without knowing any more about the situation, I'd try encoding the page in UTF-8 and ISO-8859-1.
The relevant 'more about the situation' would be what weird characters appear and what the page's encoding is currently.
If one is using SharePoint 2010, it seems Microsoft has been aware of this issue, and has supplied some functions to solve this.
The following will properly escape the link to the current page
escapeProperly(escapeProperlyCoreCore($(location).attr('href'), false, false, true))
In JavaScript you can use encodeURIComponent function for subject and body. Then it will show all special characters in email.
const emailRequest = {
to: "abc#xyz.com",
cc: "abc#xyz.com",
subject: "Email Request - for <CompanyName>",
body: `Hi All, \r\n \r\n This is my company <CompanyName>
\r\n Thanks`,
};
const subject = encodeURIComponent(emailRequest.subject.replace("<CompanyName>", 'ABC & ** Company'));
const body = encodeURIComponent(emailRequest.body.replace("<CompanyName>", 'ABC & ** Company'));
window.location.href = (`mailto:${emailRequest.to}?cc=${emailRequest.cc}&subject=${subject}&body=${body}`);