Telegram bot can't recognize cyrillic text - json

Tell me please, who knows how can json() recognize Cyrillic? (postman recognizes, but not telegram)[enter image description here](https://i.stack.imgur.com/zcE9U.png)enter image description here
need to recognize Cyrillic

Related

How to change JSON non-english non-text to "proper" non-english letters?

enter image description here
I scraped a non-English tweet, and I got a JSON file.
But the problem is that it is all written as this weird code.
\uc774\uc7ac\uba85 \ub300\ud45c\ub2d8\ud83d\udc99\n\ubd80\uc0b0 \uc88c\uccad\ub144 \uc601\uc0c1 \uaf2d \ub2e4\uc2dc \ubd10\uc8fc\uc138\uc694\ud83e\udef6\
How can I transfer this to actual non-English texts?

Can not type vietnamese in html form

I just make this basic html form with an the "mailto" action, where it display what you typed into a gmail window. The code work fine with english, but when I try vietnamese (and I assumed with other language as well), the gmail window display those weird % sign and number. I wonder what I could add to the code to display the exact same text as I typed in.
This is the code
This is the website display from the code with vietnamese typed in the form
and this is the gmail window with weird characters
P/S. At first I thought that the line charset = "utf-8" take cares of these kind of stuff but it seem to be not

How to decode this text and display it as Arabic?

I have some text in a database that is encoded in some way, but I don't know what exactly, and the person who built the database and website left no documentation or comments of any kind. Ask an example, ÌÈÑÇä-1914 is stored in the database, but on the website, it is displayed as جبران-1914. How do I decode ÌÈÑÇä-1914 in order to get the Arabic text?
I've tried decoding ÌÈÑÇä-1914 into windows-1256, ISO-8859-1, UTF-8, but when I put it in an html file and lang-eg in the html tag, and meta charset=utf-8, but nothing is letting me display it as Arabic text instead of some indecipherable characters. How can I convert the encoded text into Arabic?
I realized the problem was that I was exporting the data incorrectly--when I exported it as ISO 8859-2 and converting it to windows 1256 in atom it displayed properly.

Translate text in a html text using python

I have written an API that translates text from english to Hindi language. However, if the text that is passed is an html text, then my code fails.
How do I get this right?
I have tried using py-translate, but this package is not able to convert html texts properly.
I have also tried using googleclient package, it is able to convert html text but can handle only one request at a time.
My API should be handle multiple requests and also be able to deal with html text translation.
Any help is appreciated.

What is this image stored as?

I want to extract these telephone numbers from the website, either as an image or if possible as a string.
Here is an example from the website: Link
As you can see the telephone number is an image.
However I cant seem to view the image when I open the image source:
<img src="http://www.callmyname.sg/search/display_phone_number/VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0=">
But when put into html and viewed in a browser, you can see the image fine.
It's a solution to prevent people like you from scraping their website :)
The url http://www.callmyname.sg/search/display_phone_number/VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0= leads to a script that generates the image - probably based on the argument.
VUhkVE1WOW5BV1lFWWxSbVhUdFRObGMzQlRBRU9nPT0=
Since it ends with an equals sign, I tried to decode it as base64:
UHdTMV9nAWYEYlRmXTtTNlc3BTAEOg==
Now it looks even more like base64, so I tried another round:
PwS1_gfbTf];S6W70:
So it's clearly not plaintext (or not encoded with base64), which would be ridiculous and would let you extract the number this way. They either use some special cipher, or store the numbers in database with this as identifier.
I don't think you can steal the phone number easily, only using OCR perhaps.
When you visit the URL, you will get garbage, since they do not send proper MIME header
�PNG IHDR�,���tRNS���7X}4IDATx���_HZo�g�� E��p��l��EHTx!]�DtQ�M�.x3��.dx�*b]Dl"]�D���bQq.B����Z2$��:ȡ�wq��9�s���Cx>W�}���ٳ��ڶ����]���Ǐ�/_���ݿ���ahh���\q����������555�=���*�"�*�*�f�����}uu�e�d2���o����?00p����J%ȴds���BB�˲�`�`0RJy����n�{cc�e�H$b�ۻ����(�~�_����A4�Z��_�V|��J�w�����t:��333.��ƕ������+^����L`���֑��W��3�X�" y���$p'U"��F���y���z&�ioo��萟�*� ����\�L&Sx����p�e���ׯ_R��y�J%�~����|qq��|e�Z%:�J�{��q��nW�ՉD"�J��~�n4��������̔Ty���qF���>BwGa�z����������8��ߡc�f��B�>!�Ub�N�s���|�F�^/B���Lj��i��NfJ��͛D"����� o!t��`����fvv�eم��V���D)�����x���d2966&�n� ^,0O4��(!D��l�h46�-�~��Tً>B�"�Q�>,�P��ok#U \�BU,�P���=G SA+GIEND�B`�
but it's really just ordinary PNG image:
img http://www.callmyname.sg/search/display_phone_number/VUhkVU5scGlBV1lDWWdFelVEUUhZQWRvQlRZR013PT0=
It's a PNG image, but the server doesn't specify the right content header. It tells your browser that is't an html page in UTF-8 encoding, so you just see some garbage (including the letters PNG at the start).
The <img> tag though doesn't know how to display text so it just tries to load it as an image (and with success).
I don't see a way to extract the numbers in any other way than just reading the image. Because it contains only numbers and will have a similar format all the time, maybe you can find a simple way to parse it instead of using a full fledged OCR library.
It's actually a png-file, generated by a computer before being displayed. You can reference it fine from any other page though, and you should also be able to download it easily (right click, save as ...) Note: I tested this, make sure you save the image with the extension .png and not .html which it will default to.
<img src="http://www.callmyname.sg/search/display_phone_number/QkNOVE1RODNBV1lDWWdVM1V6ZFZNZ1JyRFQ0Rk1BPT0=">