Is it possible to sniff out what font a browser is currently using from server side? - html

Basically I want to echo back to the user what font they are using. Is it possible to do? If there is a Wordpress plugin for it, that would be extra nice.
The page in question is going to show Chinese characters. My friend wants to have one column with a character as it looks in mainland China, and then a column with how it looks in Taiwan (there are slight, but distinct, differences - see the article at the link), both these columns will be pictures. And then a third column that displays the character using your font. It would be neat if there was a way to know which variant the user's font is displaying. But now that I write it out, it seems like a very hard problem.
Hacking Chinese on character differences

Related

What are these strange characters in HTML source?

My friend runs a website and had an e-mail from Google Safesearch informing him he was hosting a phishing page. Turns out his cPanel was bruteforced (weak password) and they uploaded some of the pages onto his server. He told me about it and I wanted to take a look at how sophisticated are.
In many of the files, certain words/portions of text are strange. They display perfectly in a webbrowser, but are jumbled inside the HTML. I was wondering if anyone can tell me what this is?
Examples:
<title>WеlÑоmе tо еВаy: Sign in</title>
<span class="txtbox_title">Раsswоrd</span>
<a class="three" href="#">Fоrgоt yоur
It's also worth noting that there is normal text throughout the page that displays perfectly also.
I assume this is to stop the detection of certain words in the page, but I'm not sure. Any information would be great.
Edit: Originally was tagged as PHP. I realised that it probably shouldn't be so removed it. Be nice, kids.
Edit edit: For clarity, it's a phishing page targetting eBay users.
The examples I posted in the original post are (in order):
eBay: Sign In
Your Password
Forgot your [password]
As such I don't believe it to be any sort of malware, but a method of encrypting text to fight detection in browsers such as Chrome (which I assume detect 'hot' words in their algorithm).
They UTF-8 encoded Cyrillic letters and possibly other characters chosen for their visual similarity to common Latin letters. You are viewing the page in an editor that does not interpret data as UTF-8 but as in Latin 1 encoding.
For example, what you see as “о” is actually two bytes, 0xD0 0xBE. When interpreted as UTF-8 data (which is what browsers do here), they represent “о” U+043E CYRILLIC SMALL LETTER O. It is identical with the common Latin letter “o” in visual appearance (in any font that contains both letters), but coded as a separate character due to belonging to a different writing system. To any program, they are quite distinct characters, unless the program has been separately coded to handle “confusables”.
Such confusion is often intentionally created for various reasons. You are probably right in assuming that here the purpose was “to stop the detection of certain words in the page”. When e.g. “Forgot” is written using Cyrillic o’s (Fоrgоt), normal Find operations will find it when searching for “Forgot”.
My best guess is that there it is a custom type of keylogger. The WеlÑоmе tо еВаywould be parsed by the keylogger to output some data into a database that can be mined later for important information.
My second guess is that it is a means to scare or mess with the person whom owns the site.
My third guess is that the virus was coded by china or some other language and when the code was translated back into utf-8 it resulted in some of the unused characters to output the strange content.
EDIT
My fith guess is the the phishing website was programmatic getting the source code content of the ebay site and parsing it into it's own html file. And ebay has its own countermeasures against such a type of attack by scrambling the letter in the source code.
With this there must be some type of javascript that undoes the effects of the original source code.

Rendering barcodes in HTML with Code 128 font

Is it possible to render correct bar-codes in HTML using the Code 128 font?
The main content of the bar-code is fine in the broswer (firefox) but when I try to add the start code character I just get this character in the browser:
Ñ
This is ASCII code 209. I'm wondering if it even has a bar representation.
I'm using MVC but this is really just a HTML/CSS problem I think.
Thanks
This isn't quite what you asked for, but you can make barcodes using CSS: see http://unixshell.jcomeau.com/src/barcodes/memberships.html. I'm using code39 for this, but most other linear codes can be done the same way.
Are you sure that the client is going to have barcode font installed?
Server side image generation seems to be a better solution.
You may want to try Barcode.dll for barcode rendering.
It includes ASP.NET barcode control - just drag & drop.
Please note that this is a commercial product I developed.
I know this is years too late, but looking again at the question, I'm pretty sure you're just not using the right numeric code for your font. there is no single "Code 128 font". while 209 is shown by Wikipedia to be the correct "common" code for Start B, in various fonts I found online this is not the case. in this, Start B is 236; and here it's 204. use the right code for your particular font, and you should get what you want.
a code point not encoded by the barcode font will be rendered by a default font, which is why you're seeting the N tilde character.

Embedding and Displaying chinese/japanese

I have been working on a subtitles engine for flash/flv video player. On my Mac everything is great, nice aliased glyphs, displaying all the characters, etc. Switch to windows, it all goes out the window. Some machines with Eastern Characters enabled display fine, but I can't guarantee all users will have this option selected.
I am using the TLFTextField, I am pulling in UTF-8 XML with Chinese/Japanese characters.
I have tried embedding the (required fonts/glyphs) but pushes the file size up massively.
I have also tried changing it to unicode, with no joy. Has anyone got any experience with displaying these characters while maintaining a low file size.
I'm not really offering a solution to your question, but if the user is wanting Chinese or Japanese subtitles, I'm pretty sure that they will have the correct encoding.

What is a good resource for HTML character codes -> glyph and

I've already found a good site to convert HTML character codes to their respective glyphs:
http://www.public.asu.edu/~rjansen/glyph_encoding.html
However, I need a bit more information. Does anyone know of a site like the one above that also provides information on what type of character code it is? Meaning, is it a special character? Is the glyph visible? Etc...
So far I have found some tables with this information, but they aren't as complete as the resource above. I would really like to get my hands on a complete table.
Thanks,
-Ben
HTML Entity Character Lookup
I like FileFormat.Info--e.g.: http://www.fileformat.info/info/unicode/char/20ac/index.htm
The character map on Ubuntu (and I assume most other Linux distros) is fantastic. You can search for any character by its name or description (e.g. "arrow") really easily.
Windows' character map is a poor imitation but kinda works too. It seems to decide that certain fonts (Arial, Verdana etc) can't display some characters, even though they work absolutely fine. (Hint: try MS's more recent font creations like Calibri for better results.)
Once you've found a character you can either:
Copy it and use it directly (requires pages to be UTF-8) like this: ↗
Insert it as a hexadecimal entity. The above character is "U+2197 North East Arrow" so the entity would be ↗
Convert the hex code to decimal (the calculators on Windows and Linux can do this). The above example is ↗
Here's a quick, low-footprint way to look them up: &what;

What are the HTML entities for up and down triangles?

I've found the outlined versions, but I want the solid up and down triangles.
Does anyone know these entities?
All named HTML entities are specified in chapter 24 of the HTML standard. The only thing missing from the page are rendered entities, but you can easily create your own copy with the additional information by applying a simple regexp:
s/<!ENTITY (\S+)/<!ENTITY \1 &\1;/
Not all entities are named. For many, you need to specify the Unicode code page, either in decimal (▲ ▲, ▼ ▼) or hex (▲ ▲, ▼ ▼).
A little but late, but you can use &blacktriangledown; &blacktriangledown;, and &blacktriangle; &blacktriangle;, to make both the up and down filled in triangles. I was looking for it myself and the alt codes didn't help so I decided to share this. This same thing works for both left and right as well.
I don't know if I've ever seen what you're looking for. Maybe a better way of doing it would be to create the arrows in Photoshop on a transparent background (.gif or .png format), and then load up the images.
Check that, you can do it through alt characters.
http://www.tedmontgomery.com/tutorial/ALTchrc.html
▼ ▲
using the alt characaters on your computer keyboard is a big no no if you are working on a web page for many reasons. #1. encoding of the website, encoding of the database driving the website if any, the codepage of the computer view the website, the codepage your own pc's keyboard is set to.. all that are mostly factors you can not control. So some people will see wonky weird letter combos or sqiggle characters instead of what you intend. For webpages use the html codes for those characters when ever you can. or at least entity encode and make sure you have your code page defined in your html header of your site.. that way people will see what you intend them to.
now if you are doing this in word for a document that will be viewed in your own country you are probably safe. But for online things (site coding or data entry) you should avoid this like the plague.