I am having an issue using Lazarus + zeos + access in some characters (Ç, ~, í ... )
The problem is a bit weird, sometimes i can insert properly, but sometimes the characters go crazy, example:
When I am typing it is ok, the ç and ã
BUT when I exit the DBedit:
This happens sometimes, and sometimes the chars are registered just fine
Using zeos, with zeosconnection. ClientCodepage UTF8 / ControlsCodepage UTF8 / AutoEncodestrings true.
Tried to change the charset but the problem persists, and the worst thing is that sometimes it works but sometimes it seems to loose the charset...
This difference in behaviour occurs in the same run of the program. For example, I am typing, when I save a change to a record in my database, everything is fine and ok; then I try to create a new record and the problems occurs, and the funny thing is: when I type "requisição" the chars stay the same, but I proceed to type "requisição de saída" the chars break; it seen like a problem that the software is trying to auto-encode based on what I am typing.
Also, I have discovered that if I put an extra blank space after the word that has "...ção" in the end everything works as it should "requisição[][]de saída" where the [] are the two blank spaces
Any advice?
Related
I am using the terminal in a chromebook to ssh into a remote server. When I run a MySQL (5.6) select query, sometimes one of the fields will return nonsense unicode (when the field should return an email address) and change the MySQL prompt from:
mysql>
to
└≤⎽─┌>
and whatever text I type is converted into weird unicode. The problem persists even after I exit MySQL
One of the values in your database happened to have the sequence of bytes 0x1B, 0x28, 0x30 (ESC ) 0) in it. When you did the query, MySQL printed this byte sequence directly to your console. You can reproduce the effect by typing from python:
>>> print '\x1B\x28\x30'
Consoles use control characters (in particular 0x1B, ESC) as a way to allow applications to control aspects of the console other than pure text, such as colours and cursor movements. This behaviour is inherited from the old dumb-terminal devices that they are pretending to be (which is why they are also known as terminal emulators), along with some weirder tricks that we probably don't need any more. One of those is to switch permanently between different character sets (considered encodings, now, but this long predates Unicode).
One of those alternative character sets is the DEC Special Graphics Character Set which it looks like you have here. In this character set the byte 0x6D, usually used in ASCII for m, comes out as the graphical character └.
You could in principle reset your terminal to normal ASCII by printing a byte sequence 0x1B, 0x28, 0x42 (ESC ) B), but this tends to be a pain to arrange when your console is displaying rubbish.
There are potentially other ways your console can become confused; it's not, in general safe to print arbitrary binary data to the console. There even used to be nastier things you could do with the console by faking keyboard input, which made this a security problem, but today it's just an annoyance factor.
However, one wouldn't normally expect to have any control codes in an e-mail address field. I suggest the application using the database should be doing some validation on the input it receives, and dropping or blocking all control codes (other than potentially newlines where necessary).
As a quick hack to clean this field for the specific case of the ESC character, you could do something like:
UPDATE things SET email=REPLACE(email, CHAR(0x1B), '');
I use SublimeText for writing Python. Every so often it will insert characters that I didn't type. Today's example:
Non-ASCII character '\xc2' in file
/path/to/my/project/forms.py on line 256, but no encoding
declared; see http://www.python.org/peps/pep-0263.html for details
(forms.py, line 256)
This doesn't happen to my colleagues and happens to me from time to time. I'm not sure what to do about it. I can delete the line and re-type it and it's fine. I have tried updating versions etc etc.
I don't want to just set the file encoding because I'm not actually typing non-ascii characters and that would be ignoring the actual problem.
Has anyone else found this? Solutions?
That happens to me too! If you are in MacOS you´re typing ⌥ + space, if you are windows/Linux I guess it's alt + space
I'm developing a website which lets people create their own translator. They can choose the name of the URL, and it is sent to a database and I use .htaccess to redirect website.com/nameoftheirtranslator
to:
website.com/translator.php?name=nameoftheirtranslator
Here's my problem:
Recently, I've noticed that someone has created a translator with special characters in the name -> "LAEFÊVËŠI".
But when it is processed (posted to a php file, then mysqli_real_escape_string) and added to the database it appears as simply "LAEFVI" - so you can see the special characters have been lost somewhere.
I'm not quite sure what to do here, but I think there are two paths:
Try to keep the characters and do some encoding (no idea where to start)
Ditch them and tell users to only use 'normal' characters in the names of their translators (not ideal)
I'm wondering whether it's even possible to have a url like website.com/LAEFÊVËŠI - can that be interpreted by the server?
EDIT1: I notice that stack overflow, on this very question, translates the special characters in my title to .../using-special-characters-in-urls! This seems like a great solution, I guess I could make a function that translates special characters like â to their normal equivalent (like â)? And I suppose I would just ignore other characters like /##"',&? Now that I think of it, there must be some fairly standard/good-practice strategies for getting around problems like this.
EDIT2: Actually, now that I think about it (more) - I really want this thing to be usable by people of any language (not just English), so I would really love to be able to have special characters in the urls. Having said this, I've just found that Google doesn't interpret â as a, so people may have a hard time finding the LAEFÊVËŠI translator if I don't translate the letters to normal characters. Ahh!
Okay, after that crazy episode, here's what happened:
Found out that I was removing all the non alpha-numeric characters with PHP preg_replace().
Altered preg_replace so it only removes spaces and used rawurlencode():
$name = mysqli_real_escape_string($con, rawurlencode( preg_replace("/\s/", '', $name) ));
Now everything is in the database encoded, safe and sound.
Used this rewrite rule RewriteRule ^([^/.]+)$ process.php?name=$1 [B]
Run around in circles for 2 hours thingking my rewrite was wrong because I was getting "page not found"
Realise that process.php didn't have a rawurlencode() to read in the name
$name = rawurlencode($_GET['name']);
Now it works.
WOO!
Sleep time.
I’m creating a MySQL database storing Chinese characters with associated pīnyīn pronunciations. I’ve set up everything to work in UTF-8 charset, so I’m having no troubles with most of the symbols I’m using. Except, strangely, some of certain latin characters with tone marks, and only when I write them into the database from $_POST, using PHP.
Those are: all characters with an acute accent (á, é, í, ó, ú), except ǘ (?!); and all characters with a grave accent (à, è ì ò ù), again, except ǜ. When they are typed into a form, and that form is submitted to the db, those characters are just cut off, like they never existed. E.g., cháng submits like chng. Any other characters (with a caron, like ǎ, or a macron, like ā) are written in fine, and so are actual Chinese characters.
Again, I’m using UTF-8 everywhere possible, and this sort of problem so far has been only experienced upon submitting data from a form. Before, I ran a script to manually insert an array, containing those characters, to the database, and everything went fine.
Any ideas?
I think you may post pinyin in a numbered format.
e.g. cháng as cha2ng
And dealing with the post information in php script by some mapping methods.
Here's a method to deal with it.
Convert numbered to accentuated Pinyin?
Hopefully, it helps you.
I got a solution!
Before:
SELECT 'liàng' = 'liǎng';
Change to:
SELECT CONVERT('liàng' USING BINARY)= CONVERT('liǎng' USING BINARY) as equal;
I've asked this question before, here, however that solution didn't fix it when I looked closely. This is the problem.
For some reason, my mysql table is converting single and double quotes into strange characters. E.g
"aha"
is changed into:
“ahaâ€
How can I fix this, or detect this in PHP and decode everything??
Previously i tried doing this query right after connecting to MySQL:
$sql="SET NAMES 'latin1'";
mysql_query($sql);
But that no longer has any effect. I'm seeing strings such as:
“aha†(for "aha")
It’s (for "its")
etc.
Any ideas?
As per the answer to your original question, your input is actually in UTF-8, but the output you're seeing looks wrong because your output terminal and/or browser is set to the (single byte) character encoding "Windows 1252".
If you just make sure that your output is also set to UTF-8 then everything should be fine.
See Quotation marks turn to question marks