Using the Moodle user import from CSV, we have the problem, that some German names with letters like Ö,ä,ü are imported "falsely". I presume, that the problem is in the encoding, here are the two possibilities, which I tested:
ANSI-encoding: The German letters disappear, for example Michael Dürr appears like Michael Drr in the listed users to import.
UTF-8-encoding: The letters appear as Michael Drürr
Does anyone has solution for the problem, or it has to be fixed one by one in the user's list?
I'm guessing the original file is using a different encoding. Try to convert the csv file to utf8 then import.
How do I correct the character encoding of a file?
you have to configure the database connection to make sure the encoding you choose for your webapplication (moodle) is the same as the encoding your database connection will choose.
look for SET NAMES 'utf8' or similar if you use mariadb/mysql as database.
and compare off course to the encoding of your import file. maybe you will need to convert it first. in any case the encoding of your web gui, the file, and the database connection (client character set) should be the same.
for web application check in your browser via View->Encoding or something similar, or check the meta header tag for the encoding in your html source code.
for file, use some editor or the like that will display the chars correctly and will indicate the charset.
for database, depends on your database.)
Related
I have hard-coded Chinese characters in peoplecode. It is written to a CSV file. This CSV file is attached via an email notification. However, when the user receives the email and opens the CSV file attachment, the Chinese characters are being shown as some weird symbols or characters. I am using app engine by the way that uses PSUNX.
Anyone have any workaround about this?
The problem appears to be that you are not writing the same character set that your recipient is opening the file with. Since you are using UTF8, your choice does support the Chinese characters.
I see you have a couple options:
Find out the character set your recipient is using and use that character set when writing the file.
Educate the recipient that the file is in UTF8 and that they may need to open it differently. Here is a link on how to open a CSV using UTF8 in Excel.
Alright managed to solve it using UTF8BOM.
I am working on a Talend Project, Where we are Transforming data from 1000's of XML files to CSV and we are creating CSV file encoding as UTF-8 from Talend itself.
But the issue is that some of the Files are created as UTF-8 and some of them created as ASCII , I am not sure why this is happening The files should always be created as UTF.
As mentioned in the comments, UTF8 is a superset of ASCII. This means that the code point for any ASCII characters will be the same in UTF8 as ASCII.
Any program identifying a file containing only ASCII characters will then simply assume it is ASCII encoded. It is only when you include characters outside of the ASCII character set that the file may be recognised by whatever heuristic the reading program uses.
The only exception to this is for file types that specifically state their encoding. This includes things like (X)HTML and XML which typically start with an encoding declaration.
You can go to the Advanced tab of the tFileOutputDelimited (or other kind of tFileOutxxx) you are using and select UTF-8 encoding.
Here is an image of the advanced tab where to perform the selection
I am quite sure the unix file util makes assumptions based on content of the file being in some range and or having specific start (magic numbers). In your case if you generate a perfectly valid UTF-8 file, but you just use only the ASCII subset the file util will probably flag it as ASCII. In that event you are fine, as you have a valid UTF-8 file. :)
To force talend to get a file as you wish, you can add an additional column to your file (for example in a tMap) and set an UTF-8 character in this column. The generated file will be in UTF8 as the other repliers mentioned.
My table needs to support pretty much all characters (Japanese, Danish, Russian, etc.)
However, while saving the 2-columned table as CSV from Excel with UTF-8 encoding, then importing it with phpMyAdmin with UTF-8 encoding selected, a lot of the original characters go missing (the ones with special properties such as umlauts, accents, etc.) Also, anything following problematic characters is removed entirely. I haven't the slightest idea what is causing this problem.
EDIT: For those that come upon the same issue, I'd suggest opening your CSV file in Notepad++ and going to "Encoding > Convert to UTF-8" (not "Encode in UTF-8") first. Then import it. It will surely work.
I found an answer here:
https://help.salesforce.com/apex/HTViewSolution?id=000003837
Bascially save as a unicode text file from excel,
then replace all tabs with commas in code friendly text editor,
re-save as utf8
change file from .txt to .csv
exporting directly from excel to .csv causes problems with Japanese, this is why I went searching for help...
http://www.mamstore.co.uk/bin/pxisapi1.exe/catalogue?level=805838
Look where its (meant to say) £5 T-shirts. Instead the '£' comes up as an invalid character, yet the exact same char is shown just below on the products.
I am getting the same when i pull a php files contents in with Jquery. The actual PHP file shows the chars correctly (without any head/body set etc) as soon as i pull it into the site it suddenly has issues with it.
Its stored in an SQL DB on a custom build CMS / WMS system.
Any suggestions would be much appreciated.
Cheers
Your page is encoded with UTF, but character in breadcrumbs is encoded with ISO. What encoding do you have in your database?
I'm saving out a .csv file from Excel and importing it to a MySQL database (with phpMyAdmin 2.6.4-pl3).
A few fields have trademark symbols. but show up as "ª". I thought it was something to do with the encoding of the fields form the database, but I have changed them and found no difference. UTF-8 at least shows the small 'a,' while others I have tried just convert it to a '?'. If I leave it at UTF-8 and manually go in after importing the .csv to change the 'ª' to '™' it works fine, but since I have about 150 products that would take forever.
I think the issue is that Excel does not export the .csv file as UTF-8, so the character gets lost. I am exporting this information to a PDF so I cannot use any standard web workarounds like I have seen on other posts.
Any ideas on a way to fix this? Thanks.
MySQL allows the specification of the encoding for each database. Either change the database's encoding to something useful, like UTF-8, or convert your input data to the current database encoding.
Use Open office SpreadSheet to import data into sql instead of Excel and CSV / txt file.
You can convert Excel or CSV into open office spreadsheet and import in phpMyAdmin