AWS Datapipeline - issue with accented characters - mysql

I am new to AWS datapipeline. I created a successful datapipeline to pull all the content from RDS to S3 bucket. Everything works. I see my .csv file in S3 bucket. But I am storing spanish names in my table, in csv I see "Garc�a" instead of "García"

Looks like the wrong codepage is used. Just reference the correct codepage and you should be fine. The following topic might help: Text files uploaded to S3 are encoded strangely?

AWS DataPipeline is implemented in Java, and uses JDBC (Java Database Connectivity) drivers (specifically, MySQL Connector/J for MySQL in your case) to connect to the database. According to the Using Character Sets and Unicode section of the documentation, the character set used by the connector is automatically determined based on the character_set_server system variable on the RDS/MySQL server, which is set to latin1 by default.
If this setting is not correct for your application (run SHOW VARIABLES LIKE 'character%'; in a MySQL client to confirm), you have two options to correct this:
Set character_set_server to utf8 on your RDS/MySQL server. To make this change permanently from the RDS console, see Modifying Parameters in a DB Parameter Group for instructions.
Pass additional JDBC properties in your DataPipeline configuration to override the character set used by the JDBC connection. For this approach, add the following JDBC properties to your RdsDatabase or JdbcDatabase object (see properties reference):
"jdbcProperties": "useUnicode=true,characterEncoding=UTF-8"

This question is a little similar to this Text files uploaded to S3 are encoded strangely?. If so, kindly reference my answer there.

Related

Nodejs/Express Save select avatar from web client and save directly to MySQL Database

Looking for some guidance here. I am building Nodejs/Express app with MySQL Database. I want to be able to click on the users image on the web page (initial image is generic), select a different image and upload/save new image into MySQL database. I using:
$('#file-input').trigger('click').change (function(){
alert($('#file-input').val());
})
I get C:/fakepath... for image location. I would like to how to upload/save selected image to the MySQL database. Connection to database is established, and routes for regular data work just fine.
Before answering your question will suggest you to not save image into your MySql or any database, use IPFS, local application directory/folder, or best AWS s3 bucket.
You can use busboy.js NPM module or multer.js NPM module for file upload to server, there's lots of good reason to not save any kind of file in local database.
Now back to how you can save image in database. You can do so by first converting your image to a data format your MySql understand. By default image is binary and depending upon image selected some image binary is so big that even MySql text datatype is small for them. Converting binary to hexadecimal does help but still too big for MySql text datatype. Also you will need multipart/form-data for file upload.
You can easily find "How to upload file in nodeJs?" in a google search. Still if need an example here's one "Upload file using multer.js"

User import from CSV German

Using the Moodle user import from CSV, we have the problem, that some German names with letters like Ö,ä,ü are imported "falsely". I presume, that the problem is in the encoding, here are the two possibilities, which I tested:
ANSI-encoding: The German letters disappear, for example Michael Dürr appears like Michael Drr in the listed users to import.
UTF-8-encoding: The letters appear as Michael Drürr
Does anyone has solution for the problem, or it has to be fixed one by one in the user's list?
I'm guessing the original file is using a different encoding. Try to convert the csv file to utf8 then import.
How do I correct the character encoding of a file?
you have to configure the database connection to make sure the encoding you choose for your webapplication (moodle) is the same as the encoding your database connection will choose.
look for SET NAMES 'utf8' or similar if you use mariadb/mysql as database.
and compare off course to the encoding of your import file. maybe you will need to convert it first. in any case the encoding of your web gui, the file, and the database connection (client character set) should be the same.
for web application check in your browser via View->Encoding or something similar, or check the meta header tag for the encoding in your html source code.
for file, use some editor or the like that will display the chars correctly and will indicate the charset.
for database, depends on your database.)

Configure Geoserver to use an external EPSG database

Geoserver uses en embedded HSQL database with EPSG codes to perform coordinates conversions.
(Ref: http://docs.geoserver.org/latest/en/user/advanced/crshandling/manualepsg.html)
I am trying to find out how it can be configured to use an external database to load the EPSG codes, so that custom CRS can be maintained separately.
Please help.
I had to implement an extension similar to gt-epsg-postgresql-11.0.jar, and replace it in WEB-INF/lib. The extension has only one class that creates the datasource. In this case it would be a Mysql datasource instead of a postgresql one.

PHP / MySQL: Cannot submit a 2,5MB-File to the DB

I'm trying to upload a File into my MySQL-DB. The blob-field is declared as longblob (->4GB size). If I upload a file with 200KB, it gets correctly saved, but if I upload 2MB, there is no error (MAX_FILE_SIZE is more than 20MB), but the INSERT statement does not create any Record.
I cannot execute the statement manually because the binary code of the file is too big.
Is there any limit of file upload by the HTTP Server (or PHP's $_FILES-var?)
Thx for help
Check the setting of max_allowe_packet MySQL server variable. If it's too small, and your host won't increase it for you, you will need to split your fle into smaller parts and upload them part by part apending new packets to already uploaded ones.
See also: http://dev.mysql.com/doc/refman/5.5/en/packet-too-large.html
Yes, there is a limit with your PHP server of the maximum file size allowed to upload.
You could try using a software like MySQL Workbench and edit the database directly from your computer.

Howto display localized characters

I have a MySQL database, php code to retrieve from the database, and an Android program that recieve the output from the php code via http post.
My localized characters displays as question marks in my program. I have tried different charsets in my database. utf8_general_ci, utf8_unicode_ci, latin1_general_ci - still questions marks. In html I could use the code ø, but not in an Android program - and I shouldn't have to.
First of all, where is this problem comming from? The database itself have no problems displaying localized characters with utf8_. Android also have no problems. Is it the http post request or php that have problems with this?
I would check layer by layer:
Even the DB encoding is UTF8 - you are sure the value is properly stored?
Do you have a way to test case the API (i.e. by using a web interface), if needed you would do packet inspection and check the proper UTF-8 value
Is your PHP API sending the correct encoding?
When reading the HTTP response in Android (i.e. parsing the Stream), are you supplying a encoding & is it the correct one?