MySQL Database import causing issue special characters (ě ř č ů) - mysql

Hi I recently changed the hosting provider for my website. When doing this I exported the mysql database I had in my previous cpanel phpmyadmin. It had CHARACTER SET latin1 and COLLATE latin1_swedish_ci. After I importing it to my new phpmyadmin I saw there was an issue with displaying the characters written in Czech ě ř č ů which appeared as question mark or weird symbols etc. I also wasn't able to insert these letters at first but after changing the table CHARSET to utf8 I'm able to insert them. But how do I export the data from my old database and import it in the new one without messing up the data? Here's what the database looks like:
SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
SET AUTOCOMMIT = 0;
START TRANSACTION;
SET time_zone = "+00:00";
/*!40101 SET #OLD_CHARACTER_SET_CLIENT=##CHARACTER_SET_CLIENT */;
/*!40101 SET #OLD_CHARACTER_SET_RESULTS=##CHARACTER_SET_RESULTS */;
/*!40101 SET #OLD_COLLATION_CONNECTION=##COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8mb4 */;
--
-- Database: `sambajiu_samba`
--
-- --------------------------------------------------------
CREATE TABLE `bookings` (
`id` int(11) NOT NULL,
`fname` varchar(100) NOT NULL,
`surname` varchar(100) DEFAULT NULL,
`email` varchar(255) NOT NULL,
`telephone` varchar(100) NOT NULL,
`age_group` varchar(100) DEFAULT NULL,
`hear` varchar(100) DEFAULT NULL,
`experience` text,
`subscriber` tinyint(1) DEFAULT NULL,
`booking_date` varchar(255) DEFAULT NULL,
`lesson_time` varchar(255) NOT NULL,
`booked_on` datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
ALTER TABLE `bookings` ADD PRIMARY KEY (`id`);
ALTER TABLE `bookings` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=345;

Czech is not handled by latin1. It would be better to use utf8mb4 (which can handle virtually everything in the world). Outside of MySQL, it is called "UTF-8".
How did you do the "export" and "import"? What is in the file? Can you get the hex of a small portion of the exported file -- we need to check what encoding was used for the Czech characters.
As for "as question mark or weird symbols", see question marks and Mojibake in Trouble with UTF-8 characters; what I see is not what I stored .
Your hex probably intended to say
Rezervovat trénink zda
In the middle of the hex is
C383 C2A9
Which is UTF-8 for é. When you display the data, you might see that, or you might see the desired é. In the latter case, the browser is probably "helping" you by decoding the data twice. For further discussion on this, see "double encoding" in the link above.
"Fixing the data" is quite messy:
CONVERT(BINARY(CONVERT(CONVERT(
UNHEX('52657A6572766F766174207472C383C2A96E696E6B207A6461')
USING utf8mb4) USING latin1)) USING utf8mb4)
==> 'Rezervovat trénink zda'
But, I don't think we are finished. that acute-e is a valid character in latin1. You mentioned 4 Czech accented letters that, I think, are not in Latin1. Latin5 and dec8 may be relevant.

Related

mysqldump - maintain character set and collations

I'm looking for the most secure way to preserve my database data in a .sql backup.
This:
mysqldump -u root -p DBName > backupName.sql
outputs also these lines for my database:
DROP TABLE IF EXISTS `tableName`;
/*!40101 SET #saved_cs_client = ##character_set_client */;
/*!50503 SET character_set_client = utf8mb4 */;
CREATE TABLE `tableName` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`col1` int(11) unsigned NOT NULL,
`col2` int(11) unsigned NOT NULL,
...
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
/*!40101 SET character_set_client = #saved_cs_client */;
How does this line work about the encoding?
/*!50503 SET character_set_client = utf8mb4 */;
I remember that those data were saved with some utf8 encoding but not with utf8mb4, maybe utf8mb4 can handle correctly all sub-set like utf8 and utf8_general_ci and utf8_unicode_ci?
(I'm using Ubuntu with MySQL 8)
Thanks
Yes, utf8mb4 is a superset of utf8.
utf8 supports only the Basic Multilingual Plane of the UTF-8 standard — i.e. 1-byte, 2-byte, and 3-byte code points.
utf8mb4 supports everything utf8 does, and in addition, supports the Supplemental Multilingual Plane of the UTF-8 standard.
As of MySQL 8.0.28, utf8 is now known as utf8mb3. It has been documented that a future release of MySQL will repurpose the utf8 alias to the utf8mb4 character set.
The character_set_client only describes the character set used by the client to encode character data it sends. This doesn't have to be the same as the character set used by each table, if there's a valid conversion path from the client character set into whatever is used by the respective table.
In other words, if you set the client character set to utf8mb4, and the table uses utf8 (a subset), it's fine as long as the client doesn't send 4-byte characters from the supplemental utf8 plane (this includes for example emojis).
utf8_general_ci and utf8_unicode_ci are not character sets, they are collations. This doesn't affect storage of strings at all, but it affects the sort order used as indexes are built, and it also affects character equivalence for unique constraints.

Importing Wordpress database into Laravel Valet, rows not being imported

I'm copying a Wordpress site from a server to a local Valet environment.
When I export the database, I can see in the wp_options table that the rows such as the site_url etc are present.
Yet when I import the database into Valet via either wp-cli or phpMyAdmin, the wp_options table is missing all the usual rows you'd expect to find in a Wordpress site.
Instead there's just a handful of data for transients etc.
A similar thing happens with other tables as well.
I'm running:
macOS 10.4.3
phpMyAdmin 4.9.0.1
wp-cli 2.2.0
Laravel Valet 2.3.3
I've been exporting/importing just the wp_options table to speed up the time needed to troubleshoot things.
I've tried altering export/import settings such as compression and maximising output compatibility but to no avail.
My database knowledge is rather limited after that.
If I repeat the same process using a database from a different project, it works fine.
I'll paste sample wp_options data below.
Troublesome wp_options (slightly edited to hide client details):
DROP TABLE IF EXISTS `example_options`;
CREATE TABLE `example_options` (
`option_id` bigint(20) UNSIGNED NOT NULL,
`option_name` varchar(191) COLLATE utf8_unicode_ci DEFAULT NULL,
`option_value` longtext COLLATE utf8_unicode_ci NOT NULL,
`autoload` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT 'yes'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `example_options` (`option_id`, `option_name`, `option_value`, `autoload`) VALUES
(1, 'siteurl', 'https://www.example.com', 'yes'),
(2, 'blogname', 'Example Site', 'yes'),
(3, 'blogdescription', '', 'yes'),
...
Compared to this dump from a different project which works fine.
Note the Wordpress version here is different which changes the order of the rows that are added.
DROP TABLE IF EXISTS `wp_options`;
/*!40101 SET #saved_cs_client = ##character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `wp_options` (
`option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`option_name` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`option_value` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
`autoload` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'yes',
PRIMARY KEY (`option_id`),
UNIQUE KEY `option_name` (`option_name`)
) ENGINE=InnoDB AUTO_INCREMENT=2376 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
/*!40101 SET character_set_client = #saved_cs_client */;
LOCK TABLES `wp_options` WRITE;
/*!40000 ALTER TABLE `wp_options` DISABLE KEYS */;
INSERT INTO `wp_options` VALUES
(1,'siteurl','https://www.example.com','yes'),(2,'home','https://www.example.com','yes'),
(3,'blogname','Example Site','yes'),
...
Once I've imported the database, browsing to the site results in a Error establishing database connection error.
If I try to use wp-cli I sometimes get Error: One or more database tables are unavailable. The database may need to be repaired. otherwise it's Error establishing database connection.
A colleague has noticed it works if you untick the Enclose export in a transaction option when initially exporting the database.
This will tide me over for now as it's only local projects it affects.

MYSQL - Storing unicode characters (emoji) in TEXT column

When trying to insert a unicode emoji character (😎) to a MYSQL table, the insert fails due to the error;
Incorrect string value: '\\xF0\\x9F\\x98\\x8E\\xF0\\x9F...' for column 'Title' at row 1
From what I've red about this issue, it's apparently caused by the tables default character set, and possible the columns default character set, being set incorrectly. This post suggests to use utf8mb4, which I've tried, but the insert is still failing.
Here's my table configuration;
CREATE TABLE `TestTable` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`InsertDate` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`Title` text,
`Description` text,
`Info` varchar(250) CHARACTER SET utf8 DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `xId_TestTablePK` (`Id`)
) ENGINE=InnoDB AUTO_INCREMENT=2191 DEFAULT CHARSET=utf8mb4;
Note that the Title and Text columns dont have an explicitly stated character set. Initially I had no default table character set, and had these two columns were setup with DEFAULT CHARSET=utf8mb4. However, when I altered the table's default charset to the same, they were removed (presumably because the columns inherit the type from the table?)
Can anyone please help me understand how I can store these unicode values in my table?
Its worth noting that I'm on Windows, trying to perform this insert on the MYSQL Workbench. I have also tried using C# to insert into the database, specifying the character set with CHARSET=utf8mb4, however this returned the same error.
EDIT
To try and insert this data, I am executing the following;
INSERT INTO TestTable (Title) SELECT '😎😎';
Edit
Not sure if this is relevant or not, but my database is also set up with the same default character set;
CREATE DATABASE `TestDB` /*!40100 DEFAULT CHARACTER SET utf8mb4 */;
The connection needs to establish that the client is talking utf8mb4, not just utf8. This involves changing the parameters used at connection time. Or executing SET NAMES utf8mb4 just after connecting.

SQL: Application cross-db maintain only one generic schema

I actually have an application based on MySQL with a schema based on InnoDB (with constraints...)
My co-workers need to import this schema, so I export my schema in SQL files.
For example:
DROP TABLE IF EXISTS `admins`;
CREATE TABLE `admins` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(45) NOT NULL,
`password` varchar(45) NOT NULL,
`email` varchar(45) DEFAULT NULL,
`creation_date` datetime NOT NULL,
`close_date` datetime DEFAULT NULL,
`close_reason` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
Now, I would like to have a cross-db application, so:
I tried to import my previous sql files in PostgreSQL, but it didn't work, my SQL files are mysql-related (for example use of ` character...)
I tried to export my schema with mysqldump and a compatibility mode "--compatible=ansi" my goal: have a generic sql file compatible with all major SGBD. But it didn't work: PostgreSQL returns error about synthax
compatible=ansi returns:
DROP TABLE IF EXISTS "admins";
/*!40101 SET #saved_cs_client = ##character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE "admins" (
"id" smallint(5) unsigned NOT NULL AUTO_INCREMENT,
"username" varchar(45) NOT NULL,
"password" varchar(45) NOT NULL,
"email" varchar(45) DEFAULT NULL,
"creation_date" datetime NOT NULL,
"close_date" datetime DEFAULT NULL,
"close_reason" varchar(45) DEFAULT NULL,
PRIMARY KEY ("id")
);
/*!40101 SET character_set_client = #saved_cs_client */;
I even tried to export with compatibility=postgresql:
DROP TABLE IF EXISTS "admins";
/*!40101 SET #saved_cs_client = ##character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE "admins" (
"id" smallint(5) unsigned NOT NULL,
"username" varchar(45) NOT NULL,
"password" varchar(45) NOT NULL,
"email" varchar(45) DEFAULT NULL,
"creation_date" datetime NOT NULL,
"close_date" datetime DEFAULT NULL,
"close_reason" varchar(45) DEFAULT NULL,
PRIMARY KEY ("id")
);
/*!40101 SET character_set_client = #saved_cs_client */;
But also didn't work...
I know there are tools to convert MySQL schema to PostgreSQL schema but this isn't the goal...
My question: Is it possible to have only one SQL file compatible with MySQL, PostgreSQL, SQLite... and don't maintain a SQL file for each SGBD ?
Thank you
My question: Is it possible to have only one SQL file compatible with MySQL, PostgreSQL, SQLite... and don't maintain a SQL file for each SGBD ?
Not easily with raw SQL, unless you wish to use a pathetic subset of the databases' supported features.
SELECTs and DML in SQL can be moderately portable, but DDL is generally a hopeless nightmare for all but the total basics. You'll want an abstraction tool that generates the SQL for you, handling database specific differences in sequences/generated keys, type naming, constraints, index creation, etc.
As just one example, lets look at auto-incrementing values / sequences, as frequently used for synthetic keys:
MySQL: integer AUTO_INCREMENT
PostgreSQL: SERIAL (shorthand for a sequence)
MS-SQL: int IDENTITY(1,1)
Oracle (below 12c): No direct support, use a sequence.
Oracle (12c and above): NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY
.. and that's just for the very common task of a generated key. Lots of other fun differences exist. For example, MySQL has tinyint and unsigned int. PostgreSQL does not. PostgreSQL has bool and has bit(n) bitfields, range-types, PostGIS types, etc etc etc which most other DBs don't have. Even for things that're shared, quirks abound - specifying "4 byte signed integer" across all DBs isn't even trivial.
One option to help is Liquibase which I've heard good things about. Some people instead use an ORM to manage their DDL generation instead - though those tend to use, again, only the most primitive of database features.

Converting MySQL database to UTF16

I am trying to create this table in a MySQL database
CREATE TABLE IF NOT EXISTS `Scania` (
`GensetType` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`EngineType` text CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`Engine60Hz` int(11) NOT NULL,
`Alternator` text CHARACTER SET utf16 COLLATE utf16_unicode_ci NOT NULL,
`PriceEur` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
However I receive a error message as
Error 1115 <42000> : Unknown character set: 'UTF 16'
I even tried to Alter the database but I received the same error
ALTER DATABASE nordhavn charset='utf16'
I tried searching online about other methods to convert the database but failed to find any possible solutions
The utf16 character set is available since MySQL 5.5 and up.
I guess you're using some earlier version.