MySQL mixing Charset & Collations - mysql

I read different articles and topics on this forum to help me setting up the charset & collation for my database. Not sure about the choices I made. I would appreciate any comments or advice.
I'm using MySQL 5.5.
The database (used with PHP) will have some datas from different languages (chinese, french, dutch, Us, spanish, arabic etc..)
I will mainly insert datas and get information from table ID'S. I won't need to full search and compare text.
So here is what I've done to create my database, I decided to use CHARSET utf8mb4 and COLLATION utf8mb4_unicode_ci
ALTER DATABASE testDB CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
When I create the table:
CREATE TABLE IF NOT EXISTS sector (
idSector INT(5) NOT NULL AUTO_INCREMENT,
sectoreName VARCHAR(45) NOT NULL DEFAULT '',
PRIMARY KEY (idSector)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 AUTO_INCREMENT=0;
For some tables, I thought it was better to use utf8_bin
Ex: timezone (contain 168 047 rows)
CREATE TABLE timezone (
zone_id int(10) NOT NULL,
abbreviation varchar(6) COLLATE utf8_bin NOT NULL,
time_start decimal(11,0) NOT NULL,
gmt_offset int(11) NOT NULL,
dst char(1) COLLATE utf8_bin NOT NULL,
KEY idx_zone_id (zone_id),
KEY idx_time_start (time_start)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=0;
So basically I would like to know if I'm on the right or if I'm doing something that could lead to problems.

Different columns can have different character sets and/or collations, but...
If you compare columns of different charset or collation (WHERE a.x = b.y), indexes cannot be used.
utf8 does not handle all of Chinese, nor does it handle some Emoji. For those, you need utf8mb4.
On other issues...
In INT(5), the (5) means nothing. Check out SMALLINT UNSIGNED with a range of 0..65535.
time_start decimal(11,0) is strange for a time. If it is a unix timestamp, either TIMESTAMP or INT UNSIGNED should work ok. See also TIME.
dst char(1) COLLATE utf8_bin -- this takes 3 bytes, because of utf8. Perhaps you want CHARACTER SET ascii so it will be only 1 byte?
InnoDB tables really should be given an explicit PRIMARY KEY. (Probably zone_id?)

You are making good a good choice for your sectoreName column. Notice one thing: utf8mb4_unicode_ci is a good collation for most language. But, for Spanish, it gets the alphabet wrong: in that language N and Ñ are considered different letters. Ñ appears immediately after N in the collating sequence. But in other European language they are considered the same letter. So, your Spanish-language users will, when they ask for Niña, get back Niña and Nina. That may appear to them as a mistake. (But, they're probably used to getting this sort of thing from pan-European software applications.)
You should use utf8mb4 as your character set throughout any new application. So, use that instead of utf8 in your timezone table. Using the _bin collation for your abbreviation column is fine.

Related

Proper configuration to mix Collations in SQL?

I’m a little confused by the collations. Not sure if the DB would traduce a column collation to the table collation on a SELECT, or is just a ruleset for when comparing.
So what to put as CHARSET and COLLATE? (10.4.11-MariaDB)
Here are some examples of what I have:
Case #1: The utf8_bin column I just SELECT it, not compare it, but the ascii I do WHERE bot=?
CREATE TABLE `bots_trace` (
`id` int(10) UNSIGNED NOT NULL,
`bot` varchar(20) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
`info` varchar(2000) COLLATE utf8_bin DEFAULT NULL,
`seen` enum('yes','no') CHARACTER SET ascii COLLATE ascii_bin NOT NULL DEFAULT 'no'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
I almost never ask the DB to do an utf8mb4_bin comparison or similar, just SELECT.
So what collations I should use in those cases, what to use as DEFAULT and as COLLATE
Case #2: The only time I ask the DB to do something with an uft8mb4 is to check the mail.
CREATE TABLE `changed_email` (
`id` int(10) UNSIGNED NOT NULL,
`old_mail` varchar(256) COLLATE utf8mb4_bin NOT NULL,
`ctime` int(10) UNSIGNED NOT NULL,
`ip` varchar(94) CHARACTER SET ascii COLLATE ascii_bin NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
SELECT id FROM changed_email WHERE old_mail = ? LIMIT 1
What to do in this case? Because the only comparison I do is a utf8mb4_bin I'm assuming that would be the correct CHARSET & COLLATE.
Also, I use PHP and I set mysqli_set_charset($link, 'utf8mb4'), which I needed to retrieve the data correctly, if I change some table COLLATION to ascii, could I have trouble retrieving utf8mb4 data columns?
ascii encoding is a subset of utf8 which is a subset of utf8mb4. But that is probably irrelevant.
mysqli_set_charset() announces the CHARACTER SET of the data in the client.
MySQL, during INSERT will convert the bytes from the encoding indicated by mysqli_set_charset to the encoding specified for the column in the table. Similarly SELECT will convert the other direction.
If you are only dealing with ascii characters, then there is effectively no conversion, and no possibility of problems. If, on the other hand, you have accented letters or Emoji, there will be problems.
The above talks about CHARACTER SET, which is the "encoding" of letters. the COLLATION is a different matter; this term refers to the ordering, including case folding and accent stripping. For example, should 'a' = 'A' or not? For COLLATION ascii_general_ci or utf8mb4...ci, those are "equal". For any collation ...bin they are "not equal", and one of them will consistently be sorted (think ORDER BY) before the other.
In some, but not all, situations, MySQL will allow mixing character sets or collections, and "do the right thing". For example, storing a character in once CHARACTER SET into another, either it can be converted, or it will mess up. A is available in perhaps all character sets, but, for example, an accented A is not available in Ascii.
In the case of COLLATION, when there is a conflict of collections, there may be a rule that says which collation to use, but often it gives up and complains about a "mix of collections".
Keep in mind that all of this comes from multiple places:
The column definition
The connection parameters (between client and MySQL server)
The bytes in the client.
A common example is latin1 accented letters cannot be interpreted as utf8 bytes, but they can be converted to utf8. This raises its ugly head when connection specification disagrees with the bytes in the client.

Laravel database migration from old database UTF-8 encoding issue

Proclaimer: YES, I've done my search on Stackoverflow and NO it couldn't find an answer for this case.
I'm migrating data from an forum which has some legacy in it's MySQL database. One of the issues is the storage of Emoji's.
Donor database:
-- Server: 5.5.41-MariaDB
CREATE TABLE `forumtopicresponse` (
`id` int(10) UNSIGNED NOT NULL,
`topicid` int(10) UNSIGNED NOT NULL DEFAULT '0',
`userid` int(10) UNSIGNED NOT NULL DEFAULT '0',
`message` text NOT NULL,
`created` int(10) UNSIGNED NOT NULL DEFAULT '0',
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
In the message column I've got a message like this: Success!ðŸ‘ðŸ‘, which displays as "Success!👍👍"
Laravel target database:
-- Server: MySQL 5.7.x
CREATE TABLE `answers` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`topic_id` int(10) unsigned NOT NULL,
`user_id` int(10) unsigned NOT NULL,
`body` text CHARACTER SET utf8mb4,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
...keys & indexes
) ENGINE=InnoDB AUTO_INCREMENT=1254419 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
In HTML the document has a <meta charset="utf-8"> and to display the field, I'm using
{!! nl2br(e($answer->body)) !!}
And with this it just displays as Success!ðŸ‘👠and not the Emoji's.
Question
How can I migrate this data CLEAN and UTF-8 valid into my fresh database? I think I need some utf encoding, but can't figure out which.
UPDATE! THE SOLUTION
Got it fixed. The only solution was to alter the table in the Donor database.
ALTER TABLE forumtopicresponse CHANGE message message LONGTEXT CHARACTER SET latin1;
ALTER TABLE forumtopicresponse CHANGE message message LONGBLOB;
Do NOT change the LONGBLOB to LONGTEXT anymore: I lost data this way.
When I migrate the LONGBLOB data to the Laravel target database everything get's migrated correctly: all special chars and emoji's are fixed and in UTF-8.
The Emoji 👍 is hex F09F918D. That is, it is a 4-byte string.
MySQL's CHARACTER SET = utf8 does not handle 4-byte UTF-8 strings, only 3-byte ones, thereby excluding many of the Emoji and some of Chinese.
When interpreted as latin1, those hex digits are 👠(plus a 4th, but unprintable, character). Showing gibberish like that is called "Mojibake".
So, you have 2 problems:
Need to change the storage to utf8mb4 so you can store the Emoji.
Need to announce to MySQL that your client is speaking UTF-8, not latin1.
See "Best Practice" in Trouble with UTF-8 characters; what I see is not what I stored
And also see UTF-8 all the way through
Here's my list of fixes, but you must first correctly identify which case you have. Applying the wrong fix makes things worse.
There may be a 3rd mistake -- in moving the data from 5.5 to 5.7. Please provide those details.

Create database, what's the right charset for my purpose

I have just a little information about MySql. I just need to create a database to store some score of a videogame, taken from all over the world. (The game will be in every available store, also Chinese etc.)
I'm worried about the charset. Db schema's will be similar to (pseudocode):
leaderboard("PhoneId" int primary key, name varchar(50), score smallint);
What will happen if a chinese guy will put his score with a name with that characters? Should I specify something into db creation script?
create database if not exists "test_db";
create table if not exists "leaderboard" (
"phoneid" integer unsigned NOT NULL,
"name" varchar(20) NOT NULL, -- Gestione errori per questo
"score" smallint unsigned NOT NULL default 0,
"timestamp" timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY ("phoneid")
);
UTF8 is your obvious choice.
For details on UTF8 and MySQL integration, you can go through the Tutorial pages:
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-utf8.html
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html
There are certain things that needs to be kept in mind while using the UTF8 charset in any database. For example, To save space with UTF-8, use VARCHAR instead of CHAR. Otherwise, MySQL must reserve three bytes for each character in a CHAR CHARACTER SET utf8 column because that is the maximum possible length.
Similarly you should analyze other performance constraints and design your database.

Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='

Error message on MySql:
Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation '='
I have gone through several other posts and was not able to solve this problem.
The part affected is something similar to this:
CREATE TABLE users (
userID INT UNSIGNED NOT NULL AUTO_INCREMENT,
firstName VARCHAR(24) NOT NULL,
lastName VARCHAR(24) NOT NULL,
username VARCHAR(24) NOT NULL,
password VARCHAR(40) NOT NULL,
PRIMARY KEY (userid)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE products (
productID INT UNSIGNED NOT NULL AUTO_INCREMENT,
title VARCHAR(104) NOT NULL,
picturePath VARCHAR(104) NULL,
pictureThumb VARCHAR(104) NULL,
creationDate DATE NOT NULL,
closeDate DATE NULL,
deleteDate DATE NULL,
varPath VARCHAR(104) NULL,
isPublic TINYINT(1) UNSIGNED NOT NULL DEFAULT '1',
PRIMARY KEY (productID)
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE TABLE productUsers (
productID INT UNSIGNED NOT NULL,
userID INT UNSIGNED NOT NULL,
permission VARCHAR(16) NOT NULL,
PRIMARY KEY (productID,userID),
FOREIGN KEY (productID) REFERENCES products (productID) ON DELETE RESTRICT ON UPDATE NO ACTION,
FOREIGN KEY (userID) REFERENCES users (userID) ON DELETE RESTRICT ON UPDATE NO ACTION
) ENGINE = INNODB CHARACTER SET utf8 COLLATE utf8_unicode_ci;
The stored procedure I'm using is this:
CREATE PROCEDURE updateProductUsers (IN rUsername VARCHAR(24),IN rProductID INT UNSIGNED,IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername
AND productUsers.productID = rProductID;
END
I was testing with php, but the same error is given with SQLyog.
I have also tested recreating the entire DB but to no good.
Any help will be much appreciated.
The default collation for stored procedure parameters is utf8_general_ci and you can't mix collations, so you have four options:
Option 1: add COLLATE to your input variable:
SET #rUsername = ‘aname’ COLLATE utf8_unicode_ci; -- COLLATE added
CALL updateProductUsers(#rUsername, #rProductID, #rPerm);
Option 2: add COLLATE to the WHERE clause:
CREATE PROCEDURE updateProductUsers(
IN rUsername VARCHAR(24),
IN rProductID INT UNSIGNED,
IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername COLLATE utf8_unicode_ci -- COLLATE added
AND productUsers.productID = rProductID;
END
Option 3: add it to the IN parameter definition (pre-MySQL 5.7):
CREATE PROCEDURE updateProductUsers(
IN rUsername VARCHAR(24) COLLATE utf8_unicode_ci, -- COLLATE added
IN rProductID INT UNSIGNED,
IN rPerm VARCHAR(16))
BEGIN
UPDATE productUsers
INNER JOIN users
ON productUsers.userID = users.userID
SET productUsers.permission = rPerm
WHERE users.username = rUsername
AND productUsers.productID = rProductID;
END
Option 4: alter the field itself:
ALTER TABLE users CHARACTER SET utf8 COLLATE utf8_general_ci;
Unless you need to sort data in Unicode order, I would suggest altering all your tables to use utf8_general_ci collation, as it requires no code changes, and will speed sorts up slightly.
UPDATE: utf8mb4/utf8mb4_unicode_ci is now the preferred character set/collation method. utf8_general_ci is advised against, as the performance improvement is negligible. See https://stackoverflow.com/a/766996/1432614
I spent half a day searching for answers to an identical "Illegal mix of collations" error with conflicts between utf8_unicode_ci and utf8_general_ci.
I found that some columns in my database were not specifically collated utf8_unicode_ci. It seems mysql implicitly collated these columns utf8_general_ci.
Specifically, running a 'SHOW CREATE TABLE table1' query outputted something like the following:
| table1 | CREATE TABLE `table1` (
`id` int(11) NOT NULL,
`col1` varchar(4) CHARACTER SET utf8 NOT NULL,
`col2` int(11) NOT NULL,
PRIMARY KEY (`col1`,`col2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Note the line 'col1' varchar(4) CHARACTER SET utf8 NOT NULL does not have a collation specified. I then ran the following query:
ALTER TABLE table1 CHANGE col1 col1 VARCHAR(4) CHARACTER SET utf8
COLLATE utf8_unicode_ci NOT NULL;
This solved my "Illegal mix of collations" error. Hope this might help someone else out there.
I had a similar problem, but it occurred to me inside procedure, when my query param was set using variable e.g. SET #value='foo'.
What was causing this was mismatched collation_connection and Database collation. Changed collation_connection to match collation_database and problem went away. I think this is more elegant approach than adding COLLATE after param/value.
To sum up: all collations must match. Use SHOW VARIABLES and make sure collation_connection and collation_database match (also check table collation using SHOW TABLE STATUS [table_name]).
A bit similar to #bpile answer, my case was a my.cnf entry setting collation-server = utf8_general_ci. After I realized that (and after trying everything above), I forcefully switched my database to utf8_general_ci instead of utf8_unicode_ci and that was it:
ALTER DATABASE `db` CHARACTER SET utf8 COLLATE utf8_general_ci;
Answer is adding to #Sebas' answer - setting the collation of my local environment. Do not try this on production.
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Source of this solution
In my own case I have the following error
Illegal mix of collations (utf8_general_ci,IMPLICIT) and (utf8_unicode_ci,IMPLICIT) for operation '='
$this->db->select("users.username as matric_no, CONCAT(users.surname,
' ',
users.first_name, ' ', users.last_name) as fullname")
->join('users', 'users.username=classroom_students.matric_no', 'left')
->where('classroom_students.session_id', $session)
->where('classroom_students.level_id', $level)
->where('classroom_students.dept_id', $dept);
After weeks of google searching I noticed that the two fields I am comparing consists of different collation name. The first one i.e username is of utf8_general_ci while the second one is of utf8_unicode_ci so I went back to the structure of the second table and changed the second field (matric_no) to utf8_general_ci and it worked like a charm.
Despite finding an enormous number of question about the same problem (1, 2, 3, 4) I have never found an answer that took performance into consideration, even here.
Although multiple working solutions has been already given I would like to do a performance consideration.
EDIT: Thanks to Manatax for pointing out that option 1 does not suffer of performance issues.
Using Option 1 and 2, aka the COLLATE cast approach, can lead to potential bottleneck, cause any index defined on the column will not be used causing a full scan.
Even though I did not try out Option 3, my hunch is that it will suffer the same consequences of option 1 and 2.
Lastly, Option 4 is the best option for very large tables when it is viable. I mean there are no other usage that rely on the original collation.
Consider this simplified query:
SELECT
*
FROM
schema1.table1 AS T1
LEFT JOIN
schema2.table2 AS T2 ON T2.CUI = T1.CUI
WHERE
T1.cui IN ('C0271662' , 'C2919021')
;
In my original example, I had many more joins.
Of course, table1 and table2 have different collations.
Using the collate operator to cast, it will lead to indexes not being used.
See sql explanation in the picture below.
Visual Query Explanation when using the COLLATE cast
On the other hand, option 4 can take advantages of possible index and led to fast queries.
In the picture below, you can see the same query being run after applied Option 4, aka altering the schema/table/column collation.
Visual Query Explanation after the collation has been changed, and therefore without the collate cast
In conclusion, if performance are important and you can alter the collation of the table, go for Option 4.
If you have to act on a single column, you can use something like this:
ALTER TABLE schema1.table1 MODIFY `field` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
This happens where a column is explicitly set to a different collation or the default collation is different in the table queried.
if you have many tables you want to change collation on run this query:
select concat('ALTER TABLE ', t.table_name , ' CONVERT TO CHARACTER
SET utf8 COLLATE utf8_unicode_ci;') from (SELECT table_name FROM
information_schema.tables where table_schema='SCHRMA') t;
this will output the queries needed to convert all the tables to use the correct collation per column
I was also facing a problem while upload excels file in MySql Database form Laravel. In excel file some addresses contain characters like PeterÕs that Error was
SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8mb4_unicode_ci,COERCIBLE) for operation '=' (SQL: select count(*) as aggregate from `.....` where `e....` = kas#email.com and `first_name` = Gill and `surname` = Harries and `address` = 6 St.PeterÕs Close,,,Woodbridge,Suffolk,IP12 4EJ and `status` = 1 and `client`.`deleted_at` is null)
I try many solutions that is mentioned above but unfortunately, no one works for me. so I post the solution that works for me. so maybe it will not work in some situations. Exception can be different. so you have to find a solution based on that. please vote Up if some finds this is useful.
I try 2 types of solutions.
I putted my exceptional code into try{} Catch() block for handling. because it was giving an error while executing select() query. so that I use TRY{} for this. using that uploaded record stored in the database. But when charset issue comes on that place it is put ?. that sample image is given below.
you can detect the charset that gives an issue while operating and change the MySql configuration by this code
\Config::set('database.connections.mysql.charset', 'latin1');
\Config::set('database.connections.mysql.collation', 'latin1_bin');
\DB::purge('mysql');
Both solution works for me.

Handling a huge MYSQL Table

Hope you are all doing great. We have a huge mysql table called 'posts'. It has about 70,000 records and has gone up to about 10GB is size.
My boss says that something has to be done to make it easy for us to handle this huge table because what if that table gets corrupted then it would take us a lot of time to recover the table. Also at times its slow.
What the are possible solutions so that handling this table becomes easier for as in all aspects.
The structure of the table is as follows:
CREATE TABLE IF NOT EXISTS `posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`thread_id` int(11) unsigned NOT NULL,
`content` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`first_post` mediumtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`publish` tinyint(1) NOT NULL,
`deleted` tinyint(1) NOT NULL,
`movedToWordPress` tinyint(1) NOT NULL,
`image_src` varchar(500) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`video_src` varchar(500) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`video_image_src` varchar(500) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`thread_title` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`section_title` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`urlToPost` varchar(280) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`posts` int(11) DEFAULT NULL,
`views` int(11) DEFAULT NULL,
`forum_name` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`subject` varchar(150) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`visited` int(11) DEFAULT '0',
`replicated` tinyint(4) DEFAULT '0',
`createdOn` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `urlToPost` (`urlToPost`,`forum_name`),
KEY `thread_id` (`thread_id`),
KEY `publish` (`publish`),
KEY `createdOn` (`createdOn`),
KEY `movedToWordPress` (`movedToWordPress`),
KEY `deleted` (`deleted`),
KEY `forum_name` (`forum_name`),
KEY `subject` (`subject`),
FULLTEXT KEY `first_post` (`first_post`,`thread_title`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=78773 ;
Thanking You.
UPDATED
Note: although I am great-full for the replies but almost all answers have been about optimizing the current database and not about how to generally handle large tables. Although I can optimize the database based on the replies I got, it really does not answer the question about handling huge databases. Right now I am talking about 70,000 records but during the next few months if not weeks we are going to grow a magnitude. Each record can be about 300kb in size.
My answer's also an addition to two previous comments.
You've indexed half of your table. But if you take a look at some indexes (publish, deleted, movedToWordPress) you'll notice they are 1 or 0, so their selectivity is low (number of rows divided by number of distinct values of that column). Those indexes are a waste of space.
Some things also make no sense.
tinyint(4) - that doesn't actually make it a 4 digit integer. Number there is display length. tinyint is 1 byte, so it's got 256 possible values. I'm assuming something went wrong there.
Also, 10 gigs in size for just 75k records? How did you measure the size? Also, what's the hardware you got?
Edit in regards to your updated question:
There are many ways to scale databases. I'll link one SO question/answer so you can get the idea what you can do: here it is.
The other thing you might do is get better hardware. Usually, the reason why databases are slow when they increase in size is the HDD subsystem and available memory left to work with the dataset. The more RAM you have - the faster it all gets.
Another thing you could do is split your table into two in such a way that one table holds the textual data and the other holds the data relevant to what your system requires to perform certain searching or matching (you'd put integer fields there).
Using InnoDB, you'd gain huge performance boost if the two tables were connected via some sort of a foreign key pointing to primary key. Since InnoDB is such that primary key lookups are fast - you are opening several new possibilities to what you can do with your dataset. In case your data gets increasingly huge, you can get enough RAM and InnoDB will try to buffer the dataset in RAM. There's an interesting thing called HandlerSocket that does some neat magic with servers that have enough RAM and are using InnoDB.
In the end it really boils down to what you need to do and how you are doing it. Since you didn't mention that, it's hard to give an estimate here of what you should do.
My first step to optimizing would definitely be to tweak MySQL instance and to back that big table up.
I guess you have to change some columns.
You can start by reduce your var char variables.
image_src/video_src/video_image_src VARCHAR(500) is a little too much i think. (100 varchars is enough i would say)
thread_title is text but should be a VARCHAR(200?) if you say me
same with section_title
Ok here is your problem
content longtext
Do you really need longtext here? longtext is up to 4GB of space. I think if you change this column to text it would be a lot smaller
TINYTEXT 256 bytes
TEXT 65,535 bytes ~64kb
MEDIUMTEXT 16,777,215 bytes ~16MB
LONGTEXT 4,294,967,295 bytes ~4GB
Edit: i see you use a fulltext index. I am quite sure that is saving a lot a lot a lot of data. You should use another mechanism for searching full text.
In addition to what Michael has commented, the slowness can be an issue based on how well the queries are optimized, and proper indexes to match. I would try to find some of the culprit queries that are taking longer time than you hope and post here at S/O to see if someone can help in optimizing options.