Motto of the query is very simple, to find out the last entry on a foreign key column.
the pseudo code I can say is
select vehicleid , last_journey_point , last_journey_time from journeyTable.
here is my SQL statement
-- loconumber is a indexed column
-- journeyserla is a autonumber primary key int(11)
-- the table locojourney contains 400,000 records
-- the below block of code executes in 19 secs
with LocomotiveLastRun AS(
-- this block of code runs in 0.016 sec
SELECT locojourney.loconumber , MAX(locojourney.journeyserla) as lastrunid
FROM locojourney GROUP BY loconumber)
SELECT locojourney.CurrentCombiners , locojourney.JourneySerla ,
locojourney.From_RunPoint , locojourney.NEXT_RunPoint
FROM LocomotiveLastRun FORCE INDEX(lastrunid)
JOIN locojourney FORCE INDEX(PRIMARY) ON x.lastrunid = locojourney.journeyserla
WHERE locojourney.ishoc = 'n'
the EXPLAIN command shows a derived table which is using no index and using where and type ALL
This is the table definition:
-- SHOW CREATE TABLE locojourney
CREATE TABLE `locojourney` (
`trainID` smallint(5) NOT NULL,
`LocoNumber` varchar(5) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`CurrentLocoBase` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`CurrentDuedate` date DEFAULT NULL,
`LocoConsist` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`CurrentLocoDomain` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`DomainChange` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`FEDR` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`LADR` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`ISBANKER` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`TrainName` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`WithOutLoad` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'N',
`runRoute` varchar(50) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`From_RunPoint` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`From_RunTime` datetime NOT NULL,
`NEXT_RunPoint` varchar(10) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL,
`NEXT_RunTime` datetime NOT NULL,
`Affects_Outage` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`Affects_Mileage` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`GroundDistance` double(5,2) DEFAULT '0.00',
`SHGallowance` int(11) DEFAULT '0',
`Outage` double(5,4) DEFAULT '0.0000',
`UnderServiceType` enum('FHT','CHG','DEP','MIX','DETN') CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'FHT',
`SubServiceHead` varchar(25) CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'RUN',
`IShoc` enum('N','Y') CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT 'N',
`CurrentCombiners` varchar(28) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`RunSetSerla` varchar(25) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`JourneySerla` int(11) NOT NULL AUTO_INCREMENT,
`NominationSerla` varchar(50) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`Traction` enum('DSL','AC') CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'DSL',
`Trainload` smallint(4) NOT NULL DEFAULT '0',
`LeadAssist` enum('Y','N') CHARACTER SET latin1 COLLATE latin1_swedish_ci NOT NULL DEFAULT 'N',
`DEO` varchar(100) CHARACTER SET latin1 COLLATE latin1_swedish_ci DEFAULT NULL,
`DEOtime` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`JourneySerla`),
KEY `trainID` (`trainID`) USING BTREE,
KEY `routesection_idx` (`runRoute`) USING BTREE,
KEY `loconumber_idx` (`LocoNumber`) USING BTREE,
KEY `runsetserla_idx` (`RunSetSerla`) USING BTREE,
KEY `subservicehead_idx` (`SubServiceHead`) USING BTREE,
CONSTRAINT `locojourney_ibfk_1` FOREIGN KEY (`SubServiceHead`) REFERENCES `ineffective` (`IneffectiveHead`) ON UPDATE CASCADE,
CONSTRAINT `locojourney_ibfk_3` FOREIGN KEY (`runRoute`) REFERENCES `routesections` (`Sectionname`) ON DELETE RESTRICT ON UPDATE CASCADE,
CONSTRAINT `loconumber_fk` FOREIGN KEY (`LocoNumber`) REFERENCES `lococontainer` (`LocoNumber`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=345719 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
with LocomotiveLastRun AS(
-- this block of code runs in 0.016 sec
SELECT locojourney.loconumber , MAX(locojourney.journeyserla) as lastrunid
FROM locojourney
GROUP BY loconumber)
Why is this CTE subquery fast? Because your table already has an index on (loconumber, journeyserla). (InnoDb automatically appends the primary key to every index.) This query can be satisfied with a loose index scan on that index, and those are fast.
Now for your main query:
Get rid of FORCE INDEX(). Don't even dream of using that unless you have at least a decade of SQL experience or you have read the source code for the InnoDB indexing stuff in MySQL. Notably, it's completely useless on the CTE because CTEs don't have indexes.
For clarity put your main (detail) table first and your CTE second.
For clarity recast the JOIN as a WHERE...IN...
Those three suggestions give us this:
WITH LocomotiveLastRun AS (...)
SELECT locojourney.CurrentCombiners , locojourney.JourneySerla ,
locojourney.From_RunPoint , locojourney.NEXT_RunPoint
FROM locojourney
WHERE journeyserla IN (SELECT lastrunid FROM LocomotiveLastRun)
AND locojourney.ishoc = 'n'
Now, it's plain what index can help this query.
An index on (ishoc) will help a bit. (It's actually an index, because InnoDB, on (ishoc, journeyserla) so it helps with both WHERE conditions.) The query planner uses BTREE random access to find the first index row with the ishoc value 'n', then scans the values of the primary key to match them with the IN clause.
Instead of that index, a compound index that covers the query will help even more. Such a covering index helps especially because each row of your table is large, with many columns. That index mentions the columns in the WHERE clause and those you want to select, like this:
(ishoc, journeyserla, CurrentCombiners, From_RunPoint, NEXT_RunPoint)
The query planner can satisfy your query entirely from the index, which saves on disk reading time to satisfy the query. If you use your query a lot, this index is a good idea. But, it does consume disk space and slow down INSERT and UPDATE operations a bit.
Read https://use-the-index-luke.com/
Give this a try:
SELECT lj.CurrentCombiners , lj.JourneySerla , lj.From_RunPoint , lj.NEXT_RunPoint
FROM ( SELECT MAX(journeyserla) as lastrunid
FROM locojourney
GROUP BY loconumber
) AS llr
JOIN locojourney AS lj ON llr.lastrunid = lj.journeyserla
WHERE lj.ishoc = 'n'
(time it and provide EXPLAIN for it)
Related
My business scenario is shown in the figure above. A user can create multiple products, a product can have multiple modules, and a module can have multiple parameters. The parameters include variable types, variable names, and variable values.
Before starting I thought the query speed of mptt was better than 2D relational table, but the result is completely opposite.
I now have two data table designs.
Option One:
CREATE TABLE `products` (
`product_id` bigint(20) NOT NULL AUTO_INCREMENT,
`product_name` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`cfdversion` bigint(20) NULL DEFAULT NULL,
`product_info` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`is_activated` tinyint(1) NULL DEFAULT NULL,
PRIMARY KEY (`product_id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 226 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
CREATE TABLE `person_param` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`product_id` bigint(20) NULL DEFAULT NULL,
`param_name` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`var_type` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`var_value` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`var_name` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`is_activated` tinyint(1) NULL DEFAULT 1,
`compute_value` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`module_name` varchar(10) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
PRIMARY KEY (`id`) USING BTREE,
INDEX `product_id`(`product_id`) USING BTREE,
CONSTRAINT `person_param_ibfk_1` FOREIGN KEY (`product_id`) REFERENCES `products` (`product_id`) ON DELETE RESTRICT ON UPDATE RESTRICT
) ENGINE = InnoDB AUTO_INCREMENT = 19 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
I connect the id of the product information with the parameter table.
Option two:
products table is same just table name is different.
person_paramlike this:
CREATE TABLE `mptt_param` (
`node_id` bigint(20) NOT NULL AUTO_INCREMENT,
`node_name` varchar(30) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`lft` bigint(20) NOT NULL,
`rgt` bigint(20) NOT NULL,
`node_level` bigint(20) NOT NULL,
PRIMARY KEY (`node_id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 1 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
I added 200 products with 3 modules per product and 10 parameters per module.
option 1's sql statement:SELECT * from `products` a RIGHT JOIN `person_param` b ON a.product_id=b.product_id WHERE a.product_id=246;
option 2's sql statement:SELECT * FROM mptt_param WHERE lft>=(SELECT lft FROM mptt_param WHERE node_name='246') AND rgt<=(SELECT rgt FROM mptt_param WHERE node_name='246') ;
I don't know what the problem is, hope you can give me some advice
I have a SQL problem.
When the user, organization, and organization are associated with the table, if the user status is used to filter the table, the index user_id cannot be used. If the condition is removed, the index user_id will be used.
Why is that?
MSYQL VERSION:5.7.32-log
Below is the specific SQL and table structure.
sql 1 :
SELECT DISTINCT USER
.user_id,
USER.NAME,
USER.nickname,
USER.position,
USER.first_line_id,
USER.second_line_id,
USER.org_id,
user.state
FROM
USER INNER JOIN user_org ON USER.user_id = user_org.user_id
INNER JOIN org ON user_org.org_id = org.id
WHERE
( org.end_time IS NULL OR org.end_time > NOW( ) )
AND USER.state = 1
AND ( full_id LIKE 'H_ROOT.00000001.00000002.50060182.50091585.50095679.50092012.10148706.50092333.10161139%' )
explain:user_id index not sufficient
sql2 :
SELECT DISTINCT USER
.user_id,
USER.NAME,
USER.nickname,
USER.position,
USER.first_line_id,
USER.second_line_id,
USER.org_id,
user.state
FROM
USER INNER JOIN user_org ON USER.user_id = user_org.user_id
INNER JOIN org ON user_org.org_id = org.id
WHERE
( org.end_time IS NULL OR org.end_time > NOW( ) )
-- AND USER.state = 1
AND ( full_id LIKE 'H_ROOT.00000001.00000002.50060182.50091585.50095679.50092012.10148706.50092333.10161139%' )
explain:user_id index sufficient
table count
USER:356007
ORG:142713
USER_ORG:353088
table schema
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
DROP TABLE IF EXISTS `user_org`;
CREATE TABLE `user_org` (
`user_id` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`org_id` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`created_at` datetime(0) NULL DEFAULT NULL,
`updated_at` datetime(0) NULL DEFAULT NULL,
PRIMARY KEY (`user_id`, `org_id`) USING BTREE,
INDEX `org_id`(`org_id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
DROP TABLE IF EXISTS `user`;
CREATE TABLE `user` (
`user_id` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '工号',
`name` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '姓名',
`email` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '邮箱',
`email_private` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '个人邮箱',
`mobile` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL COMMENT '手机号',
`position` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT '' COMMENT '岗位',
`state` tinyint(4) NOT NULL DEFAULT 1 COMMENT '状态(1:启用;0:禁用)',
`org_id` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '部门编码',
PRIMARY KEY (`user_id`) USING BTREE,
INDEX `user_email_index`(`email`) USING BTREE,
INDEX `user_mobile_index`(`mobile`) USING BTREE,
INDEX `user_name_index`(`name`) USING BTREE,
INDEX `user_org_id_index`(`org_id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci COMMENT = '用户表' ROW_FORMAT = Dynamic;
SET FOREIGN_KEY_CHECKS = 1;
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
DROP TABLE IF EXISTS `org`;
CREATE TABLE `org` (
`id` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`name` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`parent_id` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`full_id` varchar(512) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`end_time` datetime(0) NULL DEFAULT NULL COMMENT '部门过期时间',
`created_at` datetime(0) NOT NULL DEFAULT CURRENT_TIMESTAMP(0) COMMENT '创建时间',
`updated_at` datetime(0) NOT NULL DEFAULT CURRENT_TIMESTAMP(0) COMMENT '更新时间',
`customer_code` varchar(40) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT '',
`org_type` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT '组织类别',
`state` tinyint(4) NULL DEFAULT NULL COMMENT ' 1 正常 2 停用\r\n冗余目前还是用endtime来识别有效性',
PRIMARY KEY (`id`) USING BTREE,
INDEX `org_full_id_index`(`full_id`(255)) USING BTREE,
INDEX `org_name_index`(`name`(255)) USING BTREE,
INDEX `org_parent_id_index`(`parent_id`) USING BTREE,
INDEX `end_time`(`end_time`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci COMMENT = '组织表' ROW_FORMAT = Dynamic;
SET FOREIGN_KEY_CHECKS = 1;
STRAIGHT_JOIN not sufficient
STRAIGHT_JOIN not sufficien v2
FORCE INDEX not sufficient
FORCE INDEX not sufficient v2
What version of MySQL are you using? There have been Optimization and Index-limit changes that are relevant to your query and schema.
If you set end_time to some date in the distant future, you could avoid the OR by changing to simply end_time > NOW(). (OR used to be bad for performance.)
The indexes you have for the many-to-many table (user_org) are optimal.
Index "prefixing" (full_id(255)) is problematic. It can be eliminated in newer versions. INDEX(full_id) would let the query start with `full_id LIKE '...%' be much more usable.
Perhaps you should change to utf8mb4? It is needed for the more obscure Chinese characters, plus some Emoji.
This index may be picked by the Optimizer; suggest you add it:
USER: INDEX(state, user_id)
If you don't actually need user.name to be a full 256 characters, lower it to 255. That way you can eliminate the prefixing:
USER: INDEX(name)
See other options here: http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes
I already tried This solution which says
ALTER TABLE title
CHARACTER SET utf8
COLLATE utf8_unicode_ci;
Ok here are some screen shots which might help you.
Update
here's what happens when i insert Japanese characters.
Update 2
Show create table gives this
CREATE TABLE `productInfo` (
`pID` int(11) NOT NULL AUTO_INCREMENT,
`pOperation` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`year` year(4) DEFAULT NULL,
`season` varchar(10) CHARACTER SET latin1 DEFAULT NULL,
`pName` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`category` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`margin1` text CHARACTER SET latin1,
`margin2` text CHARACTER SET latin1,
PRIMARY KEY (`pID`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
just see that
DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
But now see that the query
SELECT character_set_name, collation_name
FROM information_schema.columns
WHERE table_schema = 'trac_data'
AND table_name = 'productInfo'
AND column_name = 'pOperation';
gives
character_set_name collation_name
'latin1' 'latin1_swedish_ci'
Thats weird !
Update 3
SELECT hex(pOperation),pOperation FROM trac_data.productInfo;
gave 3F3F3F3F3F which is hex code for actual '?' and not any japanese character so that means no japanese characters are being stored
You have a mix of charsets in your table structure. The table itself uses utf8, but the column in question uses latin 1. You have it defined that way. As long as you have an own charset for your column you can change the table's or the schema's column a thousand times. It won't have any effect on your column. So, instead change the column's charset to either default (to use that of the table) or make it using utf8 explicitely.
When you alter the column's charset existing data will be converted (if possible). Your wrong input however stays wrong, so you have to fill the data again.
Ok i found the cause
CREATE TABLE `productInfo` (
`pID` int(11) NOT NULL AUTO_INCREMENT,
`pOperation` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`year` year(4) DEFAULT NULL,
`season` varchar(10) CHARACTER SET latin1 DEFAULT NULL,
`pName` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`category` varchar(40) CHARACTER SET latin1 DEFAULT NULL,
`margin1` text CHARACTER SET latin1,
`margin2` text CHARACTER SET latin1,
PRIMARY KEY (`pID`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
I noticed how in front of each column SET latin1 was present.
So I Just changed to sjis and problem solved.
You have to set the database collation to UTF-8, not only the table collation :
Here is the SQL script result :
I have table products_discription from OpenCart.
I created new search engine. Everything is okey, except that is case sensitive.
How I can make it insensitive.
I readed in Mysql Documentation I must change utf8_bin to utf8_general_ci.
But how to make it, without deleting all indexes.
Its not only one table. I'm looking for at 4 tables. Every table has around 4 -5 indexes.
The site brings non-stop information. Loss of information is simply not acceptable.
I was wondering if there is a way to extract keys to delete, and change the encoding. Then add them again with just one application. As such, I think that there will be no data loss.
CREATE TABLE IF NOT EXISTS `product_description` (
`product_id` int(11) NOT NULL,
`language_id` int(11) NOT NULL,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`short_description` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`description` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`meta_description` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`meta_keyword` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`tag` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`custom_title` varchar(255) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT '',
PRIMARY KEY (`product_id`,`language_id`),
FULLTEXT KEY `description` (`description`),
FULLTEXT KEY `tag` (`tag`),
FULLTEXT KEY `ft_namerel` (`name`,`description`),
FULLTEXT KEY `name` (`name`,`short_description`,`description`,`meta_description`,`meta_keyword`,`tag`,`custom_title`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
have you tried searching in boolean mode?
I deleted all index keys and change encoding, after that I set new index keys.
Let`s have a example hotels table:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
`Hotel` varchar(80) character set latin1 NOT NULL default '',
`City` varchar(100) character set latin1 default NULL,
`CityFR` varchar(100) character set latin1 default NULL,
`Region` varchar(50) character set latin1 default NULL,
`RegionFR` varchar(100) character set latin1 default NULL,
`Country` varchar(50) character set latin1 default NULL,
`CountryFR` varchar(50) character set latin1 default NULL,
`HotelText` text character set latin1,
`HotelTextFR` text character set latin1,
`tagsforsearch` text character set latin1,
`tagsforsearchFR` text character set latin1,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
In this table for example we have only one hotel with Region name = "Graubünden" (please note umlaut ü character)
And now I want to achieve same search match for phrases:
'graubunden' and
'graubünden'
This is simple with use of MySql built in
collations in regular searches as follows:
SELECT *
FROM `hotels`
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1)
COLLATE latin1_german1_ci
This works fine for 'graubunden' and 'graubünden' and
as a result I receive proper result, but problem is
when we make MySQL full text search
Whats wrong with this SQL statement?:
SELECT
*
FROM
hotels
WHERE
MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY Country ASC, Region ASC, City ASC
This doesn`t return any result.
Any ideas where the dog is buried ?
When you define individual CHARACTER SETS for your columns, you override the collation you set default on table level.
Each of your columns has default latin1 collation (which is latin1_swedish_ci). You can see it by running SHOW CREATE TABLE.
In FULLTEXT queries, indexed columns have COERCIBILITY of 0, that is all fulltext queries are converted to the collation used in the index, not vice versa.
You need to remove CHARACTER SET definitions from your columns or explicitly set all columns to latin1_german_ci:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) NOT NULL default '0000',
`Hotel` varchar(80) NOT NULL default '',
`City` varchar(100) default NULL,
`CityFR` varchar(100) default NULL,
`Region` varchar(50) default NULL,
`RegionFR` varchar(100) default NULL,
`Country` varchar(50) default NULL,
`CountryFR` varchar(50) default NULL,
`HotelText` text,
`HotelTextFR` text,
`tagsforsearch` text,
`tagsforsearchFR` text,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
INSERT
INTO hotels (hotelText, HotelTextFR, tagsforsearch, tagsforsearchFR)
VALUES ('text', 'text', 'graubünden', 'tags');
SELECT *
FROM hotels
WHERE MATCH (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST (CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY
Country ASC, Region ASC, City ASC;