Mysql word boundary for diacritic strings - mysql

I am facing an issue to match words with word boundaries in diacritic strings
CREATE TABLE `authors` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(250) DEFAULT NULL,
`last_name` varchar(250) DEFAULT NULL,
`initials` varchar(250) DEFAULT NULL,
`affiliation_available` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_names` (`first_name`,`last_name`,`initials`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
INSERT INTO authors VALUES (1,'Juan José','Palacios Gutiérrez','JJ',1);
I want to get the word boundaries for diacritic strings also
select * from authors where ( preg_rlike('/\bJosé\b/i',CONVERT(authors.first_name USING utf8)));
Which failes, then I tried
select * from authors where ( preg_rlike('/(?<!\w)José(?!\w)/i',CONVERT(authors.first_name USING utf8)));
It is working. But
select * from authors where ( preg_rlike('/(?<!\w)Jos(?!\w)/i',CONVERT(authors.first_name USING utf8)));
This query should return false ideally.Because there is no words with Jos.But here it returns the wrong data
Can you please help me

Related

How to store translates in MySQL to use join?

I have a table that contains all translations of words:
CREATE TABLE `localtexts` (
`Id` int(11) NOT NULL,
`Lang` char(2) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'pe',
`Text` varchar(300) DEFAULT NULL,
`ShortText` varchar(100) NOT NULL,
`DbVersion` timestamp NOT NULL DEFAULT current_timestamp(),
`Status` int(11) NOT NULL DEFAULT 1
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As example there is a table that refers to localtexts:
CREATE TABLE `composes` (
`Status` int(11) NOT NULL DEFAULT 1,
`Id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The table above has foreign key Id to localtexts.Id. And when I need to get word on English I do:
SELECT localtexts.text,
composes.status
FROM composes
LEFT JOIN localtexts ON composes.Id = localtexts.Id
WHERE localtexts.Lang = 'en'.
I'm concerned in performance this decision when there are a lot of tables for join with localtexts.
You might find that adding the following index to the localtexts table would speed up the query:
CREATE INDEX idx ON localtexts (Lang, id, text);
This index covers the WHERE clause, join, and SELECT.

Slow search query with a one to many join

My problem is a slow search query with a one-to-many relationship between the tables. My tables look like this.
Table Assignment
CREATE TABLE `Assignment` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ProjectId` int(10) unsigned NOT NULL,
`AssignmentTypeId` smallint(5) unsigned NOT NULL,
`AssignmentNumber` varchar(30) NOT NULL,
`AssignmentNumberExternal` varchar(50) DEFAULT NULL,
`DateStart` datetime DEFAULT NULL,
`DateEnd` datetime DEFAULT NULL,
`DateDeadline` datetime DEFAULT NULL,
`DateCreated` datetime DEFAULT NULL,
`Deleted` datetime DEFAULT NULL,
`Lat` double DEFAULT NULL,
`Lon` double DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `idx_assignment_assignment_type_id` (`AssignmentTypeId`),
KEY `idx_assignment_assignment_number` (`AssignmentNumber`),
KEY `idx_assignment_assignment_number_external`
(`AssignmentNumberExternal`)
) ENGINE=InnoDB AUTO_INCREMENT=5280 DEFAULT CHARSET=utf8;
Table ExtraFields
CREATE TABLE `ExtraFields` (
`assignment_id` int(10) unsigned NOT NULL,
`name` varchar(30) NOT NULL,
`value` text,
PRIMARY KEY (`assignment_id`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
My search query
SELECT
`Assignment`.`Id`, COL_5_72, COL_5_73, COL_5_74, COL_5_75, COL_5_76,
COL_5_77 FROM (
SELECT
`Assignment`.`Id`,
`Assignment`.`AssignmentNumber` AS COL_5_72,
`Assignment`.`AssignmentNumberExternal` AS COL_5_73 ,
`AssignmentType`.`Name` AS COL_5_74,
`Assignment`.`DateStart` AS COL_5_75,
`Assignment`.`DateEnd` AS COL_5_76,
`Assignment`.`DateDeadline` AS COL_5_77 FROM `Assignment`
CASE WHEN `ExtraField`.`Name` = "WorkDistrict" THEN
`ExtraField`.`Value` end as COL_5_78 FROM `Assignment`
LEFT JOIN `ExtraFields` as `ExtraField` on
`ExtraField`.`assignment_id` = `Assignment`.`Id`
WHERE `Assignment`.`Deleted` IS NULL -- Assignment should not be removed.
AND (1=1) -- Add assignment filters.
) AS q1
GROUP BY `Assignment`.`Id`
HAVING 1 = 1
AND COL_5_78 LIKE '%Amsterdam East%'
ORDER BY COL_5_72 ASC, COL_5_73 ASC;
When the table is only around 3500 records my query takes a couple of seconds to execute and return the results.
What is a better way to search in the related data? Should I just add a JSON field to the Assignment table and use the MySQL 5.7 Json query features? Or did I made a mistake in designing my database?
You are using select from subquery that forces MySQL to create unindexed temp table for each execution. Remove subquery (you really don't need it here) and it will be much faster.

mysql match against not matching text in brackets

I'm trying to use match against to return search results.
I have a problem though where it's not returning results where the matching text is in brackets. Just getting rid of the brackets isn't an option i'm afraid.
So running the sql fiddle below I would expect it to return two results, not one
CREATE TABLE `courses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`course_code` varchar(100) NOT NULL,
`course_name` varchar(100) DEFAULT NULL,
`startdate` varchar(100) DEFAULT NULL,
`starttimestamp` varchar(45) DEFAULT NULL,
`prospectus_title` varchar(500) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `info` (`course_name`,`prospectus_title`,`course_code`)
) ENGINE=MyISAM AUTO_INCREMENT=981074 DEFAULT CHARSET=utf8;
INSERT INTO `courses` (`id`, `course_code`, `course_name`, `startdate`, `starttimestamp`, `prospectus_title`) VALUES
('1', '1234', 'vrqtest', 'time','time', 'vrqtest'),
('2', '5678', '(vrq)test', 'time','time','(vrq)test');
SELECT * FROM courses force index(info)
WHERE starttimestamp IS NOT NULL AND (
MATCH ( course_name ) against ('vrq*' in boolean mode ))
SQL Fiddle
That returns only the first record but should also the second.
Any ideas?
Here's why:
https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_ft_min_word_len
ft_min_word_len
The minimum length of the word to be included in a
FULLTEXT index. Defaults to 4
(vrq)test is too short to match. Add an extra character (e.g. (vrqa)test ) and it's matched.

full text indexing only returns 2 rows

I have a table for food and hotels
like
CREATE TABLE `food_master` (
`id` int(6) unsigned NOT NULL auto_increment,
`caption` varchar(255) default NULL,
`category` varchar(10) default NULL,
`subcategory` varchar(10) default NULL,
`hotel` varchar(10) default NULL,
`description` text,
`status` varchar(10) default NULL,
`created_date` date default NULL,
`modified_date` date default NULL,
`chosen_mark` varchar(10) default 'no',
PRIMARY KEY (`id`),
FULLTEXT KEY `description` (`description`,`caption`)
) ENGINE=MyISAM AUTO_INCREMENT=15 DEFAULT CHARSET=latin1
And I have data in it. I use full text indexing in this table. I use the query
SELECT * FROM food_master am
WHERE MATCH(description, caption) AGAINST ('Chicken')
This query works fine when i have 2 'Chicken' in the field 'caption'. but when i put third one it doesnt return a row.
try with IN BOOLEAN MODE as
MySQL can perform boolean full-text searches using the IN BOOLEAN MODE
modifier. With this modifier, certain characters have special meaning
at the beginning or end of words in the search string.
SELECT * FROM food_master
WHERE MATCH(description, caption) AGAINST ('Chicken' IN BOOLEAN MODE)
Demo

Nested selects in MySQL

Setting:
Each page on my site has four widgets that are arranged in different orders (1-4).
I have a table 'content' and table 'widgets'. I have a bridging table that maps content.id to widgets.content_id.
Problem:
What I want to do is run a query that selects * from content along with addition columns widget_1, widget_2, widget_3, widget_4, each containing the id of the widget linked to that page.
I've been trying some nested selects all morning and can't seem to crack it. I've copied the MySQL dumps of the involved tables below :-).
CREATE TABLE `content` (
`id` int(11) NOT NULL auto_increment,
`permalink` varchar(64) character set latin1 NOT NULL,
`parent` int(11) NOT NULL default '1',
`title` varchar(128) character set latin1 NOT NULL,
`content` text character set latin1,
`content_type` varchar(16) NOT NULL default 'page',
PRIMARY KEY (`id`),
FULLTEXT KEY `title` (`title`,`content`,`meta_description`,`meta_keywords`)
)
CREATE TABLE `widgets` (
`id` int(11) unsigned NOT NULL auto_increment,
`title` varchar(64) default NULL,
`text` varchar(256) default NULL,
`image` varchar(128) default NULL,
`target` varchar(128) default NULL,
`code` varchar(32) default NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `content_widgets` (
`content_id` int(11) NOT NULL,
`widget_id` int(11) NOT NULL,
`order` tinyint(4) NOT NULL
)
thanks a lot!
You don't need a nested query - just a join. Assuming that you want to start with a content record and return the matching widgets....
SELECT c.*, w.*
FROM content c
LEFT JOIN (
content_widgets cw INNER JOIN widgets w
ON cw.widget_id=w.id
) ON c.id=cw.id
WHERE c.id=....
Although a simple innter join is a better idea of you know you've got the widgets:
SELECT c.*, w.*
FROM content c, content_widgets cw widgets w
WHERE cw.widget_id=w.id
AND c.id=cw.id
AND c.id=....