mysql match against not matching text in brackets - mysql

I'm trying to use match against to return search results.
I have a problem though where it's not returning results where the matching text is in brackets. Just getting rid of the brackets isn't an option i'm afraid.
So running the sql fiddle below I would expect it to return two results, not one
CREATE TABLE `courses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`course_code` varchar(100) NOT NULL,
`course_name` varchar(100) DEFAULT NULL,
`startdate` varchar(100) DEFAULT NULL,
`starttimestamp` varchar(45) DEFAULT NULL,
`prospectus_title` varchar(500) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `info` (`course_name`,`prospectus_title`,`course_code`)
) ENGINE=MyISAM AUTO_INCREMENT=981074 DEFAULT CHARSET=utf8;
INSERT INTO `courses` (`id`, `course_code`, `course_name`, `startdate`, `starttimestamp`, `prospectus_title`) VALUES
('1', '1234', 'vrqtest', 'time','time', 'vrqtest'),
('2', '5678', '(vrq)test', 'time','time','(vrq)test');
SELECT * FROM courses force index(info)
WHERE starttimestamp IS NOT NULL AND (
MATCH ( course_name ) against ('vrq*' in boolean mode ))
SQL Fiddle
That returns only the first record but should also the second.
Any ideas?

Here's why:
https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_ft_min_word_len
ft_min_word_len
The minimum length of the word to be included in a
FULLTEXT index. Defaults to 4
(vrq)test is too short to match. Add an extra character (e.g. (vrqa)test ) and it's matched.

Related

Second Subquery Inside INSERT Into saves int 0

Read it carefully, we have this query which is inserting values in the table called users. For the value member_id we are running a subquery to select from the table admin_users the id of the member. The reason why there are single quotes with +, it's because we are trying to manipulate the query. At this moment this first subquery works correctly but what happends with the second subquery?
The second subquery selects the pass from the table settings, the table settings and the value pass totally exists and there is only one record, but this second query inside the INSERT INTO is not returning nothing. When the execution of the query INSERT INTO finishs, all the values are stored correctly except notes column which finally inserts 0. I don't know why but if you delete all the ''+ it works correctly the whole sql statement but in this time we can not delete ''+ because we are altering the query. I need a solution for this issue.
INSERT INTO `users` (`username`,`password`,`number`,`member_id`,`exp_date`,`notes`) VALUES ('balvin','sjeneoeoe','3',''+(select id from `admin_users` where username = 'TEST')+'','1644622354','' + (select pass from `settings`));#;');
Also i have tried modifying the second subquery like this but it didn't work.
'' + (select pass from `settings` LIMIT 1)
'' + (select pass from `settings` GROUP BY pass LIMIT 1)
'' + (select pass from `settings` where id = 1 LIMIT 1)
Perhaps the error it's the datatype of the column value pass in settings or the column notes in users
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`member_id` int(11) DEFAULT NULL,
`username` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`password` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`exp_date` int(11) DEFAULT NULL,
`notes` mediumtext COLLATE utf8_unicode_ci NOT NULL,
`number` int(11) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`),
KEY `member_id` (`member_id`),
KEY `exp_date` (`exp_date`),
KEY `username` (`username`),
KEY `password` (`password`),
) ENGINE=InnoDB AUTO_INCREMENT=1702894 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `settings` (
`id` int(11) NOT NULL,
`name` mediumtext COLLATE utf8_unicode_ci NOT NULL,
`pass` mediumtext COLLATE utf8_unicode_ci NOT NULL,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Write your insert as a select, not values.
Untested, but something like:
INSERT INTO `users` (`username`,`password`,`number`,`member_id`,`exp_date`,`notes`)
select 'balvin','sjeneoeoe','3',
(select id from `admin_users` where username = 'TEST'),
'1644622354',
(select pass from `settings`);
Note each sub-query must return a single row.

Mysql word boundary for diacritic strings

I am facing an issue to match words with word boundaries in diacritic strings
CREATE TABLE `authors` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(250) DEFAULT NULL,
`last_name` varchar(250) DEFAULT NULL,
`initials` varchar(250) DEFAULT NULL,
`affiliation_available` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_names` (`first_name`,`last_name`,`initials`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
INSERT INTO authors VALUES (1,'Juan José','Palacios Gutiérrez','JJ',1);
I want to get the word boundaries for diacritic strings also
select * from authors where ( preg_rlike('/\bJosé\b/i',CONVERT(authors.first_name USING utf8)));
Which failes, then I tried
select * from authors where ( preg_rlike('/(?<!\w)José(?!\w)/i',CONVERT(authors.first_name USING utf8)));
It is working. But
select * from authors where ( preg_rlike('/(?<!\w)Jos(?!\w)/i',CONVERT(authors.first_name USING utf8)));
This query should return false ideally.Because there is no words with Jos.But here it returns the wrong data
Can you please help me

MySQL fulltext selection performance

I have 1.5M rows in a table. Following is the table create code:
CREATE TABLE `jobs` (
`id` INT(8) NOT NULL AUTO_INCREMENT,
`job_id` VARCHAR(50) NOT NULL DEFAULT '',
`title` VARCHAR(255) NOT NULL DEFAULT '',
`company` VARCHAR(255) NOT NULL DEFAULT '',
`city` VARCHAR(50) NOT NULL DEFAULT '',
`state` VARCHAR(50) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
UNIQUE INDEX `job_id` (`job_id`),
FULLTEXT INDEX `search` (`title`, `company`, `city`, `state`)
)
COLLATE='utf8_general_ci'
ENGINE=MyISAM
The query below takes about 0.5 seconds, which is very high.
SELECT id, title, company, state, city FROM `jobs` WHERE MATCH (title, company, state, city) AGAINST ('software engineer in san fransisco california') LIMIT 0,10
How can I decrease execution time and still provide relevance results? Any suggestions?
So far I tried followings but there is no improvement at all.
-Searching in a single field that contains 4 field of data but it did not matter.
-Using in boolean mode>1 or >2, but then it gives me unrelated results
-Repearing the table, increasing key_buffer_size to 1GB from 16MB, changing table type to Innodb, changing character set to latin1 from utf8.
-Setting ft_max_word_len=1 and ft_stopword_file='' from default values.
-I searched online for many hours but no luck so far.
"Explain select..." output:
id;select_type;table;type ;possible_keys;key ;key_len;ref;rows;Extra
1 ;SIMPLE ;jobs ;fulltext;search ;search;0 ;\N ;1 ;Using where
Edit: Thank you for your suggestions but there is no imprevement at all.

MySQL-Query too slow

I am using the following tables in my MySQL-Database:
--
-- Table structure for table `company`
--
CREATE TABLE IF NOT EXISTS `company` (
`numb` varchar(4) NOT NULL,
`cik` varchar(30) NOT NULL,
`sNumber` varchar(30) NOT NULL,
`street1` varchar(255) NOT NULL,
`street2` varchar(255) NOT NULL,
`city` varchar(255) NOT NULL,
`state` varchar(100) NOT NULL,
`zip` varchar(100) NOT NULL,
`phone` varchar(255) NOT NULL,
`name` varchar(255) NOT NULL,
`dateChanged` varchar(30) NOT NULL,
`name2` varchar(255) NOT NULL,
`seriesId` varchar(30) NOT NULL,
`symbol` varchar(10) NOT NULL,
`exchange` varchar(20) NOT NULL,
PRIMARY KEY (`cik`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `company` (`numb`, `cik`, `sNumber`, `street1`, `street2`, `city`, `state`, `zip`, `phone`, `name`, `dateChanged`, `name2`, `seriesId`, `symbol`, `exchange`) VALUES
('6798', 'abc', '953551121', '701 AVENUE', '', 'GLENDALE', 'CA', '91201-2349', '818-244-8080', '', '', 'Public Store', '', 'PSA', 'NYSE')
--
-- Table structure for table `data`
--
CREATE TABLE IF NOT EXISTS `data` (
`id` int(100) NOT NULL AUTO_INCREMENT,
`number` varchar(100) NOT NULL,
`elementname` mediumtext NOT NULL,
`date` varchar(100) NOT NULL,
`elementvalue` longtext NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=18439;
INSERT INTO `data` (`id`, `number`, `elementname`, `date`, `elementvalue`) VALUES
(1, '0001393311-10-000004', 'StockholdersEquityIncludingPortionAttributableToNoncontrollingInterest', '2009-12-31', '3399777000')
--
-- Table structure for table `filing`
--
CREATE TABLE IF NOT EXISTS `filing` (
`number` varchar(100) NOT NULL,
`file_number` varchar(100) NOT NULL,
`type` varchar(100) NOT NULL,
`amendment` tinyint(1) NOT NULL,
`date` varchar(100) NOT NULL,
`cik` varchar(30) NOT NULL,
PRIMARY KEY (`accession_number`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `filing` (`number`, `file_number`, `type`, `amendment`, `date`, `cik`) VALUES
('0001393311-10-000004', '001-33519', '10-K', 0, '2009-12-31', '0000751653'),
('0000751652-10-000006', '001-08796', '10-K', 0, '2009-12-31', '0000751652')
The data table has around 22.000 entries, filing and company tables have around 400 entries each. I want to operate the database with a lot more entries in the future.
I perform the following query, which selects the newest item with a given type:
SELECT data.elementname, data.elementvalue, company.name2 FROM data
JOIN filing ON data.number = filing.number
JOIN company ON filing.cik = company.cik
WHERE elementname IN ('Elem1', 'Elem2', 'Elem3', 'Elem4', 'Elem5', 'ElemN')
AND number IN (
SELECT number
FROM filing
WHERE filing.cik IN ('cik1', 'cik2', 'cikN')
AND filing.type = '1L'
GROUP BY filing.cik
)
It takes between ~0.28 and 0.4 seconds to complete, which appears to be very slow.
When i perform the query without the following line
WHERE filing.cik IN ('cik1', 'cik2', 'cikN')
it takes only ~0.035 seconds.
Any idea how to speed the query up or to optimize the table structure because the table is growing rapidly and it's already too slow.
First off, the table structure you posted for filing is incorrect, as the primary key you specified doesn't. I'll assume you mean number. Additionally, you didn't specify the table definition for company, which makes trying to provide advice for this somewhat difficult.
However, both of the comments are correct. You need some indexes. Based on the query, you should probably some the following indexes.
ALTER TABLE company ADD INDEX ( cik )
ALTER TABLE data ADD INDEX ( number )
I would also recommend taking a look at whether data.elementname actually needs to be a MEDIUMTEXT, which is a pretty huge column. If the rest of the data looks like the example data you provided, you should probably change it into a varchar. TEXT columns can cause some serious performance penalties due to the way they're stored.
Additionally, your PRIMARY KEY number columns, which are currently strings, look as though they could be reformatted into different columns that are actually of type INT. Keep in mind that VARCHAR PRIMARY KEY columns will not be as efficient as INTs, just because they're so much bigger.
Lastly, 22k rows isn't all that much data. You should a take a look at your my.cnf settings. Your key_buffer value may be too small to fit indexes entirely in memory. Additionally, you may want to consider using INNODB for these tables, combined with an innodb_buffer_pool value that'll keep everything in memory.

full text indexing only returns 2 rows

I have a table for food and hotels
like
CREATE TABLE `food_master` (
`id` int(6) unsigned NOT NULL auto_increment,
`caption` varchar(255) default NULL,
`category` varchar(10) default NULL,
`subcategory` varchar(10) default NULL,
`hotel` varchar(10) default NULL,
`description` text,
`status` varchar(10) default NULL,
`created_date` date default NULL,
`modified_date` date default NULL,
`chosen_mark` varchar(10) default 'no',
PRIMARY KEY (`id`),
FULLTEXT KEY `description` (`description`,`caption`)
) ENGINE=MyISAM AUTO_INCREMENT=15 DEFAULT CHARSET=latin1
And I have data in it. I use full text indexing in this table. I use the query
SELECT * FROM food_master am
WHERE MATCH(description, caption) AGAINST ('Chicken')
This query works fine when i have 2 'Chicken' in the field 'caption'. but when i put third one it doesnt return a row.
try with IN BOOLEAN MODE as
MySQL can perform boolean full-text searches using the IN BOOLEAN MODE
modifier. With this modifier, certain characters have special meaning
at the beginning or end of words in the search string.
SELECT * FROM food_master
WHERE MATCH(description, caption) AGAINST ('Chicken' IN BOOLEAN MODE)
Demo