MySQL to PostgreSQL's to_tsvector, ## and to_tsquery? - mysql

In PostgreSQL, we can search table based on full text search like this -
SELECT title
FROM pgweb
WHERE to_tsvector('english', body) ## to_tsquery('english', 'friend');
Source - http://www.postgresql.org/docs/current/static/textsearch-tables.html
How can we do similar search in MySQL 5.5 which is quite easily done in PostgreSQL?

You probably want MySQL's full text search functionality. Essentially you create a FULLTEXT index then search against it using MATCH() ... AGAINST.
I'm not aware of a facility to set the search language per-query in MySQL, but that doesn't mean no such support exists. It wasn't clear if per-query language settings were a requirement for you.
The latest stable release of MySQL supports full text search on the modern transactional and crash-safe InnoDB table type as well as the unsafe MyISAM table type. If your MySQL only does FTS on MyISAM it's time to upgrade. 5.6 supports full text search on InnoDB.
Alternately, if you really can't upgrade, you can store your important data in InnoDB tables and run a periodic query to update a MyISAM table you use as a materialized view for fulltext search only:
Create a new MyISAM table
INSERT INTO ... SELECT the data from the InnoDB table into the new MyISAM table
CREATE the fulltext index on the new MyISAM table
DROP the old MyISAM table you were using for fulltext indexing; and
finally ALTER TABLE ... RENAME the new MyISAM table to have the name of the old one.
You'll have a very short window during which the fulltext index is unavailable between when you drop the old table and re-create the new one. Your data also gets out of date and stale between view refreshes, though it's possible you can work around that with triggers (I don't use MySQL enough to know). If you can't live with these limitations, upgrade to 5.6.
MySQL's full text search offers control of stopwords and other tuning. It's a solid offering that should do the job nicely.

Related

MySQL InnoDB table with a Hash Index

I have a table like this.
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Later i have created a HASH index like this.
CREATE INDEX index ON table (column) USING HASH;
Latter i have try some explain queries.
Like
explain Select * from table where column=132;
And i see the engine is using the index on possible_keys and in the key stuff says the name of the index!!
But in the docs says that InnoDB doesn't allow hash index now i wonder why my innoDB Supposedly allows the hash index?
InnoDB silently changes "HASH" into "BTree". A BTree index does what a HASH does, plus more. Or do you think there is some good reason to want Hash?
"Good reason" -- MySQL was created many years ago. It was designed to be 'lean and mean'. Many features were boiled down to "one size fits all": BTree for indexing; Nested Loop Join for JOINing, etc.
Meanwhile, for future expansion and pseudo compatibility, some common syntax variants were included -- HASH for indexing, DESC for index ordering, etc. Even though those "lie" about what will happen, the database engine still gives you the 'right' answer.
Over time, the most glaring shortcuts have been remedied.
Replication (3.xx?)
Transactions (Adding InnoDB in 4.0) (MyISAM had LOCK TABLES, but that was not really adequate.)
information_schema (4.1?) (versus a variety of SHOW commands) Note: 8.0 overhauled it with the "data dictionary")
Character sets and collations (4.1) (vs "latin_swedish_ci", which was good enough for the implementor.)
Stored routines (vs client code) (5.0)
Subqueries (TEMPORARY TABLEs were not adequate)
Various JOIN optimizations (5.6, 5,7, 8.0)
only_full_group_by (MariaDB 10.1?, 5.7)
ALTER not 'always' copying the table over (mostly 5.7)
"Generated" columns (5.7)
"Tablespaces" (5.7)
JSON datatype and functions
FULLTEXT and SPATIAL indexing in InnoDB (5.7, 8.0) (so MyISAM can be deprecated)
DESC in INDEXes (8.0) (very few use cases really need this)
"Windowing" functions (MariaDB 10.2, then MySQL 8.0)
CTEs (MariaDB 10.2, then MySQL 8.0)
Security: Better password handling (4.1?, 5.6, 8.0)
HA (High Availability) (MariaDB with Galera; 8.0 with InnoDB Cluster)
At-rest encryption (8.0?)
Notice how the list is somewhat ordered from "must have" to "nice to have". Yet to come may include
Multi-threaded execution (Useless if you are I/O-bound anyway) (a very few use cases in 8.0)
HASH indexing (and other types) (MariaDB 10.4, only for UNIQUE on TEXT/BLOB)
Global UNIQUE and FOREIGN KEY for PARTITIONing. (Not that partitioning is very useful.)
More syntax compatibility with standards and other vendors (MariaDB already does a much better job of this)
Meanwhile, some things are going away (or have already gone away -- either in MariaDB or MySQL)
Compiling for a large variety of computers -- such as Atari
The Query Cache -- Handy for benchmarking, but not really useful in Production environments. And a major hassle to implement in any 'cluster' topology.
MyISAM has major deficiencies relative to InnoDB, and has very few benefits. (Arguably, the only benefit is less disk space needed.)
The feature in InnoDB is called Adaptive Hash Index,
Whether to use hash index depends on the scale of the table and query frequency, it's a completely internal strategy and normally out of configuration.
https://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html

Creating indexes prior to LOAD DATA for performance in MySQL

The Amazon RDS Customer Data Import Guide for MySQL (written in 2009) provides the following tip to decrease load times for MySQL -
Create all secondary indexes prior to loading. This is counterintuitive for those familiar with other databases. Adding or modifying a secondary index causes MySQL to create a new table with the index changes, copy the data from the existing table to the new table, and drop the original table.
However, there are several articles and stackoverflow posts from 2010+ that provide performance tests showing that creating indexes after loading is more performant. Where did this come from and did it just apply to an older version of MySQL? If so, please provide exact details. Or, does it still apply is specific cases?
The AWS recommendation to put secondary indexes in place before loading the data applied to older MySQL versions (< 5.5) because of the way secondary indexes were handled:
From the MySQL 5.5 docs:
Creating and dropping secondary indexes has traditionally involved
significant overhead from copying all the data in the InnoDB table.
The fast index creation feature of the InnoDB Plugin makes both CREATE
INDEX and DROP INDEX statements much faster for InnoDB secondary
indexes.
MySQL offers the following recommendation in the 5.5 documentation:
Because index maintenance can add performance overhead to many data
transfer operations, consider doing operations such as ALTER TABLE ...
ENGINE=INNODB or INSERT INTO ... SELECT * FROM ... without any
secondary indexes in place, and creating the indexes afterward.
If you use MySQL 5.5 or higher with AWS, you can take advantage of the fast Fast Index Creation feature that significantly speeds up secondary indexes creation.
Fast Index Creation is a capability first introduced in the InnoDB Plugin, now part of the MySQL server in 5.5 and higher, that speeds up
creation of InnoDB secondary indexes by avoiding the need to
completely rewrite the associated table. The speedup applies to
dropping secondary indexes also.

InnoDB full-text search (no lucene)

I have a problem. I have a managed VPS server running MySQL 5.1.x. I am currently building a new database where I want to store tweets (via search, stream, timeline etc). So I want to use InnoDB database engine because of the row locking! But unfortunately, MySQL 5.1 does not support Full-text search in InnoDB tables.
The problem is that I cannot update my server by myself. So I cannot install MySQL 5.6 (that should support Full-text search) and I cannot install lucene (or solr or whatever).
Are there other options to achieve Full-text search in MySQL or whatever. Or maybe in PostgreSQL (never used that before)
The only other option I have so far is going to an unmanaged VPS but I don't prefer that :)
You can take a look to MyISAM engine but it's not transactional.
The other other possible solution is create another table with engine MyISAM or ARIA and create a relationship between the new table and the "store tweets" table, so before insert into the 'new table' make sure that 'store tweets' it's not locked.

Does indexing a text in mysql improve searching text with LIKE '%search%'

I am using InnoDB engine in mysql version:5.1.61-community-log. The length of text to be searched is less than 100 characters. Does the performance of queries with LIKE '%searchstring%' improve with indexing the text ?
EDIT:
I am using this query for jquery auto-suggest component.
I am using 3rd party webhosting service . So upgrade is not an option for me
No, adding an index won't help your LIKE '%foo%' queries (much*). A LIKE condition can only use the index efficiently if you have a constant prefix, such as LIKE 'foo%'
You should consider a full text search instead. If you are on MySQL 5.5 or older only the MyISAM engine was supported. In MySQL 5.6 or newer InnoDB is also supported for full text searches.
If you are using an older version of MySQL, are unable to upgrade, and are unable to change the storage engine (because you need other features of InnoDB such as foreign key constraints), then you could consider creating a new MyISAM table which stores only the primary key and the text column. You can use this table to perform fast full text searches and join to the original table if you also need access to the other columns.
You could also consider using an external text search engine such as:
Sphinx
Lucene
(*) If the index you add is a covering index for your query, you will get a small improvement by adding an index. Due to the smaller width of the index, a full scan of the index will be faster than a full scan of the entire table.

MySQL FULLTEXT indexes issue

I’m trying to create a FULLTEXT index on an attribute of a table. Mysql returns
ERROR 1214: The used table type doesn’t support FULLTEXT indexes.
Any idea what I’m doing wrong?
You’re using the wrong type of table. Mysql supports a few different types of tables, but the most commonly used are MyISAM and InnoDB. MyISAM (in MySQL 5.6+also InnoDB tables) are the types of tables that Mysql supports for Full-text indexes.
To check your table’s type issue the following sql query:
SHOW TABLE STATUS
Looking at the result returned by the query, find your table and corresponding value in the Engine column. If this value is anything except MyISAM or InnoDB then Mysql will throw an error if your trying to add FULLTEXT indexes.
To correct this, you can use the sql query below to change the engine type:
ALTER TABLE <table name> ENGINE = [MYISAM | INNODB]
Additional information (thought it might be useful):
Mysql using different engine storage types to optimize for the needed functionality of specific tables. Example MyISAM is the default type for operating systems (besides windows), preforms SELECTs and INSERTs quickly; but does not handle transactions. InnoDB is the default for windows, can be used for transactions. But InnoDB does require more disk space on the server.
Up until MySQL 5.6, MyISAM was the only storage engine with support for full-text search (FTS) but it is true that InnoDB FTS in MySQL 5.6 is syntactically identical to MyISAM FTS. Please read below for more details.
InnoDB Full-text Search in MySQL 5.6
On MySQL <= 5.5, the mysql manual says that FULLTEXT indexes can only be created on tables with the mylsam engine.
Are you using InnoDB? The only table type that supports FULLTEXT is MyISAM.
apart from MyISAM table PARTITIONING also not support full-text index.