Create fulltext index on a VIEW - mysql

Is it possible to create a full text index on a VIEW?
If so, given two columns column1 and column2 on a VIEW, what is the SQL to get this done?
The reason I'd like to do this is I have two very large tables, where I need to do a FULLTEXT search of a single column on each table and combine the results. The results need to be ordered as a single unit.
Suggestions?
EDIT: This was my attempt at creating a UNION and ordering by each statements scoring.
(SELECT a_name AS name, MATCH(a_name) AGAINST('$keyword') as ascore
FROM a WHERE MATCH a_name AGAINST('$keyword'))
UNION
(SELECT s_name AS name,MATCH(s_name) AGAINST('$keyword') as sscore
FROM s WHERE MATCH s_name AGAINST('$keyword'))
ORDER BY (ascore + sscore) ASC
sscore was not recognized.

MySQL doesn't allow any form of indexes on a view, just on it's underlying tables. The reason for this is because MySQL only materializes a view when you select from it, due to the possibility of the underlying tables changing data all the time. If you had a view that returned 10 million rows, you'd have to apply a full text index to it every time you selected from it, and that takes a lot of time.
If you want full index functionality, then you might as well stick with the SQL script you've posted, and manually (or cronjob a script to) update the fulltext index of both tables on a nightly basis (or hourly if you're in that high traffic market).

A kindof ugly workaround that might be inefficient but gets the job done would be to join the view with the underlying table to access the fulltext columns:
/* join the view with the underlying table, that is used in the view: */
select item.* from items_view as item
left join items as raw_item on raw_item.id = item.id
/* use the `item` from the view as usual: */
where item.foo = 'bar'
/* but use `raw_item` when you need the fulltext index: */
where match (raw_item.content) against ('foo bar')
Of course, this isn't a relevant solution if you need to create a new column in the view and give it the fulltext index, but if you only need to access the original data in this way and you just need the view for other unrelated operations, then this might be the way to go.

Related

How can i speed up the left join in my query using indexes?

I am new to SQL. At the moment I am experiencing some slower MySQL queries. I think I need to improve my indexes but not sure how.
drop temporary table if exists temp ;
CREATE TEMPORARY TABLE temp
(index idx_a (EXTRACT_DATE, project_id, SERVICE_NAME) )
select distinct DATE(c.EXTRACT_DATETIME) as EXTRACT_DATE,p.project_id, p.project_name, c.CLUSTER_NAME, c.SERVICE_NAME,
UPPER(CONCAT(SUBSTRING_INDEX(c.ENV_NAME, '-', 1),'-',c.CLUSTER_NAME)) as CLUSTER_ID
from p
left join c
on p.project_id = c.project_id ;
The short answer is that you need indexes at least to optimize the lookups done by the JOIN. The explain shows that both tables you are joining are doing a full table scan, then joining them the hard was, using "block nested loop" which indicates it is not using an index.
It would help to at least create an index on c.project_id.
ALTER TABLE c ADD INDEX (project_id);
This would mean there is still a table-scan to read the p table (estimated 5720 rows), but at least when it needs to find the related rows in c, it only reads the rows it needs, without doing a table-scan of 287K rows for each row of p.
The query you posted in an earlier question had another condition:
where DAYNAME(c.EXTRACT_DATETIME) = 'Friday' ;
I don't know why you haven't included this condition in the new question you posted.
If this is still a condition you need to handle, this could help optimize the query further. MySQL 5.7 (which you said in the other question you are using) supports virtual columns, defined for an expression, and you can index virtual columns.
ALTER TABLE c
ADD COLUMN isFriday AS (DAYNAME(EXTRACT_DATETIME) = 'Friday'),
ADD INDEX (isFriday);
Then if you search on the new isFriday column, or even if you search on the same expression used for the virtual column definition, it will use the index.
So what you really need is an index on c that uses both columns, one for the join, and then for the additional condition.
ALTER TABLE c
ADD COLUMN isFriday AS (DAYNAME(EXTRACT_DATETIME) = 'Friday'),
ADD INDEX (project_id, isFriday);
You aren’t filtering on anything other than the outer join column. This leads me to expect that most of the rows in both tables are going to need reading. In order to do this only once, you may be best off using a hash join rather than a nested loop and index. A hash join will allow both tables to be read completely once rather than the back and forth approach of a nested loop which will likely mean the same pages read each time a row is looked up.
In order to use hash joins, you need to be running and a version of MySQL at least above version 8. It would be recommended to use the latest available stable release.

mysql index column order for join

I have two table (requests, results)
requests:
email
results:
email, processed_at
I now want to get all results that have a request with the same email and that have not been processed:
SELECT * FROM results
INNER JOIN requests ON requests.email = results.email
AND results.processed_at IS NULL
I have an index on each individual column, but the query is very slow. So I assume I need a multi column index on results:
I am just not sure which order the columns have to be:
ALTER TABLE results
ADD INDEX results_email_processed_at (email,processed_at)
ALGORITHM=INPLACE LOCK=NONE;
or
ALTER TABLE results
ADD INDEX results_processed_at_email (processed_at,email)
ALGORITHM=INPLACE LOCK=NONE;
Either composite index will be equally beneficial.
However, if you are fetching 40% of the table, then the Optimizer may choose to ignore any index and simply scan the table.
Is that SELECT the actual query? If not, please show us the actual query; a number of seemingly minor changes could make a big difference in optimization options.
Please provide EXPLAIN SELECT ... so we can see what it thinks with the current index(es). And please provide SHOW CREATE TABLE in case there are datatype issues that are relevant.
Not withstanding any indexing issues, you explicitly asked about all requests that WERE NOT processed. You have an INNER JOIN which means I WANT FROM BOTH Sides, so your NULL check in the where would never qualify.
You need a LEFT JOIN to the results table.
As for index, since the join is on the email, I would just have the EMAIL as the primary component of the index. By having a covering index and including the processed_at column would be faster as it would not have to go to the raw data page to qualify the results, but have index specifically ordered as (email, processed_at) so the EMAIL is first qualifier, THEN when it was processed comes along for the ride to complete the query requirement fields.

Index when using OR in query

What is the best way to create index when I have a query like this?
... WHERE (user_1 = '$user_id' OR user_2 = '$user_id') ...
I know that only one index can be used in a query so I can't create two indexes, one for user_1 and one for user_2.
Also could solution for this type of query be used for this query?
WHERE ((user_1 = '$user_id' AND user_2 = '$friend_id') OR (user_1 = '$friend_id' AND user_2 = '$user_id'))
MySQL has a hard time with OR conditions. In theory, there's an index merge optimization that #duskwuff mentions, but in practice, it doesn't kick in when you think it should. Besides, it doesn't give as performance as a single index when it does.
The solution most people use to work around this is to split up the query:
SELECT ... WHERE user_1 = ?
UNION
SELECT ... WHERE user_2 = ?
That way each query will be able to use its own choice for index, without relying on the unreliable index merge feature.
Your second query is optimizable more simply. It's just a tuple comparison. It can be written this way:
WHERE (user_1, user_2) IN (('$user_id', '$friend_id'), ('$friend_id', '$user_id'))
In old versions of MySQL, tuple comparisons would not use an index, but since 5.7.3, it will (see https://dev.mysql.com/doc/refman/5.7/en/row-constructor-optimization.html).
P.S.: Don't interpolate application code variables directly into your SQL expressions. Use query parameters instead.
I know that only one index can be used in a query…
This is incorrect. Under the right circumstances, MySQL will routinely use multiple indexes in a query. (For example, a query JOINing multiple tables will almost always use at least one index on each table involved.)
In the case of your first query, MySQL will use an index merge union optimization. If both columns are indexed, the EXPLAIN output will give an explanation along the lines of:
Using union(index_on_user_1,index_on_user_2); Using where
The query shown in your second example is covered by an index on (user_1, user_2). Create that index if you plan on running those queries routinely.
The two cases are different.
At the first case both columns needs to be searched for the same value. If you have a two column index (u1,u2) then it may be used at the column u1 as it cannot be used at column u2. If you have two indexes separate for u1 and u2 probably both of them will be used. The choice comes from statistics based on how many rows are expected to be returned. If returned rows expected few an index seek will be selected, if the appropriate index is available. If the number is high a scan is preferable, either table or index.
At the second case again both columns need to be checked again, but within each search there are two sub-searches where the second sub-search will be upon the results of the first one, due to the AND condition. Here it matters more and two indexes u1 and u2 will help as any field chosen to be searched first will have an index. The choice to use an index is like i describe above.
In either case however every OR will force 1 more search or set of searches. So the proposed solution of breaking using union does not hinder more as the table will be searched x times no matter 1 select with OR(s) or x selects with union and no matter index selection and type of search (seek or scan). As a result, since each select at the union get its own execution plan part, it is more likely that (single column) indexes will be used and finally get all row result sets from all parts around the OR(s). If you do not want to copy a large select statement to many unions you may get the primary key values and then select those or use a view to be sure the majority of the statement is in one place.
Finally, if you exclude the union option, there is a way to trick the optimizer to use a single index. Create a double index u1,u2 (or u2,u1 - whatever column has higher cardinality goes first) and modify your statement so all OR parts use all columns:
... WHERE (user_1 = '$user_id' OR user_2 = '$user_id') ...
will be converted to:
... WHERE ((user_1 = '$user_id' and user_2=user_2) OR (user_1=user_1 and user_2 = '$user_id')) ...
This way a double index (u1,u2) will be used at all times. Please not that this will work if columns are nullable and bypassing this with isnull or coalesce may cause index not to be selected. It will work with ansi nulls off however.

MySQL Join on 2 tables based on text field

I have two MySQL tables: "list" and "more"
The "list" table has upwards of 100,000 results.
The "more" table has upwards of 50,000 results.
The data from these tables cannot be combined.
The issue is, I have a column called "short_title" in both tables. They are both VARCHAR(255) and will contain a string like "short-title-here".
I'm using a simple query like:
SELECT L.title, M.more_info
FROM `list` L, `more` M
WHERE M.short_title = L.short_title
Since there are so many results in each table, and I need to be matching the results based on the "short_title" column which is a text field, it makes the queries EXTREMELY slow.
There is an INDEX on the "short_title" column in the "list" table, and the "short_title" column in my "more" table is UNIQUE
Is there anything I can do to the column (example: making them fulltext) that will make these queries faster?
Thank you in advance
**** UPDATE ****
I've changed my query to INNER JOIN the two tables.
The results of the explain query can be found here:
Try indexing the short_title fields.
This is your query (written using explicit joins):
SELECT L.title, M.more_info
FROM list L join
more M
on M.short_title = L.short_title;
You state that short_title is being stored as text. You can create a regular index on it by doing:
create index more_short_title on more(short_title(255));
The 255 is the number of bytes to be indexed. This should speed up the queries.
Also, with a name like short_title, why not just use a varchar() column rather than text?

What are views in MySQL?

What are views in MySQL? What's the point of views and how often are they used in the real world?
Normal Views are nothing more then queryable queries.
Example:
You have two tables, orders and customers, orders has the fields id, customer_id, performance_date and customers has id, first_name, last_name.
Now lets say you want to show the order id, performance date and customer name together instead of issuing this query:
SELECT o.id as order_id, c.first_name + ' ' + c.last_name as customer_name,
o.performance_date
FROM orders o inner join customers c
you could create that query as a view and name it orders_with_customers, in your application you can now issue the query
SELECT *
FROM orders_with_customer
One benefit is abstraction, you could alter the way you store the customers name, like inlcuding a middle name, and just change the views query. All applications that used the view continue to do so but include the middle name now.
It's simple: views are virtual tables.
Views are based on SELECT-queries on "real" tables, but the difference is that views do not store the information unlike real tables. A view only references to the tables and combines them the way SELECT says them to. This makes often used queries a lot more simplified.
Here's a simple example for you. Lets suppose you have a table of employees and a table of departments, and you'd like to see their salaries. First you can create a view for the salaries.
CREATE VIEW SALARIES
AS
SELECT e.name,
e.salary,
d.name
FROM employees AS e, deparments as d
WHERE e.depid = d.depid
ORDER BY e.salary DESC
This query lists the name of the employee, his/her salary and department and orders them by their salaries in descending order. When you've done this you can use queries such as:
SELECT * from SALARIES
On a larger scale you could make a view that calculates the average salary of the employees and lists who has a salary that's less than the average salary. In real life such queries are much more complex.
In mysql a view is a stored select statement
Look to this answer: MySQL Views - When to use & when not to
You can think of the view as a on-the-fly generated table. In your queries it behaves like an ordinary table but instead of being stored on the disk, it is created on the fly when it is necessary from a SQL statement that is defined when creating a view.
To create a view, use:
CREATE VIEW first_names AS SELECT first_name FROM people WHERE surname='Smith'
You can then use this view just as an ordinary table. The trick is when you update the table people the first_names view will be updated as well because it is just a result from the SELECT statement.
This query:
SELECT * FROM first_names
will return all the first names of people named Smith in the table people. If you update the people table and re-run the query, you will see the updated results.
Basically, you could replace views with nested SELECT statements. However, views have some advantages:
Shorter queries - nested SELECT statements make the query longer
Improved readability - if the view has a sensible name, the query is much easier to understand
Better speed - the view's SELECT statement is stored in the database engine and is pre-parsed, therefore it doesn't need to be transferred from the client and parsed over and over again
Caching and optimizations - the database engine can cache the view and perform other optimizations
They're a shorthand for common filters.
Say you have a table of records with a deleted column. Normally you wouldn't be interested in deleted records and hence you could make a view called Records that filters out deleted records from the AllRecords table.
This will make your code cleaning since you don't have to append/prepend deleted != 1 to every statement.
SELECT * FROM Records
would return all not-deleted records.
In simple words
In SQL, a view is a virtual table
based on the result-set of an SQL
statement.
A view contains rows and columns, just
like a real table. The fields in a
view are fields from one or more real
tables in the database.
You can add SQL functions, WHERE, and
JOIN statements to a view and present
the data as if the data were coming
from one single table.
Source
In addtion to this, you can insert rows in an underlying table from a view provided that the only one table is referred by the view and columns which are not referred in the view allow nulls.
A view is a way of pre-defining certain queries. It can be queried just like a table, is defined by a query rather than a set of on-disk data. Querying it allows you to query the table's results.
In most cases, querying a view can be seen as equivalent to using the view's defining query as a subquery in your main query. Views allow queries to be shorter and more modular (as the common parts are defined separately in the view). They also provide opportunity for optimization (although not all databases do so; I am not sure if MySQL provides any optimizations to make views faster or not).
If you update the underlying tables, queries against the view will automatically reflect those changes.