I have the following simplified table 'places', which contains 200,000+ rows:
placeId INT(10)
placeName VARCHAR (30)
placeNameEnglish VARCHAR (30)
placeName is a place name stored in the original language e.g. Rhône
placeNameEnglish is a place name translated into english e.g. Rhone
Currently I have two single column indexes - one for placeName and one for placeNameEnglish and am conducting these LIKE pattern queries:
$testStr = 'rho';
SELECT placeId
FROM places
WHERE (placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%')
Done some research but can't quite get my head around multi-column indexes when used in this scenario. Question is, should I combine placeName and placeNameEnglish into a multi-column index or leave them as separate indexes?
UPDATE
Working on implementing the last suggested by #Gordon Linoff.
Considering adding a table named translations instead of placeNamesso that the same index can be used for multiple tables i.e a persons table that requires the same LIKE 'abc%' matching.
So far:
transId INT
parentId INT - either placeId or personId
parentTypeId TINYINT - either 1 to identify the places table or 2 for the persons table, etc (more tables could use this system at a later date)
languageId INT
transName VARCHAR
Should I also index the parentTypeId to accommodate the extra WHERE condition required to identify the correct parent table?
e.g. WHERE transName LIKE 'abc%' AND parentTypeId = 1
I imagine mysql works like this: it first uses the index for transName to match against transName LIKE 'abc%', then it filters the results using parentTypeId = 1
For this query:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%';
MySQL could use two indexes, one on places(placeName) and one on places(placeNameEnglish). The operation is a called index merge (see here). I wouldn't count on it. This query cannot fully use a composite index.
You can rephrase the query as:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION
SELECT placeId
FROM places
WHERE placeNameEnglish LIKE '$testStr%';
or:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION ALL
SELECT placeId
FROM places
WHERE placeId NOT IN (SELECT placeId FROM places WHERE placename LIKE '$testStr%') AND
placeNameEnglish LIKE '$testStr%';
These can take advantage of the two indexes.
My recommendation, though, is to change the structure of your data. Have a table called PlaceNames (or something like that) with these columns:
placeNameId INT
placeId INT,
languageId INT,
placeName VARCHAR(255)
That is, have a separate row for each language. Your query can then easily take advantage of an index on placeName(placeName).
For your original question: Two separate INDEXes. But... You are working too hard:
For European place names, you don't need to search both columns. The case folding and accent insensitivity of utf8_unicode_ci (or virtually any collation other than utf8_bin) will do what you need:
mysql> SELECT 'Rhône' LIKE '%rho%', 'Rhône' LIKE '%xyz%';
+-----------------------+-----------------------+
| 'Rhône' LIKE '%rho%' | 'Rhône' LIKE '%xyz%' |
+-----------------------+-----------------------+
| 1 | 0 |
+-----------------------+-----------------------+
Edit Based on OP's comment, this is not a complete solution.
Related
Hello i have a string that is stored in my database separated by comma
eg: (new south wales,Queensland,etc,etc)
Know my problem is when i try to search Queensland i am not able to get the result but when i try to search for new south wales i get the record.
But i want to get the result when i try to search for queen or etc.
I am new to php so please help...
Short Term Solution
Use the FIND_IN_SET function:
WHERE FIND_IN_SET('Queensland', csv_column)
...because using LIKE with wildcards on either end is risky, depending on how much/little matches (and it also ensures a table scan). Performance of LIKE with wildcards on either side is on par with REGEXP--that means bad.
Long Term Solution
Don't store comma separated values -- use a proper many-to-many relationship, involving three tables:
Things
thing_id (primary key)
Australian States
State_id (primary key)
State_name
Things_to_Auz_States
thing_id (primary key, foreign key to THINGS table)
State_id (primary key, foreign key to AUSTRALIAN_STATES table)
You'll need JOINs to get data out of the three tables, but if you want to know things like how many are associated to a particular state, or two particular states, it's the proper model.
Not really what you were asking, but just to be complete: you're going to have a lot of trouble unless you change your approach.
The correct way:
TableOne
--------
ThingID
TableTwo
--------
ThingID
Province
Then your database query becomes:
SELECT fields FROM TableOne WHERE ThingID IN
(SELECT ThingID from TableTwo WHERE Province = 'Queensland')
And what do you want to have happen when they search for "Australia"? Get back both Western Australia and South Australia?
By using REGEXP
$result = mysql_query("SELECT * FROM table WHERE column REGEXP $your_search_string");
I store destinations a user is willing to ship a product to in a varchar field like this:
"userId" "destinations" "product"
"1" "US,SE,DE" "apples"
"2" "US,SE" "books"
"3" "US" "mushrooms"
"1" "SE,DE" "figs"
"2" "UK" "Golf Balls"
I was hoping this query would return all rows where US was present. Instead it returns only a single row.
select * from destinations where destinations IN('US');
How do I get this right? Am I using the wrong column type? or is it my query that's failing.
Current Results
US
Expected Results
US,SE,DE
US,SE
US
Try with FIND_IN_SET
select * from destinations where FIND_IN_SET('US',destinations);
Unfortunately, the way you've structured your table, you'll have to check for a pattern match for "US" in your string at the beginning, middle, or end.
One way you can do that is using LIKE, as follows:
SELECT *
FROM destinations
WHERE destinations LIKE ('%US%');
Another way is using REGEXP:
SELECT *
FROM destinations
WHERE destinations REGEXP '.*US.*';
Yet another is using FIND_IN_SET, as explained by Sadkhasan.
CAVEAT
None of these will offer great performance or data integrity, though. And they will all COMPOUND their performance problems when you add criteria to your search.
E.g. using FIND_IN_SET, proposed by Sadkhasan, you would have to do something like:
SELECT * FROM destinations
WHERE FIND_IN_SET('US',destinations)
OR FIND_IN_SET('CA',destinations)
OR FIND_IN_SET('ET',destinations);
Using REGEXP is a little better, though REGEXP is innately slow:
SELECT *
FROM destinations
WHERE destinations REGEXP '.*US|CA|ET.*';
SO WHAT NOW?
Your best bet would be switching to a 3NF design with destinations applying to products by splitting into 2 tables that you can join, e.g.:
CREATE TABLE products (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
userId INT NOT NULL REFERENCES users(id),
name VARCHAR(255) NOT NULL
) TYPE=InnoDB;
Then you would add what's called a composite key table, each row containing a productId and a single country, with one row per country.
CREATE TABLE product_destinations (
productId INT NOT NULL REFERENCES products(id),
country VARCHAR(2) NOT NULL,
PRIARY KEY (productId, country)
) TYPE=InnoDB;
Data in this table would look like:
productId | country
----------|--------
1 | US
1 | CA
1 | ET
2 | US
2 | GB
Then you could structure a query like this:
SELECT p.*
FROM products AS p
INNER JOIN product_destinations AS d
ON p.id = d.productId
WHERE d.country IN ('US', 'CA', 'ET')
GROUP BY p.id;
It's important to add the GROUP (or DISTINCT in the SELECT clause), as a single product may ship to multiple countries, resulting in multiple row matches - aggregation will reduce those to a single result per product id.
An added bonus is you don't have to UPDATE your countries column and do string operations to determine if the country already exists there. You can let the database do that for you, and INSERT - preventing locking issues that will further compound your problems.
You can use this if your destinations have just two caracters of the countries.
SELECT * FROM destinations WHERE destinations LIKE ('%US%')
to add other country
SELECT * FROM destinations WHERE destinations LIKE ('%US%')
AND destinations LIKE ('%SE%')
^^^--> you use AND or OR as you want the result.
I have a district table, in which we store user’s preferred districts in district table district_id (varchar(250)) field(column). Value stored in this field is like 1 2 5 6 1 by using \n. So please tell me, how can i search in this specific column?
Don't. Your design is absolutely horrible and this is why you are having this issue in the first place.
When you have a N-N relationship (a user can have many preferred districts and each district can be preferred by many users) you need to make a middle table with foreign keys to both tables.
You need:
A table for districts with only information about districts.
A table with users with only information about users.
A table for preferred districts by user with the district number and the user id as columns and foreign key constraints. This will make sure that any user can have an unlimited number of preferred districts with easy querying.
I would not recommend performing searches on data stored that way, but if you are stuck it can be done with regular expressions.
You have to deal with starting and ending matches for a string as well. So a regular LIKE is not going to work.
MySQL Regular Expressions
Give this SQL a try. To search for the number 5
SELECT * FROM `TABLE` WHERE `field` REGEXP '\\n?(5)\\n?';
If you want to match using the LIKE feature. It can be done using multiple rules.
SELECT * FROM `TABLE` WHERE `field` LIKE '%\\n5\\n%' OR LIKE '5\\n%' OR LIKE '%\\n5';
Note that you have to use a double \ to escape for a new line.
Easiest way is to just use a LIKE query, like this:
SELECT * FROM `preferred_districts` WHERE `district_id` LIKE '%6%';
To make sure it's the right one you'll receive (because this will also match id 16, 26, 674 etc.) you'll have to check manually if it's correct. In php (dunno if you use it) you could use the snippet below:
$id_field = '1 2 5 6 17';
$ids = explode("\n", $id_field);
if(in_array(6, $ids)) {
echo 'Yup, found the right one';
}
Important Although the above will work, your database design isn't how it should be. You should create (what is sometimes called) a pivot table between the districts and the users, something like below.
(Table 'users_preferred_districts')
user_id | district_id
--------+------------
2 | 1
2 | 17
9 | 21
Like this it's quite easy to retrieve the records you want...
I have used mysql function FIND_IN_SET() and I got the desired result through this function.
I got help from this tutorial.
http://www.w3resource.com/mysql/string-functions/mysql-find_in_set-function.php
I do not fully understand indexes and would like some precisions.
I have a table, named posts, which overtime might become very big.
Each post belongs to a category and a language, through 2 columns category_id and lang
If I create indexes on the columns category_id and lang, does this mean that the posts table will be "organized"/"classified" in mysql by "blocs" of category_id and lang, allowing a better performance of the selection of data when I precise a category_id and/or a lang in my query...?
Which type of index should be created then ?
I hope I'm clear enough here...
What an index does is create a "shadow" table, consisting of only the index values, so it only has to look through the index to find what you're looking for.
If you're doing a query, with a where like this:
WHERE zipcode = 555 AND phone = 12345678
You will need an index on Zipcode and Phone.
If the query is only:
WHERE zipcode = 555
You will need to index zipcode only.
Background I have a table with max 2000 rows, the user should search up to 6 columns.
I don't know in advance what he's looking for and i want a concatenated search (search1 AND search2 AND...)
Problem In these columns I have the an ID not the plain description (ie i have the id of the town, not its name). So I was thinking about two solutions:
Create another table, where i put keywords (1 key/row) and then i search there using LIKE search1% OR LIKE search2% ...
Add a field to the existent table where I put all the keywords and then do a FULLTEXT on that
Which one is the best? I know that rows are so fews that there won't be big perfomance problems, but i hope they'll get more and more :)
Example
This is my table:
ID | TOWN | TYPE | ADDRESS |
11| 14132 | 3 | baker street 220
13| 45632 | 8 | main street 12
14132 = London
45632 = New York
3 = Customer
8 = Admin
The user typing "London Customer" should find the first row.
If you're simply going to use a series of LIKEs, then I'd have thought it would make sense to make use of a FULLTEXT index, the main reason being that it would let you use more complex boolean queries in the future. (As #Quassnoi states, you can simply create an index if you don't have a use for a specific field.)
However, it should be noted that fulltext has its limitations - words that are common across all rows have a low "score" and hence won't match as prominently as if you'd carried out a series of LIKEs. (On the flipside, you can of course get a "score" back from a FULLTEXT query, which may be of use depending on how you want to rank the results.)
You don't have to create a separate field, since a FULLTEXT index can be created on multiple fields:
CREATE fx_mytable_fields ON mytable (field1, field2, field3)
SELECT *
FROM mytable
WHERE MATCH(field1, field2, field3) AGAINST ('+search1 +search2')
This will return all records that contain search1 and search2 in either of the fields, like this:
field1 field2 field3
-- -- --
search1 something search2
or this:
field1 field2 field3
-- -- --
search1 search2 something something else
Given you've got the data in seperate tables, you'd have to have a FULLTEXT index on each of the searchable fields in each table. After that, it's just a matter of building the query with the appropriate JOINs in place so you can fulltext MATCH AGAINST the text version of the field, and not the foreign key number.
SELECT user.id, user.name, town.name
FROM user
LEFT JOIN town ON user.town = town.id
WHERE MATCH(user.name, town.name) AGAINST (...)