I store destinations a user is willing to ship a product to in a varchar field like this:
"userId" "destinations" "product"
"1" "US,SE,DE" "apples"
"2" "US,SE" "books"
"3" "US" "mushrooms"
"1" "SE,DE" "figs"
"2" "UK" "Golf Balls"
I was hoping this query would return all rows where US was present. Instead it returns only a single row.
select * from destinations where destinations IN('US');
How do I get this right? Am I using the wrong column type? or is it my query that's failing.
Current Results
US
Expected Results
US,SE,DE
US,SE
US
Try with FIND_IN_SET
select * from destinations where FIND_IN_SET('US',destinations);
Unfortunately, the way you've structured your table, you'll have to check for a pattern match for "US" in your string at the beginning, middle, or end.
One way you can do that is using LIKE, as follows:
SELECT *
FROM destinations
WHERE destinations LIKE ('%US%');
Another way is using REGEXP:
SELECT *
FROM destinations
WHERE destinations REGEXP '.*US.*';
Yet another is using FIND_IN_SET, as explained by Sadkhasan.
CAVEAT
None of these will offer great performance or data integrity, though. And they will all COMPOUND their performance problems when you add criteria to your search.
E.g. using FIND_IN_SET, proposed by Sadkhasan, you would have to do something like:
SELECT * FROM destinations
WHERE FIND_IN_SET('US',destinations)
OR FIND_IN_SET('CA',destinations)
OR FIND_IN_SET('ET',destinations);
Using REGEXP is a little better, though REGEXP is innately slow:
SELECT *
FROM destinations
WHERE destinations REGEXP '.*US|CA|ET.*';
SO WHAT NOW?
Your best bet would be switching to a 3NF design with destinations applying to products by splitting into 2 tables that you can join, e.g.:
CREATE TABLE products (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
userId INT NOT NULL REFERENCES users(id),
name VARCHAR(255) NOT NULL
) TYPE=InnoDB;
Then you would add what's called a composite key table, each row containing a productId and a single country, with one row per country.
CREATE TABLE product_destinations (
productId INT NOT NULL REFERENCES products(id),
country VARCHAR(2) NOT NULL,
PRIARY KEY (productId, country)
) TYPE=InnoDB;
Data in this table would look like:
productId | country
----------|--------
1 | US
1 | CA
1 | ET
2 | US
2 | GB
Then you could structure a query like this:
SELECT p.*
FROM products AS p
INNER JOIN product_destinations AS d
ON p.id = d.productId
WHERE d.country IN ('US', 'CA', 'ET')
GROUP BY p.id;
It's important to add the GROUP (or DISTINCT in the SELECT clause), as a single product may ship to multiple countries, resulting in multiple row matches - aggregation will reduce those to a single result per product id.
An added bonus is you don't have to UPDATE your countries column and do string operations to determine if the country already exists there. You can let the database do that for you, and INSERT - preventing locking issues that will further compound your problems.
You can use this if your destinations have just two caracters of the countries.
SELECT * FROM destinations WHERE destinations LIKE ('%US%')
to add other country
SELECT * FROM destinations WHERE destinations LIKE ('%US%')
AND destinations LIKE ('%SE%')
^^^--> you use AND or OR as you want the result.
Related
I'm debating between using a CASE statement or a lookup table to replace text from table2.columnB when table1.columnB = table2.columnA. I'd rather use a lookup table because it's easier to manage.
Our database pulls all the customer order information from our online store. It receives all the state names in full and I need to replace all instances of U.S. states with their 2-character abbreviation. (e.g. Texas -> TX)
How would I use a lookup table with this query for State?
Here's my query: http://sqlfiddle.com/#!9/e44aa3/12/0
Thank you in advance!
For your question how would add the lookup table in your code, you must add this join:
LEFT JOIN `state_abbreviations` AS `sa` ON `sa`.`shipping_zone` = `o`.`shipping_zone`
and change this line:
`o`.`shipping_zone` AS `State`
with:
COALESCE(`sa`.`zone_abbr`, `o`.`shipping_zone`) AS `State`
so you get the abbreviation returned.
See the demo.
Results:
Order ID Name State Qty Option Size Product Ref
12345 Mason Sklut NC 1 R L Tee R / Tee L
12346 John Doe OH 2 Bl S Hood 2x Bl / Hood S
Using a CASE expression is sure an option. However, it does not scale well: there are 50+ states in the US, so you would need to write 50 when branches, like:
case state
when 'North Carolina' then 'NC'
when 'Ohio' then 'OH'
when ...
end
Creating a mapping table seems like a better idea. It is also a good way to enforce referential integrity (ie ensure that the names being used really are state names).
That would look like:
create table states (
code varchar(2) not null primary key,
name varchar(100) not null
);
In your original table, you want to have a column that stores the state code, with a foreign key constraint that references states(code) (you may also store the state name, but this looks like a less efficient option in terms of storage).
You can do the mapping in your queries with a join:
select t.*, s.name state_name
from mytable t
inner join states s on s.code = t.state_code
We want to select customers based on following parameters i.e. customer should be in:
specific city i.e. cityId=1,2,3...
specific customerId should be excluded i.e. customerId=33,2323,34534...
specific age i.e. 5 years, 7 years, 72 years...
This inclusion & exclusion list can be any long.
How should we design database for this:
Create separate table 'customerInclusionCities' for these inclusion cities and do like:
select * from customers where cityId in (select cityId from customerInclusionCities)
Some we do for age, create table 'customerEligibleAge' with all entries of eligible age entries:
i.e. select * from customers where age in (select age from customerEligibleAge)
and Create separate table 'customerIdToBeExcluded' for excluding customers:
i.e. select * from customers where customerId not in (select customerId from customerIdToBeExcluded)
OR
Create One table with Category and Ids.
i.e. Category1 for cities, Category2 for CustomerIds to be excluded.
Which approach is better, creating one table for these parameters OR creating separate tables for each list i.e. age, customerId, city?
IN ( SELECT ... ) can be very slow. Do your query as a single SELECT without subqueries. I assume all 3 columns are in the same table? (If not, that adds complexity.) The WHERE clause will probably have 3 IN ( constants ) clauses:
SELECT ...
FROM tbl
WHERE cityId IN (1,2,3...)
AND customerId NOT IN (33,2323,34534...)
AND age IN (5, 7, 72)
Have (at least):
INDEX(cityId),
INDEX(age)
(Negated things are unlikely to be able to use an index.)
The query will use one of the indexes; having both will give the Optimizer a choice of which it thinks is better.
Or...
SELECT c.*
FROM customers AS c
JOIN cityEligible AS b ON b.city = c.city
JOIN customerEligibleAge AS ce ON c.age = ce.age
LEFT JOIN customerIdToBeExcluded AS ex ON c.customerId = ex.customerId
WHERE ex.customerId IS NULL
Suggested indexes (probably as PRIMARY KEY):
customers: (city)
customerEligibleAge: (age)
customerIdToBeExcluded: (customerId)
In order to discuss further, please provide SHOW CREATE TABLE for each table and EXPLAIN SELECT ... for any of the queries actually work.
If you use the database only that operation, I recommend to use the first solution. Also the first solution is very simple to deploy.
The second solution fills up with junk the DB.
I have the following simplified table 'places', which contains 200,000+ rows:
placeId INT(10)
placeName VARCHAR (30)
placeNameEnglish VARCHAR (30)
placeName is a place name stored in the original language e.g. Rhône
placeNameEnglish is a place name translated into english e.g. Rhone
Currently I have two single column indexes - one for placeName and one for placeNameEnglish and am conducting these LIKE pattern queries:
$testStr = 'rho';
SELECT placeId
FROM places
WHERE (placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%')
Done some research but can't quite get my head around multi-column indexes when used in this scenario. Question is, should I combine placeName and placeNameEnglish into a multi-column index or leave them as separate indexes?
UPDATE
Working on implementing the last suggested by #Gordon Linoff.
Considering adding a table named translations instead of placeNamesso that the same index can be used for multiple tables i.e a persons table that requires the same LIKE 'abc%' matching.
So far:
transId INT
parentId INT - either placeId or personId
parentTypeId TINYINT - either 1 to identify the places table or 2 for the persons table, etc (more tables could use this system at a later date)
languageId INT
transName VARCHAR
Should I also index the parentTypeId to accommodate the extra WHERE condition required to identify the correct parent table?
e.g. WHERE transName LIKE 'abc%' AND parentTypeId = 1
I imagine mysql works like this: it first uses the index for transName to match against transName LIKE 'abc%', then it filters the results using parentTypeId = 1
For this query:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%' OR placeNameEnglish LIKE '$testStr%';
MySQL could use two indexes, one on places(placeName) and one on places(placeNameEnglish). The operation is a called index merge (see here). I wouldn't count on it. This query cannot fully use a composite index.
You can rephrase the query as:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION
SELECT placeId
FROM places
WHERE placeNameEnglish LIKE '$testStr%';
or:
SELECT placeId
FROM places
WHERE placeName LIKE '$testStr%'
UNION ALL
SELECT placeId
FROM places
WHERE placeId NOT IN (SELECT placeId FROM places WHERE placename LIKE '$testStr%') AND
placeNameEnglish LIKE '$testStr%';
These can take advantage of the two indexes.
My recommendation, though, is to change the structure of your data. Have a table called PlaceNames (or something like that) with these columns:
placeNameId INT
placeId INT,
languageId INT,
placeName VARCHAR(255)
That is, have a separate row for each language. Your query can then easily take advantage of an index on placeName(placeName).
For your original question: Two separate INDEXes. But... You are working too hard:
For European place names, you don't need to search both columns. The case folding and accent insensitivity of utf8_unicode_ci (or virtually any collation other than utf8_bin) will do what you need:
mysql> SELECT 'Rhône' LIKE '%rho%', 'Rhône' LIKE '%xyz%';
+-----------------------+-----------------------+
| 'Rhône' LIKE '%rho%' | 'Rhône' LIKE '%xyz%' |
+-----------------------+-----------------------+
| 1 | 0 |
+-----------------------+-----------------------+
Edit Based on OP's comment, this is not a complete solution.
I use mySQL and I have a members table with a BLOB 'contacts' field containing a comma separated list of other member's IDs:
TABLE members:
id_member = 1
firstname = 'John'
contacts (BLOB) = '4,6,7,2,5'
I want to retrieve all the first names in the 'contacts' list of an individual, with a single query. I tried the following:
SELECT firstname from members WHERE id_member IN ( SELECT contacts FROM members WHERE id_member = 1 );
It returns only one row, but when I try:
SELECT firstname from members WHERE id_member IN ( 4,6,7,2,5 );
It returns all the first names from the list. I can use two queries to achieve this, but I thought I'd double check if there's a way to make it work with one simple, elegant query.
Thanks for reading, any help appreciated.
Jul
That seems like a very poor table design. Is it possible to change it?
If you can't change the design then you can handle comma separated values in MySQL by using FIND_IN_SET but it won't be able to use indexes efficiently:
SELECT firstname
FROM members
WHERE FIND_IN_SET(id_member, (SELECT contacts FROM members WHERE id_member = 1))
But rather than going this route, I'd strongly recommend that if possible you normalize your database. Consider using a join table instead of a comma separated list. Then you can find the entries you need by using joins and the search will be able to use an index.
If you're using a serialized BLOB type column to store these values then you're not going to be able to do what you want. A more SQL friendly approach is to create a relationship table that can be used as part of a JOIN operation, such as a member_contacts table that has an association between one id_member value and some other.
Expanding your comma separated list into individual records is a pretty simple mechanical process.
Can you change this DB structure? The contacts field really should be a related table rather than a column. Assuming a contacts table with this structure:
id_contact
id_member
Then you would use EXISTS instead:
SELECT firstname from members m WHERE EXISTS (SELECT 1 FROM contacts c WHERE c.id_contact = m.id_member );
I need help for this problem.
In MYSQL Table i have a field :
Field : artist_list
Values : 1,5,3,401
I need to find all records for artist uid 401
I do this
SELECT uid FROM tbl WHERE artist_list IN ('401');
I have all record where artist_list fields values are '401' only, but if i have 11,401 this query do not match.
Any idea ?
(I cant user LIKE method because if artist uid is 3 (match for 30, 33, 3333)...
Short Term Solution
Use the FIND_IN_SET function:
SELECT uid
FROM tbl
WHERE FIND_IN_SET('401', artist_list) > 0
Long Term Solution
Normalize your data - this appears to be a many-to-many relationship already involving two tables. The comma separated list needs to be turned into a table of it's own:
ARTIST_LIST
artist_id (primary key, foreign key to ARTIST)
uid (primary key, foreign key to TBL)
Your database organization is a problem; you need to normalize it. Rather than having one row with a comma-separated list of values, you should do one value per row:
uid artist
1 401
1 11
1 5
2 5
2 4
2 2
Then you can query:
SELECT uid
FROM table
WHERE artist = 401
You should also look into database normalization because what you have is just going to cause more and more problems in the future.
SELECT uid
FROM tbl
WHERE CONCAT(',', artist_list, ',') LIKE '%,401,%'
Although it would make more sense to normalise your data properly in the first place. Then your query would become trivial and have much better performance.