\n Separated Search in Column - mysql

I have a district table, in which we store user’s preferred districts in district table district_id (varchar(250)) field(column). Value stored in this field is like 1 2 5 6 1 by using \n. So please tell me, how can i search in this specific column?

Don't. Your design is absolutely horrible and this is why you are having this issue in the first place.
When you have a N-N relationship (a user can have many preferred districts and each district can be preferred by many users) you need to make a middle table with foreign keys to both tables.
You need:
A table for districts with only information about districts.
A table with users with only information about users.
A table for preferred districts by user with the district number and the user id as columns and foreign key constraints. This will make sure that any user can have an unlimited number of preferred districts with easy querying.

I would not recommend performing searches on data stored that way, but if you are stuck it can be done with regular expressions.
You have to deal with starting and ending matches for a string as well. So a regular LIKE is not going to work.
MySQL Regular Expressions
Give this SQL a try. To search for the number 5
SELECT * FROM `TABLE` WHERE `field` REGEXP '\\n?(5)\\n?';
If you want to match using the LIKE feature. It can be done using multiple rules.
SELECT * FROM `TABLE` WHERE `field` LIKE '%\\n5\\n%' OR LIKE '5\\n%' OR LIKE '%\\n5';
Note that you have to use a double \ to escape for a new line.

Easiest way is to just use a LIKE query, like this:
SELECT * FROM `preferred_districts` WHERE `district_id` LIKE '%6%';
To make sure it's the right one you'll receive (because this will also match id 16, 26, 674 etc.) you'll have to check manually if it's correct. In php (dunno if you use it) you could use the snippet below:
$id_field = '1 2 5 6 17';
$ids = explode("\n", $id_field);
if(in_array(6, $ids)) {
echo 'Yup, found the right one';
}
Important Although the above will work, your database design isn't how it should be. You should create (what is sometimes called) a pivot table between the districts and the users, something like below.
(Table 'users_preferred_districts')
user_id | district_id
--------+------------
2 | 1
2 | 17
9 | 21
Like this it's quite easy to retrieve the records you want...

I have used mysql function FIND_IN_SET() and I got the desired result through this function.
I got help from this tutorial.
http://www.w3resource.com/mysql/string-functions/mysql-find_in_set-function.php

Related

SQL IN Clause only returning rows with first match in comma separated list of IDs

I have 5 users which have a column 'shop_access' (which is a list of shop IDs eg: 1,2,3,4)
I am trying to get all users from the DB which have a shop ID (eg. 2) in their shop_access
Current Query:
SELECT * FROM users WHERE '2' IN (shop_access)
BUT, it only returns users which have shop_access starting with the number 2.
E.g
User 1 manages shops 1,2,3
User 2 manages shops 2,4,5
User 3 manages shops 1,3,4
User 4 manages shops 2,3
The only one which will be returned when running the IN Clause is User 2 and User 4.
User 1 is ignored (which it shouldn't as it has number 2 in the list) as it does not start with the number 2.
I'm not in a position to currently go back and change the way this is set up, eg convert it to JSON and handle it with PHP first, so if someone can try to make this work without having to change the column data (shop_access) that would be ideal.
A portable solution is to use like:
where concat(',', shop, ',') like '%,2,%'
Or if the value to search for is given as a parameter:
where concat(',', shop, ',') like concat('%,', ?, ',%')
Depending on your database, there may be neater options available. In MuSQL:
where find_in_set('2', shop)
That said, I would highly recommend fixing your data model. Storing CSV data in a database defeats the purpose of a relational database in many ways. You should have a separate table to store the user/shop relations, which each tuple on a separate row. Recommended reading: Is storing a delimited list in a database column really that bad?.
Also, you might want to consider using REGEXP here for an option:
SELECT *
FROM users
WHERE shop_access REGEXP '[[:<:]]2[[:>:]]';
-- [[:<:]] and [[:>:]] are word boundaries
SELECT * FROM users WHERE (shop_access = 2) OR (shop_access LIKE "2,%" OR shop_access LIKE "%,2,%" OR shop_access LIKE "%,2")

Mysql two ways to select where. Which way uses less resources and is faster?

For example have url like domain.com/transport/cars
Based on the url want to select from mysql and show list of ads for cars
Want to choose fastest method (method that takes less time to show results and will use less resources).
Comparing 2 ways
First way
Mysql table transport with rows like
FirstLevSubcat | Text
---------------------------------
1 | Text1 car
2 | Text1xx lorry
1 | Text another car
FirstLevSubcat Type is int
Then another mysql table subcategories
Id | NameOfSubcat
---------------------------------
1 | cars
2 | lorries
3 | dogs
4 | flats
Query like
SELECT Text, AndSoOn FROM transport
WHERE
FirstLevSubcat = (SELECT Id FROM subcategories WHERE NameOfSubcat = `cars`)
Or instead of SELECT Id FROM subcategories get Id from xml file or from php array
Second way
Mysql table transport with rows like
FirstLevSubcat | Text
---------------------------------
cars | Text1 car
lorries | Text1xx lorry
cars | Text another car
FirstLevSubcat Type is varchar or char
And query simply
SELECT Text, AndSoOn FROM transport
WHERE FirstLevSubcat = `cars`
Please advice which way would use less resources and takes less time to show results. I read that better select where int than where varchar SQL SELECT speed int vs varchar
So as understand the First way would be better?
The first design is much better, because you separate two facts in your data:
There is a category 'cars'.
'Text1 car' is in the Category 'cars'.
Imagine, in your second design you enter another car, but type in 'cors' instead of 'cars'. The dbms doesn't see this, and so you have created another category with a single entry. (Well, in MySQL you could use an enum column instead to circumvent this issue, but this is not available in most other dbms. And anyhow, whenever you want to rename your category, say from 'cars' to 'vans', then you would have to change all existing records plus alter the table, instead of simply renaming the entry once in the subcategories table.)
So stay away from your second design.
As to Praveen Prasannan's comment on sub queries and joins: That is nonsense. Your query is straight forward and good. You want to select from transport where the category is the desired one. Perfect. There are two groups of persons who would prefer a join here:
Beginners who simply don't know better and always join from the start and try to sort things out in the end.
Experienced programmers who know that some dbms often handle joins better than sub-queries. But this is a pessimistic habit. Better write your queries such that they are easy to read and maintain, as you are already doing, and only change this in case grave performance issues occur.
Yup. As the SO link in your question suggests, int comparison is faster than character comparison and yield faster fetch. Keeping this in mind, first design would be considered as better design. However sub queries are never recommended. Use join instead.
eg:
SELECT t.Text, t.AndSoOn FROM transport t
INNER JOIN subcategories s ON s.ID = t.FirstLevSubcat
WHERE s.NameOfSubcat = 'cars'

MySQL select users on multiple criteria

My team working on a php/MySQL website for a school project. I have a table of users with typical information (ID,first name, last name, etc). I also have a table of questions with sample data like below. For this simplified example, all the answers to the questions are numerical.
Table Questions:
qid | questionText
1 | 'favorite number'
2 | 'gpa'
3 | 'number of years doing ...'
etc.
Users will have the ability fill out a form to answer any or all of these questions. Note: users are not required to answer all of the questions and the questions themselves are subject to change in the future.
The answer table looks like this:
Table Answers:
uid | qid | value
37 | 1 | 42
37 | 2 | 3.5
38 | 2 | 3.6
etc.
Now, I am working on the search page for the site. I would like the user to select what criteria they want to search on. I have something working, but I'm not sure it is efficient at all or if it will scale (not that these tables will ever be huge - like I said, it is a school project). For example, I might want to list all users whose favorite number is between 100 and 200 and whose GPA is above 2.0. Currently, I have a query builder that works (it creates a valid query that returns accurate results - as far as I can tell). A result of the query builder for this example would look like this:
SELECT u.ID, u.name (etc)
FROM User u
JOIN Answer a1 ON u.ID=a1.uid
JOIN Answer a2 ON u.ID=a2.uid
WHERE 1
AND (a1.qid=1 AND a1.value>100 AND a1.value<200)
AND (a2.qid=2 AND a2.value>2.0)
I add the WHERE 1 so that in the for loops, I can just add " AND (...)". I realize I could drop the '1' and just use implode(and,array) and add the where if array is not empty, but I figured this is equivalent. If not, I can change that easy enough.
As you can see, I add a JOIN for every criteria the searcher asks for. This also allows me to order by a1.value ASC, or a2.value, etc.
First question:
Is this table organization at least somewhat decent? We figured that since the number of questions is variable, and not every user answers every question, that something like this would be necessary.
Main question:
Is the query way too inefficient? I imagine that it is not ideal to join the same table to itself up to maybe a dozen or two times (if we end up putting that many questions in). I did some searching and found these two posts which seem to kind of touch on what I'm looking for:
Mutiple criteria in 1 query
This uses multiple nested (correct term?) queries in EXISTS
Search for products with multiple criteria
One of the comments by youssef azari mentions using 'query 1' UNION 'query 2'
Would either of these perform better/make more sense for what I'm trying to do?
Bonus question:
I left out above for simplicity's sake, but I actually have 3 tables (for number valued questions, booleans, and text)
The decision to have separate tables was because (as far as I could think of) it would either be that or have one big answers table with 3 value columns of different types, having 2 always empty.
This works with my current query builder - an example query would be
SELECT u.ID,...
FROM User u
JOIN AnswerBool b1 ON u.ID=b1.uid
JOIN AnswerNum n1 ON u.ID=n1.uid
JOIN AnswerText t1 ON u.ID=t1.uid
WHERE 1
AND (b1.qid=1 AND b1.value=true)
AND (n1.qid=16 AND n1.value<999)
AND (t1.qid=23 AND t1.value LIKE '...')
With that in mind, what is the best way to get my results?
One final piece of context:
I mentioned this is for a school project. While this is true, then eventual goal (it is an undergrad senior design project) is to have a department use our site for students creating teams for their senior design. For a rough estimate of size, every semester, the department would have somewhere around 200 or so students use our site to form teams. Obviously, when we're done, the department will (hopefully) check our site for security issues and other stuff they need to worry about (what with FERPA and all). We are trying to take into account all common security practices and scalablity concerns, but in the end, our code may be improved by others.
UPDATE
As per nnichols suggestion, I put in a decent amount of data and ran some tests on different queries. I put around 250 users in the table, and about 2000 answers in each of the 3 tables. I found the links provided very informative
(links removed because I can't hyperlink more than twice yet) Links are in nnichols' response
as well as this one that I found:
http://phpmaster.com/using-explain-to-write-better-mysql-queries/
I tried 3 different types of queries, and in the end, the one I proposed worked the best.
First: using EXISTS
SELECT u.ID,...
FROM User u WHERE 1
AND EXISTS
(SELECT * FROM AnswerNumber
WHERE uid=u.ID AND qid=# AND value>#) -- or any condition on value
AND EXISTS
(SELECT * FROM AnswerNumber
WHERE uid=u.ID AND qid=another # AND some_condition(value))
AND EXISTS
(SELECT * FROM AnswerText
...
I used 10 conditions on each of the 3 answer tables (resulting in 30 EXISTS)
Second: using IN - a very similar approach (maybe even exactly?) which yields the same results
SELECT u.ID,...
FROM User u WHERE 1
AND (u.ID) IN (SELECT uid FROM AnswerNumber WHERE qid=# AND ...)
...
again with 30 subqueries.
The third one I tried was the same as described above (using 30 JOINs)
The results of using EXPLAIN on the first two were as follows: (identical)
The primary query on table u had a type of ALL (bad, though users table is not huge) and rows searched was roughly twice the size of the user table (not sure why). Each other row in the output of EXPLAIN was a dependent query on the relevant answer table, with a type of eq_ref (good) using WHERE and key=PRIMARY KEY and only searching 1 row. Overall not bad.
For the query I suggested (JOINing):
The primary query was actually on whatever table you joined first (in my case AnswerBoolean) with type of ref (better than ALL). The number of rows searched was equal to the number of questions answered by anyone (as in 50 distinct questions have been answered by anyone) (which will be much less than the number of users). For each additional row in EXPLAIN output, it was a SIMPLE query with type eq_ref (good) using WHERE and key=PRIMARY KEY and only searching 1 row. Overall almost the same, but a smaller starting multiplier.
One final advantage to the JOIN method: it was the only one I could figure out how to order by various values (such as n1.value). Since the other two queries were using subqueries, I could not access the value of a specific subquery. Adding the order by clause did change the extra field in the first query to also have 'using temporary' (required, I believe, for order by's) and 'using filesort' (not sure how to avoid that). However, even with those slow-downs, the number of rows is still much less, and the other two (as far as I could get) cannot use order by.
You could answer most of these questions yourself with a suitably large test dataset and the use of EXPLAIN and/or the profiler.
Your INNER JOINs will almost certainly perform better than switching to EXISTS but again this is easy to test with a suitable test dataset and EXPLAIN.

MySQL - How can i query a multi-value field to match a primary key of another table?

I have 2 tables. One table contains all of the states in the USA. The other table is just a list of stuff in those states.
My table is structure looks something like this:
tbl_states - stateID (PK), stateName
tbl_stuff - stuffID, stuffName, relState
The values look like this
1 | Alabama
2 | Georgia
3 | Maryland
The relState column relates to the tbl_states.stateID column and i have it in this format. I plan to have a webform to select multiple states and assign the stuff in the states to the state.
1 | This is some stuff | 1,2 [ and this stuff is only AL, GA. ]
So I'm trying to figure out the best way to write the select statement for this. Is there some way to do it strictly with mysql?
Multi-valued fields in a database are a bad idea. Instead, resolve the many-to-many relationship between states and stuff like this:
I came across this post while searching to do this myself and figured a second answer would be helpful. While it is true that multi-valued fields decrease search efficiency, impact scailability and promote data integrity problems, they can be necessary for simplicity, reporting, and integrating with other systems, as in my case it was.
Assuming the Tables:
Table: States
Id Name
235325235 'Alabama'
457457432 'Georgia'
334634636 'Maryland'
Table: Stuff
Id Text StateIds
1 'Some stuff' '235325235'
2 'Some Stuff for two states' '235325235,457457432'
The following query would return all stuffs for alabama
SELECT * FROM Stuff WHERE FIND_IN_SET('235325235', Stuff.StateIds);
Please note that i complicated your ID's more to make a lower probability of uniqueness and I would recommend using a GUID/UUID since you are using a string searching function.

Help with mysql query or database design to help me achieve this

I have been a long time reader of site but havenever posted before and am hoping someone here can help. I am working on a multi-user web application which allows users to capture prospect information into our database. The users will be given default fields inside the app, such as name and email which are grouped into data groups.
However i want to allow the users to create as many custom fields as they wish and also allow them to choose which data group these custom fields belong in group. I am struggling to figure out the best way to achieve this.
I have currently have 3 tables in my data which are as follows:
datagroups
group_id group_name order
1 test group 1 2
2 test group 2 1
datafields
field_id field_name group
1 field 1 2
2 field 2 1
3 field 3 2
customdatafields
custom_field_id field_name group
1 custom 1 1
2 custom 2 2
I am really puzzled on how I would create a query across the 3 tables so that I can produce a view following display:
test group 2
- field 1
- field 3
- custom 2
test group 1
- field 2
- custom 1
One thing I need to keep in mind is that I may allow the users to create custom data groups also that that needs to be factored into this.
Any input on this would be much appreciated.
Thanks
However i want to allow the users to
create as many custom fields as they
wish and also allow them to choose
which data group these custom fields
belong in group.
You've just made every casual user your database designer.
Does that sound like a good idea?
Anything that should be a row in a table will end up being a comma separated string in a single column. Data will end up looking like this, sooner or later.
Home: 123-456-7890, Work: 123-456-0987, Cell: 123-454-5678
H: 123-454-6453, W: 123-432-5746, 800: 1-800-555-1212
234-345-4567, 234-345-6785
323-123-4567 Don't call before 10:00 am
Why would people do that? Because ordinary people don't know how to design databases. They'll do the most expedient thing, which is just put data in any which way now, and figure out how to deal with it later.
Heck, most of the programmers on SO don't know how to design databases. Just read [database] questions for a week or so. (Including a dozen variants of this very one.)
You do not want to use MySQL or any SQL dbms for this. If you can't do a proper job of modeling to start with, a SQL dbms will just make you die before your time.
well, your described output is not tabular... so not directly what you will get form a sql statement.
however, you should be able to get listings by using UNION maybe something like this:
select g.group_id, g.group_name, f.field_name
from datagroups g,
(select field_name, field_id, group from datafields
union
select custom_field_name, field_id, group from customdatafields
) f
where f.group = g.group_id
order by g.order