So I have some CSV values stored in a mySQL database.
For example:
ID Name parentID
1 Dave 1,4,6
2 Josh 2
3 Pete 10
4 Andy 2,10
Using this query
SELECT * FROM `table` WHERE `parentID` LIKE %4%
Only Dave will be returned, this is correct.
However if I select using: LIKE %1%, pete and andy are selected as well as dave, because they conatin '1'.
I need the query to be able to distinguish '10' for example, from '1'.
It needs acknowledge each value between a comma is distinct and appreciate the fact the last comma may be omitted.
Am I right in thinking perhaps REGEX could do the job instead?
Thanks.
You can use a regex to match "word boundaries":
WHERE parentID RLIKE '[[:<:]]4[[:>:]]'
Or you can use a special function that parses elements of a comman-separated string:
WHERE FIND_IN_SET('4', parentID) <> 0
I agree with the comment from #Nanne.
You will also find that it's better to store data not in comma-separated lists, but in a normalized fashion. I don't know if you have freedom to change your schema at this time, but for what it's worth, read my answer for the question Is storing a delimited list in a database column really that bad?
Related
col_1 col_2
0 ab,bc,cd
1 bc,xy
2 zz,xx
3 ab
4 cc
5 ef,kk,ok
I want to select rows that have "ab" as one of the values in col_2. For example - in this case, 0th and 3rd row will be selected.
So, is there any SQL query for that?
First, you should fix your data model. Storing multiple values in a string is just a misuse of strings. The correct data model would have a separate row for each col_1/col_2 combination.
Sometimes, we are stuck with other people's really bad decisions on data modeling. MySQL actually has a function to help deal with this, find_in_set().
You can use:
where find_in_set('ab', col_2) > 0
until you fix the data model.
I have a small problem, I have a table like this:
id|name|group|date_created
1|Volvo|1,3|06-04-2020 10:00:00
2|Audi|3|06-04-2020 10:00:00
etc....
Now I wish I could get all the records that have the value 1 inside the group column.
I tried LIKE "%1%", but I don't think it's a good query. Can you address me?
SELECT id FROM cars WHERE group LIKE '%1%'
The problem with your query is that it would wrongly match '1' against a list like '45,12,5' for example.
One method is to add commas on both ends before searching:
where concat(',', `group`, ',') like '%,1,%';
But in MySQL, it is much more convenient to use string function find_in_set(), whose purpose is just what you are looking for, ie search for a value in a comma-separated list:
select id from cars where find_in_set('1', `group`) > 0
Notes:
you should fix your data model, and have a separated table to store relationship between ids and groups, with each tuple on a separate row. Related reading: Is storing a delimited list in a database column really that bad?
group is a reserved word in MySQL, so not a good choice for a column name (you would need to surround it with backticks everytime you use it, which is error-prone)
I need a help in selection. How I can get rows with unique attributes, for example I have 2 json strings in db.
{"name":["name1"]};
{"name":["name2", "name1"]};
{"name":["name3", "name4"]};
{"name":["name3"]};
If I try just SELECT DISTINCT data->"$.name" so I get all 2 strings, but I need to check every param and if some was before then don't show it.
Is it possible?
I want to get just 1 and 3 rows, because 2 and 4 contains names which we already have (I don't care about name2 in my case name2 equivalent name1).
I wanted to contribute this answer to the void. This will return all unique top-level JSON keys; unfortunately {'test':1} and {'test':1, 'word':1} will return two records, test and test, word. This may still be suitable for some.
SELECT DISTINCT JSON_KEYS(tags) as name FROM items WHERE JSON_LENGTH(tags) >= 1
SELECT DISTINCT JSON_UNQUOTE(features->"$.name[0]") as name
FROM data WHERE JSON_LENGTH(features->"$.name") = 1
So, I just took result where attr name has 1 item only. And that we can check for unique. It's not the best solution, but I don't have another yet)
I need to export a single column from a MySQL database which shows each entry only once. So in the following table:
id author(s) content
________________________________________
1 Bill, Sara, Mike foo1
1 Sara foo2
2 Bill, Sara, Mike foo3
2 Sara foo4
3 David foo5
3 Mike foo5
I would need to export a list of authors as "Bill, Sara, Mike, Susan" so that each name is shown only once.
Thanks!
UPDATE: I realize this may not be possible, so I am going to have to accept an exported list which simply eliminates any exact duplicates within the column, so the output would be as such: Bill, Sara, Mike, Sara, David, Mike Any help forming this query would be appreciated.
Thanks again!
It's possible to get the resultset, but I'd really only do this to convert this to another table, with one row per author. I wouldn't want to run queries like this from application code.
The SUBSTRING_INDEX function can be used to extract the first, secpond, et al. author from the list, e.g.
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 1 ),',',-1) AS author1
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 2 ),',',-1) AS author2
SUBSTRING_INDEX(SUBSTRING_INDEX(authors,',', 3 ),',',-1) AS author3
But this gets messy at the end, because you get the last author when you retrieve beyond the length of the list.
So, you can either count the number of commas, with a rather ugly expression:
LENGTH(authors)-LENGTH(REPLACE(authors,',','')) AS count_commas
But it's just as easy to append a trailing comma, and then convert empty strings to NULL
So, replace authors with:
CONCAT(authors,',')
And then wrap that in TRIM and NULLIF functions.
NULLIF(TRIM( foo ),'')
Then, you can write a query that gets the first author from each row, another query that gets the second author from each row (identical to the first query, just change the '1' to a '2', the third author, etc. up to the maximum number of authors in a column value. Combine all those queries together with UNION operations (this will eliminate the duplicates for you.)
So, this query:
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',1),',',-1)),'') AS author
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',2),',',-1)),'')
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',3),',',-1)),'')
FROM unfortunately_designed_table a
UNION
SELECT NULLIF(TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(a.authors,','),',',4),',',-1)),'')
FROM unfortunately_designed_table a
this will return a resultset of unique author names (and undoubtedly a NULL). That's only getting the first four authors in the list, you'd need to extend that to get the fifth, sixth, etc.
You can get the maximum count of entries in that column by finding the maximum number of commas, and adding 1
SELECT MAX(LENGTH(a.authors)-LENGTH(REPLACE(a.authors,',','')))+1 AS max_count
FROM unfortunately_designed_table a
That lets you know how far you need to extend the query above to get all of the author values (at the particular point in time you run the query... nothing prevents someone from adding another author to the list within a column at a later time.
After all the work to get distinct author values on separate rows, you'd probably want to leave them in a list like that. It's easier to work with.
But, of course, it's also possible to convert that resultset back into a comma delimited list, though the size of the string returned is limited by max_allowed_packet session variable (iirc).
To get it back as a single row, with a comma separated list, take that whole mess of a query from above, and wrap it in parens as an line view, give it an alias, and use the GROUP_CONCAT function.
SELECT GROUP_CONCAT(d.author ORDER BY d.author) AS distinct_authors
FROM (
...
) d
WHERE d.author IS NOT NULL
If you think all of these expressions are ugly, and there should be an easier way to do this, unfortunately (aside from writing procedural code), there really isn't. The relational database is designed to handle information in tuples (rows), with each row representing one entity. Stuffing multiple entities or values into a single column goes against relational design. As such, SQL does not provide a simple way to extract values from a string into separate tuples, which is why the code to do this is so messy.
I want to convert an integer to text in a mySQL select query. Here's what a table looks like:
Languages
--------
1,2,3
I want to convert each integer to a language (e.g., 1 => English, 2 => French, etc.)
I've been reading up on CONVERT and CAST functions in mySQL, but they mostly seem to focus on converting various data types to integers. And also I couldn't find anything that dealt with the specific way I'm storing the data (multiple numbers in one field).
How can I convert the integers to text in a mySQL query?
UPDATE
Here's my mySQL query:
SELECT u.id, ulp.userid, ulp.languages, ll.id, ll.language_detail
FROM users AS u
JOIN user_language_profile AS ulp ON (ulp.userid = u.id)
JOIN language_detail AS ll ON (ulp.languages = ll.id)
Use either:
MySQL's ELT() funtion:
SELECT
ELT(Languages
, 'English' -- 1
, 'French' -- 2
-- etc.
)
FROM table_name
A CASE expression:
SELECT
CASE Languages
WHEN 1 THEN 'English'
WHEN 2 THEN 'French'
-- etc.
END
FROM table_name
Although, if possible I would be tempted to either JOIN with a lookup table (as #Mr.TAMER says) or change the data type of the column to ENUM('English','French',...).
UPDATE
From your comments, it now seems that each field contains a set (perhaps even using the SET data type?) of languages and you want to replace the numeric values with strings?
First, read Bill Karwin's excellent answer to "Is storing a delimited list in a database column really that bad?".
In this case, I suggest you normalise your database a tad: create a new language-entity table wherein each record associates the PK of the entities in the existing table with a single language. Then you can use a SELECT query (joining on that new table) with GROUP_CONCAT aggregation to obtain the desired list of language names.
Without such normalisation, your only option is to do string-based search & replace (which would not be particularly efficient); for example:
SELECT CONCAT_WS(',',
IF(FIND_IN_SET('1', Languages), 'English', NULL),
IF(FIND_IN_SET('2', Languages), 'French' , NULL),
-- etc.
)
FROM table_name
Why don't you make a number-language table and, when SELECTing, get the language associated with that number that you selected.
This is better in case you want to add a new language. You will only insert it into the table instead of changing all the queries in your code, and also easier if others are using your code (they won't be happy debugging and editing all the queries).
From your other comments, are you saying that the languages field is a literal string embedded with commas?
From an SQL perspective, that's a pretty unworkable design. A variable number of languages should be stored in another table.
However, if you're stuck with what you've got, you might be able to construct a regexp replacement algorithm, but it seems terribly fragile, and I wouldn't recommend it. If you've got more than 9 languages, the following will be broken, and you would need the Regexp UDF, which introduces a bunch of complexity.
Assuming the simple case:
SELECT REPLACE(
REPLACE(
REPLACE(Languages, '1', 'English'),
'2', 'French'),
N, DESCRIPTION)
and so on. But I repeat: this is an awful data design. If it's possible to fix it to something like:
person person_lang language
========== ============ =========
person_id -----< person_id
... lang_id >----- lang_id
lang_desc
Then I strongly suggest you do so.