Length of a text list in MySQL (basic string manipulation) - mysql

I have a field of comma-separated lists in MySQL:
id field1
1 aa,bb,cc
2 aa
I would like to count the total number of elements, with overlap. In this case that would be 4: aa appears twice and so should be double-counted.
It would suffice to count the number of commas in each field and add 1, since my lists do not have quotes or escaping.

Let's try that:
select count(field1) + sum(CHAR_LENGTH(field1) - CHAR_LENGTH(REPLACE(field1,',','')))
from Table1;

Why do you use comma separated string?
Databases aren't made for that.
You have 2 possibilities :
- make a loop in a stored procedure,
- count it into the code which call this MySQL query...

Related

How to find variable pattern in MySql with Regex?

I am trying to pull a product code from a long set of string formatted like a URL address. The pattern is always 3 letters followed by 3 or 4 numbers (ex. ???### or ???####). I have tried using REGEXP and LIKE syntax, but my results are off for both/I am not sure which operators to use.
The first select statement is close to trimming the URL to show just the code, but oftentimes will show a random string of numbers it may find in the URL string.
The second select statement is more rudimentary, but I am unsure which operators to use.
Which would be the quickest solution?
SELECT columnName, SUBSTR(columnName, LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", columnName), LENGTH(columnName) - LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", REVERSE(columnName))) AS extractedData FROM tableName
SELECT columnName FROM tableName WHERE columnName LIKE '%___###%' OR columnName LIKE '%___####%'
-- Will take a substring of this result as well
Example Data:
randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc
In this case, the desired string is "xyz123" and the location of said pattern is variable based on each entry.
EDIT
SELECT column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), SUBSTR(column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), LENGTH(column) - LOCATE(column REGEXP "^.*[a-zA-Z]{3}[0-9]{3,4}", REVERSE(column))) AS extractData From mainTable
This expression is still not grabbing the right data, but I feel like it may get me closer.
I suggest using
REGEXP_SUBSTR(column, '(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])')
Details:
(?<=[&?]random_code=[^&#]{0,256}-) - immediately on the left, there must be & or &, random_code=, and then zero to 256 chars other than & and # followed with a - char
[a-zA-Z]{3} - three ASCII letters
[0-9]{3,4} - three to four ASCII digits
(?![^&#]) - that are followed either with &, # or end of string.
See the online demo:
WITH cte AS ( SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc' val
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz4567&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz89&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz00000&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-aaaaa11111&hello_world=us&etc_etc')
SELECT REGEXP_SUBSTR(val,'(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])') output
FROM cte
Output:
I'd make use of capture groups:
(?<=[=\-\\])([a-zA-Z]{3}[\d]{3,4})(?=[&])
I assume with [^=\-] you wanted to capture string with "-","\" or "=" in front but not include those chars in the result. To do that use "positive lookbehind" (?<=.
I also added a lookahead (?= for "&".
If you'd like to fidget more with regex I recommend RegExr

Is it possible in MySQL to search a table for where a column only contains 1 comma?

I have this column in a table which is comma delimited to separate the values.
Here's the sample data:
2003,2004
2003,2005
2003,2006
2003,2004,2005
2003,2007
I want to get all data that contains only 1 comma.
I've been playing around with the '%' and '_' wildcards, but I can't seem to get the results I need.
SELECT column FROM table WHERE column like '%_,%'
Replace the , with '' empty set then take the original length less the replaced length. if 1 then only 1 comma if > 1 then more than 1 comma.
The length difference would represent the number of commas.
Length(column) - length(Replace(column,',','')) as NumOfCommas
or
where Length(column) - length(Replace(column,',','')) =1
While this may solve the problem, I agree with what others have indicated. Storing multiple values in a single column in a RDBMS is asking for more trouble. Better to normalize the data and get it to at least 3rd Normal form!
You can also use find_in_set() method which searches a value in comma separated list, by picking the last value of column using substring_index we can then check result of find_in_set should be 2 so that its the second and last value from list
select *
from demo
where find_in_set(substring_index(data,',',-1),data) = 2
Demo
Maybe another solution is to use regular expression in your case it can look like this ^[0-9]{4},[0-9]{4}$ :
SELECT * FROM MyTable WHERE ColName REGEXP '^[0-9]{4},[0-9]{4}$'
Or if you want all non comma one or more time :
SELECT * FROM MyTable WHERE ColName REGEXP '^[^,]*,[^,]*$'

find string in first 10 characters

I have a table with fields containing html code. I would like to select only those rows where htmlfield contain string "<p>[[{"fid":" but only in first 10 characters of the html field (this field contains more of such strings and I want to find only fields that contain the string in the beginning).
Is it possible to do such select?
You can use a SUBSTRING() to grab the fields with that string in them.
SELECT field1
FROM TableName
WHERE SUBSTRING(field1, 1, 12) = '<p>[[{"fid":'
Example
You can also try using the LIKE function. Where you can use the % wildcard at the end of your string to get fields that start with that string.
SELECT field1
FROM TableName
WHERE field1 LIKE '<p>[[{"fid":%'
Example
Do it like this:
select SUBSTRING(field_name,1, 10) from table_name;
It will display 10 characters only for that specific field.
Hope it helps

MYSQL - Query Certain String's Length

For example I have a table column : lorem,ipsum,mini,momo,testing
I need to select total 'Comma' 's count.
Is it possible to query it as result of : 4 because having 4 comma in the column ?
Does mysql have any trick like select character length by filtering certain words like :
SELECT CHAR_LENGTH(section,',') FROM table WHERE 1 = 1 ?
You are close. Just use subtraction:
select length(section) - length(replace(section, ',', ''))
However, you shouldn't be storing lists of things in strings. Instead, use a junction table with one row per item.

Why does this search query return nothing?

I have this table under user_name='high'
function_description :
akram is in a date
test
akram is studying
test4
kheith is male
test3
I want a query that returns results of field that have at least an 'akram'
SELECT *
FROM functions
WHERE 'isEnabled'=1
AND 'isPrivate'=1
AND user_name='high'
AND function_description LIKE '%akram%'
and this returns absolutely nothing!
Why?
You are listing the column names as if they are strings. This is why it returns nothing.
Try this:
SELECT *
FROM functions
WHERE user_name='high'
AND function_description LIKE '%akram%'
edit: After trying to re-read your question... are isEnabled and isPrivate columns in this table?
edit2: updated.. remove those unknown columns.
You are comparing strings 'isEnabled' with integer 1, which likely leads to the integer being converted to a string, and the comparison then fails. (The alternative is that the string is converted to an integer 0 and the comparison still fails.)
In MySQL, you use back-quotes, not single quotes, to quote column and table names:
SELECT *
FROM `functions`
WHERE `isEnabled` = 1
AND `isPrivate` = 1
AND `user_name` = 'high'
AND `function_description` LIKE '%akram%'
In standard SQL, you use double quotes to create a 'delimited identifier'; in Microsoft SQL Server, you use square brackets around the names.
Please show the schema more carefully (column names, sample values, types if need be) next time.