Is there any way to check for a regex expression in a comma separated values column?
I have a column named storeId with the following values EMP_0345,00345,OPS and I need to get only the storeid with no alpha numeric characters in it.
I am able to get the valid store_ids with the regex REGEXP '^[0-9]+$' but how do get the values in a comma separated values column?
You violated a dogma of database design: Never ever store more than one value in a single field, if you need to access them separatly
Even if you manage to REGEX your way around this, you will run into massive performance troubles. The correct way to tackle this ist to move the contents of the CSV column into rows of a join table, then simply match against the single values of that join table.
Your database design is flawed - given your current request.
Whether you should (try to) convert that column into several columns in the current (or a different) table, or rather into rows in a different table does primarily depend on whether or not there is some structure in that column's data.
With some inherent structure, you could use something like
SELECT
storeId
, SUBSTRING_INDEX(storeId, ',', 1) AS some_column
, SUBSTRING_INDEX(SUBSTRING_INDEX(storeId, ',', 2), ',', -1) AS store_id
, SUBSTRING_INDEX(storeId, ',', -1) AS another_column
FROM T
WHERE storeId REGEXP '^[^,]+,[0-9]+,[^,]+$'
;
to separate the values (and potentially populate newly added columns). The WHEREclause would allow to differentiate between sets of rows with specific arrangements of values in the column in question.
See in action / more detail: SQLFiddle.
Please comment if and as adjustment / further detail is required, or update your request to provide more detailed input.
Related
I have a MySQL column which contains a string of scores separated by a semi-colon eg: "5;21;24;25;26;28;117".
This column was created not by design, but by collecting the values from multiple rows in a table using GROUP_CONCAT and GROUP BY. The original data arrived as a spreadsheet with multiple rows with the ID value.
I can use a select clause with REPLACE function to replace the ; with a +.
SELECT values, REPLACE(values,";","+") AS score FROM [table_name] WHERE 1
values score
5;21;24;25;26;28;117 5+21+24+25+26+28+117
However what I need is the sum of: 5+21+24+25+26+28+117 to get a total of 246.
Is there any way to do this in MySQL without using some other scripting language?
The SELECT clause shows me a string of numbers joined with the + symbol.
Am looking for a way to evaluate that string to give me the result: 246
UPDATE:
As I was framing my question, I did more research and came up with this link which solves my problem:
(https://dba.stackexchange.com/questions/120747/evaluate-a-string-value-as-a-computed-expression-in-an-sql-statement-sthg-like).
Am keeping this question and the link to the answer here in case it could help other people searching for the same.
I have column in table views: 165,75,44,458,458,42,45
This column contain user_ids who viewed a link.
I want to explode it by comma delimiter and count the ids
I tried this but it only counts all the character.
SQL:
SELECT LENGTH(views) FROM questions WHERE question_id=1;
This is the PHP version that I want to do in SQL
$viewers_ids = "165,75,44,458,458,42,45";
$num_of_views = count(array_filter(explode(",", $viewers_ids)));
You can count the commas:
select 1 + length(views) - length(replace(views, ',', ''))
That said, you should fix your data structure. You should not be storing multiple numeric ids in a string column -- that is just wrong in many ways. You should be using a junction/association table.
Tried the other way around i.e. to replace all characters other than ,
Select
Length(REGEXP_REPLACE
(views,'[^\,]*',''))+1 from
table
My immediate reaction is that your table structure needs to change - I would expect there to be another table that joins question links and user_ids with a "visit_id" perhaps, and use visit_id as a foreign key in the views table.
My second reaction is that this is a perfect problem for a shell utility - maybe sed/,/ /g then wc -w, or a small awk script.
To do it in SQL, I might try to figure out how to count the number of commas, and add one. This answer gives you a function that will get you the number of commas by replacing them with "", then substracting the length from the length of the unmodified field: Count the number of occurrences of a string in a VARCHAR field?
When I run the following query, I am returned two entries with duplicate results. Why are duplicate results returned when I’m using distinct here? The primary keys are the house number, street name, and unit number.
SELECT distinct
house_num,
Street_name,
Unit_Designator,
Unit_Num
FROM voterinfo.voter_info
WHERE house_num = 420
AND street_name = "PARK"
AND Unit_Num = ''
AND Unit_Designator = '';
select distinct is a statement that ensures that the result set has no duplicate rows. That is, it filters out rows where every column is the same (and NULL values are considered equal).
It does not look at a subset of columns.
Sometimes, people use select distinct and don't realize that it applies to all columns. It is rather amusing when the first column is in parentheses -- as if parentheses make a difference (they don't).
Then, you might also have situations where values look the same but are not.
Consider this simple example where values differ by only a space as the end of string:
select distinct x
from (select 'a' as x union all
select 'a '
) y;
Here is a db<>fiddle with this example.
This returns two rows, not 1.
Without sample data it is hard to say which of these situations you are referring to. But the rows that you think are "identical" really are not.
For the fields with datatype as Char or similar ( Street_name,Unit_Designator) it is possible that there are spaces that aren't visible in the query editor that are to be removed by applying appropriate trimming logic.Please refer below link,
MySQL select fields containing leading or trailing whitespace
I have a comma separate string like "1,2,3" and a column in table is also contain comma separate value like "1,2,4,5,3" how to get all records that match any value to any value.
for example
id---category
1---1,2,4,5
2---1,2,3,6
3---2,3,5
If I search for string "1,2,3" then I should get record the category contains 1 or 2 or 3 or 1,2 or 1,3 or 2,3 or 1,2,3. It should not return the duplicate value where as we can group them.
Is it possible to get all record with a single query.
Try this:
select * from table where category IN (1,2,3);
You should not be storing lists of ids in strings. Here are reasons why:
Values should be stored using the correct type. int <> string.
SQL has lousy string processing functions.
Foreign keys should be properly declared.
SQL will not be able to optimize these queries.
SQL has a great way to store lists. It is called a table not a string.
But, sometimes you are stuck with someone else really, really, really, really, really bad data modeling decisions. You can do something, using regular expressions:
where category regexp replace($string, ',', '|')
or perhaps more accurately:
where concat(',', category, ',') regexp concat(',', replace($string, ',', ',|,'), ',')
I have a special data environment where I need to be returned data in a certain way to populate a table.
This is my current query:
SELECT
bs_id,
IF(bs_board = 0, 'All Boards', (SELECT b_name FROM certboards WHERE b_id IN (REPLACE(bs_board, ';', ',')))) AS board
FROM boardsubs
As you can see I have an if statement then a special subselect.
The reason I have this is that the field bs_board is a varchar field containing multiple row IDs like so:
1;2;6;17
So, the query like it is works fine, but it only returns the first matched b_name. I need it to return all matches. For instance in this was 1;2 it should return two boards Board 1 and Board 2 in the same column. Later I can deal with adding a <br> in between each result.
But the problem I am dealing with is that it has to come back in a single column both name, or all names since the field can contain as many as the original editor selected.
This will not work the way you're thinking it will work.
Let's say bs_board is '1;2;3'
In your query, REPLACE(bs_board, ';', ',') will resolve to '1,2,3', which is a single literal string. This makes your final subquery:
SELECT b_name FROM certboards WHERE b_id IN ('1,2,3')
which is equivalent to:
SELECT b_name FROM certboards WHERE b_id = '1,2,3'
The most correct solution to the problem is to normalize your database. Your current system or storing multiple values in a single field is exactly what you should never do with an RDBMS, and this is exactly why. The database is not designed to handle this kind of field. You should have a separate table with one row for each bs_board, and then JOIN the tables.
There are no good solutions to this problem. It's a fundamental schema design flaw. The easiest way around it is to fix it with application logic. First you run:
SELECT bs_id, bs_board FROM boardsubs
From there you parse the bs_board field in your application logic and build the actual query you want to run:
SELECT bs_id,
IF(bs_board = 0, 'All Boards', (SELECT b_name FROM certboards WHERE b_id IN (<InsertedStringHere>) AS board
FROM boardsubs
There are other ways around the problem, but you will have problems with sorting order, matching, and numerous other problems. The best solution is to add a table and move this multi-valued field to that table.
The b_id IN (REPLACE(bs_board, ';', ',')) will result in b_id IN ('1,2,6,7') which is different from b_id IN (1,2,6,7) which is what you are looking for.
To make it work either parse the string before doing the query, or use prepared statements.