How to replace delimiters from a string in SQL Server - sql-server-2008

I have the following data
abc
pqr
xyz,
jkl mno
This is one string separated by delimiters like space, new line, comma, tab.
There could be two or more consecutive spaces or tabs or any delimiter after or before a word.
I would like to be able to do the following
Get the individual words removing all leading and trailing delimiters off it
Append the individual words with "OR"
I am trying to achieve this to build a T-SQL query separated by OR clause.
Thanks

I think you can achieve what you need (although I think using a programming language is way better) using just SQL, here is my approach.
Kindly note that I will just handle commas, newlines and multiple-spaces, but you can simple follow using the same technique to remove the rest of your undesired characters
so let's assume that we have a table names ExampleData with a column named DataBefore and another called DataAfter.
DataBefore: has the line value that you want to clean
DataAfter: will host the cleaned text
First we need to trim the preceding & leading space(s) from the text
Update ExampleData
set DataAfter = LTRIM(RTRIM(DataBefore))
Second, we should clean all the commas, and replace them with spaces (doesn't matter if we will end up with many spaces together)
Update ExampleData
set DataAfter = replace(replace(DataAfter,',',' '),char(13),' ')
This is the part in which you may continue and remove any other characters using the same technique, and replace it by a space
So far we have a text that has no spaces before or after, and every comma, newline, TAB, dash, etc character replaced by a space, let's continue our cleaning procedure.
We can now safely move on to replace the spaces between words with just one, this is made by using the following SQL statement:
Update ExampleData
set DataAfter = replace(replace(replace(DataAfter,' ','<>'),'><',''),'<>',' ')
as per your needs, we need to place an OR between each word, this is achievable with this SQL statement:
Update ExampleData
set DataAfter = replace(replace(replace(DataAfter,' ','<>'),'><',''),'<>',' OR ')
we are done now, as a final step that may or may not make a change, we need to remove any space at the end of the whole text, just in case an unwanted character was at the end of the text and as a result got replaced by a space, this can be achieved by the following statement:
Update ExampleData
set DataAfter = RTRIM(DataAfter)
we are now done. :)
as a test, I've generated the following text inside the DataBefore column:
this is just a, test, to be sure, that everything is, working, great .
and after running the previous commands, ended up with this value inside the DataAfter column:
this OR is OR just OR a OR test OR to OR be OR sure OR that OR everything OR is OR working OR great OR .
Hope that this is what you want, let me know if you need any extra help :)

Related

easy way to query without putting everything in quotation marks

How do I query in MySql without putting all inserts in quotations? (I have a big list and it would take to much time to quote and unquote every word)
Example:
SELECT *
FROM names
WHERE names.first IN ("joe", "tom", "vincent")
Since you said the list is comma separated, simply use the 'find and replace' feature to find all commas and replace them with ","
The result should be joe","tom","vincent"," which you can simply copy into mysql.
All you then have to do is edit the start and end of the string

How to prevent entering text containing spaces in mySQL

It happens occasionally that users erroneously enter text with a trailing space in a text column, which is hard to spot visually. This can later cause problems when this text field has to be matched against another where the trailing space is not present. Is it possible in mySQL to enforce that a text string cannot contain a certain character (space in this case)?
Thankful for feedback!
There are a number of ways to achieve what you want:
Check the user input in your application and reject it if it contains a space. If your primary worry is the quality of the user inputs, then this is probably the best way to do this.
You can remove spaces (or just starting / trailing spaces) from the user input either in the application logic or using sql.
If you opt for removing all spaces from the user input in sql, then use the replace() function. If you just want to remove the starting and trailing spaces, then use the trim() function to achieve the desired results.
Using mysql function a simple way is based on trim()
select
trim(' try with trim ')
, length (trim(' try with trim '))
, length (' try with trim ')
from dual ;

mysql select query ignoring inner spaces

Banging me head against the wall with this one.
I have table containing postcodes and street names and I have another table where Houses are listed for sale ( where the Street name is missing) and I am tryin to get the Street name for each post code.
The problem is that table 1 stores the postcode without the space and table 2 which I am trying to update stores the post code with the space.
So in table 1 the postcode is stored as "l249pb" and table 2 it is stored as "l24 9pb".
Now if the post codes where both stored in exactly the same format i.e without the space I would expect this query to work:
UPDATE Table1
INNER JOIN Table2 ON ( Table1.PostCode = Table2.PostCode )
SET Table1.StreetName = Table2.StreetName
I have tried this but it wont work :
UPDATE Table1
INNER JOIN Table2 ON ( Table1.PostCode = REPLACE(Table2.PostCode,' ',''))
SET Table1.StreetName = Table2.StreetName
can anyone tell me how to check for a match ignoring spaces ( like a trim but removing every space )
Many thanks for any help you can offer.
With the data you've given your UPDATE runs just fine. Probably the whitespaces you see are not actually spaces, but something else, e.g. non-breaking spaces, tabs etc.
After normal SPACE, the next most common white spaces (which are not line breaks) are CHARACTER TABULATION (ie. horizontal tab) and NO-BREAK SPACE. You could use CHAR(9) and CHAR(160), respectively, to reference them in your query.
It also might be possible that your table viewer application shows line breaks as a space for brevity, so if replacing space, tab and nbsp isn't enough, try replacing those, too.
If you really need to replace all white space characters… Unfortunately there is no "white space wildcard" to use in MySQL. Technically, you could make a monster REPLACE(REPLACE(REPLACE(REPLACE…-call, which, in the end, would replace all whitespace characters with ''. For example, to replace every THREE-PER-EM SPACE, first look for its Unicode code point (U+2004), then you can replace its occurences e.g. with:
REPLACE(PostCode, CHAR(0x2004 using ucs2), '')
There is a hackish shortcut to this: if you are sure that your data should contain only Latin-1 characters and no ? (question mark), you could CONVERT() the string first as latin1, which replaces all characters with overflowing code as ?and then replace all ? as '':
REPLACE(CONVERT(PostCode using latin1), '?', '')
This can be useful in one-off, manual queries, but for continuing use, better replace the characters explicitly.
But first you should check your data input sanitizer/validator, so future records won't be such a mess. Perhaps you could consider running a bulk replace to normalize the data on PostCode column(s), if possible, before even trying to do your join query. Legacy systems with legacy data only get worse over time.

Delete all characters before and after quotation marks

I have a CSV file, which has two columns and 4500 rows. In one column, I have several phrases that are surrounded in quotation marks. I need to delete all the text that comes before and after the quotations marks.
For example:
How would you say "Hello, my Friend" when speaking outside?
should become "Hello, my Friend"
I also have several rows that have the word NULL in the second column. I need these rows deleted in full.
What's the best way of doing something like this? I have been looking at regular expressions, but I'm not sure if they are flexible enough to do what I want to do, or how you would use them on a CSV file (I need the table structure to remain).
EDIT:
1) At the moment I am just using Apple Numbers, but I know that wont don't it, so I am happy to any suggestions. It must support Kanji characters.
2) I have removed all the NULL rows, so that is no longer needed (I simply added a column of numbers, sorted the table so all the NULLs were together, deleted them and the sorted back by the column of numbers).
Find a text editor that supports regular expression search and replace.
Something like this would match ,NULL in the second column: ^.*,NULL.*$. Replace it with "DELETEMEDELETEME" to mark the line, or as an empty string or find a way to have it match on `\n' or '\r' to catch the line break and remove the entire line completely.
Stripping out parts of the quoted string might work like this:
^(.*,){n}(.*)(\".\")(.*)(,.*)$ replaced with \1\3\5 where n is the number of columns preceding the one you want to edit. Repeat (.*,) if that's not available. It will depend on the regex flavor of your tool.

MySQL - remove whitespace before and after a given string or char

Is it possible to update a row by removing the whitespace (1 or more spaces) before and after a given string or char? I need all spaces before and after a specific char (#) to be stripped but also leave the other spaces in the cell intact.
Example:
'This is a simple # example'
should be updated to
'This is a simple#example'
likewise:
'This is another #example'
should be updated to
'This is another#example'
I can do this using PHP but it would be much easier if there was a way to have this done in a single query.
There is no default regex you can use with replacing. So the options you have are:
extend MySQL with a regex-replace:
You could use a UDF (user defined function, this one to be exact) to do a regex replace. You can then use something like \s*#\s* to indicate multiple spaces around an #.
Use the default replace:
you'd need to 'hardcode' your replaces to account for several spaces before and after the #, a cumbersome (and never complete) task. You'd end up with a lot of replaces, or a repetition of a certain function (remove 100 times one space from the # sign). To be clear, the mentioned doubble replace in the comments and #luksch' answer will not suffice for more then 1 space.
What about this:
SELECT REPLACE(REPLACE(line, ' #', '#'), '# ', '#') FROM tab1;
sqlfiddle
UPDATE:
to allow for the removal of any number of spaces before or after the # do this:
SELECT CONCAT(RTRIM(LEFT(line,LOCATE('#',line)-1)),'#',
LTRIM(SUBSTR(line,LOCATE('#',line)+1)))
FROM tab1;
see the new sqlfiddle to play around with this.
Note that if you have more than one # in the line this method only works for the first.