SQL: select unique substrings from the table by mask

SQL: select unique substrings from the table by mask - mysql

There is a SQL table mytable that has a column mycolumn.
That column has text inside each cell. Each cell may contain "this.text/31/" or "this.text/72/" substrings (numbers in that substrings can be any) as a part of string.
What SQL query should be executed to display a list of unique such substrings?
P.S. Of course, some cells may contain several such substrings.
And here are the answers for questions from the comments:
The query supposed to work on SQL Server.
The prefered output should contain the whole substring, not the numeric part only. It actually could be not just the number between first "/" and the second "/".
And it is varchar type (probably)
Example:
mycolumn contains such values:
abcd/eftthis.text/31/sadflh adslkjh
abcd/eftthis.text/44/khjgb ljgnkhj this.text/447/lhkjgnkjh
ljgkhjgadsvlkgnl
uygouyg/this.text/31/luinluinlugnthis.text/31/ouygnouyg
khjgbkjyghbk
The query should display:
this.text/31/
this.text/44/
this.text/447/

How about using a recursive CTE:
CREATE TABLE #myTable
(
myColumn VARCHAR(100)
)
INSERT INTO #myTable
VALUES
('abcd/eftthis.text/31/sadflh adslkjh'),
('abcd/eftthis.text/44/khjgb ljgnkhj this.text/447/lhkjgnkjh'),
('ljgkhjgadsvlkgnl'),
('uygouyg/this.text/31/luinluinlugnthis.text/31/ouygnouyg'),
('khjgbkjyghbk')
;WITH CTE
AS
(
SELECT MyColumn,
CHARINDEX('this.text/', myColumn, 0) AS startPos,
CHARINDEX('/', myColumn, CHARINDEX('this.text/', myColumn, 1) + 10) AS endPos
FROM #myTable
WHERE myColumn LIKE '%this.text/%'
UNION ALL
SELECT T1.MyColumn,
CHARINDEX('this.text/', T1.myColumn, C.endPos) AS startPos,
CHARINDEX('/', T1.myColumn, CHARINDEX('this.text/', T1.myColumn, c.endPos) + 10) AS endPos
FROM #myTable T1
INNER JOIN CTE C
ON C.myColumn = T1.myColumn
WHERE SUBSTRING(T1.MyColumn, C.EndPos, 100) LIKE '%this.text/%'
)
SELECT DISTINCT SUBSTRING(myColumn, startPos, EndPos - startPos)
FROM CTE

Having a table named test with the following data:
COLUMN1
aathis.text/31/
this.text/1/
bbbthis.text/72/sksk
could this be what you are looking for?
select SUBSTR(COLUMN1,INSTR(COLUMN1,'this.text', 1 ),INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 )+10) - INSTR(COLUMN1,'this.text', 1 )+1) from test;
result:
this.text/31/
this.text/1/
this.text/72/
i see your problem:
Assume the same table as above but now with the following data:
this.text/77/
xxthis.text/33/xx
xthis.text/11/xxthis.text/22/x
xthis.text/1/x
The following might help you:
SELECT SUBSTR(COLUMN1, INSTR(COLUMN1,'this.text', 1 ,1), INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 ,1)+10) - INSTR(COLUMN1,'this.text', 1 ,1)+1) FROM TEST
UNION
SELECT CASE WHEN (INSTR(COLUMN1,'this.text', 1,2 ) >0) THEN
SUBSTR(COLUMN1, INSTR(COLUMN1,'this.text', 1,2 ), INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 ,2),2) - INSTR(COLUMN1,'this.text', 1,2 )+1) end FROM TEST;
it will generate the following result:
this.text/1/
this.text/11/
this.text/22/
this.text/33/
this.text/77/
The downside is that you need to add a select statement for every occurance you might have of "this.text". If you might have 100 "this.text" in the same cell it might be a problem.

SQL> select SUBSTR(column_name,1,9) from tablename;
column_name
this.text

SELECT REGEXP_SUBSTR(column_name,'this.text/[[:digit:]]+/')
FROM table_name

Related

Teradata SQL Split Single String into Table Rows

I have one string element, for example : "(1111, Tem1), (0000, Tem2)" and hope to generate a data table such as
var1
var2
1111
Tem1
0000
Tem2
This is my code, I created the lag token and filter with odd rows element.
with var_ as (
select '(1111, Tem1), (0000, Tem2)' as pattern_
)
select tbb1.*, tbb2.result_string as result_string_previous
from(
select tb1.*,
min(token) over(partition by 1 order by token asc rows between 1 preceding and 1 preceding) as min_token
from
table (
strtok_split_to_table(1, var_.pattern_, '(), ')
returns (outkey INTEGER, token INTEGER, result_string varchar(20))
) as tb1) tbb1
inner join (select min_token, result_string from tbb1) tbb2
on tbb1.token = tbb2.min_token
where (token mod 2) = 0;
But it seems that i can't generate new variables in "from" step and applied it directly in "join" step.
so I wanna ask is still possible to get the result what i want in my procedure? or is there any suggestion?
Thanks for all your assistance.

I wouldn't split / recombine the groups. Split each group to a row, then split the values within the row, e.g.
with var_ as (
select '(1111, Tem1), (0000, Tem2)' as pattern_
),
split1 as (
select trim(leading '(' from result_string) as string_
from
table ( /* split at & remove right parenthesis */
regexp_split_to_table(1, var_.pattern_, '\)((, )|$)','c')
returns (outkey INTEGER, token_nbr INTEGER, result_string varchar(256))
) as tb1
)
select *
from table(
csvld(split1.string_, ',', '"')
returns (var1 VARCHAR(16), var2 VARCHAR(16))
) as tb2
;

Get a value in a single cell into multiple rows using SSRS

One particular field in the SQL Table has a value in the below format.
Value11,value12,Value13
Value21,value22,value23
...
...
I need to get each of the above lines in the text into individual lines using SSRS.
for example I will get 2 rows in the report for above data.
Is there a way to do this using a reporting project in VS or Report builder?
Thanks in advance.
Update
Hi,Below is the DDL for the table
tblTest
[id] int
[Description] VARCHAR(MAX)
Lets assume there is only one record with Below
Insert Into tblTest
([id],[Description])
VALUES
(1, 'Value11,value12,Value13
Value21,value22,value23')
So there is a carriage return Caharacter in above Insert for the Description column. This will have 2 lines in the description row.
So my requirement is that when i retrieve the data, I should get into below format.
ID, Description
1, Value11,value12,Value13
1, Value21,value22,value23

You can use this SELECT for passing data to Reporting Services.
SELECT t1.id, t2.splittedDescriptions
FROM
(
SELECT tblTest.id,
CAST('<row>' + REPLACE(tblTest.[Description], CHAR(13) + CHAR(10), '</row><row>') + '</row>' AS XML) as xmlRow
FROM tblTest
) t1
CROSS APPLY
(
SELECT xmlTable.splittedRow.value('.', 'VARCHAR(MAX)') as splittedDescriptions
FROM t1.xmlRow.nodes('/row') AS xmlTable(splittedRow)
) t2
It uses XML and nodes() method to split your description when it finds a CRLF.
It work with a single CRLF, if you need to work with double CRLF you can simply modify the SELECT.
Example - input data:
INSERT INTO tblTest ([id],[Description]) VALUES
(1, 'val11, val12, val13' + CHAR(13) + CHAR(10) + 'val21, val22, val23')
INSERT INTO tblTest ([id],[Description]) VALUES
(2, 'val31, val32, val33')
INSERT INTO tblTest ([id],[Description]) VALUES
(3, 'val41, val42, val43' + CHAR(13) + CHAR(10) + 'val51, val52, val53' + CHAR(13) + CHAR(10) + 'val61, val62, val63')
Example - output:
id splittedDescriptions
----------- --------------------
1 val11, val12, val13
1 val21, val22, val23
2 val31, val32, val33
3 val41, val42, val43
3 val51, val52, val53
3 val61, val62, val63

use this,
select '1' as Id, Value11+','+value12+','+Value13 as Description into tblTest from XYZ;
Value11,value12,Value13 all should be in String

removing duplicate in a column with same value?

url link
1.247appliances.co.uk info#247appliances.co.uk info#247appliances.co.uk
2.365electrical.com sales#365electrical.com sales#365electrical.com sales#365electrical.com sales#365electrical.com|customerservices#365electrical.com sales#365electrical.com
in the above table first row and second row link column has repeated values but in need the result to be
url link
1.247appliances.co.uk info#247appliances.co.uk
2.365electrical.com sales#365electrical.com customerservices#365electrical.com

Used DISTINCT when calling column here is sample
SELECT DISTINCT column_name,column_name
FROM table_name;

This might not be an exact solution to your issue but I have tried to give you an idea.
Cheers
First create a Split Function
CREATE FUNCTION dbo.Split (#sep CHAR(1), #s VARCHAR(512))
RETURNS TABLE
AS
RETURN (
WITH Pieces(pn, START, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #LinkValue)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #LinkValue, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#LinkValue, START, CASE WHEN stop > 0 THEN stop-START ELSE 512 END) AS Link
FROM Pieces
)
I have created a temporary table for sample data (use your own table)
CREATE TABLE #RemoveDuplicateWords (URL VARCHAR(200), Link VARCHAR(200))
GO
INSERT INTO #RemoveDuplicateWords(URL,Link)
SELECT '1.247appliances.co.uk', 'info#247appliances.co.uk info#247appliances.co.uk'
UNION ALL
SELECT '2.365electrical.com','sales#365electrical.com sales#365electrical.com sales#365electrical.com'
GO
And Finally a SELECT query
SELECT
rd.URL,
st.Link
FROM #RemoveDuplicateWords rd
CROSS APPLY dbo.Split(' ',rd.LINK) AS st
GROUP BY
rd.URL,
st.Link

Oracle 10g SQL regexp like

I'd like to ask if it's possible in a regexp to identify in a given number if there are 3 instances of a set.
For instance:
123456141414
123456171717
in the example above we have 3x14 and 3x17 so it should return the numbers in the regexp_like query.
But it should return all occurrences of 3 times the same numbers.

Please try this:
SELECT INPUT_TEXT, REGEXP_SUBSTR(INPUT_TEXT, '([[:digit:]]{2})\1\1', 6) EXTRACTED
FROM MY_TABLE
WHERE REGEXP_INSTR(INPUT_TEXT, '([[:digit:]]{2})\1\1', 6) > 0
Input table values:
INPUT_TEXT
--------------
123456141414
123456171717
123456111111
141414123456
123456121234
Query result:
INPUT_TEXT EXTRACTED
-------------- --------------
123456111111 111111
123456141414 141414
123456171717 171717

If I read your updated requirements correctly, you're checking that you have six digits followed by a pair of digits repeated three times. In which case, Reza's response should be modified to:
select * from (
select '123456343434' str from dual union all
select '123456555555' str from dual union all
select '1234565555550' str from dual union all
select '123456232324' str from dual union all
select '123456111110' str from dual )
where regexp_like(str,'^([[:digit:]]{6})([[:digit:]]{2})\2\2$')
which gives:
STR
123456343434
123456555555
Editted to add - if you want to extract the actual digit-pair that's repeated:
select regexp_replace(str,'^([[:digit:]]{6})([[:digit:]]{2})\2\2$','\2') result
from (
select '123456343434' str from dual union all
select '123456555555' str from dual union all
select '123456555555a' str from dual union all
select '123456232324' str from dual union all
select '123456111110' str from dual )
where regexp_like(str,'^([[:digit:]]{6})([[:digit:]]{2})\2\2$')

Detect if value is number in MySQL

Is there a way to detect if a value is a number in a MySQL query? Such as
SELECT *
FROM myTable
WHERE isANumber(col1) = true

You can use Regular Expression too... it would be like:
SELECT * FROM myTable WHERE col1 REGEXP '^[0-9]+$';
Reference:
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

This should work in most cases.
SELECT * FROM myTable WHERE concat('',col1 * 1) = col1
It doesn't work for non-standard numbers like
1e4
1.2e5
123. (trailing decimal)

If your data is 'test', 'test0', 'test1111', '111test', '111'
To select all records where the data is a simple int:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+$';
Result: '111'
(In regex, ^ means begin, and $ means end)
To select all records where an integer or decimal number exists:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+\\.?[0-9]*$'; - for 123.12
Result: '111' (same as last example)
Finally, to select all records where number exists, use this:
SELECT *
FROM myTable
WHERE col1 REGEXP '[0-9]+';
Result: 'test0' and 'test1111' and '111test' and '111'

SELECT * FROM myTable
WHERE col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$'
Will also match signed decimals (like -1.2, +0.2, 6., 2e9, 1.2e-10).
Test:
drop table if exists myTable;
create table myTable (col1 varchar(50));
insert into myTable (col1)
values ('00.00'),('+1'),('.123'),('-.23e4'),('12.e-5'),('3.5e+6'),('a'),('e6'),('+e0');
select
col1,
col1 + 0 as casted,
col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$' as isNumeric
from myTable;
Result:
col1 | casted | isNumeric
-------|---------|----------
00.00 | 0 | 1
+1 | 1 | 1
.123 | 0.123 | 1
-.23e4 | -2300 | 1
12.e-5 | 0.00012 | 1
3.5e+6 | 3500000 | 1
a | 0 | 0
e6 | 0 | 0
+e0 | 0 | 0
Demo

Returns numeric rows
I found the solution with following query and works for me:
SELECT * FROM myTable WHERE col1 > 0;
This query return rows having only greater than zero number column that col1
Returns non numeric rows
if you want to check column not numeric try this one with the trick (!col1 > 0):
SELECT * FROM myTable WHERE !col1 > 0;

This answer is similar to Dmitry, but it will allow for decimals as well as positive and negative numbers.
select * from table where col1 REGEXP '^[[:digit:]]+$'

use a UDF (user defined function).
CREATE FUNCTION isnumber(inputValue VARCHAR(50))
RETURNS INT
BEGIN
IF (inputValue REGEXP ('^[0-9]+$'))
THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END;
Then when you query
select isnumber('383XXXX')
--returns 0
select isnumber('38333434')
--returns 1
select isnumber(mycol) mycol1, col2, colx from tablex;
-- will return 1s and 0s for column mycol1
--you can enhance the function to take decimals, scientific notation , etc...
The advantage of using a UDF is that you can use it on the left or right side of your "where clause" comparison. this greatly simplifies your SQL before being sent to the database:
SELECT * from tablex where isnumber(columnX) = isnumber('UnkownUserInput');
hope this helps.

Another alternative that seems faster than REGEXP on my computer is
SELECT * FROM myTable WHERE col1*0 != col1;
This will select all rows where col1 starts with a numeric value.

Still missing this simple version:
SELECT * FROM myTable WHERE `col1` + 0 = `col1`
(addition should be faster as multiplication)
Or slowest version for further playing:
SELECT *,
CASE WHEN `col1` + 0 = `col1` THEN 1 ELSE 0 END AS `IS_NUMERIC`
FROM `myTable`
HAVING `IS_NUMERIC` = 1

You can use regular expression for the mor detail https://dev.mysql.com/doc/refman/8.0/en/regexp.html
I used this ^([,|.]?[0-9])+$. This is allows handle to the decimal and float number
SELECT
*
FROM
mytable
WHERE
myTextField REGEXP "^([,|.]?[0-9])+$"

I recommend: if your search is simple , you can use `
column*1 = column
` operator interesting :) is work and faster than on fields varchar/char
SELECT * FROM myTable WHERE column*1 = column;
ABC*1 => 0 (NOT EQU **ABC**)
AB15*A => 15 (NOT EQU **AB15**)
15AB => 15 (NOT EQU **15AB**)
15 => 15 (EQUALS TRUE **15**)

SELECT * FROM myTable WHERE sign (col1)!=0
ofcourse sign(0) is zero, but then you could restrict you query to...
SELECT * FROM myTable WHERE sign (col1)!=0 or col1=0
UPDATE: This is not 100% reliable, because "1abc" would return sign of
1, but "ab1c" would return zero... so this could only work for text that does not begins with numbers.

you can do using CAST
SELECT * from tbl where col1 = concat(cast(col1 as decimal), "")

I have found that this works quite well
if(col1/col1= 1,'number',col1) AS myInfo

Try Dividing /1
select if(value/1>0 or value=0,'its a number', 'its not a number') from table

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL: select unique substrings from the table by mask - mysql

SQL> select SUBSTR(column_name,1,9) from tablename; column_name this.text

SELECT REGEXP_SUBSTR(column_name,'this.text/[[:digit:]]+/') FROM table_name

Related

Teradata SQL Split Single String into Table Rows

Get a value in a single cell into multiple rows using SSRS

removing duplicate in a column with same value?

Oracle 10g SQL regexp like

Detect if value is number in MySQL

Categories

Resources