Issue in Alphanumeric sorting with special char

Issue in Alphanumeric sorting with special char - sql-server-2008

I have to sort an alphanumeric value containing some special characters in sql server, I have tried several order by clause but it is not giving the desired output( It is giving the output as 1,10,100 than 101 it should be 1,2,3..100 )
I have tried ordering by alphabets and number at same time by split but it didn't worked.

Here my solution. The first, I create a function that gets number from the string. And then sort this number.
USE tempdb
GO
CREATE TABLE MyTable (ID INT, Col1 VARCHAR(100))
GO
-----
INSERT INTO MyTable (ID, Col1)
SELECT 1, 'CBSFBE20151202000017_000_1.tif'
UNION ALL
SELECT 2, 'CBSFBE20151202000017_000_10.tif'
UNION ALL
SELECT 3, 'CBSFBE20151202000017_000_2.tif'
UNION ALL
SELECT 4, 'CBSFBE20151202000017_000_3.tif'
UNION ALL
SELECT 5, 'CBSFBE20151202000017_000_11.tif'
-----
CREATE FUNCTION dbo.fnGetNumberFromString (#strInput VARCHAR(255))
RETURNS VARCHAR(255)
AS
BEGIN
DECLARE #intNumber int
SET #intNumber = PATINDEX('%[^0-9]%', #strInput)
WHILE #intNumber > 0
BEGIN
SET #strInput = STUFF(#strInput, #intNumber, 1, '')
SET #intNumber = PATINDEX('%[^0-9]%', #strInput)
END
RETURN ISNULL(#strInput,0)
END
GO
-----
SELECT *, dbo.fnGetNumberFromString(Col1) AS Number
FROM MyTable
order by CAST(dbo.fnGetNumberFromString(Col1) AS float), Col1

Related

SQL Query that counts the number of characters match in two text columns

I need to count how many characters are equal in two text columns (same size, in the same table).
For example:
RowNum: Template: Answers:
------- --------- --------
1 ABCDEABCDEABCDE ABCDAABCDBABCDC
2 EDAEDAEDAEDAEDA EDBEDBEDBEDBEDB
SELECT SOME_COUNT_FUNCTION (Template, Answers) should return:
RowNum: Result:
------- -------
1 12
2 10
The database is a MySQL.

Not exactly MySQL, but here's something that works in SQL Server. Maybe it'll translate over.
DROP TABLE IF EXISTS #tmp
CREATE TABLE #tmp (
[RowNum] INT IDENTITY(1,1) PRIMARY KEY,
[Template] NVARCHAR(20),
[Answer] NVARCHAR(20),
[Result] INT
)
INSERT INTO #tmp
VALUES ('ABCDEABCDEABCDE','ABCDAABCDBABCDC', NULL),
('EDAEDAEDAEDAEDA','EDBEDBEDBEDBEDB', NULL)
--SELECT * FROM #tmp
DECLARE #current_template NVARCHAR(50) -- Variable to hold the current template
, #current_answer NVARCHAR(50) -- Variable to hold the current answer
, #template_char CHAR(1) -- Char for template letter
, #answer_char CHAR(1) -- Char for answer letter
, #word_index INT -- Index (position) within each word
, #match_counter INT -- Match counter for each word
, #max_iter INT = (SELECT TOP 1 RowNum FROM #tmp ORDER BY RowNum DESC) -- Max iterations
, #row_idx INT = (SELECT TOP 1 RowNum FROM #tmp) -- Minimum RowNum as initial row index value.
WHILE (#row_idx <= #max_iter)
BEGIN
SET #match_counter = 0 -- Reset match counter for each row
SET #word_index = 1 -- Reset word index for each row
SET #current_template = (SELECT [Template] FROM #tmp WHERE RowNum = #row_idx)
SET #current_answer = (SELECT [Answer] FROM #tmp WHERE RowNum = #row_idx)
WHILE (#word_index <= LEN(#current_template))
BEGIN
SET #template_char = SUBSTRING(#current_template, #word_index, 1)
SET #answer_char = SUBSTRING(#current_answer, #word_index, 1)
IF (#answer_char = #template_char)
BEGIN
SET #match_counter += 1
END
SET #word_index += 1
END
UPDATE #tmp
SET Result = #match_counter
WHERE RowNum = #row_idx
SET #row_idx += 1
END
Get values from the temp table:
SELECT * FROM #tmp
Output:
RowNum Template Answer Result
1 ABCDEABCDEABCDE ABCDAABCDBABCDC 12
2 EDAEDAEDAEDAEDA EDBEDBEDBEDBEDB 10

If you are running MySQL 8.0, you can use a recursive query compare the strings character by character:
with recursive chars as (
select rownum, template, answers, 1 idx, 0 res from mytable
union all
select
rownum,
template,
answers,
idx + 1,
res + ( substr(template, idx, 1) = substr(answers, idx, 1) )
from chars
where idx <= least(char_length(template), char_length(answers))
)
select rownum, max(res) result from chars group by rownum order by rownum
In the CTE (the with clause), the anchor (the query before union all) selects the whole table, then the recursive member (the query after union all) compares the characters and the current position (idx) increments the result (res) if they match, and advances to the next position, until the (smallest) string is exhausted. Then, the outer query just aggregates by rownum.
Demo on DB Fiddle:
rownum | result
-----: | -----:
1 | 12
2 | 10
Please bear in mind that this query will not perform well against a large dataset. Other slighly more efficient solutions exist (typically, using a number table instead of a recursive cte), but basically, as commented by Gordon Linoff, you do want to fix your data structure if you need to run such queries. You should store each character in a separate row, along with its rownum and its index in the string. Materialize the proper data structure, and then you won't need to generate it on the fly in each and every query.

SQL: GROUP BY Clause for Comma Separated Values

Can anyone help me how to check duplicate values from multiple comma separated value. I have a customer table and in that one can insert multiple comma separated contact number and I want to check duplicate values from last five digits.For reference check screenshot attached and the required output is
contact_no. count
97359506775 -- 2
390558073039-- 1
904462511251-- 1

I would advise you to redesign your database schema, if possible. Your current database violates First Normal Form since your attribute values are not indivisible.
Create a table where id together with a single phone number constitutes a key, this constraint enforces that no duplicates occur.

I don't remember much but I will try to put the idea (it's something which I had used a long time ago):
Create a table value function which will take the id and phone number as input and then generate a table with id and phone numbers and return it.
Use this function in query passing id and phone number. The query is such that for each id you get as many rows as the phone numbers. CROSS APPLY/OUTER APPLY needs to be used.
Then you can check for the duplicates.
The function would be something like this:
CREATE FUNCTION udf_PhoneNumbers
(
#Id INT
,#Phone VARCHAR(300)
) RETURNS #PhonesTable TABLE(Id INT, Phone VARCHAR(50))
BEGIN
DECLARE #CommaIndex INT
DECLARE #CurrentPosition INT
DECLARE #StringLength INT
DECLARE #PhoneNumber VARCHAR(50)
SELECT #StringLength = LEN(#Phone)
SELECT #CommaIndex = -1
SELECT #CurrentPosition = 1
--index is 1 based
WHILE #CommaIndex < #StringLength AND #CommaIndex <> 0
BEGIN
SELECT #CommaIndex = CHARINDEX(',', #Phone, #CurrentPosition)
IF #CommaIndex <> 0
SELECT #PhoneNumber = SUBSTRING(#Phone, #CurrentPosition, #CommaIndex - #CurrentPosition)
ELSE
SELECT #PhoneNumber = SUBSTRING(#Phone, #CurrentPosition, #StringLength - #CurrentPosition + 1)
SELECT #CurrentPosition = #CommaIndex + 1
INSERT INTO #UsersTable VALUES(#Id, #PhoneNumber)
END
RETURN
END
Then run CROSS APPLY query:
SELECT
U.*
,UD.*
FROM yourtable U CROSS APPLY udf_PhoneNumbers(Userid, Phone) UD
This will give you the table on which you can run query to find duplicate.

Writing stored procedure which flags duplicate values in a comma separated field in MySQL

I have a database table like this sample:
ID THINGS HAS_DUPLICATES
1 AAA, BBB, AAA NULL
2 CCC, DDD NULL
I am trying to write a stored procedure to flag duplicate values in THINGS field.
After calling the procedure the table will become like this:
ID THINGS HAS_DUPLICATES
1 AAA, BBB, AAA YES
2 CCC, DDD NO
Please be informed that I am trying to resolve it using only SQL and without normalizing my database. I am also aware of other approaches like writing PHP code.

Schema:
DROP TABLE IF EXISTS evilThings; -- orig table with dupes
CREATE TABLE evilThings
( ID INT AUTO_INCREMENT PRIMARY KEY,
THINGS TEXT NOT NULL,
HAS_DUPLICATES INT NULL
);
INSERT evilThings(ID,THINGS) VALUES
(1,"'AAA, BBB, AAA'"),
(2,"'CCC, DDD'");
CREATE TABLE notEvilAssocTable
( ai INT AUTO_INCREMENT PRIMARY KEY, -- no shuffle on inserts
ID INT NOT NULL,
THING VARCHAR(100) NOT NULL,
UNIQUE KEY `unqK_id_thing` (ID,THING) -- no dupes, this is honorable
);
Stored Proc:
DROP PROCEDURE IF EXISTS splitEm;
DELIMITER $$
CREATE PROCEDURE splitEm()
BEGIN
DECLARE lv_ID,pos1,pos2,comma_pos INT;
DECLARE lv_THINGS TEXT;
DECLARE particle VARCHAR(100);
DECLARE strs_done INT DEFAULT FALSE; -- string search done
DECLARE done INT DEFAULT FALSE; -- cursor done
DECLARE cur111 CURSOR FOR SELECT ID,THINGS FROM evilThings ORDER BY ID;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
-- Please note in the above, CURSOR stuff MUST come LAST else "Error 1337: Variable or condition decl aft curs"
-- -------------------------------------------------------------------------------------------------------------------
TRUNCATE TABLE notEvilAssocTable;
OPEN cur111;
read_loop: LOOP
SET strs_done=FALSE;
FETCH cur111 INTO lv_ID,lv_THINGS;
IF done THEN
LEAVE read_loop;
END IF;
SET pos1=1,comma_pos=0;
WHILE !strs_done DO
SET pos2=LOCATE(',', lv_THINGS, comma_pos+1);
IF pos2=0 THEN
SET pos2=LOCATE("'", lv_THINGS, comma_pos+1);
IF pos2!=0 THEN
SET particle=SUBSTRING(lv_THINGS,comma_pos+1,pos2-comma_pos-1);
SET particle=REPLACE(particle,"'","");
SET particle=TRIM(particle);
INSERT IGNORE notEvilAssocTable (ID,THING) VALUES (lv_ID,particle);
END IF;
SET strs_done=1;
ELSE
SET particle=SUBSTRING(lv_THINGS,comma_pos+1,pos2-comma_pos-1);
SET particle=REPLACE(particle,"'","");
SET particle=TRIM(particle);
INSERT IGNORE notEvilAssocTable (ID,THING) VALUES (lv_ID,particle);
SET comma_pos=pos2;
END IF;
END WHILE;
END LOOP;
CLOSE cur111; -- close the cursor
END$$
DELIMITER ;
Test:
call splitEm();
See results of split:
select * from notEvilAssocTable;
Note that position 3, the InnoDB gap (from INSERT IGNORE). It is simply the innodb gap anomaly, an expected side effect like so many of InnoDB. In this case driven by the IGNORE part that creates a gap. No problem though. It forbids duplicates in our new table for split outs. It is common. It is there to protect you.
If you did not mean to have the single quote at the beginning and end of the string in the db, then change the routine accordingly.

Here is the answer to my question, assuming the data in THINGS field are separated by a bar '|'. Our original table will be myTABLE:
ID THINGS THINGSCount THINGSCountUnique HAS_DUPLICATES
1 AAA|BBB|AAA NULL NULL NULL
2 CCC|DDD NULL NULL NULL
Step 1. Check the maximum number of values separated by a bar '|' in THINGS field:
SELECT ROUND((CHAR_LENGTH(THINGS) - CHAR_LENGTH(REPLACE(THINGS,'|',''))) / CHAR_LENGTH('|')) + 1 FROM myTABLE;
Step 2. Assuming the answer from step 1 was 7, now use the following SQL to split the data in THINGS field into rows, there are many other approaches which you can Google to do the split:
CREATE TABLE myTABLE_temp
SELECT ID, SUBSTRING_INDEX(SUBSTRING_INDEX(myTABLE.THINGS, '|', n.n), '|', -1) THINGS
FROM myTABLE JOIN
( SELECT n FROM
( SELECT 1 AS N UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 ) a ) n
ON CHAR_LENGTH(THINGS) - CHAR_LENGTH(REPLACE(THINGS, '|', '')) >= n - 1
ORDER BY ID;
Our myTABLE_temp table will be something like:
ID THINGS
1 AAA
1 BBB
1 AAA
2 CCC
2 DDD
Step 3. Here we create two new tables to hold COUNT(THINGS) and COUNT(DISTINCT THINGS) as following:
# THINGSCount
CREATE TABLE myTABLE_temp_2
SELECT ID, COUNT(THINGS) AS THINGSCount FROM myTABLE_temp GROUP BY ID;
# Remember to ADD INDEX to ID field
UPDATE myTABLE A INNER JOIN myTABLE_temp_2 B ON(A.ID = B.ID) SET A.THINGSCount = B.THINGSCount;
# THINGSCountUnique
CREATE TABLE myTABLE_temp_3
SELECT ID, COUNT(THINGS) AS THINGSCountUnique FROM myTABLE_temp GROUP BY ID;
# Remember to ADD INDEX to ID field
UPDATE myTABLE A INNER JOIN myTABLE_temp_3 B ON(A.ID = B.ID) SET A.THINGSCountUnique = B.THINGSCountUnique;
Final Step: Flag duplicate values:
UPDATE myTABLE SET HAS_DUPLICATES = IF(THINGSCount>THINGSCountUnique, 'DUPLICATES', 'NO');

How to split string in sql server and put it in a table

My input is like
(abcd#123, xyz#324, def#567)
I want an output
col1 Col2
abcd 123
xyz 324
def 567
and so on
Column 1 should have abcd and xyz, and column 2 should have 123 and 324 both in different rows and so on. and the String can be of any size.
Thank you

Try this
SELECT LEFT('abcd#123', CHARINDEX('#', 'abcd#123')-1),
RIGHT('abcd#123', CHARINDEX('#', 'abcd#123')-1)

You will need to use CHARINDEX() and SUBSTRING() functions to split your input values.

your actual problem is turning a complex String into a table i gues.
That for i found help with a procedure about 1 year ago, that does exactly that:
CREATE FUNCTION [dbo].[fnSplitString] (
#myString varchar(500),
#deliminator varchar(10))
RETURNS
#ReturnTable TABLE (
[id] [int] IDENTITY(1,1) NOT NULL,
[part] [varchar](50) NULL
)
AS
BEGIN
Declare #iSpaces int
Declare #part varchar(50)
Select #iSpaces = charindex(#deliminator,#myString,0)
While #iSpaces > 0
BEGIN
Select #part = substring(#myString,0,charindex(#deliminator,#myString,0))
Insert Into #ReturnTable(part)
Select #part
Select #myString = substring(#mystring,charindex(#deliminator,#myString,0)+ len(#deliminator),len(#myString) - charindex(' ',#myString,0))
Select #iSpaces = charindex(#deliminator,#myString,0)
END
If len(#myString) > 0
Insert Into #ReturnTable
Select #myString
RETURN
END
Create this procedure and you can do:
DECLARE #TestString varchar(50)
SET #TestString = 'abcd#123,xyz#324,def#567'
Select * from dbo.fnSplitString(#TestString, ',')
Result:
id| part
1 | abcd#123
2 | xyz#324
3 | def#567
this part you can combine with Leonardos answer:
SELECT
LEFT(part, CHARINDEX('#', part)-1) As Col1,
RIGHT(part, LEN(part) - CHARINDEX('#', part)) As Col2
from dbo.fnSplitString(#TestString, ',')
to get your problem solved.
(little note: the function has little issues with whitespaces, so please try to avoid them there)

Fetch the occurrences of particular words in particular column of a table

I have near about 200 words. I want to see how many times those words occurred in a column of a table.
e.g: say we have table test with column statements which has two rows.
How are you. It's been long since I met you.
I am fine how are you.
Now I want to find the the occurrences of words "you" and "how". Output should be something like:
word count
you 3
how 2
since "you" has 3 and how has 2 occurrences in the two rows.
How can I do this?

You can do it like this:
Split the phrase and put all items in a different table;
Remove all ponctuation;
Make a select using the created table and the words that you want to identify.

The way I would approach this is to write a little user defined function to give me the number of times one string appears in another with some allowances for:
upper and lower case
common punctuation
I would then create a table with all of the words that I wish to search with i.e. your 200 list. Then use the function to count the number of occurrences of each word in every phrase, put that in a inline view and then sum the results up by search word.
Hence:
User Defined Function
DELIMITER $$
CREATE FUNCTION `get_word_count`(phrase VARCHAR(500),word VARCHAR(255), delimiter VARCHAR(1)) RETURNS int(11)
READS SQL DATA
BEGIN
DECLARE cur_position INT DEFAULT 1 ;
DECLARE remainder TEXT;
DECLARE cur_string VARCHAR(255);
DECLARE delimiter_length TINYINT UNSIGNED;
DECLARE total INT;
DECLARE result DOUBLE DEFAULT 0;
DECLARE string2 VARCHAR(255);
SET remainder = replace(phrase,'!',' ');
SET remainder = replace(remainder,'.',' ');
SET remainder = replace(remainder,',',' ');
SET remainder = replace(remainder,'?',' ');
SET remainder = replace(remainder,':',' ');
SET remainder = replace(remainder,'(',' ');
SET remainder = lower(remainder);
SET string2 = concat(delimiter,trim(word),delimiter);
SET delimiter_length = CHAR_LENGTH(delimiter);
SET cur_position = 1;
WHILE CHAR_LENGTH(remainder) > 0 AND cur_position > 0 DO
SET cur_position = INSTR(remainder, delimiter);
IF cur_position = 0 THEN
SET cur_string = remainder;
ELSE
SET cur_string = concat(delimiter,LEFT(remainder, cur_position - 1),delimiter);
END IF;
IF TRIM(cur_string) != '' THEN
set result = result + (select instr(string2,cur_string) > 0);
END IF;
SET remainder = SUBSTRING(remainder, cur_position + delimiter_length);
END WHILE;
RETURN result;
END$$
DELIMITER ;
You might have to play with this function a little depending on what allowances you need to make for punctuation and case. Hopefully you get the idea here though!
Populate tables
create table search_word
(id int unsigned primary key auto_increment,
word varchar(250) not null
);
insert into search_word (word) values ('you');
insert into search_word (word) values ('how');
insert into search_word (word) values ('to');
insert into search_word (word) values ('too');
insert into search_word (word) values ('the');
insert into search_word (word) values ('and');
insert into search_word (word) values ('world');
insert into search_word (word) values ('hello');
create table phrase_to_search
(id int unsigned primary key auto_increment,
phrase varchar(500) not null
);
insert into phrase_to_search (phrase) values ("How are you. It's been long since I met you");
insert into phrase_to_search (phrase) values ("I am fine how are you?");
insert into phrase_to_search (phrase) values ("Oh. Not bad. All is ok with the world, I think");
insert into phrase_to_search (phrase) values ("I think so too!");
insert into phrase_to_search (phrase) values ("You know what? I think so too!");
Run Query
select word,sum(word_count) as total_word_count
from
(
select phrase,word,get_word_count(phrase,word," ") as word_count
from search_word
join phrase_to_search
) t
group by word
order by total_word_count desc;

Here is a solution:
SELECT SUM(total_count) as total, value
FROM (
SELECT count(*) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.sentence, ' ', n.n), ' ', -1) value
FROM table_name t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.sentence) - LENGTH(REPLACE(t.sentence, ' ', '')))
ORDER BY value
) AS x
GROUP BY x.value
) AS y
GROUP BY value
Here is the full working fiddle: http://sqlfiddle.com/#!2/17481a/1
First we do a query to extract all words as explained here by #peterm(follow his instructions if you want to customize the total number of words processed). Then we convert that into a sub-query and then we COUNT and GROUP BY the value of each word, and then make another query on top of that to GROUP BY not grouped words cases where accompanied signs might be present. ie: hello = hello! with a REPLACE

Below is the simple solution for the case when you need to count certain word occurrences, not the complete statistics:
SELECT COUNT(*) FROM `words` WHERE `row1` LIKE '%how%';
SELECT COUNT(*) FROM `words` WHERE `row1` LIKE '%you%';

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Issue in Alphanumeric sorting with special char - sql-server-2008

Related

SQL Query that counts the number of characters match in two text columns

SQL: GROUP BY Clause for Comma Separated Values

Writing stored procedure which flags duplicate values in a comma separated field in MySQL

How to split string in sql server and put it in a table

Fetch the occurrences of particular words in particular column of a table

Categories

Resources