SQL ORDER BY dilemma with numbers - mysql

I have a table which holds a varchar datatype. It holds 128 characters max.
I'm trying to order it alphabetically, and it works out fine, except for one little thing.
When I try to save a mixture of numbers and letters, it returns the 'literal' alphabetical order, which means 11 comes first before 2.
I have read almost all of the answers in the internet, but they are all workarounds that cannot work specifically for my problem.
Examples of values I want to put in order
Apartment
House
Dog
Cat
18 years old
2 years old
1 year old
But I want it to look like this.
1 year old
2 years old
18 years old
Apartment
Cat
Dog
House
It spans on a large database and I can't just split the numerical values apart from the text ones.
Also users who can use the program can modify it with Alphanumeric characters.
Any suggestions about my problem? Thanks.

Here is something I tried in SQL Server. It's neither elegant nor fit for production, but it may give you an idea.
SELECT StringValue,
CAST(SUBSTRING(StringValue, StartPos, EndPos - StartPos) AS INT) AsNumber,
SUBSTRING(StringValue, StartPos, EndPos - StartPos) NumberToken,
SUBSTRING(StringValue, EndPos, 1000) Rest,
StartPos,
EndPos
FROM
(SELECT
StringValue,
PATINDEX('[0-9]%', StringValue) StartPos,
PATINDEX('%[^0-9]%', StringValue) EndPos
FROM
(SELECT 'abc123xyz' StringValue
UNION SELECT '1abc'
UNION SELECT '11abc'
UNION SELECT '2abc'
UNION SELECT '100 zasdfasd') Sub1
) Sub2
ORDER BY AsNumber, Rest
Result:
StringValue AsNumber NumberToken Rest StartPos EndPos
abc123xyz 0 abc123xyz 0 1
1abc 1 1 abc 1 2
2abc 2 2 abc 1 2
11abc 11 11 abc 1 3
100 zasdfasd 100 100 zasdfasd 1 4

I would approach this as follows...
First, write an expression to convert the numeric stuff to integers, something like
select CAST(SUBSTRING(<field>',1,instr(<field>',' ') as INT),<field>
I would then use a UNION ALL statement, something like this
SELECT CAST(SUBSTRING(<field>',1,instr(<field>',' ') as INT),<field>,A.*
FROM <table> A
WHERE <field> LIKE <regular expression to get fields beginning with numbers>
UNION ALL
SELECT 999999,<field>,A.*
FROM <table> A
WHERE <field> NOT LIKE <regular expression to get fields beginning with numbers>
ORDER BY 1,2,3
The numbers will appear first, in numeric order. Sine all of the alpha data has the same numeric key, it will appear sorted alphabetically after the numbers... Just be sure to make the alpha dummy key (999999) is large enough to be after all the numeric ones...
I don't have mySQL on this machine, but hopefully this gives you enough of a start to solve it

you should probably get away by doing something like this:
order by right(replicate(' ',30)+Column_name,30)

Try this order by:
ORDER BY RIGHT(REPLICATE('0',128)+value,128)
My test:
DECLARE #T TABLE
(
value VARCHAR(128)
)
INSERT INTO #T VALUES('Apartment'),
('House'),
('Dog'),
('Cat'),
('18 years old'),
('2 years old'),
('1 year old'),
('12 horses'),
('1 horse')
SELECT * FROM #T
ORDER BY RIGHT(REPLICATE('0',128)+value,128)
RESULTS:
Cat
Dog
House
1 horse
12 horses
Apartment
1 year old
2 years old
18 years old
If you find a case that this doesn't work please post it along with the sort order you would like and I can see if there's a fix.

Related

How do I Query for used BETWEEN Operater for text searches in MySql database?

I have a SQL Table in that i use BETWEEN Operater.
The BETWEEN Operater selects values within range. The values can be numbers, text , dates.
stu_id name city pin
1 Raj Ranchi 123456
2 sonu Delhi 652345
3 ANU KOLKATA 879845
4 K.K's Company Delhi 345546
5 J.K's Company Delhi 123456
I have a query like this:-
SELECT * FROM student WHERE stu_id BETWEEN 2 AND 4 //including 2 & 4
SELECT * FROM `student` WHERE name between 'A' and 'K' //including A & not K
Here My Question is why not including K.
but I want K also in searches.
Don't use between -- until you really understand it. That is just general advice. BETWEEN is inclusive, so your second query is equivalent to:
WHERE name >= 'A' AND
name <= 'K'
Because of the equality, 'K' is included in the result set. However, names longer than one character and starting with 'K' are not -- "Ka" for instance.
Instead, be explicit:
WHERE name >= 'A' AND
name < 'L'
Of course, BETWEEN can be useful. However, it is useful for discrete values, such as integers. It is a bit dangerous with numbers with decimals, strings, and date/time values. That is why I encourage you to express the logic as inequalities.
In supplement to gordon's answer, one way to get what you're expecting is to turn your name into a discrete set of values:
SELECT * FROM `student` WHERE LEFT(name, 1) between 'A' and 'K'
You need to appreciate that K.K's Company is alphabetically AFTER the letter K on its own so it is not BETWEEN, in the same way that 4.1 is not BETWEEN 2 and 4
By stripping it down to just a single character from the start of the string it will work like you expect, but take cautionary note, you should always avoid running functions on values in tables, because if you had a million names, thats a million strings that mysql has to strip out to just the first letter and it might no longer be able to use an index on name, battering the performance.
Instead, you could :
SELECT * FROM `student` WHERE name >= 'A' and name < 'L'
which is more likely to permit the use of an index as you aren't manipulating the stored values before comparing them
This works because it asks for everything up to but not including L.. Which includes all of your names starting with K, even kzzzzzzzz. Numerically it is equivalent to saying number >= 2 and number < 5 which gives you all the numbers starting with 2, 3 or 4 (like the 4.1 from before) but not the 5
Remember that BETWEEN is inclusive at both ends. Always revert to a pattern of a >= b and a < c, a >= c and a < d when you want to specify ranges that capture all possible values
Compare in lexicographical order, 'K.K's Company' > 'K'
We should convert the string to integer. You can try that mysql script with CAST and SUBSTRING. I've updated your script here. It will include the last record as well.
SELECT * FROM student WHERE name CAST(SUBSTRING(username FROM 1) AS UNSIGNED)
BETWEEN 'A' AND 'K';
The script will work. Hope it will helps to you.
Here I've attached my test sample.

How to query for a phrase on SQL database of words?

I am using MySQL and I have an SQL database of of songs with a table that consists of 8 columns of information on words of a song. each row represents a single word from the songs lyrics:
songSerial - the serial number of the song
songName - the song name
word - a single word from the song's lyrics
row_number - the number of the row that the word is found
word_position_in_row - the number of the word in the row alone
house_number - the number of the house the word belongs to
house_row - the number of the row in the house that the word is found in
word_number - the number of the word out of all the songs lyrics
example for a row: { 4 , The Scientist , secrets , 8 , 4 , 2 , 1 , 37 }
Now I want to query all the songs that contains a group of words. For instance all the words that have the sentence: "I Love You" in them. It must be in that order and not from different rows or houses.
Here are scripts in my oneDrive for creating the databastable and about 400 rows:
TwoTextScriptFilesAndTheirZip
Can anyone help ?
Thank you
One method is to use joins:
select s.*
from songwords sw1 join
songwords sw2
on sw2.songSerial = sw1.songSerial and
sw2.word_number = sw1.word_number + 1 join
songwords sw3
on sw3.songSerial = sw2.songSerial and
sw3.word_number = sw2.word_number + 1
where sw1.word = 'I' and sw2.word = 'love' and sw3.word = 'you';
Or, if you prefer:
where concat_ws(' ', sw1.word, sw2.word, sw3.word) = 'I love you'
This is worse from an optimization perspective (indexes using word do not help performance), but it is clear what the query is doing.
Searches of this type suggest using a full text index. The only caveat is that you will need to remove the stop word list and index all words, regardless of length. ("I" and "you" are typical examples of stop words.)
This is an expensive approach for a large table, assuming word is not null, we could do something like this:
SET group_concat_max_len = 16777216 ;
SELECT t.song_serial
, t.house_number
, t.row_number
FROM mytable t
GROUP
BY t.songserial
, t.house_number
, t.row_number
HAVING CONCAT(' ',GROUP_CONCAT(t.word ORDER BY t.word_position_by_row),' ')
LIKE CONCAT('% ','I love you',' %')
We would definitely want a suitable index available, e.g.
... ON `mytable` (`songserial`,`house_number`,`row_number`,`word`)
If one of the words in the phrase is infrequent, we might be able to optimize a bit with a search for that infrequent word first, and then get all of the words on the same row ...
SELECT t.song_serial
, t.house_number
, t.row_number
FROM ( SELECT r.songserial
, r.house_number
, r.row_number
FROM mytable r
WHERE r.word = 'love'
GROUP
BY r.word
, r.songserial
, r.house_number
, r.row_number
) s
JOIN mytable t
ON t.songserial = s.songserial
AND t.house_number = s.house_number
AND t.row_number = s.row_number
GROUP
BY t.songserial
, t.house_number
, t.row_number
HAVING CONCAT(' ',GROUP_CONCAT(t.word ORDER BY t.word_position_by_row),' ')
LIKE CONCAT('% ','I love you',' %')
That inline view s would benefit from a covering index with word as the leading column
... ON `mytable` (`word`,`songserial`,`house_number`,`row_number`)
You look for these words and relative search positions: 1 = I, 2 = love, 3 = you. Let's compare them with two song lines:
And I love, love, love you
real pos: 1 2 3 4 5 6
search pos: - 1 2 2 2 3
diff: - 1 1 2 3 3
I miss you and I love you
real pos: 1 2 3 4 5 6 7
search pos: 1 - 3 - 1 2 3
diff: 0 - 0 - 4 4 4
If we look at the position deltas of the first line, we get 1 (twice), 2 (once), and 3 (twice).
For the second line we get deltas 0 (twice), and 4 (thrice).
So for the second song line we find a delta with as many matches as search words, for the first line not. The second line is a match.
And here is the query. I assume we have a temporary table search filled with the search words and relative positions for readability.
select distinct w.songserial, w.songname, w.house_number
from words w
join search s on s.word = w.word
group by
w.songserial, w.songname, w.row_number, w.house_number, w.house_row, -- song line
w.word_position_in_row - s.pos -- delta
having count(*) = (select count(*) from search);
This query is based on:
a song is identified by songserial + songname + house_number
a song line is identified by songserial + songname + row_number + house_number + house_row
This may be wrong; I don't know what house and house number mean in reference to a song. But that'll be easy to adjust.

MYSQL - Find rows, where part of search string matches part of value in column

I wasn't able to find this anywhere, here's my problem:
I have a string like '1 2 3 4 5' and then I have a mysql table that has a column, let's call it numbers, that look like this:
numbers
1 2 6 8 9 14
3
1 5 3 6 9
7 8 9 23 44
10
I am trying to find the easiest way (hopefully in a single query) to find the rows, where any of the numbers in my search string (1 or 2 or 3 or 4 or 5) is contained in the numbers column. In the give example I am looking for rows with 1,2 and 3 (since they share numbers with my search string).
I am trying to do this with a single query and no loops.
Thanks!
The best solution would be to get rid of the column containing a list of values, and use a schema where each value is in its own row. Then you can use WHERE number IN (1, 2, 3, 4, 5) and join this with the table containing the rest of the data.
But if you can't change the schema, you can use a regular expression.
SELECT *
FROM yourTable
WHERE numbers REGEXP '[[:<:]](1|2|3|4|5)[[:<:]]'
[[:<:]] and [[:<:]] match the beginning and end of words.
Note that this type of search will be very slow if the table is large, because it's not feasible to index it.
Here is a start point (split string function) : http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/ := SplitString(string,delimiter,position)
Create a function so it converts a string to an array := stringSplitted(string,delimiter)
Create a function so it compares two arrays :=arrayIntersect(array1, array2)
SELECT numbers
FROM table
WHERE arrayIntersect(#argument, numbers)
Two function definitions with loops and one single query without any loop
SELECT * FROM MyTable WHERE (numbers LIKE '%1%' OR numbers LIKE '%2%')
or you can also use REGEX something like this
SELECT * FROM events WHERE id REGEXP '5587$'

Natural Sorting SQL ORDER BY

Can anyone lend me a hand as to what I should append to my ORDER BY statement to sort these values naturally:
1
10
2
22
20405-109
20405-101
20404-100
X
Z
D
Ideally I'd like something along the lines of:
1
2
10
22
20404-100
20405-101
20405-109
D
X
Z
I'm currently using:
ORDER BY t.property, l.unit_number
where the values are l.unit_number
I've tried doing l.unit_number * 1 and l.unit_number + 0 but they haven't worked.
Should I be doing sort of ORDER conditional, such as Case When IsNumeric(l.unit_number)?
Thank you.
This will do it:
SELECT value
FROM Table1
ORDER BY value REGEXP '^[A-Za-z]+$'
,CAST(value as SIGNED INTEGER)
,CAST(REPLACE(value,'-','')AS SIGNED INTEGER)
,value
The 4 levels of the ORDER BY:
REGEXP assigns any alpha line a 1 and non-alphas a 0
SIGNED INT Sorts all of the numbers by the portion preceding the dash.
SIGNED INT after removing the dash sorts any of the items with the same value before the dash by the portion after the dash. Potentially could replace number 2, but wouldn't want to treat 90-1 the same as 9-01 should the case arise.
Sorts the letters alphabetically.
Demo: SQL Fiddle

Oracle/MYSQL: Sort records from a select query on a column that contains alphanumeric values

I know that this question has been asked in various forms but my requirement happens to be a bit different.
Suppose I have a table that contains data as follows:
ID NAME VALUE
-----------------------------
1 ABC-2-2 X
2 PQRS-1-3 Y
3 ABC-3-2 Z
4 PQRS-1-4 A
5 PQRS-3-4 B
6 MNO-2-1 C
7 AAA-1 D
8 BBB-2 E
9 CCC-3 F
Now, the output that I'm expecting should look something like this:
ID NAME VALUE
-----------------------------
7 AAA-1 D
2 PQRS-1-3 Y
4 PQRS-1-4 A
8 BBB-2 E
6 MNO-2-1 C
1 ABC-2-2 X
9 CCC-3 F
3 ABC-3-2 Z
5 PQRS-3-4 B
Note that this is not a direct alpha-numeric sort. Instead, the value before the first "-" is ignored and the fields are sorted on what is after the first "-" in the name.
I'm not very familiar with PL/SQL and any kind of help on this would be appreciated.
Thanks.
PS: Note that this should work on both Oracle and MySQL.
For your example this would suffice (Oracle syntax):
ORDER BY SUBSTR(name,4)
If the number of characters before the first hyphen can vary, you can do this (again Oracle syntax):
ORDER BY SUBSTR(name,INSTR(name,'-')+1)
However that won't work if you have codes like:
AAA-10-1
AAA-8-1
AAA-9-1
and expect AAA-10-1 to appear after AAA-9-1. Then you will need to parse it further:
ORDER BY LPAD(SUBSTR(name,INSTR(name,'-')+1, INSTR(name,'-',1,2)-INSTR(name,'-')-1),10,'0'),
LPAD(SUBSTR(name,INSTR(name,'-',1,2)+1),10,'0')
(NB I have used LPAD(x,10,'0') to turn a value like '1' into '0000000001' and so on, rather than use TO_NUMBER since this could fail if there are any non-numerics in your data.)
Example:
with data as
(
select 'AAA-1' name from dual
union all
select 'PQR-1-4' name from dual
union all
select 'PQR-1-3' name from dual
union all
select 'AAA-10-10' name from dual
union all
select 'AAA-10-1' name from dual
union all
select 'AAA-9-10' name from dual
union all
select 'AAA-9-1' name from dual
)
select *
from data
ORDER BY LPAD(SUBSTR(name,INSTR(name,'-')+1, INSTR(name,'-',1,2)-INSTR(name,'-')-1),10,'0'),
LPAD(SUBSTR(name,INSTR(name,'-',1,2)+1),10,'0');
Output:
NAME
---------
PQR-1-3
PQR-1-4
AAA-9-1
AAA-9-10
AAA-10-1
AAA-10-10
AAA-1
And if AAA-1 should come first:
ORDER BY LPAD(SUBSTR(name,INSTR(name,'-')+1, INSTR(name||'-','-',1,2)-INSTR(name,'-')-1),10,'0'),
LPAD(SUBSTR(name,INSTR(name||'-','-',1,2)+1),10,'0') nulls first
Not sure about mysql syntax, but you can do this in oracle:
select * from <your_table>
order by substr(name, 5)
in mssql the syntax of finding your problem is :
select * from mytable order by substring(name,PATINDEX('%-%',name)+1,len(name)-PATINDEX('%-%',name))
SqlFiddle