How to order text that contains double colons (::) - mysql

To order by name I'm using 'order by name'
But the names contain double colons : '::'
How can I order by the text that occurs subsequent to the double colons ?
So :
aaaa::bbbb
aaaa::aaaa
aaaa::1234
aaaa::a1234
Will be ordered :
aaaa::1234
aaaa::aaaa
aaaa::a1234
aaaa::bbbb

Order by the substring ans use locate to find where it starts:
order by substring(name, locate('::', name) + 3, 30)
It'll decrease performance since no index will be used.

You would have to create a new field in MySQL then insert the second part of your text into it. Sort by uses various indexes and algorithms (such as divide and conquer).
As such it would not work on sorting on a specific portion of a specific string, and if you did manage to 'fake' a way of doing it, the performance would be terrible due to lack of indexes.
Sorry, I realise this probably isn't the answer your looking for, but I'm afraid the best way is the slightly longer way, but at least you can then do it at lighting fast speeds if you add an index to it :)

You must split the text into two columns and order by the latter one. You can either split and join the columns in application code or use views and stored procedures to make it look like one column to a database client.

about your sorting , according to ascii values numbers come first before alphabets,
so aaaa:1234 should come first
You can retrieve the values and sort in PHP
Navsort
<?php
$arr = array("aaaa::bbbb","aaaa::aaaa","aaaa::1234","aaaa::a1234");
$sec=$arr;
natsort($sec);
print_r ($sec);
?>

You may try the following approach
Get all records where All data is Alphabet after ::
UNION
Get all records where All data is Numeric after ::

Related

Match specific string before user input

I have the following strings:
SDZ420-1241242,
AS42-9639263,
SPF3-2352353
I want to "escape" the SDZ420- part while searching and only search using the last digits, so far I've tried RLIKE '^[a-zA-Z\d-]' which works but I am confused on how to add the next digits (user input, say 1241242) to it. I cannot use LIKE '%$input' since that would return a row even if I just input '242' as the search string.
In simple words, a user input of '1241242' should return the row with 'SDZ420-1241242'. Is there any other approach other than creating a separate table with the numbers only?
Note that without jumping through some crazy hoops, this search needs to hit every row in the table; if you have an index on this, it's not going to use that (an index is generally used, assuming it's of the proper kind, which they tend to be, when you search on start, and generally only when using LIKE 'needle%' and not RLIKE. If that's a problem, storing the digits separately, and then putting an index on that, is probably the simplest way to solve your problem here.
To query for the final few digits, why not:
SELECT * FROM foo WHERE colName LIKE ?
with the string made in your programming language via:
String searchTerm = "%-" + digits;
You can also pass in the number as a string and use:
where substring_index(colname, '-', -1) = ?
This does not require changing the value in the application code.

Performance of LIKE 'xyz%' v/s LIKE '%xyz'

I was wondering how the LIKE operator actually work.
Does it simply start from first character of the string and try matching pattern, one character moving to the right? Or does it look at the placement of the %, i.e. if it finds the % to be the first character of the pattern, does it start from the right most character and starts matching, moving one character to the left on each successful match?
Not that I have any use case in my mind right now, just curious.
edit: made question narrow
If there is an index on the column, putting constant characters in the front will lead your dbms to use a more efficient searching/seeking algorithm. But even at the simplest form, the dbms has to test characters. If it is able to find it doesn't match early on, it can discard it and move onto the next test.
The LIKE search condition uses wildcards to search for patterns within a string. For example:
WHERE name LIKE 'Mickey%'
will locate all values that begin with 'Mickey' optionally followed by any number of characters. The % is not case sensitive and not accent sensitive and you can use multiple %, for example
WHERE name LIKE '%mouse%'
will return all values with 'mouse' (or 'Mouse' or 'mousé') in it.
The % is inclusive, meaning that
WHERE name like '%A%'
will return all that starts with an 'A', contain 'A' or end with 'A'.
You can use _ (underscore) for any character on a single position:
WHERE name LIKE '_at%'
will give you all values with 'a' as the second letter and 't' as the third. The first letter can be anything. For example: 'Batman'
In T-SQL, if you use [] you can find values in a range.
WHERE name LIKE '[c-f]%'
it will find any value beginning with letter between c and f, inclusive. Meaning it will return any value that start with c, d, e or f. This [] is T-SQL only. Use [^ ] to find values not in a range.
Finding all values that contain a number:
WHERE name LIKE '%[0-9]%'
returns everything that has a number in it. Example: 'Godfather2'
If you are looking for all values with the 3rd position to be a '-' (dash) use two underscores:
WHERE NAME '__-%'
It will return for example: 'Lo-Res'
Finding the values with names ends in 'xyz' use:
WHERE name LIKE '%xyz'
returns anything that ends with 'xyz'
Finding a % sign in a name use brackets:
WHERE name LIKE '%[%]%'
will return for example: 'Top%Movies'
Searching for [ use brackets around it:
WHERE name LIKE '%[[]%'
gives results as: 'New York [NY]'
The database collation's sort order determines both case sensitivety and the sort order for the range of characters. You can optionally use COLLATE to specify collation sort order used by the LIKE operator.
Usually the main performance bottleneck is IO. The efficiency of the LIKE operator can be only important if your whole table fits in the memory otherwise IO will take most of the time.
AFAIK oracle can use indexes for prefix matching. (like 'abc%'), but these index cannot be used for more complex expressions.
Anyway if you have only this kind of queries you should consider using a simple index on the related column. (Probably this is true for other RDBMS's as well.)
Otherwise LIKE operator is generally slow, but most of the RDBMS have some kind of full text searching solution. I think the main reason of the slowness is that LIKE is too general. Usually full text indexes has lots of different options which can tell the database what you really want to search for, and with these additional information the DB can do its task in a more efficient way.
As a rule of thumb I think if you want to search in a text field and you think performance can be an issue, you should consider your RDBMS's full text searching solution, or the real goal is not text searching, but this is some kind of "design side effect", for example xml/json/statuses stored in a field as text, then probably you should consider choosing a more efficient data storing option. (if there is any...)

Select area code from phone number entries

I want to select only the area code from a list of column entries populated by phone numbers. This is what I have:
SELECT LEFT(phone, 3) AS areacode, COUNT(phone) AS count
FROM registration
GROUP BY areacode;
The problem is, the entries aren't consistent. So some phone numbers start as +123-456-7899, and others with (123)-456-7899, and others with no symbol at the beginning.
So my question is: is there a way that I can ensure the SELECT LEFT starts at the first integer?
Thanks!
There are somethings that SQL is just not meant for. This is one. I would select the phone number into a string, and do some pattern matching in your programming language of choice to find the area code.
-OR-
Change your table such that area code is a different column.
Two options (neither of which being SQL):
Select all phone numbers and use a programming language of your choice to programatically strip out the unnecessary characters.
Clean the input to strip out all unnecessary characters prior to inserting them into the database
SQL is not the best way to do this, rather, SQL + programming
There actually is a way to do this in SQL that was intentionally designed for this exact purpose.
SELECT SUBSTRING(office_phone_number, 1, 3) FROM contact;
Of course, this depends on how the number is stored in the table. If parenthesis are present, your starting position would be off.
Here is more information:
MySQL substring function

extracting strings from mysql field

total slow moment day, i need to extract different areas based on what language is selected from a field in a mysql database
ex:
<!--:en-->Overview<!--:--><!--:es-->Overview<!--:--><!--:fr-->Présentation<!--:--><!--:ar-->نظرة عامة<!--:-->
so if my language is french for example, i want the part between <!--:fr--> and <!--:-->
any ideas?
Strings processing is not the strongest part of MySQL. But here is one idea:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(column_name, '<!--:fr-->', -1), '<!--:-->', 1) FROM table_name
The easier way would be using a substring. You can find the index for the language on the string first. After that, find the index of the end marker () and extract what's in the middle, which is the value you want.
A more elaborated way would be using regular expressions. The implementation depends on the language you are coding on.

MySQL - Extracting numbers out of strings

In a MySQL database, I have a table which contains itemID, itemName and some other fields.
Sample records (respectively itemID and itemName):
vaX652bp_X987_foobar, FooBarItem
X34_bar, BarItem
tooX56, TOOX_What
I want to write a query which gives me an output like:
652, FooBarItem
34, BarItem
56, TOOX_What
In other words, I want to extract out the number from the itemID column. But the condition is that the extracted number should be the number that occurs after the first occurence of the character "X" in the itemID column.
I am currently trying out locate() and substring() but could not (yet) achieve what I want..
EDIT:
Unrelated to the question - Can any one see all the answers (currently two) to this question ? I see only the first answer by "soulmerge". Any ideas why ? And the million dollar question - Did I just find a bug ?!
That's a horrible thing to do in mysql, since it does not support extraction of regex matches. I would rather recommend pulling the data into your language of choice and processing it there. If you really must do this in mysql, using unreadable combinations of LOCATE and SUBSTRING with multiple CASEs is the only thing I can think of.
Why don't you try to make a third column where you can store, at the moment of the insertion of the record (separating the number in PHP or so), the number alone. So this way you use a little more of space to save a lot of processing.
Table:
vaX652bp_X987_foobar, 652, FooBarItem
X34_bar, 34, BarItem
tooX56, 56, TOOX_What
This isn't so unreadable :
SELECT 0+SUBSTRING(itemID, LOCATE("X", itemID)+1), itemName FROM tableName