Related
I want select rows in where a day in specific i.e. "Monday", but my type column is a timestamp "AAAA-MM-DD HH:MM:SS". I've searched but I don't how to select this.
My table is this, and the field is forex_pair_price_time (timestamp):
mysql> describe forex_pair_price;
+-------------------------+----------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+----------------+------+-----+-------------------+----------------+
...
| forex_pair_price_time | timestamp | NO | | CURRENT_TIMESTAMP | |
...
+-------------------------+----------------+------+-----+-------------------+----------------+
To find all records where the forex timestamp happen to land on Monday, we can try using DAYNAME:
SELECT *
FROM forex_pair_price
WHERE DAYNAME(forex_pair_price_time) = 'Monday';
SELECT forex_pair_price_time,
CASE
WHEN DAYOFWEEK(orex_pair_price_time) = 1 THEN "Sunday"
WHEN DAYOFWEEK(orex_pair_price_time) = 2 THEN "Monday"
WHEN DAYOFWEEK(orex_pair_price_time) = 3 THEN "Tuesday"
WHEN DAYOFWEEK(orex_pair_price_time) = 4 THEN "Wednesday"
WHEN DAYOFWEEK(orex_pair_price_time) = 5 THEN "Thursday"
WHEN DAYOFWEEK(orex_pair_price_time) = 6 THEN "Friday"
WHEN DAYOFWEEK(orex_pair_price_time) = 7 THEN "Saturday"
END Weekday
FROM forex_pair_price
http://www.mysqltutorial.org/mysql-weekday
Use WEEKDAY(date).
The WEEKDAY function returns a weekday index for a date i.e., 0 for Monday, 1 for Tuesday, … 6 for Sunday.
Select * from forex_pair_price
where weekday(forex_pair_price_time)=0
Or
Dayname(forex_pair_price_time)='Monday'
I think it is saver to use weekday because it is an compare to an integer instead of string. This is saver against problems with typos in String.
I have this table, only two columns, each record stores an interest rate for a given month:
id rate
===========
199502 3.63
199503 2.60
199504 4.26
199505 4.25
... ...
201704 0.79
201705 0.93
201706 0.81
201707 0.80
201708 0.14
Based on this rates, I need to create another table of accumulated rates which similar structure, whose data is calculated as function of a YYYYMM (month/year) parameter, this way (this formula is legally mandatory):
The month given as parameter has always rate of 0 (zero)
The month immediately previous has always a rate of 1 (one)
The previous months' rates will be (one) plus the sum of rates of months between that given month and the month given as parameter.
I'll clarify this rules with this example, given parameter 201708:
SOURCE CALCULATED
id rate id rate
=========== =============
199502 3.63 199502 360.97 (1 + sum(rate(199503) to rate(201707)))
199503 2.60 199503 358.37 (1 + sum(rate(199504) to rate(201707)))
199504 4.26 199504 354.11 (1 + sum(rate(199505) to rate(201707)))
199505 4.25 199505 349.86 (1 + sum(rate(199506) to rate(201707)))
... ... ... ...
201704 0.79 201704 3.54 (1 + rate(201705) + rate(201706) + rate(201707))
201705 0.93 201705 2.61 (1 + rate(201706) + rate(201707))
201706 0.81 201706 1.80 (1 + rate(201707))
201707 0.80 201707 1.00 (per definition)
201708 0.14 201708 0.00 (per definition)
Now I've already implemented a VB.NET function that reads the source table and generates the calculated table, but this is done in runtime at each client machine:
Public Function AccumRates(targetDate As Date) As DataTable
Dim dtTarget = Rates.Clone
Dim targetId = targetDate.ToString("yyyyMM")
Dim targetIdAnt = targetDate.AddMonths(-1).ToString("yyyyMM")
For Each dr In Rates.Select("id<=" & targetId & " and id>199412")
If dr("id") = targetId Then
dtTarget.Rows.Add(dr("id"), 0)
ElseIf dr("id") = targetIdAnt Then
dtTarget.Rows.Add(dr("id"), 1)
Else
Dim intermediates =
Rates.Select("id>" & dr("id") & " and id<" & targetId).Select(
Function(ldr) New With {
.id = ldr.Field(Of Integer)("id"),
.rate = ldr.Field(Of Decimal)("rate")}
).ToArray
dtTarget.Rows.Add(
dr("id"),
1 + intermediates.Sum(
Function(i) i.rate))
End If
Next
Return dtTarget
End Function
My question is how can I put this as a query in my database so it can be used dynamically by other queries which would use these accumulated rates to update debts to any given date.
Thank you very much!
EDIT
I managed to make a query that returns the data I want, now I just don't know how to encapsulate it so that it can be called from another query passing any id as argument (here I did it using a SET ... statement):
SET #targetId=201708;
SELECT
id AS id_acum,
COALESCE(1 + (SELECT
SUM(taxa)
FROM
tableSelic AS ts
WHERE
id > id_acum AND id < #targetId
LIMIT 1),
IF(id >= #targetId, 0, 1)) AS acum
FROM
tableSelic
WHERE id>199412;
That's because I'm pretty new to MySQL, I'm used to MS-Access where parametrized queries are very straightfoward to create.
For example:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL PRIMARY KEY
,rate DECIMAL(5,2) NOT NULL
);
INSERT INTO my_table VALUES
(201704,0.79),
(201705,0.93),
(201706,0.81),
(201707,0.80),
(201708,0.14);
SELECT *
, CASE WHEN #flag IS NULL THEN #i:=1 ELSE #i:=#i+rate END i
, #flag:=1 flag
FROM my_table
, (SELECT #flag:=null,#i:=0) vars
ORDER
BY id DESC;
+--------+------+-------------+-------+------+------+
| id | rate | #flag:=null | #i:=0 | i | flag |
+--------+------+-------------+-------+------+------+
| 201708 | 0.14 | NULL | 0 | 1 | 1 |
| 201707 | 0.80 | NULL | 0 | 1.80 | 1 |
| 201706 | 0.81 | NULL | 0 | 2.61 | 1 |
| 201705 | 0.93 | NULL | 0 | 3.54 | 1 |
| 201704 | 0.79 | NULL | 0 | 4.33 | 1 |
+--------+------+-------------+-------+------+------+
5 rows in set (0.00 sec)
Ok, I made it with a function:
CREATE FUNCTION `AccumulatedRates`(start_id integer, target_id integer) RETURNS decimal(6,2)
BEGIN
DECLARE select_var decimal(6,2);
SET select_var = (
SELECT COALESCE(1 + (
SELECT SUM(rate)
FROM tableRates
WHERE id > start_id AND id < target_id LIMIT 1
), IF(id >= unto, 0, 1)) AS acum
FROM tableRates
WHERE id=start_id);
RETURN select_var;
END
And them a simple query:
SELECT *, AccumulatedRates(id,#present_id) as acum FROM tableRates;
where #present_id is passed as parameter.
Thanks to all, anyway!
I make a cohort analysis processor. Input parameters: time range and step, condition (initial event) to exctract cohorts, additional condition (retention event) to check after each N hours/days/months. Output parameters: cohort analysis grid, like this:
0h | 16h | 32h | 48h | 64h | 80h | 96h |
cohort #00 15 | 6 | 4 | 1 | 1 | 2 | 2 |
cohort #01 1 | 35 | 8 | 0 | 2 | 0 | 1 |
cohort #02 0 | 3 | 31 | 11 | 5 | 3 | 0 |
cohort #03 0 | 0 | 4 | 27 | 7 | 6 | 2 |
cohort #04 0 | 1 | 1 | 4 | 29 | 4 | 3 |
Basically:
fetch cohorts: unique users who did something 1 in every period from time_begin every time_step.
find how many of them (in each cohort) did something 2 after N seconds, N*2 seconds, N*3, and so on until now.
In short - I have 2 solutions. One works too slow and includes a heavy select with joins for each data step: 1 day, 2 day, 3 day, etc. I want to optimize it by joining result for every data step to cohorts - and it's the second solution. It looks like it works but I'm not sure it's the best way and that it will give the same result even if cohorts will intersect. Please check it out.
Here's the whole story.
I have a table of > 100,000 events, something like this:
#user-id, timestamp, event_name
events_view (uid varchar(64), tm int(11), e varchar(64))
example input row:
"user_sampleid1", 1423836540, "level_end:001:win"
To make a cohort analisys first I extract cohorts: for example, users, who send special event '1st_launch' in 10 hour periods starting from 2015-02-13 and ending with 2015-02-16. All code in this post is simplified and shortened to see the idea.
DROP TABLE IF EXISTS tmp_c;
create temporary table tmp_c (uid varchar(64), tm int(11), c int(11) );
set beg = UNIX_TIMESTAMP('2015-02-13 00:00:00');
set en = UNIX_TIMESTAMP('2015-02-16 00:00:00');
select min(tm) into t_start from events_view ;
select max(tm) into t_end from events_view ;
if beg < t_start then
set beg = t_start;
end if;
if en > t_end then
set en = t_end;
end if;
set period = 3600 * 10;
set cnt_c = ceil((en - beg) / period) ;
/*works quick enough*/
WHILE i < cnt_c DO
insert into tmp_c (
select uid, min(tm), i from events_view where
locate("1st_launch", e) > 0 and tm > (beg + period * i)
AND tm <= (beg + period * (i+1)) group by uid );
SET i = i+1;
END WHILE;
Cohorts may consist the same user ids, though usually one user is exist only in one cohort. And in each cohort users are unique.
Now I have temp table like this:
user_id | 1st timestamp | cohort_no
uid1 1423836540 0
uid2 1423839540 0
uid3 1423841160 1
uid4 1423841460 2
...
uidN 1423843080 M
Then I need to again divide time range on periods and calculate for each period how many users from each cohort have sent event "level_end:001:win".
For each small period I select all unique users who have sent "level_end:001:win" event and left join them to tmp_c cohorts table. So I have something like this:
user_id | 1st timestamp | cohort_no | user_id | other fields...
uid1 1423836540 0 uid1
uid2 1423839540 0 null
uid3 1423841160 1 null
uid4 1423841460 2 uid4
...
uidN 1423843080 M null
This way I see how many users from my cohorts are in those who have sent "level_end:001:win", exclude not found by where clause: where t2.uid is not null.
Finally I perform grouping and have counts of users in each cohort, who have sent "level_end:001:win" in this particluar period.
Here's the code:
DROP TABLE IF EXISTS tmp_res;
create temporary table tmp_res (uid varchar(64) CHARACTER SET cp1251 NOT NULL, c int(11), cnt int(11) );
set i = 0;
set cnt_c = ceil((t_end - beg) / period) ;
WHILE i < cnt_c DO
insert into tmp_res
select concat(beg + period * i, "_", beg + period * (i+1)), c, count(distinct(uid)) from
(select t1.uid, t1.c from tmp_c t1 left join
(select uid, min(tm) from events_view where
locate("level_end:001:win", e) > 0 and
tm > (beg + period * i) AND tm <= (beg + period * (i+1)) group by uid ) t2
on t1.uid = t2.uid where t2.uid is not null) t3
group by c;
SET i = i+1;
END WHILE;
/*getting result of the first method: tooo slooooow!*/
select * from tmp_res;
The result I've got (it's ok that some cohorts are not appear on some periods):
"1423832400_1423890000","1","35"
"1423832400_1423890000","2","3"
"1423832400_1423890000","3","1"
"1423832400_1423890000","4","1"
"1423890000_1423947600","1","21"
"1423890000_1423947600","2","50"
"1423890000_1423947600","3","2"
"1423947600_1424005200","1","9"
"1423947600_1424005200","2","24"
"1423947600_1424005200","3","70"
"1423947600_1424005200","4","6"
"1424005200_1424062800","1","7"
"1424005200_1424062800","2","15"
"1424005200_1424062800","3","21"
"1424005200_1424062800","4","32"
"1424062800_1424120400","1","7"
"1424062800_1424120400","2","13"
"1424062800_1424120400","3","24"
"1424062800_1424120400","4","18"
"1424120400_1424178000","1","10"
"1424120400_1424178000","2","12"
"1424120400_1424178000","3","18"
"1424120400_1424178000","4","14"
"1424178000_1424235600","1","6"
"1424178000_1424235600","2","7"
"1424178000_1424235600","3","9"
"1424178000_1424235600","4","12"
"1424235600_1424293200","1","6"
"1424235600_1424293200","2","8"
"1424235600_1424293200","3","9"
"1424235600_1424293200","4","5"
"1424293200_1424350800","1","5"
"1424293200_1424350800","2","3"
"1424293200_1424350800","3","11"
"1424293200_1424350800","4","10"
"1424350800_1424408400","1","8"
"1424350800_1424408400","2","5"
"1424350800_1424408400","3","7"
"1424350800_1424408400","4","7"
"1424408400_1424466000","2","6"
"1424408400_1424466000","3","7"
"1424408400_1424466000","4","3"
"1424466000_1424523600","1","3"
"1424466000_1424523600","2","4"
"1424466000_1424523600","3","8"
"1424466000_1424523600","4","2"
"1424523600_1424581200","2","3"
"1424523600_1424581200","3","3"
It works but it takes too much time to process because there are many queries here instead of one, so I need to rewrite it.
I think it can be rewritten with joins, but I'm still not sure how.
I decided to make a temporary table and write period boundaries in it:
DROP TABLE IF EXISTS tmp_times;
create temporary table tmp_times (tm_start int(11), tm_end int(11));
set cnt_c = ceil((t_end - beg) / period) ;
set i = 0;
WHILE i < cnt_c DO
insert into tmp_times values( beg + period * i, beg + period * (i+1));
SET i = i+1;
END WHILE;
Then I get periods-to-events mapping (user_id + timestamp represent particular event) to temp table and left join it to cohorts table and group the result:
SELECT Concat(tm_start, "_", tm_end) per,
t1.c coh,
Count(DISTINCT( t2.uid ))
FROM tmp_c t1
LEFT JOIN (SELECT *
FROM tmp_times t3
LEFT JOIN (SELECT uid,
tm
FROM events_view
WHERE Locate("level_end:101:win", e) > 0)
t4
ON ( t4.tm > t3.tm_start
AND t4.tm <= t3.tm_end )
WHERE t4.uid IS NOT NULL
ORDER BY t3.tm_start) t2
ON t1.uid = t2.uid
WHERE t2.uid IS NOT NULL
GROUP BY per,
coh
ORDER BY per,
coh;
In my tests this returns the same result as method #1. I can't check the result manually, but I understand how method #1 work more and as far I can see it gives what I want. Method #2 is faster, but I'm not sure it's the best way and it will give the same result even if cohorts will intersect.
Maybe there are well-known common methods to perform a cohort analysis in SQL? Is method #1 I use more reliable than method #2? I work with joins not that often, that's why still do not fully understand joins magic yet.
Method #2 looks like pure magic, and I used to not believe in what I don't understand :)
Thanks for answers!
I am developing a hotel room reservation system, There having only 5 rooms. I want to get the date periods if 5 or more bookings are already done.
Example:
+-------------------------------------------------------------+
|Booking_from_date | Booking_to_date | Number_of_booking_rooms|
+------------------+-----------------+------------------------+
| 2013-01-01 | 2013-01-10 | 3 |
+------------------+-----------------+------------------------+
| 2013-01-06 | 2013-01-15 | 2 |
---------------------------------------------------------------
(now there are total 5 room booked between 2013-01-06 to 2013-01-10, so i want to get this date period).
I tried using MySql, but not achieved yet. Is it possible to create a query like this?
There may a different simpler way, but what i can think of is to split the days and group it yo get sum.
select date_add(a.booking_from_date, interval col1 day),
sum(Number_of_booking_rooms) from
(select * from table1)a,
(select 0 col1 union all
select 1 col1 union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9) b
where date_add(a.booking_from_date, interval col1 day) <= a.booking_to_date
group by date_add(a.booking_from_date, interval col1 day)
having sum(Number_of_booking_rooms) >= 5
http://sqlfiddle.com/#!2/87813/15
This is the result i got.
DATE SUM
January, 06 2013 00:00:00+0000 5
January, 07 2013 00:00:00+0000 5
January, 08 2013 00:00:00+0000 5
January, 09 2013 00:00:00+0000 5
January, 10 2013 00:00:00+0000 5
I guess this is not possible from just MySql query.But I have a solution by using loop. Please check this code, you will get the output which you want.
/* all active booking from mysql table. */
$this -> data['booked_dates'] = $this -> admin_model -> get_booked_dates_from();//(just a function to get all dates from table)
$array_index = 0;
$not_available_date_range = array();
//declaration
$not_available_date = array();
foreach ($this -> data['booked_dates'] as $key => $booked_date)// loop for each booking dates(each table row).
{
//flag to know package is to be blocked or not...
$block_status = 0;
$check_in_min_date = 0;
$check_out_max_date = 0;
$strDateFrom = $booked_date -> check_in_date;
$strDateTo = $booked_date -> check_out_date;
// Takes two dates, formatted in YYYY-MM-DD.
// Inclusive array of the dates between the from and to dates.
$aryRange = array();
$iDateFrom = strtotime($strDateFrom);
$iDateTo = strtotime($strDateTo);
if ($iDateTo >= $iDateFrom)//just a validation
{
array_push($aryRange, date('Y-m-d', $iDateFrom));
// first entry
$date_gap = 0;
while ($iDateFrom <= $iDateTo)//select each date...between check-in date and check-out date. one by one
{
$date_gap++;
$booking_count = 0;
for ($i = $key; $i <= (count($this -> data['booked_dates']) - 1); $i++)// loop for each booking. // loop have to reduce size. by using good logic..:)
{
$strDateFrom2 = strtotime($this -> data['booked_dates'][$i] -> check_in_date);
$strDateTo2 = strtotime($this -> data['booked_dates'][$i] -> check_out_date);
if (($iDateFrom >= $strDateFrom2) && ($iDateFrom <= $strDateTo2))
{
$booking_count = $booking_count + $this -> data['booked_dates'][$i] -> rooms;
}
}
if ($booking_count >= 5)// compare with maximum available rooms.
{
$block_status = 1;
//room should be blocked.
if ($check_in_min_date == 0)
{
$check_in_min_date = $iDateFrom;
$check_out_max_date = $iDateFrom;
//For if only one day is going to block.
}
else
{
$check_out_max_date = $iDateFrom;
//datefrom is incremented till the maximum booked date.
}
array_push($not_available_date, date('Y-m-d', $iDateFrom));
}
$iDateFrom += 86400;
// add 24 hours
}
}
else
{
echo 'Wrong input!';
}
if ($block_status)//if the room is blocked..
{
$not_available_date_range[$array_index]['check_in_date'] = date('Y-m-d', $check_in_min_date);
$not_available_date_range[$array_index]['check_out_date'] = date('Y-m-d', $check_out_max_date);
$array_index++;
}
}
$this -> data['blocked_dates'] = $not_available_date_range; // Now we have booking period where more than 5 bookings are done.
I want to sort the following data items in the order they are presented below (numbers 1-12):
1
2
3
4
5
6
7
8
9
10
11
12
However, my query - using order by xxxxx asc sorts by the first digit above all else:
1
10
11
12
2
3
4
5
6
7
8
9
Any tricks to make it sort more properly?
Further, in the interest of full disclosure, this could be a mix of letters and numbers (although right now it is not), e.g.:
A1
534G
G46A
100B
100A
100JE
etc....
Thanks!
update: people asking for query
select * from table order by name asc
People use different tricks to do this. I Googled and find out some results each follow different tricks. Have a look at them:
Alpha Numeric Sorting in MySQL
Natural Sorting in MySQL
Sorting of numeric values mixed with alphanumeric values
mySQL natural sort
Natural Sort in MySQL
Edit:
I have just added the code of each link for future visitors.
Alpha Numeric Sorting in MySQL
Given input
1A 1a 10A 9B 21C 1C 1D
Expected output
1A 1C 1D 1a 9B 10A 21C
Query
Bin Way
===================================
SELECT
tbl_column,
BIN(tbl_column) AS binray_not_needed_column
FROM db_table
ORDER BY binray_not_needed_column ASC , tbl_column ASC
-----------------------
Cast Way
===================================
SELECT
tbl_column,
CAST(tbl_column as SIGNED) AS casted_column
FROM db_table
ORDER BY casted_column ASC , tbl_column ASC
Natural Sorting in MySQL
Given input
Table: sorting_test
-------------------------- -------------
| alphanumeric VARCHAR(75) | integer INT |
-------------------------- -------------
| test1 | 1 |
| test12 | 2 |
| test13 | 3 |
| test2 | 4 |
| test3 | 5 |
-------------------------- -------------
Expected Output
-------------------------- -------------
| alphanumeric VARCHAR(75) | integer INT |
-------------------------- -------------
| test1 | 1 |
| test2 | 4 |
| test3 | 5 |
| test12 | 2 |
| test13 | 3 |
-------------------------- -------------
Query
SELECT alphanumeric, integer
FROM sorting_test
ORDER BY LENGTH(alphanumeric), alphanumeric
Sorting of numeric values mixed with alphanumeric values
Given input
2a, 12, 5b, 5a, 10, 11, 1, 4b
Expected Output
1, 2a, 4b, 5a, 5b, 10, 11, 12
Query
SELECT version
FROM version_sorting
ORDER BY CAST(version AS UNSIGNED), version;
Just do this:
SELECT * FROM table ORDER BY column `name`+0 ASC
Appending the +0 will mean that:
0,
10,
11,
2,
3,
4
becomes :
0,
2,
3,
4,
10,
11
I hate this, but this will work
order by lpad(name, 10, 0) <-- assuming maximum string length is 10
<-- you can adjust to a bigger length if you want to
I know this post is closed but I think my way could help some people. So there it is :
My dataset is very similar but is a bit more complex. It has numbers, alphanumeric data :
1
2
Chair
3
0
4
5
-
Table
10
13
19
Windows
99
102
Dog
I would like to have the '-' symbol at first, then the numbers, then the text.
So I go like this :
SELECT name, (name = '-') boolDash, (name = '0') boolZero, (name+0 > 0) boolNum
FROM table
ORDER BY boolDash DESC, boolZero DESC, boolNum DESC, (name+0), name
The result should be something :
-
0
1
2
3
4
5
10
13
99
102
Chair
Dog
Table
Windows
The whole idea is doing some simple check into the SELECT and sorting with the result.
This works for type of data:
Data1,
Data2, Data3 ......,Data21. Means "Data" String is common in all rows.
For ORDER BY ASC it will sort perfectly, For ORDER BY DESC not suitable.
SELECT * FROM table_name ORDER BY LENGTH(column_name), column_name ASC;
I had some good results with
SELECT alphanumeric, integer FROM sorting_test ORDER BY CAST(alphanumeric AS UNSIGNED), alphanumeric ASC
This type of question has been asked previously.
The type of sorting you are talking about is called "Natural Sorting".
The data on which you want to do sort is alphanumeric.
It would be better to create a new column for sorting.
For further help check
natural-sort-in-mysql
If you need to sort an alpha-numeric column that does not have any standard format whatsoever
SELECT * FROM table ORDER BY (name = '0') DESC, (name+0 > 0) DESC, name+0 ASC, name ASC
You can adapt this solution to include support for non-alphanumeric characters if desired using additional logic.
This should sort alphanumeric field like:
1/ Number only, order by 1,2,3,4,5,6,7,8,9,10,11 etc...
2/ Then field with text like: 1foo, 2bar, aaa11aa, aaa22aa, b5452 etc...
SELECT MyField
FROM MyTable
order by
IF( MyField REGEXP '^-?[0-9]+$' = 0,
9999999999 ,
CAST(MyField AS DECIMAL)
), MyField
The query check if the data is a number, if not put it to 9999999999 , then order first on this column, then order on data with text
Good luck!
Instead of trying to write some function and slow down the SELECT query, I thought of another way of doing this...
Create an extra field in your database that holds the result from the following Class and when you insert a new row, run the field value that will be naturally sorted through this class and save its result in the extra field. Then instead of sorting by your original field, sort by the extra field.
String nsFieldVal = new NaturalSortString(getFieldValue(), 4).toString()
The above means:
- Create a NaturalSortString for the String returned from getFieldValue()
- Allow up to 4 bytes to store each character or number (4 bytes = ffff = 65535)
| field(32) | nsfield(161) |
a1 300610001
String sortString = new NaturalSortString(getString(), 4).toString()
import StringUtils;
/**
* Creates a string that allows natural sorting in a SQL database
* eg, 0 1 1a 2 3 3a 10 100 a a1 a1a1 b
*/
public class NaturalSortString {
private String inStr;
private int byteSize;
private StringBuilder out = new StringBuilder();
/**
* A byte stores the hex value (0 to f) of a letter or number.
* Since a letter is two bytes, the minimum byteSize is 2.
*
* 2 bytes = 00 - ff (max number is 255)
* 3 bytes = 000 - fff (max number is 4095)
* 4 bytes = 0000 - ffff (max number is 65535)
*
* For example:
* dog123 = 64,6F,67,7B and thus byteSize >= 2.
* dog280 = 64,6F,67,118 and thus byteSize >= 3.
*
* For example:
* The String, "There are 1000000 spots on a dalmatian" would require a byteSize that can
* store the number '1000000' which in hex is 'f4240' and thus the byteSize must be at least 5
*
* The dbColumn size to store the NaturalSortString is calculated as:
* > originalStringColumnSize x byteSize + 1
* The extra '1' is a marker for String type - Letter, Number, Symbol
* Thus, if the originalStringColumn is varchar(32) and the byteSize is 5:
* > NaturalSortStringColumnSize = 32 x 5 + 1 = varchar(161)
*
* The byteSize must be the same for all NaturalSortStrings created in the same table.
* If you need to change the byteSize (for instance, to accommodate larger numbers), you will
* need to recalculate the NaturalSortString for each existing row using the new byteSize.
*
* #param str String to create a natural sort string from
* #param byteSize Per character storage byte size (minimum 2)
* #throws Exception See the error description thrown
*/
public NaturalSortString(String str, int byteSize) throws Exception {
if (str == null || str.isEmpty()) return;
this.inStr = str;
this.byteSize = Math.max(2, byteSize); // minimum of 2 bytes to hold a character
setStringType();
iterateString();
}
private void setStringType() {
char firstchar = inStr.toLowerCase().subSequence(0, 1).charAt(0);
if (Character.isLetter(firstchar)) // letters third
out.append(3);
else if (Character.isDigit(firstchar)) // numbers second
out.append(2);
else // non-alphanumeric first
out.append(1);
}
private void iterateString() throws Exception {
StringBuilder n = new StringBuilder();
for (char c : inStr.toLowerCase().toCharArray()) { // lowercase for CASE INSENSITIVE sorting
if (Character.isDigit(c)) {
// group numbers
n.append(c);
continue;
}
if (n.length() > 0) {
addInteger(n.toString());
n = new StringBuilder();
}
addCharacter(c);
}
if (n.length() > 0) {
addInteger(n.toString());
}
}
private void addInteger(String s) throws Exception {
int i = Integer.parseInt(s);
if (i >= (Math.pow(16, byteSize)))
throw new Exception("naturalsort_bytesize_exceeded");
out.append(StringUtils.padLeft(Integer.toHexString(i), byteSize));
}
private void addCharacter(char c) {
//TODO: Add rest of accented characters
if (c >= 224 && c <= 229) // set accented a to a
c = 'a';
else if (c >= 232 && c <= 235) // set accented e to e
c = 'e';
else if (c >= 236 && c <= 239) // set accented i to i
c = 'i';
else if (c >= 242 && c <= 246) // set accented o to o
c = 'o';
else if (c >= 249 && c <= 252) // set accented u to u
c = 'u';
else if (c >= 253 && c <= 255) // set accented y to y
c = 'y';
out.append(StringUtils.padLeft(Integer.toHexString(c), byteSize));
}
#Override
public String toString() {
return out.toString();
}
}
For completeness, below is the StringUtils.padLeft method:
public static String padLeft(String s, int n) {
if (n - s.length() == 0) return s;
return String.format("%0" + (n - s.length()) + "d%s", 0, s);
}
The result should come out like the following
-1
-a
0
1
1.0
1.01
1.1.1
1a
1b
9
10
10a
10ab
11
12
12abcd
100
a
a1a1
a1a2
a-1
a-2
áviacion
b
c1
c2
c12
c100
d
d1.1.1
e
MySQL ORDER BY Sorting alphanumeric on correct order
example:
SELECT `alphanumericCol` FROM `tableName` ORDER BY
SUBSTR(`alphanumericCol` FROM 1 FOR 1),
LPAD(lower(`alphanumericCol`), 10,0) ASC
output:
1
2
11
21
100
101
102
104
S-104A
S-105
S-107
S-111
This is from tutorials point
SELECT * FROM yourTableName ORDER BY
SUBSTR(yourColumnName FROM 1 FOR 2),
CAST(SUBSTR(yourColumnName FROM 2) AS UNSIGNED);
it is slightly different from another answer of this thread
For reference, this is the original link
https://www.tutorialspoint.com/mysql-order-by-string-with-numbers
Another point regarding UNSIGNED is written here
https://electrictoolbox.com/mysql-order-string-as-int/
While this has REGEX too
https://www.sitepoint.com/community/t/how-to-sort-text-with-numbers-with-sql/346088/9
SELECT length(actual_project_name),actual_project_name,
SUBSTRING_INDEX(actual_project_name,'-',1) as aaaaaa,
SUBSTRING_INDEX(actual_project_name, '-', -1) as actual_project_number,
concat(SUBSTRING_INDEX(actual_project_name,'-',1),SUBSTRING_INDEX(actual_project_name, '-', -1)) as a
FROM ctts.test22
order by
SUBSTRING_INDEX(actual_project_name,'-',1) asc,cast(SUBSTRING_INDEX(actual_project_name, '-', -1) as unsigned) asc
This is a simple example.
SELECT HEX(some_col) h
FROM some_table
ORDER BY h
order by len(xxxxx),xxxxx
Eg:
SELECT * from customer order by len(xxxxx),xxxxx
Try this For ORDER BY DESC
SELECT * FROM testdata ORDER BY LENGHT(name) DESC, name DESC
SELECT
s.id, s.name, LENGTH(s.name) len, ASCII(s.name) ASCCCI
FROM table_name s
ORDER BY ASCCCI,len,NAME ASC;
Assuming varchar field containing number, decimal, alphanumeric and string, for example :
Let's suppose Column Name is "RandomValues" and Table name is "SortingTest"
A1
120
2.23
3
0
2
Apple
Zebra
Banana
23
86.Akjf9
Abtuo332
66.9
22
ABC
SELECT * FROM SortingTest order by IF( RandomValues REGEXP '^-?[0-9,.]+$' = 0,
9999999999 ,
CAST(RandomValues AS DECIMAL)
), RandomValues
Above query will do sorting on number & decimal values first and after that all alphanumeric values got sorted.
This will always put the values starting with a number first:
ORDER BY my_column REGEXP '^[0-9]' DESC, length(my_column + 0), my_column ";
Works as follows:
Step1 - Is first char a digit? 1 if true, 0 if false, so order by this DESC
Step2 - How many digits is the number? Order by this ASC
Step3 - Order by the field itself
Input:
('100'),
('1'),
('10'),
('0'),
('2'),
('2a'),
('12sdfa'),
('12 sdfa'),
('Bar nah');
Output:
0
1
2
2a
10
12 sdfa
12sdfa
100
Bar nah
Really problematic for my scenario...
select * from table order by lpad(column, 20, 0)
My column is a varchar, but has numeric input (1, 2, 3...) , mixed numeric (1A, 1B, 1C) and too string data (INT, SHIP)