cast to unsigned need very long - mysql

I've the following problem:
i want to query some data from 3 big SQL-Tables.
eintraege ~13000 rows // rubrik2eintrag ~ 9500 rows // rubriken ~ 425 rows
This query
SELECT eintraege.id AS id, eintraege.email, eintraege.eintrags_name, eintraege.telefon,
eintraege.typ, rubrik2eintrag.rubrik AS rubrik, eintraege.status,
IFNULL( GROUP_CONCAT( rubriken.bezeichnung ), \'- Keine Rubrik zugeordnet\' ) AS rubrikname
FROM eintraege
LEFT OUTER JOIN rubrik2eintrag ON rubrik2eintrag.eintrag = eintraege.id
LEFT OUTER JOIN rubriken ON rubrik = rubriken.rubrik_id
GROUP BY id
ORDER BY `id` DESC
LIMIT 0, 50
works fine for me (~ 2 seconds response time) but the entrys appear not in the correct order. (e.g. the row with the id 500 came right before the row with id 3000 )
so i cast the id to unsigned. like this:
ORDER BY CAST(`id` AS UNSIGNED) DESC
But now the query needs nearly 40 seconds.
Is there a better/faster way to reach a correct ordered output?

Apparently, id is not defined as integer (or numeric) datatype. That would explain the ordering, where it's ordering by string value.
Some possibilities:
Introduce a new column in the table with integer datatype, populate/maintain the contents of that column, add an appropriate index with that column as the leading index, and change the query to order by the new column. (That would be the best MySQL approximation of a function based index.)
Or, store the string value with leading zeros, so they are the same length.
000000000500
000000030000
Or, redefine the id column to be integer type.
Aside from those ideas... no, there's really no getting around a Using filesort operation to order the rows by integer value.

Related

Get statistical measures of a varchar field in snowflake

I have a field called MER_DATA in a snowflake table having a value as shown below:
[43,44.25,44.5,42.75,44,44.25,42.75,43,42.5,42.5,36.75,42.25,42.75,43.25,43.25,43.25,42.75,43.5,42,43,43.75,43.75,43.25,41.75,43.25,42.5,43.25,42.75,43.25,43.5,43.25,43.25,43.75,...]
Each row has approximately 4k(This varies from row to row)numbers in them and the data type of the field is varchar(30000). The data is around 700k rows
Now I want to calculate the standard deviation of each row using the numbers present in the list shown above.
I have tried doing this in MySQL using the following query:
select mac, `timestamp`, std(res), min(res), max(res)
from
(select mac, `timestamp`, r.res from table cmr ,
json_table(mer_data, '$[*]' columns (res float path '$'))r)T
group by mac, `timestamp`;
which gives me the right result but takes a lot of time for 700k rows.
I want to do the same in snowflake. Is there an optimal way to do this?
Also the query needs to run within 10 mins in snowflake. The mysql query can take upto 1 hours.
Without the table definition and example source data it's difficult to produce a complete solution for your problem, but here is an example of how to do this using the STRTOK_SPLIT_TO_TABLE table function which first splits your varchar numbers to rows, so we can then re-aggregate the Value's to get the standard deviations per row.
First generate some test data at the right scale:
Create or Replace Table cmr (mer_data varchar) as
With gen as (
select
uniform(1,700000, random()) row_num,
normal(50, 1, random(0))::decimal(4,2) num
from table(generator(rowcount => 2800000000)) v
)
Select listagg(num, ',') listNums from gen group by row_num
;
Check we have 700k rows and varying count of numbers per row.
Select
count(*) row_count,
min(REGEXP_COUNT( mer_data , '[,]' ))+1 min_num_count,
max(REGEXP_COUNT( mer_data , '[,]' ))+1 max_num_count
from cmr limit 10;
Split the varchar number lists to rows with STRTOK_SPLIT_TO_TABLE and group by the generated SEQ column to calculate the stddev of the VALUE.
Select
seq row_num,
stddev(value) stdListNums,
min(value) minNum, max(value) maxNum,
count(value) countListNums
from cmr, table(STRTOK_SPLIT_TO_TABLE(mer_data ,','))
Group By 1
;
For my data the query takes just over 3 minutes on and XSMALL Virtual Warehouse, and
a little over 30 seconds on LARGE Virtual Warehouse.
You can read about the STRTOK_SPLIT_TO_TABLE function here.

Cannot subtract one value from another (unsigned)

I have a table stats with three columns:
id
up - number of upvotes (just like here at StackOverflow)
down - analogue to up
up and down are INT(11) and UNSIGNED (because they'll only be positive values).
Now when I want to fetch the ten items with the highest (up-down) value, I'm using this query:
SELECT id, up, down, (up-down) AS result
FROM stats
ORDER BY result DESC
LIMIT 0,10
But I'm getting an error
#1690 - BIGINT UNSIGNED value is out of range in
'(`database`.`stats`.`up` - `database`.`stats`.`down`)'
If I leave out the ORDER BY result DESC everything runs smoothly (except for the fact that they're not ordered by result, but the math of up-down is working).
What would I have to change in my query in order to retreive the correct result? Or do I have to remove the UNSIGNED attribute? But isn't this an appropriate case where I should use that attribute?
Unsigned remain unsigned, so you have a problem when the result would be negative. Cast to signed before the subtraction:
SELECT id, up, down, cast(up as signed) - cast(down as signed) AS result
FROM stats
ORDER BY result DESC
LIMIT 0, 10;
Or, keep your query and add a where clause:
SELECT id, up, down, (up-down) AS result
FROM stats
WHERE up >= down
ORDER BY result DESC
LIMIT 0,10;

How to sort the columns in the mysql database

I am having column named rating in the mysql database table with multiple values from 1+,2+,................9+,10+,12+. when i am sorting this column with query
select * from tbl_app order by rating desc
I am getting 9+ as highest value, can any one tell me how to get 12+ as highest value
SELECT rating,SUBSTR(rating,1,LENGTH(rating)-1) FROM tbl_app ORDER BY CAST(SUBSTR(rating,1,LENGTH(rating)-1) as SIGNED) DESC;
if the last char is always a '+',the sql above will work.
what have you kept the datatype of the column rating ? If you have kept it varchar or text then this query will not work for sorting values as per descending order.
Probably the easiest thing to do in MySQL is cast those odd looking strings to numbers:
order by cast(rating as unsigned) desc
-- or less explicitly
order by rating + 0 desc
Both of those casts will stop trying to convert the string to a number when they hit the + so you'll get them sorted numerically.
Simply removing the plus signs from the strings will still leave you with strings and '10' < '2' is just as true for strings as '10+' < '2+'. That's actually your whole problem: you're storing numbers as decorated strings when you should be storing them as integers and adding the + decorations when you display them. You really should fix your schema to make sense instead of adding ugly hacks to work around your schema's strange ideas.
try this :
select convert(replace(rating,'+',' '),unsigned integer) as x from tab order by x desc
sql_fiddle_demo

Mysql query find specified rows

i have one table trip_data.Every one second i getting packets and inserting data to database.trip_data table contains four fields.trip_paramid,fuel_content,creation_time&vehicle_id.I want to select all rows in which difference between creation time is 2 minutes(Not exactly 2.Approximately 2).trip_data table contains 40 lacks rows.So i need a optimized select query for this.Can anyone help on this.Here is table schema&sample data for the trip_table..
SQlFiddle demo
SELECT
tp.*
FROM
trip_parameters tp
GROUP BY
CONVERT(UNIX_TIMESTAMP (tp.creation_time)/(2*60), unsigned)
ORDER BY
tp.creation_time asc
Note that using UNIX_TIMESTAMP does not allow you to handle dates beyond year 2037. Using the following instead fixes the problem:
CONVERT(TIMESTAMPDIFF(SECOND,'1970-01-01 00:00:00',tp.creation_time)/(2*60), unsigned)
You can do it in one table scan using MYSQL User defined variables. Unfortunately UDV's have a limited set of data types (integer, decimal, floating-point, binary or nonbinary string). So in this query I use a char #ti varible to store previous datetime using CAST to compare it with the Creation_time field. Also initial value for this variable I set to (now()-10000000) you can use any date you wish less than MIN(Creation_time)
Here is the SQLFiddle demo
select * from
(
select trip_parameters.*,
if(ABS(TIMESTAMPDIFF(MINUTE,Creation_time,cast(#ti as datetime)))>=2,1,0) t,
#ti:=if(ABS(TIMESTAMPDIFF(MINUTE,Creation_time,cast(#ti as datetime)))>=2,
cast(Creation_time as char(100)),#ti)
from trip_parameters,
(select #ti:=cast(now()-10000000 as char(100))) a
order by creation_time
) t2
where T=1
order by creation_time
Try this
SELECT trip_paramid, fuel_content, creation_time, vehicle_id
FROM trip_parameters
GROUP BY FLOOR(UNIX_TIMESTAMP(creation_time)/120)
This takes one item of every 2 minute block

Sorting varchar field numerically in MySQL

I have a field number of type varchar. Even though it is of type varchar, it stores integer values with optional leading zeros. A sort orders them lexicographically ("42" comes before "9"). How can I order by numeric values ("9" to come before "42")?
Currently I use the query:
SELECT * FROM table ORDER BY number ASC
Try this
SELECT * FROM table_name ORDER BY CAST(field_name as SIGNED INTEGER) ASC
There are a few ways to do this:
Store them as numeric values rather than strings. You've already discounted that as you want to keep strings like 00100 intact with the leading zeros.
Order by the strings cast as numeric. This will work but be aware that it's a performance killer for decent sized databases. Per-row functions don't really scale well.
Add a third column which is the numeric equivalent of the string and index on that. Then use an insert/update trigger to ensure it's set correctly whenever the string column changes.
Since the vast majority of databases are read far more often than written, this third option above amortises the cost of the calculation (done at insert/update) over all selects. Your selects will be blindingly fast since they use the numeric column to order (and no per-row functions).
Your inserts and updates will be slower but that's the price you pay and, to be honest, it's well worth paying.
The use of the trigger maintains the ACID properties of the table since the two columns are kept in step. And it's a well-known idiom that you can usually trade off space for time in most performance optimisations.
We've used this "trick" in many situations, such as storing lower-cased versions of surnames alongside the originals (instead of using something like tolower), lengths of identifying strings to find all users with 7-character ones (instead of using len) and so on.
Keep in mind that it's okay to revert from third normal form for performance provided you understand (and mitigate) the consequences.
Actually i've found something interesting:
SELECT * FROM mytable ORDER BY LPAD(LOWER(mycol), 10,0) DESC
This allows you to order the field like:
1
2
3
10
A
A1
B2
10A
111
SELECT * FROM table ORDER BY number + 0
Trick I just learned. Add '+0' to the varchar field order clause:
SELECT * FROM table ORDER BY number+0 ASC
I now see this answer above. I am wondering if this is typecasting the field and an integer. I have not compared performance. Working great.
For a table with values like Er353, ER 280, ER 30, ER36
default sort will give
ER280
ER30
ER353
ER36
SELECT fieldname, SUBSTRING(fieldname, 1, 2) AS bcd,
CONVERT(SUBSTRING(fieldname, 3, 9), UNSIGNED INTEGER) AS num
FROM table_name
ORDER BY bcd, num;
the results will be in this order
ER30
ER36
ER280
ER353
you can get order by according to your requirement my using following sql query
SELECT * FROM mytable ORDER BY ABS(mycol)
given a column username containing VARCHAR's like these:
username1
username10
username100
one could do:
SELECT username,
CONVERT(REPLACE(username, 'username', ''), UNSIGNED INTEGER) AS N
FROM users u
WHERE username LIKE 'username%'
ORDER BY N;
it is not cheap, but does the job.
SELECT * FROM table ORDER BY number ASC
Should display what you want it to display.. looks like you're sorting it by id or number is not defined as integer at the moment.
MySQL ORDER BY Sorting alphanumeric on correct order
example:
SELECT `alphanumericCol` FROM `tableName` ORDER BY
SUBSTR(`alphanumericCol` FROM 1 FOR 1),
LPAD(lower(`alphanumericCol`), 10,0) ASC
output:
0
1
2
11
21
100
101
102
104
S-104A
S-105
S-107
S-111
Another option to keep numerics at a top, then order by alpha.
IF(name + 0, name + 0, 9999999), name
Rough and ready: order by 1*field_name