I've stumbled on a previously asked and answered question here:
How to use comparison operator for numeric string in MySQL?
I absolutely agree with the answer being the best mentioned. But it left me with a question myself while I was trying to create my own answer. I was trying to select the first number and convert it to an integer. Next I wanted to compare that integer with a number (3 in case of the question).
This is the query I've created:
SELECT experience,
CONVERT(SUBSTRING_INDEX(experience,'-',1), UNSIGNED INTEGER) AS num
FROM employee
WHERE #num >= 3;
For the sake of simplicity, asume the data inside experience is: 4-8
The query doesn't return any errors. But it doesn't return the data either. I know it's possible to compare the data inside a column with a user defined variable. But is it possible to compare data (the integer in this case) with the variable like I'm trying to do?
This is purely out of curiousity and to learn something.
Yes, a derived table will do. The inner select block below is a derived table. And every derived table needs a name. In my case, xDerived.
The strategy is to let the derived table cleanse the use of the column name. Coming out of the derived chunk is a clean column named num which the outer select is free to use.
Schema
create table employee
( id int auto_increment primary key,
experience varchar(20) not null
);
-- truncate table employee;
insert employee(experience) values
('4-5'),('7-1'),('4-1'),('6-5'),('8-6'),('5-9'),('10-4');
Query
select id,experience,num
from
( SELECT id,experience,
CONVERT(SUBSTRING_INDEX(experience,'-',1),UNSIGNED INTEGER) AS num
FROM employee
) xDerived
where num>=7;
Results
+----+------------+------+
| id | experience | num |
+----+------------+------+
| 2 | 7-1 | 7 |
| 5 | 8-6 | 8 |
| 7 | 10-4 | 10 |
+----+------------+------+
Note, your #num concept was faulty but hopefully I interpreted what you meant to do above.
Also, I went with 7 not 3 because all your sample data would have returned, and I wanted to show you it would work.
The AS num instruction names the result of convert as num, not a variable named #num.
You could repeat the convert
SELECT experience,CONVERT(SUBSTRING_INDEX(experience,'-',1),UNSIGNED INTEGER)
FROM employee
WHERE CONVERT(SUBSTRING_INDEX(experience,'-',1),UNSIGNED INTEGER) >= 3;
Or use a partial (derived) table (only one convert)
SELECT experience,num
FROM (select experience,
CONVERT(SUBSTRING_INDEX(experience,'-',1),UNSIGNED INTEGER) as num
FROM employee) as partialtable WHERE num>=3;
Much simpler. (Or at least much shorter.) This will work for the data as described, namely "number, -, other stuff".
SELECT experience,
0+experience AS 'FirstPart'
FROM employee
WHERE 0+experience >= 3
Why? 0+string is parsed as "convert the string to a number, then add it to 0". Converting a string will extract the digits up to the first non-digit, then convert that as numeric.
Related
Suppose we have 2 numbers of 3 bits each attached together like '101100', which basically represents 5 and 4 combined. I want to be able to perform aggregation functions like SUM() or AVG() on this column separately for each individual 3-bit column.
For instance:
'101100'
'001001'
sum(first three column) = 6
sum(last three column) = 5
I have already tried the SUBSTRING() function, however, speed is the issue in that case as this query will run on millions of rows regularly. And string matching will slow the query.
I am also open for any new databases or technologies that may support this functionality.
You can use the function conv() to convert any part of the string to a decimal number:
select
sum(conv(left(number, 3), 2, 10)) firstpart,
sum(conv(right(number, 3), 2, 10)) secondpart
from tablename
See the demo.
Results:
| firstpart | secondpart |
| --------- | ---------- |
| 6 | 5 |
With the current understanding I have of your schema (which is next to none), the best solution would be to restructure your schema so that each data point is its own record instead of all the data points being in the same record. Doing this allows you to have a dynamic number of data points per entry. Your resulting table would look something like this:
id | data_type | value
ID is used to tie all of your data points together. If you look at your current table, this would be whatever you are using for the primary key. For this answer, I am assuming id INT NOT NULL but yours may have additional columns.
Data Type indicates what type of data is stored in that record. This would be the current tables column name. I will be using data_type_N as my values, but yours should be a more easily understood value (e.g. sensor_5).
Value is exactly what it says it is, the value of the data type for the given id. Your values appear to be all numbers under 8, so you could use a TINYINT type. If you have different storage types (VARCHAR, INT, FLOAT), I would create a separate column per type (val_varchar, val_int, val_float).
The primary key for this table now becomes a composite: PRIMARY KEY (id, data_type). Since your previously single record will become N records, the primary key will need to adjust to accommodate that.
You will also want to ensure that you have indexes that are usable by your queries.
Some sample values (using what you placed in your question) would look like:
1 | data_type_1 | 5
1 | data_type_2 | 4
2 | data_type_1 | 1
2 | data_type_2 | 1
Doing this, summing the values now becomes trivial. You would only need to ensure that data_type_N is summed with data_type_N. As an example, this would be used to sum your example values:
SELECT data_type,
SUM(value)
FROM my_table
WHERE id IN (1,2)
GROUP BY data_type
Here is an SQL Fiddle showing how it can be used.
If I have a large table with floating numbers, can it help in reading speed if I add a column that represent the int value of each float? maybe if the int value will be an index, then when I need to select all the floats that starts with certain int it will "filter" the values that are surely not necessary?
For example if there are 10,000 numbers, 5000 of which begin with 14: 14.232, 14.666, etc, is there an sql statement that can increase the selecting speed if I add the int value column?
id | number | int_value |
1 | 11.232 | 11 |
2 | 30.114 | 30 |
3 | 14.888 | 14 |
.. | .. | .. |
3005 | 14.332 | 14 |
You can create a non clustered index on number column itself. and when selecting the data from table you can filtered out with like operator. No need of additional column,
Select * from mytable
where number like '14%'
First of all: Do you have performance issues? If not then why worry?
Then: You need to store decimals, but you are sometimes only interested in the integer part. Yes?
So you have one or more queries of the type
where number >= 14 and number < 15
or
where truncate(number, 0) = 14
Do you already have indexes on the number? E.g.
create index idx on mytable(number);
The first mentioned WHERE clause would probably benefit from it. The second doesn't, because when you invoke a function on the column, the DBMS doesn't see the relation to the index anymore. This shows it can make a difference how you write the query.
If the first WHERE clause is still too slow in spite of the index, you can create a computed column (ALTER TABLE mytable ADD numint int GENERATED ALWAYS AS truncate(number, 0) STORED), index that, and access it instead of the number column in your query. But I doubt that would speed things up noticeably.
As to your example:
if there are 10,000 numbers, 5000 of which begin with 14
This is not called a large table, but a small one. And as you'd want half of the records anyway, the DBMS would simply read all records sequentially and look at the number. It doesn't make a difference whether it looks at an integer or a decimal number. (Well, some nanoseconds maybe, but nothing you would notice.)
I was doing some system testing and expecting empty results from MySQL(5.7.21) but got surprised to get results.
My transactions table looks like this:
Column Data type
----------------------------
id | INT
fullnames | VARCHAR(40)
---------------------------
And I have some records
--------------------------------
id | fullnames
--------------------------------
20 | Mutinda Boniface
21 | Boniface M
22 | Some-other Guy
-------------------------------
My sample queries:
select * from transactions where id = "20"; -- gives me 1 record which is fine
select * from transactions where id = 20; -- gives me 1 record - FINE as well
Now it gets interesting when I try with these:
select * from transactions where id = "20xxx"; -- gives me 1 record - what is happening here?
What does MySQL do here??
MySQL plays fast and loose with type conversions. When implicitly converting a char to a number, it will take characters from the beginning of the string as long as they are digits, and ignore the rest. In your example, xxx aren't digits, so MySQL only takes the initial "20".
One way around this (which is horrible for performance, since you lose the usage on the index you may have on your column), is to explicitly cast the numeric side to a character:
SELECT * FROM transactions WHARE (CAST id AS CHAR) = 20;
EDIT:
Referencing the discussion about performance from the comments - performing the cast to a number on the client-side is probably the best approach, as it will allow you to avoid sending queries to the database when you know no rows should be returned (i.e., when your input is not a valid number, such as "20x").
An alternative hack could be to cast the input to a number and back again to a string, and compare the lengths. If the lengths are the same it means the input string was fully converted into a number and no characters were omitted. This should be OK WRT performance, since this comparison is performed on an inputted string, not on a value from the column, and the column's index can still be used if the condition passes the short-circuit evaluation of the input:
SELECT *
FROM transactions
WHERE LENGTH(:input) = LENGTH(CAST(:input AS SIGNED)) AND id = :input;
I have a simple MySQL table made up of words and an associated number. The numbers are unique for each word. I want to find the first word whose index is larger than a given number. As an example:
-----------------------
| WORD: | F_INDEX: |
|---------------------|
| a | 5 |
| cat | 12 |
| bat | 4002 |
-----------------------
If I was given the number "9" I would want "cat" returned, as it is the first word whose index is larger than 9.
I know that I can get a full list of sorted rows by querying:
SELECT * FROM table_name ORDER BY f_index;
But would, instead, like to make a MySQL query that does this. (The confusion lies in the fact that I'm unsure as to how to keep track of the current row in my query). I know can loop with something like this:
CREATE PROCEDURE looper(desired_index INT)
BEGIN
DECLARE current_index int DEFAULT 0
// Loop here, setting current_index to whatever the next rows index is,
// then do a comparison to check it to our desired_index, breaking out
// if it is greater.
END;
Any help would be greatly appreciated.
Try this:
SELECT t.word
, t.f_index
FROM table_name t
WHERE t.f_index > 9
ORDER
BY t.f_index
LIMIT 1
It is much more efficient to have the database return the row you need, than it is to pull a whole bunch of rows and figure out which one you need.
For best performance of this query, you will want an index ON table_name (f_index,word).
Why don't you just use MYSQL statement to retrieve the first item you found from f_index where the f_index is greater than the value your pass in.
For example :
select word from table_name
where f_index > desired_index
order by f_index
limit 1
Any SQL to get first numbers not listed in my MySQL database table?
Ex:
Table:
Users
ID | Name | Number
------------------------
1 | John | 1456
2 | Phil | 345
3 | Jenny | 345612
In this case the SQL must return me list of row with number from 1 to 344 and 346 to 1455 and 1457 to 345611
Any suggestions? Maybe with some procedure?
I like the answer by #pst but would suggest another alternative.
Create a new table of unassigned numbers, insert a few thousand rows or so in there.
Present some of those numbers to the user.
When a number is used, delete it from the unassigned numbers table.
Periodically generate more unassigned numbers as needed.
The generation of those unassigned numbers could use the random method suggested by #pst, but using this method you move the uncertainty of how long it'll take to generate a list of unassigned numbers into a batch task rather than having to do it at the front end while the user is waiting. This probably isn't an issue if the usage of the number space is sparse, but as more of the number space becomes used, it becomes a bigger issue.
Given the comment(s), my first approach would be use a "random number" probe. This approach assumes:
Number is indexed; and
There are "significantly less" users than available numbers
Approach:
Choose N (i.e. 1-10) numbers at random on the client;
Query the database for Number IN (ns..), or Number = n for N=1; then
If the number is available can be detected based on not finding the requested record(s).
A size of N=1 is likely "okay" in this case and it is the most trivial to implement although it will require at least 6 database requests to find 6 free numbers. A larger N would decrease the number of trips to the database.
Make sure to use transactions.
SELECT 'start', 1 AS number FROM tableA
UNION
SELECT 'min', number - 1 number FROM tableA
UNION
SELECT 'max', number + 1 number FROM tableA
ORDER BY number
You can check the answer at http://www.sqlfiddle.com/#!2/851de/6
Then you can make a comparison of missing numbers when you populate the next time.
Just use an auto increment column. The database will assign the next number automatically. You don't need to even know what it is at the time of the insert. Just tell the user the number he got, don't give him a choice at all.
Based on your comments, the approach below might work for you. It doesn't really answer your specific question, but it probably meets your requirements.
I'm going to assume your requirements cannot change (e.g., presenting users with 6 possible id choices). Frankly I think it's a bit of a weird requirement, but it makes for some interesting SQL. :-)
Here's my approach: generate 10 random numbers. Filter out any already in the database. Present 6 of these random numbers to your user. Random id numbers have very nice properties with respect to transactionality compared to sequential id numbers, so this should scale very nicely should your app become popular.
SELECT
temp.i
FROM
(
SELECT 18 AS i -- 10 random
UNION SELECT 42 -- numbers.
UNION SELECT 88
UNION SELECT 191 -- Let's assume
UNION SELECT 192 -- you generated
UNION SELECT 193 -- these in the
UNION SELECT 1000 -- application
UNION SELECT 123456 -- layer.
UNION SELECT 1092930
UNION SELECT 9892919
) temp
LEFT JOIN
mytable ON (temp.i = mytable.i)
WHERE
mytable.i IS NULL -- filter out collisions
LIMIT
6 -- limit results to 6
SQL pop quiz time!!!
Why does the line "WHERE mytable.i IS NULL" filter collisions? (Hint: How can mytable.i be null when it's a primary key?)
Here's some test data:
CREATE TABLE mytable (i BIGINT PRIMARY KEY) ;
INSERT INTO mytable VALUES (88), (3), (192), (123456) ;
Run the query above, and here's the result. Notice that 88, 192, and 123456 were filtered out, since they would be collisions against the test data.
+---------+
| i |
+---------+
| 18 |
| 42 |
| 191 |
| 193 |
| 1000 |
| 1092930 |
+---------+
And how to generate those random numbers? Probably rand() * 9223372036854775807 would work. (Assuming you don't want negative numbers!)