I have a table with STATUS column of VARCHAR2(25) and STATE column of VARCHAR2(2) along with few more columns.
While filtering records from the table, I'm using STATUS column as well as STATE column in my query.
SELECT * FROM TAB WHERE STATUS = 'Active' AND STATE = 'WA';
Since STATUS and STATE columns are VARCHAR2 datatype, I would like to introduce new two columns STATUS_ID and STATUS_ID in the table with datatype as NUMBER. STATUS and STATE values are substituted with NUMERIC value for STATUS_ID and STATE_ID. So that I can use NUMBER column instead of VARCHAR2 column in WHERE clause.
SELECT * FROM TAB WHERE STATUS_ID = 1 AND STATE_ID = 2;
I'm comparing NUMBER vs NUMBER and VARCHAR2 vs VARCHAR2 datatype only. There is no implicit or explicit conversion of datatypes exists in the query.
Will there be performance improvement of having NUMBER datatype instead of VARCHAR2 in WHERE clause in Oracle Database?
I would like to know whether NUMBER datatype will have high performance over VARCHAR2 datatype in WHERE clause. Is it true?
Thanks.
Performance will be the same, since Oracle stores numbers as packed strings (unlike some other databases). Only thing you should consider before choosing format is the operations you are going to use with the value, amount of computing power thats going to be needed to convert or lookup values.
Related
Lately I discovered a performance issue in the following use case
Before I had a table "MyTable" with a INT indexed column "MyCode"
Afterwhile Ineeded to change the table structure converting "MyCode" column to VARCHAR (index on the column was preserved)
ALTER TABLE MyTable CHANGE MyCode MyCode VARCHAR(250) DEFAULT NULL
Then experienced a unexpected latency, query were being performed like:
SELECT * FROM MyTable where MyCode = 1234
This query was completely ignoring the MyCode VARCHAR indexing, impression was it was full scanning the table
Converting the query to
SELECT * FROM MyTable where MyCode = "1234"
Performance get back to optimal leveraging on VARCHAR indexing
So the question is.... how to explain it... and how does actually MySQL treat indexing. Or maybe some DB setting to be changed to avoid this ?
int_col = 1234 -- no problem; same type
char_col = "1234" -- no problem; same type
int_col = "1234" -- string is converted to number, then no problem
char_col = 1234 -- converting all the strings to numbers -- tedious
In the 4th case, the index is useless, so the Optimizer looks for some other way to perform the query. This is likely to lead to a "full table scan".
The main exception involves a "covering index", which is only slightly faster -- involving a "full index scan".
I accepted Rick James answer because he got the point.
But I'd like to add more info after having some testing.
The case in the question is: how does actually MySQL compares two values when the filtered column is varchar type and the provided value to filter by is not a string.
If this is the case you'll lose the opportunity to leverage on the index applied to the VARCHAR column having a dramatically loss of performances in your query, supposed instead to be immediate and simple.
Explanation is that MySQL in front of a given value which has a different type from
VARCHAR will perform a full table scan and for every record's field will to perform a CAST(varcharcol as providedvaluetype) and compare the result with provided value.
E.g.
having a VARCHAR column named "code" and filtering
SELECT * FROM table WHERE code=1234
will full scan every record just like doing doing
SELECT * FROM table WHERE CAST(code as UNSIGNED)=1234
Notice that if you'll test it against 0
SELECT * FROM table WHERE CAST(code as UNSIGNED)=0
you'll get back ALL records having a string that its CAST to UNSIGNED won't have a unsigned meaning for mysql CAST function.
I have a large table and I have to sort the records using tag_id currently this column is varchar but all the values are numeric in this.
Will there be any performance improvement if I convert column datatype from varchar to integer??
Below is the quesry which I have to write:
SELECT * FROM xxx table ORDER BY tag_id;
I am developing an application which uses external datasources. My Application supports multiple databases(viz. MySQl,MsSQl,Teradata, Oracle, DB2 etc.). When i create a datasource, I allow user to assign a primary key(pk) to the datasource. Now, I am not checking if the user selected column is primary key or not in actual database. I just want that, while retrieving data from database, the records which have null/blank value in user selected primary key should get dropped. I have created a filter supporting all other databases except for DB2 and Teradata.
Sample Query for other databases:
Select * from MY_TABLE where PK_COLUMN IS NOT NULL and PK_COLUMN !='';
Select * from MY_TABLE where PK_COLUMN IS NOT NULL AND cast(PK_COLUMN as varchar) !=''
DB2 and Teradata:
The PK_COLUMN !='' and cast(PK_COLUMN as varchar) !='' conditions gives error for int datatype in DB2 and teradata because:
- column with int type cannot be gven the above mentioned conditions and also we cannot cast the int type columns to varchar directly in DB2 and Teradata.
I want to create a query to drop null/blank value from the database provided table name and user pk column name as string. (I do not know the pk_column_type while creating the query so the query should be uniform to support all datatypes)
NOTE: The pk here is not actual pk, it is just a dummy pk assigned by my application user. So this can be a normal column and thus can have null/blank values.
I have created a query as:
Select * from MY_TABLE where PK_COLUMN IS NOT NULL AND cast(cast(PK_COLUMN as char) as varchar) !=''
My Question:
Will this solution(double casting) support all datatypes in DB2 and Teradata?
If not, can I come up with a better solution?
Of course you can cast an INT to a VARCHAR in both Teradata and DB2, but you have to specify the length of a VARCHAR, there's no default length.
Casting to a CHAR without a length defaults to CHAR(1) in Standard SQL, which might cause some "string truncation" error.
You need to cast to a VARCHAR(n) where n is the maximum length based on the DBMS.
Plus there's no != operator in SQL, this should be <> instead.
Finally there's a fundamental difference between an empty string and a NULL (except on Oracle), one or more blanks might also have a meaning and will be filtered when compared to ''.
And what is an empty INT supposed to be? If your query worked zero would be casted to '0' which is not equal to '', so it would fail.
You should simply use IS NOT NULL and add a check for an empty string only for character column (and add an option for the user to decide if an empty string is equal to NULL).
at a MySQL database I have the following table:
id int Primary key
timestamp timestamp
tpid varchar
tpidno int
serialnumber int
command varchar
sequence int
startTime varchar
endTime varchar
PosData varchar
...
I also have 3 secondary indices:
tpid,tpidno
serialnumber
command
The table contains ~2.5M rows and it is about 500MB
Although I have complex queries that work fast I have great delay on those two simple queries:
Select id, sequence, PosData
From myTable
Where serialNumber = 130541
and command = "myCommand"
and startTime = "20140106194300"
and endtime = "20140106200000"
(~4.4sec)
Select id
From myTable
Where serialNumber = 130541
and command = 'myCommand'
and sequence = 128
(~4.5sec)
Does more indices like
serialnumber, command
command, sequence
or
serialnumber,command, sequence
will speed up the queries?
At the first query is it possible the data type of startTime and endTime to be the problem? if they were int instead of varchar it would be better?
any other suggestions?
A single index on (serialnumber, command) will definitively improve performance for those two queries. Further you could add the other columns to make it even faster. However, the best choice of the other columns depends on the data distribution and on the question which of the two statements is more often executed. It might even not worth adding those columns if the other two columns are very selective.
The datatype for startdate and enddate is unfortunate at least. Proper types will improve performance in a range from "a little" to "a lot" depending on you SQL. The SQL above will be in the "a little" range.
Some refs:
How multi-column indexes work
Possible problems when using improper types (ex: varchar instead of numeric types)
I have a 2 columns in my table: a varchar(8) and an int.
I want to auto-increment the int column and when I do, I want to copy the value into the varchar(8) column, but pad it with 0's until it is 8 characters long, so for example, if the int column was incremented to 3, the varchar(8) column would contain '00000003'.
My two questions are, what happens when the varchar(8) column gets to '99999999' because I don't want to have duplicates?
How would I do this in MySQL?
If my values can be between 00000000 to 99999999, how many values can i have before I run out?
This is my alternative approach to just creating a random 8 character string and checking MySQL for duplicates. I thought this was a better approach and would allow for a greater number of values.
Because your formatted column depends upon, and is derivable from, the id column, your table design violates 3NF.
Either create a view that has your derived column in it (see this in sqlfiddle):
CREATE VIEW myview AS
SELECT *, substring(cast(100000000 + id AS CHAR(9)), 2) AS formatted_id
FROM mytable
or just start your auto-increment at 10000000, then it will always be 8 digits long:
ALTER TABLE mytable AUTO_INCREMENT = 10000000;
Simple, if the column is unique, it will throw an exception telling that the value already do exists. But if not unique, after 99999999 you'll get error message that the value is truncated.
Alternatives, why not use INT AUTO_INCREMENT? or a custom ID with a combination of date/time, eg
YYMMDD-00000
This will have a maximum record of 99999 records per day. It will reset on the next day.