Related
I need to mask integer field in mysql such that 9999911111 becomes 9900001111. I want to keep first 2 digits and last 4 digits and need to mark rest of the digits as 0 for the integers stored in the field.
I have created a query and it's working but I am not sure whether this is right way to do for integers or not.
update table_name
set field_name=CONCAT(SUBSTR(field_name, 1, 2),
REPEAT('0', CHAR_LENGTH(field_name) - 6),
SUBSTR(field_name, CHAR_LENGTH(field_name)-3, CHAR_LENGTH(field_name)));
Just trying a different approach .
SET #myVar = 344553543534;
SELECT #myVar - (SUBSTRING(#myVar, 4, LENGTH(#myVar) - 7) * 10000) ;
Above mentioned formula will give 344000003534 as the result. Tried with different combination and found it working.
So your query need to change as given below
UPDATE table_name
SET field_name=
(field_name - (SUBSTRING(field_name, 4, LENGTH(field_name) - 7) * 10000));
Explanation :
Consider Number, a = 344553543534;
Expected Result, b = 344000003534;
c = (a - b) = 344553543534 - 344000003534 = 553540000;
Now if you consider the result, c, 55354 is the numbers where masking required, and 0000 indicates the last 4 number to be left open.
So to get masked value, we can use the formula, b = a - c;
So now to get c, used SUBSTRING(a, 4, LENGTH(a) - 7) * 10000
EDIT : To keep only first two numbers, use 3 instead of 4 and 6 instead of 7. I assumed that you needed to keep first 3.
SET #myVar = 344553543534;
SELECT #myVar - (SUBSTRING(#myVar, 3, LENGTH(#myVar) - 6) * 10000) ;
This is my sql query,In flag(00000) every bit position have different specification, e.g. change 4th bit position to 1 when user is inactive.Here flag is varchar datatype(String).
$sql="select flag from user where id =1"
I got
flag=10001 #it may be flag="00001" or flag="00101"
I want to update 2nd bit of this flag to 1.
$sql="update user set flag='-1---' where id=1" #it may be flag='11001' or flag='01001' or flag='01110'
Actually,I want to to update 2nd bit of this flag to 1,but with out updating it like flag='11001'.I want to do some thing like this.
$sql="update user set flag='--change(flag,2bit,to1)--' where id =1" #this is wrong
What can I do for it , only using one sql query?Is it possible?
update user
set flag = lpad(conv((conv(flag, 2, 10) | 1 << 3), 10, 2), 5, '0')
where id = 1
conv(flag, 2, 10) converts the flag string from binary to decimal.
1 << 3 shifts a 1 bit 3 binary places to the left
| performs a binary OR of this, to set that bit. This arithmetic operation will automatically coerce the decimal string to a number; you can use an explicit CAST if you prefer.
conv(..., 10, 2) will convert the decimal string back to a binary string
lpad(..., 5, '0') adds leading zeroes to make the string 5 characters long
FIDDLE DEMO
To set the bit to 0, you use:
set flag = lpad(conv((conv(flag, 2, 10) & ~(1 << 3)), 10, 2), 5, '0')
you want to use the bitwise or operator |
update user set flag = flag | (1 << 1) where id =1
if flag was 101 flag will now be 111
if flag was 000 flag will now be 010
1 << 1 shifts 1 up one bit - making it 10 (binary 2)
edit - not tested but use
update user set flag = cast(cast(flag AS SIGNED) | (1 << 1) AS CHAR) where id =1
If you are going to use a VARCHAR, you are better off using string manipulation functions: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html
UPDATE user
SET flag = CONCAT(LEFT(flag, 1), '1', RIGHT(flag, 3))
WHERE id = 1
However, you probably want to convert this field to an INT so that you can use the bit functions: http://dev.mysql.com/doc/refman/5.0/en/bit-functions.html
Well, I have a table data of millions of rows. I want to carry out correlation study for every row (from the 1st to the current row minus 1). For e.g. the 1st rows is omitted. The 2nd row's result column is to be supplied with the correlation using the 1st row. The 3rd row's result column is to be supplied with the correlation using the 1st and 2nd row. And so on.
Correlation for the entire table can be calculated using:
SELECT (Count(*)*Sum(x*y)-Sum(x)*Sum(y))/
(sqrt(Count(*)*Sum(x*x)-Sum(x)*Sum(x))*
sqrt(Count(*)*Sum(y*y)-Sum(y)*Sum(y))) AS TotalCorelation FROM Data;
I want to avoid using Joins as much as possible as it takes lots of time, sometimes even timeout error, above 300 seconds). What's the other alternative?
Example table Data Structure:
id, x, y, result
1 , 4, 2, null
2 , 6, 3, -0.2312
3 , 5, 5, 0.42312
4 , 6, 2, -0.5231
5 , 5, 5, 0.22312
6 , 3, 7, -0.2312
7 , 2, 9, 0.42231
8 , 7, 2, 0.32253
9 , 9, 5, 0.32431
id : primary key
x and y : The data
result: correlation
I think this is it:
SELECT d2.ID, d2.x, d2.y, d2.result,
(Count(*)*Sum(d1.x*d1.y)-Sum(d1.x)*Sum(d1.y))/
(sqrt(Count(*)*Sum(d1.x*d1.x)-Sum(d1.x)*Sum(d1.x))*
sqrt(Count(*)*Sum(d1.y*d1.y)-Sum(d1.y)*Sum(d1.y))) AS TotalCorelation
FROM Data d1
RIGHT JOIN Data d2 ON d1.id < d2.id
GROUP BY d2.ID
ORDER BY d2.ID
Without a closed form for calculating correlation of N+1 from N rows, you have to use a quadratic join like this.
I'm assuming that your basic formula is correct. But I'm not sure it is -- when I just run it on the total dataset, I don't get the result 0.32431, I get -0.552773693079.
Here's a linear implementation:
SET #SumX = 0;
SET #SumY = 0;
SET #Count = 0;
SET #SumX2 = 0;
SET #SumY2 = 0;
SET #SumXY = 0;
SELECT id, x, y,
#SumX := #SumX + x AS SumX,
#SumY := #SumY + y AS SumY,
#Count := #Count + 1 AS ct,
#SumX2 := #SumX2 + x*x AS SumX2,
#SumY2 := #SumY2 + y*y AS SumY2,
#SumXY := #SumXY + x*y AS SumXY,
IF(#Count > 1,
(#Count*#SumXY-#SumX*#SumY)/
(sqrt(#Count*#SumX2-#SumX*#SumX)*
sqrt(#Count*#SumY2-#SumY*#SumY)), NULL) AS TotalCorelation
FROM DATA
ORDER BY id
SQLFIDDLE
Okay, I'm having some difficulties with order by. Here is the problem I need to solve:
In the database I have written every tile of a map, that is 101 x 101 big. The table has 3 columns(ID, x, y), now I gotta select all the tiles in some radious. For example, I used this query:
SELECT *
FROM tile
WHERE ((x >= -3 AND x <= 3)
AND (y >= -3 AND y <= 3))
ORDER BY x ASC, y DESC;
This query selects all tiles in radius of 3 of the given coordinate (0|0) for now.
But, it doesn't sort them the way I want it to. Basically, the output must be like this.
But this is the closest I got.
http://prntscr.com/zqjd7
Edit:
Disregard the double values, had double inputs for each coordinate. Haven't seen it.
It seems that your problem is around the ASC / DESC modificator.
But since we're here, wouldn't you prefer to use a distance formula? Something near
SELECT x, y FROM tile WHERE
(
POW(x-#var1, 2) + POW(y-#var2, 2) <= POW(3, 2)
)
ORDER BY x DESC, y ASC;
Here, given a point P (m,n), we shall know the distance to a fixed point Q (x,y) by acerting D(P,Q) = SQRT( (x-m)² + (y-n)² ). As much as it has to be less than (or equals) your desired radius (= 3), we have so SQRT( (x-m)² + (y-n)² ) <= 3, or better, (x-m)² + (y-n)² <= 3², raising both terms to its square power.
SQL-language speaking, we write POW(x-m, 2) + POW(y-n, 2) <= POW(3, 2), willing to say that the distance between (x,y) and (m,n) is last than or equal 3.
About #var, it's where you enter your input value. More specifically, they are session variables, but you don't really want to use it to perform a select; just substitute them by any number you want, e.g. you can choose the origin (0,0) by putting 0 on place of #var1 and #var2.
[Update]
Well... It's always a good idea to test your code before answering. In fact I should have suggested to order firstly by y, since we first care about ordering rows to display on screen. The following code was (finally) tested (on test DB); my last suggest is to create the following index (index_y_x):
USE `test` ;
CREATE TABLE IF NOT EXISTS `test`.`tile` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT ,
`x` INT(11) NULL DEFAULT 0 ,
`y` INT(11) NULL DEFAULT 0 ,
PRIMARY KEY (`id`) ,
INDEX `index_y_x` (`y` DESC, `x` ASC) )
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8;
INSERT tile (x,y) VALUES
(-2,-2),(-2, -1),(-2, 0),(-2, 1),(-2, 2),
(-1,-2),(-1, -1),(-1, 0),(-1, 1),(-1, 2),
(0,-2), (0, -1), (0, 0), (0, 1), (0, 2),
(1,-2), (1, -1), (1, 0), (1, 1), (1, 2),
(2,-2), (2, -1), (2, 0), (2, 1), (2, 2);
SELECT x, y FROM tile
WHERE POW(x-3, 2) + POW(y-3, 2) <= POW(3, 2)
ORDER BY y DESC, x ASC;
This returns items near the point (3,3), in a range of 3 units
Something like
SELECT COUNT(*) AS c FROM BANS WHERE typeid=6 AND (SELECT ipaddr,cidr FROM BANS) MATCH AGAINST 'this_ip';
So you don't first fetch all records from DB and then match them one-by one.
If c > 0 then were matched.
BANS table:
id int auto incr PK
typeid TINYINT (1=hostname, 4=ipv4, 6=ipv6)
ipaddr BINARY(128)
cidr INT
host VARCHAR(255)
DB: MySQL 5
IP and IPv type (4 or 6) is known when querying.
IP is for example ::1 in binary format
BANNED IP is for example ::1/64
Remember that IPs are not a textual address, but a numeric ID. I have a similar situation (we're doing geo-ip lookups), and if you store all your IP addresses as integers (for example, my IP address is 192.115.22.33 so it is stored as 3228767777), then you can lookup IPs easily by using right shift operators.
The downside of all these types of lookups is that you can't benefit from indexes and you have to do a full table scan whenever you do a lookup. The above scheme can be improved by storing both the network IP address of the CIDR network (the beginning of the range) and the broadcast address (the end of the range), so for example to store 192.168.1.0/24 you can store two columns:
network broadcast
3232235776, 3232236031
And then you can to match it you simply do
SELECT count(*) FROM bans WHERE 3232235876 >= network AND 3232235876 <= broadcast
This would let you store CIDR networks in the database and match them against IP addresses quickly and efficiently by taking advantage of quick numeric indexes.
Note from discussion below:
MySQL 5.0 includes a ranged query optimization called "index merge intersect" which allows to speed up such queries (and avoid full table scans), as long as:
There is a multi-column index that matches exactly the columns in the query, in order. So - for the above query example, the index would need to be (network, broadcast).
All the data can be retrieved from the index. This is true for COUNT(*), but is not true for SELECT * ... LIMIT 1.
MySQL 5.6 includes an optimization called MRR which would also speed up full row retrieval, but that is out of scope of this answer.
For IPv4, you can use:
SET #length = 4;
SELECT INET_NTOA(ipaddr), INET_NTOA(searchaddr), INET_NTOA(mask)
FROM (
SELECT
(1 << (#length * 8)) - 1 & ~((1 << (#length * 8 - cidr)) - 1) AS mask,
CAST(CONV(SUBSTR(HEX(ipaddr), 1, #length * 2), 16, 10) AS DECIMAL(20)) AS ipaddr,
CAST(CONV(SUBSTR(HEX(#myaddr), 1, #length * 2), 16, 10) AS DECIMAL(20)) AS searchaddr
FROM ip
) ipo
WHERE ipaddr & mask = searchaddr & mask
IPv4 addresses, network addresses and netmasks are all UINT32 numbers and are presented in human-readable form as "dotted-quads". The routing table code in the kernel performs a very fast bit-wise AND comparison when checking if an address is in a given network space (network/netmask). The trick here is to store the dotted-quad IP addresses, network addresses and netmasks in your tables as UINT32, and then perform the same 32-bit bit-wise AND for your matching. eg
SET #test_addr = inet_aton('1.2.3.4');
SET #network_one = inet_aton('1.2.3.0');
SET #network_two = inet_aton('4.5.6.0');
SET #network_netmask = inet_aton('255.255.255.0');
SELECT (#test_addr & #network_netmask) = #network_one AS IS_MATCHED;
+------------+
| IS_MATCHED |
+------------+
| 1 |
+------------+
SELECT (#test_addr & #network_netmask) = #network_two AS IS_NOT_MATCHED;
+----------------+
| IS_NOT_MATCHED |
+----------------+
| 0 |
+----------------+
Generating IP Address Ranges as Integers
If your database doesn't support fancy bitwise operations, you can use a simplified integer based approach.
The following example is using PostgreSQL:
select (cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 1) as bigint) * (256 * 256 * 256) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 2) as bigint) * (256 * 256 ) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 3) as bigint) * (256 ) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 4) as bigint))
as network,
(cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 1) as bigint) * (256 * 256 * 256) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 2) as bigint) * (256 * 256 ) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 3) as bigint) * (256 ) +
cast(split_part(split_part('4.0.0.0/8', '/', 1), '.', 4) as bigint)) + cast(
pow(256, (32 - cast(split_part('4.0.0.0/8', '/', 2) as bigint)) / 8) - 1 as bigint
) as broadcast;
Hmmm. You could build a table of the cidr masks, join it, and then compare the ip anded (& in MySQL) with the mask with the ban block ipaddress. Would that do what you want?
If you don't want to build a mask table, you could compute the mask as -1 << (x-cidr) with x = 64 or 32 depending.