Can't fetch a geolocation result from imported MySQL database - mysql

I've downloaded Software77's free geoip country DB. I've imported it using Sequel Pro - the resulting table looks to have the same amount of lines as in the .csv.
The following queries however, return 0 results:
SELECT * FROM s77_country WHERE ip_num_start = "3000762368";
SELECT country_code FROM s77_country WHERE 3000831958 BETWEEN ip_num_start AND ip_num_end
In the .csv, line 49046:
"3000762368","3001024511","ripencc","1269993600","RS","SRB","Serbia"
That looks to be in the range, so the result should have been "RS".
Here's the table setup:
CREATE TABLE `s77_country` (
`ip_num_start` int(11) DEFAULT NULL,
`ip_num_end` int(11) DEFAULT NULL,
`registry` varchar(255) DEFAULT NULL,
`assigned` bigint(11) DEFAULT NULL,
`country_code` varchar(255) DEFAULT NULL,
`country_code_long` varchar(255) DEFAULT NULL,
`country_name` varchar(255) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
What am I missing here?

you need to use BIGINT or int unsigned for your IP_NUM. IP num value goes above 2b, INT is limited there
your maximum value for IP_NUM is theoretically 255*256^3+255*256^2+255*256+255 which is 4294967295, double of what signed int can store
http://dev.mysql.com/doc/refman/5.0/en/integer-types.html
edit: int unsigned fits exactly that value

Actually you have missed one phase of error.
The value 3000831958 doesn't fit into the range of int. Use bigint(11) or int(11) unsigned instead both for ip_num_start and ip_num_end.
The record you have specified will not even be inserted into the table because of the range violation.
Your select query is fine.

Related

How to optimize a large mysql table (stock data)

I have a table with following structure,
`trading_daily_price` (
`id` int(11) NOT NULL PRAMARY AUTO_INCREMENT,
`date` date DEFAULT NULL,
`Symbol` varchar(20) DEFAULT NULL,
`Market` varchar(12) DEFAULT NULL,
`QuoteName` text,
`Price` float DEFAULT NULL,
`PriceChange` float DEFAULT NULL,
`PriceChangePct` float DEFAULT NULL,
`Volume` float DEFAULT NULL,
`DayLow` float DEFAULT NULL,
`DayHigh` float DEFAULT NULL,
`Week52Low` float DEFAULT NULL,
`Week52High` float DEFAULT NULL,
`Open` float DEFAULT NULL,
`High` float DEFAULT NULL,
`Bid` float DEFAULT NULL,
`BidSize` float DEFAULT NULL,
`Beta` float DEFAULT NULL,
`PrevClose` float DEFAULT NULL,
`Low` float DEFAULT NULL,
`Ask` float DEFAULT NULL,
`AskSize` float DEFAULT NULL,
`VWAP` float DEFAULT NULL,
`Yield` float DEFAULT NULL,
`Dividend` char(12) DEFAULT NULL,
`DivFrequency` varchar(24) DEFAULT NULL,
`SharesOut` float DEFAULT NULL,
`PERatio` float DEFAULT NULL,
`EPS` float DEFAULT NULL,
`ExDivDate` date DEFAULT NULL,
`MarketCap` float DEFAULT NULL,
`PBRatio` float DEFAULT NULL,
`Exchange` varchar(32) DEFAULT NULL,
`NewsTitle` varchar(1024) DEFAULT NULL,
`NewsSource` varchar(32) DEFAULT NULL,
`NewsPublicationDate` date DEFAULT NULL,
`NewsURL` varchar(256) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I didn't find an idea to break down it, in frontend presentation, I need all these columns to display. I am writing a query like,
SELECT * FROM trading_daily_price WHERE date='SOME_DATE' AND Symbol='%search_key%' ORDER BY 'column' LIMIT 10
The table has millions of records, and every day new records are added. Now the problem is every query takin so much time to generate the output. In a 4GB VPS with DigitalOcean with some configuration, it's running nicely. But, in Godaddy business hosting it's running very slowly.
I want to know is it a better idea to break the columns into multiple tables, and using JOIN statements. Will it increase performance? or I need to follow other optimization logic.
As suggested by Madhur, I have added INDEX to date, symbol, and Market. It improves the above query speed, but the following query still taking much time.
SELECT `date`,`Price` FROM trading_daily_price WHERE `Symbol` = 'GNCP:US' ORDER BY date ASC
Thanks in advance,
Rajib
As suggested by Madhur and JNevill, I found the only solution is to create multiple INDEX as required.
for first SQL,
SELECT * FROM trading_daily_price WHERE date='SOME_DATE' AND Symbol='%search_key%' ORDER BY 'column' LIMIT 10
we need to create index as below,
CREATE INDEX index_DCS ON trading_daily_price (`date`,column, symbol);
and for the second SQL,
SELECT `date`,`Price` FROM trading_daily_price WHERE `Symbol` = 'GNCP:US' ORDER BY date ASC
we need to create index as below,
CREATE INDEX index_DPS ON trading_daily_price (`date`,Price, symbol);
Thanks
You shouldn't need date, symbol and column index for your first query because you are searching symbol by %text% and MySql can only use the date part of the index. An index with date and column should be better because MySQL can utilize two columns from the index
For your new query, you will need index on Symbol, date and price. By this index, your query won't need go back to clustered index for data.
Whether splitting the table depends on your use case: how will you handle old data. If old data won't be frequently accessed, you can consider to split. But your application need cater for it.
Split up that table.
One table has the open/high/low/close/volume, indexed by stock and date.
Another table provides static information about each stock.
Perhaps another has statistics derived from the raw data.
Make changes like those, then come back for more advice/abuse.

How can i define a column as tinyint(4) using sequelize ORM

I need to define my table as
CREATE TABLE `test` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`o_id` int(11) unsigned NOT NULL,
`m_name` varchar(45) NOT NULL,
`o_name` varchar(45) NOT NULL,
`customer_id` int(11) unsigned NOT NULL,
`client_id` tinyint(4) unsigned DEFAULT '1',
`set_id` tinyint(4) unsigned DEFAULT NULL,
`s1` tinyint(4) unsigned DEFAULT NULL,
`s2` tinyint(4) unsigned DEFAULT NULL,
`s3` tinyint(4) unsigned DEFAULT NULL,
`review` varchar(2045) DEFAULT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `br_o_id_idx` (`order_id`),
KEY `br_date_idx` (`created_at`),
KEY `br_on_idx` (`operator_name`),
KEY `br_mn_idx` (`merchant_name`)
)
but as i am looking on sequelize documentation , it does not have support for tiny int with its size.
From the lack of ZERO_FILL in your table definition, I suspect tinyint(4) probably does not do what you think it does. From 11.2.5 Numeric Type Attributes:
MySQL supports an extension for optionally specifying the display width of integer data types in parentheses following the base keyword for the type.
...
The display width does not constrain the range of values that can be stored in the column.
...
For example, a column specified as SMALLINT(3) has the usual SMALLINT range of -32768 to 32767, and values outside the range permitted by three digits are displayed in full using more than three digits.
I'm not sure if other RDBMSs treat the number in parens differently, but from perusing the sequelize source it looks like they're under the same, incorrect, impression you are.
That being said, the important part of your schema, being that you want to store those fields as TINYINTs (using only a byte of storage to contain values between 0-255), is sadly not available in the Sequelize DataTypes. I might suggest opening a PR to add it...
On the other hand, if you really are looking for the ZEROFILL functionality, and need to specify that display width of 4, you could do something like Sequelize.INTEGER(4).ZEROFILL, but obviously, that would be pretty wasteful of space in your DB.
For MySQL, the Sequelize.BOOLEAN data type maps to TINYINT(1). See
https://github.com/sequelize/sequelize/blob/3e5b8772ef75169685fc96024366bca9958fee63/lib/data-types.js#L397
and
http://docs.sequelizejs.com/en/v3/api/datatypes/
As noted by #user866762, the number in parentheses only affects how the data is displayed, not how it is stored. So, TINYINT(1) vs. TINYINT(4) should have no effect on your data.

Pull records on or before 2 given dates

I was asked to pull records from Match table from 2 Given dates.
Say 2013/01/01 and 2013/02/29.
For 1 Match there are many possible bets when relationship is applied.
Say for a Match that Started around 2013/01/05, the bets relating to it
are bets that was made on or before 2013/01/05
What's weird is my boss would like to get also the records before the Match Date occured. Coz I already made a page that allows him to view the bet of a certain match.
coz in the first place the page is a Match page, so the search should be within the Match data right?
My idea is use something like
SELECT * from betdb
WHERE DateTime <= '2013/01/01 00:00:00' AND DateTime <= '2013/02/29 23:59:00'
But it just yields only 1 record that has a Date
1931-01-29 00:00:00
which was possibly typo error.
Is there a better way on pulling those kind of records that occured before the supplied DateTime?
Match table definition
CREATE TABLE IF NOT EXISTS `matchdb` (
`MatchID` int(11) NOT NULL AUTO_INCREMENT,
`BookID` int(11) NOT NULL,
`Category` varchar(40) NOT NULL,
`MatchName` varchar(60) NOT NULL,
`StartTime` datetime NOT NULL,
`Result1` varchar(20) DEFAULT NULL,
`Result2` varchar(20) DEFAULT NULL,
`Result3` varchar(20) DEFAULT NULL,
PRIMARY KEY (`MatchID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 AUTO_INCREMENT=1 ;
Bet table definition
CREATE TABLE IF NOT EXISTS `betdb` (
`BetID` int(11) NOT NULL,
`BookID` int(10) NOT NULL,
`PlayerID` int(10) NOT NULL,
`DateTime` datetime NOT NULL,
`MatchID` int(10) NOT NULL,
`BetType` varchar(40) NOT NULL,
`Bet` varchar(40) NOT NULL,
`BetAmount` float NOT NULL,
`Odds` float NOT NULL,
`BetResult` varchar(40) NOT NULL,
`Payout` float NOT NULL,
PRIMARY KEY (`BetID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 AUTO_INCREMENT=1;

error in mysql partitioning

I need to create the table in the below format. But is returning the error in the partitioning area that is Error Code: 1659. Field 'fldassigndate' is of a not allowed type for this type of partitioning. How to resolve this error and make partitioning ?
CREATE TABLE tblattendancesetup (
fldattendanceid int(11) NOT NULL AUTO_INCREMENT,
flddept varchar(100) DEFAULT NULL,
fldemployee varchar(100) DEFAULT NULL,
fldintime varchar(20) DEFAULT NULL,
fldouttime varchar(20) DEFAULT NULL,
fldlateafter varchar(20) DEFAULT NULL,
fldearlybefore varchar(20) DEFAULT NULL,
fldweekoff varchar(20) DEFAULT NULL,
fldshiftname varchar(20) DEFAULT NULL,
fldassigndate varchar(20) DEFAULT NULL,
fldfromdate varchar(20) DEFAULT NULL,
fldtodate varchar(20) DEFAULT NULL,
fldrefid varchar(20) DEFAULT NULL,
UNIQUE KEY fldattendanceid (fldattendanceid),
KEY in_attendancesetup (fldemployee,fldintime,fldouttime,fldlateafter,fldearlybefore,fldfromdate,fldtodate,fldattendanceid),
KEY i_emp_tmp (fldemployee),
KEY i_emp_attendance (fldemployee)
)
PARTITION BY RANGE (fldassigndate)
(PARTITION p_Apr VALUES LESS THAN (TO_DAYS('2012-05-01')),
PARTITION p_May VALUES LESS THAN (TO_DAYS('2012-06-01')),
PARTITION p_Nov VALUES LESS THAN MAXVALUE );
From the MySQL manual (Section 18):
Data type of partitioning key. A partitioning key must be either an
integer column or an expression that resolves to an integer.
Neither dates nor varchars can be used for partitioning
Although this question is very old, I thought it'd help those searching, like I was at one point.
OP should first set his column structures correctly by using a correct structure for a date field, like datetime.
Once that is done, I have not tried to set my keys in that manner, but I have found that I needed to use a composite primary key if I wanted to keep the structure of my table the same. In this case, it'd be
PRIMARY KEY(`fldattendanceid`, `fldassigndate`)
Then, there is a missing TO_DAYS specified in the PARTITION BY RANGE. Lastly, I'm not 100 percent sure, but I don't think you can partition on a null field. Even if you can, it'd be awful practice to do so.
CREATE TABLE tblattendancesetup (
fldattendanceid INT(11) NOT NULL AUTO_INCREMENT,
flddept VARCHAR(100) DEFAULT NULL,
fldemployee VARCHAR(100) DEFAULT NULL,
fldintime DATETIME DEFAULT NULL,
fldouttime DATETIME DEFAULT NULL,
fldlateafter DATETIME DEFAULT NULL,
fldearlybefore DATETIME DEFAULT NULL,
fldweekoff VARCHAR(20) DEFAULT NULL,
fldshiftname VARCHAR(20) DEFAULT NULL,
fldassigndate DATETIME NOT NULL,
fldfromdate DATETIME DEFAULT NULL,
fldtodate DATETIME DEFAULT NULL,
fldrefid VARCHAR(20) DEFAULT NULL,
PRIMARY KEY(`fldattendanceid`, `fldassigndate`),
KEY in_attendancesetup (fldemployee,fldintime,fldouttime,fldlateafter,fldearlybefore,fldfromdate,fldtodate,fldattendanceid),
KEY i_emp_tmp (fldemployee),
KEY i_emp_attendance (fldemployee)
)
PARTITION BY RANGE ( TO_DAYS(fldassigndate))
(PARTITION p_Apr VALUES LESS THAN (TO_DAYS('2012-05-01')),
PARTITION p_May VALUES LESS THAN (TO_DAYS('2012-06-01')),
PARTITION p_Nov VALUES LESS THAN MAXVALUE );
I hope this helps someone!

MYSQL: Find and delete similar records - Updated with example

I'm trying to dedup a table, where I know there are 'close' (but not exact) rows that need to be removed.
I have a single table, with 22 fields, and uniqueness can be established through comparing 5 of those fields. Of the remaining 17 fields, (including the unique key), there are 3 fields that cause each row to be unique, meaning the dedup proper method will not work.
I was looking at the multi table delete method outlined here: http://blog.krisgielen.be/archives/111 but I can't make sense of the final line of code (AND M1.cd*100+M1.track > M2.cd*100+M2.track) as I am unsure what the cd*100 part achieves...
Can anyone assist me with this? I suspect I could do better exporting the whole thing to python, doing something with it, then re-importing it, but then (1)I'm stuck with knowing how to dedup the string anyway! and (2) I had to break the record into chunks to be able to import it into mysql as it was timing out after 300 seconds so it turned into a whole debarkle to get into mysql in the first place.... (I am very novice at both mysql and python)
The table is a dump of some 40 log files from some testing. The test set for each log is some 20,000 files. The repeating values are either the test conditions, the file name/parameters or the results of the tests.
CREATE SHOW TABLE:
CREATE TABLE `t1` (
`DROID_V` int(1) DEFAULT NULL,
`Sig_V` varchar(7) DEFAULT NULL,
`SPEED` varchar(4) DEFAULT NULL,
`ID` varchar(7) DEFAULT NULL,
`PARENT_ID` varchar(10) DEFAULT NULL,
`URI` varchar(10) DEFAULT NULL,
`FILE_PATH` varchar(68) DEFAULT NULL,
`NAME` varchar(17) DEFAULT NULL,
`METHOD` varchar(10) DEFAULT NULL,
`STATUS` varchar(14) DEFAULT NULL,
`SIZE` int(10) DEFAULT NULL,
`TYPE` varchar(10) DEFAULT NULL,
`EXT` varchar(4) DEFAULT NULL,
`LAST_MODIFIED` varchar(10) DEFAULT NULL,
`EXTENSION_MISMATCH` varchar(32) DEFAULT NULL,
`MD5_HASH` varchar(10) DEFAULT NULL,
`FORMAT_COUNT` varchar(10) DEFAULT NULL,
`PUID` varchar(15) DEFAULT NULL,
`MIME_TYPE` varchar(24) DEFAULT NULL,
`FORMAT_NAME` varchar(10) DEFAULT NULL,
`FORMAT_VERSION` varchar(10) DEFAULT NULL,
`INDEX` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`INDEX`)
) ENGINE=MyISAM AUTO_INCREMENT=960831 DEFAULT CHARSET=utf8
The only unique field is the PriKey, 'index'.
Unique records can be established by looking at DROID_V,Sig_V,SPEED.NAME and PUID
Of the ¬900,000 rows, I have about 10,000 dups that are either a single duplicate of a record, or have upto 6 repetitions of the record.
Row examples: As Is
5;"v37";"slow";"10266";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/7";"image/tiff";"Tagged Ima";"3";"191977"
5;"v37";"slow";"10268";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/8";"image/tiff";"Tagged Ima";"4";"191978"
5;"v37";"slow";"10269";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/9";"image/tiff";"Tagged Ima";"5";"191979"
5;"v37";"slow";"10270";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/10";"image/tiff";"Tagged Ima";"6";"191980"
5;"v37";"slow";"12766";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/7";"image/tiff";"Tagged Ima";"3";"193977"
5;"v37";"slow";"12768";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/8";"image/tiff";"Tagged Ima";"4";"193978"
5;"v37";"slow";"12769";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/9";"image/tiff";"Tagged Ima";"5";"193979"
5;"v37";"slow";"12770";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/10";"image/tiff";"Tagged Ima";"6";"193980"
Row Example: As It should be
5;"v37";"slow";"10266";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/7";"image/tiff";"Tagged Ima";"3";"191977"
5;"v37";"slow";"10268";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/8";"image/tiff";"Tagged Ima";"4";"191978"
5;"v37";"slow";"10269";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/9";"image/tiff";"Tagged Ima";"5";"191979"
5;"v37";"slow";"10270";;"file:";"V1-FL425817.tif";"V1-FL425817.tif";"BINARY_SIG";"MultipleIdenti";"20603284";"FILE";"tif";"2008-11-03";;;;"fmt/10";"image/tiff";"Tagged Ima";"6";"191980"
Please note, you can see from the index column at the end that I have cut out some other rows - I have only idenitified a very small set of repeating rows. Please let me know if you need any more 'noise' from the rest of the DB
Thanks.
I figured out a fix - using the count function, I was using a COUNT(*) that just returned everything in the table, by using a COUNT (distinct NAME) function I am able to weed out the dup rows that fit the dup critera (as set out by the field selection in a WHERE clause)
Example:
SELECT `PUID`,`DROID_V`,`SIG_V`,`SPEED`, COUNT(distinct NAME) as Hit FROM sourcelist, main_small WHERE sourcelist.SourcePUID = 'MyVariableHere' AND main_small.NAME = sourcelist.SourceFileName
GROUP BY `PUID`,`DROID_V`,`SIG_V`,`SPEED` ORDER BY `DROID_V` ASC, `SIG_V` ASC, `SPEED`;