MySQL-Query too slow - mysql

I am using the following tables in my MySQL-Database:
--
-- Table structure for table `company`
--
CREATE TABLE IF NOT EXISTS `company` (
`numb` varchar(4) NOT NULL,
`cik` varchar(30) NOT NULL,
`sNumber` varchar(30) NOT NULL,
`street1` varchar(255) NOT NULL,
`street2` varchar(255) NOT NULL,
`city` varchar(255) NOT NULL,
`state` varchar(100) NOT NULL,
`zip` varchar(100) NOT NULL,
`phone` varchar(255) NOT NULL,
`name` varchar(255) NOT NULL,
`dateChanged` varchar(30) NOT NULL,
`name2` varchar(255) NOT NULL,
`seriesId` varchar(30) NOT NULL,
`symbol` varchar(10) NOT NULL,
`exchange` varchar(20) NOT NULL,
PRIMARY KEY (`cik`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `company` (`numb`, `cik`, `sNumber`, `street1`, `street2`, `city`, `state`, `zip`, `phone`, `name`, `dateChanged`, `name2`, `seriesId`, `symbol`, `exchange`) VALUES
('6798', 'abc', '953551121', '701 AVENUE', '', 'GLENDALE', 'CA', '91201-2349', '818-244-8080', '', '', 'Public Store', '', 'PSA', 'NYSE')
--
-- Table structure for table `data`
--
CREATE TABLE IF NOT EXISTS `data` (
`id` int(100) NOT NULL AUTO_INCREMENT,
`number` varchar(100) NOT NULL,
`elementname` mediumtext NOT NULL,
`date` varchar(100) NOT NULL,
`elementvalue` longtext NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=18439;
INSERT INTO `data` (`id`, `number`, `elementname`, `date`, `elementvalue`) VALUES
(1, '0001393311-10-000004', 'StockholdersEquityIncludingPortionAttributableToNoncontrollingInterest', '2009-12-31', '3399777000')
--
-- Table structure for table `filing`
--
CREATE TABLE IF NOT EXISTS `filing` (
`number` varchar(100) NOT NULL,
`file_number` varchar(100) NOT NULL,
`type` varchar(100) NOT NULL,
`amendment` tinyint(1) NOT NULL,
`date` varchar(100) NOT NULL,
`cik` varchar(30) NOT NULL,
PRIMARY KEY (`accession_number`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `filing` (`number`, `file_number`, `type`, `amendment`, `date`, `cik`) VALUES
('0001393311-10-000004', '001-33519', '10-K', 0, '2009-12-31', '0000751653'),
('0000751652-10-000006', '001-08796', '10-K', 0, '2009-12-31', '0000751652')
The data table has around 22.000 entries, filing and company tables have around 400 entries each. I want to operate the database with a lot more entries in the future.
I perform the following query, which selects the newest item with a given type:
SELECT data.elementname, data.elementvalue, company.name2 FROM data
JOIN filing ON data.number = filing.number
JOIN company ON filing.cik = company.cik
WHERE elementname IN ('Elem1', 'Elem2', 'Elem3', 'Elem4', 'Elem5', 'ElemN')
AND number IN (
SELECT number
FROM filing
WHERE filing.cik IN ('cik1', 'cik2', 'cikN')
AND filing.type = '1L'
GROUP BY filing.cik
)
It takes between ~0.28 and 0.4 seconds to complete, which appears to be very slow.
When i perform the query without the following line
WHERE filing.cik IN ('cik1', 'cik2', 'cikN')
it takes only ~0.035 seconds.
Any idea how to speed the query up or to optimize the table structure because the table is growing rapidly and it's already too slow.

First off, the table structure you posted for filing is incorrect, as the primary key you specified doesn't. I'll assume you mean number. Additionally, you didn't specify the table definition for company, which makes trying to provide advice for this somewhat difficult.
However, both of the comments are correct. You need some indexes. Based on the query, you should probably some the following indexes.
ALTER TABLE company ADD INDEX ( cik )
ALTER TABLE data ADD INDEX ( number )
I would also recommend taking a look at whether data.elementname actually needs to be a MEDIUMTEXT, which is a pretty huge column. If the rest of the data looks like the example data you provided, you should probably change it into a varchar. TEXT columns can cause some serious performance penalties due to the way they're stored.
Additionally, your PRIMARY KEY number columns, which are currently strings, look as though they could be reformatted into different columns that are actually of type INT. Keep in mind that VARCHAR PRIMARY KEY columns will not be as efficient as INTs, just because they're so much bigger.
Lastly, 22k rows isn't all that much data. You should a take a look at your my.cnf settings. Your key_buffer value may be too small to fit indexes entirely in memory. Additionally, you may want to consider using INNODB for these tables, combined with an innodb_buffer_pool value that'll keep everything in memory.

Related

Is it possible to make a batch insert/update if the uniqueness of the record is a bundle of two fields?

I have the following table structure (example)
CREATE TABLE `Test` (
`id` int(11) NOT NULL,
`order_id` int(11) NOT NULL,
`position_id` int(11) NOT NULL,
`name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`price` decimal(10,2) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
ALTER TABLE `Test` ADD PRIMARY KEY (`id`);
ALTER TABLE `Test` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
This table contains data that is constantly in need of updating. There is also new data that needs to be entered. Since there is a lot of data, it will take quite a long time to check each record to make it insert or update.
After studying the question, I realized that I need to use batch insert/update with:
INSERT on DUPLICATE KEY UPDATE
But the documentation says that the fields must have a unique index. But I don't have any unique fields, I can't use the ID field. The uniqueness of the record can only be in a combination of two fields order_id and position_id.
Is it possible to make a batch insert/update if the uniqueness of the record is a bundle of two fields?
You need a composite primary-key. You also don't need your AUTO_INCREMENT id column, so you can drop it.
Like so:
CREATE TABLE `Test` (
`order_id` int NOT NULL,
`position_id` int NOT NULL,
`name` varchar(255) NOT NULL COLLATE utf8mb4_unicode_ci,
`price` decimal(10,2) NOT NULL,
CONSTRAINT PK_Test PRIMARY KEY ( `order_id`, `position_id` )
) ENGINE=InnoDB
Then you can use INSERT ON DUPLICATE KEY UPDATE.

simple select query optimise

I have a table defined as follows:
CREATE TABLE IF NOT EXISTS `cards` (
`ID` int(11) NOT NULL,
`Name` varchar(200) NOT NULL,
`WorkerID` varchar(20) NOT NULL,
`pic` varchar(200) NOT NULL,
`expDate` bigint(20) NOT NULL,
`reminderSent` tinyint(4) NOT NULL,
`regNum` varchar(8) NOT NULL,
`cardType` varchar(200) NOT NULL
) ENGINE=MyISAM AUTO_INCREMENT=92 DEFAULT CHARSET=latin1;
ALTER TABLE `cards`
ADD PRIMARY KEY (`ID`), ADD KEY `cardsWorkerID_idx` (`WorkerID`);
But running:
explain
SELECT pic, expDate, Name, ID, cardType, regNum FROM cards WHERE workerID= 18
tells me it is scanning the entire table, even though I added an index to the workerID field. Can anyone explain what I'm missing?
The use of indexes depends on the size of the data. It also depends on the types used for the comparison. If you have a small table, then the SQL engine might decide that a scan is more efficient than using the index. This is particularly true if the table fits on a single data page.
In your case, though, the problem is might be data conversion. Use the appropriate typed constant for the comparison:
SELECT pic, expDate, Name, ID, cardType, regNum
FROM cards
WHERE workerID = '18';

mysql match against not matching text in brackets

I'm trying to use match against to return search results.
I have a problem though where it's not returning results where the matching text is in brackets. Just getting rid of the brackets isn't an option i'm afraid.
So running the sql fiddle below I would expect it to return two results, not one
CREATE TABLE `courses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`course_code` varchar(100) NOT NULL,
`course_name` varchar(100) DEFAULT NULL,
`startdate` varchar(100) DEFAULT NULL,
`starttimestamp` varchar(45) DEFAULT NULL,
`prospectus_title` varchar(500) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `info` (`course_name`,`prospectus_title`,`course_code`)
) ENGINE=MyISAM AUTO_INCREMENT=981074 DEFAULT CHARSET=utf8;
INSERT INTO `courses` (`id`, `course_code`, `course_name`, `startdate`, `starttimestamp`, `prospectus_title`) VALUES
('1', '1234', 'vrqtest', 'time','time', 'vrqtest'),
('2', '5678', '(vrq)test', 'time','time','(vrq)test');
SELECT * FROM courses force index(info)
WHERE starttimestamp IS NOT NULL AND (
MATCH ( course_name ) against ('vrq*' in boolean mode ))
SQL Fiddle
That returns only the first record but should also the second.
Any ideas?
Here's why:
https://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_ft_min_word_len
ft_min_word_len
The minimum length of the word to be included in a
FULLTEXT index. Defaults to 4
(vrq)test is too short to match. Add an extra character (e.g. (vrqa)test ) and it's matched.

Avoid UNION for two almost identical tables in MySQL

I'm not very good at MySQL and i'm going to write a query to count messages sent by an user, based on its type and is_auto field.
Messages can be of type "small text message" or "newsletter". I created two entities with a few fields that differs between them. The important one is messages_count that is absent in table newsletter and it's used in the query:
CREATE TABLE IF NOT EXISTS `small_text_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
And:
CREATE TABLE `newsletter` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` varchar(78) DEFAULT NULL,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
I ended up with a UNION query. Can this query be shortened or optimized since the only difference is messages_count that should be always 1 for newsletter?
SELECT
CONCAT('sms_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(messages_count * (customers_count + recipients_count)) AS count
FROM small_text_message WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto
UNION
SELECT
CONCAT('newsletter_', IF(is_auto = 0, 'user' , 'auto')) AS subtype,
SUM(customers_count + recipients_count) AS count
FROM newsletter WHERE status <> 'pending' AND user_id = 1
GROUP BY is_auto
I don't see any easy way to avoid a UNION (or UNION ALL) operation, that will return the specified result set.
I would recommend you use a UNION ALL operator in place of the UNION operator. Then the execution plan will not include the step that eliminates duplicate rows. (You already have GROUP BY operations on each query, and there is no way that those two queries can produce an identical row.)
Otherwise, your query looks fine just as it is written.
(It's always a good thing to consider the question, might there be a better way? To get the result set you are asking for, from the schema you have, your query looks about as good as it's going to get.)
If you are looking for more general DB advice, I recommend restructuring the tables to factor the common elements into one table, perhaps called outbound_communication or something, with all of your common fields, then perhaps have "sub tables" for the specific types to host the fields which are unique to that type. It does mean a simple JOIN is necessary to select all of the fields you want, but the again, it's normalized and actually makes situations like this one easier (one table holds all of the entities of interest). Additionally, you have the option of writing that JOIN just once as a "view", and then your existing code would not even need to change to see the two tables as if they never changed.
CREATE TABLE IF NOT EXISTS `outbound_communicaton` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` longtext,
`sent_at` datetime DEFAULT NULL,
`status` varchar(255) NOT NULL,
`recipients_count` int(11) NOT NULL,
`customers_count` int(11) NOT NULL,
`sheduled_at` datetime DEFAULT NULL,
`sheduled_for` datetime DEFAULT NULL,
`is_auto` tinyint(1) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `small_text_message` (
`oubound_communication_id` int(11) NOT NULL,
`messages_count` int(11) NOT NULL,
`username` varchar(255) NOT NULL,
`method` varchar(255) NOT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
CREATE TABLE `newsletter` (
`oubound_communication_id` int(11) NOT NULL,
`subject` varchar(78) DEFAULT NULL,
PRIMARY KEY (`outbound_communication_id`),
FOREIGN KEY (outbound_communication_id)
REFERENCES outbound_communicaton(id)
) ENGINE=InnoDB;
Then selecting a text msg is like this:
SELECT *
FROM outbound_communication AS parent
JOIN small_text_message
ON parent.id = small_text_message.outbound_communication_id
WHERE parent.id = 1234;
The nature of the query is inherently the union of the data from the small text message and the newsletter tables, so the UNION query is the only realistic formulation. There's no join of relevance between the two tables, for example.
So, I think you're very much on the right lines with your query.
Why are you worried about a UNION?

SQL Code for INSERT while checking of multiple keys for duplicates

I have the following table:
CREATE TABLE IF NOT EXISTS `customer_list` (
`id` INT AUTO_INCREMENT,
`first_name` char(4) NOT NULL,
`last_name` varchar(80) NOT NULL,
`phone` varchar(50) NOT NULL,
`province` varchar(50) NOT NULL,
`country` varchar(30) NOT NULL,
`start_date` TIMESTAMP NOT NULL,
`end_date` TIMESTAMP NOT NULL,
PRIMARY KEY (id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
I want to be able to insert into this table with the only restriction being that first_name, last_name and phone CANNOT be the same. If they are the same I want some sort of error returned to warn the end user that the record already exists - No insert/update/replace action is performed.
The key here is the INSERT statement must somehow check 3 fields for duplication. The error must only return if ALL 3 fields are duplicates. IE. 1 or 2 out of the 3 are allowed to be duplicates and still be entered.
Is this possible with one INSERT statement?
Try:
alter table customer_list add unique index(first_name, last_name, phone);