I can't get this to work
CREATE TABLE `oc_tax_class` (
`tax_class_id` int(11) NOT NULL,
`title` varchar(255) NOT NULL,
`description` varchar(255) NOT NULL,
`date_added` datetime NOT NULL,
`date_modified` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
-- --------------------------------------------------------
--
-- Table structure for table `oc_tax_rate`
--
CREATE TABLE `oc_tax_rate` (
`tax_rate_id` int(11) NOT NULL,
`geo_zone_id` int(11) NOT NULL DEFAULT 0,
`name` varchar(255) NOT NULL,
`rate` decimal(15,4) NOT NULL DEFAULT 0.0000,
`type` char(1) NOT NULL,
`date_added` datetime NOT NULL,
`date_modified` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
-- --------------------------------------------------------
--
-- Table structure for table `oc_tax_rule`
--
CREATE TABLE `oc_tax_rule` (
`tax_rule_id` int(11) NOT NULL,
`tax_class_id` int(11) NOT NULL,
`tax_rate_id` int(11) NOT NULL,
`based` varchar(10) NOT NULL,
`priority` int(5) NOT NULL DEFAULT 1
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
3 tables. I want oc_tax_class.title = oc_tax_rate.name
I believe, although I'm not sure, that I should
INSERT INTO oc_tax_class(title)
or
UPDATE oc_tax_class SET title = ...
SELECT oc_tax_rate.name, oc_tax_rule.tax_class_id
JOIN oc_tax_rule ON oc_tax_rate.tax_rate_id = oc_tax_rule.tax_rate_id
And then I don't know what to do next.
I need to copy values from one column to another table, passing through a connecting table.
MySQL supports a multi-table UPDATE syntax, but the documentation (https://dev.mysql.com/doc/refman/en/update.html) has pretty sparse examples of it.
In your case, this may work:
UPDATE oc_tax_class
JOIN oc_tax_rule USING (tax_class_id)
JOIN oc_tax_rate USING (tax_rate_id)
SET oc_tax_class.title = oc_tax_rate.name;
I did not test this. I suggest you test it first on a sample of your data, to make sure it works the way you want it to.
I have a MySQL query I wrote that displays the data I want it to, but it takes at least 30 secs - 1 min to run.
I researched to find out how to created the nested SELECT query with the COUNT that I needed in order to display the data I required. The SQL is also part of a web page I have, and when I go from page to page it takes the same amount of time to load. I am sure there is a more efficient way to write the query so it loads fast, as there are only about 1,500 records in the ttb_shows table and about 11k in the ttb_books table. Below is the query.
-- DDL
CREATE TABLE `ttb_books` (
`book_id` int(11) NOT NULL,
`book_name` varchar(255) NOT NULL DEFAULT '',
`cover_image` varchar(255) DEFAULT NULL,
`show_id` int(11) NOT NULL DEFAULT '0',
`state_id` int(7) NOT NULL DEFAULT '0',
`notes` text,
`year` varchar(255) DEFAULT NULL,
`publisher` varchar(255) DEFAULT NULL,
`status_id` int(7) NOT NULL DEFAULT '0',
`no_pages` varchar(255) DEFAULT NULL,
`footer` text,
`opt1` varchar(255) NOT NULL DEFAULT '$5-10',
`opt2` varchar(255) DEFAULT NULL,
`opt3` varchar(255) DEFAULT NULL,
`opt4` varchar(255) DEFAULT NULL,
`opt5` varchar(255) DEFAULT NULL,
`owned` int(1) DEFAULT '0'
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `ttb_shows` (
`show_id` int(11) NOT NULL,
`show_name` varchar(255) NOT NULL DEFAULT '',
`date_added` datetime NOT NULL DEFAULT '0000-00-00 00:00:00'
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
-- QUERY
SELECT ttb_shows.show_id, ttb_shows.show_name, ttb_shows.date_added,
COUNT(ttb_books.book_id) AS books,
(SELECT COUNT(ttb_books.owned) AS owned FROM ttb_books WHERE (owned=1 AND ttb_books.show_id = ttb_shows.show_id))
FROM ttb_shows LEFT JOIN ttb_books ON ttb_shows.show_id = ttb_books.show_id
GROUP BY ttb_shows.show_id, ttb_shows.show_name, ttb_shows.date_added
Thank you to all who are able to help with this. It is really appreciated!
You could avoid the subquery for owner using sum based on case
SELECT ttb_shows.show_id
, ttb_shows.show_name
, ttb_shows.date_added
, COUNT(ttb_books.book_id) AS books
, sum( case when ttb_books.owned = 1 then 1 else 0 end) AS owned
FROM ttb_shows
LEFT JOIN ttb_books ON ttb_shows.show_id = ttb_books.show_id
GROUP BY ttb_shows.show_id, ttb_shows.show_name, ttb_shows.date_added
Your query can be optimized by you only. It seems that so many left outer will obviously slow the output. If you can either avoid so many left outers or make small chunks of cases to fetch out data and then fetch the Final output.
My problem is a slow search query with a one-to-many relationship between the tables. My tables look like this.
Table Assignment
CREATE TABLE `Assignment` (
`Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ProjectId` int(10) unsigned NOT NULL,
`AssignmentTypeId` smallint(5) unsigned NOT NULL,
`AssignmentNumber` varchar(30) NOT NULL,
`AssignmentNumberExternal` varchar(50) DEFAULT NULL,
`DateStart` datetime DEFAULT NULL,
`DateEnd` datetime DEFAULT NULL,
`DateDeadline` datetime DEFAULT NULL,
`DateCreated` datetime DEFAULT NULL,
`Deleted` datetime DEFAULT NULL,
`Lat` double DEFAULT NULL,
`Lon` double DEFAULT NULL,
PRIMARY KEY (`Id`),
KEY `idx_assignment_assignment_type_id` (`AssignmentTypeId`),
KEY `idx_assignment_assignment_number` (`AssignmentNumber`),
KEY `idx_assignment_assignment_number_external`
(`AssignmentNumberExternal`)
) ENGINE=InnoDB AUTO_INCREMENT=5280 DEFAULT CHARSET=utf8;
Table ExtraFields
CREATE TABLE `ExtraFields` (
`assignment_id` int(10) unsigned NOT NULL,
`name` varchar(30) NOT NULL,
`value` text,
PRIMARY KEY (`assignment_id`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
My search query
SELECT
`Assignment`.`Id`, COL_5_72, COL_5_73, COL_5_74, COL_5_75, COL_5_76,
COL_5_77 FROM (
SELECT
`Assignment`.`Id`,
`Assignment`.`AssignmentNumber` AS COL_5_72,
`Assignment`.`AssignmentNumberExternal` AS COL_5_73 ,
`AssignmentType`.`Name` AS COL_5_74,
`Assignment`.`DateStart` AS COL_5_75,
`Assignment`.`DateEnd` AS COL_5_76,
`Assignment`.`DateDeadline` AS COL_5_77 FROM `Assignment`
CASE WHEN `ExtraField`.`Name` = "WorkDistrict" THEN
`ExtraField`.`Value` end as COL_5_78 FROM `Assignment`
LEFT JOIN `ExtraFields` as `ExtraField` on
`ExtraField`.`assignment_id` = `Assignment`.`Id`
WHERE `Assignment`.`Deleted` IS NULL -- Assignment should not be removed.
AND (1=1) -- Add assignment filters.
) AS q1
GROUP BY `Assignment`.`Id`
HAVING 1 = 1
AND COL_5_78 LIKE '%Amsterdam East%'
ORDER BY COL_5_72 ASC, COL_5_73 ASC;
When the table is only around 3500 records my query takes a couple of seconds to execute and return the results.
What is a better way to search in the related data? Should I just add a JSON field to the Assignment table and use the MySQL 5.7 Json query features? Or did I made a mistake in designing my database?
You are using select from subquery that forces MySQL to create unindexed temp table for each execution. Remove subquery (you really don't need it here) and it will be much faster.
trying to query a large table (senddb.order_histories) that has close to 50M rows and this is the MySQL query I am using:
FIRST APPROACH- inner join:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
inner join (
select order_line_item_id,
order_number,
order_status,
order_status_description,
action,
modified_by,
created_at,
max(updated_at) as updated_at
from senddb.order_histories
where order_status in ('x','y','z')
and fulfillment_location = 'abcd'
group by order_line_item_id) as b
on a.id = b.order_line_item_id
and a.fulfillment_status = '2';
EXPLAIN output :
SECOND APPROACH- nested select:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
where a.fulfillment_status = '2'
and a.id in (
select b.order_line_item_id from(
select order_line_item_id,
order_number,
order_status,
order_status_description,
action,
modified_by,
created_at,
max(updated_at) as updated_at
from senddb.order_histories
where
order_status in ('x','y','z')
and fulfillment_location = 'abcd'
group by order_line_item_id) as b);
I believe nested select is a bad approach on large data but i anyhow added it here because it worked on my sample set. Anyway both the queries eventually time out after 600 seconds with the message : Error Code: 2013. Lost connection to MySQL server during query.
I would like to know if there are any ways to alter the query to make it run faster. I have already tried reducing the columns in the inner select / inner join but that should not really be an issue IMO. I also looked up a solution that says "create a clustered index" but i wasn't really able to follow. Any help is appreciated.
TABLE order_histories :
order_histories CREATE TABLE `order_histories` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`order_number` varchar(24) DEFAULT NULL,
`order_status_description` varchar(255) DEFAULT NULL,
`datetime_stamp` datetime DEFAULT NULL,
`action` varchar(32) DEFAULT NULL,
`fulfillment_location` int(8) DEFAULT NULL,
`order_status` int(8) DEFAULT NULL,
`user_id` int(8) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`modified_by` varchar(32) DEFAULT NULL,
`order_line_item_id` int(11) DEFAULT NULL,
`pooled` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `order_histories_ecash_idx` (`order_number`),
KEY `order_line_item_id` (`order_line_item_id`)
) ENGINE=InnoDB AUTO_INCREMENT=454738178 DEFAULT CHARSET=latin1
TABLE order_line_items :
order_line_items CREATE TABLE `order_line_items` (
`id` int(4) unsigned NOT NULL AUTO_INCREMENT,
`order_number` varchar(24) DEFAULT NULL,
`sku_id` int(8) DEFAULT NULL,
`original_price` float DEFAULT NULL,
`dept_description` varchar(100) DEFAULT NULL,
`description` varchar(100) DEFAULT NULL,
`quantity_ordered` int(8) DEFAULT NULL,
`gift_indicator` char(1) DEFAULT NULL,
`gift_wrap_flag` char(1) DEFAULT NULL,
`shipping_record_flag` char(1) DEFAULT NULL,
`gift_comments` varchar(100) DEFAULT NULL,
`item_status` char(1) DEFAULT NULL,
`tax_amount` float DEFAULT NULL,
`tax_rate` float DEFAULT NULL,
`upc` varchar(20) DEFAULT NULL,
`final_price` float DEFAULT NULL,
`line_number` int(8) DEFAULT NULL,
`master_line_number` int(8) DEFAULT NULL,
`gift_wrap_flag_type` char(1) DEFAULT NULL,
`color_code` varchar(4) DEFAULT NULL,
`size_id` varchar(6) DEFAULT NULL,
`width_id` varchar(6) DEFAULT NULL,
`brand` varchar(15) DEFAULT NULL,
`vpn` varchar(30) DEFAULT NULL,
`dept_number` int(8) DEFAULT NULL,
`class_number` int(8) DEFAULT NULL,
`non_merch_item` char(1) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`modified_by` varchar(32) DEFAULT NULL,
`chain_id` int(11) DEFAULT NULL,
`fulfillment_location` int(11) DEFAULT NULL,
`fulfillment_date` datetime DEFAULT NULL,
`fulfillment_status` int(11) DEFAULT NULL,
`fulfillment_sales_associate` int(11) DEFAULT NULL,
`gift_wrap_line_number` int(11) DEFAULT NULL,
`shipping_type` int(11) DEFAULT NULL,
`order_track_info_id` int(11) DEFAULT NULL,
`store_tlog_updated` varchar(1) DEFAULT NULL,
`shipping_tlx_code` int(11) DEFAULT NULL,
`store_closed` tinyint(1) DEFAULT NULL,
`flags` int(11) DEFAULT NULL,
`deal_based_index` int(11) DEFAULT NULL,
`tlog_calc_ret_price` float DEFAULT NULL,
`tlog_amount` float DEFAULT NULL,
`tlog_retail_price` float DEFAULT NULL,
`tlog_ext_amount` float DEFAULT NULL,
`tlog_flag_1` int(11) DEFAULT NULL,
`tlog_flag_2` int(11) DEFAULT NULL,
`tlog_flag_3` int(11) DEFAULT NULL,
`time_remaining` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `order_line_items_ecash_idx` (`order_number`),
KEY `order_line_item_fulfillment_location_idx` (`fulfillment_location`),
KEY `order_line_item_fulfillment_status_idx` (`fulfillment_status`),
KEY `upc_idx` (`upc`),
KEY `sku_id_idx` (`sku_id`),
KEY `order_line_items_idx001` (`order_number`,`id`,`fulfillment_status`),
KEY `order_track_info_id` (`order_track_info_id`),
KEY `shipping_type_idx` (`shipping_type`,`non_merch_item`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=11367052 DEFAULT CHARSET=latin1
This query can be simplified:
select a.id,
a.order_number,
a.sku_id,
a.fulfillment_status,
a.modified_by,
a.created_at,
a.updated_at
from senddb.order_line_items a
inner join senddb.order_histories b on a.id = b.order_line_item_id
where b.order_status in ('x','y','z')
and b.fulfillment_location = 'abcd'
and a.fulfillment_status = '2';
Since you're only selecting values from table a, you don't need to select specific values from table b and can instead just apply your conditions. Outside of this, you need to ensure that b.order_line_item_id has an index on it. You can find more about creating indexes here. I'm not an expert in MySQL but something similar to this should work if senddb.order_histories.order_line_item_id isn't already the primary key.
CREATE INDEX IX_order_histories_order_line_item_id ON order_histories (order_line_item_id);
You need to read up the optimization section of the MySQL docs. It contains a lot of information on how you can optimize your queries and data sets. The main idea here is to add indexes to the fields that are being used as the criteria in the WHERE clause of the SQL statements.
Basically, both of your alternatives are using a "sub-SELECT, not an INNER JOIN.
The syntax of a true JOIN is one of the following:
SELECT ...
FROM X INNER JOIN Y USING (field_list)
... or ...
SELECT ...
FROM X INNER JOIN Y ON (x.field1 = y.field2) ...
But in both cases the objects being joined are tables or views.
I'm going to presume ... admittedly, without checking ... that Nick Larsen's answer #1 adequately re-expresses your original query using JOINs.
(Notice how, in his answer, the shorthand identifiers A and B are introduced as referring to each of the two table-names mentioned in his query.)
Firstly, you need to decide if a 50 million resultset is what you are asking for. Big data tables are not there so that you can select all their rows. They are there so that you can ask them questions using sql queries. SQL is a query language, it's not a data loading language.
What's your purpose? If you want to copy the data you can do that by loading the data, for example, 1000 rows per query in a for loop. if you are loading the data for processing, you can do that in the same way.
If you want to derive statistical information, you can use outer join and return a low number of rows, using aggregate functions. But you shouldn't do that either, what you "should" do is to decide what you want from the table and preferably, run aggregate functions to store useful information in a different table. (mostly SELECT INTO queries) You should never need to join a table of 50 million records in the first place.
Telling you how to do something wrong using indexes wouldn't be the right thing here.
I have two tables:
`search_chat` (
`pubchatid` varchar(255) NOT NULL,
`profile` varchar(255) DEFAULT NULL,
`prefs` varchar(255) DEFAULT NULL,
`init` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`session` varchar(255) DEFAULT NULL,
`device` varchar(255) DEFAULT NULL,
`uid` int(10) DEFAULT NULL,
PRIMARY KEY (`pubchatid`)
and
`chats` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`chatlog` varchar(255) DEFAULT NULL,
`block` varchar(2) DEFAULT '',
`whenadded` datetime DEFAULT NULL,
`pubchatid1` varchar(255) DEFAULT NULL,
`pubchatid2` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
So basically people chat with each other through a search system based on prefrences. The further they are apart, the worse it is. So the query I have is simple:
SELECT *
FROM search_chat
WHERE levenshtein(profile, "[user_prefs]") < 20
AND pubchatid <> "[user_pubchatid]"
ORDER BY
levenshtein(profile, "[user_prefs]")
LIMIT 1
It is a shitty query in itself, but it does the job (everything between "[]" is a variable I put in, just to make it clear).
As you can see this query only makes a selection between two peoples preferences (prefs) and how they are (profile). So far so good.
I have been bugging around some to make this query also check if they have had previous chats. That is where "chats" comes in. I can not get the query to check for a proper user and see if they have an open chat.
In chats, the "search_chat.pubchatid" can be either "chats.pubchatid1" or "chats.pubchatid2"
So somehow I have got to make these two work, making chats rule out options in search_chat.
Do you want something like this:
-- ... ( start of query as per your question )
and not exists (
select *
from chats
where ( ( chats.pubchatid1 = search_chat.pubchatid )
or ( chats.pubchatid2 = search_chat.pubchatid ) )
and -- ... add any restriction on how recent the chat was
)