How to improve the performance of this query Sql in MySql? - mysql

I have a query in SQL that takes about 15 seconds to return the result, I need to improve the performance, I've worked a few hours and I can't get a satisfactory result.
MySql 5.6
Table lote: 304053 rows
Table Prod_lista: 41525 rows
Time result: 15 Seconds
SET #CODIGO_EMPRESA = 1;
SET #CODIGO_FILIAL = 1;
SET #PESQUISA = '%';
SELECT
B.CODIGO_PRODUTO,
C.DESCRICAO,
B.SALDO_DISPONIVEL,
B.SALDO_RESERVADO,
B.SALDO_INDISPONIVEL,
B.SALDO_TERCEIROS,
B.SALDO_ENTREGUE,
C.SITUACAO
FROM
(SELECT
A.CODIGO_EMPRESA,
A.CODIGO_FILIAL,
A.CODIGO_PRODUTO,
SUM(A.SALDO_DISPONIVEL) AS SALDO_DISPONIVEL,
SUM(A.SALDO_RESERVADO) AS SALDO_RESERVADO,
SUM(A.SALDO_INDISPONIVEL) AS SALDO_INDISPONIVEL,
SUM(A.SALDO_TERCEIROS) AS SALDO_TERCEIROS,
SUM(A.SALDO_ENTREGUE) AS SALDO_ENTREGUE
FROM
LOTE A
WHERE
A.CODIGO_EMPRESA = #CODIGO_EMPRESA
AND A.CODIGO_FILIAL = #CODIGO_FILIAL
AND IFNULL(A.ENCERRADO, 0) = 0
AND A.TIPO NOT IN (4 , 5)
GROUP BY A.CODIGO_PRODUTO
ORDER BY NULL) B
INNER JOIN
PROD_LISTA C ON B.CODIGO_EMPRESA = C.CODIGO_EMPRESA
AND B.CODIGO_FILIAL = C.CODIGO_FILIAL
AND B.CODIGO_PRODUTO = C.CODIGO
AND IFNULL(C.SITUACAO, 1) = 1
WHERE
(B.CODIGO_PRODUTO LIKE #PESQUISA
OR C.CODIGOFABRICA LIKE #PESQUISA
OR C.CODIGOBARRA_COMPLETO LIKE #PESQUISA
OR C.DESCRICAO LIKE #PESQUISA
OR EXISTS( SELECT
D.CODIGOPRODUTO
FROM
PROD_CODIGO_BARRA D
WHERE
D.CODIGO_EMPRESA = #CODIGO_EMPRESA
AND D.CODIGO_FILIAL = #CODIGO_FILIAL
AND B.CODIGO_PRODUTO = D.CODIGOPRODUTO
AND D.CODIGOBARRA_COMPLETO LIKE #PESQUISA))
LIMIT 0 , 100
Structure of the batch table
CREATE TABLE `lote` (
`CODIGO_EMPRESA` int(3) unsigned NOT NULL,
`CODIGO_FILIAL` int(4) unsigned NOT NULL,
`CODIGO_LOTE` bigint(20) NOT NULL,
`CODIGO_LOTE_PAI` bigint(20) DEFAULT NULL,
`NUMERO_LOTE` varchar(15) DEFAULT NULL,
`TIPO` int(2) DEFAULT NULL ,
`DATAHORA` datetime DEFAULT NULL,
`CODIGO_DOCUMENTO` bigint(20) DEFAULT NULL,
`CODIGO_ITEM` int(5) DEFAULT NULL,
`CUSTO` decimal(21,10) DEFAULT '0.0000000000',
`CODIGO_PRODUTO` varchar(25) DEFAULT NULL,
`DESTINO_INICIAL` int(1) DEFAULT NULL,
`SALDO_INICIAL` decimal(15,4) DEFAULT '0.0000',
`SALDO_DISPONIVEL` decimal(15,4) DEFAULT '0.0000',
`SALDO_INDISPONIVEL` decimal(15,4) DEFAULT '0.0000',
`SALDO_TERCEIROS` decimal(15,4) DEFAULT '0.0000',
`SALDO_RESERVADO` decimal(15,4) DEFAULT '0.0000',
`SALDO_ENTREGUE` decimal(15,4) DEFAULT '0.0000',
`VENCIMENTO` date DEFAULT NULL,
`ENCERRADO` int(1) DEFAULT '0',
`CODIGO_CONTAGEM` bigint(11) unsigned DEFAULT NULL,
PRIMARY KEY (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_LOTE`),
KEY `IDX_CODIGO_LOTE` (`CODIGO_LOTE`),
KEY `IDX_CODIGO_PRODUTO` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_PRODUTO`),
KEY `IDX_CONSULTA_PDV` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_PRODUTO`,`ENCERRADO`,`TIPO`),
KEY `IDX_CONTAGEM_ESTOQUE` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_PRODUTO`,`TIPO`,`ENCERRADO`),
KEY `IDX_CUSTO` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CUSTO`),
KEY `IDX_DOCUMENTO_ITEM` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_DOCUMENTO`,`CODIGO_ITEM`),
KEY `IDX_EMPRESA_FILIAL` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`),
KEY `IDX_ESTOQUE` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`TIPO`,`ENCERRADO`),
KEY `IDX_FILTRO` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_LOTE`,`DATAHORA`),
KEY `IDX_LOTE_COMP` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`CODIGO_PRODUTO`,`DATAHORA`),
KEY `IDX_TIPO` (`TIPO`),
KEY `IDX_TIPO_CODIGO` (`CODIGO_EMPRESA`,`CODIGO_FILIAL`,`TIPO`,`CODIGO_PRODUTO`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
How can I improve SQL to have a faster result?

OR is a performance killer; try to avoid it.
B.CODIGO_PRODUTO LIKE "%" is equivalent to TRUE, but the Optimizer does not see that. It would be much better to construct the query dynamically instead of using #variables and picking values that eliminate clauses.
Also of importance is that the Optimizer may be able to use an index when there is not a leading wildcard:
x LIKE 'A%' -- possibly uses index
x LIKE '%' -- cannot use index
x LIKE '%B' -- cannot use index
x LIKE '#foo' -- cannot use index
int(4) -- the (4) is irrelevant. All INTs are the same 4-byte datatype. See SMALLINT and similar ones for smaller datatypes.
Don't blindly use NULL (most of your columns are NULLable); leave NULL for optional / unknown / etc, values.
IFNULL(A.ENCERRADO, 0) = 0 -- If you can arrange for this to have 2 values (0 and 1), you can simplify this expression and avoid using a function. Functions make expressions not 'sargable'. Once you have done that, the suggested index below may be useful.
When you have INDEX(a,b,c), there is no need for also having INDEX(a,b). For example: IDX_EMPRESA_FILIAL can be dropped.
These indexes may help:
D: (CODIGOPRODUTO, CODIGO_FILIAL, CODIGO_EMPRESA, CODIGOBARRA_COMPLETO)
A: (ENCERRADO, CODIGO_FILIAL, TIPO, CODIGO_EMPRESA)
There may be more suggestions. Apply most of the above, then come back for more suggestions. (And provide the other SHOW CREATE TABLEs.)

Correlated subqueries, like your EXISTS one, can be costly if they are correlated with a large number of results from the outer query. I'd suggest converting your EXISTS condition into something like this:
OR B.CODIGO_PRODUTO IN (
SELECT D.CODIGOPRODUTO
FROM PROD_CODIGO_BARRA AS D
WHERE D.CODIGO_EMPRESA = #CODIGO_EMPRESA
AND D.CODIGO_FILIAL = #CODIGO_FILIAL
AND D.CODIGOBARRA_COMPLETO LIKE #PESQUISA
)
With your correlated version, the subquery ends up being evaluated separately for every row coming out of the main FROM. But with this non-correlated version, the subquery is only evaluated once, and it's result set is used to check the rows coming out of the main FROM.

Related

Improve Laravel Eloquent Query

I have this relation in my model...
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere(function($aQuery){
$aQuery
->where('count', '=' , 0)
->whereHas('containers', function($bQuery){
$bQuery->whereIn('status', [0,1]);
});
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
And Sql query generated are :
select
*
from
`inventories`
where
`inventories`.`user_id` = 17
and `inventories`.`user_id` is not null
and `status` = 1
and `deleted_at` is null
and (
`count` >= 1
or (
`count` = 0
and exists (
select
*
from
`containers`
where
`inventories`.`id` = `containers`.`inventory_id`
and `status` in (0, 1)
)
)
)
and `inventories`.`deleted_at` is null
order by
`updated_at` desc
limit
10 offset 0
Unfortunately this take more than 2sec in MySql,
There are anyways to improve and reduce the query time for this?!
Each inventory has many containers. when inventory count is 0 (0 mean out of stock but sometimes there are disabled containers that mean inventory is not out of stock yet.) the real count is depend on count of containers with status [0,1] (containers have other statuses...).
I have an idea to have a column on inventory to count containers with [0,1] status, and update it in other processes to improve this query. but this take too much time and need to modify other process.
Inventories show create table
CREATE TABLE `inventories` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint unsigned NOT NULL,
`cabin_id` bigint unsigned NOT NULL,
`address_id` bigint unsigned NOT NULL,
`count` mediumint NOT NULL,
`status` mediumint NOT NULL,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`available_at` datetime DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=37837 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Containers show create table
CREATE TABLE `containers` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`inventory_id` bigint unsigned NOT NULL,
`order_id` bigint unsigned DEFAULT NULL,
`status` tinyint unsigned NOT NULL DEFAULT '1',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=64503 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
Used Solution due comments (Thanks to #ysth #vixducis #Breezer ):
Changed Containers engine from MyISAM to InnoDB ,
Added INDEX to containers.inventory_id
And optimize code like below and limit whereHas select query
$this->hasMany('App\Inventory')->where('status',1)
->whereNull('deleted_at')
->where(function($query){
$query
->where('count', '>=', 1)
->orWhere('count', '=' , 0)
->whereHas('containers', function ($bQuery) {
$bQuery
->select('inventory_id')
->whereIn('status', [0, 1]);
});
})
->orderBy('updated_at','desc')
->with('address', 'cabin');
for whereHas we can use whereIn and subQuery like below
->whereIn('id', function ($subQuery) {
$subQuery->select('inventory_id')
->from('containers')
->whereIn('status', [0, 1]);
});
and for limiting select of dosentHave
->doesntHave('requests', 'and', function($query){
$query->select('inventory_id');
})
It looks like the containers table is still running on the MyISAM engine. While that engine is not deprecated, the development focus has shifted heavily towards InnoDB, and it should be a lot more performant. Switching to InnoDB is a good first step.
Secondly, I see that there is no index on containers.inventory_id. When experiencing performance issues when relating two tables, it's often a good idea to check whether adding an index on the column that relates the tables improves performance.
These two steps should make your query a lot faster.
when your data is big, whereHas statement sometimes run slowly because it use exists syntax. For more detailed explanation, you can read from this post.
To solve this, I prefer you to use mpyw/eloquent-has-by-non-dependent-subquery because it will use in syntax which will improve the performance. I already used this package on my project, and no problem until now.
Change to InnoDB.
inventories needs this composite index: INDEX(user_id, status, deleted_at, updated_at)
containers needs this composite index, not simply (inventory_id), but (inventory_id, status).
Redundant: inventories.user_id is not null because the test for 17 requires NOT NULL.
Redundant: deleted_at is null simply because it is in the query twice.

Speed Up A Large Insert From Select Query With Multiple Joins

I'm trying to denormalize a few MySQL tables I have into a new table that I can use to speed up some complex queries with lots of business logic. The problem that I'm having is that there are 2.3 million records I need to add to the new table and to do that I need to pull data from several tables and do a few conversions too. Here's my query (with names changed)
INSERT INTO database_name.log_set_logs
(offload_date, vehicle, jurisdiction, baselog_path, path,
baselog_index_guid, new_location, log_set_name, index_guid)
(
select STR_TO_DATE(logset_logs.offload_date, '%Y.%m.%d') as offload_date,
logset_logs.vehicle, jurisdiction, baselog_path, path,
baselog_trees.baselog_index_guid, new_location, logset_logs.log_set_name,
logset_logs.index_guid
from
(
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 7), '/', -1) as offload_date,
SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 8), '/', -1) as vehicle,
SUBSTRING_INDEX(path, '/', 9) as baselog_path, index_guid,
path, log_set_name
FROM database_name.baselog_and_amendment_guid_to_path_mappings
) logset_logs
left join database_name.log_trees baselog_trees
ON baselog_trees.original_location = logset_logs.baselog_path
left join database_name.baselog_offload_location location
ON location.baselog_index_guid = baselog_trees.baselog_index_guid);
The query itself works because I was able to run it using a filter on log_set_name however that filter's condition will only work for less than 1% of the total records because one of the values for log_set_name has 2.2 million records in it which is the majority of the records. So there is nothing else I can use to break this query up into smaller chunks from what I can see. The problem is that the query is taking too long to run on the rest of the 2.2 million records and it ends up timing out after a few hours and then the transaction is rolled back and nothing is added to the new table for the 2.2 million records; only the 0.1 million records were able to be processed and that was because I could add a filter that said where log_set_name != 'value with the 2.2 million records'.
Is there a way to make this query more performant? Am I trying to do too many joins at once and perhaps I should populate the row's columns in their own individual queries? Or is there some way I can page this type of query so that MySQL executes it in batches? I already got rid of all my indexes on the log_set_logs table because I read that those will slow down inserts. I also jacked my RDS instance up to a db.r4.4xlarge write node. I am also using MySQL Workbench so I increased all of it's timeout values to their maximums giving them all nines. All three of these steps helped and were necessary in order for me to get the 1% of the records into the new table but it still wasn't enough to get the 2.2 million records without timing out. Appreciate any insights as I'm not adept to this type of bulk insert from a select.
'CREATE TABLE `log_set_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`purged` tinyint(1) NOT NULL DEFAUL,
`baselog_path` text,
`baselog_index_guid` varchar(36) DEFAULT NULL,
`new_location` text,
`offload_date` date NOT NULL,
`jurisdiction` varchar(20) DEFAULT NULL,
`vehicle` varchar(20) DEFAULT NULL,
`index_guid` varchar(36) NOT NULL,
`path` text NOT NULL,
`log_set_name` varchar(60) NOT NULL,
`protected_by_retention_condition_1` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_2` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_3` tinyint(1) NOT NULL DEFAULT ''1'',
`protected_by_retention_condition_4` tinyint(1) NOT NULL DEFAULT ''1'',
`general_comments_about_this_log` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1736707 DEFAULT CHARSET=latin1'
'CREATE TABLE `baselog_and_amendment_guid_to_path_mappings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`path` text NOT NULL,
`index_guid` varchar(36) NOT NULL,
`log_set_name` varchar(60) NOT NULL,
PRIMARY KEY (`id`),
KEY `log_set_name_index` (`log_set_name`),
KEY `path_index` (`path`(42))
) ENGINE=InnoDB AUTO_INCREMENT=2387821 DEFAULT CHARSET=latin1'
...
'CREATE TABLE `baselog_offload_location` (
`baselog_index_guid` varchar(36) NOT NULL,
`jurisdiction` varchar(20) NOT NULL,
KEY `baselog_index` (`baselog_index_guid`),
KEY `jurisdiction` (`jurisdiction`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1'
'CREATE TABLE `log_trees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`baselog_index_guid` varchar(36) DEFAULT NULL,
`original_location` text NOT NULL, -- This is what I have to join everything on and since it's text I cannot index it and the largest value is above 255 characters so I cannot change it to a vachar then index it either.
`new_location` text,
`distcp_returncode` int(11) DEFAULT NULL,
`distcp_job_id` text,
`distcp_stdout` text,
`distcp_stderr` text,
`validation_attempt` int(11) NOT NULL DEFAULT ''0'',
`validation_result` tinyint(1) NOT NULL DEFAULT ''0'',
`archived` tinyint(1) NOT NULL DEFAULT ''0'',
`archived_at` timestamp NULL DEFAULT NULL,
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`dir_exists` tinyint(1) NOT NULL DEFAULT ''0'',
`random_guid` tinyint(1) NOT NULL DEFAULT ''0'',
`offload_date` date NOT NULL,
`vehicle` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `baselog_index_guid` (`baselog_index_guid`)
) ENGINE=InnoDB AUTO_INCREMENT=1028617 DEFAULT CHARSET=latin1'
baselog_offload_location has not PRIMARY KEY; what's up?
GUIDs/UUIDs can be terribly inefficient. A partial solution is to convert them to BINARY(16) to shrink them. More details here: http://localhost/rjweb/mysql/doc.php/uuid ; (MySQL 8.0 has similar functions.)
It would probably be more efficient if you have a separate (optionally redundant) column for vehicle rather than needing to do
SUBSTRING_INDEX(SUBSTRING_INDEX(path, '/', 8), '/', -1) as vehicle
Why JOIN baselog_offload_location? Three seems to be no reference to columns in that table. If there, be sure to qualify them so we know what is where. Preferably use short aliases.
The lack of an index on baselog_index_guid may be critical to performance.
Please provide EXPLAIN SELECT ... for the SELECT in your INSERT and for the original (slow) query.
SELECT MAX(LENGTH(original_location)) FROM .. -- to see if it really is too big to index. What version of MySQL are you using? The limit increased recently.
For the above item, we can talk about having a 'hash'.
"paging the query". I call it "chunking". See http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks . That talks about deleting, but it can be adapted to INSERT .. SELECT since you want to "chunk" the select. If you go with chunking, Javier's comment becomes moot. Your code would be chunking the selects, hence batching the inserts:
Loop:
INSERT .. SELECT .. -- of up to 1000 rows (see link)
End loop

Improve query speed suggestions

For self education I am developing an invoicing system for an electricity company. I have multiple time series tables, with different intervals. One table represents consumption, two others represent prices. A third price table should be still incorporated. Now I am running calculation queries, but the queries are slow. I would like to improve the query speed, especially since this is only the beginning calculations and the queries will only become more complicated. Also please note that this is my first database i created and exercises I have done. A simplified explanation is preferred. Thanks for any help provided.
I have indexed: DATE, PERIOD_FROM, PERIOD_UNTIL in each table. This speed up the process from 60 seconds to 5 seconds.
The structure of the tables is the following:
CREATE TABLE `apxprice` (
`APX_id` int(11) NOT NULL AUTO_INCREMENT,
`DATE` date DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`PRICE` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`APX_id`)
) ENGINE=MyISAM AUTO_INCREMENT=28728 DEFAULT CHARSET=latin1
CREATE TABLE `imbalanceprice` (
`imbalanceprice_id` int(11) NOT NULL AUTO_INCREMENT,
`DATE` date DEFAULT NULL,
`PTU` tinyint(3) DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`UPWARD_INCIDENT_RESERVE` tinyint(1) DEFAULT NULL,
`DOWNWARD_INCIDENT_RESERVE` tinyint(1) DEFAULT NULL,
`UPWARD_DISPATCH` decimal(10,2) DEFAULT NULL,
`DOWNWARD_DISPATCH` decimal(10,2) DEFAULT NULL,
`INCENTIVE_COMPONENT` decimal(10,2) DEFAULT NULL,
`TAKE_FROM_SYSTEM` decimal(10,2) DEFAULT NULL,
`FEED_INTO_SYSTEM` decimal(10,2) DEFAULT NULL,
`REGULATION_STATE` tinyint(1) DEFAULT NULL,
`HOUR` int(2) DEFAULT NULL,
PRIMARY KEY (`imbalanceprice_id`),
KEY `DATE` (`DATE`,`PERIOD_FROM`,`PERIOD_UNTIL`)
) ENGINE=MyISAM AUTO_INCREMENT=117427 DEFAULT CHARSET=latin
CREATE TABLE `powerload` (
`powerload_id` int(11) NOT NULL AUTO_INCREMENT,
`EAN` varchar(18) DEFAULT NULL,
`DATE` date DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`POWERLOAD` int(11) DEFAULT NULL,
PRIMARY KEY (`powerload_id`)
) ENGINE=MyISAM AUTO_INCREMENT=61039 DEFAULT CHARSET=latin
Now when running this query:
SELECT i.DATE, i.PERIOD_FROM, i.TAKE_FROM_SYSTEM, i.FEED_INTO_SYSTEM,
a.PRICE, p.POWERLOAD, sum(a.PRICE * p.POWERLOAD)
FROM imbalanceprice i, apxprice a, powerload p
WHERE i.DATE = a.DATE
and i.DATE = p.DATE
AND i.PERIOD_FROM >= a.PERIOD_FROM
and i.PERIOD_FROM = p.PERIOD_FROM
AND i.PERIOD_FROM < a.PERIOD_UNTIL
AND i.DATE >= '2018-01-01'
AND i.DATE <= '2018-01-31'
group by i.DATE
I have run the query with explain and get the following result: Select_type, all simple partitions all null possible keys a,p = null i = DATE Key a,p = null i = DATE key_len a,p = null i = 8 ref a,p = null i = timeseries.a.DATE,timeseries.p.PERIOD_FROM rows a = 28727 p = 61038 i = 1 filtered a = 100 p = 10 i = 100 a extra: using where using temporary using filesort b extra: using where using join buffer (block nested loop) c extra: null
Preferably I run a more complicated query for a whole year and group by month for example with all price tables incorporated. However, this would be too slow. I have indexed: DATE, PERIOD_FROM, PERIOD_UNTIL in each table. The calculation result may not be changed, in this case quarter hourly consumption of two meters multiplied by hourly prices.
"Categorically speaking," the first thing you should look at is indexes.
Your clauses such as WHERE i.DATE = a.DATE ... are categorically known as INNER JOINs, and the SQL engine needs to have the ability to locate the matching rows "instantly." (That is to say, without looking through the entire table!)
FYI: Just like any index in real-life – here I would be talking about "library card catalogs" if we still had such a thing – indexes will assist both "equal to" and "less/greater than" queries. The index takes the computer directly to a particular point in the data, whether that's a "hit" or a "near miss."
Finally, the EXPLAIN verb is very useful: put that word in front of your query, and the SQL engine should "explain to you" exactly how it intends to carry out your query. (The SQL engine looks at the structure of the database to make that decision.) Although the EXPLAIN output is ... (heh) ... "not exactly standardized," it will help you to see if the computer thinks that it needs to do something very time-wasting in order to deliver your answer.

Why is this query being logged as "not using indexes"?

For some reason my slow query log is reporting the following query as "not using indexes" and for the life of me I cannot understand why.
Here is the query:
update scheduletask
set active = 0
where nextrun < date_sub( now(), interval 2 minute )
and enabled = 1
and active = 1;
Here is the table:
CREATE TABLE `scheduletask` (
`scheduletaskid` int(11) NOT NULL AUTO_INCREMENT,
`schedulethreadid` int(11) NOT NULL,
`taskname` varchar(50) NOT NULL,
`taskpath` varchar(100) NOT NULL,
`tasknote` text,
`recur` int(11) NOT NULL,
`taskinterval` int(11) NOT NULL,
`lastrunstart` datetime NOT NULL,
`lastruncomplete` datetime NOT NULL,
`nextrun` datetime NOT NULL,
`active` int(11) NOT NULL,
`enabled` int(11) NOT NULL,
`creatorid` int(11) NOT NULL,
`editorid` int(11) NOT NULL,
`created` datetime NOT NULL,
`edited` datetime NOT NULL,
PRIMARY KEY (`scheduletaskid`),
UNIQUE KEY `Name` (`taskname`),
KEY `IDX_NEXTRUN` (`nextrun`)
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=latin1;
Add another index like this
KEY `IDX_COMB` (`nextrun`, `enabled`, `active`)
I'm not sure how many rows your table have but the following might apply as well
Sometimes MySQL does not use an index, even if one is available. One
circumstance under which this occurs is when the optimizer estimates
that using the index would require MySQL to access a very large
percentage of the rows in the table. (In this case, a table scan is
likely to be much faster because it requires fewer seeks.)
try using the "explain" command in mysql.
http://dev.mysql.com/doc/refman/5.5/en/explain.html
I think explain only works on select statements, try:
explain select * from scheduletask where nextrun < date_sub( now(), interval 2 minute ) and enabled = 1 and active = 1;
Maybe if you use, nextrun = ..., it will macht the key IDX_NEXTRUN. In your where clause has to be one of your keys, scheduletaskid, taskname or nextrun
Sorry for the short answer but I don't have time to write a complete solution.
I believe you can fix your issue by saving date_sub( now(), interval 2 minute ) in a temporary variable before using it in the query, see here maybe: MySql How to set a local variable in an update statement (Syntax?).

mysql query optimization (wrong indexes?) avoid filesort

I will try to explain myself quickly.
I have a database called 'artikli' which has about 1M records.
On this table i run a lots of different queryies but 1 particular is causing problems (long execution time) when ORDER by is present.
This is my table structure:
CREATE TABLE IF NOT EXISTS artikli (
id int(11) NOT NULL,
name varchar(250) NOT NULL,
datum datetime NOT NULL,
kategorije_id int(11) default NULL,
id_valute int(11) default NULL,
podogovoru int(1) default '0',
cijena decimal(10,2) default NULL,
valuta int(1) NOT NULL default '0',
cijena_rezerva decimal(10,0) NOT NULL,
cijena_kupi decimal(10,0) default NULL,
cijena_akcija decimal(10,2) NOT NULL,
period int(3) NOT NULL default '30',
dostupnost enum('svugdje','samobih','samomojgrad','samomojkanton') default 'svugdje',
zemlja varchar(10) NOT NULL,
slike varchar(500) NOT NULL,
od_s varchar(34) default NULL,
od_id int(10) unsigned default NULL,
vrsta int(1) default '0',
trajanje datetime default NULL,
izbrisan int(1) default '0',
zakljucan int(1) default '0',
prijava int(3) default '0',
izdvojen decimal(1,0) NOT NULL default '0',
izdvojen_kad datetime NOT NULL,
izdvojen_datum datetime NOT NULL,
sajt int(1) default '0',
PRIMARY KEY (id),
KEY brend (brend),
KEY kanton (kanton),
KEY datum (datum),
KEY cijena (cijena),
KEY kategorije_id (kategorije_id,podogovoru,sajt,izdvojen,izdvojen_kad,datum)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And this is the query:
SELECT artikli.datum as brojx,
artikli.izdvojen as i,
artikli.izdvojen_kad as ii,
artikli.cijena as cijena, artikli.name
FROM artikli
WHERE artikli.izbrisan=0 and artikli.prodano!=3
and artikli.zavrseno=0 and artikli.od_id!=0
and (artikli.sajt=0 or (artikli.sajt=1 and artikli.dostupnost='svugdje'))
and kategorije_id IN (18)
ORDER by i DESC, ii DESC, brojx DESC
LIMIT 0,20
What i want to do is to avoid Filesort which is very slow.
It would have been a big help if you'd provided the explain plan for the query.
Why do you think its the filesort which is causing the problem? Looking at the query you seem to be applying a lot filtering - which should reduce the output set significantly - but none of can use the available indexes.
artikli.izbrisan=0 and artikli.prodano!=3
and artikli.zavrseno=0 and artikli.od_id!=0
and (artikli.sajt=0 or (artikli.sajt=1 and artikli.dostupnost='svugdje'))
and kategorije_id IN (18)
Although I don't know what the pattern of your data is, I suspect that you might get a lot more benefit by adding an index on :
kategorije_id,izbrisan,sajt
Are all those other indexes really being used already?
Although you'd get a LOT more bang for your buck by denormalizing all those booleans (assuming that the table is normalised to start with and there are not hidden functional dependencies in there).
C.
The problem is that you don't have an index on the izdvojen, izdvojen_kad and datum columns that are used by the ORDER BY.
Note that the large index you have starting with kategorije_id can't be used for sorting (although it will help somewhat with the where clause) because the columns you are sorting by are at the end of the index.
Actually, the order by is not the basis for the index you want... but the CRITERIA you want to mostly match the query... Filter the smaller set of data out, you'll get smaller set of the table... I would change the WHERE clause a bit, but you'll know your data best. Put your smallest expected condition first and ensure an index is based on that... something like
WHERE
artikli.izbrisan = 0
and artikli.zavrseno = 0
and artikli.kategorije_id IN (18)
and artikli.prodano != 3
and artikli.od_id != 0
and ( artikli.sajt = 0
or ( artikli.sajt = 1
and artikli.dostupnost='svugdje')
)
and having a compound index on (izbrisan, zavrseno, kategorije_id)... I've mode the other != comparisons after as they are not specific key values, instead, they are ALL EXCEPT the value in question.