I have a CRM system that generates attributes using an EAV model. The problem as you may be very aware that EAV model require a complex queries to pull the data. Each attribute have to be returned in a separate column.
When using sub queries, MySQL performance sucks. I have to find a better way to write my queries by analyzing them using the giving where clause, sort order and the limit "if any"!
By sub query I am refereeing to a query that look like this
SELECT a.account_name, a.account_type, a.status, a.account_id, s.fieldValue, s2.last_training_on, s3.fieldValue
FROM accounts AS a
INNER JOIN clients AS c ON c.client_id = a.client_id
LEFT JOIN (
SELECT p.related_to AS account_id, decimal_value AS fieldValue
FROM df_answers_text AS p
INNER JOIN df_field_to_client_relation AS r ON r.field_id = p.field_id
WHERE p.field_id = '19' AND r.client_id = '7';
) AS s ON s.account_id = a.account_id
LEFT JOIN (
SELECT p.related_to AS account_id, datetime_value AS last_training_on
FROM df_answers_text AS p
INNER JOIN df_field_to_client_relation AS r ON r.field_id = p.field_id
WHERE p.field_id = '10' AND r.client_id = '7';
) AS s2 ON s2.account_id = a.account_id
LEFT JOIN (
SELECT
p.related_to
, CAST(GROUP_CONCAT(o.label SEPARATOR " | ") AS CHAR(255)) AS fieldValue
FROM df_answer_predefined AS p
INNER JOIN df_fields_options AS o ON o.option_id = p.option_id
INNER JOIN df_field_to_client_relation AS r ON r.field_id = o.field_id
WHERE o.is_place_holder = 0 AND o.field_id = '16' AND r.field_id = '16' AND r.client_id = '7'
GROUP BY p.related_to;
) AS s3 ON s3.related_to = a.account_id
WHERE c.client_id = '7' AND c.status = 'Active' AND ( a.account_type = 'TEST' OR a.account_type = 'VALUE' OR s2.last_training_on > '2015-01-01 00:00:00') AND (s.fieldValue = 'Medium' OR s.fieldValue = 'Low' OR a.expType = 'Very High')
ORDER BY a.account_name
LIMIT 500;
I thought about creating a temporary table using MEMORY engine with the content of the sub query like this
CREATE TEMPORARY TABLE s (KEY(account_id, fieldValue)) ENGINE = MEMORY
SELECT p.related_to AS account_id, decimal_value AS fieldValue
FROM df_answers_text AS p
INNER JOIN df_field_to_client_relation AS r ON r.field_id = p.field_id
WHERE p.field_id = '19' AND r.client_id = '7';
CREATE TEMPORARY TABLE s2 (KEY(account_id, INDEX USING BTREE last_training_on)) ENGINE = MEMORY
SELECT p.related_to AS account_id, datetime_value AS last_training_on
FROM df_answers_text AS p
INNER JOIN df_field_to_client_relation AS r ON r.field_id = p.field_id
WHERE p.field_id = '10' AND r.client_id = '7';
CREATE TEMPORARY TABLE s3 (KEY(related_to, fieldValue)) ENGINE = MEMORY
SELECT
p.related_to
, CAST(GROUP_CONCAT(o.label SEPARATOR " | ") AS CHAR(255)) AS fieldValue
FROM df_answer_predefined AS p
INNER JOIN df_fields_options AS o ON o.option_id = p.option_id
INNER JOIN df_field_to_client_relation AS r ON r.field_id = o.field_id
WHERE o.is_place_holder = 0 AND o.field_id = '16' AND r.field_id = '16' AND r.client_id = '7'
GROUP BY p.related_to;
CREATE TEMPORARY TABLE s3 (KEY(related_to)) ENGINE = MEMORY
SELECT
p.related_to
, CAST(GROUP_CONCAT(o.label SEPARATOR " | ") AS CHAR(255)) AS fieldValue
FROM df_answer_predefined AS p
INNER JOIN df_fields_options AS o ON o.option_id = p.option_id
INNER JOIN df_field_to_client_relation AS r ON r.field_id = o.field_id
WHERE o.is_place_holder = 0 AND o.field_id = '16' AND r.field_id = '16' AND r.client_id = '7'
GROUP BY p.related_to;
Then my new query will look like this
SELECT a.account_name, a.account_type, a.status, a.account_id, s.fieldValue, s2.last_training_on, s3.fieldValue
FROM accounts AS a
INNER JOIN clients AS c ON c.client_id = a.client_id
LEFT JOIN s ON s.account_id = a.account_id
LEFT JOIN s2 ON s2.account_id = a.account_id
LEFT JOIN s3 ON s2.related_to = a.account_id
WHERE c.client_id = '7' AND c.status = 'Active' AND ( a.account_type = 'TEST' OR a.account_type = 'VALUE' OR s2.last_training_on > '2015-01-01 00:00:00') AND (s.fieldValue = 'Medium' OR s.fieldValue = 'Low' OR a.expType = 'Very High')
ORDER BY a.account_name
LIMIT 500;
DROP TEMPORARY TABLE s, s2;
The problem that I am facing now of that the temporary table will create a temporary table of the entire data available in the database which consume lots of time. but my outer query is only looking for 500 records sorted by the a.account_name. If the temporary table has 1 million records that will be waste of time and obviously give me bad performance.
I am looking to find a better way to pass on the clause to the sub query so that way I would only create a temporary table with the needed data for the outer query
Note: these queries are generated dynamic using a GUI. I am unable to figure out how to extract the logic/clause and properly pass them to the sub query.
QUESTIONS
How can I look at the where clause, parse them and pass them to the sub query to refuse the amount of the data in the sub quires? If the call the clause where "AND" then my life will be easier but since I have a mix or "AND" and "OR" it is very complex.
Is there a better approach to this problem rather than using Temporary tables.
EDITED
Here are my table definitions
CREATE TABLE df_answer_predefined (
answer_id int(11) unsigned NOT NULL AUTO_INCREMENT,
field_id int(11) unsigned DEFAULT NULL,
related_to int(11) unsigned DEFAULT NULL,
option_id int(11) unsigned DEFAULT NULL,
created_by int(11) unsigned NOT NULL,
created_on datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (answer_id),
UNIQUE KEY un_row (field_id,option_id,related_to),
KEY field_id (field_id),
KEY related_to (related_to),
KEY to_delete (field_id,related_to),
KEY outter_view (field_id,option_id,related_to)
) ENGINE=InnoDB AUTO_INCREMENT=4946214 DEFAULT CHARSET=utf8;
`CREATE TABLE df_fields_options (
option_id int(11) unsigned NOT NULL AUTO_INCREMENT,
field_id int(11) unsigned NOT NULL,
label varchar(255) DEFAULT NULL,
is_place_holder tinyint(1) NOT NULL DEFAULT '0',
is_default tinyint(1) NOT NULL DEFAULT '0',
sort smallint(3) NOT NULL DEFAULT '1',
status tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (option_id),
KEY i (field_id),
KEY d (option_id,field_id,is_place_holder)
) ENGINE=InnoDB AUTO_INCREMENT=155 DEFAULT CHARSET=utf8;`
`CREATE TABLE df_field_to_client_relation (
relation_id int(11) unsigned NOT NULL AUTO_INCREMENT,
client_id int(11) unsigned DEFAULT NULL,
field_id int(11) unsigned DEFAULT NULL,
PRIMARY KEY (relation_id),
UNIQUE KEY unique_row (field_id,client_id),
KEY client_id (client_id),
KEY flient_id (field_id)
) ENGINE=InnoDB AUTO_INCREMENT=26 DEFAULT CHARSET=utf8;`
`CREATE TABLE df_answers_text (
answer_id int(11) unsigned NOT NULL AUTO_INCREMENT,
notes varchar(20000) DEFAULT NULL,
datetime_value datetime DEFAULT NULL,
date_value date DEFAULT NULL,
us_phone_number char(10) DEFAULT NULL,
field_id int(11) unsigned DEFAULT NULL,
related_to int(11) unsigned DEFAULT NULL,
created_by int(11) unsigned NOT NULL,
created_on datetime DEFAULT CURRENT_TIMESTAMP,
modified_by int(11) DEFAULT NULL,
modified_on datetime DEFAULT NULL,
big_unsigned_value bigint(20) DEFAULT NULL,
big_signed_value bigint(19) DEFAULT NULL,
unsigned_value int(11) DEFAULT NULL,
signed_value int(10) DEFAULT NULL,
decimal_value decimal(18,4) DEFAULT NULL,
PRIMARY KEY (answer_id),
UNIQUE KEY unique_answer (field_id,related_to),
KEY field_id (field_id),
KEY related_to (related_to),
KEY big_unsigned_value (big_unsigned_value),
KEY big_signed_value (big_signed_value),
KEY unsigned_value (unsigned_value),
KEY signed_value (signed_value),
KEY decimal_Value (decimal_value)
) ENGINE=InnoDB AUTO_INCREMENT=2458748 DEFAULT CHARSET=utf8;`
The query that takes the most time is the third sub query with the alias s3
Here is the execution plan for the query that us taking long time "2 seconds"
UNIQUE(a,b,c)
INDEX (a)
DROP the INDEX, since the UNIQUE key is an INDEX and the INDEX is a prefix of the UNIQUE.
PRIMARY KEY(d)
UNIQUE(a,b,c)
Why have d at all? Simply say PRIMARY KEY(a,b,c).
FROM ( SELECT ... )
JOIN ( SELECT ... ) ON ...
optimizes poorly (until 5.6.6). Whenever possible turn JOIN ( SELECT ) into a JOIN with the table. As you suggested, using tmp tables may be better, if you can add a suitable index to the tmp table. Best is to try to avoid more than one "table" that is a subquery.
In a many-to-many relation table, don't include an id for the table, instead have only
PRIMARY KEY (a,b), -- for enforcing uniqueness, providing a PK, and going one direction
INDEX (b,a) -- for going the other way.
The EXPLAIN does not seem to match the SELECT you provided. Each is useless without the other.
Another approach that might help... Instead of
SELECT ..., s2.foo, ...
...
JOIN ( SELECT ... FROM x WHERE ... ) AS s2 ON s2.account_id = a.account_id
see if you can reformulate it as:
SELECT ...,
( SELECT foo FROM x WHERE ... AND related = a.account_id) AS foo, ...
...
That is, replace the JOIN subquery with a correlated subquery for the one value you need.
The bottom line is that the EAV model sucks.
Hmmm... I don't see the need for this at all, since r is not used elsewhere in he query...
INNER JOIN df_field_to_client_relation AS r ON r.field_id = p.field_id
WHERE p.field_id = '19' AND r.client_id = '7'
It seems to be equivalent to
WHERE EXISTS ( SELECT * FROM df_field_to_client_relation
WHERE field_id = '19' AND client_id = '7' )
but why bother checking for existence?
Related
I have an order table that is associated with two join tables: order_buyer and order_seller.
The table structure looks like this:
CREATE TABLE IF NOT EXISTS `order` (
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
-- some fields...
created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id)
);
CREATE TABLE IF NOT EXISTS order_seller (
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
order_id INT(10) UNSIGNED NOT NULL,
seller_id INT(10) UNSIGNED NOT NULL,
created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
CONSTRAINT fk_order_seller_order_id FOREIGN KEY (order_id) REFERENCES `order` (id) ON DELETE CASCADE,
CONSTRAINT fk_order_seller_seller_id FOREIGN KEY (seller_id) REFERENCES user (id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS order_buyer (
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
order_id INT(10) UNSIGNED NOT NULL,
buyer_id INT(10) UNSIGNED NOT NULL,
created_at TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id),
CONSTRAINT fk_order_buyer_order_id FOREIGN KEY (order_id) REFERENCES `order` (id) ON DELETE CASCADE,
CONSTRAINT fk_order_buyer_buyer_id FOREIGN KEY (buyer_id) REFERENCES user (id) ON DELETE CASCADE
);
If I am a buyer viewing my order history, I want to get all orders where the order_buyer.buyer_id is equal to my user ID. Same thing goes for the seller.
But my query is such that I don't know if it's the buyer or the seller making the query; I just have a user ID. So I need one of them to match, and if one matches then I want to collect that order.
The following query does not work because it gives me every single order:
SELECT `order`.*
FROM `order`
LEFT JOIN order_seller
ON order_seller.seller_id = 1
AND order_seller.order_id = order.id
LEFT JOIN order_buyer
ON order_buyer.buyer_id = 1
AND order_buyer.order_id = order.id
WHERE order.status = "PENDING"
The query below solves the problem above by removing the AND from the joins and making it a WHERE IN clause on the result set:
SELECT `order`.*
FROM `order`
LEFT JOIN order_seller
ON order_seller.seller_id = 1
LEFT JOIN order_buyer
ON order_buyer.buyer_id = 1
WHERE order.status = "PENDING"
AND order.id IN (order_buyer.order_id, order_seller.order_id)
My question is, is there a better query to achieve the same thing? This seems kind of dirty to me. Feels unnatural.
You can just combine the two queries:
SELECT `order`.*
FROM `order`
LEFT JOIN order_seller
ON order_seller.seller_id = 1
AND order_seller.order_id = order.id
LEFT JOIN order_buyer
ON order_buyer.buyer_id = 1
AND order_buyer.order_id = order.id
WHERE order.status = "PENDING"
AND order.id IN (order_buyer.order_id, order_seller.order_id)
It should be faster than your second query, but return the same result.
You can also change the last condition to
COALESCE(order_buyer.order_id, order_seller.order_id) IS NOT NULL
which would be essentially the same as Gordon's first query.
However - The problem here is that the engine will need to read all pending orders before it can filter them by userID. And you would have the same problem with EXISTS subqueries.
Instead I would use a UNION ALL query:
SELECT `order`.*
FROM `order`
JOIN order_seller ON order_seller.order_id = order.id
WHERE order.status = "PENDING" AND order_seller.seller_id = 1
UNION ALL
SELECT `order`.*
FROM `order`
JOIN order_buyer ON order_buyer.order_id = order.id
WHERE order.status = "PENDING" AND order_buyer.buyer_id = 1
To avoid to much code duplication you can use a UNION ALL subquery:
SELECT o.*
FROM (
SELECT order_id FROM order_seller WHERE seller_id = 1
UNION ALL
SELECT order_id FROM order_buyer WHERE buyer_id = 1
) x
JOIN `order` o ON o.id = x.order_id
WHERE o.status = "PENDING"
You will need at least an index on order(status). Since the primary key (order.id) is an implicit part of it, the index can be used for the JOIN-ON and the WHERE clause at the same time. For the other tables the indexes on order_seller(seller_id) and order_buyer(buyer_id) might be fine. But composite indexes on order_seller(seller_id, order_id) and order_buyer(buyer_id, order_id) would be better.
You can use the first version, by adding a where clause:
SELECT o.*
FROM `order` o LEFT JOIN
order_seller os
ON os.seller_id = 1 AND
os.order_id = o.id LEFT JOIN
order_buyer ob
ON ob.buyer_id = 1 AND
ob.order_id = o.id
WHERE o.status = 'PENDING' AND
(os.order_id IS NOT NULL OR ob.order_id IS NOT NULL);
However, I would write the query using exists:
SELECT o.*
FROM `order` o
WHERE EXISTS (SELECT 1
FROM order_seller os
WHERE os.seller_id = 1 AND os.order_id = o.id
) OR
EXISTS (SELECT 1
FROM order_buyer ob
WHERE ob.buyer_id = 1 AND ob.order_id = o.id
)
WHERE o.status = 'PENDING' ;
For performance, you want indexes on each of the tables on (order_id, buyer_id) and (order_id, seller_id).
What about something like
select order.*
from order
where order.status = 'PENDING'
and order.id in (
select s.order_id from order_seller s where s.seller_id = 1
union all
select b.order_id from order_buyer b where b.buyer_id = 1)
or change it using exists
select o.*
from order o
where o.status = 'PENDING'
and (
exists (select 1
from order_seller s
where s.seller_id = 1 and s.order_id = o.order_id)
or
exists (select 1
from order_buyer b
where b.buyer_id = 1 and b.order_id = o.order_id))
Working on a support ticketing system with not a lot of tickets (~3,000). To get a summary grid of ticket information, there are five LEFT JOIN statements on custom field table (j25_field_value) containing about 10,000 records. The query runs too long (~10 seconds) and in cases with a WHERE clause, it runs even longer (up to ~30 seconds or more).
Any suggestions for improving the query to reduce the time to run?
Four tables:
j25_support_tickets
CREATE TABLE `j25_support_tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) NOT NULL DEFAULT '0',
`user_id` int(11) DEFAULT NULL,
`email` varchar(50) DEFAULT NULL,
`subject` varchar(255) DEFAULT NULL,
`message` text,
`modified_date` datetime DEFAULT NULL,
`priority_id` tinyint(3) unsigned DEFAULT NULL,
`status_id` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3868 DEFAULT CHARSET=utf8
j25_support_priorities
CREATE TABLE `j25_support_priorities` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=14 DEFAULT CHARSET=utf8
j25_support_statuses
CREATE TABLE `j25_support_statuses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=utf8
j25_field_value (id, ticket_id, field_id, field_value)
CREATE TABLE `j25_support_field_value` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ticket_id` int(11) DEFAULT NULL,
`field_id` int(11) DEFAULT NULL,
`field_value` tinytext,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=10889 DEFAULT CHARSET=utf8
Also, ran this:
SELECT LENGTH(field_value) len FROM j25_support_field_value ORDER BY len DESC LIMIT 1
note: the result = 38
The query:
SELECT DISTINCT t.id as ID
, (select p.title from j25_support_priorities p where p.id = t.priority_id) as Priority
, (select s.title from j25_support_statuses s where s.id = t.status_id) as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5
ALTER TABLE j25_support_field_value
ADD INDEX (`ticket_id`,`field_id`,`field_value`(50))
This index will work as a covering index for your query. It will allow the joins to use only this index to look up the values. It should perform massively faster than without this index, since currently your query would have to read every row in the table to find what matches each combination of ticket_id and field_id.
I would also suggest converting your tables to InnoDB engine, unless you have a very explicit reason for using MyISAM.
ALTER TABLE tablename ENGINE=InnoDB
As above - a better index would help. You could probably then simplify your query into something like this too (join to the table only once):
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, case when v.field_id=1 then v.field_value else null end as IssueType
, case when v.field_id=2 then v.field_value else null end as Version
, case when v.field_id=3 then v.field_value else null end as UserType
, case when v.field_id=4 then v.field_value else null end as Company
, case when v.field_id=5 then v.field_value else null end as RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value v ON t.id = v.ticket_id
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;
You can do away with the subqueries for starters and just get them from another join. You can add an index to j25_support_field_value
alter table j25_support_field_value add key(id, field_type);
I assume there is an index on id in j25_support_tickets - if not and if they are unique, add a unique index alter table j25_support_tickets add unique key(id); If they're not unique, remove the word unique from that statement.
In MySQL, a join usually requires an index on the field(s) that you are using to join on. This will hold up and produce very reasonable results with huge tables (100m+), if you follow that rule, you will not go wrong.
are the ids in j25_support_tickets unique? If they are you can do away with the distinct - if not, or if you are getting exact dupicates in each row, still do away with the distinct and add a group by t.id to the end of this:
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;
Switch to InnoDB.
After switching to InnoDB, make the PRIMARY KEY for j25_support_field_value be (ticket_id, field_id) (and get rid if id). (Tacking on field_value(50) will hurt, not help.)
A PRIMARY KEY is a UNIQUE KEY, so don't have both.
Use VARCHAR(255) instead of the nearly-equivalent TINYTEXT.
EAV schema sucks. My ran on EAV.
I am trying to count how many transactions have been complete for a course, I am trying to left join the training_transactions with a count of all the rows where the training_transaction_course = course_id and where training_transaction_status = 'completed' Here's the code I have so far:
SELECT training.*,
Count(DISTINCT training_transactions.training_transaction_course) AS completed_training_payments
left JOIN users
ON training.course_user = users.user_id
LEFT JOIN training_transactions
ON training.course_user = training_transactions.training_transaction_user
FROM training
WHERE course_id = ?
AND training_transactions.training_transaction_status = 'complete'
AND course_enabled = 'enabled'
My tables:
training transactions
CREATE TABLE IF NOT EXISTS `training_transactions` (
`training_transaction_id` int(11) NOT NULL,
`training_transaction_user` int(11) NOT NULL,
`training_transaction_course` int(11) NOT NULL,
`training_transaction_status` varchar(50) NOT NULL,
`training_transaction_enabled` varchar(50) NOT NULL DEFAULT 'enabled',
`training_transaction_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
training
CREATE TABLE IF NOT EXISTS `training` (
`course_id` int(11) NOT NULL,
`course_user` int(11) NOT NULL,
`course_type` varchar(255) NOT NULL,
`course_name` varchar(255) NOT NULL,
`course_location` varchar(255) NOT NULL,
`course_duration` varchar(255) NOT NULL,
`course_fitness_type` varchar(255) NOT NULL,
`course_instructor_name` varchar(255) NOT NULL,
`course_price` int(15) NOT NULL,
`course_start_date` date NOT NULL,
`course_max_attendees` int(8) NOT NULL,
`course_accommodation` varchar(255) NOT NULL,
`course_accommodation_price` varchar(255) NOT NULL,
`course_status` varchar(50) NOT NULL,
`course_enabled` varchar(10) NOT NULL DEFAULT 'enabled'
) ENGINE=InnoDB AUTO_INCREMENT=24 DEFAULT CHARSET=latin1;
As you can see I am trying to get the count of completed transactions as a count to deduct from course_max_attendees, and then I can check if there's any places left.
You want to select trainings. So select from training. You want to show the transaction count with it, which you can do in a subquery in your select clause:
select
t.*,
(
select count(*)
from training_transactions tt
where tt.training_transaction_user = t.course_user
and tt.training_transaction_status = 'complete'
) as completed_training_payments
from training t
where t.course_id = ?
and t.course_enabled = 'enabled';
And here is the same with a join:
select
t.*, coalesce(tt.cnt, 0) as completed_training_payments
from training t
left join
(
select training_transaction_status, count(*) as cnt
from training_transactions
where training_transaction_status = 'complete'
group by training_transaction_status
) tt on tt.training_transaction_user = t.course_user
where t.course_id = ?
and t.course_enabled = 'enabled';
First, if you want to know how many completed transactions have taken place for each course, you can't get the User table involved. You will be aggregating away any user information.
Then, you must start with the course table, which looks like you have named training. Now you want to count every completed transaction for each course. A left join works just about perfectly for this:
select t.Name, count( * ) as completed_training_payments
from training t
left join training_transactions tt
on tt.user = t.course_user
and tt.status = 'complete'
where t.course_status = 'enabled'
group by t.Name;
The problem with this is that it will give a count value of "1" for every course with one completed transaction but also those with no completed transactions at all! So every row with a count of "1" would be suspect. The solution is to count keys, not rows. This is done with the sum rather than the count function.
select t.Name, sum( case when tt.course_user is null then 0 else 1 end ) as completed_training_payments
from training t
left join training_transactions tt
on tt.user = t.course_user
and tt.status = 'complete'
where t.course_status = 'enabled'
group by t.Name;
Since tt.course_user will only be null when there are no completed transactions at all, that course will show a "count" of "0".
I'd like some help with an left join statement thats not doing what i, probably incorrectly, think it should do.
there are two tables:
cd:
CREATE TABLE `cd` (
`itemID` int(11) NOT NULL AUTO_INCREMENT,
`title` text NOT NULL,
`artist` text NOT NULL,
`genre` text NOT NULL,
`tracks` int(11) NOT NULL,
PRIMARY KEY (`itemID`)
)
loans
CREATE TABLE `loans` (
`itemID` int(11) NOT NULL,
`itemType` varchar(20) NOT NULL,
`userID` int(11) NOT NULL,
`dueDate` date NOT NULL,
PRIMARY KEY (`itemID`,`itemType`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and i want to select all cd's thats not in loans using a left join and then an where dueDate is null
select
t.itemID,
t.artist as first,
t. title as second,
(select AVG(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `rating avarage`,
(select COUNT(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `number of ratings`
from
cd t left join loans l
on t.itemID = l.itemID
where l.itemType = 'cd' and l.dueDate is null;
this one however returns an empty table even though there are plenty rows in cd with itemIDs thats not in loans
now i was under the understanding that the left join should preserv the righthandside and fill the columns from the lefthandside with null values
but this does not seem to be the case, can anbyone enlighten me?
Your WHERE condition causes the error. The L.ItemType = 'cd' will always return false if the L.DueDate IS NULL is true. (All of your fields are NOT NULL, so the DueDate can only be NULL if there is no matching records, but in this case the ItemType field will be NULL too).
Another point is that your query is semantically incorrect. You are trying to get the record from the cd table where the loans table do not contains any rows with dueDates.
The second table acts as a condition, so it should go to the WHERE conditions.
Consider to use the EXISTS statement to achieve your goal:
SELECT
t.itemID,
t.artist as first,
t. title as second,
(select AVG(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `rating avarage`,
(select COUNT(rating) from ac9039.ratings where itemType = 'cd' and itemId = t.itemID) as `number of ratings`
FROM
cd t
WHERE
NOT EXISTS (SELECT 1 FROM loans l WHERE t.itemID = l.itemID AND L.itemType = 'cd')
Based on your data model you have to add another condition to the subquery to filter out those records which are out-of-date now (dueDate is earlier than the current time)
This is the case, when you do not delete outdated loan records.
NOT EXISTS (SELECT 1 FROM loans l WHERE t.itemID = l.itemID AND AND L.itemType = 'cd' l.dueDate > NOW())
I am running website which i have below two queries which are very much repetitive. And in MySQL innoDB processes in can see they are taking lots of time and whenever i see processes those are there they keep on creating temporary tables and takes long to execute taking lots of memory and CPU.
They were really bad i somehow manage to optimised these.. but i am not able to do beyond that.
$getmoddetails = "SELECT a.id, a.name, a.defvar, a.description, a.icon, a.thumb, a.average_rating, a.total_rating, c.rating, a.group_access, d.long_name, a.editor_id, e.users_count
FROM dir_cat_item AS b
INNER JOIN dir_item AS a ON a.id = b.item_id
AND a.status = 'O'
LEFT JOIN dir_item_notation_user_map AS c ON a.id = c.item_id
AND c.user_id =%u
LEFT JOIN users AS d ON d.id = a.editor_id
LEFT JOIN (SELECT item_id, COUNT(*) AS users_count
FROM module
GROUP BY item_id) AS e ON e.item_id = b.item_id
WHERE a.id=%u";
$getnbModules_by_col = "SELECT
posx,COUNT(posx) as nb
FROM module WHERE
user_id = %u
AND profile_id = %u
GROUP BY posx
ORDER BY posx ASC";
Table index on Module
- item_id
- user_id
- profile_id
- uniq
For USERS Table
- id
- username
Any suggestion please...
Update :-
CREATE TABLE IF NOT EXISTS `module` (
`item_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`user_id` int(10) unsigned NOT NULL DEFAULT '0',
`profile_id` int(3) unsigned NOT NULL DEFAULT '0',
`posx` tinyint(3) unsigned NOT NULL DEFAULT '0',
`posy` tinyint(3) unsigned NOT NULL DEFAULT '0',
`posj` tinyint(3) unsigned NOT NULL DEFAULT '0',
`x` smallint(5) unsigned NOT NULL DEFAULT '0',
`y` smallint(5) unsigned NOT NULL DEFAULT '0',
`typ` char(1) CHARACTER SET utf8 NOT NULL DEFAULT 'D',
`variables` text COLLATE utf8_unicode_ci,
`uniq` smallint(5) unsigned NOT NULL DEFAULT '1',
`blocked` tinyint(1) unsigned NOT NULL DEFAULT '0',
`minimized` tinyint(1) unsigned NOT NULL DEFAULT '0',
`old_id` tinyint(3) unsigned DEFAULT NULL,
`feed_id` mediumint(8) unsigned NOT NULL DEFAULT '0',
`shared` varchar(33) COLLATE utf8_unicode_ci DEFAULT NULL,
`currentview` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
KEY `item_id` (`item_id`,`user_id`,`profile_id`,`uniq`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
item_id 18 A No
user_id 393467 A No
profile_id 393467 A No
uniq 393467 A No
Thank you in advance
For best performance of that second query, you'd need an appropriate index on the module table, e.g.
... ON module (user_id, profile_id, posx)
For the first query, you'd likely benefit from a different index on the module table:
... ON module (item_id)
But without the table definitions, datatype and cardinality of the columns, it's really impossible to make definitive suggestions.
In your first query, I'd suggest you add a predicate in the inline view (derived table) aliased as e. I don't think MySQL is pushing the predicate from the outer query into the inline view.
( SELECT item_id
, COUNT(*) AS users_count
FROM module
WHERE item_id = %u
GROUP BY item_id
) AS e
You're going to need to supply the same value in that WHERE clause as you are providing in the WHERE clause on the outer query. From what I'm reading there...
e.item_id = b.item_id = a.id = %u
By adding that WHERE clause in that inline view, that should cut down the number or rows retrieved from the module table, and that derived table will have just one row. An index with a leading column of item_id would be a covering index. The EXPLAIN PLAN should show Using index and not show Using filesort.
If your first query is pulling back a relatively small number of rows, you might consider a correlated subquery on the module table, in place of a join to the derived table (aliased as e) in your first query (to avoid materializing a large derived table). (In general, however, correlated subqueries can be real performance killers. But in some cases, where the outer query is pulling back a small number of rows, the repeated execution of the subquery can actually perform better than generating a large derived table, which you only need a few rows from.)
SELECT a.id
, a.name
, a.defvar
, a.description
, a.icon
, a.thumb
, a.average_rating
, a.total_rating
, c.rating
, a.group_access
, d.long_name
, a.editor_id
, ( SELECT SUM(1)
FROM module e
WHERE e.item_id = b.item_id
AND e.item_id = %u
) AS users_count
FROM dir_cat_item b
JOIN dir_item a
ON a.id = b.item_id
AND b.item_id = %u
AND a.status = 'O'
LEFT
JOIN dir_item_notation_user_map c
ON c.item_id = a.id
AND c.item_id = %u
AND c.user_id = %u
LEFT
JOIN users d
ON d.id = a.editor_id
WHERE a.id = %u