I'm struggling to find a solution for my problem.
I basically have 3 tables - campaigns, users, campaign_user (pivot table - with campaign_id, user_id)
I have this query:
select * from `campaigns`
where `id` = 91
and (select count(*)
from `users`
inner join `campaign_user` on `users`.`id` = `campaign_user`.`user_id`
where `campaign_user`.`campaign_id` = `campaigns`.`id`
and `user_id` = 1) >= 1
That returns 0 results. I have checked that relevent row in campaign_user table exists.
Weird thing is that if I run the same query for another campaign id (89) it does return the expected result. Some campaign ids return as expected and some return 0.. weird and fraustrating.
This does not happen in production server which runs mysql 5.5
But it happens in my VM that runs mysql 5.7
I have no idea what is the cause of that. A help would be really appreciated!
The most likely explanation is that the data is different on the two servers. However, you can simplify the query, which is why I'm answering. The users table is not needed in the subquery. So:
select c.*
from `campaigns` c
where c `id` = 91 and
(select count(*)
from campaign_user cu
where cu.`campaign_id` = c.`id` and cu.user_id = 1
) >= 1;
This, in turn, can be simplified and made more efficient by using exists (or a left join) instead of count(*):
select c.*
from campaigns c
where c id = 91 and
exists (select 1
from campaign_user cu
where cu.campaign_id = c.id and cu.user_id = 1
) ;
Related
In the query below I want to retrieve #MaximumRecords rows, so that no ProjectId will have rows left out beyond #MaximumRecords.
For example if #MaximumRecords=100, and ProjectId=7 has records at rows number 99-102, I wish to retrieve only rows with ProjectId=1 to ProjectId=6 (The query will run again later starting at ProjectId=7). How do I do that?
SELECT TOP (#MaximumRecords)
t1.ProjectId,
t1.Row2,
t2.Row3
FROM Table1 t1
JOIN Table2 t2 ON t1.ProjectId = t2.ProjectId
ORDER BY
t1.ProjectId ASC
WHERE
t1.ProjectId > #InitialProjectId
I worked up a solution using the Sales.SalesOrderHeader and Sales.SalesOrderDetail tables in the AdventureWorks2008R2 database using the technique to get a running total described here.
The basic idea is to get the running total of the count for each SalesOrderID (ProjectID in your case) and then select all of the data for each SalesOrderID where the running total of the count is less than #MaximumRecords. You would then need to capture the maximum ID in the data returned and use that value in the next run of your query.
This task gets a little easier with SQL Server 2012 which is also described in the link given above.
Here it is...
USE AdventureWorks2008R2
IF OBJECT_ID('tempdb..#Test', 'U') IS NOT NULL DROP TABLE #Test;
DECLARE #MaximumRecords INT
DECLARE #InitialSalesOrderID INT
SET #MaximumRecords = 500
SET #InitialSalesOrderID = 43663
SELECT a.SalesOrderID, COUNT(*) AS 'Count'
INTO #Test
FROM Sales.SalesOrderHeader a
INNER JOIN Sales.SalesOrderDetail b ON a.SalesOrderID = b.SalesOrderID
WHERE a.SalesOrderID > #InitialSalesOrderID
GROUP BY a.SalesOrderID
SELECT * FROM Sales.SalesOrderHeader a
INNER JOIN Sales.SalesOrderDetail b ON a.SalesOrderID = b.SalesOrderID
WHERE a.SalesOrderID IN (
SELECT
a.SalesOrderID
FROM
#Test a
WHERE (
SELECT
SUM(Count)
FROM
#Test b
WHERE
b.SalesOrderID <= a.SalesOrderID
) < #MaximumRecords
)
Noel
I getting in doubt with "IS NULL" MySQL check. I have this 2 queries. The first one runs in about 300 seconds. The second one run less then 1 second!
Slow query:
SELECT count(distinct(u.id))
FROM ips_usuario AS u
JOIN ips_fatura AS f
ON ((u.id = f.ips_usuario_id) OR
(u.ips_usuario_id_titular IS NOT NULL AND
u.ips_usuario_id_titular = f.ips_usuario_id));
Fast query:
SELECT count(distinct(u.id))
FROM ips_usuario AS u
JOIN ips_fatura AS f
ON ((u.id = f.ips_usuario_id) OR
(u.ips_usuario_id_titular = f.ips_usuario_id));
All join conditions use foreign keys indexed columns. The table ips_usuario have about 20.000 records and the table ips_fatura have about 500.000 records.
I am surprised that either is fast. I would suggest replacing them with exists:
SELECT COUNT(*)
FROM ips_usuario u
WHERE EXISTS (SELECT 1 FROM ips_fatura f WHERE u.id = f.ips_usuario_id) OR
EXISTS (SELECT 1 FROM ips_fatura f WHERE u.ips_usuario_id_titular = f.ips_usuario_id);
And for the second:
SELECT COUNT(*)
FROM ips_usuario u
WHERE EXISTS (SELECT 1 FROM ips_fatura f WHERE u.id = f.ips_usuario_id) OR
(u.ips_usuario_id_titular IS NOT NULL AND
EXISTS (SELECT 1 FROM ips_fatura f WHERE u.ips_usuario_id_titular = f.ips_usuario_id)
)
For both these, you want two indexes: ips_fatura(ips_usuario_id) and ips_fatura(ips_usuario_id_titular). You can check the explain to be sure that EXISTS is using the index. If not, the newer releases of MySQL use indexes for IN:
SELECT COUNT(*)
FROM ips_usuario u
WHERE u.id IN (SELECT f.ips_usuario_id FROM ips_fatura f) OR
u.ips_usuario_id_titular IN (SELECT f.ips_usuario_id FROM ips_fatura f);
In either case (EXISTS or IN) the goal is to do a "semi-join". That is, to only fine the first row with a match rather than all matches. This is an important efficiency, because it allows the query to avoid duplication removal.
I would speculate that the issue is the optimization of the or -- usually this results in inefficient JOIN algorithms. However, perhaps MySQL is smart in your first case. But the addition of the IS NULL to the outer table throws it off.
I have a MySQL-question which I can't seem to figure out.
I have two query's and some PHP and I think it should be possible to merge this into one query.
The thing I want to do is: show all users that completed all modules of a certain programm with id=1
Query 1 -> this gives all modules of the programm:
SELECT `id`
FROM `modules`
JOIN `linkModuleProgramm` ON `module`.`id` = `linkModuleProgramm`.`module`
WHERE `linkModuleProgramm`.`programm` = 1
Query 2 -> this gives the user and the number of modules of the programm that are completed by this user:
SELECT `user`.`id`,
COUNT(*) as 'count'
FROM `user`
JOIN `linkUserModule` ON `user`.`id` = `linkUserModule`.`user`
WHERE `linkUserModule`.`status` = 1
AND `linkUserModule`.`module` IN (Query1)
GROUP BY `user`.`id`
Then PHP filters the users from Query2 where 'count' is equal to the number of results from Query1.
Anyone has a suggestion?
Looks like I haven't understood your question right the first time. Another try, using two queries and MySQL variable:
SET #MODULESCOUNT = (
SELECT COUNT(*)
FROM `modules`
JOIN `linkModuleProgramm` ON `module`.`id` = `linkModuleProgramm`.`module`
WHERE `linkModuleProgramm`.`programm` = 1
);
SELECT `user`.`id`,
FROM `user`
JOIN `linkUserModule` ON `user`.`id` = `linkUserModule`.`user`
WHERE `linkUserModule`.`status` = 1
AND `linkUserModule`.`module` IN (Query1)
GROUP BY `user`.`id`
HAVING COUNT(*)=#MODULESCOUNT ;
There is also a possibility to use 'subquery' instead of variable:
SELECT `user`.`id`,
FROM `user`
JOIN `linkUserModule` ON `user`.`id` = `linkUserModule`.`user`
WHERE `linkUserModule`.`status` = 1
AND `linkUserModule`.`module` IN (Query1)
GROUP BY `user`.`id`
HAVING COUNT(*)=
(
SELECT COUNT(*)
FROM `modules`
JOIN `linkModuleProgramm` ON `module`.`id` = `linkModuleProgramm`.`module`
WHERE `linkModuleProgramm`.`programm` = 1
)
but I don't like subqueries inside 'where' or 'having' sections, mysql sometimes decides to do unreasonably inefficient things because of them.
For the longest time I couldn't figure out why an incorrect COUNT(*) values were being returned. After incrementally removing parts of my query I finally realized that joining tables were the reason behind the incorrect values.
This is the query I'm working with:
SELECT `profiles`.`logo` AS logo,
`companies`.`company_name`,
`companies`.`url_slug`,
count(*)
FROM (`companies`)
JOIN `users` ON `users`.`id` = `companies`.`user_id`
JOIN `categories` ON `categories`.`company_id` = `companies`.`id`
JOIN `products` ON `products`.`company_id` = `companies`.`id`
JOIN `profiles` ON `profiles`.`company_id` = `companies`.`id`
WHERE `users`.`last_login` IS NOT NULL
AND `categories`.`category_id` = '3'
AND `products`.`active` = 1
AND `products`.`xmp_1` = 1
AND `products`.`xmp_2` = 1
AND `profiles`.`field_a` = 1
GROUP BY `companies`.`id`
Running this in my SQL program returns 28 rows, but the COUNT(*) row returns something like 1400. I'm not sure where to head from here. I need a column returned that returns the 28 instead of 1400.
SQL Fiddle with sample data: http://sqlfiddle.com/#!2/97d2d/9
There's no way to do this with a simple query. Chris Hayes shows a way to do this with sub-queries and there are a variety of ways to do that.
The reason is that aggregation functions (like COUNT) only work on the GROUP that a row represents... in this case, each row represents the set of results that have the same company id. There's no way for field in a row to show aggregation information that extends beyond its content, in a simple query.
With subqueries, you can generate the first table, and then aggregate over that. Or, you could just use the fact that the total number of rows returned is included as metadata on the data that is sent back to whichever client submitted the query in the first place, thereby removing the need to include it in the data.
The following query works:
SELECT *, COUNT(*) FROM (
SELECT `profiles`.`logo` AS logo,
`companies`.`company_name`,
`companies`.`url_slug`
FROM (`companies`)
JOIN `users` ON `users`.`id` = `companies`.`user_id`
JOIN `categories` ON `categories`.`company_id` = `companies`.`id`
JOIN `products` ON `products`.`company_id` = `companies`.`id`
JOIN `profiles` ON `profiles`.`company_id` = `companies`.`id`
WHERE `users`.`last_login` IS NOT NULL
#AND `categories`.`category_id` = '3'
AND `products`.`active` = 1
#AND `products`.`xmp_1` = 1
#AND `products`.`xmp_2` = 1
AND `profiles`.`field_a` = 1
GROUP BY `companies`.`id`
) AS stuff
I think somehow your original COUNT(*) is looking at pre-join-condition or pre-grouping amounts.
In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.