Fixing SQL Query so it will become more Efficient - mysql

I've got 3 tables:
mobile_users - with id,phone_type,...
2+3. iphone_purchases AND android_purchases - with id,status,user_id,..
I am trying to get all of the users who made 2 or more purchases.
successful purchase is identified by status > 0.
Also I am tring to get the total amount of users in the mobile_users table in the same query.
this is the query I came up with:
SELECT COUNT(*) AS `users`,
( SELECT COUNT(*)
FROM `mobile_users`
) AS `total`
FROM `mobile_users`
WHERE `mobile_users`.`phone_type` = 'iphone'
AND ( SELECT COUNT(*)
FROM ( SELECT `status`,
`user_id`
FROM `iphone_purchases`
UNION
SELECT `status`,
`user_id`
FROM `android_purchases`
) AS `purchase_list`
WHERE `purchase_list`.`status` > 0
AND `purchase_list`.`user_id` = `mobile_users`.`id`
) >= 2
It's very slow, and I have to find a way to improve it.
Any help would be appreciated!
Edit:
Also you should take in consideration that i'm building this query with sub-queries in PHP.
I'm building it with more conditions on the WHERE statment.

Your query is just returning counts of users, not each user.
The following restructures your query. It counts the number of purchases for iphones and androids separately, and then combines them using left outer join. The where clause simply combines the counts:
select mu.*, i.cnt as iphones, a.cnt as androids
from mobile_users mu left outer join
(SELECT `user_id`, count(*) as cnt
FROM `iphone_purchases`
where `status` > 0
group by user_id
) i
on i.user_id = mu.id left outer join
(SELECT `user_id`, count(*) as cnt
FROM `android_purchases`
where `status` > 0
group by user_id
) a
on a.user_id = mu.id
where coalesce(i.cnt, 0) + coalesce(a.cnt, 0) >= 2;

Related

Calculations using aliases (from a subquery to the same table) mySQL

I have a database that stores player kills in CS:GO, I am trying to write a query that can show each player's KD.
I've written a query that will show each player's kills and deaths using aliases.
SELECT
`Name`,
`SteamID` as PlayerID,
count(`EventType`) as kills,
(SELECT count(`EventType`)
FROM `logdata`
WHERE (`EventVariable` = PlayerID AND `EventType` = 'killed')
GROUP BY `EventVariable`
ORDER BY `count(``EventType``)` DESC) as deaths
FROM `logdata`
WHERE `EventType` = 'killed'
GROUP BY `EventType`, `Name`
ORDER BY kills DESC
(results limited to just bots, I didn't want to openly advertise my friends SteamIDs)
To work out KD I just need to divide kills / deaths but you can't do that with aliases, I read that I should be able to wrap the alias e.g. (SELECT kills) / (SELECT deaths) as KD but that doesn't work.
The table looks like this: (Limited to bots again)
I am currently working out KD in PHP using the result of my query but that isn't a great way of doing it. (I am unable to query who has the highest KD for example)
So, my question is, how would I go about calculating the KD if I am unable to make calculations using alias?
I might just write your query using two completely separate subqueries which compute the kills and deaths counts:
SELECT
n.Name,
COALESCE(t1.kill_cnt, 0) AS kills,
COALESCE(t2.death_cnt, 0) AS deaths,
CASE WHEN t2.death_cnt > 0
THEN CAST(t1.kill_cnt / t2.death_cnt AS CHAR(50))
ELSE 'NA' END AS ratio
FROM
( SELECT DISTINCT Name FROM logdata ) n
LEFT JOIN
(
SELECT Name, COUNT(*) AS kill_cnt
FROM logdata
WHERE EventType = 'killed'
GROUP BY Name
) t1
ON
n.Name = t1.Name
LEFT JOIN
(
SELECT EventVariable AS Name, COUNT(*) AS death_cnt
FROM logdata
WHERE EventType = 'killed'
GROUP BY Name
) t2
ON
n.Name = t2.Name
Note that the subquery above which I have aliased as n is just intended to generate a complete list of all users in your database. Ideally, there should be a dedicated user table somewhere. If not, and you don't like my approach, then you will have to come up with some other way to obtain a list of all users.
Thanks to Tim for pointing me in the right direction and providing a query. I have made some changes to get the result I want and I wanted to post the final result.
SELECT
n.SteamID,
COALESCE(t1.kill_cnt, 0) AS kills,
COALESCE(t2.death_cnt, 0) AS deaths,
CASE WHEN t2.death_cnt > 0 THEN CAST(t1.kill_cnt / t2.death_cnt AS CHAR(50))
WHEN t1.kill_cnt = 0 THEN '0'
ELSE 'Infinite' END AS ratio
FROM
( SELECT DISTINCT SteamID FROM logdata ) n
LEFT JOIN
(
SELECT SteamID, COUNT(*) AS kill_cnt
FROM logdata
WHERE EventType = 'killed'
GROUP BY SteamID
) t1
ON
n.SteamID = t1.SteamID
LEFT JOIN
(
SELECT EventVariable AS SteamID, COUNT(*) AS death_cnt
FROM logdata
WHERE EventType = 'killed'
GROUP BY EventVariable
) t2
ON
n.SteamID = t2.SteamID
WHERE t1.kill_cnt > 0 or t2.death_cnt > 0
ORDER BY `ratio` DESC
I attempted to get KD 0 to show as such but that is not all that important at the end of the day, NULL is easy to work with.

MySQL row count

I have a very large table (~1 000 000 rows) and complicated query with unions, joins and where statements (user can select different ORDER BY columns and directions). I need to get a row count for pagination. If I run query without counting rows it completes very fast. How can I implement pagination in fastest way?
I tried to use EXPLAIN SELECT and SHOW TABLE STATUS to get approximate row count, but it is very different from real row count.
My query is like this one (simplyfied):
SELECT * FROM (
(
SELECT * FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
ORDER BY x ASC
LIMIT 0, 10
)
UNION
(
SELECT * FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
ORDER BY x ASC
LIMIT 0, 10
)
) tbl ORDER BY x ASC LIMIT 0, 10
Query result without limiting is about ~100 000 rows, how can I get this approximate count in fastest way?
My production query example is like this one:
SELECT SQL_CALC_FOUND_ROWS * FROM (
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot, articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`contents`.dat AS source_dat, `contents_trans`.header, `contents_trans`.custom_text
FROM articles_log
INNER JOIN `contents` ON articles_log.record_id = `contents`.id
AND articles_log.source_table = 'contents'
INNER JOIN `contents_trans` ON `contents`.id = `contents_trans`.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
UNION
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot,
articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`news`.dat AS source_dat, `news_trans`.header, `news_trans`.custom_text
FROM articles_log
INNER JOIN `news` ON articles_log.record_id = `news`.id
AND articles_log.source_table = 'news'
INNER JOIN `news_trans` ON `news`.id = `news_trans`.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
) tbl ORDER BY view_dat ASC LIMIT 0, 10
Many thanks!
If you can use UNION ALL instead of UNION (which is a shortcut for UNION DISTINCT) - In other words - If you don't need to remove duplicates you can try to add the counts of the two subqueries:
SELECT
(
SELECT COUNT(*) FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
)
+
(
SELECT COUNT(*) FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
)
AS cnt
Without ORDER BY and without UNION the engine might not need to create a huge temp table.
Update
For your original query try the following:
Select only count(*).
Remove OR articles_log.source_table <> 'contents' from first part (contents) since we know it's never true.
Remove AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404' OR articles_log.source_table <> 'contents') from second part (news) since we know it's allways true because OR articles_log.source_table <> 'contents' is allways true.
Remove the joins with contents and news. You can join the *_trans tables directly using record_id
Remove articles_log.dat > 0 since it's redundant with articles_log.dat >= 1488319200
The resulting query:
SELECT (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `contents_trans`
ON `contents_trans`.record_id = articles_log.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.record_id NOT LIKE '%\_404'
AND articles_log.record_id <> '404'
) + (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `news_trans`
ON `news_trans`.record_id = articles_log.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
) AS cnt
Try the following index combinations:
articles_log(bot, dat, record_id)
contents_trans(lang, record_id)
news_trans(lang, record_id)
or
contents_trans(lang, record_id)
news_trans(lang, record_id)
articles_log(record_id, bot, dat)
It depends on the data, which combination ist the better one.
I might be wrong on one ore more points, since i don't know your data and business logic. If so, try to adjust the other.
You can get the calculation when you run the query using SQL_CALC_FOUND_ROWS as explained in the documentation:
select SQL_CALC_FOUND_ROWS *
. . .
And then running:
select FOUND_ROWS()
However, the first run needs to generate all the data, so you are going to get up to 20 possible rows -- I don't think it respects LIMIT in subqueries.
Given the structure of your query and you want to do, I would think first about optimizing the query. For instance, is UNION really needed (it incurs overhead for removing duplicates)? As pointed out in a comment, your joins are really inner joins disguised as outer joins. Indexes might improve performance.
You might want to ask another question, providing sample data and desired results to get advice on such issues.

Count on multiple tables with missing zero counts

I am running this query to return data with count < 0. It works fine until count is > 0 and < 50. But when count becomes 0, it doesnot return the data. Count is defined by coupons`.`status. On count zero, there will be no data in coupons table with status as 1. This is creating the issue, as it omits the whole row.
SELECT count(*) AS count, clients.title, plans.name
FROM `coupons`
INNER JOIN `clients` ON `coupons`.`client_id` = `clients`.`id`
INNER JOIN `plans` ON `coupons`.`plan_id` = `plans`.`id`
WHERE `coupons`.`status` = 1
GROUP BY `coupons`.`client_id`, `coupons`.`plan_id`
HAVING count < 50
Please help how to fix it.
Table definitions.
coupons (id, client_id, plan_id, customer_id, status, code)
plans (id, name)
clients (id, name...)
client_plans (id, client_id, plan_id)
Basically, a client can have multiple plans and a plan can belong to multiple clients.
Coupons table stores predefined coupons which can be allocated to customers. Non allocated coupons have status as 0, while as allocated coupons get status as 1
Here I am trying to fetch non allocated client wise, plan wise coupon count where either the count is less than 50 or count has reached 0
For example,
If coupons table as 10 rows of client_id = 1 & plan_id = 1 with status as 1, it should return count as 10, but when the table has 0 rows with client_id = 1 and plan_id = 1 with status as 1, it does not return anything in the above query.
Thank you all for your inputs, this worked.
select
sum(CASE WHEN `coupons`.`status` = 1 THEN 1 ELSE 0 END) as count,
clients.title,
plans.name
from
`clients`
left join
`coupons`
on
`coupons`.`client_id` = `clients`.`id`
left join
`plans`
on
`coupons`.`plan_id` = `plans`.`id`
group by
`coupons`.`client_id`,
`coupons`.`plan_id`
having
count < 50
With the inner joins, the query is not going to return any "zero" counts.
If you want to return "zero" counts, you are going to need an outer join somewhere.
But it's not clear what you are actually trying to count.
Assuming that what you are trying to get is a count of rows from coupons, for every possible combination of rows from plans and clients, you could do something like this:
SELECT COUNT(`coupons`.`client_id`) AS `count`
, clients.title
, plans.name
FROM `plans`
CROSS
JOIN `clients`
LEFT
JOIN `coupons`
ON `coupons`.`client_id` = `clients`.`id`
AND `coupons`.`plan_id` = `plans`.`id`
AND `coupons`.`status` = 1
GROUP
BY `clients`.`id`
, `plans`.`id`
HAVING `count` < 50
This is just a guess at result set you are expecting to return. Absent table definitions, example data, and the expected result, we're just guessing.
FOLLOWUP
Based on your comment, it sounds like you want conditional aggregation.
To "count" only the rows in coupons that have status=1, you can do something like this:
SELECT SUM( `coupons`.`status` = 1 ) AS `count`
, clients.title
, plans.name
FROM `coupons`
JOIN `plans`
ON `plans`.`id` = `coupons`.`plan_id`
JOIN `clients`
ON `clients`.`id` = `coupons`.`client_id`
GROUP
BY `clients`.`id`
, `plans`.`id`
HAVING `count` < 50
There are other expressions you can use to get the conditional "count". For example
SELECT COUNT( IF(`coupons`.`status`=1, 1, NULL) ) AS `count`
or
SELECT SUM( IF(`coupons`.`status`=1, 1, 0) ) AS `count`
or, for a more ANSI standards compatible approach
SELECT SUM( CASE WHEN `coupons`.`status` = 1 THEN 1 ELSE 0 END ) AS `count`

Got Sql Error in Syntax and need efficient sql query

As per my requirement i made the below query. Now it not working.
Query is:
SELECT *
FROM T_INV_DTL T
LEFT JOIN (
SELECT inv_dtl_id,
Employee_id AS emp_id,
GROUP_CONCAT(DISTINCT Employee_id) AS Employee_id
FROM T_INV_INVESTIGATOR
GROUP BY
inv_dtl_id
)TII
ON T.inv_dtl_id = TII.inv_dtl_id
JOIN T_INVESTIGATION TI
ON T.inv_id = TI.inv_id
LEFT JOIN (
SELECT inv_dtl_id
FROM T_INV_BILL
GROUP BY
inv_dtl_id
)TIB
ON T.inv_dtl_id = TIB.inv_dtl_id
JOIN T_Insurance_company TIC
ON TI.client_id = TIC.ins_cmp_id
WHERE 1 T.Report_dt != '0000-00-00'
AND (
T.inv_dtl_id NOT IN (SELECT inv_dtl_id
FROM T_INV_BILL TIBS
WHERE TIBS.inv_dtl_id NOT IN (SELECT
inv_dtl_id
FROM
T_INV_BILL
WHERE
Bill_submitted_dt =
'0000-00-00'))
)
ORDER BY
Allotment_dt DESC
LIMIT 20
Can anyone tells the problem and can you please modify to more efficient query(Suppose if we have more than 100 records, then we take the count for it for pagination it should be give faster).
T_INV_DTL is main table and it connect to others. So my probelm is each entry of this table T_INV_DTL has multtiple investigation bill in the table T_INV_BILL. Report_dt in the T_INV_DTL. So my outcome is that i need result if there’s a report date in T_INV_DTL and not atleast one bill date in T_INV_BILL.
I need the result with both if there’s a report date in T_INV_DTL and not atleast one bill date in T_INV_BILL(If all have entered the bill submitted date it does not need it).
While I admittedly don't know what issues you're having (please provide addl info), your query does look like it could be optimized.
Removing your Where criteria and adding to your Join should save 2 of your table scans:
SELECT *
FROM T_INV_DTL T
LEFT JOIN (
SELECT inv_dtl_id,
Employee_id AS emp_id,
GROUP_CONCAT(DISTINCT Employee_id) AS Employee_id
FROM T_INV_INVESTIGATOR
GROUP BY
inv_dtl_id
)TII
ON T.inv_dtl_id = TII.inv_dtl_id
JOIN T_INVESTIGATION TI
ON T.inv_id = TI.inv_id
LEFT JOIN (
SELECT inv_dtl_id
FROM T_INV_BILL
WHERE Bill_submitted_dt != '0000-00-00'
GROUP BY inv_dtl_id
)TIB
ON T.inv_dtl_id = TIB.inv_dtl_id
JOIN T_Insurance_company TIC
ON TI.client_id = TIC.ins_cmp_id
WHERE T.Report_dt != '0000-00-00'
AND TIB.inv_dtl_id IS NULL
ORDER BY
Allotment_dt DESC
LIMIT 20

inner join + count + group by

I'm have trouble counting/grouping the results of an inner join
I have two tables
results_dump: Which has two columns: email and result (the result value can be either open or bounce)
all_data: Which has three columns: email, full_name and address
The first goal is to query the result_dump table and count and group the number of times the result is "open" for a specific email.
This query works great:
SELECT `email`, COUNT(*) AS count
FROM `result_dump`
WHERE `date` = "open"
GROUP BY `email`
HAVING COUNT(*) > 3
ORDER BY count DESC
The second goal it to take those results (anyone who "open" more then 3 time) and pull in the 'full_name' and 'address' so I will have details on who opened an email 3+ times.
I have this query and it works as far as getting the data together - But I can't figure out how to get the COUNT, HAVING and ORDER to work with the INNER JOIN?
SELECT *
FROM all_data
INNER JOIN result_dump ON
all_data.email = result_dump.email
where `result` = "open"
SELECT email,name,count(*)
FROM all_data
INNER JOIN result_dump ON
all_data.email = result_dump.email
where `result` = "open"
group by result_dump.email
having count(*)>3
ORDER by count DESC
Nothing wrong with this one I think.
Try with following query:
SELECT * FROM all_data AS a
INNER JOIN
(SELECT * FROM result_dump where email IN
(SELECT `email`
FROM `result_dump`
WHERE `date` = "open"
GROUP BY `email`
HAVING count(email) >3
ORDER BY count(email) DESC)) AS b
ON a.email = b.email
WHERE b.`result` = "open"
This is Works Fine...! Try to this..
SELECT title.title
,count(*)
,title.production_year
,title.id as movie_id
,title.flag as language
,movie_info.info
FROM title INNER JOIN movie_info ON title.id=movie_info.movie_id;