MySQL query - optimized - mysql

Is it possible to achieve this task with a single MySQL query?
Table urls. Fields {id, url}
1, www.mysite.kom
2, mysite.kom
3, anothersite.kom
Table logs. Fields {id, url_id, group_type - a number in the range of 1..10}
1, 1, 4
2, 1, 4
3, 2, 5
4, 2, 5
5, 3, 9
The result of the query in this example should be: 1 (mysite.com and www.mysite.com = 1)
THE GOAL:
Need to count all distinct urls recorded in logs table, but with a few conditions:
1) Urls with and without www. prefix, like mysite.kom and www.mysite.kom, should be counted as 1 (not 2).
2) Have group_type in the range of 4..6
3) Now, any of these urls with group_type 4..6, which appear in the list of those with group_type lower than 4 - should be ignored and not counted at all.
The SQL code:
SELECT COUNT(DISTINCT TRIM(LEADING 'www.' FROM b.url))
FROM logs a
INNER JOIN urls b
ON a.url_id = b.id
WHERE (group_type BETWEEN 4 AND 6)
----- and this condition below -----
AND TRIM(LEADING 'www.' FROM b.url)
NOT IN (
SELECT TRIM(LEADING 'www.' FROM b.url)
FROM logs a
INNER JOIN urls b
ON a.url_id = b.id
WHERE (group_type < 4)
)
If my sql query is correct, can it be optimized (to look more compact)?

SELECT COUNT(DISTINCT u.id) AS COUNT_QUES FROM urls u
INNER JOIN logs l
ON u.id=l.url_id
WHERE u.url NOT IN (SELECT A.url FROM
(SELECT * FROM urls u
WHERE SUBSTR(u.url,1,3)!='www')A,
(SELECT * FROM urls v
WHERE SUBSTR(v.url,1,3)='www')B
WHERE A.url=SUBSTR(B.url,5,LENGTH(B.url))
)
AND l.group_type BETWEEN 4 AND 6
AND u.id NOT IN
(SELECT DISTINCT u.id FROM urls u
INNER JOIN logs l
ON u.id=l.url_id
WHERE u.url NOT IN (SELECT A.url FROM
(SELECT * FROM urls u
WHERE SUBSTR(u.url,1,3)!='www')A,
(SELECT * FROM urls v
WHERE SUBSTR(v.url,1,3)='www')B
WHERE A.url=SUBSTR(B.url,5,LENGTH(B.url))
)
AND l.group_type < 4
)
OR
SELECT COUNT(DISTINCT CASE WHEN B.URL_ID IS NOT NULL AND FLAG1 = 1 AND FLAG2 = 0 THEN TRIM(LEADING 'www.' FROM A.URL) END)
FROM URLS A
LEFT JOIN (SELECT URL_ID,
MAX(CASE WHEN GROUP_TYPE BETWEEN 4 AND 6 THEN 1 ELSE 0 END) FLAG1,
MAX(CASE WHEN GROUP_TYPE < 4 THEN 1 ELSE 0 END) FLAG2
FROM LOGS
GROUP BY URL_ID) B
ON A.ID = B.URL_ID
Hope this works for you.Check this on SQLFIDDLE - http://sqlfiddle.com/#!2/1fde2/39

Here's one way:
SELECT trimmed_url
FROM ( SELECT TRIM(LEADING 'www.' FROM urls.url) AS trimmed_url,
MIN(logs.group_type) AS min_group_type
FROM logs
JOIN urls
ON urls.id = logs.url_id
GROUP
BY trimmed_url
) t
WHERE min_group_type BETWEEN 4 AND 6
;
But only you can judge whether it looks more compact to you, and only testing can determine whether it performs better.

Related

SQL Query that will omit records with the same ID based on values in another column

* UPDATE*
Upon further review, the table I am using also has a linenumber column. See updated example data below. I feel like this could be extremely helpful in solving this....just not sure how. Sum up the line number by PO, and if it equals 1 it is a single line, if it is greater than 1 it is a multi line....does that do anything for us?
New here and to SQL so please forgive my ignorance. Hopefully this is an easy answer.
Looking to build 3 similar queries that will return purchase orders that contain more than 1 item and:
Contain NO Lot Controlled Items
Contain ALL Lot Controlled Items
Contain a MIX of Lot Controlled and Non-Lot Controlled Items
Data looks like this...
PONUMBER ITEMNUMBER LOTCONTROLLED LINENUMBER
PO1.18 OSC1024 0 1
PO1.18 OSC1025 0 2
PO1.18 OSC1026 0 3
PO1.2 OSC1199 0 1
PO1.2 OSC1200 1 2
PO1.21 OSC1201 1 1
PO1.21 OSC1202 1 2
PO1.22 OSC1203 1 1
PO1.23 OSC1204 1 1
PO1.23 OSC1205 0 2
PO1.24 OSC1206 1 1
PO1.24 OSC1207 1 2
PO1.24 OSC1300 0 3
Query for NO Lot Controlled items works great...
SELECT
`POD`.`PONUMBER`,
`POD`.`ITEMNUMBER`,
`POD`.`LOTCONTROLLED`
FROM
table1 AS `POD`
INNER JOIN
(
SELECT `PONUMBER`, COUNT(`PONUMBER`)
FROM table1
WHERE `LOTCONTROLLED` = 0
GROUP BY `PONUMBER`
HAVING (COUNT(`PONUMBER`) > 1)
) as `POD1`
ON `POD`.`PONUMBER` = `POD1`.`PONUMBER`
I thought it would be as simple as changing the WHERE LOTCONTROLLED to be = 1, to get Purchase Orders with ALL Lot Controlled items, but that returns some Purchase Orders that have mixed lines as well.
How can I eliminate a purchase order from inclusion if any one of the lines are not lot controlled?
I like using NOT EXISTS here:
SELECT POD.*
FROM table1 POD
JOIN (SELECT PONUMBER
FROM table1 POD
WHERE NOT EXISTS (SELECT *
FROM table1 POD1
WHERE POD.PONUMBER = POD1.PONUMBER
AND POD1.LOTCONTROLLED = 1)
GROUP BY PONUMBER
HAVING COUNT(*) > 1
) POD1 ON POD.PONUMBER = POD1.PONUMBER
This will omit the PONUMBER from results if any record from that PONUMBER has LOTCONTROLLED = 1 or 0, depending on what you put in the exists subquery.
To get only records that have a mix, you can use COUNT().. HAVING:
SELECT PONUMBER,
ITEMNUMBER,
LOTCONTROLLED
FROM table1 POD
JOIN (SELECT PONUMBER
FROM table1
GROUP BY PONUMBER
HAVING COUNT(DISTINCT LOTCONTROLLED) = 2
) POD1 ON POD.PONUMBER = POD1.PONUMBER
Looks like you also need to join the queries by Lot Controlled too, so I added it to the Group By and Inner Select so it could be joined:
NO LOT CONTROLLED:
SELECT
`POD`.`PONUMBER`,
`POD`.`ITEMNUMBER`,
`POD`.`LOTCONTROLLED`
FROM
table1 AS `POD`
INNER JOIN
(
SELECT `PONUMBER`, 'LOTCONTROLLED', COUNT(`PONUMBER`)
FROM table1
WHERE `LOTCONTROLLED` = 0
GROUP BY `PONUMBER`, 'LOTCONTROLLED'
HAVING (COUNT(`PONUMBER`) > 1)
) as `POD1`
ON `POD`.`PONUMBER` = `POD1`.`PONUMBER` AND `POD`.`LOTCONTROLLED` = `POD1`.`LOTCONTROLLED`
LOT CONTROLLED:
SELECT
`POD`.`PONUMBER`,
`POD`.`ITEMNUMBER`,
`POD`.`LOTCONTROLLED`
FROM
table1 AS `POD`
INNER JOIN
(
SELECT `PONUMBER`, 'LOTCONTROLLED', COUNT(`PONUMBER`)
FROM table1
WHERE `LOTCONTROLLED` = 1
GROUP BY `PONUMBER`, 'LOTCONTROLLED'
HAVING (COUNT(`PONUMBER`) > 1)
) as `POD1`
ON `POD`.`PONUMBER` = `POD1`.`PONUMBER` AND `POD`.`LOTCONTROLLED` = `POD1`.`LOTCONTROLLED`
ALL LOT CONTROLLED:
SELECT
`POD`.`PONUMBER`,
`POD`.`ITEMNUMBER`,
`POD`.`LOTCONTROLLED`
FROM
table1 AS `POD`
INNER JOIN
(
SELECT `PONUMBER`, 'LOTCONTROLLED', COUNT(`PONUMBER`)
FROM table1
WHERE `LOTCONTROLLED` IN (0,1)
GROUP BY `PONUMBER`, 'LOTCONTROLLED'
HAVING (COUNT(`PONUMBER`) > 1)
) as `POD1`
ON `POD`.`PONUMBER` = `POD1`.`PONUMBER` AND `POD`.`LOTCONTROLLED` = `POD1`.`LOTCONTROLLED`
Window functions are the simplest method, but you probably don't have those. So, just use the min() and max() of lotcontrolled. The basic query is:
select pod.*
from table1 pod join
(select ponumber, min(lotcontrolled) as min_lc, max(lotcontrolled) as max_lc
from table1 pod
group by ponumber
having count(*) > 1
) p
using (ponumber)
Then your three conditions are:
max_lc = 0 -- no lot controlled
min_lc = 1 -- all lot controlled
min_lc <> max_lc -- mixed
Some people might prefer the more verbose versions:
min_lc = max_lc and max_lc = 0 -- no lot controlled
min_lc = max_lc and max_lc = 1 -- all lot controlled
min_lc <> max_lc -- mixed
try something like this:
--No items in the group contain LotControlled
SELECT *
FROM your_table
WHERE ponumber IN (SELECT ponumber
FROM your_table
GROUP BY ponumber
HAVING Sum(CONVERT(INT, lotcontrolled)) = 0)
--All Items Contain
SELECT *
FROM your_table
WHERE ponumber IN (SELECT ponumber
FROM your_table
GROUP BY ponumber
HAVING Sum(CONVERT(INT, lotcontrolled)) = Count(*))
--mixed
SELECT *
FROM your_table
WHERE ponumber IN (SELECT ponumber
FROM your_table
GROUP BY ponumber
HAVING Sum(CONVERT(INT, lotcontrolled)) != Count(*)
AND Sum(CONVERT(INT, lotcontrolled)) > 0)

mysql need to extract median value from query

I have the following query from which I need to extract the median value of total_views.
SELECT
#rownum:=#rownum + 1 AS row_num, total_views, projectId
FROM
(SELECT
a.creation,
a.projectId,
devices,
browserIds,
devices + browserIds AS total_views
FROM
((SELECT
projectId, creation
FROM
event
WHERE
kind = 'project_creation'
AND creation > '2017-04-28') a
INNER JOIN ((SELECT
COUNT(DISTINCT deviceId) AS devices, projectId, creation
FROM
event
WHERE
kind = 'open' AND component = 'mobile'
GROUP BY projectId) b
JOIN (SELECT
COUNT(DISTINCT browserId) AS browserIds, projectId, creation
FROM
event
WHERE
kind = 'open' AND component = 'web'
GROUP BY projectId) c ON b.projectId = c.projectId) ON a.projectId = b.projectId
OR a.projectId = c.projectId)
ORDER BY total_views ASC) d,
(SELECT #rownum:=0) e
;
This a part of the result :
1 1 151
2 1 256
3 1 301
4 2 404
5 2 305
6 3 895
7 4 654
8 4 369
9 9 874
10 10 123
I need to extend the query to extract the median value of total_views.
Any ideas?
Found the solution, needed to use the value of the #rownum variable instead of using the value of the field row_num to determine the position of the middle value.
I then calculate the average value of the total_views in the middle of the result set.
(Average of two middle values if the result has an even number of lines. average of the middle value if the resultset has an odd number of lines, which is the same as the middle value).
thus using the condition :
WHERE row_num in (CEIL(#rownum/2), FLOOR(#rownum/2))
full query:
SELECT avg(total_views) from
(SELECT
#rownum:=#rownum + 1 AS row_num, total_views, projectId
FROM
(SELECT
a.creation,
a.projectId,
devices,
browserIds,
devices + browserIds AS total_views
FROM
((SELECT
projectId, creation
FROM
event
WHERE
kind = 'project_creation'
AND creation > '2017-04-28') a
INNER JOIN ((SELECT
COUNT(DISTINCT deviceId) AS devices, projectId, creation
FROM
event
WHERE
kind = 'open' AND component = 'mobile'
GROUP BY projectId) b
JOIN (SELECT
COUNT(DISTINCT browserId) AS browserIds, projectId, creation
FROM
event
WHERE
kind = 'open' AND component = 'web'
GROUP BY projectId) c ON b.projectId = c.projectId) ON a.projectId = b.projectId
OR a.projectId = c.projectId)
ORDER BY total_views ASC) d,
(SELECT #rownum:=0) e) f WHERE row_num in (CEIL(#rownum/2), FLOOR(#rownum/2))
;

mysql Sub queries in COUNT along with GROUP BY YEAR

Having some trouble figuring out the best way to do this.
Here is what I'm trying to do:
SELECT
YEAR(t.voucher_date) as period,
COUNT(t.id) as total_count,
(SELECT COUNT(t2.id) FROM booking_global as t2 where t2.booking_status = 'CONFIRMED') as confirmed,
(SELECT COUNT(t3.id) FROM booking_global as t3 where t3.booking_status = 'PENDING') as pending
FROM booking_global t
GROUP BY YEAR(t.voucher_date)
This produces the below result.
period total_count CONFIRMED PENDING
2014 4 5 3
2015 4 5 3
Expected Result
period total_count CONFIRMED PENDING
2014 4 3 1
2015 4 2 2
Here i want to get CONFIRMED / PENDING count's for respective years, rather than getting count of all statuses.
I am not sure how to use my query as a sub query and run another query on the results.
Flowing should give you right rsult
SELECT
YEAR(t.voucher_date) as period,
COUNT(t.id) as total_count,
(SELECT COUNT(t2.id) FROM booking_global as t2 where t2.booking_status = 'CONFIRMED' and YEAR(t2.voucher_date) = YEAR(t.voucher_date)) as confirmed,
(SELECT COUNT(t3.id) FROM booking_global as t3 where t3.booking_status = 'PENDING' and YEAR(t3.voucher_date) = YEAR(t.voucher_date)) as pending
FROM booking_global t
GROUP BY YEAR(t.voucher_date)
You can have a subquery that calculates each booking_status for each year. The result of which is then joined on table booking_global. Example,
SELECT YEAR(t.voucher_date) voucher_date_year,
COUNT(t.id) total_count,
IFNULL(calc.confirmed_count, 0) confirmed_count,
IFNULL(calc.pending_count, 0) pending_count
FROM booking_global t
LEFT JOIN
(
SELECT YEAR(voucher_date) voucher_date_year,
SUM(booking_status = 'CONFIRMED') confirmed_count,
SUM(booking_status = 'PENDING') pending_count
FROM booking_global
GROUP BY YEAR(voucher_date)
) calc ON calc.voucher_date_year = YEAR(t.voucher_date)
GROUP BY YEAR(t.voucher_date)

Split values then resolve the values to a name

I need to be able to do something with my column (below) that can contain multiple values. The 'HearAboutEvent' column has multiple values separated by a comma. Each one of these values corresponds to an entry in another table. So the value of 11273 will equal facebook, 11274 will mean radio, and 11275 will mean commercial.
The data I am working with looks like this:
weather ID MemberID SubscriptionID DateEntered ParticipatedBefore ParticipatedBeforeCities WeatherDependent NonRefundable TShirtSize HearAboutEvent
Yes 24 18 1 2013-12-19 0 NULL 10950 10952 10957 11273, 11274, 11275
I am able to do the proper join to resolve the value of 'weather', note it is the first column and the 8th column.
This is the query I have created so far to resolve the values of WeatherDependent:
SELECT CFS1.Name as 'weather', *
FROM FSM_CustomForm_693 t
LEFT JOIN FSM_CustomFormSelectOptions CFS1 ON CFS1.ID = t.WeatherDependent
where t.ID = 24
Ultimately I need to have the data look like this:
weather ID MemberID SubscriptionID DateEntered ParticipatedBefore ParticipatedBeforeCities WeatherDependent NonRefundable TShirtSize HearAboutEvent
Yes 24 18 1 2013-12-19 0 NULL 10950 10952 10957 Facebook, radio, commercial
Things I think you could use to accomplish this are:
A Split TVF FUNCTION - http://msdn.microsoft.com/en-us/library/ms186755.aspx
CROSS APPLY - http://technet.microsoft.com/en-us/library/ms175156.aspx
STUFF & FOR XML PATH - http://msdn.microsoft.com/en-us/library/ms188043.aspx & http://msdn.microsoft.com/en-us/library/ms190922.aspx
Going one step further, you need something like this:
Excuse my profuse use of sub queries.
CREATE FUNCTION dbo.Split (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
GO
SELECT
O.A,O.B,O.C,O.D,O.E,O.F,O.G,O.H,O.I,O.J,O.Stuffed
FROM (
SELECT
*
,STUFF((
SELECT ', ' + Name
FROM (
SELECT
V.*
,Y.Name
FROM (
SELECT
'Yes' AS A
,24 AS B
,18 AS C
,1 AS D
,'2013-12-19' AS E
,0 AS F
,NULL AS G
,10950 AS H
,10952 AS I
,10957 AS J
,'11273, 11274, 11275' AS K
)
AS V
CROSS APPLY dbo.Split(',',REPLACE(K,' ','')) AS P
JOIN (
SELECT 11273 AS Id , 'Facebook' AS Name UNION ALL
SELECT 11274 AS Id , 'radio' AS Name UNION ALL
SELECT 11275 AS Id , 'commercial' AS Name
)Y ON y.Id = p.s) ExampleTable
FOR XML PATH('')
), 1, 1, '' )
AS [Stuffed]
FROM (
SELECT
V.*
FROM (
SELECT
'Yes' AS A
,24 AS B
,18 AS C
,1 AS D
,'2013-12-19' AS E
,0 AS F
,NULL AS G
,10950 AS H
,10952 AS I
,10957 AS J
,'11273, 11274, 11275' AS K
)
AS V
CROSS APPLY dbo.Split(',',REPLACE(K,' ','')) AS P
JOIN (
SELECT 11273 AS Id , 'Facebook' AS Name UNION ALL
SELECT 11274 AS Id , 'radio' AS Name UNION ALL
SELECT 11275 AS Id , 'commercial' AS Name
)Y ON y.Id = p.s
)Z
) O
GROUP BY O.A,O.B,O.C,O.D,O.E,O.F,O.G,O.H,O.I,O.J,O.K,O.Stuffed

MySQL refinement of query

I'm uing the following query to give me a set of the last 20 matches for a team. I want to find their goals scored in the last 20 matches and order the results by (goals scored, date):
SELECT * FROM (
SELECT *, `against` AS `goalsF` , `for` AS `goalsA`
FROM `matches` , `teams` , `outcomes`
WHERE(
`home_team_id`=7 AND `matches`.away_team_id = `teams`.team_id
OR
`away_team_id`=7 AND `matches`.home_team_id = `teams`.team_id
)
AND `matches`.score_id = `outcomes`.outcome_id
ORDER BY `date` DESC
LIMIT 0 , 20
) res
ORDER BY `goalsF`
The problem is that:
if the team we're looking up is the home team, we need to count "goalsfor".
if the team ins the away team we need to count "goalsagainst" to find their for goals.
So what I need to be able to do is something like:
if (`home_team_id`=7 AND `matches`.away_team_id = `teams`.team_id)
SELECT *, `for` AS `goalsF` , `against` AS `goalsA`
if (`away_team_id`=7 AND `matches`.home_team_id = `teams`.team_id)
SELECT *, `against` AS `goalsF` , `for` AS `goalsA`
But, this must be peformed on the sub-set or results. I'm not sure if this is even possible, but it is beyond my knowledge of MYSQL.
Any help would be hugely appreciated.
Alan.
First, you really need to learn ANSI standard join syntax, where you put the join conditions in an on clause rather than a from clause. Also, aliases may a query much more readable.
The following does the logic that you want, although it does not include the team names:
SELECT *
FROM (SELECT *,
(case when m.home_team_id = 7 then o.against end) as `goalsF` ,
(case when m.away_team_id = 7 then o.`for` end) as `goalsA`
FROM `matches` m join
`outcomes` o
on m.score_id = o.outcome_id
WHERE m.home_team_id = 7 or m.away_team_id = 7
ORDER BY `date` DESC
LIMIT 0 , 20
) res
ORDER BY `goalsF`
To get the team names, you should join twice to the teams table, once for the home team and once for the away team. You can do this either in the subquery or afterwards. It is also a good idea to explicit mention the columns you are choosing and to include a table alias on every column reference:
SELECT *
FROM (SELECT m.*, o.*,
homet.team_name as hometeam_name, awayt.team_name as away_team_name,
(case when m.home_team_id = 7 then o.against end) as `goalsF` ,
(case when m.away_team_id = 7 then o.`for` end) as `goalsA`
FROM `matches` m join
`outcomes` o
on m.score_id = o.outcome_id join
teams homet
on homet.team_id = m.home_team_id join
teams awayt
on awayt.team_id = m.away_team_id
WHERE m.home_team_id = 7 or m.away_team_id = 7
ORDER BY `date` DESC
LIMIT 0 , 20
) res
ORDER BY `goalsF`
EDIT:
To get just goals for team 7, you can use:
(case when m.home_team_id = 7 then o.`for`
when m.away_team_id = 7 then o.against
end) as goals
To get goals for the other team:
(case when m.home_team_id = 7 then o.against
when m.away_team_id = 7 then o.`for`
end) as goals
EDIT II:
To get the "other" team name, the logic is similar. Replace the team name references in the select with:
(case when m.home_team_id = 7 then awayt.team_name
when m.away_team_id = 7 then homet.team_name
end) as goals