GROUP BY ordering - mysql

I have the following query in mysql:
select territory_id, platform_type_id, p.store_url
from main_itemmaster m
inner join main_iteminstance i on m.id=i.master_id
inner join main_territorypricing p on p.item_id=i.id
inner join main_territorypricingavail a on a.tp_id=p.id
where imdb_url = 'http://imdb.com/title/tt1576422/'
group by platform_type_id
Which gives me the following:
territory_id platform_type_id store_url
US Amazon http://www.amazon.com/dp/B00EQIHJAG
PT ITUNES https://itunes.apple.com/pt/movie/id582142080
However, I want to do a GROUP BY to return the territory_id="US" first if that exists. How would I do that?
This is one way I tried which looks quite dirty but does work in the version of mysql I'm using:
select * from
(select territory_id, platform_type_id, p.store_url from main_itemmaster m
inner join main_iteminstance i on m.id=i.master_id
inner join main_territorypricing p on p.item_id=i.id
inner join main_territorypricingavail a on a.tp_id=p.id
where imdb_url = 'http://imdb.com/title/tt1576422/'
order by territory_id='us' desc
) x group by platform_type_id
Which gives:
territory_id platform_type_id store_url
US Amazon http://www.amazon.com/dp/B00EQIHJAG
US ITUNES https://itunes.apple.com/us/movie/id582142080
Which is the correct result set I'm looking to get.
Here is a link to a SQL fiddle. I condensed all the data into one table to focus on the GROUP BY statement: http://sqlfiddle.com/#!9/81c3b6/2/0

So from the comments and the addition of the SqlFiddle it actually seems like you want to create a partitioned row number with a precedence on US per platform and then select the first record. One way of doing partitioned Row Numbers in mysql is to use variables here is an example:
SELECT
territory_id
,platform_type_id
,store_url
FROM
( SELECT
*
,#PlatFormRowNum:= IF(#prevplatform = platform_type_id, #PlatFormRowNum + 1, 1) as PlatformRowNum
,#prevplatform:= platform_type_id
FROM
main_itemmaster m
CROSS JOIN (SELECT #prevplatform:='',#PlatFormRowNum=0) var
ORDER BY
platform_type_id
,CASE WHEN territory_id = 'US' THEN 0 ELSE 1 END
,territory_id
) t
WHERE
t.PlatformRowNum = 1
ORDER BY
t.platform_type_id
SQL Fiddle: http://sqlfiddle.com/#!9/81c3b6/12
Basically this partitions the row number by platform, orders US before any other territory and then selects the first row foreach platform. The one question/trick is how do you choose which to return when US is not available for the platform simply the ascending alphabetical order of the territory_id?

Related

SELECT only first and last results

A client can have more than one equipment (SerialNo). Each equipment has a cost and every month there is data recorded for each equipment. I'm trying to select only the first and last result for each equipment based on the queried period.
"
SELECT i.SerialNo
, p.Name
, c.Cost
, ci.DataDate
, ci.Data
,
FROM install i
JOIN product p USING (ProductId)
JOIN counter c USING (InstallId)
JOIN counter_item ci USING (CounterId)
WHERE i.ClientId LIKE $clientId
AND MONTH(ci.DataDate) BETWEEN $mStart AND $mEnd
";
This select works but it retrieves all records between the starting date and finishing date.
I tried, to get the top results and figured I would use A UNION ALL to combine with the bottom results (ci.DataDate ASC), but it's not working. I only get the first record encounter.
GROUP BY i.SerialNo
ORDER BY ci.DataDate DESC
It's like ORDER BY has no effect at all.
In counter_item you find the first and last DataDate per CounterId for the time range. So find these first by aggregation and use this information in order to join the desired records:
SELECT i.SerialNo,
p.Name,
c.Cost,
ci.DataDate,
ci.Data
FROM install i
JOIN product p ON p.ProductId = i.ProductId
JOIN counter c ON c.InstallId = i.InstallId
JOIN
(
SELECT CounterId, MIN(DataDate) AS MinDate, MAX(DataDate) AS MaxDate
FROM counter_item
WHERE MONTH(DataDate) BETWEEN $mStart AND $mEnd
GROUP BY CounterId
) minmax ON minmax.CounterId = c.CounterId
JOIN counter_item ci ON ci.CounterId = minmax.CounterId
AND ci.DataDate IN (minmax.MinDate, minmax.MaxDate)
WHERE i.ClientId LIKE $clientId
ORDER BY i.SerialNo, ci.DataDate
You could do it in next way, here is just general idea of how that could be done:
select * from table
where
([row] = (select max([row]) from table ) or
[Date] = (select min([row]) from table ))
You may also be able to use a cross apply. Something like this, untested rough sample:
SELECT i.SerialNo,
p.Name,
c.Cost,
MIN(ci.DataDate) as MinDate,
b.MaxDate,
ci.Data,
FROM install i
CROSS APPLY (SELECT
MAX(ci.DataDate) as MaxDate
FROM install
JOIN counter_item ci USING (CounterId)
WHERE i.ClientId LIKE $clientId
AND MONTH(ci.DataDate) BETWEEN $mStart AND $mEnd) b
WHERE i.ClientId LIKE $clientId
AND MONTH(ci.DataDate) BETWEEN $mStart AND $mEnd
GROUP BY i.SerialNo
ORDER BY ci.DataDate DESC

sql counts wrong number of likes

I have written an sql statement that besides all the other columns should return the number of comments and the number of likes of a certain post. It works perfectly when I don't try to get the number of times it has been shared too. When I try to get the number of time it was shared instead it returns a wrong number of like that seems to be either the number of shares and likes or something like that. Here is the code:
SELECT
[...],
count(CS.commentId) as shares,
count(CL.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN
account A ON A.id = `AS`.accountId
INNER JOIN
comment C ON C.accountId = A.id
LEFT JOIN
commentLikes CL ON C.commentId = CL.commentId
LEFT JOIN
commentShares CS ON C.commentId = CS.commentId
GROUP BY
C.time
ORDER BY
year, month, hour, month
Could you also tell me if you think this is an efficient SQL statement or if you would do it differently? thank you!
Do this instead:
SELECT
[...],
(select count(*) from commentLikes CL where C.commentId = CL.commentId) as shares,
(select count(*) from commentShares CS where C.commentId = CS.commentId) as numberOfLikes
FROM
(SELECT *
FROM accountSpecifics
WHERE institutionId= '{$keyword['id']}') `AS`
INNER JOIN account A ON A.id = `AS`.accountId
INNER JOIN comment C ON C.accountId = A.id
GROUP BY C.time
ORDER BY year, month, hour, month
If you use JOINs, you're getting back one result set, and COUNT(any field) simply counts the rows and will always compute the same thing, and in this case the wrong thing. Subqueries are what you need here. Good luck!
EDIT: as posted below, count(distinct something) can also work, but it's making the database do more work than necessary for the answer you want to end up with.
Quick fix:
SELECT
[...],
count(DISTINCT CS.commentId) as shares,
count(DISTINCT CL.commentId) as numberOfLikes
Better approach:
SELECT [...]
, Coalesce(shares.numberOfShares, 0) As numberOfShares
, Coalesce(likes.numberOfLikes , 0) As numberOfLikes
FROM [...]
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfShares
FROM commentShares
GROUP
BY commentId
) As shares
ON shares.commentId = c.commentId
LEFT
JOIN (
SELECT commentId
, Count(*) As numberOfLikes
FROM commentLikes
GROUP
BY commentId
) As likes
ON likes.commentId = c.commentId

MAX() Function not working as expected

I've created sqlfiddle to try and get my head around this http://sqlfiddle.com/#!2/21e72/1
In the query, I have put a max() on the compiled_date column but the recommendation column is still coming through incorrect - I'm assuming that a select statement will need to be inserted on line 3 somehow?
I've tried the examples provided by the commenters below but I think I just need to understand this from a basic query to begin with.
As others have pointed out, the issue is that some of the select columns are neither aggregated nor used in the group by clause. Most DBMSs won't allow this at all, but MySQL is a little relaxed on some of the standards...
So, you need to first find the max(compiled_date) for each case, then find the recommendation that goes with it.
select r.case_number, r.compiled_date, r.recommendation
from reporting r
join (
SELECT case_number, max(compiled_date) as lastDate
from reporting
group by case_number
) s on r.case_number=s.case_number
and r.compiled_date=s.lastDate
Thank you for providing sqlFiddle. But only reporting data is given. we highly appreciate if you give us sample data of whole tables.
Anyway, Could you try this?
SELECT
`case`.number,
staff.staff_name AS ``case` owner`,
client.client_name,
`case`.address,
x.mx_date,
report.recommendation
FROM
`case` INNER JOIN (
SELECT case_number, MAX(compiled_date) as mx_date
FROM report
GROUP BY case_number
) x ON x.case_number = `case`.number
INNER JOIN report ON x.case_number = report.case_number AND report.compiled_date = x.mx_date
INNER JOIN client ON `case`.client_number = client.client_number
INNER JOIN staff ON `case`.staff_number = staff.staff_number
WHERE
`case`.active = 1
AND staff.staff_name = 'bob'
ORDER BY
`case`.number ASC;
Check below query:
SELECT c.number, s.staff_name AS `case owner`, cl.client_name,
c.address, MAX(r.compiled_date), r.recommendation
FROM case c
INNER JOIN (SELECT r.case_number, r.compiled_date, r.recommendation
FROM report r ORDER BY r.case_number, r.compiled_date DESC
) r ON r.case_number = c.number
INNER JOIN client cl ON c.client_number = cl.client_number
INNER JOIN staff s ON c.staff_number = s.staff_number
WHERE c.active = 1 AND s.staff_name = 'bob'
GROUP BY c.number
ORDER BY c.number ASC
SELECT
case.number,
staff.staff_name AS `case owner`,
client.client_name,
case.address,
(select MAX(compiled_date)from report where case_number=case.number),
report.recommendation
FROM
case
INNER JOIN report ON report.case_number = case.number
INNER JOIN client ON case.client_number = client.client_number
INNER JOIN staff ON case.staff_number = staff.staff_number
WHERE
case.active = 1 AND
staff.staff_name = 'bob'
GROUP BY
case.number
ORDER BY
case.number ASC
try this

How to expand MySQL subquery/join to include all rows

This query below is perfect in producing the result for horse_id = 1 ... but I want to do this for all horses in the database. Can anyone share with me how to tweak this query so I can do that?
SELECT figures.entry_id,
max(figures.beyer)
FROM
( SELECT hrdb_lines.horse_id,
hrdb_entries.entry_id,
hrdb_lines.beyer
FROM hrdb_entries
INNER JOIN hrdb_lines
ON hrdb_lines.horse_id = hrdb_entries.horse_id
WHERE hrdb_lines.horse_id = 1
ORDER BY hrdb_lines.line_date DESC
LIMIT 2
) as figures
Perhaps I'm doing it all wrong too.
I think the following would generate the desired results:
SELECT `entry_id`, `beyer`
FROM (SELECT hrdb_entries.entry_id,
MAX( hrdb_lines.beyer )
FROM hrdb_entries
INNER JOIN hrdb_lines
ON hrdb_lines.horse_id = hrdb_entries.horse_id
GROUP BY hrdb_lines.horse_id
ORDER BY hrdb_lines.line_date DESC
) AS figures
If I'm understanding your question, something like this should be close:
SELECT
figures.horse_id,
figures.entry_id,
max(figures.beyer)
FROM
(SELECT
hrdb_lines.horse_id,
hrdb_entries.entry_id,
hrdb_lines.beyer
FROM hrdb_entries
INNER JOIN hrdb_lines ON hrdb_lines.horse_id = hrdb_entries.horse_id
ORDER BY hrdb_lines.line_date DESC
) as figures
GROUP BY figures.horse_id
One option to limit the MAX to just the most recent 2 beyer fields is to add a row number to the results and only include rows 1 and 2.
SELECT
figures.horse_id,
figures.entry_id,
max(figures.beyer)
FROM
(SELECT
#rn:=if(#prev_horse_id=horse_id,#rn+1,1) rn,
hrdb_lines.horse_id,
hrdb_entries.entry_id,
hrdb_lines.beyer,
#prev_horse_id:=hrdb_lines.horse_id
FROM hrdb_entries
INNER JOIN hrdb_lines ON hrdb_lines.horse_id = hrdb_entries.horse_id
INNER JOIN (SELECT #rn:=0) r
ORDER BY hrdb_lines.horse_id, hrdb_lines.line_date DESC
) as figures
WHERE rn <= 2
GROUP BY figures.horse_id

sql query very slow when another table gets fuller

I have the following query, but after some time when users start putting in more and more items in the "ci_falsepositives" table, it gets really slow.
The ci_falsepositives table contains a reference field from ci_address_book and another reference field from ci_matched_sanctions.
How can I create a new query but still being able to sort on each field.
For example I can still sort on "hits" or "matches"
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches,
(SELECT COUNT(*)
FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference
AND n.sanction_key IN
(SELECT sanction_key FROM ci_matched_sanctions)
) AS falsepositives
FROM ci_address_book c
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) S
ORDER BY folder asc, wholename ASC
LIMIT 0,15
The problem has to be the SELECT COUNT(*) FROM ci_falsepositives sub-query. That sub-query can be written using an inner join between ci_falsepositives and ci_matched_sanctions, but the optimizer might do that for you anyway. What I think you need to do, though, is make that sub-query into a separate query in the FROM clause of the 'next query out' (that is, SELECT c.*, ...). Probably, that query is being evaluated multiple times - and that's what's hurting you when people add records to ci_falsepositives. You should study the query plan carefully.
Maybe this query will be better:
SELECT *, matches - falsepositives AS hits
FROM (SELECT c.*, IFNULL(p.total, 0) AS matches, f.falsepositives
FROM ci_address_book AS c
JOIN (SELECT n.addressbook_id, COUNT(*) AS falsepositives
FROM ci_falsepositives AS n
JOIN ci_matched_sanctions AS m
ON n.sanction_key = m.sanction_key
GROUP BY n.addressbook_id
) AS f
ON c.reference = f.addressbook_id
LEFT JOIN
(SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id) AS p
ON c.id = p.addressbook_id
) AS s
ORDER BY folder asc, wholename ASC
LIMIT 0, 15