How can I eliminate duplicates using MAX function?

How can I eliminate duplicates using MAX function? - mysql

I have these tables
recommendation_object_id, exhibitor_name, event_edition_id, timestamp
I want to hide/remove the duplicates in recommendation_object_id to make it a primary key.
I successfully removed most of the dups, but a few recommendation id's have a different event edition id so some id's are still duplicating as a result.
A colleague of mine said I could eliminate those further by using max(timestamp) but I could not pull it off :(
My current query is this:
SELECT DISTINCT r.recommended_object_id, ed.exhibitor_name, sd.event_edition_id, r.object_type, max(r.timestamp)
FROM recommendations r
left join show_details sd on r.event_edition_id = sd.event_edition_id
left join exhibitor_details ed on r.recommended_object_id = ed.exhibitor_id
group by r.recommended_object_id, ed.exhibitor_name, sd.event_edition_id, r.object_type
order by r.recommended_object_id

If you want one row per recommended_object_id, the one with the most recent timestamp, then use window functions:
select r.*
from (select r.recommended_object_id, ed.exhibitor_name, sd.event_edition_id, r.object_type,
row_number() over (partition by recommended_object_id order by r.timestamp desc) as seqnum
from recommendations r left join
show_details sd
on r.event_edition_id = sd.event_edition_id left join
exhibitor_details ed
on r.recommended_object_id = ed.exhibitor_id
) r
where seqnum = 1
order by r.recommended_object_id;

Related

MySQL if row is duplicated prefer which matches field value

I have the following select query. I want to avoid getting the duplicated "EN" row when "ES" row is present. Like prefer ES over EN.
SELECT s.soft_id,s.groupby,s.packageid,s.name,s.area,l.min,GROUP_CONCAT(DISTINCT JSON_ARRAY(s.version,s.detailid,s.filesize,s.updatetime)) versions
FROM software s
INNER JOIN langs l ON s.lang_id=l.lang_id
INNER JOIN devices_type t ON (s.familylock_id=t.familylock_id OR (s.familylock_id=0 AND s.devicelock_id=t.device_type_id))
INNER JOIN devices d ON t.device_type_id=d.device_type_id
INNER JOIN users u ON d.user_id=u.user_id
WHERE s.groupby IN(1,2,3)
AND u.token="abc"
AND d.serialno="123456789"
AND l.min IN("ES","EN")
GROUP BY s.soft_id,s.groupby,s.packageid,s.name,s.area,l.min ORDER BY s.name ASC
This is the example result:
image
You can test your query here: http://185.27.134.10/login.php?2=epiz_26706010wejghelqwdtg3e54gVGtSRk1VMUVRVE5QVkdzeFRWaDNhRWxUUldoSldIZzRaa2g0T0daSWVEaG1TSGhvVjIxc1JGb3lkRk5rV0U1cFZsRTlQUT09wejghelqwdtg3e54gsql102.epizy.comwejghelqwdtg3e54gepiz_26706010_test&db=epiz_26706010_test

I'd do it in two steps, count how many duplicate rows you have. And by duplicate, I mean identical on one column but differ in ES vs EN. have the table sorted
select the last among the duplicates
Get top first record from duplicate records having no unique identity

Outer join the table twice, looking for a single specific value with each join, and then coalesce the fields in the order you prefer:
SELECT s.soft_id, coalesce(les.min, len.min) As min, ...
FROM software s
LEFT JOIN langs les ON s.lang_id=les.lang_id AND les.min = 'ES'
LEFT JOIN langs len ON s.lang_id=len.lang_id AND len.min = 'EN'
...
WHERE s.groupby IN(1,2,3)
AND coalesce(les.lang_id, len.lang_id) IS NOT NULL
...
Windowing functions can do it more efficiently, but they're still not support on lots of MySql servers in the wild. If you're using 8.0 or later, you should look into that option.

You can use window functions as:
with q as (
<your query here>
)
select q.*
from (select q.*,
row_number() over (partition by soft_id order by field(l.min, 'ES', 'EN')) as seqnum
from q
) q
where seqnum = 1;

MySQL, joining a table where you require the max value from the second table

I have the below query:
SELECT users_service.id, name
FROM users_service
LEFT JOIN
(SELECT * FROM activity)
activity ON (users_service.id = activity.user_service_id)
WHERE admin_id = 1
However, this returns as many results from the activity table as exist, ie multiple activity results for each admin_id entry.
I desire to return only the latest row from the activity table for each admin_id.
This could be entry_date or id.
I tried using distinct & max and limit 1, but these all produced strange behavior.

Use ROW_NUMBER():
SELECT us.id, a.name
FROM users_service us LEFT JOIN
(SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY a.user_service id ORDER BY ? DESC) as seqnum
FROM activity
) a
ON u.id = a.user_service_id AND seqnum = 1
WHERE u.admin_id = 1;
The ? is for the column that specifies the "most recent", which your question doesn't clarify.

You did not specify the column by which you determine the most recent activity. I call it datetime_col in the solution below:
SELECT users_service.id
, name
FROM users_service usv
LEFT
JOIN activity act
on act.users_service.id = usv.user_service_id
and act.datetime_col = (select max(datetime_col)
from activity act_
WHERE act_.user_service_id= act.user_service_id)

Mysql - 'Select max' from multiple joined tables doesn't return correct values

I have two tables - one is a list of addresses, and the other of attendance dates and EmployeeIDNumbers to identfy the engineer who attended. An engineer may have attended an address multiple times. I am trying to select the address name, and the most recent attendance date and corresponding engineerID
select s.sitename, max(sd.scheduleddate), sd.EngineerID
from sites as s
left join scheduled_dates as sd on sd.idsites = s.idsites
group by s.idsites
This code correctly pulls each address and the most recent 'Scheduled Date' but does not pull the correct corresponding engineer id. How do I get the engineerID from the same row as the max(scheduleddate)? Think this is something to do with the 'greatest-n-per-group' discussion, but I can't see how to implement that code with a query that already has a join

You can use a NOT EXISTS condition with a correlated subquery:
select s.sitename, sd.EngineerID, sd.scheduleddate
from sites as s
inner join scheduled_dates as sd on sd.idsites = s.idsites
where not exists (
select 1
from scheduled_dates sd1
where sd1.idsites = s.idsites
and sd1.scheduleddate > sd.scheduleddate
)
The condition ensures that there no other record exists in scheduled_dates for the current site with a date greater than the one on the record being selected.
Notes: I turned you LEFT JOIN to an INNER JOIN, since I believe that it better fit your use cases, feel free to revert this if needed.

In MySQL 8+, you can use window functions:
select s.sitename, sd.scheduleddate, sd.EngineerID
from sites s left join
(select sd.*,
row_number() over (partition by sd.idsites orer by sd.scheduleddate desc) as seqnum
from scheduled_dates sd
) sd
on sd.idsites = s.idsites and sd.seqnum = 1;
Note that this also keeps all sites (which appears to be your intention), even those that have not been visited.

Using Max in group by

I have the following query and I want it to generate the result with just the latest date for the category for a store instead of giving out per date transaction:
SELECT c.store,d.node_name category, x.txn_dt, x.txn_tm time, count(c.txn_id) Buyer
FROM pos_swy.5_centerstore_triptype c
join pos_swy.3_txn_itm t on c.txn_id=t.txn_id
join pos_swy.1_upc_node_map d on t.upc_id=d.upc_id
join pos_swy.3_txn_hdr x on t.txn_id=x.txn_id
group by store,txn_dt,node_name;
I tried using max(x.txn_dt) but it really didn't solve the purpose.

you may need the order by ?
SELECT c.store,d.node_name category, max(x.txn_dt) max_date, x.txn_tm time, count(c.txn_id) Buyer
FROM pos_swy.5_centerstore_triptype c
join pos_swy.3_txn_itm t on c.txn_id=t.txn_id
join pos_swy.1_upc_node_map d on t.upc_id=d.upc_id
join pos_swy.3_txn_hdr x on t.txn_id=x.txn_id
group by node_name
order by max_date desc
-- you can change limit 1 to what ever you want to get results

SQL query to get latest prices, depending on the date

I am trying to retrieve a list of products which have been updated, the table contains multiple updates of the products as it records the price changes.
I need to get the latest price changes for all products, but only return the the last update. I have the below code so far, but it only returns the very last update and only 1 product.
SELECT dbo.twProducts.title, dbo.LowestPrices.productAsin, dbo.twProducts.sku,
dbo.LowestPrices.tweAmzPrice, dbo.LowestPrices.price, dbo.LowestPrices.priceDate
FROM dbo.aboProducts INNER JOIN
dbo.LowestPrices ON dbo.aboProducts.asin = dbo.LowestPrices.productAsin
INNER JOIN dbo.twProducts ON dbo.aboProducts.sku = dbo.twProducts.sku
WHERE (dbo.LowestPrices.priceDate =
(SELECT MAX(priceDate) AS Expr1
FROM dbo.LowestPrices AS LowestPrices_1))
I hope this makes sense, i am not sure if i have explained it in a way thats easy to understand.
Any questions please feel free to ask.

The easiest would be to use a CTE with ROW_NUMBER function:
WITH CTE AS
(
SELECT dbo.twProducts.title, dbo.LowestPrices.productAsin, dbo.twProducts.sku,
dbo.LowestPrices.tweAmzPrice, dbo.LowestPrices.price, dbo.LowestPrices.priceDate,
RN = ROW_NUMBER()OVER( PARTITION BY productAsin ORDER BY priceDate DESC)
FROM dbo.aboProducts INNER JOIN
dbo.LowestPrices ON dbo.aboProducts.asin = dbo.LowestPrices.productAsin
INNER JOIN dbo.twProducts ON dbo.aboProducts.sku = dbo.twProducts.sku
)
SELECT * FROM CTE WHERE RN = 1

I think the adjustment to the query you are looking for is to join your subquery rather than just matching on the Date.
SELECT dbo.twProducts.title, dbo.LowestPrices.productAsin, dbo.twProducts.sku,
dbo.LowestPrices.tweAmzPrice, dbo.LowestPrices.price, dbo.LowestPrices.priceDate
FROM dbo.aboProducts INNER JOIN
dbo.LowestPrices ON dbo.aboProducts.asin = dbo.LowestPrices.productAsin
INNER JOIN dbo.twProducts ON dbo.aboProducts.sku = dbo.twProducts.sku
WHERE dbo.LowestPrices.priceDate IN
(SELECT MAX(LowestPrices_1.priceDate)
FROM dbo.LowestPrices AS LowestPrices_1
WHERE dbo.LowestPrices.productAsin = LowestPrices_1.productAsin)
This will match on the max(priceDate) for each product

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How can I eliminate duplicates using MAX function? - mysql

Related

MySQL if row is duplicated prefer which matches field value

MySQL, joining a table where you require the max value from the second table

Mysql - 'Select max' from multiple joined tables doesn't return correct values

Using Max in group by

SQL query to get latest prices, depending on the date

Categories

Resources