Collapse rows with equal values in particular columns in one - mysql

I have a table with deals data written in it. I need to query said deals while joining on other tables, however d.originalorderid is not a unique entry and I get several duplicate entries. I want to:
For each unique d.originalorderid select one row
This row should be most recent (largest ID)
How would I go about this? This is the query I have right now.
SELECT d.id,
d.date,
d.ip, d.panmask,
d.merchantorderid,
d.amount,
d.cardholder,
d.bankhumanname,
d.cardtypeid,
d.bankcountrycode,
d.usercountrycode,
mc.paymentkey as merchantname,
dt.status,
d.merchantcontract,
dt.tag,
d.originalorderid,
ds.refnumber,
ds.dealauthcode,
mc.processingid,
pc.Name as processing,
d.customparams
FROM Deal as d
LEFT JOIN MerchantContract as mc ON mc.Id = d.MerchantContract
LEFT JOIN DealTrace as dt ON d.Id = dt.DealId
AND dt.id = (SELECT MAX(id)
FROM DealTrace WITH (nolock)
WHERE DealId = d.id)
LEFT JOIN DealSummary ds ON d.Id = ds.DealId
AND ds.id = (SELECT MAX(id)
FROM DealSummary WITH (nolock)
WHERE DealId = d.id)
LEFT JOIN Processing pc on mc.ProcessingId = pc.id
WHERE (d.MerchantContract IN ('12'))
ORDER BY ID desc OFFSET 0 ROWS FETCH NEXT 1000 ROWS ONLY

If I understand the requirement, instead of joining to the Deal table, join to a correlated subquery on it which returns the deal with the highest deal id which has the same original order id. Test it but I think its OK..
SELECT ....
FROM
(SELECT *
FROM Deal d1
WHERE d1.Id=(SELECT MAX(Id)
FROM Deal d2
WHERE d2.OriginalOrderId=d1.OriginalOrderId)) d
LEFT JOIN MerchantContract as mc ON mc.Id = d.MerchantContract
etc etc

Related

SQL Complex update query filter distinct values only

I have 3 tables with following columns.
Table: A with column: newColumnTyp1, typ2
Table: B with column: typ2, tableC_id_fk
Table: C with column: id, typ1
I wanted to update values in A.newColumnTyp1 from C.typ1 by following logic:
if A.typ2=B.typ2 and B.tableC_id_fk=C.id
the values must be distinct, if any of the conditions above gives multiple results then should be ignored. For example A.typ2=B.typ2 may give multiple result in that case it should be ignored.
edit:
the values must be distinct, if any of the conditions above gives multiple results then take only one value and ignore rest. For example A.typ2=B.typ2 may give multiple result in that case just take any one value and ignore rest because all the results from A.typ2=B.typ2 will have same B.tableC_id_fk.
I have tried:
SELECT DISTINCT C.typ1, B.typ2
FROM C
LEFT JOIN B ON C.id = B.tableC_id_fk
LEFT JOIN A ON B.typ2= A.typ2
it gives me a result of table with two columns typ1,typ2
My logic was, I will then filter this new table and compare the type2 value with A.typ2 and update A.newColumnTyp1
I thought of something like this but was a failure:
update A set newColumnTyp1= (
SELECT C.typ1 from
SELECT DISTINCT C.typ1, B.typ2
FROM C
LEFT JOIN B ON C.id = B.tableC_id_fk
LEFT JOIN A ON B.typ2= A.type2
where A.typ2=B.typ2);
I am thinking of an updateable CTE and window functions:
with cte as (
select a.newColumnTyp1, c.typ1, count(*) over(partition by a.typ2) cnt
from a
inner join b on b.type2 = a.typ2
inner join c on c.id = b.tableC_id_fk
)
update cte
set newColumnTyp1 = typ1
where cnt > 1
Update: if the columns have the same name, then alias one of them:
with cte as (
select a.typ1, c.typ1 typ1c, count(*) over(partition by a.typ2) cnt
from a
inner join b on b.type2 = a.typ2
inner join c on c.id = b.tableC_id_fk
)
update cte
set typ1 = typ1c
where cnt > 1
I think I would approach this as:
update a
set newColumnTyp1 = bc.min_typ1
from (select b.typ2, min(c.typ1) as min_typ1, max(c.typ1) as max_typ1
from b join
c
on b.tableC_id_fk = c.id
group by b.type2
) bc
where bc.typ2 = a.typ2 and
bc.min_typ1 = bc.max_typ1;
The subquery determines whether typ1 is always the same. If so, it is used for updating.
I should note that you might want the most common value assigned, instead of requiring unanimity. If that is what you want, then you can ask another question.

How can I grab all rows from one table, but filter the second table?

I am running a query on a PHP page that will pull all records from one table, INNER JOIN with two other tables and then list all of the results. However on the second table I only want the most recent record.
Here is my query
SELECT * FROM wn_trailer
INNER JOIN (
SELECT id, trailer_id, trailer_status, trailer_assigned, MAX(last_update), trailer_lat, trailer_long
FROM wn_trailer_history
) AS th ON wn_trailer.id = th.trailer_id
INNER JOIN wn_trailer_status ON wn_trailer_status.id = th.trailer_status
INNER JOIN wn_users ON wn_users.id = th.trailer_assigned
ORDER BY trailer_number ASC
The query runs but returns only the first record.
You want an additional JOIN to bring in the data on the last update date. Also, your subquery needs a GROUP BY:
SELECT *
FROM wn_trailer t INNER JOIN
(SELECT trailer_id, MAX(last_update) as max_last_update
FROM wn_trailer_history
GROUP BY trailer_id
) tht
ON t.id = tht.trailer_id INNER JOIN
wn_trailer_history th
ON th.trailer_id = tht.trailer_id AND
th.last_update = tht.max_last_update INNER JOIN
wn_trailer_status ts
ON ts.id = th.trailer_status INNER JOIN
wn_users u
ON u.id = th.trailer_assigned
ORDER BY trailer_number ASC;
I also added table aliases so the query is easier to write and to read.

Left join sql query

I want to get all the data from the users table & the last record associated with him from my connection_history table , it's working only when i don't add at the end of my query
ORDER BY contributions DESC
( When i add it , i have only the record wich come from users and not the last connection_history record)
My question is : how i can get the entires data ordered by contributions DESC
SELECT * FROM users LEFT JOIN connections_history ch ON users.id = ch.guid
AND EXISTS (SELECT 1
FROM connections_history ch1
WHERE ch.guid = ch1.guid
HAVING Max(ch1.date) = ch.date)
The order by should not affect the results that are returned. It only changes the ordering. You are probably getting what you want, just in an unexpected order. For instance, your query interface might be returning a fixed number of rows. Changing the order of the rows could make it look like the result set is different.
I will say that I find = to be more intuitive than EXISTS for this purpose:
SELECT *
FROM users u LEFT JOIN
connections_history ch
ON u.id = ch.guid AND
ch.date = (SELECT Max(ch1.date)
FROM connections_history ch1
WHERE ch.guid = ch1.guid
)
ORDER BY contributions DESC;
The reason is that the = is directly in the ON clause, so it is clear what the relationship between the tables is.
For your casual consideration, a different formatting of the original code. Note in particular the indented AND suggests the clause is part of the LEFT JOIN, which it is.
SELECT * FROM users
LEFT JOIN connections_history ch ON
users.id = ch.guid
AND EXISTS (SELECT 1
FROM connections_history ch1
WHERE ch.guid = ch1.guid
HAVING Max(ch1.date) = ch.date
)
We can use nested queries to first check for max_date for a given user and pass the list of guid to the nested query assuming all the users has at least one record in the connection history table otherwise you could use Left Join instead.
select B.*,X.* from users B JOIN (
select A.* from connection_history A
where A.guid = B.guid and A.date = (
select max(date) from connection_history where guid = B.guid) )X on
X.guid = B.guid
order by B.contributions DESC;

How to access parent column from a subquery within a join

I'm trying to left join the second table useri_ban based on the users' ids, with the extra condition: useri_ban.start_ban = max_start.
In order for me to calculate max_start, I have to run the following subquery:
(SELECT MAX(ub.start_ban) AS max_start, user_id FROM useri_ban ub WHERE ub.user_id = useri.id)
Furthermore, in order to add max_start to every row, I need to inner join this subquery's result into the main result. However, it seems that once I apply that join, the subquery is no longer able to access useri.id.
What am I doing wrong?
SELECT
useri.id as id,
useri.email as email,
useri_ban.warning_type_id as warning_type_id,
useri_ban.type as type,
useri.created_at AS created_at
FROM `useri`
inner join
(SELECT MAX(ub.start_ban) AS max_start, user_id FROM useri_ban ub WHERE ub.user_id = useri.id) `temp`
on `useri`.`id` = `temp`.`user_id`
left join `useri_ban` on `useri_ban`.`user_id` = `useri`.`id` and `useri_ban`.`start_ban` = `max_start`
Does this solve your problem? You need GROUP BY in the inner query instead of another join.
SELECT useri.id, useri.email, maxQuery.maxStartBan
FROM useri
INNER JOIN
(
SELECT useri_ban.user_id ubid, MAX(useri_ban.startban) maxStartBan
FROM useri_ban
GROUP BY useri_ban.user_id
) AS maxQuery
ON maxQuery.ubid = useri.id;

MySQL Inner Join with where clause sorting and limit, subquery?

Everything in the following query results in one line for each invBlueprintTypes row with the correct information. But I'm trying to add something to it. See below the codeblock.
Select
blueprintType.typeID,
blueprintType.typeName Blueprint,
productType.typeID,
productType.typeName Item,
productType.portionSize,
blueprintType.basePrice * 0.9 As bpoPrice,
productGroup.groupName ItemGroup,
productCategory.categoryName ItemCategory,
blueprints.productionTime,
blueprints.techLevel,
blueprints.researchProductivityTime,
blueprints.researchMaterialTime,
blueprints.researchCopyTime,
blueprints.researchTechTime,
blueprints.productivityModifier,
blueprints.materialModifier,
blueprints.wasteFactor,
blueprints.maxProductionLimit,
blueprints.blueprintTypeID
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
So what I need to get in here is the following table with the columns below it so I can use the values timestamp and sort the entire result by profitHour
tablename: invBlueprintTypesPrices
columns: blueprintTypeID, timestamp, profitHour
I need this information with the following select in mind. Using a select to show my intention of the JOIN/in-query select or whatever that can do this.
SELECT * FROM invBlueprintTypesPrices
WHERE blueprintTypeID = blueprintType.typeID
ORDER BY timestamp DESC LIMIT 1
And I need the main row from table invBlueprintTypes to still show even if there is no result from the invBlueprintTypesPrices. The LIMIT 1 is because I want the newest row possible, but deleting the older data is not a option since history is needed.
If I've understood correctly I think I need a subquery select, but how to do that? I've tired adding the exact query that is above with a AS blueprintPrices after the query's closing ), but did not work with a error with the
WHERE blueprintTypeID = blueprintType.typeID
part being the focus of the error. I have no idea why. Anyone who can solve this?
You'll need to use a LEFT JOIN to check for NULL values in invBlueprintTypesPrices. To mimic the LIMIT 1 per TypeId, you can use the MAX() or to truly make sure you only return a single record, use a row number -- this depends on whether you can have multiple max time stamps for each type id. Assuming not, then this should be close:
Select
...
From
invBlueprintTypes As blueprints
Inner Join invTypes As blueprintType On blueprints.blueprintTypeID = blueprintType.typeID
Inner Join invTypes As productType On blueprints.productTypeID = productType.typeID
Inner Join invGroups As productGroup On productType.groupID = productGroup.groupID
Inner Join invCategories As productCategory On productGroup.categoryID = productCategory.categoryID
Left Join (
SELECT MAX(TimeStamp) MaxTime, TypeId
FROM invBlueprintTypesPrices
GROUP BY TypeId
) blueprintTypePrice On blueprints.blueprintTypeID = blueprintTypePrice.typeID
Left Join invBlueprintTypesPrices blueprintTypePrices On
blueprintTypePrice.TypeId = blueprintTypePrices.TypeId AND
blueprintTypePrice.MaxTime = blueprintTypePrices.TimeStamp
Where
blueprints.techLevel = 1 And
blueprintType.published = 1 And
productType.marketGroupID Is Not Null And
blueprintType.basePrice > 0
Order By
blueprintTypePrices.profitHour
Assuming you might have the same max time stamp with 2 different records, replace the 2 left joins above with something similar to this getting the row number:
Left Join (
SELECT #rn:=IF(#prevTypeId=TypeId,#rn+1,1) rn,
TimeStamp,
TypeId,
profitHour,
#prevTypeId:=TypeId
FROM (SELECT *
FROM invBlueprintTypesPrices
ORDER BY TypeId, TimeStamp DESC) t
JOIN (SELECT #rn:=0) t2
) blueprintTypePrices On blueprints.blueprintTypeID = blueprintTypePrices.typeID AND blueprintTypePrices.rn=1
You don't say where you are putting the subquery. If in the select clause, then you have a problem because you are returning more than one value.
You can't put this into the from clause directly, because you have a correlated subquery (not allowed).
Instead, you can put it in like this:
from . . .
(select *
from invBLueprintTypesPrices ibptp
where ibtp.timestamp = (select ibptp2.timestamp
from invBLueprintTypesPrices ibptp2
where ibptp.blueprintTypeId = ibptp2.blueprintTypeId
order by timestamp desc
limit 1
)
) ibptp
on ibptp.blueprintTypeId = blueprintType.TypeID
This identifies the most recent records for all the blueprintTypeids in the subquery. It then joins in the one that matches.