I have 2 duplicate rows in the table,I want to delete only 1 from that and keep the other row.how can I do that?
The PostGres code might be a little different, but here's an example from TSQL that does it with a CTE:
; WITH duplicates
AS (
SELECT ServerName ,
ProcessName ,
DateCreated ,
RowRank = ROW_NUMBER() OVER(PARTITION BY ServerName, ProcessName, DateCreated ORDER BY 1)
FROM dbo.ErrorLog
)
DELETE e
FROM dbo.ErrorLog e
JOIN duplicates d
ON d.ServerName = e.ServerName
AND d.ProcessName = e.ProcessName
AND d.DateCreated = e.DateCreated
AND d.RowRank <> 1
Related
I have a table:
QUOTE
| id | value | mar_id | date |
And I am trying to select the latest row for each mar_id (market id). I have managed to achieve what I need from the query below:
SELECT
q.*
FROM quote q
WHERE q.date = (
SELECT MAX(q1.date)
FROM quote q1
WHERE q.mar_id = q1.mar_id
)
However I find that the query is incredibly slow (>60s), to the extent my database kills the connection.
I did an EXPLAIN to find out why and got the result:
I have a composite unique index QUO_UQ on mar_id, date which appears to be getting used.
logically this doesn't seem such a tough query to run, what can I do to do this more efficiently?
An example of an uncorrelated subquery
SELECT x.*
FROM quote x
JOIN
( SELECT mar_id
, MAX(date) date
FROM quote
GROUP
BY mar_id
) y
ON y.mar_id = x.mar_id
AND y.date = x.date;
select * from (
select mar_id, [date],row_number() over (partition by mar_id order by [date] desc ) as [Rank] from
qoute
group by mar_id, [date]) q where Rank = 1
Your query is fine:
SELECT q.*
FROM quote q
WHERE q.date = (SELECT MAX(q1.date)
FROM quote q1
WHERE q.mar_id = q1.mar_id
);
I recommend an index on quote(mar_id, date). This is probably the fastest method to get your result.
EDIT:
I'm curious if you find that this uses the index:
SELECT q.*
FROM quote q
WHERE q.date = (SELECT q1.date
FROM quote q1
WHERE q.mar_id = q1.mar_id
ORDER BY q1.date DESC
LIMIT 1
);
I'm writing a Logging System for Items where i track the Quantity and Type of various Objects.
And i need to write a Insert Query where it only imports if the Quantity (qty) has changed since the last time.
This is the Query to get the last inserted Quantity:
SELECT qty FROM `qty` WHERE object='object_name' AND type='type' ORDER BY UNIX_TIMESTAMP(timestamp) DESC LIMIT 1
But now how do i say: Import only if quantity given by Programm is not the Quantity given by the Query above
Edit:
Here is the Normal insert:
INSERT INTO `qty` (qty, object, type) VALUES ("quantity", "object_name", "type")
Edit:
I got it working now!
thanks everybody for the response! you guys are awesome :)
INSERT INTO qty (qty, object, type)
SELECT * FROM (SELECT 'qty-value', 'object-value', 'type-value') AS tmp
WHERE NOT EXISTS (
SELECT * FROM (SELECT qty FROM `qty` WHERE object = 'object-value' AND type = 'type-value' ORDER BY UNIX_TIMESTAMP( timestamp ) DESC LIMIT 1) as lastQTY WHERE qty = "qty-value"
) LIMIT 1;
If you want to insert new values, try matching the new values to the old values. If there is a match, then filter out the rows. I think the key is using insert . . . select rather than insert . . . values.
The following gives the idea:
INSERT INTO qty(qty, object, type)
select #quantity, #object_name", #type
from (select #quantity as quantity, #object_name as object_name, #type as type
) as newrow left outer join
(SELECT qty.*
FROM qty
WHERE object = #object_name AND type = #type
ORDER BY UNIX_TIMESTAMP(timestamp) DESC
LIMIT 1
) oldrow
on newrow.quantity = oldrow.quantity and
newrow.object_name = oldrow.object_name and
newrow.type = oldrow.type
where oldrow is null;
Think this would do it.
This takes your input values, joins that against a sub query to get the latest timestamp for the object and type, and then joins that against the qty table to get the value of the column qty for the latest timestamp and that the qty is the same as the new qty.
The WHERE clause is then checking that the value of the latest qty is NULL (ie, assuming the qty can not legitimatly be NULL there is no record found )
INSERT INTO `qty_test` (qty, object, type)
SELECT a.qty, a.object, a.type
FROM
(
SELECT 1 AS qty, 1 AS object, 1 AS type
) a
LEFT OUTER JOIN
(
SELECT object, type, MAX(timestamp) AS max_timestamp
FROM qty_test
GROUP BY object, type
) b
ON a.object = b.object
AND a.type = b.type
LEFT OUTER JOIN qty_test c
ON a.object = c.object
AND a.type = c.type
AND a.qty = c.qty
AND b.max_timestamp = c.timestamp
WHERE c.qty IS NULL
I have a table CONTACT with a field opt_out.
The field opt_out may have values 'Y', 'N' and NULL.
I have a table CONTACT_AUDIT with fields
date
contact_id
field_name
value_before
value_after
When I add a new contact, a new line is added in the CONTACT table, nothing the CONTACT_AUDIT table.
When I edit a contact, for example if I change the opt_out field value from NULL to 'Y', the opt_out field value in CONTACT table is changed and a new line is added to CONTACT_AUDIT table with values
date=NOW()
contact_id=<my contact's id>
field_name='opt_out'
value_before=NULL
value_after='Y'
I need to know the contacts who had opt_out='Y' at a given date.
I tried this :
SELECT count(*) AS nb
FROM contacts c
WHERE
( -- contact is optout now and has never been modified before
c.optout = 'Y'
AND c.id NOT IN (SELECT DISTINCT contact_id FROM contacts_audit WHERE field_name = 'optout')
)
OR ( -- we consider contacts where the last row before date in contacts_audit is optout = 'Y'
c.id IN (
SELECT ca.contact_id
FROM contacts_audit ca
WHERE date_created BETWEEN '2014-07-24' AND DATE_ADD( '2014-07-24', INTERVAL 1 DAY )
AND field_name = 'optout'
ORDER BY date_created
LIMIT 1
)
)
But mysql does not support LIMIT in subquery.
So I tried with HAVING :
SELECT count(*) AS nb
FROM contacts c
WHERE
( -- contact is optout now and has never been modified before
c.optout = 'Y'
AND c.id NOT IN (SELECT DISTINCT contact_id FROM contacts_audit WHERE field_name = 'optout')
)
OR ( -- we consider contacts where the last row before date in contacts_audit is optout = 'Y'
c.id IN (
SELECT ca.contact_id
FROM contacts_audit ca
WHERE date_created BETWEEN '2014-07-24' AND DATE_ADD( '2014-07-24', INTERVAL 1 DAY )
AND field_name = 'optout'
HAVING MAX(date_created)
)
)
The query runs, but now, I don't know how to know if the value corresponding to the subquery value is 'Y' or 'N'. If I add a WHERE clause to check only for 'Y' values, 'N' values will be filtred and I will not be able to know if the last value at date was 'Y' or 'N'...
Thank you for your help
If i understand your problem correctly you may want to use a union. I dont have mysql to test it right now but the code could be something like this. tell me if this helped
select c.id, c.optout
where c.optout = 'Y'
AND c.id NOT IN (SELECT DISTINCT contact_id FROM contacts_audit WHERE field_name = 'optout')
UNION
select c.id, c.optout where c.id IN (
SELECT ca.contact_id
FROM contacts_audit ca
WHERE date_created BETWEEN '2014-07-24' AND DATE_ADD( '2014-07-24', INTERVAL 1 DAY )
AND field_name = 'optout'
HAVING MAX(date_created)
)
I'm currently coallescing fields individually in MySQL queries, but I would like to coalesce whole records.
Is this possible?
SELECT la.id,
COALESCE(( SELECT name FROM lookup_changed l0,
( SELECT MAX(id) id
FROM lookup_changed
WHERE lookup_id = 26
) l1
WHERE l0.id = l1.id
), la.name) name,
COALESCE(( SELECT msisdn FROM lookup_changed l0,
( SELECT MAX(id) id
FROM lookup_changed
WHERE lookup_id = 26
) l1
WHERE l0.id = l1.id
), la.msisdn) msisdn
FROM lookup_added la
WHERE la.id = 26
#Alma Do - the pseudo-SQL is:
SELECT la.id,
MULTICOALESCE(( SELECT <name, msisdn> FROM lookup_changed l0,
( SELECT MAX(id) id
FROM lookup_changed
WHERE lookup_id = 26
) l1
WHERE l0.id = l1.id
), <la.name, la.msisdn>) <name, msisdn>
FROM lookup_added la
WHERE la.id = 26
Since COALESCE() "return[s] the first non-NULL argument", it sounds like you want to retreive the "first non-NULL result from a set for queries":
-- syntax error
SELECT COALESCE(
SELECT a FROM ta,
SELECT b FROM tb
);
-- roughly equates to
( SELECT a AS val FROM ta WHERE a IS NOT NULL ORDER BY a LIMIT 1 )
UNION
( SELECT b AS val FROM tb WHERE b IS NOT NULL ORDER BY b LIMIT 1 )
ORDER BY val LIMIT 1 ;
Comments:
I added ORDER BY clauses, otherwise "first row" means nothing
the inner LIMIT 1 clauses are optional (but allow early trimming of sub-results)
the parenthesis around the sub queries are mandatory
This is my initial query:
SELECT bid_tag.*
FROM bid_tag join
(select serial_number, count(*) as cnt
from bid_tag where user_id = 0
group by serial_number
) tsum
on tsum.serial_number = bid_tag.serial_number and cnt > 1
order by bid_tag.serial_number
LIMIT 0, 21000;
Now from those results, I need to SELECT all where tag_design = 0 AND tag_size = 0 and then DELETE those records from the database.
I just don't know how to run a query on the results of an initial query.
Just replace SELECT with DELETE and it will delete the rows that would have been selected.
DELETE bid_tag.*
FROM bid_tag join
(select serial_number, count(*) as cnt
from bid_tag where user_id = 0
group by serial_number
) tsum
on tsum.serial_number = bid_tag.serial_number and cnt > 1
WHERE tag_design = 0 AND tag_size = 0
order by bid_tag.serial_number
LIMIT 0, 21000;
use an EXISTS term in your where clause:
DELETE
FROM bid_tag btd
WHERE EXISTS (
SELECT 1
FROM (
SELECT bid_tag.*
FROM bid_tag bts
JOIN (
SELECT serial_number, count(*) as cnt
FROM bid_tag btj
WHERE btj.user_id = 0
GROUP BY btj.serial_number
) tsum
ON ( tsum.serial_number = bts.serial_number
AND tsum.cnt > 1
)
WHERE bts.tag_design = 0
AND bts.tag_size = 0
ORDER BY bts.serial_number
LIMIT 0
, 21000
) rs_base
WHERE rs_base.id = btd.id -- PK column
)
;
the subquery in the EXISTS term can be nested further to contain another query on the result set of the original one. just make sure that you always select the primary key of the table on which the deletion is to be performed.
note that you probably don't want to restrict yourself to a part of your result set in a delete operation so check whether you need the limiting to the top 21000 results - if you dont, drop the 'ORDER BY' and 'LIMIT' clauses.