Remove duplicated based on two columns [duplicate] - mysql

This question already has answers here:
How to select and/or delete all but one row of each set of duplicates in a table?
(2 answers)
How can I remove duplicate rows?
(43 answers)
Closed 1 year ago.
I've a flights table that consists of few columns but somebody seem to have ran a migration twice that resulted in creation of same data twice.
Anyway, the flight should only have only data from the following condition: The flight_number and the date.
Basically the table is looking like this at the moment:
flight_number
date
123
2021-09-16
123
2021-09-16
123
2021-09-17
124
2021-09-18
124
2021-09-18
Result I want:
flight_number
date
123
2021-09-16
123
2021-09-17
124
2021-09-18
Basically, keep only one and remove duplicated (if the flight_number is same of the same date).
I'm looking for a DELETE SQL query but couldn't find the one like I am looking for.
What is the query that can help me achieve it?
Thanks!
EDIT: Yes, all the data has a column id that is unique even if the data is same.

You need to identify which rows to keep and which to remove; this can be done as such:
delete ff from
flight ff
inner join (
select flight_number, row_number() over (partition by flight_number order by date) as RN
from flight f
) dups
on ff.flight_number = dups.flight_number
where dups.rn > 1
Basically, this uses Row_Number to create a row identifier based on certain criteria, in this case, for each (partition) Flight_number, create a row number then delete any records where the row_number is > 1.
You will need to change this to use the actual ID column on the join, like this https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=58a4ac7235ea22b557116ad68c8449c3

Related

Joining missing dates from calendar table [duplicate]

This question already has answers here:
MySQL how to fill missing dates in range?
(6 answers)
Closed 4 years ago.
I have a table with information and dates, which have some missing ones, so I want to join that table with a calendar table to fill missing dates and set values in another column in the same row to null. This is an example:
Steps | Date
10 | 2018-04-30
20 | 2018-04-28
And it want to do the following:
Steps | Date
10 | 2018-04-30
null | 2018-04-29
20 | 2018-04-28
This is what I tried (real query, so you can point out if I'm doing something wrong):
SELECT sum(steps), date(from_unixtime(u.in_date)) as stepdate
FROM userdata u
RIGHT JOIN
time_dimension td
ON date(from_unixtime(u.in_date)) = td.db_date
AND user_id = 8
GROUP BY day(from_unixtime(in_date))
ORDER BY stepdate DESC;
I expected this query to do what I wanted, but it doesn't. The table time_dimension and its column db_date have all dates (ranging from 2017-01-01 to 2030-01-01), which is the one I'm trying to join userdata's in_date column (which is in unix_time).
Edit: I checked the following questions in SO:
Join to Calendar Table - 5 Business Days
What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN?
Edit, regarding the duplicate: That question in particular is using intervals and date_add to compare against their table. I am using a calendar table instead to join them. While similar, I don't think they won't have the same solution.
Solution: Thanks to xQBert, who pointed out the mistake:
PROBLEM: Having the group by be on the userdata table as well as the select, you're basically ignoring the time dimension data. There is no 2018-4-29 date in Userdata right (for user 8) Fix the select & group by to source from time dimension data and problem solved.
So, I changed GROUP BY day(from_unixtime(in_date)) to GROUP BY td.db_date.
You need left join rather than right join or you may also change the position of tables
SELECT sum(steps), date(from_unixtime(td.db_date)) as stepdate
FROM time_dimension td
LEFT JOIN userdata u
ON date(from_unixtime(u.in_date)) = td.db_date
WHERE user_id = 8
GROUP BY date(from_unixtime(td.db_date))
ORDER BY stepdate DESC;
However, this assumes time_dimension table treating as calender table.

Sum multiple columns showing all results [duplicate]

This question already has answers here:
SQL to add a summary row to MySQL result set
(4 answers)
Closed 4 years ago.
Is it possible to total two columns, however still show all of the results? At the moment, If i sum one of the columns in the select statement (e.g. sum(sales_amount), it only 1 line of the results show
SALES_ID EMPLOYEE_ID PRODUCT_ID SALES_AMOUNT QUANTITY DATE
1 123148 4578947 80 1 01/01/2018
2 123148 5124578 80 1 01/01/2018
I want to keep the two results shown above, however an extra line created showing a total of 160 under the sales_amount, quantity of 2.
This is arguably bad SQL because the semantics of the last row is different from the others. (Rows and columns should be logically interchangeable.) That said, as I write this, I see that #Gordon-Linoff has just given the answer that I was going to give. Still, I would argue that such aggregations should be separate.
The simplest way in this case probably union all:
select SALES_ID, EMPLOYEE_ID, PRODUCT_ID, SALES_AMOUNT, QUANTITY, DATE
from t
union all
select NULL, NULL, NULL, SUM(sales_amount), sum(quantity), date
from t
group by date;
If your data is the result of an aggregation query, then rollup would be appropriate.
Select SALES_ID, EMPLOYEE_ID, PRODUCT_ID,SALES_AMOUNT ,QUANTITY, DATE ,
(SALES_AMOUNT* QUANTITY) AS TOTAL FROM (YOUR TABLE NAME)
If you use this query, the amount is multiplied by the quantity for each row and displayed in the new column

Count and sum up all duplicate records in MySQL

I have table with, following structure.
id name
1 john
2 ana
3 john
4 ana
5 peter
6 ana
7 Abrar
8 Raju
Duplicate entries in the table are as follows
john(2) duplicate
ana(3) duplicate
The names which are duplicates are john and ana.
My question is how would I count the records in total which are duplicate here it is '5' records
Note : I also followed the similar question in community but it explains how we can add the number of duplicates exists for that particular name in the table and adds up the third column in table representing the duplicates records with same name but in my case I wanted to know the number of all duplicates exist in the table (here the result of the query is just number "5") irrespective of the names.
Just take a count subquery on the query you already have in mind (or perhaps have already written):
SELECT SUM(cnt) AS total_duplicates
FROM
(
SELECT COUNT(*) AS cnt
FROM yourTable
GROUP BY name
HAVING COUNT(*) > 1
) t;
Demo

Iterate through a column and summarize findings

I have a table (t1) in mySQL that generates the following table:
type time full
0 11 yes
1 22 yes
0 11 no
3 13 no
I would like to create a second table (t2) from this that will summarize the information found in t1 like the following:
type time num_full total
0 11 1 2
1 22 1 1
3 13 0 1
I want to be able to iterate through the type column in order to be able to start this summary, something like a for-loop. The types can be up to a value of n, so I would rather not write n+1 WHERE statements, then have to update the code every time more types are added.
Notice how t2 skipped the type of value 2? This has also been escaping me when I try looping. I only want the the types found to have rows created in t2.
While a direct answer would be nice, it would be much more helpful to be pointed to some sources where I could figure this out, or both.
This may do what you want
create table t2 if not exists select type, time, sum(full) num_full, count(*) count
from t1
group by type,time
order by type,time;
depending on how you want to aggregate the time column.
This is a starting point for reference on the group by functions : https://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
here for create syntax
https://dev.mysql.com/doc/refman/5.6/en/create-table.html

Display a table with just the second duplicate rows removed yet keep the first row

So, I have a table with 3 columns, of which the first column consists of IDs and the last column consists of dates. What I need is, to sort the table by dates, and remove any duplicate IDs with a later date (and keep the ID with the earliest date).
For example,
This is how my table originally looks like -
123 Ryan 01/01/2011
345 Carl 03/01/2011
123 Lisa 01/02/2012
870 Tiya 06/03/2012
345 Carl 07/01/2012
I want my resultant table to look like this -
123 Ryan 01/01/2011
345 Carl 03/01/2011
870 Tiya 06/03/2012
I'm using VBA Access Code to find a solution for the above, and used SQL Queries too, however my resultant table either has no duplicates whatsoever or displays all the records.
Any help will be appreciated.
This will create a new table:
SELECT tbl.SName, a.ID, a.BDate
INTO NoDups
FROM tbl
INNER JOIN (
SELECT ID, Min(ADate) As BDate
FROM tbl GROUP BY ID) AS a
ON (tbl.ADate = a.BDate) AND (tbl.ID = a.ID);