Display a table with just the second duplicate rows removed yet keep the first row - ms-access

So, I have a table with 3 columns, of which the first column consists of IDs and the last column consists of dates. What I need is, to sort the table by dates, and remove any duplicate IDs with a later date (and keep the ID with the earliest date).
For example,
This is how my table originally looks like -
123 Ryan 01/01/2011
345 Carl 03/01/2011
123 Lisa 01/02/2012
870 Tiya 06/03/2012
345 Carl 07/01/2012
I want my resultant table to look like this -
123 Ryan 01/01/2011
345 Carl 03/01/2011
870 Tiya 06/03/2012
I'm using VBA Access Code to find a solution for the above, and used SQL Queries too, however my resultant table either has no duplicates whatsoever or displays all the records.
Any help will be appreciated.

This will create a new table:
SELECT tbl.SName, a.ID, a.BDate
INTO NoDups
FROM tbl
INNER JOIN (
SELECT ID, Min(ADate) As BDate
FROM tbl GROUP BY ID) AS a
ON (tbl.ADate = a.BDate) AND (tbl.ID = a.ID);

Related

Remove duplicated based on two columns [duplicate]

This question already has answers here:
How to select and/or delete all but one row of each set of duplicates in a table?
(2 answers)
How can I remove duplicate rows?
(43 answers)
Closed 1 year ago.
I've a flights table that consists of few columns but somebody seem to have ran a migration twice that resulted in creation of same data twice.
Anyway, the flight should only have only data from the following condition: The flight_number and the date.
Basically the table is looking like this at the moment:
flight_number
date
123
2021-09-16
123
2021-09-16
123
2021-09-17
124
2021-09-18
124
2021-09-18
Result I want:
flight_number
date
123
2021-09-16
123
2021-09-17
124
2021-09-18
Basically, keep only one and remove duplicated (if the flight_number is same of the same date).
I'm looking for a DELETE SQL query but couldn't find the one like I am looking for.
What is the query that can help me achieve it?
Thanks!
EDIT: Yes, all the data has a column id that is unique even if the data is same.
You need to identify which rows to keep and which to remove; this can be done as such:
delete ff from
flight ff
inner join (
select flight_number, row_number() over (partition by flight_number order by date) as RN
from flight f
) dups
on ff.flight_number = dups.flight_number
where dups.rn > 1
Basically, this uses Row_Number to create a row identifier based on certain criteria, in this case, for each (partition) Flight_number, create a row number then delete any records where the row_number is > 1.
You will need to change this to use the actual ID column on the join, like this https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=58a4ac7235ea22b557116ad68c8449c3

Looking for a low footprint solution to GROUP rows using HAVING to filter

Here is a table
id date name
1 180101 josh
2 180101 peter
3 180101 julia
4 180102 robert
5 180103 patrick
6 180104 josh
7 180104 adam
I need to get all the names whom having the same days as 'josh'. how can i achieve it without groupping the whole table together. i need to keep it efficient (this is not my real table, i just simplified my problem here, and i have hundred thousands of records, and 99% of the rows have different dates, so groupable rows by date is kind of rare).
So basicaly what i want is: if 'josh' is the target, i need to get 'josh,peter,julia,adam' (actually the first 10 distinct names sharing the same date with josh).
SELECT
COUNT(date) as datecount,
GROUP_CONCAT(DISTINCT name) as names,
FROM
table
GROUP BY
date
HAVING
datecount>1
// && name IN ('josh') would work nice for me, but im getting error because 'name' is not in GROUPED BY
LIMIT 10
Any idea ? As i mentioned it needs to be fast, and most of the rows have unique dates
Join the table with itself on date:
select distinct t1.name
from tbl t1
join tbl t2 using (date)
where t2.name = 'josh'
Demo
For the best performance you would have indexes on (name) and (date, name).

what does this sql query do? SELECT column_1 FROM table_1,table_2;

SELECT column_1 FROM table_1,table_2;
When I ran this on my database it returned huge number of rows with duplicate column_1 values. I could not understand why I got these results. Please explain what this query does.
it gives you a cross product from table 1 and table 2
In more layman's terms, it means that for each record in Table A, you get every record from Table B (all possible combinations).
TableA with 3 records and Table B with 3 records gives 9 total records in the result:
TableA-1/B-1
TableA-1/B-2
TableA-1/B-3
TableA-2/B-1
TableA-2/B-2
TableA-2/B-3
TableA-3/B-1
TableA-3/B-2
TableA-3/B-3
Often used as a basis for Cartesian Queries (which themselves are the means to generate, say, a list of future dates based on a recurrence schedule: give me all possible results for the next 6 months, then restrict that set to those whose factor matches my day of the week)
This is 'valid' way of cross joining two tables; it is not the preferred way though. Cross Join would be much clearer. An on condition would then be helpful to limit results,
Imagine that i have 3 friends named Jhon, Ana, Nick; then i have in the other table 2 are T-shirts a red and a yellow and i wanna know witch is from.
So in the query being tableA:Friends and tableB:Tshirts returns:
1|JHON | t-shirt_YELLOW
2|JHON | t-shirt_RED
3|ANA | t-shirt_YELLOW
4|ANA | t-shirt_RED
5|NICK | t-shirt_YELLOW
6|NICK | t-shirt_RED
As you see this join has no relational logic between friends and Tshirts so by evaluating all the posible combination generates what you call duplicates.

Retrieve rows that have a first entry in 2014 in MySQL

I want to retrieve all rows from a table that have their first entry on or after 01/01/2014 but no later than 31/12/2014
Example of the table:
OID FK_OID Treatment Trt_DATE
1 100 19304 2011-05-24
2 100 19304 2011-08-01
3 100 19306 2014-03-05
4 200 19305 2012-02-02
5 300 19308 2014-01-20
6 400 19308 2014-06-06
For example. I would like to pull all entries that have STARTED treatment in 2014. So above i would to extract FK_OID's 300 and 400 because their first entry is in 2014, but i would like to omit FK_OID 100 because they have 2 entries prior to 2014.
How do i go about this? I can extract all entries within a date range etc but that brings back all entries for that date and doesn't omit anyone who has an entry prior to the start of the date range. It just returns their first entry in 2014.
For the ones who need to see that i have tried something. See below.
I am not an experienced coder and this is the best i can get because i don't have the knowledge.
SELECT
mod,
(select NHSNum from person p
WHERE
p.oid = t.fk_oid) as 'NHS'
FROM
timeline t
Where trt_date BETWEEN '2014-01-01' AND '2014-12-31'
ORDER BY trt_date ASC
This returns every treatment for 2014 regardless of whether it is the first ever one for that person. I want to omit anyone from this list who has had treatment before 01/01/2014 as well as only return the first treatment per person. For example, this code returns all treatments for all people in 2014. I only want their first one and only if it is their first one ever.
Thanks.
create table aThing
( oid int auto_increment primary key,
fk_oid int not null,
treatment int not null,
trt_date date not null
);
insert aThing (fk_oid,treatment,trt_date) values
(100, 19304, '2011-05-24'),
(100, 19304, '2011-08-01'),
(100, 19306, '2014-03-05'),
(200, 19305, '2012-02-02'),
(300, 19308, '2014-01-20'),
(400, 19308, '2014-06-06');
select fk_oid,dt
from
( select fk_oid,min(trt_date) as dt
from aThing
group by fk_oid
) xDerived
where year(dt)=2014;
+--------+------------+
| fk_oid | dt |
+--------+------------+
| 300 | 2014-01-20 |
| 400 | 2014-06-06 |
+--------+------------+
The inner part, the nested one, become a derived table, and is given a name xDerived. This means that even though it is just a result set, by making it a derived table, it can be referred to by name. So it is not a physical table, but a derived one, or virtual one.
So that derived table is a very simple group by with an aggregate function. It says, for every fk_oid, bring back one row and only 1 row, with its minimum value for trt_date.
So if you have 10 million rows in that table called aThing, but only 17 distinct values for fk_oid, it will return only 17 rows. Each row being the minimum of trt_date for its fk_oid.
So now that that is achieved, the outer wrapper says just show me those two columns (but with a year check). There is a complicated to explain reason why I had to do that, so I will try to do it here.
But I might need a little time to explain it well, so bear with me.
This will be a shortcut way to say it. I had to get the min into an alias, and I only had access to that alias if resolved in a derived table, to cleanse it so to speak, and then access it with an outer wrapper.
An alias of aggregate column, like as dt, is not available (as a pseudo like column name which is what an alias is) ... it is not available in a where clause. But by wrapping it in a derived table name, I cleanse it so to speak, and then I can access it in a where clause.
So I can't access it directly in its own query in the where clause, but when I wrap it in an envelope (a derived table), I can access it on the outside.
I will try better to explain it later, maybe, but I would have to show alternative attempts to gain access to results, and the syntax errors that would result.
There's probably a more elegant solution, but this seems to satisfy the requirement...
SELECT x.*
FROM my_table x
JOIN
( SELECT fk_oid
, MIN(trt_date) min_date
FROM my_table
GROUP
BY fk_oid
HAVING min_date > '2014-01-01'
) a
ON a.fk_oid = x.fk_oid
LEFT
JOIN my_table b
ON b.fk_oid = a.fk_oid
AND b.trt_date > '2014-12-31'
WHERE b.oid IS NULL;
Having a few years a experience with this, i decided to revisit it. The solution i now use regularly is:
SELECT t1.column1, t1.column2
FROM MyTable AS t1
LEFT OUTER JOIN MyTable AS t2
ON t1.fkoid = t2.fkoid
AND (t1.date > t2.date
OR (t1.date = t2.date AND t1.oid > t2.oId))
WHERE t2.fkoid IS NULL and t1.date >= '2014-01-01'

find out count of comma based value in MySql

I have two tables.
Table Emp
id name
1 Ajay
2 Amol
3 Sanjay
4 Vijay
Table Sports
Sport_name Played by
Cricket ^2^,^3^,^4^
Football ^1^,^3^
Vollyball ^4^,^1^
Now I want to write a query which will give me output like
name No_of_sports_played
Ajay 2
Amol 1
Sanjay 2
Vijay 2
So what will be Mysql query for this?
I agree with the above answers/comments that you are not using a database for what a database is for, but here is how you could calculate your table from your current structure in case you have no control over that:
SELECT Emp.name, IF(Played_by IS NULL,0,COUNT(*)) as Num_Sports
FROM Emp
LEFT JOIN Sports
ON Sports.Played_by RLIKE CONCAT('[[:<:]]',Emp.id,'[[:>:]]')
GROUP BY Emp.name;
See it in action here.
UPDATE: added the IF(Played_by IS NULL,0,COUNT(*)) instead of COUNT(*). This means that if an employee doesn't play anything they'll have a 0 as their Num_Sports. See it here (I also added in those ^ characters and it still works.
What it does is joins the Emp table to the Sports table if it can find the Emp.id in the corresponding Played_by column.
For example, if we wanted to see what sports Ajay played (id=1), we could do:
SELECT *
FROM Emp, Sports
WHERE Sports.Played_by LIKE '%1%'
AND Emp.id=1;
The query I gave as my solution is basically the query above, with a GROUP BY Emp.name to perform it for each employee.
The one modification is the use of RLIKE instead of LIKE.
I use RLIKE '[[:<:]]employeeid[[:>:]]' instead of LIKE '%employeeid%. The [[:<:]] symbols just mean "make sure the employeeid you match is a whole word".
This prevents (e.g.) Emp.id 1 matching the 1 in the Played_by of 3,4,11,2.
You do not want to store your relationships in a column like that. Create this table:
CREATE TABLE player_sports (player_id INTEGER NOT NULL, sport_id INTEGER NOT NULL, PRIMARY KEY(player_id, sport_id));
This assumes you have an id column in your sports table. So now a player will have one record in player_sports for each sport they play.
Your final query will be:
SELECT p.name, COUNT(ps.player_id)
FROM players p, player_sports ps
WHERE ps.player_id = p.id
GROUP BY p.name;