Conditional delete across multiple tables in mysql - mysql

I have two tables. One of them contains files, the other one actions:
|Files | |Actions |
|---------| |------------|
|FileID | |ActionID |
|Filename | |ActionDate |
|... | |... |
|---------| |------------|
One file can have several actions. Those actions happened at a certain date.
Every now and then I want to delete all files and its actions. But only if one of the actions of that file is older than - say - 1 year.
For example:
File 1 has 2 actions: Both actions happened a week ago. Do not delete
File 2 has 2 actions: Both actions happened 10 years ago. Delete
File 3 has 2 actions: One of them happened 10 years ago, the other one half a year ago. Delete
I would love to do that without having to do it in several steps. (Like selecting stuff in my perl script first and then iterate over those to delete them or whatever)
If this is too easy I can provide further challenge:
There is another table, lets call it 'State'. One State can have multiple actions again and i also want to delete all the states that are referenced by the actions that are going to be deleted.
Any hints on how to do this highly appreciated!
edit
oh my, I just realized that deleting from multiple tables at once is highly discouraged, especially when dealing with big amounts of data.
I assume this means there is no (decent) way around doing this within sql, correct?

For files and action you first need to find out the files whose one of action is a year later this can be done using below query
select *,
sum(ActionDate < now() - interval 1 year) need_to_delete
from
Actions
group by FileID
having need_to_delete >0
This will give you the file ids which need to be delete from the database
Select Demo
Second you need multi-delete query joined with above query to delete from multiple tables in single query
delete f.*,a.* from files f
join Actions a
on(f.FileID = a.FileID)
join (
select *,
sum(ActionDate < now() - interval 1 year) need_to_delete
from
Actions
group by FileID
having need_to_delete >0
) fa
on(f.FileID = fa.FileID)
Delete Demo
For deletion of states above query will help you and i am leaving it to

Related

Access: finding the corresponding value of maximum value

I have a database in which I perform an audit on a set of required documents, for several locations of those documents.
So I have a table named Locations and a table named Documents, which are correlated through a 2 x 2 relationship.
Every document can have multiple versions. In my query, I want to see only the most recent version of each document, so the max(Id).
Now, every version can be 'audited' (checked) multiple times, for example 2 times each year. Each Audit/check is stored in a record, and I want to show only the most recent audit for each document, so Max(ID).
This is my Selection Query:
SELECT [~Locations].Location, [+DocuProperties].Category, [~Documents].[Document name], Max([DocuVersion].Id) AS MaxDocuID, Max([Audit].Id) AS MaxAuditID, [Audit].Conclusion
FROM ([~Documents] INNER JOIN ([~Locations] INNER JOIN ([+DocuLocation] INNER JOIN [+DocuProperties] ON [+DocuLocation].Id = [+DocuProperties].DocuLocation) ON [~Locations].Id = [+DocuLocation].Location) ON [~Locations].Id = [+DocuLocation].DocuName) INNER JOIN (DocuVersion INNER JOIN 2Audit ON [DocuVersion].Id = [Audit].DocuVersion) ON [+DocuProperties].Id = [DocuVersion].DocuLocation
GROUP BY [~Locations].Location, [+Docuproperties].Category, [~Documents].[Document name], [Audit].Conclusion
However: I do not wish to Group on Audit Conclusion, I wish to show the Audit conclusion that corresponds to the Max(Id) of that Audit.
So for every most recent Audit, I want to show the Conclusion. This conclusion I want to show for each Document, grouped byCategory and grouped byLocation.
I know I need to build a nested subquery of some form, but I just can't get any code to work.
I hope anybody can help.
The basic idea is like this:
Table 1
DocuProperties
Id Location Category
1 15 1
2 15 1
3 14 2
(every location can have multiple document properties a.k.a. objects)
Table2
DocuVersion
Id DocuProperty DocumentEndDate
1 1 01-01-2022
2 1 20-07-2023
3 2 31-07-2023 etc.
4 3 01-10-2023
(every DocuProperties can have multiple versions, I have to check If they are still valid, but also on some other criteria ).
Table 3
Audit
Id DocuVersion Conclusion
1 1 Not Valid
2 1 Not Valid
3 2 Valid
4 4 Valid
(every version can be audited multiple times. Every audit can have a different conclusion)
Which I would like to translate into the following:
LASTAudit (a.k.a. the most recent audit of the most recent version of the most recent property)
Location DocutPropertyId DocuVersionId AuditId Conclusion
15 2 2 2 Not Valid
14 3 4 4 Valid
The ID’s were easy to get right, as those were just Max(Id) functions. The problem was to get the Conclusion corresponding to that audit of that version of that object.

Joining and selecting multiple tables and creating new column names

I have very limited experience with MySQL past standard queries, but when it comes to joins and relations between multiple tables I have a bit of an issue.
I've been tasked with creating a job that will pull a few values from a mysql database every 15 minutes but the info it needs to display is pulled from multiple tables.
I have worked with it for a while to figure out the relationships between everything for the phone system and I have discovered how I need to pull everything out but I'm trying to find the right way to create the job to do the joins.
I'm thinking of creating a new table for the info I need, with columns named as:
Extension | Total Talk Time | Total Calls | Outbound Calls | Inbound Calls | Missed Calls
I know that I need to start with the extension ID from my 'user' table and match it with 'extensionID' in my 'callSession'. There may be multiple instances of each extensionID but each instance creates a new 'UniqueCallID'.
The 'UniqueCallID' field then matches to 'UniqueCallID' in my 'CallSum' table. At that point, I just need to be able to say "For each 'uniqueCallID' that is associated with the same 'extensionID', get the sum of all instances in each column or a count of those instances".
Here is an example of what I need it to do:
callSession Table
UniqueCallID | extensionID |
----------------------------
A 123
B 123
C 123
callSum table
UniqueCallID | Duration | Answered |
------------------------------------
A 10 1
B 5 1
C 15 0
newReport table
Extension | Total Talk Time | Total Calls | Missed Calls
--------------------------------------------------------
123 30 3 1
Hopefully that conveys my idea properly.
If I create a table to hold these values, I need to know how I would select, join and insert those things based on that diagram but I'm unable to construct the right query/statement.
You simply JOIN the two tables, and do a group by on the extensionID. Also, add formulas to summarize and gather the info.
SELECT
`extensionID` AS `Extension`,
SUM(`Duration`) AS `Total Talk Time`,
COUNT(DISTINCT `UniqueCallID`) as `Total Calls`,
SUM(IF(`Answered` = 1,0,1)) AS `Missed Calls`
FROM `callSession` a
JOIN `callSum` b
ON a.`UniqueCallID` = b.`UniqueCallID`
GROUP BY a.`extensionID`
ORDER BY a.`extensionID`
You can use a join and group by
select
a.extensionID
, sum(b.Duration) as Total_Talk_Time
, count(b.Answered) as Total_Calls
, count(b.Answered) -sum(b.Answered) as Missed_calls
from callSession as a
inner join callSum as b on a.UniqueCallID = b.UniqueCallID
group by a.extensionID
This should do the trick. What you are being asked to do is to aggregate the number of and duration of calls. Unless explicitly requested, you do not need to create a new table to do this. The right combination of JOINs and AGGREGATEs will get the information you need. This should be pretty straightforward... the only semi-interesting part is calculating the number of missed calls, which is accomplished here using a "CASE" statement as a conditional check on whether each call was answered or not.
Pardon my syntax... My experience is with SQL Server.
SELECT CS.Extension, SUM(CA.Duration) [Total Talk Time], COUNT(CS.UniqueCallID) [Total Calls], SUM(CASE CS.Answered WHEN '0' THEN SELECT 1 ELSE SELECT 0 END CASE) [Missed Calls]
FROM callSession CS
INNER JOIN callSum CA ON CA.UniqueCallID = CS.UniqueCallID
GROUP BY CS.Extension

MS-Access 2010 DELETE Query LEFT JOIN

There's a lot of these issues floating around the net with many solutions, but I'm really struggling with this one.
I have a table [BaseHrs] which looks a little like this -
p_ID b_Person WeekNos HrsRequired
1 A 2016-39 10
1 A 2016-40 10
1 A 2016-41 10
1 A 2016-42 10
1 B 2016-39 11
1 B 2016-40 11
1 B 2016-41 12
1 B 2016-42 09
The table continues with different p_ID, people & week numbers. There is no Primary Key and no indexing. This table also has no relationship with any other table.
It is populated from a Query connected to another table as well as a form for the [HrsRequired] field.
Scenario -
Project 1 (p_ID=1) has now been brought forward by two weeks and BaseHrs table no longer needs row for [WeekNos] 2016-41 & 2016-42.
I initially use a query to show which weeks the project is now running on (qry_SelectNewDates).
I have started my delete query by first creating a Select query which looks like this -
SELECT BaseHrs.*
FROM BaseHrs
LEFT JOIN qry_SelectNewDates
ON BaseHrs.WeekNos = qry_SelectNewDates.WeekNos
WHERE (((BaseHrs.p_ID)=[Forms]![frm_Projects]![p_ID])
AND ((BaseHrs.WeekNos) Not In ([qry_SelectNewDates].[WeekNos])));
This works as intended.
Converting that into a delete query produces an error though. Delete Query -
DELETE BaseHrs.*, BaseHrs.p_ID, BaseHrs.WeekNos
FROM BaseHrs
LEFT JOIN qry_SelectNewDates
ON BaseHrs.WeekNos = qry_SelectNewDates.WeekNos
WHERE (((BaseHrs.p_ID)=[Forms]![frm_Projects]![p_ID])
AND ((BaseHrs.WeekNos) Not In ([qry_SelectNewDates].[WeekNos])));
Error message -
Could not delete from specified tables.
I realise that there is often an issue when trying to delete records in this way. I've tried using it with just 'DELETE.*' in the first line without luck.
I have also made an attempt at a nested Query, but I just can't figure out how to construct it. Any guidance?
**********EDIT**********
With advice from #SunKnight0 I have added a primary key to my BaseHrs table and got this query -
DELETE *
FROM BaseHrs
WHERE b_pKey IN
(SELECT BaseHrs.b_pKey
FROM BaseHrs
LEFT JOIN qry_SelectNewDates
ON (BaseHrs.WeekNos = qry_SelectNewDates.WeekNos)
WHERE (((BaseHrs.p_ID)=[Forms]![frm_Projects]![p_ID])
AND ((BaseHrs.WeekNos) Not In ([qry_SelectNewDates].[WeekNos]))));
This query appears to work but takes a huge amount of time to run. Is that as good as it gets?

MySQL deleting duplicates

I updated an old site a couple of months ago moving from a Joomla install to a bespoke system. In order to convert the data from the Joomla tables to the new format I wrote various php scripts which stepped through the old records by section, processed them and inserted into the new table. All was fine until I recently discovered I had forgotten to add the die() statement to the top of one of the scripts and somehow a searchbot has been merrily pinging that script over time to add precisely 610 duplicates in one particular section.
So the things I do know about the data is that the row with the lowest ID is the row I want to keep, and the duplication only exists in CATEGORY = 8. To be sure of a duplicate, the row ORIGINAL_ID will match.
Beyond SELECT, INSERT, DELETE, I'm no MySQL expert, so confused as to how to approach this. What would the experts out their suggest?
Edit: Example code
ID CATEGORY TITLE ORIGINAL_ID
1 7 A 1
2 8 A 2
3 8 A 2
4 8 B 3
5 8 C 4
6 8 A 2
In the above example, records 3 & 6 should be stripped, because they are in CATEGORY=8, have duplicate ORIGINAL_ID; but retain the row with the lowest id (row 2)
So, you want to identify records within Category 8, where there is another record with the same Category, Title and Original_id. You also want to check if that other record has a lower ID.
So:
Select *
from MYTABLE T1
where CATGEORY = 8
and EXISTS (
select 1
from MYTABLE T2
where T2.CATEGORY=T1.CATEGORY
and T2.TITLE=T1.TITLE
where T2.ORIGINAL_ID=T1.ORIGINAL_ID
where T2.ID>T1.ID
If you run this and it returns only the records you wish to delete, replace the "select *" with a "delete" and re-run.

How to combine two rows with similar timestamps and return both

I have a MySQL table of bouts set up like this.
|------------------------|
|bouts |
|------------------------|
|boutID |
|recording_athlete |
|boutdate (timestamp) |
|opponent |
|recording_athlete_points|
|------------------------|
Each actual meeting between two people is recorded twice in the table, with a unique boutID and boutdate (reflecting the moment when it was actually entered, but within 5 minutes of the other) and the recording athlete of one is the opponent of the other, and visa versa. The two records are not necessarily consecutive. There are additional meetings for the two participants each day, separated by longer time intervals: we're looking for the two closest in both timestamp and ID number (assuming that these are the two that belong together).
I'm trying to SELECT records that belong together into one row (and realize and want it will be done twice) such that it will output matched rows something like this:
boutID|recording_athlete|boutdate|opponent|recording_athlete_points|boutID_b|boutdate_b|opponent_points
01|John|2012-05-10 20:33:04|Jane|15|04|2012-05-10 20:36:12|10
04|Jane|2012-05-10 20:36:12|John|10|01|2012-05-10 20:33:04|15
Here is what I have so far, and where I think I need to go, but just can't figure out what to use. Some sort of interval statement? Or do I need a totally different structure?
SELECT
A.`boutID`,
A.`recording_athlete`,
A.`boutDate`,
A.`opponent`,
A.`recording_athlete_points`,
B.`boutID` as `boutID_b`,
B.`boutDate` as `boutdate_b`,
B.`recording_athlete_points`as `opponent_points`
FROM bouts A
INNER JOIN bouts B on(A.`fullName` = B.`opponent` AND ????? )
ORDER by A.`boutDate`
SELECT
...
FROM bouts A
JOIN bouts B
on A.fullName = B.opponent
AND B.boutdate between subdate(A.boutdate, interval 5 minute)
and adddate(A.boutdate, interval 5 minute)
If you create an index on boutdate:
create index bouts_boutdate_index on bouts(boutdate);
this query will perform well