We have a fairly large database, set up via Symfony+Doctrine, which has been in the running for a while. Some of those entities are made with SoftDeletableTrait at the time. Some of those entities no longer required softdelete so we're going to delete the rows with a 'deletedAt' value and then drop the SoftDeletableTrait.
We need to find which tables/rows are referenced currently, example:
TableA
ID name deleted_at
1 Foo NULL
2 Bar 01-01-2020 10:11:12
I want all tables and rows referencing to TableA id=2:
TableFoo id=19
TableFoo id=21
TableBar id=7
If it where one table, we could do an subquery or join to complare it with one other table, but we dont know the amount of other tables.
We Could check the entity and see all it's references, but TableA isn't the only table getting scrubbed, there are about 10 large tables, so we're looking for a (semi-)automatic method.
Related
Essentially I have the following called Table1 with columns OrderNum and Book there should never be duplicate records of any kind of Book for each OrderNum, if there is it needs to identified and deleted.
For example:
OrderNum 1 should only have Book1 listed once so the query must identify the other 2 Book1 listed for OrderNum 1 and delete them.
OrderNum 4 should only have Book2 listed once so the query must identify the other Book2 listed for OrderNum 4 and delete it.
After the query runs Table1 Should look like this:
I am working with MS Access queries but I am looking for a solution that could work for an mySQL query as well.
I don't know how to do this gracefully on either MySQL or Access, because your table doesn't have a primary key column, which it rightfully should have. On Access, you could try creating a new table, then populating it using the following query:
INSERT INTO yourNewTable (OrderNum, Book)
SELECT DISTINCT OrderNum, Book
FROM yourTable;
Then, delete yourTable after you are done with the above query.
If you had a primary key/auto increment column in your table, let's say id, then you could use the following delete statement directly:
DELETE
FROM yourTable t1
WHERE EXISTS (SELECT 1 FROM yourTable t2
WHERE t2.OrderNum = t1.OrderNum AND
t2.Book = b1.Book AND
t2.id < t1.id);
This would leave, for each (OrderNum, Book) combination, the single record among duplicates which happens to have the lowest id value.
I have a table as such:
id entity_id first_year last_year sessions_attended age
1 2020 1996 2008 3 34.7
2 2024 1993 2005 2 45.1
3 ... ... ...
id is auto-increment primary key, and entity_id is a foreign key that must be unique for the table.
I have a query that calculates first and last year of attendance, and I want to be able to update this table with fresh data each time it is run, only updating the first and last year columns:
This is my insert/update for "first year":
insert into my_table (entity_id, first_year)
( select contact_id, #sd:= year(start_date)
from
( select contact_id, event_id, start_date from participations
join events on participations.event_id = events.id where events.event_type_id = 7
group by contact_id order by event_id ASC) as starter)
ON DUPLICATE KEY UPDATE first_year_85 = #sd;
I have one similar that does "last year", identical except for the target column and the order by.
The queries alone return the desired values, but I am having issues with the insert/update queries. When I run them, I end up with the same values for both fields (the correct first_year value).
Does anything stand out as the cause for this?
Anecdotal Note: This seems to work on MySQL 5.5.54, but when run on my local MariaDB, it just exhibits the above behavior...
Update:
Not my table design to dictate. This is a CRM that allows custom fields to be defined by end-users, I am populating the data via external queries.
The participations table holds all event registrations for all entity_ids, but the start dates are held in a separate events table, hence the join.
The variable is there because the ON DUPLICATE UPDATE will not accept a reference to the column without it.
Age is actually slightly more involved: It is age by the start date of the next active event of a certain type.
Fields are being "hard" updated as the values in this table are being pulled by in-CRM reports and searches, they need to be present, can't be dynamically calculated.
Since you have a 'natural' PK (entity_id), why have the id?
age? Are you going to have to change that column daily, or at least monthly? Not a good design. It would be better to have the constant birth_date in the table, then compute the ages in SELECT.
"calculates first and last year of attendance" -- This implies you have a table that lists all years of attendance (yoa)? If so, MAX(yoa) and MIN(yoa) would probably a better way to compute things.
One rarely needs #variables in queries.
Munch on my comments; come back for more thoughts after you provide a new query, SHOW CREATE TABLE, EXPLAIN, and some sample data.
I have a database with 2 tables, both tables have around 200,000 records.
Lets call these tables, TableA and TableB
Currently I have a function that triggers a select query, this query grabs all records in TableA that match a condition. Once I have that data, I have a foreach loop that uses the data from TableA to see if it matches any record in TableB.
The problem is that it takes a while to do this because there are so many records. I know the way Im doing it works because it does what its supposed to but it takes a good 3 minutes to finish the script. Is there a faster more efficient way to do something like this?
Thank you in advance for the help.
PS: I'm using PHP.
The most efficient way to achieve what you want is to:
1. Create a primary key column for each table (if you do not already have one). Example schema where column "id" is a unique identifier for the table row:
TableA
id firstname lastname
1 Michael Douglas
2 Michael Jackson
TableB
id table_a_id pet
1 1 cat
2 2 ape
3 1 dog
Google or search here on stackoverflow on how to create or add a primary key for a mysql table column. An example of creating TableA with a primary key:
CREATE TABLE `TableA` (
`id` int(11) unsigned AUTO_INCREMENT,
`firstname` varchar(100),
`lastname` varchar(100),
PRIMARY KEY (`id`)
)
2. Create an SQL-query to fetch what you need. For example:
To get all rows with at least one match in BOTH tables:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
INNER JOIN TableB
ON TableA.id = TableB.table_a_id;
To instead get all rows from TableA, and only the matching rows from TableB:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
LEFT JOIN TableB
ON TableA.id=TableB.table_a_id;
The answer to your question ultimately depends on what you mean by "if it matches."
Let's assume, for a moment, that you have primary keys on each of these tables, TableA an TableB, and that you're NOT matching those. But that you have one or more other columns, the actual data that you're storing in each row, which you are considering for your matching. Let's call those ColA and ColB.
In that case you could use:
SELECT TableA.id, TableB.id, TableA.ColA, TableB.ColB
FROM TableA
LEFT JOIN TableB
ON (TableA.ColA = TableB.ColA)
AND (TableB.ColB = TableB.ColB);
... notice that we're using a complex expression on which to JOIN. You'd want to add an AND (TableA.XXX = TableB.XXX) for each columned that you want to consider significant in your matching.
Of course I'm assuming that these tables don't share a common surrogate key (otherwise MicKri's JOIN would be simpler ... or a "NATURAL JOIN" would be even simpler still).
What you're doing, conceptually, is defining a pair of (mathematical) sets an finding the intersection between them. The complication of doing this in SQL is that real world tables often have these extra columns (surrogate primary keys, and foreign keys) which aren't attributes of the underlying entities ... but which serve to map relationships among them.
In my example I'm just showing a way to formulate a JOIN query that finds the intersection based only on the attributes that are significant for your purposes.
(By the way, the parentheses in my example are there for human legibility. They should not be required by your SQL engine ... though they don't hurt, either).
Here's one of a number of visual explanations of SQL JOINs that's handy for learning this sort of thing. An INNER JOIN is an intersection. The ON and WHERE clauses define the subsets of the data (columns and rows, respectively) which are to be related.
Let's say I have a table with 10 records labeled 1 through 10, and each record contains two fields. I want to create a query that shows me Field 1 of record N with Field 2 of record N+1. For example, the query would show Field 1 of record 3 with Field 2 of record 4. Is this possible?
It is possible not particularily complex.
Given a table tblFoo with FooId as Primary Key and the two additional fields FooText and BarText, the SQL to get the desired results would look like this:
SELECT f1.FooText, f2.BarText
FROM tblFoo AS f1
LEFT JOIN tblFoo AS f2
ON f1.FooID +1 = f2.FooID
While it is simple to implement, performance will no be ideal for large tables because the expression FooId+1 prevents the query engine to use the primary key as index while retrieving the results.
I have 2 tables like this
words(word_id, value);
word_map(sno(auto_inc), wm_id, service_id, word_id, base_id, root_id);
in which sno is auto incremented just for indexing.
wm_id is the actual id which are unique for each service like
(serviceid, wm_id together form a unique key).
base_id and root_id are referenced to wm_id i.e., I store the values of respective wm_id of new word being inserted.
My Requirement now is I want to delete the records from this table where, a words's base_id or root_id does not exists in the table
For example,
A new word with tr_id = 4, its base_id = 2 and root_id = 1 then There must two other records with tr_id s 2 and 1 if not we can call it as an orphan and that record with wm_id = 4 must be deleted, then records with other wm_ids having this 4 as base_id or root_id must also be deleted as they r also now orphans if 4 gets deleted and so on.
Can anybody suggest me the solution for the problem.
What I tried:
I tried write a procedure using while in which it has a query like,
delete from words_map where base_id not in (select wm_id from words_map) or root_id not in (select wm_id from words_map)
But deleting/ or updating on same table using this kind of nested queries is not possible, So I am searching for an alternate way.
What I doubt is :
I thought of reading these wm_ids into an array then reading one by one deleting based on that, but I dont think we have arrays in stored
procedures.
Is Cursor an alternative for this sitution.
or any other best solution for this problem.
EDIT 1: Please go through this http://sqlfiddle.com/#!2/a4b6f/15 for clear experimental data
Any and early help would be appreciated