We ran into a problem with out primary key. It was set to a meaningful value for ease of data entry since all data was originally added directly. However now the meaningful value is not always present in all entries. So now we are moving to an auto-generated, non-meaningful key. But I have to update the database to reflect this.
So my products table has the columns serial (the original key) and Id (the new PK). My parts table has the 2 columns FK_serial (the old FK) and FK_product (the new FK, currently set to 0 for all entries).
Is there a UPDATE statement that will walk through the parts table and set the FK_product to the value of Id in the products table where serial = FK_serial?
UPDATE parts
JOIN products
ON parts.FK_serial = products.serial
SET parts.FK_product = products.Id;
Related
I have been having some frustration attempting to add data values to this table students. I have all the other data values and have dropped and created the column student_id. However, when trying to add the data with this query:
insert into students(student_id) values('1'),('2'),('3'),('4'),('5');
The data does not insert correctly, as it creates new columns below the first 5 which contain data.
It must be because of my not null values, but I can't not have the not null identifier.
Is there a query command that allows me to change data within already existing value-filled columns? I have been unsuccessful in finding this so far.
Here are some images to explain the problem further.
The query I have made to add my values to the table:
The data was inserted but as it is underneath the columns I need to map with a foreign key, I cannot use the column as the top 5 values are still my not null default, which is required to let me create the foreign key
Looks like you already have your records initially created without the student_id field, you want to UPDATE the current records but you're actually INSERTING new records.
You're meant to update your students with update statements such as "UPDATE students SET student_id = X where condition = Y"
Then it looks like your student_id is your primary key which you should set to AUTO_INCREMENT value.
Regards
INSERT is the wrong command since you want to update existing rows. The problem here lies within the fact that the order of the rows is nondeterministic and I think you cannot update them in one statement. One solution would be as follows:
UPDATE students SET student_id = 1 WHERE first_name = 'Berry';
UPDATE students SET student_id = 2 WHERE first_name = 'Darren';
I hope you really do have only 5 columns to update :-)
I've created multiple indexed tables that I want to tie into a new normalized version of an old table. I get everything indexed and the relations set and I get a "Duplicate entry '11' for key 'Primary' " error message.
Here's the code I'm using to populate the new table.
insert into dvdNormal(dvdId, dvdTitle, year, publicRating, dvdStudioId,
dvdStatusId, dvdGenreId)
(
select dvdId, dvdTitle, year, publicRating, studioId, statusId, genreId
from dvd d
join dvdStudio on d.studio = dvdStudio.studioName
join dvdStatus on d.status = dvdStatus.dvdStatus
join dvdGenre on d.genre = dvdGenre.genre);
I'm going to assume you were asking a question, and not just giving a status report.
The behavior you observe is (most likely) due to the insert statement attempting to insert a row that violates a UNIQUE (or PRIMARY KEY) constraint defined on the dvdId column in the target table (the table the statment is inserting rows into.)
And either 1) the dvdId column is not unique in the table it's being retrieved from, or 2) there is more than one "matching" row in one of the other three tables.
For example, if dvdId is a column in dvd, and it's defined as UNIQUE, then case 1) doesn't apply.
But if that row from dvd has more than one "matching" row from one (or more) of the other three tables, then we'd expect the SELECT to generate "duplicate" values for dvdId.
For example, if the genre column is not unique in dvdGenre table, or studioName column is not unique in dvdStudio, we'd expect the query to return multiple copies of the row from dvd. The redundant data (duplicated values) is expected when we "denormalize" data.
If we want to get the table loaded from the query, there's a couple of options.
If we want to store every row returned by the query, we would remove the UNIQUE constraint from the dvdId column. (There may also be other UNIQUE constraints that need to be removed from the target table.)
If we only want to store one copy of the row from dvd, along with values from one matching row from each of the other tables, we could leave the UNIQUE constraint, and use an INSERT IGNORE statement to avoid throwing a "duplicate key error". Any rows where that error would have been thrown will be discarded, and won't be inserted into the target table.
Because the column references aren't qualified, we can't actually tell which table the dvdId column is beint returned from. We can't tell which table any of the columns are returned from. We can "guess" that genreId is being returned from the dvdGenre table, but for us to figure that out, we'd need to investigate the schema definition. It's not a problem for MySQL, it can lookup the table definitions a whole lot faster than we can.
We could aid to the future reader of that SQL statement by qualifying the column references with the tablename, or a table alias.
I'm trying to remove doublettes (sometimes triplettes, unfortunately!) from a MySQL table. My issue is that the only unique data available are the primary key, so in order to identify doublettes, you have to take account all the columns.
I've managed to identify all records that have doublettes and copied them along with their doublettes (including their primary keys) to the table temp. The source table is called translation and it has an integer primary key with the name TranslationID. How do I move on from here? Thanks!
edit Available columns are:
TranslationID
LanguageID
Translation
Etymology
Type
Source
Comments
WordID
Latest
DateCreated
AuthorID
Gender
Phonetic
NamespaceID
Index
EnforcedOwner
The duplicity issue resides with the rows with the Latest column assigned 1.
edit #2 Thank you, everyone for your time! I've solved the problem by using WouterH's answer, resulting in the following query:
DELETE from translation USING translation, translation as translationTemp
WHERE translation.Latest = 1
AND (NOT translation.TranslationID = translationTemp.TranslationID)
AND (translation.LanguageID = translationTemp.LanguageID)
AND (translation.Translation = translationTemp.Translation)
AND (translation.Etymology = translationTemp.Etymology)
AND (translation.Type = translationTemp.Type)
AND (translation.Source = translationTemp.Source)
AND (translation.Comments = translationTemp.Comments)
AND (translation.WordID = translationTemp.WordID)
AND (translation.Latest = translationTemp.Latest)
AND (translation.AuthorID = translationTemp.AuthorID)
AND (translation.NamespaceID = translationTemp.NamespaceID)
You can remove duplicates without temporary table or subquery. Delete all rows that have the same data but a different TranslationID
DELETE from translation USING translation, translation as translationTemp
WHERE (NOT translation.TranslationID = translationTemp.TranslationID)
AND (translation.LanguageID = translationTemp.LanguageID)
AND (translation.Translation = translationTemp.Translation)
AND (translation.Etymology = translationTemp.Etymology)
AND // compare other fields here
Create a SELECT statement with your current SELECT as a sub-select, so that you can return a col of IDs that should be removed. Then apply that SELECT in a DELETE FROM statement.
Example (pseudo code):
SELECT1 = SELECT ... AS temp; # the table you have right now
SELECT2 = SELECT TranslationID FROM (SELECT1)
Final query will look like this:
DELETE FROM table_name WHERE TranslationID IN (SELECT2);
You just need to insert the SELECT with sub-select in the final query.
Top stop duplicates in future you can change your engine to the InnoDB engine like this:
ALTER TABLE table_name ENGINE=InnoDB;
Then add a Unique constraint to the TranslationID field.
If the doublettes/triplettes are identical except for the primary key, then you can select all records from temp which are identical to another except for having a larger primary key than that other; this will give you temp w/ the record w/ the minimum key for each doublet/triplette. You can then delete these records from translation.
Instead of identifying the lines that aren't unique, I would try to copy the valid data to a new table, and then remove the old one and replace it by this new, cleaned table.
I can see of two ways:
Using the DISTINCT keyword in your SQL query (source);
Using a GROUP BY statement on all columns (source).
I have a table in Access which I'd like to substitute with a query which gathers data from the table and other new tables. The table is used by many queries which look to a primary key (autonumber) in the table, so the new query must have a primary key which is a unique combination of the primary keys of the tables used by the query. What can I do?
--EDIT--
Solution found: Since I want to "merge" tables with a query, and since the pk is an autonumber, I can define the new pk (of the query) by "expanding the numbering": I multiply both pkeys by 2 (because I have two tables) and add or subtract 1 to one of the two (or 1 for the first table and 2 for the second, and so on).
For example:
PK1 = 1,2,3,4,5,6
PK2 = 1,3,4,5,8,9,10 (some records may have been deleted, so the number is skipped)
new PK = (2*PK1, (2*PK2 + 1)) = (2,4,6,8,10,12),(3,7,9,11,17,19,21)
as you can see they will never overlap (no new value of PK2 can be obtained from any value of PK1, because of the "+1") because math says they belong to different vector spaces.
Hope it may help somebody
Use composite key (Multiple-field primary key)
I'm developing a helpdesk-like system, and I want to employ foreign keys, to make sure the DB structure is decent, but I don't know if I should use them at all, and how to employ them properly.
Are there any good tutorials on how (and when) to use Foreign keys ?
edit The part where I'm the most confused at is the ON DELETE .. ON UPDATE .. part, let's say I have the following tables
table 'users'
id int PK auto_increment
department_id int FK (departments.department_id) NULL
name varchar
table 'departments'
id int PK auto_increment
name
users.department_id is a foreign key from departments.department_id, how does the ON UPDATE and ON DELETE functions work here when i want to delete the department or the user?
ON DELETE and ON UPDATE refer to how changes you make in the key table propagate to the dependent table. UPDATE means that the key values get changed in the dependent table to maintain the relation, and DELETE means that dependent records get deleted to maintain the integrity.
Example: Say you have
Users: Name = Bob, Department = 1
Users: Name = Jim, Department = 1
Users: Name = Roy, Department = 2
and
Departments: id = 1, Name = Sales
Departments: id = 2, Name = Bales
Now if you change the deparments table to modify the first record to read id = 5, Name = Sales, then with "UPDATE" you would also change the first two records to read Department = 5 -- and without "UPDATE" you wouldn't be allowed to make the change!
Similarly, if you deleted Department 2, then with "DELETE" you would also delete the record for Roy! And without "DELETE" you wouldn't be allowed to remove the department without first removing Roy.
You will need foreign keys if you are splitting your database into tables and you are working with a DBMS (e.g. MySQL, Oracle and others). I assume from your tags you are using MySQL.
If you don't use foreign keys your database will become hard to manage and maintain. The process of normalisation ensures data consistency, which uses foreign keys.
See here for foreign keys. See here for why foreign keys are important in a relational database here.
Although denormalization is often used when efficiency is the main factor in the design. If this is the case you may want to move away from what I have told you.
Hope this helps.