inserting rows to mysql table containing predefined Primary Key - mysql

I was looking for the solution of this problem for some time and did not manage to find anything satisfactory. I know that similar problems were many times answered but there are usually workarounds rather than standard solutions for them.
The problem in my particular case is:
I have one table that contains predefined Primary Key that cannot be used as auto increment. It is predefined and it is also used by several other tables as a foreign key.
NID - my Primary Key
PID - the key from external source
Serial
bla1
bla2
NID is already in table ids (target table), not in source table
PID is already in source file/table, not in target table
other columns are in both tables
The pair NID-PID would be a unique match as these would be used further after matching.
Now I need to be able to insert values to this table on a weekly basis as these would be sent to me in csv/excel files, hundreds of records, so some easy way would be best, especially as easy way is easy to validate the import process.
Since there is no auto increment PK, I get an error:
1062 - Duplicate entry '' for key 'NID'
I was thinking about creating unique index on multiple fields like:
CREATE UNIQUE INDEX unique_index ON ids (NID,PID);
But it did not work very well either:
1062 - Duplicate entry '107521' for key 'unique_index'
I also tried to create separate table with data to be imported, but I get the same error.
The question is: what is the best way to insert records to table that contains PK and continue to do so on regular basis without altering existing data? What should I do to achieve this?
I would really appreciate any help since I'm stuck.

Related

Is there a data type in MySQL that is similar to a dynamic array in C or C++?

I want to structure a MySQL table with many of the the usual columns such as index as an integer, name as a varchar, etc. The thing about this table is I want to include a column that has an unknown number of entries. I think the best way to do this (if possible) is to make one of the columns an array that can be changed as any entry in a database can. Supposing when the record is created it has 0 entries. Then later, I want to add 1 or more. Maybe sometime later still, I might want to remove 1 or more of these entries.
I know I could create the table with individual columns for each of these additions, but I may want as many as a hundred or more for one record. This seems very inefficient and very difficult to maintain. So the bottom-line question is can a column be defined as a dynamic array? If so, how? How can things be selectively added to or removed from it?
I'll take a stab in the dark and guess maybe make a table contain another table. I've never heard of this because my experience with MySQL has been mostly casual. I make databases and dynamic websites because I want to.
The way to do this in a relational database is to create another table. One column of that table will have a foreign key pointing to the primary key of that table that should have had the array (or multiple columns, if the primary key consists of more than one row). Another column has the values that'd be found in the array. If order matters, a third column would store some other values indicating the ordinality.
Something along the lines of:
CREATE TABLE elbat_array
(id integer,
elbat integer -- or whatever type the primary key column has
NOT NULL,
value text, -- or whatever type the values should have
ordinality integer
NOT NULL, -- optional
PRIMARY KEY (id),
FOREIGN KEY (elbat)
REFERENCES elbat -- the other table
(id) -- and its primary key column
ON DELETE CASCADE,
UNIQUE (ordinality));
To add to the "array", insert rows into that table. To remove, delete rows. There can be as many as zero rows (i.e. "array" elements) or as much as there's disk space (unless you hit any limit of the DBMS before, but if such a limit applies it would be very large, so usually that should not be a problem).
Also worth a read in that context: "Is storing a delimited list in a database column really that bad?" While it's not about an array type in particular, on the meta level it discusses why the values in a column should be atomic. An array would violate that as well as a delimited list does.

Should I use my two columns that uniquely identify a record as primary key?

I started to design a database that tracks system events by following some online tutorials, and some easy examples start by assigning auto-incrementing IDs as primary keys. I looked at my database, I don't really need IDs. Out of all my columns, the timestamp and device ID are the two columns that together identifies an unique event.
What my program does right now is to pull some events from system log in the past x minutes and insert these events to the database. However, I could be going too much into the past that the events overlap with what's already in the database. As I mentioned before, timestamp and device ID are the two fields that uniquely identify an event. My question is, should I use these two fields as my primary key and use "Insert ignore" from now on so I can avoid having duplicate records?
It is a good practise to never have your business values as table's primary key and always to use synthetic, e.g. autoincrement, values for this. You will make your life easier in the future when business requirements change :)
We are currently struggling with exactly this situation. Have a column with business values as a primary key for 2 years and now painfully introducing an autoincrement one.
You may need to use foreign key from other table to this in the future to link some rows between two tables. It is easier with one-column primary key.
But if you don't need it now - no need to create column special for index. Table can be altered in future to add such column with autoincrement and move primary key to it.

Correct way of inserting new row into SQL table, but only if pair does not exist

This has been discussed before, however I cannot understand the answers I have found.
Essentially I have a table with three columns memo, user and keyid (the last one is primary and AUTO_INC). I insert a pair of values (memo and user). But if I try to insert that same pair again it should not happen.
From what I found out, the methods to do this all depend on a unique key (which I've got, in keyid) but what I don't understand is that you still need to do a second query just to get the keyid of the existing couple (or get nothing, in which case you go ahead with the insertion).
Is there any way to do all of this in a single query? Or am I understanding what I've read (using REPLACE or IGNORE) wrong?
You need to set a UNIQUE KEY on user + memo,
ALTER TABLE mytable
ADD CONSTRAINT unique_user_memo UNIQUE (memo,user)
and then using INSERT IGNORE or REPLACE according to your needs when inserting. Your current unique key is the primary key, that is all well and good, but you need a 2nd one in order to not allow the insertion of duplicate data. If you do not create a new unique key on the two columns together, then you'll need to do a SELECT query before every insert to check if the pair already exists.

MySQL - find all duplicate records

I have a table with 55 columns. This table is going to be populated with data from a CSV file. I have created a PHP script which reads in the CSV file and inserts the records.
Whilst scanning through the CSV file I noticed there are some rows that are duplicates. I want to eliminate all duplicate records.
My question is, what would be the best way of doing this? I assume it will be either one of these two options:
Remove / skip duplicate records at source, i.e. duplicate records will not be inserted in the table.
Insert all records from the CSV file, then query the table to find and remove all duplicate records.
For option one, would this be possible to do using MS Excel or even just a text editor?
For option 2, I came across some possible solutions but surely this would result in a rather large query. I am looking for something short and simple. Is this at all possible to do?
A good way is to define a key for the table. A key is a set of fields that make each record unique and all other fields depend on it. (In the worst case the key will consist of all the columns in your table but usually you can define a smaller key). Then you can use the database itself to enforce that key for example using a primary key constraint or an unique index.

Primary Key field with merged tables

Apologies for the noob question (I'm keenly learning as I go). I'd be grateful for some advice on the Primary Key.
I have 5 separate (unrelated) tables (Access 2003) containing similar fields that I will be merging (using Append queries) into a single new table. Each record between tables is unique (no duplicated).
Each separate table already has a primary key field using the default autonumber method (1-n). This means (I'm thinking) that there will be many duplicate primary key numbers between tables.
Is it standard practice (and ok to do) to detete the existing primary key field and create a new (autonumber; 1-n) upon merging. Should I do this before the merge (for each separate table) or after the merge (on the single new table)?
Create your new table with the table structure, primary keys and any other necessary metadata defined. Then run a SELECT INTO statement from each of the five table tables specifying the columns to copy into the new table. Since you already have your identity column defined on the new table and you are not selecting the identity column on the old table(s) the data should copy over and the insert will assign a new primary key value.