adding data to interrelated tables..easier way? - mysql

I am a bit rusty with mysql and trying to jump in again..So sorry if this is too easy of a question.
I basically created a data model that has a table called "Master" with required fields of a name and an IDcode and a then a "Details" table with a foreign key of IDcode.
Now here's where its getting tricky..I am entering:
INSERT INTO Details (Name, UpdateDate) Values (name, updateDate)
I get an error: saying IDcode on details doesn't have a default value..so I add one then it complains that Field 'Master_IDcode' doesn't have a default value
It all makes sense but I'm wondering if there's any easy way to do what I am trying to do. I want to add data into details and if no IDcode exists, I want to add an entry into the master table. The problem is I have to first add the name to the fund Master..wait for a unique ID to be generated(for IDcode) then figure that out and add it to my query when I enter the master data. As you can imagine the queries are going to probably get quite long since I have many tables.
Is there an easier way? where everytime I add something it searches by name if a foreign key exists and if not it adds it on all the tables that its linked to? Is there a standard way people do this? I can't imagine with all the complex databases out there people have not figured out a more easier way.
Sorry if this question doesn't make sense. I can add more information if needed.
p.s. this maybe a different question but I have heard of Django for python and that it helps creates queries..would it help my situation?
Thanks so much in advance :-)

(decided to expand on the comments above and put it into an answer)
I suggest creating a set of staging tables in your database (one for each data set/file).
Then use LOAD DATA INFILE (or insert the rows in batches) into those staging tables.
Make sure you drop indexes before the load, and re-create what you need after the data is loaded.
You can then make a single pass over the staging table to create the missing master records. For example, let's say that one of your staging table contains a country code that should be used as a masterID. You could add the master record by doing something along the lines of:
insert
into master_table(country_code)
select distinct s.country_code
from staging_table s
left join master_table m on(s.country_code = m.country_code)
where m.country_code is null;
Then you can proceed and insert the rows into the "real" tables, knowing that all detail rows references a valid master record.
If you need to get reference information along with the data (such as translating some code) you can do this with a simple join. Also, if you want to filter rows by some other table this is now also very easy.
insert
into real_table_x(
key
,colA
,colB
,colC
,computed_column_not_present_in_staging_table
,understandableCode
)
select x.key
,x.colA
,x.colB
,x.colC
,(x.colA + x.colB) / x.colC
,c.understandableCode
from staging_table_x x
join code_translation c on(x.strange_code = c.strange_code);
This approach is a very efficient one and it scales very nicely. Variations of the above are commonly used in the ETL part of data warehouses to load massive amounts of data.
One caveat with MySQL is that it doesn't support hash joins, which is a join mechanism very suitable to fully join two tables. MySQL uses nested loops instead, which mean that you need to index the join columns very carefully.
InnoDB tables with their clustering feature on the primary key can help to make this a bit more efficient.
One last point. When you have the staging data inside the database, it is easy to add some analysis of the data and put aside "bad" rows in a separate table. You can then inspect the data using SQL instead of wading through csv files in yuor editor.

I don't think there's one-step way to do this.
What I do is issue a
INSERT IGNORE (..) values (..)
to the master table, wich will either create the row if it doesn't exist, or do nothing, and then issue a
SELECT id FROM master where someUniqueAttribute = ..
The other option would be stored procedures/triggers, but they are still pretty new in MySQL and I doubt wether this would help performance.

Related

Asking opinion about table structure

I'm working on a project to make a digital form of this paper
this paper (can't post image)
and the data will displayed on a Web in a simple table view. There will be NO altering, deleting, updating. It's just displaying (via SELECT * of course) the data inputted.
The data will be inserted via android app and stored in a single table which has 30 columns in mysql.
and the question is, is it a good idea if i use a single table? because i think there will be no complex operation in the sql.
and the other question is, am i violating some rules for this method?
I need your opinion. thanks.
It's totally ok to use only one table, if that suits your needs. What you can do to make the database a little bit 'smarter' is add new tables for attributes in your paper that will be repeated. So, for example, the Soil Type could be another table where there are two columns, ID and Description, and you will use it as a foreign key in each record in the main table. You need this if you want your database to be in 3NF.
To sum up, yes you can have one table if that's all you need. However, adding more tables might help save some space and make your database more flexible. It's up to you to decide! :)

How to set up relational database tables for this many-to-many relationship?

I have a type of data called a chain. Each chain is made up of a specific sequence of another type of data called a step. So a chain is ultimately made up of multiple steps in a specific order. I'm trying to figure out the best way to set this up in MySQL that will allow me to do the following:
Look up all steps in a chain, and get them in the right order
Look up all chains that contain a step
I'm currently considering the following table set up as the appropriate solution:
TABLE chains
id date_created
TABLE steps
id description
TABLE chains_steps (this would be used for joins)
chain_id step_id step_position
In the table chains_steps, the step_position column would be used to order the steps in a chain correctly. It seems unusual for a JOIN table to contain its own distinct piece of data, such as step_position in this case. But maybe it's not unusual at all and I'm just inexperienced/paranoid.
I don't have much experience in all this so I wanted to get some feedback. Are the three tables I suggested the correct way to do this? Are there any viable alternatives and if so, what are the advantages/drawback?
You're doing it right.
Consider a database containing the Employees and Projects tables, and how you'd want to link them in a many-to-many fashion. You'd probably come up with an Assignments table (or Project_Employees in some naming conventions).
At some point you'd decide you want not only to store each project assignment, but you'd also want to store when the assignment started, and when it finished. The natural place to put that is in the assignment itself; it doesn't make sense to store it either with the project or with the employee.
In further designs you might even find it necessary to store further information about the assignment, for example in an employee review process you may wish to store feedback related to their performance in that project, so you'd make the assignment the "one" end of a relationship with a Review table, which would relate back to Assignments with a FK on assignment_id.
So in short, it's perfectly normal to have a junction table that has its own data.
That looks fine, and it's not unusual for the join table to contain a position/rank field.
Look up all steps in a chain, and get them in the right order
SELECT * FROM chains_steps
LEFT JOIN steps ON steps.id = chains_steps.step_id
WHERE chains_steps.chain_id = ?
ORDER BY chains_steps.step_position ASC
Look up all chains that contain a step
SELECT DISTINCT chain_id FROM chains_steps
LEFT JOIN chains ON chains.id = chains_steps.chain_id
I think that the plan you've outlined is the correct approach. Don't worry too much about the presence of step_position on your mapping table. After all the step_position is a bit of data that is directly related to a step in the context of a chain. So the chains_steps table is the right place for it IMHO.
Some things to think about:
Foreign keys - use 'em!
Unique key on the chains_steps table - can a step be present in more than one position in a single chain? What about in different chains?
Good luck!

Update a table with data from another where non FK columns match

I'm working on an online registry which was created by a previous programmer. I Have to fix a bunch of data integrity issues revolving around postal codes and cities. I am trying to do a large update query using data from our table of Canadian postal codes and our table of registrants. My query seems to literally take infinite time on my development environment. Not sure why.
Create Temporary Table RegistrantToChange AS (
SELECT
intID, vcCity, vcPostalCode
FROM
tblRegistrantWebsiteSignUps
WHERE
vcPostalCode NOT LIKE '00%' AND vcPostalCode!=''
AND (vcCity = '' OR vcCity = 'unspecified')
);
UPDATE RegistrantToChange, tblPostalCodes
SET
vcPostalCode = tblPostalCodes.PostalCode
WHERE
vcCity = tblPostalCodes.CityName;
Pardon the horrific and inconsistent naming. I just recently took over this project and am still in the process of refactoring the whole thing.
vcCity in your temporary table is not indexed, and if tblPostalCodes.CityName is not indexed then the JOIN in the update has a lot of work to do and may take some time.
I would suggest creating the temporary table first with an index on vcCity, then perform an INSERT...SELECT to populate it. Ensure that tblPostalCodes.CityName is indexed and then perform your update.

How to merge 2 Records in innoDB MySQL databases

This is related to How to change ID in mysql
I also have checked other questions and none are quite like this one.
As we know, innodb has a feature. If I want to channge an id of a record for example, then all other table that point to the previous ID will magically be updated.
What about if I want to MERGE 2 records?
Say I have 2 businesses.
They have 2 ID.
I want to merge them into one. I also want to use innodb awesome feature to automatically change things.
I can't just change one of the id to the other ID. Or can I?
What would you do to merge 2 simmilar records in database?
Of course what actually goes into the combined record will be business decisions.
Basically I just do not want to pin point all the other table one by one. I think on update rule is there for a reason. Is there a way where I just change slaveID to masterID, keep ALL data in master the same, and then have the database itself (rather than my program) to repoint all tables that point to slaveID to point to masterID? of course, records for slaveID will be gone anyway.
For example, with normal mysql engine, you can change ID, and then you have to go through all table that points to the old ID to point the new ID instead. With innodb, that repointing is done by the database engine itself. Which is kind of cool. Why would anyone use non innodb engine anyway.
I want to do the same but for merging.
Trying to set a records primary key to an already existing value will simply result in a key violation error. While this is simple on a first glance, it has a side effect: You can not use ON UPDATE CASCADE to merge two records - it will simply not work.
If you have the possibility to change the schema, you can use the old but good redirect-trick:
(Assuming your IDs are positive, maybe unsigend ints)
add a field redirect int not null default 0
Create a view:
.
CREATE VIEW tablename_view
SELECT
-- repeat next line for every field apart from redirect
IF(s.redirect>0,m.<fieldname>,s.<fieldname>
FROM tablename AS s
LEFT JOIN tablename AS m ON s.redirect=m.id
When you merge a record (slave) into another record (master) run UPDATE tablename SET redirect=<id_of_master> WHERE id=<id_of_slave>
Adapt your select queries to select from tablename_view instead of tablename
Create and use a maintenance script to weed out merger slaves

How to store data in mysql, to get the fastest performance?

I'm thinking about it, which of the following two query types would give me the fastest performance for a user messaging module inside my site:
The first one i thought about is a multi table setup, which has a connection table, and a main table. The connection table holds the connection between accounts, and the messaging table.
In this case a query would look like following, to get some data of the author, and the messages he has sent:
SELECT m.*, a.username
FROM messages AS m
LEFT JOIN connection_table
ON (message_id = m.id)
LEFT JOIN accounts AS a
ON (account_id = a.id)
WHERE m.id = '32341'
Inserting into it is a little bit more "complicated".
My other idea, and in my thought the better solution of this problem is that i store the data i would use in a connection table in the same table where is store the data of the mail. Sounds like i would get lots of duplicated entries, but no, because i have a field which has text type and holds user ids like this: *24*32*249*
If I want to query them, i use the mysql LIKE method. Deleting is an other problem, but for this i have one more field where i store who has deleted the post.
Sad about that i don't know how to join this.
So what would you recommend? Are there other ways?
Sounds like you are using an n:m relation.. if yes, don't put a list of ids in a single column but create a mapping table containing two columns - the primary key of table1 and the primary key of table2. Then selecting, inserting and deleting will all be easy and still cheap.
I wonder how many messages will be send to multiple recipients? It might just be easier to have it all in one table - MessageID, SentFrom, SentTo, Message, and dup it for multiple people. This obviously makes it extremely easy to query.
Definately avoid storing multiple ID's in one field and using LIKE - that'll be a performance killer - go with ThiefMasters suggestion if you want something like that.