Saving records in MySQL which does not pre-exist - mysql

Let say, I have a pre-defined table called cities, with almost all the cities in my country.
When a user register himself (user table), the column cities_id in the table user stores the city id from the table cities (Foreign Key, reference table cities), something like
CREATE TABLE `cities` (
`id` int,
`city_name` varchar(100)
)
CREATE TABLE `user` (
`id` int,
`name` varchar(60)
`****`
`cities_id` FK
)
The user table stores the city id.
But what if I missed a few cities ... How does the user then save his city name in the user table which does not accept any city name but only IDs.
Can I have one more column city_name right after the cities_id in the table user something like
CREATE TABLE `user` (
`id` int,
`name` varchar(60)
`****`
`cities_id` FK
`citiy name` varchar(100)
)
to record the data entered by the user at the time of registration? Can this be done?

You can add a type to city table tag, the user can't find their corresponding to the city allows him to type the name of his city, and then you in the city, and will create a corresponding record in the table type marked as a special status (convenient operating personnel check and correction), at the same time to save the record id to the user record
CREATE TABLE `cities` (
`id` int,
`city_name` varchar(100),
`type` int,
)
CREATE TABLE `user` (
`id` int,
`name` varchar(60)
`****`
`cities_id` FK
)

As #Joakim mentioned in the comment, from a DB perspective, as cities_id is a foreign key referencing to the cities table, inserting a record to the user table will fail if the city in question is not already there in the table.
From a programming perspective, if you want a city which is not there in the table should be first inserted automatically whenever a user is registering, it is possible. Assuming you are using Java and Hibernate and User entity contains City entity, then calling saveOrUpdate() method on the user entity will cause the city record to be inserted if not already there, and a user record will then be inserted into the User table.

That's how I would quickly solve this
Create an additional table to store the missing cities, that will be introduced by users
CREATE TABLE `cities_users` (
`id` int,
`city_name` varchar(100),
`added_by` varchar(100),
`added_TS` DATETIME DEFAULT CURRENT_TIMESTAMP
);
Create a VIEW that UNION the 2 cities tables :
CREATE VIEW all_cities AS
SELECT id, city_name FROM `cities`
UNION ALL
SELECT id, city_name FROM `cities_users`;
Whenever a user register, you query the VIEW to check if the user's city exists. That way you'll kknow if a city exists in your original table OR the cities introduced by users.
If not, you INSERT the new city in the cities_users table (along with the user that created it for logging purposes).
You should generate a unique ID properly, ie one that can't ever exists in the cities table. You can do this in various ways, here's a quick example : Take the last ID in the cities_users table and add 1 million to it. Your cities_users IDs will be like: 1000001, 1000002, 1000003
And finally, you insert the generated cities_users ID in the users table.
Having a separate table for user inputs should help you to keep the database clean :
Your original cities table remains totally unchanged
You will know easily at all times the new cities added by whom and when, and you can create a small interface to review and manage that.
Your users are working for you to complete your database.

If a user suggest a new city you should create a new record into cities table and store city_id into users table. This is the best way to store the table records.

I feel like it should be pointed out, despite answers to the contrary, that your original suggestion of adding a city_name column to the table will work fairly well
If you allow both cities_id and city_name to be nullable then you can validate that one and only one of them is set in the application logic
The benefit of this approach is that it would keep your city table 'pure' and allow you to count duplicates of and analyse the user supplied cities easily
It would however add a very sparse nullable city_name column in your table
I guess it depends on how you want to get the city from the user, (drop-down + text box for others, text-box with suggestions, just a text box) and what you plan to do with the cities you have gathered
You could even change the label to 'city (or nearest city)' with a hard-coded drop-down, or searchable drop-down, and not allow user supplied cities

If you have a buffer table where the raw data is put in, i.e. the relationship between city_name, user_name
CREATE TABLE `buffer_city_user` (
`buffer_id` int,
`city_name` varchar(100),
`user_name` varchar(100),
);
you can first process the buffer table for new city_names - if found, insert into table cities.
Then insert the user info - any new city-names should already be in the cities table and no foreign key issues will occur.

Related

How to detect deleted rows when migrating data

I have a main database and am moving data from that database to a second data warehouse on a periodic schedule.
Instead of migrating an entire table each time, I want to only migrate the rows that has changed since the process last run. This is easy enough to do with a WHERE clause. However, suppose some rows have been deleted in the main database. I don't have a good way to detect which rows no longer exist, so that I can delete them on the data warehouse too. Is there a good way to do this? (As opposed to reloading the entire table each time, since the table is huge)
It could be done in following steps for let’s say in this example I am using customer table:
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR (25) ,
LAST_UPDATED DATETIME,
PRIMARY KEY (ID)
);
Create CDC:
CREATE TABLE CUSTOMERS_CDC(
ID INT NOT NULL,
LAST_UPDATED DATETIME,
PRIMARY KEY (ID)
);
Trigger on source table like below on delete event:
CREATE TRIGGER TRG_CUSTOMERS_DEL
ON CUSTOMERS
FOR DELETE
AS
INSERT INTO CUSTOMERS_CDC (ID, LAST_UPDATED)
SELECT ID, getdate()
FROM DELETED
In your ETL process where you are querying source for changes add deleted records information through UNION or create separate process like below:
SELECT ID, NAME, AGE, ADDRESS, LAST_UPDATED, ‘I/U’ STATUS
FROM CUSTOMERS
WHERE LAST_UPDATED > #lastpulldate
UNION
SELECT ID, null, null, null, LAST_UPDATED, ‘D’ STATUS
FROM CUSTOMERS_CDC
WHERE LAST_UPDATED > #lastpulldate
If you just fire an update query, then it wont update the rows.
The way I see: lets say you have your way where you do a where clause. Youd have that as part of an update query, unless you are doing a csv export. If you do a mysql dump of the rows you wish to update and create a new tempTable in the main database,
Then
UPDATE mainTable WHERE id = (SELECT id from tempTable WHERE id >0 and id <1000)
If there is no corresponding match, then no update gets run, and no error occurs, by using the id limits as parameters.

Auto Increment in sql with specific name

i need autoincrement. for start like abc_1,abc_2. like this format? below shown code is for auto increment. but i need format like abc_ is constatanct then auto increment, format like abc_1,abc_2..
CODE
sql = "CREATE TABLE MY_TABLE
(
table_id int NOT NULL AUTO_INCREMENT,
PRIMARY KEY(table_id),
table_1 varchar(45),
table_2 varchar(45),
table_3 varchar(999),
table_4 varchar(45)
)"
You have 2 options - both include keeping the autoincrement field exactly as it is.
1st Option is to add a short char type field, which simply stores your Alpha part. When you want to retrieve the whole key, then you can SELECT (alpha_part + table_id) as ID. As you can see this generates smaller storage, but requires more work for each select statement.
2nd option is to add a longer column that gets populated by an insert trigger normally. It is simply storing the concatenation on creation and then you don't have to concatenate it when you want to select it. This option also allows you to create an index or clustered index easier.
CREATE TABLE MY_TABLE (
table_id int NOT NULL AUTO_INCREMENT, PRIMARY KEY(table_id),
alpha_part varchar(10) NOT NULL, -- This
display_id varchar(40) NOT NULL, -- OR This (not both)
table_1 varchar(45),
table_2 varchar(45),
table_3 varchar(999),
table_4 varchar(45) )
"Database Id" and "Insurance Policy Id" are two separate entities - they may contain the "same" number, but don't mix up what the database needs to perform effectively, with what your business application needs to generate IDs for customers. Business rules and database Id are separate entities. You can "seed" a policy Id from a database generated one, but if something changes the policy id (yes this happens) your database suddenly needs to be refactored and you don't want that to happen.
You could add another column to derive this value, then have a trigger that automatically updates this column to add the derived value whenever a row is inserted.
However, it is not clear why this would be needed. It is likely better to just store the number and derive the form abc_123 where that value needs to be used.
It was an interesting thing. so I googled custom auto increment structure and found some links. Most of the people are saying that its better to use trigger before insertion and I think it can be on possible solution for your problem. Look at the following link.
http://www.experts-exchange.com/Database/MySQL/Q_27602627.html

correct way to place data in table OneToONe

I am confused about the correct/most efficient way to place data in my dababase table when there is a OneToOne relationship.
For example, I have a users table.
I now wish for each user to be able to state his current country location.
i then want to be able to search the datatable for users by current location.
The way that I have done this is to create 3 separate tables. i.e
table one - users : just contains the user information:
CREATE TABLE users(
id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT,
firstName VARCHAR(30) NOT NULL,
lastName VARCHAR(40) NOT NULL,
);
Table two country list: a list of countries and respective Ids for each country
PHP Code:
CREATE TABLE countrylist(
country_id MEDIUMINT UNSIGNED NOT NULL,
country VARCHAR(60) NOT NULL,
INDEX country_id ( country_id, country ),
INDEX countrylist (country, country_id ),
UNIQUE KEY (country)
);
Table 3; contains the userId and the countryId he lives in:
PHP Code:
CREATE TABLE user_countrylocation(
country_id VARCHAR(60) NOT NULL,
id MEDIUMINT UNSIGNED NOT NULL,
INDEX country_id (country_id, id ),
INDEX user_id (id, country_id )
);
Alternatively, should I place the countryId in the users table and completely get rid of the user_countrylocation. i.e in each user column, I will place a country_id for the country he lives in.
The problem is that I have over 20 similar tables as above that give details on users; i.e languages spoken, age-group, nationality etc.
My concerns is that if I place this unique information in each users column in the user table, then what would be the most efficient way to search the database: that is why I opted for the style above.
So, I really request for some advice on the most efficient/correct way to plan the database.
If you are going to have a huge data then you should keep the same approach and use the following method to keep the one to one constraint satisfied
if you don't have a huge data then you should keep the look up tables like country and use the reference for user in a column. but then you may need to allow them nulls that is make such optional information columns nullable.
The most efficient and exactly correct way is to first delete the data from the third table "user_countrylocation" for the user to be updated. Then insert the new location for the user. don't forget to use transaction.
your table 3 should have
country_id MEDIUMINT UNSIGNED NOT NULL,
instead of
country_id VARCHAR(60) NOT NULL,
and also change tyhe column name from id to user_id in all tables.
if you are using a stored procedure it would be like
create procedure sp_UpdateUserCurrentCountry (
#userID MEDIUMINT UNSIGNED,
#CountryID MEDIUMINT UNSIGNED)
begin
as
delete from user_countrylocation
where user_id = #userID
insert into user_countrylocation
(
country_id,
user_id
)
values
(
#CountryID,
#userID
)
END
One to One relations are usually mapped via Foreign Keys linking the two tables together. A third mapping table is only required for Many to Many relationships. So, you should ideally have a Foreign Key Country_ID in your Users table.
Your SELECT query would then look like
SELECT * FROM Users
WHERE Country_ID = (
SELECT Country_ID FROM Countries
WHERE Country_Name = 'USA'
);

Database schema for storing ints

I'm really new to databases so please bear with me.
I have a website where people can go to request tickets to an upcoming concert. Users can request tickets for either New York or Dallas. Similarly, for each of those locales, they can request either a VIP ticket or a regular ticket.
I need a database to keep track of how many people have requested each type of ticket (VIP and NY or VIP and Dallas or Regular and NY or Regular and Dallas). This way, I won't run out of tickets.
What schema should I use for this database? Should I have one row and then 4 columns (VIP&NY, VIP&Dallas, Regular&NY and Regular&Dallas)? The problem with this is it doesn't seem very flexible, thus I'm not sure if it's good design.
You should have one column containing a quantity, a column that specifies the type (VIP), and another that specifies the city.
To make it flexible you would do:
Table:
location
Columns:
location_id integer
description varchar
Table
type
Columns:
type_id integer
description varchar
table
purchases
columns:
purchase_id integer
type_id integer
location_id integer
This way you can add more cities, more types and you allways insert them in purchases.
When you want to know how many you sold you count them
What you want to do is have one table with cities and one table with ticket types.
Then you create a weak association with [city, ticket type, number of tickets].
That table will have 2 foreign keys, therefore "weak".
But this enables you to add or remove cities etc. And you can add a table for concerts as well and your weak table you will have another foreign key "concert".
I think this is the most correct way to do it.
CREATE TABLE `tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`locale` varchar(45) NOT NULL,
`ticket_type` varchar(45) NOT NULL
}
This is a simple representation of your table. Ideally you would have separate tables for locale and type. And your table would look like this:
CREATE TABLE `tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`locale_id` int(11) NOT NULL,
`ticket_type_id` int(11) NOT NULL
}

Database Tables - normalized enough?

Here are two tables I designed for managing user accounts.
create table if not exists users (
id int unsigned not null auto_increment,
username varchar(100) not null,
password binary(60) not null,
first_name varchar(100) not null,
last_name varchar(100) not null,
role_id int unsigned not null,
primary key(id),
unique(username)
);
create table if not exists roles (
id int unsigned not null auto_increment,
role varchar(100) not null,
primary key(id),
unique(role)
);
I think I need to normalize the first table, e.g. splitting the first table into some sort of user_info(first_name, last_name, ...) and account (username, password, role_id). The problem I have is that I am very uncertain of why I need to do this, as I can't really explain why it isn't in 3NF.
EDIT
A user can only have exactly one role (admin, poweruser, user).
You only need to separate the user information and account information if a user can have multiple accounts or an account can have multiple users. If the user-to-account relationship is always 1-to-1, then you're normalized as is.
Occasionally it makes sense to separate out columns in a 1-to-1 relationship if the columns in the second table will be used rarely. However, in this case, it seems as though both tables would always be populated, so there's nothing to be gained by separating those columns.
Decompose the users table further only if it's allowable to have a user id and username without a corresponding first name and last name. Otherwise it looks like your tables are already in 5NF.
I'm not a SQL Expert, but this tables looks very normalized to me. You should normalize a table to save space:
If you have a column, like role and you have 20 users with 5 roles, each roles uses 10byte, you will have 20 * 10bytes = 200bytes.
But if you normalize the table, as you have done it already, you will only need 5 * 10bytes = 50bytes for the role name, 5 * 1byte = 5byte for the id in the role table and 20 * 1byte = 20byte for the id in the user table.
200bytes not normalized
50bytes + 20bytes + 5bytes = 75bytes in normalized form.
This is only a very incomplete and basic calculation to show the background.