Generate ID Keys for MySQL Values

Generate ID Keys for MySQL Values - mysql

I currently have a MySQL table that is formatted as such -
What I would like to do is move this data to a new table, but instead of having VARCHAR's for the part, location, and customer data, I would like to assign each of them autoincrementing id's based on the value. For example, part "DEF" would have an id of 1 and part "GHI" would have an id of 2. This is what the table would look like -
Is there an SQL query to do this?

You want the values in the part and loc columns to be auto-incrementing integers, or you have type tables for part and loc respectively with auto-incrementing integers?
Option-1:
Create the new table with a different name than the old one. Insert the entries into the new table from the old table mapping values to integers as you go.
INSERT INTO new_table_name (part, loc, quan, date, customer)
SELECT CASE
WHEN part = 'DEF' THEN 1
WHEN part = 'GHI' THEN 2
END
, CASE
WHEN loc = '...' THEN 1
WHEN loc = '...' THEN 2
WHEN loc = '...' THEN 3
END
, quan
, date
, customer
FROM original_table
Option-2:
The following is a sample type table for part:
If you have a type table for part and loc, you can do something like this...
SELECT part.id
, loc.id
, quan
, date
, customer
FROM original_table orig INNER JOIN part prt
ON orig.part = prt.value
INNER JOIN loc
ON orig.loc = loc.value
As far as I know, there is no way to use the auto-increment feature to directly generate values for the table you described.

It is good practice to create new tables and populate it rather than trying to change the existing one (Especially, if your application is live and you are dealing with customer data).
I would suggest you to create a new schema as follows:
Schema Diagram
You can populate all the new tables using your existing table and assign all the primary keys to Auto-increment.
This helps you to scale your application and maintain it easily.

Related

Copy data from few tables using SQL script

So let's say I have a table Car and it has primary key ID, columns BrandID (ref to table Brand), Price, Comment and others.
.
The thing I need to do is to copy columns Price and Comment to the new table.
But also for every Car element I need to go to Brand table and get specific Brand Name depending on BrandID value and also copy it to the new table
How can I accomplish this via SQL script?

Create the new table direct from the SELECT statement with JOINed tbales
CREATE TABLE NEW_TABLE SELECT Price, Comment, name FROM car c INNER JOIN brand ON b.ID = c.BrandID

You can use
CREATE TABLE ... SELECT statement...
Take a look at this mysql document for details
https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html

Create Table (new columns) and columns from different table

So For example. I have 1 table
and the name of the table is Suppliers
Contains :
1. SupplierName
2. SupplierID
I want to create another new table name Contracts
which contain new columns for
1. ContractID (new column)
2. SupplierID(from "Suppliers" table)
3. ContractValue (new column)
How do i do it?
I have researched and most of them told me to use Create table and then select, But it wont work and also ive tried alter table but still not working.
CREATE TABLE Contracts (
ContractID INT NOT NULL,
SELECT SupplierID
FROM Suppliers,
ContractValue INT NOT NULL,
ContractStart DATE NOT NULL)
These codes are not working so I'm not sure what is the solution.
CREATE TABLE Contracts (
ContractID INT NOT NULL,
(SELECT SupplierID
FROM Suppliers),
ContractValue INT NOT NULL,
ContractStart DATE NOT NULL)
I expect the result to be new table with ContractID (new column), SupplierID (from table Suppliers) and another new column named ContractValue

Think of Select query result set as a table or data grid.
So "SELECT [some fields] FROM [some table]" returns data grid where each row contains some fields from the table.
Therefore you can define table as select query with data OR alternatively specify the structure and create empty table. Most likely you don't want to mix those two approaches.
In your case, SupplierID field of contract table is a reference to SupplierID of Supplier table. In SQL it's called "foreign key". Theoretically you can use select statement in order to create new table and when you play a lot with database queries, you'll choose most convenient and faster way depending on your needs.
But when you start learning, it's better to create an empty table with structure and then insert data using new fields and existing data for the foreign key.
Therefore, the query will be something like:
CREATE TABLE Contracts (
ContractID INT NOT NULL,
SupplierID INT NOT NULL,
ContractValue INT,
ContractStart DATE
);
And then you can insert data using existing values from supplier table:
INSERT INTO Contracts (SupplierID)
SELECT SupplierID FROM Suppliers
Of course this is very simplified description
First, you have to specify ContractID as primary key. Then the query above will work only if you specify primary key as auto increment value, otherwise you have to use some logic and specify it explicitly.
In addition you have to specify default values if you want to use NOT NULL fields.
You can also specify SupplierID as foreign key, so only existing values will be added and some other integrity relationships will be supported.
See any MySQL or SQL documentation for details.

I don't know whether the below way could solve your problem
Make a copy of Suppliers table
Delete unnecessary column from the copied table
Add new column that you want to it.

You can use CTAS command.
CREATE TABLE Contracts as
SELECT
0 as ContractID,
SupplierID,
0 as ContractValue,
now() as ContractStart
FROM Suppliers;
This will create a table with all fields. The default value is to specify the dataType. You can update the table with relevant value or have a join in the select clause itself.

The basic syntax for creating a table from another table is as follows
CREATE TABLE NEW_TABLE_NAME AS
SELECT [ column1, column2...columnN ]
FROM EXISTING_TABLE_NAME
[ WHERE ]
Here, column1, column2... are the fields of the existing table and the same would be used to create fields of the new table.
Example
Following is an example, which would create a table SALARY using the CUSTOMERS table and having the fields customer ID and customer SALARY −
SQL> CREATE TABLE SALARY AS
SELECT ID, SALARY
FROM CUSTOMERS;

last week I did, as you want to do.
Only two steps I was followed:
Export existing table.
Open in notepad++ and change the existing table name, add my new columns and Import.
Thanks

Convert an EAV table to JSON update query

I have a simple EAV table, that I want to convert to JSON/B and insert it into a column that I will add to the entity table.
This is meant to be used as a migration query.
My EAV :
Record ( id, ... )
RecInfos ( recordid, key, value )
For earch entry in the record table, it will create a json representation of each key / value that is can be found in the RecInfos table, and this will be send as an update on the Record table.
I am using postgresql 10.3

Here is what I was searching for :
update
record r
set
infos = (
select
json_agg(json_build_object('name',i.name,'value',i.value))
from
recinfos i
where
i.rec_id = r.id
)

Create table by different columns inserts

I have a source(web pages) that have common data and uncommon data that which I need to store in one table.
The data can look like this:
model: xyz, attr_1: xyz, attr_2: xyz
model: xyz, attr_3: xyz, attr_4: xyz
model: xyz, attr_1: xyz, attr_4: xyz
model: xyz, attr_1: xyz, attr_5: xyz
model: xyz, attr_15: xyz, attr_20: xyz
This data will generate this DML:
insert into table (model, attr_1, attr_2)values('xyz','xyz','xyz');
insert into table (model, attr_3, attr_4)values('xyz','xyz','xyz');
insert into table (model, attr_1, attr_4)values('xyz','xyz','xyz');
insert into table (model, attr_1, attr_5)values('xyz','xyz','xyz');
insert into table (model, attr_15, attr_20)values('xyz','xyz','xyz');
My problem is that I can't define the table before the insert commands so I can't know the columns and in every new insert I may discover new columns. I can't get all the insert commands before the actual insert. The only thing I think of is to insert every row to different table (using create table as insert into) and then use UNION ALL to create the final table. But this sound not so good idea.
EDIT I don't looking for normalized table.
The end result should be(as for the example):
table_name
id int
model varchar
attr_1 varchar
attr_2 varchar
attr_3 varchar
attr_4 varchar
attr_5 varchar
attr_15 varchar
attr_20 varchar

There's a really simple solution to this. You need to change your table:
table: model
modelName attribute value
xyz 1 xyz
xyz 2 xyz
Then when you do the INSERT, you would do:
INSERT INTO `model` (`modelName`, `attribute`, `value`) VALUES ('xyz', 1, 'xyz')
This is a normalized table structure that allows for n amount of attributes.

If you use an Array to get your data then you could use PHP's implode(', ', $array). But, you may not be using PHP. If that's the case you could always just concatenate what you're INSERTing with ,.

Right solution is to normalize your schema.
Create 2 tables: master table for main model - pretty much what you have now, but without attributes, and slave table to keep attributes. Something like this:
CREATE TABLE master (
master_id INTEGER PRIMARY KEY AUTOINCREMENT,
model VARCHAR(50)
);
CREATE TABLE attrs (
attr_id INTEGER PRIMARY KEY AUTOINCREMENT,
master_id INTEGER NOT NULL,
attr_name VARCHAR(20)
);
This schema is rather compact and has some important properties. For example, it allows you to keep arbitrary number of attributes associated with given model - it could be 0, or it could be 1000.
To insert data, you will need insert in master table first, and then to attrs table.
To retrieve data, use simple join like this:
SELECT m.model,
a.attr_name
FROM master m
JOIN attrs a ON m.model_id = a.model_id
WHERE ...

How to optimise change history data for MySQL

The previous table this data was stored in approached 3-4gb, but the data wasn't compressed before/after storage. I'm not a DBA so I'm a little out of my depth with a good strategy.
The table is to log changes to a particular model in my application (user profiles), but with one tricky requirement: we should be able to fetch the state of a profile at any given date.
Data (single table):
id, username, email, first_name, last_name, website, avatar_url, address, city, zip, phone
The only two requirements:
be able to fetch a list of changes for a given model
be able to fetch state of model on a given date
Previously, all of the profile data was stored for a single change, even if only one column was changed. But to get a 'snapshot' for a particular date was easy enough.
My first couple of solutions in optimising the data structure:
(1) only store changed columns. This would drastically reduce data stored, but would make it quite complicated to get a snapshot of data. I'd have to merge all changes up to a given date (could be thousands), then apply that to a model. But that model couldn't be a fresh model (only changed data is stored). To do this, I'd have to first copy over all data from current profiles table, then to get snapshot apply changes to those base models.
(2) store whole of data, but convert to a compressed format like gzip or binary or whatnot. This would remove ability to query the data other than to obtain changes. I couldn't, for example, fetch all changes where email = ''. I would essentially have a single column with converted data, storing the whole of the profile.
Then, I would want to use relevant MySQL table options, like ARCHIVE to further reduce space.
So my question is, are there any other options which you feel are a better approach than 1/2 above, and, if not, which would be better?

First of all, I wouldn't worry at all about a 3GB table (unless it grew to this size in a very short period of time). MySQL can take it. Space shouldn't be a concern, keep in mind that a 500 GB hard disk costs about 4 man-hours (in my country).
That being said, in order to lower your storage requirements, create one table for each field of the table you want to monitor. Assuming a profile table like this:
CREATE TABLE profile (
profile_id INT PRIMARY KEY,
username VARCHAR(50),
email VARCHAR(50) -- and so on
);
... create two history tables:
CREATE TABLE profile_history_username (
profile_id INT NOT NULL,
username VARCHAR(50) NOT NULL, -- same type as profile.username
changedAt DATETIME NOT NULL,
PRIMARY KEY (profile_id, changedAt),
CONSTRAINT profile_id_username_fk
FOREIGN KEY profile_id_fkx (profile_id)
REFERENCES profile(profile_id)
);
CREATE TABLE profile_history_email (
profile_id INT NOT NULL,
email VARCHAR(50) NOT NULL, -- same type as profile.email
changedAt DATETIME NOT NULL,
PRIMARY KEY (profile_id, changedAt),
CONSTRAINT profile_id_fk
FOREIGN KEY profile_id_email_fkx (profile_id)
REFERENCES profile(profile_id)
);
Everytime you change one or more fields in profile, log the change in each relevant history table:
START TRANSACTION;
-- lock all tables
SELECT #now := NOW()
FROM profile
JOIN profile_history_email USING (profile_id)
WHERE profile_id = [a profile_id]
FOR UPDATE;
-- update main table, log change
UPDATE profile SET email = [new email] WHERE profile_id = [a profile_id];
INSERT INTO profile_history_email VALUES ([a profile_id], [new email], #now);
COMMIT;
You may also want to set appropriate AFTER triggers on profile so as to populate the history tables automatically.
Retrieving history information should be straightforward. In order to get the state of a profile at a given point in time, use this query:
SELECT
(
SELECT username FROM profile_history_username
WHERE profile_id = [a profile_id] AND changedAt = (
SELECT MAX(changedAt) FROM profile_history_username
WHERE profile_id = [a profile_id] AND changedAt <= [snapshot date]
)
) AS username,
(
SELECT email FROM profile_history_email
WHERE profile_id = [a profile_id] AND changedAt = (
SELECT MAX(changedAt) FROM profile_history_email
WHERE profile_id = [a profile_id] AND changedAt <= [snapshot date]
)
) AS email;

You can't compress the data without having to uncompress it in order to search it - which is going to severely damage the performance. If the data really is changing that often (i.e. more than an average of 20 times per record) then it would be more efficient to for storage and retrieval to structure it as a series of changes:
Consider:
CREATE TABLE profile (
id INT NOT NULL autoincrement,
PRIMARY KEY (id);
);
CREATE TABLE profile_data (
profile_id INT NOT NULL,
attr ENUM('username', 'email', 'first_name'
, 'last_name', 'website', 'avatar_url'
, 'address', 'city', 'zip', 'phone') NOT NULL,
value CARCHAR(255),
starttime DATETIME DEFAULT CURRENT_TIME,
endtime DATETIME,
PRIMARY KEY (profile_id, attr, starttime)
INDEX(profile_id),
FOREIGN KEY (profile_id) REFERENCES profile(id)
);
When you add a new value for an existing record, set an endtime in the masked record.
Then to get the value at a date $T:
SELECT p.id, attr, value
FROM profile p
INNER JOIN profile_date d
ON p.id=d.profile_id
WHERE $T>=starttime
AND $T<=IF(endtime IS NULL,$T, endtime);
Alternately just have a start time, and:
SELECT p.id, attr, value
FROM profile p
INNER JOIN profile_date d
ON p.id=d.profile_id
WHERE $T>=starttime
AND NOT EXISTS (SELECT 1
FROM prodile_data d2
WHERE d2.profile_id=d.profile_id
AND d2.attr=d.attr
AND d2.starttime>d.starttime
AND d2.starttime>$T);
(which will be even faster with the MAX concat trick).
But if the data is not changing with that frequency then keep it in the current structure.

You need a slow changing dimension:
i will do this only for e-mail and telephone so you understand (pay attention to the fact of i use two keys, 1 as unique in the table, and another that is unique to the user that it concerns. This is, the table key identifies the the record, and the user key identifies the user):
table_id, user_id, email, telephone, created_at,inactive_at,is_current
1, 1, mario#yahoo.it, 123456, 2012-01-02, , 2013-04-01, no
2, 2, erik#telecom.de, 123457, 2012-01-03, 2013-02-28, no
3, 3, vanessa#o2.de, 1234568, 2012-01-03, null, yes
4, 2, erik#telecom.de, 123459, 2012-02-28, null, yes
5, 1, super.mario#yahoo.it, 654321,2013-04-01, 2013-04-02, no
6, 1, super.mario#yahoo.it, 123456,2013-04-02, null, yes
most recent state of the database
select * from FooTable where inactive_at is null
or
select * from FooTable where is_current = 'yes'
All changes to mario (mario is user_id 1)
select * from FooTable where user_id = 1;
All changes between 1 jan 2013 and 1 of may 2013
select * from FooTable where created_at between '2013-01-01' and '2013-05-01';
and you need to compare with the old versions (with the help of a stored procedure, java or php code... you chose)
select * from FooTable where incative_at between '2013-01-01' and '2013-05-01';
if you want you can do a fancy sql statement
select f1.table_id, f1.user_id,
case when f1.email = f2.email then 'NO_CHANGE' else concat(f1.email , ' -> ', f2.email) end,
case when f1.phone = f2.phone then 'NO_CHANGE' else concat(f1.phone , ' -> ', f2.phone) end
from FooTable f1 inner join FooTable f2
on(f1.user_id = f2.user_id)
where f2.created_at in
(select max(f3.created_at) from Footable f3 where f3.user_id = f1.user_id
and f3.created_at < f1.created_at and f1.user_id=f3.user_id)
and f1.created_at between '2013-01-01' and '2013-05-01' ;
As you can see a juicy query, to compare the user_with the previews user row...
the state of the database on 2013-03-01
select * from FooTable where table_id in
(select max(table_id) from FooTable where inactive_at <= '2013-03-01' group by user_id
union
select id from FooTable where inactive_at is null group by user_id having count(table_id) =1 );
I think this is the easiest way of implement what you want... you could implement a multi-million tables relational model, but then it would be a pain in the arse to query it
Your database is not big enough, I work everyday with one even bigger. Now tell me is the money you save in a new server worthy the time you spend on a super-complex relational model?
BTW if the data changes too fast, this approach cannot be used...
BONUS: optimization:
create indexes on created_at, inactive_at, user_id and the pair
perform partition (both horizontal and vertical)

if you try and put all occurring changes in different tables and later if you require an instance on some date you join them along and display by comparing dates, for example if you want an instance at 1st of july you can run a query with condition where date is equal or less than 1st of july and order it in asc ordering limiting the count to 1. that way the joins will produce exactly the instance it was at 1st of july. in this manner you can even figure out the most frequently updated module.
also if you want to keep all the data flat try range partitioning on the basis of month that way mysql will handle it pretty easily.
Note: by date i mean storing unix timestamp of the date its pretty easier to compare.

I'll offer one more solution just for variety.
Schema
PROFILE
id INT PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE
PROFILE_ATTRIBUTE
id INT PRIMARY KEY,
profile_id INT NOT NULL FOREIGN KEY REFERENCES PROFILE (id),
attribute_name VARCHAR(50) NOT NULL,
attribute_value VARCHAR(255) NULL,
created_at DATETIME NOT NULL DEFAULT GETTIME(),
replaced_at DATETIME NULL
For all attributes you are tracking, simply add PROFILE_ATTRIBUTE records when they are updated, and mark the previous attribute record with the DATETIME it was replaced at.
Select Current Profile
SELECT *
FROM PROFILE p
LEFT JOIN PROFILE_ATTRIBUTE pa
ON p.id = pa.profile_id
WHERE p.username = 'username'
AND pa.replaced_at IS NULL
Select Profile At Date
SELECT *
FROM PROFILE p
LEFT JOIN PROFIILE_ATTRIBUTE pa
ON p.id = pa.profile_id
WHERE p.username = 'username'
AND pa.created_at < '2013-07-01'
AND '2013-07-01' <= IFNULL(pa.replaced_at, GETTIME())
When Updating Attributes
Insert the new attribute
Update the previous attribute's replaced_at value
It would probably be important that the created_at for a new attribute match the replaced_at for the corresponding old attribute. This would be so that there is an unbroken timeline of attribute values for a given attribute name.
Advantages
Simple two-table architecture (I personally don't like a table-per-field approach)
Can add additional attributes with no schema changes
Easily mapped into ORM systems, assuming an application lives on top of this database
Could easily see the history for a certain attribute_name over time.
Disadvantages
Integrity is not enforced. For example, the schema doesn't restrict on multiple NULL replaced_at records with the same attribute_name... perhaps this could be enforced with a two-column UNIQUE constraint
Let's say you add a new field in the future. Existing profiles would not select a value for the new field until they save a value to it. This is opposed to the value coming back as NULL if it were a column. This may or may not be an issue.
If you use this approach, be sure you have indexes on the created_at and replaced_at columns.
There may be other advantages or disadvantages. If commenters have input, I'll update this answer with more information.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008