Importing CSV to MySQL while looking up foreign key - mysql

Please excuse any syntax errors in my examples; I am new to SQL.
For this question, let us suppose I have this hypothetical structure:
authors_list:
author_id INT NOT_NULL AUTO_INCREMENT PRIMARY
author_name VARCHAR(30) NOT_NULL
books_list:
book_id INT NOT_NULL AUTO_INCREMENT PRIMARY
book_author_id INT NOT_NULL FOREIGN_KEY(authors_list.author_id)
book_name VARCHAR(30) NOT_NULL
Generally when importing books, I would only know the book name and author name. I have finally figured out how to insert into books_list using only this data:
INSERT INTO `books_list`(`book_author_id`, `book_name`) VALUES ((SELECT `author_id` FROM `authors_list` WHERE `author_name` = 'SomeAuthorName'), 'SomeBookName')
However, I have a .csv file which only contains the columns author_name and book_name. I have previously been importing .csv files with phpMyAdmin, but those tables did not have foreign keys. Is there any way to import a .csv of the form described using this "on the fly lookup" functionality?

You can use SQL directly: http://dev.mysql.com/doc/refman/5.1/en/load-data.html
If you need more logic than id generation, you can import data into another table and then write script or procedure to copy data from this table to books_list, using some your customized logic.
If our steps works, use it. There will be probably limit in amount of data. If you reach the limit, use suggested way.

Related

Cannot make liquibase execute more than 1 script in one file

I have a spring-boot application and sqlite db.
If I write two scripts in one file, only first script make change to db. The table creates but no data inserts, though in log I see that both scripts were executed.
And if I write each script to separate file, then it is OK, table creates and data inserts.
How to make it execute more then one script in one single file?
CREATE TABLE IF NOT EXISTS acl_class
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
class character varying UNIQUE NOT NULL
);
INSERT INTO acl_class
(class)
VALUES
('models.User');
The same problem is with mysql db, but here my app falls with runtime error "Caused by: liquibase.exception.DatabaseException: You have an error in your SQL syntax;" But syntax is ok, and separately tables are creating without any problem:
CREATE TABLE IF NOT EXISTS acl_class
(
id INT AUTO_INCREMENT PRIMARY KEY,
class VARCHAR(255) UNIQUE NOT NULL
);
CREATE TABLE IF NOT EXISTS acl_sid
(
id INT AUTO_INCREMENT PRIMARY KEY,
principal boolean NOT NULL,
sid VARCHAR(255) UNIQUE NOT NULL
);
Thanks to SteveDonie the problem was solved easily: I just added to the begining of my file with multiply scripts this two lines:
--liquibase formatted sql
--changeset myname:create-multiple-tables splitStatements:true endDelimiter:;
and and it worked!

Importing csv file into single column of a MySQL table

I have googled this a lot, and I have not found anything matching my problem.
I have a lot of Time Series, containing different sensors readings. Each Time Series is stored into a .csv file, so each file contains a single column.
I have to populate this MySQL table:
CREATE TABLE scheme.sensor_readings (
id int unsigned not null auto_increment,
sensor_id int unsigned not null,
date_created datetime,
reading_value double,
PRIMARY KEY(id),
FOREIGN KEY (sensor_id) REFERENCES scheme.sensors (id) ON DELETE CASCADE
) ENGINE = InnoDB;
while the sensors table is:
CREATE TABLE scheme.sensors (
id int unsigned not null auto_increment,
sensor_title varchar(255) not null,
description varchar(255) not null,
date_created datetime,
PRIMARY KEY(id)
) ENGINE = InnoDB;
Now, I should fill the reading_value field with values contained in the above descripted .csv files. An example of this kind of file:
START INFO
Recording Time *timestamp*
Oil Pressure dt: 1,000000 sec
STOP INFO
0,445328
0,429459
0,4245
0,445099
0,432434
0,433426
...
EOF
What I need is to design an SQL query in which I populate this table while reading values from a .csv file.
I cannot figure out how to proceed: should I use some sort of temporary table as a buffer?
I use HeidiSQL as Client.
The kind of tool you looking for is called an ETL (Extract, transform, Load).
You can extract data form csv files (among other), transfrom them by adding the info from the sensor db-table (among other), and load it into the sensor_reading db-table.
There are plenty of ETL on the market. Although, I should be agnostic, a free, easy to learn and covering all your future needs, you may start evaluating PDI (Pentaho Data Integrator, nicknamed Kettle). Go there, download the latest Data Integrator, unzip and press the spoon.bat / spoon.sh. A nice getting started is there. And the StackOverFlow flag Pentaho Data Integration, respond usually quite quickly.
Alternatively you may try Talend or plenty others.

Can i populate a database from another database

I'm trying to create a data warehouse.
Is it possible to populate a table in db1, from data in db2.
For example
Corporate Database Table Route
CREATE TABLE ROUTE (
RouteID INTEGER(4) PRIMARY KEY,
RouteName VARCHAR (50) NOT NULL,
BoardingStop VARCHAR (50) NOT NULL,
AlightingStop VARCHAR (50) NOT NULL
);
Insert Information
INSERT INTO `ROUTE` (`RouteID`,`RouteName`,`BoardingStop`,`AlightingStop`)
VALUES (1,"ab","B","C")
Data warehouse table dimRoute
CREATE TABLE DimROUTE (
RouteID INTEGER(4),
RouteName VARCHAR (50) NOT NULL,
BoardingStop VARCHAR (50) NOT NULL,
AlightingStop VARCHAR (50) NOT NULL,
PRIMARY KEY(RouteID)
);
Populate the above table with data from the first table.
You can copy from one table into another table with INSERT INTO ... SELECT. See docs here: http://dev.mysql.com/doc/refman/5.7/en/insert-select.html
You can copy between tables in different databases on the same MySQL instance, provided you have privileges to both database. Just use databasename.tablename syntax:
INSERT INTO warehouse.DimRoute
SELECT * FROM corporate.Route;
If the databases are hosted on different MySQL instances, you can dump data from the corporate instance and import to the data warehouse instance using mysqldump. Since your table is named differently in the data warehouse, this is a little bit tricky.
You could restore the data to its original table name, and then rename the table:
$ mysqldump --host=corporate corp_dbname ROUTE > route-dump.sql
$ mysql --host=datawarehouse dw_dbname < route-dump.sql
$ mysql --host=datawarehouse -e "RENAME TABLE ROUTE TO DimROUTE" dw_dbname
(I'm leaving out user/password options for brevity, but I suggest you use the config file for those.)
You just need a couple of queries to clone a table (with its indexes and keys) then populate it with the records:
CREATE TABLE DimROUTE LIKE ROUTE;
INSERT DimROUTE SELECT * FROM ROUTE;
Demo SQL Fiddle
Yes, you can. The technique you want is called Extract, Transform and Load (ETL). There are a number of tools you can use, which will help you automate and organise the process. Or you can roll your own solution.
It is quite common for reporting databases to be feed by other databases in this fashion.

Concatenating a str to an auto incrementeed column which functions as primary key

Having some trouble putting together a table with a unique value. The current setup I have for two tables which for all intents and purposes can be the same as the one below. My problem is that I'm trying to use the auto incremented value as the primary key due to redundancies in the data pulls, but since it's for two tables, I want to concatenate a string to the auto incremented value so my ID column would be:
Boop1, Boop2, Boop3 and Beep1, Beep2, Beep3, instead of 1, 2, 3 for both tables so they are differentiated and thus do not have duplicate values when I put in constraints
CREATE TABLE IF NOT EXISTS `beep`.`boop` (
`ID` INT NOT NULL AUTO_INCREMENT,
`a` VARCHAR(15) NOT NULL,
`b` VARCHAR(255) NOT NULL,
`c` VARCHAR(50) NOT NULL,
`d` VARCHAR(50) NOT NULL,
PRIMARY KEY(`ID`))
ENGINE = InnoDB;
LOAD DATA LOCAL INFILE 'blah.csv'
FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES SET DCMID = CONCAT('DCM'+`DCMID`);
The code in boldface is optional and was only there to try concatenating which I already know does not work
I realize that this would not be able to work since my datatype is an INT, so what would I have to do to to keep my autoincrement while differentiating
For reference, I am using a LOAD DATA LOCAL INFILE, and not INSERT (and I don't think bulk insert is available with mySQL workbench). otherwise, i would bulk insert and just utilize last_insert_id
The goal is to plug and play for a datapull I perform so I can archive my data quickly and run queries to grab the data I need in the future. using one insert line per row of data i have would be extremely inefficient
I was utilizing a delimiter function earlier with a trigger, which in theory would have worked by altering the table after the load data infile, but that requires SUPER privileges which I do not have
is what i'm asking for even possible, or should i give up and find a workaround, or try grabbing super priveleges and trying the delimiter trigger
I'm not sure why you would do that, but you could use two different indexes. The first one is the auto-increment, and is populated by MySQL. The second one is your "prefixed" key, and is created by a trigger called after insert, where you update the column based on the first key and the prefix that you want.

Questions about FriendFeed's MySql SchemaLess Design

Bret Taylor discussed the SchemaLess Design in this blog post: http://bret.appspot.com/entry/how-friendfeed-uses-mysql
It looks like they stored different class's Objects into only one table.Then build more index tables.
my question is that how to build index on one class.
for example, a user's blog is {id,userid,title,body}. A user's tweet is {id,userid,tweet}.
If I want to build an index for users' blogs how can I do?
It's very simple -- perhaps simpler than you expect.
When you store a blog entity, you're going to insert to the main entities table of course. A blog goes like this:
CREATE TABLE entities (
id INT AUTO_INCREMENT PRIMARY KEY,
entity_json TEXT NOT NULL
);
INSERT INTO entities (id, entity_json) VALUES (DEFAULT,
'{userid: 8675309,
post_date: "2010-07-27",
title: "MySQL is NoSQL",
body: ... }'
);
You also insert into a separate index table for each logical type of attribute. Using your example, the userid for a blog is not the same as a userid for a tweet. Since you just inserted a blog, you then insert into index table(s) for blog attribute(s):
CREATE TABLE blog_userid (
id INT NOT NULL PRIMARY KEY,
userid BIGINT UNSIGNED,
KEY (userid, id)
);
INSERT INTO blog_userid (id, userid) VALUES (LAST_INSERT_ID(), 8675309);
CREATE TABLE blog_date (
id INT NOT NULL PRIMARY KEY,
post_date DATETIME UNSIGNED,
KEY (post_date, id)
);
INSERT INTO blog_date (id, post_date) VALUES (LAST_INSERT_ID(), '2010-07-27');
Don't insert into any tweet index tables, because you just created a blog, not a tweet.
You know all rows in blog_userid reference blogs, because that's how you inserted them. So you can search for blogs of a given user:
SELECT e.*
FROM blog_userid u JOIN entities e ON u.id = e.id
WHERE u.userid = 86765309;
Re your comment:
Yes, you could add real columns to the entities table for any attributes that you know apply to all content types. For example:
CREATE TABLE entities (
id INT AUTO_INCREMENT PRIMARY KEY,
entity_type INT NOT NULL,
creation_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
entity_json TEXT NOT NULL
);
The columns for entity_type and creation_date would allow you to crawl the entities in chronological order (or reverse chronological order) and know which set of index tables matches the entity type of a given row.
They do not store objects of different classes in the same table. The 'entities' table they are referring to is used to store only one kind of entities.
For example, a typical entity in FriendFeed might look like this:
"id": "71f0c4d2291844cca2df6f486e96e37c",
"user_id": "f48b0440ca0c4f66991c4d5f6a078eaf",
"feed_id": "f48b0440ca0c4f66991c4d5f6a078eaf",
"title": "We just launched a new backend system for FriendFeed!",
"link": "http://friendfeed.com/e/71f0c4d2-2918-44cc-a2df-6f486e96e37c",
"published": 1235697046,
"updated": 1235697046,
To understand the implementation better, have a look at the example given here: https://github.com/jamesgolick/friendly#readme