Importing Excel CSV to MySQL relational database? - mysql

I have two tables baskets and fruits; a basket has many different fruits so it's a one to many relation.
The tables are as follows:
baskets
id NOT NULL PRIMARY KEY AUTO_INCREMENT,
basket_name varchar(20)
fruits
id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
fruit_name varchar(20),
basket_id INT FOREIGN KEY REFERENCES baskets(id)
I have an Excel sheet with column names and data organized like this:
Basket Fruits
Basket_1 Apple, Mango, Banana, Pear
Basket_2 Mango, Strawberry, Plums, Banana, Grapes
Baskt_3 Raspberry, Apple, Pear
What would be the best possible way to store all these fruits for a single basket in the relational architecture modeled above?
I have converted my XLS file to CSV for better parsing using scripts (in Ruby) and found that the table has been messed up such that each fruit is now in a separate cell and the number of cells vary for one basket (in a row) in the spreadsheet.
My question is some what similar to this: Import Excel Data to Relational Tables at MySQL
But not the same in data.
All suggestions are welcome!

These kinds of problems always seem to take a fair amount of steps.
If the fruits are in separate cells, use the paste - transpose tool in Excel to turn rows into columns by copying all cells in one sheet and then using paste special - transpose to paste them in a new sheet.
In a third sheet i would make a formula that concatenates the value of each cell, starting in the second row, with the value in the first row. If sheetA has the freshly transposed values, and sheetB is the new sheet, the formula would be =sheetA!a$1 & "," & sheetA!a2, and copy that formula from a2:zz999 or however far it needs to go.
Then the columns need to be concatenated, which can be done manually, or via UNION statements in Mysql after the sheet is imported:
SELECT COLUMN1 FROM SHEETB
UNION
SELECT COLUMN2 FROM SHEETB
UNION
...
You get the idea. Once the combined basket-fruit field is in one column, it's easy to separate out into two columns.

Related

inserting multiple values into one row mySQL

How can i insert multiple values into one row?
My query
insert into table_RekamMedis values ('RM001', '1999-05-01', 'D01', 'Dr Zurmaini', 'S11', 'Tropicana', 'B01', 'Sulfa', '3dd1');
i cant insert two values into one row. is there another way to do it?
I'm ignorant of the human language you use, so this is a guess.
You have two entities in your system. One is dokter, the other is script (prescription). Your requirement is to store zero or more scripts for each dokter. That is, the relationship between your entities is one-to-many.
In a relational database management system (SQL system) you do that with two tables, one per entity. Your dokter table will contain a unique identifier for each doctor, and the doctor's descriptive attributes.
CREATE TABLE dokter(
dokter_id BIGINT AUTO_INCREMENT PRIMARY KEY NOT NULL,
nama VARCHAR (100),
kode VARCHAR(10),
/* others ... */
);
And you'll have a second table for script
CREATE TABLE script (
script_id BIGINT AUTO_INCREMENT PRIMARY KEY NOT NULL,
dokter_id BIGINT NOT NULL,
kode VARCHAR(10),
nama VARCHAR(100),
dosis VARCHAR(100),
/* others ... */
);
Then, when a doctor writes two prescriptions, you insert one row in dokter and two rows in script. You make the relationship between script and dokter by putting the correct dokter_id into each script row.
Then you can retrieve this information with a query like this:
SELECT dokter.dokter_id, dokter.nama, dokter.kode,
script.script_id, script.kode, script.nama, script.dosis
FROM dokter
LEFT JOIN script ON dokter.dokter_id = script.dokter_id
Study up on entity-relationship data design. It's worth your time to learn and will enhance your career immeasurably.
You can't store multiple values in a single field but there are various options to achieve what you're looking for.
If you know that a given field can only have a set number of values then it might make sense to simply create multiple columns to hold these values. In your case, perhaps Nama obat only ever has 2 different values so you could break out that column into two columns: Nama obat primary and Nama obat secondary.
But if a given field could have any amount of values, then it would likely make sense to create a table to hold those values so that it looks something like:
NoRM
NamaObat
RM001
Sulfa
RM001
Anymiem
RM001
ABC
RM002
XYZ
And then you can combine that with your original table with a simple join:
SELECT * FROM table_RekamMedis JOIN table_NamaObat ON table_RekamMedis.NoRM = table_NamaObat.NoRM
The above takes care of storing the data. If you then want to query the data such that the results are presented in the way you laid out in your question, you could combine the multiple NamaObat fields into a single field using GROUP_CONCAT which could look something like:
SELECT GROUP_CONCAT(NamaObat SEPARATOR '\n')
...
GROUP BY NoRM

Load Redshift Spectrum external table with CSVs with differing column order

This got no answer and I have a similar question, though I'll expand it.
Suppose I have 3 CSV files in s3://test_path/. I want to create an external table and populate it with the data in these CSVs. However, not only does column order differ across CSVs, but some columns may be missing from some CSVs.
Is Redshift Spectrum capable of doing what I want?
a.csv:
id,name,type
a1,apple,1
a2,banana,2
b.csv:
type,id,name
1,b1,orange
2,b2,lemon
c.csv:
name,id
kiwi,c1
I create the external database/schema and table by running this in Redshift query editor v2 on my Redshift cluster:
CREATE EXTERNAL SCHEMA test_schema
FROM DATA CATALOG
DATABASE 'test_db'
REGION 'region'
IAM_ROLE 'iam_role'
CREATE EXTERNAL DATABASE IF NOT EXISTS
;
CREATE EXTERNAL TABLE test_schema.test_table (
"id" VARCHAR,
"name" VARCHAR,
"type" SMALLINT
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION 's3://test_path/'
TABLE PROPERTIES ('skip.header.line.count'='1')
;
I expect SELECT * FROM test_schema.test_table to yield:
id
name
type
a1
apple
1
a2
banana
2
b1
orange
1
b2
lemon
2
c1
kiwi
NULL
Instead I get:
id
name
type
a1
apple
1
a2
banana
2
1
b1
NULL
2
b2
NULL
kiwi
c1
NULL
It seems Redshift Spectrum cannot match columns by name across files the way pandas.concat() can with data frames with differing column order.
No, your data needs to be transformed to align the data between files. Your Spectrum DDL specifies that the first row of the CSVs is ignored so the information you need isn't even being read.
If you want to have these files usable as one Spectrum table you will need to transform them to align columns and store new files to S3. You can do this with Redshift if you also have a support piece of code reading the column order from each file. You could write a Lambda to do this easily or if your CSV files are fairly simple then a Glue crawler will work. Almost any ETL tool can do this as well. Lots of choices but these files are not Spectrum ready as is.

Two separate CSV files and Two tables that share a common column of a primary key

so I'm working on a datawarehouse project and I was given 3 csv files by my professor. The two I'm having trouble with are a mastersales file and a customer lookup. Both of these files share a column that is a CUST_ID I need to be able to insert or join the master sales file data with the customer dimension file based on the CUST_ID as a primary key. This is my first ever SQL project so I have literally no experience. My question would it be best to join the tables together based on the ID or to insert the master file data as well into the CUST_DIM and delete the columns that don't pertain to a customer? Thank you in advance
Best idea would be to use a JOIN for the CUST_ID with a method like:
SELECT CUST_DIM.CUST_BIRTH_DT, CUST_DIM.CUST_CITY_NM, CUST_DIM.CUST_STREET_ADD, CUST_DIM.CUST_POSTAL_CD, CUST_DIM.CUST_STATE_CD, CUST_DIM.CUST_NM, CUST_DIM.CUST_NO, CUST_DIM.CUST_PHONE_NO
FROM CUST_DIM
JOIN sales_filev1
ON CUST_DIM.CUST_NO=sales_filev1.**SALES ID NUMBER GOES HERE**
This would allow you to like both tables together and manipulate how you see fit.

mysql trying to import an odd format csv

I have an excel CSV file where the customer was recording invoices. The made a new column for each vendor and put the amounts under each column.
Like this:
ABC Company Jacks Garage XYZ Company
123.45 223.22 123.11
423.11 10.22 11.21
Etc. I am trying to guess how to get that into two columns (Vendor, Amount) so I can import that data into the actual table. There are about 200 vendors so doing this manually cut and paste would work be take forever.
Can I do this with a loop somehow and insert the info into the 2 column table?
I would do this by writing a simple script written in just about any language, e.g. Python, PHP, Ruby, or even Perl. Any of those languages make it easy to read a text file, split the fields into an array, and post the fields into a database in whatever manner you want.
Alternatively, you could do this without writing code, but in the following steps:
Load the CSV file as-is into a table.
create table invoices_asis (
rownum serial primary key,
abc_company_amount numeric(9,2),
jacks_garage_amount numeric(9,2),
xyz_company_amount numeric(9,2)
);
load data infile 'invoices.csv' into table invoices_asis ignore 1 lines
(abc_company_amount, jacks_garage_amount, xyz_company_amount);
Then copy all data for each given vendor to your (vendor, amount) table.
create table invoices (
invoice_id serial primary key,
vendor varchar(20),
amount numeric(9,2)
);
insert into invoices (vendor, amount)
select 'ABC Company', abc_company_amount from invoices_asis;
insert into invoices (vendor, amount)
select 'Jacks Garage', jacks_garage_amount from invoices_asis;
insert into invoices (vendor, amount)
select 'XYZ Company', xyz_commpany_amount from invoices_asis;
Finally, drop the as-is table.
drop table invoices_asis;
I think what you want is to 'unpivot', for which there re many options (eg for Excel).
Insert a blank column at the extreme left then include that with your data at 4. in the example. At 6. move Column into ROWS above Row. Double-click the Grand Total and remove left-hand column.

Is it possible to reference a mysql table entry value from a second table entry dynamically?

I can't find anything about dynamically referencing one MySQL table entry to another. It may not be possible.
Essentially, I'd like to know if in MySQL you can do the equivalent to referencing the value of a certain Excel cell to another. For example, if in Excel I set Sheet 1 Cell A1 to some value like "MyVal". Then if I set Sheet 2 Cell A1 to "=Sheet1!A1" and Sheet 3 Cell A1 to "=Sheet2!A1" the value of Sheet 3 Cell A1 is "MyVal". If I go back to Sheet 1 Cell A1 and change the value to "MyNewVal" then the value is automatically updated on Sheet 2 Cell A1 and Sheet 3 Cell A1 to "MyNewVal".
My question is... in MySQL can I set the value of a certain entry in the first table to be dynamically linked to the value of a different entry in a second table such that when I query the first table (using existing PHP code) I get the value that's in the second table? I imagine that if it's possible then perhaps the value of the entry in the first table would look like a query that queries the second table for the correct value.
I understand how to write an UPDATE query in PHP to explicitly make the values the same but I don't want to change the existing php code. I want to link them in a relative/dynamic way. The short reason is that I don't want to change the PHP code since the same code is used on several of the sites I maintain and I want to keep the existing php code the same for cleaner maintenance/upgrading.
However, since the databases on the various sites are already different, it would be very clean to somehow dynamically link the appropriate entries in the different tables in the database itself.
Any help would be very appreciated. If this is possible, if you could just point me in the right direction, I'd be happy to do the research.
There are 2.5 ways to do this (basically two, but it feels like there's three):
From easiest to hardest...
Option 1:
If you need tableA to reflect tableB's value, don't store the value in tableA at all, just use tableB's value. Use either a join:
select a.*, b.col1
from tableA a
join tableB b on <some join condition>
or a subselect
select *, (select col1 from tableB where <some condition>) col1
from tableA
Option 2:
If you're happy with option 1, convert it to a view, which behaves like a table (except are restrictions on updating views that are joins):
create view myview as
select ... (one of the above selects)
Option 3:
Create a database trigger that fires when tableB's value is changed and copies the value over to the appropriate row/column in tableA
create trigger tableB_update
after update on tableB
for each row
update tableA set
tablea_col = new.col1
where id = new.tableA_id;
Note that new and old are special names given to the new and old rows so you can reference the values in the table being updated.
Choose the option that best suits your needs.
Databases don't really provide this type of facility, it's a completely different paradigm.
You can achieve the same results using joins, groupings or functions.
ALternatively if you wish to save the representation, store the query into a view which makes it more re-usable from various interfaces. More information on views can be found here; http://www.mysqltutorial.org/mysql-views-tutorial.aspx
Anything more complex and you will need to look at some business analysis tools.
Perhaps you have oversimplified the question, but you should not need to use a trigger. Just join the tables and any time 'MyVal' is changed it will automatically be available through query.
CREATE TABLE Sheet1
(
`ID` int auto_increment primary key
, `A` varchar(5)
)
;
INSERT INTO Sheet1
(`A`)
VALUES
('MyVal')
;
CREATE TABLE Sheet2
(
`ID` int auto_increment primary key
, `Sheet1FK` int)
;
INSERT INTO Sheet2
(`Sheet1FK`)
VALUES
(1)
;
CREATE TABLE Sheet3
(
`ID` int auto_increment primary key
, `Sheet2FK` int)
;
INSERT INTO Sheet3
(`Sheet2FK`)
VALUES
(1)
;
Query 1:
select
sheet3.id id3
, sheet2.id id2
, sheet1.id id1
, sheet1.a
from sheet1
inner join sheet2 on sheet1.id = sheet2.sheet1fk
inner join sheet3 on sheet2.id = sheet3.sheet2fk
Results:
| ID3 | ID2 | ID1 | A |
|-----|-----|-----|-------|
| 1 | 1 | 1 | MyVal |