I am trying to import a csv file that contains 5 columns and 100 rows. When I open the terminal and type SELECT * FROM cmc2 it only shows 1 row.
mysql> SELECT * FROM cmc2;
+--------------------------------+-------+------------+--------------------+--------+
| alt_name | price | market_cap | circulating_supply | volume |
+--------------------------------+-------+------------+--------------------+--------+
| Alt coin name;Price;Market cap | 554 | 714 | 630 | NULL |
+--------------------------------+-------+------------+--------------------+--------+
1 row in set (0.01 sec)
this is my table that i created:
mysql> create table cmc2 (alt_name varchar(30), price int(20), market_cap int(20), circulating_supply int(20), volume int(20));
As per the documentation "CSV files must have one line for each row of data and have comma-separated fields." So you should replace the semicolons in your file with commas.
This should be an easy task if you have a tool like Word, Excel, etc... Where you can do a search (Control + F) and select replace with... then import the CSV file with the required separator
Related
I have a MEDIUMTEXT blob in a table, which contains paths, separated by new line characters. I'd like to add a "/" to the begging of each line if it is not already there. Is there a way to write a query to do this with built-in procedures?
I suppose an alternative would be to write a Python script to get the field, convert to a List, process each line and update the record. There aren't that many records in the DB, so I can take the processing delay (if it doesn't lock the entire DB or table). About 8K+ rows.
Either way would be fine. If second option is recommended, do I need to know of specific locking schematics before getting into this -- as this would be run on a live prod DB (of course, I'd take a DB snapshot). But in place updates would be best to not have downtime.
Demo:
mysql> create table mytable (id int primary key, t text );
mysql> insert into mytable values (1, 'path1\npath2\npath3');
mysql> select * from mytable;
+----+-------------------+
| id | t |
+----+-------------------+
| 1 | path1
path2
path3 |
+----+-------------------+
1 row in set (0.00 sec)
mysql> update mytable set t = concat('/', replace(t, '\n', '\n/'));
mysql> select * from mytable;
+----+----------------------+
| id | t |
+----+----------------------+
| 1 | /path1
/path2
/path3 |
+----+----------------------+
However, I would strongly recommend to store each path on its own row, so you don't have to think about this. In SQL, each column should store one value per row, not a set of values.
I create a table in mysql on macos commandline using the 'utf-8' charset,
mysql> CREATE TABLE tb_stu (id VARCHAR(20), name VARCHAR(20), sex CHAR(1), birthday DATE) default charset=utf8;
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW TABLES;
+----------------+
| Tables_in_test |
+----------------+
| pet |
| tb_stu |
+----------------+
2 rows in set (0.00 sec)
mysql> show create table tb_stu \G
*************************** 1. row ***************************
Table: tb_stu
Create Table: CREATE TABLE `tb_stu` (
`id` varchar(20) DEFAULT NULL,
`name` varchar(20) DEFAULT NULL,
`sex` char(1) DEFAULT NULL,
`birthday` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
I want to add some values to the 'tb_stu' table, I have a txt file containing Chinese string :
1 小明 男 2015-11-02
2 小红 女 2015-09-01
3 张三 男 2010-02-12
4 李四 女 2009-09-10
and the txt file is 'utf-8' charset too!
➜ ~ file /Users/lee/Desktop/JAVA/Java从入门到精通/第18章--使用JDBC操作数据库/Example_18_02/tb_stu.txt
/Users/lee/Desktop/JAVA/Java从入门到精通/第18章--使用JDBC操作数据库/Example_18_02/tb_stu.txt: UTF-8 Unicode text
so I execute the mysql command line:
mysql> LOAD DATA LOCAL INFILE '/Users/lee/Desktop/JAVA/Java从入门到精通/第18章--使用JDBC操作数据库/Example_18_02/tb_stu.txt' INTO TABLE tb_stu;
Query OK, 4 rows affected, 4 warnings (0.01 sec)
Records: 4 Deleted: 0 Skipped: 0 Warnings: 4
but I get the messy code in mysql :
mysql> select * from tb_stu;
+------+----------------+------+------------+
| id | name | sex | birthday |
+------+----------------+------+------------+
| 1 | å°æ˜Ž | ç | 2015-11-02 |
| 2 | å°çº¢ | å | 2015-09-01 |
| 3 | å¼ ä¸‰ | ç | 2010-02-12 |
| 4 | æŽå›› | å | 2009-09-10 |
+------+----------------+------+------------+
4 rows in set (0.00 sec)
it makes me confused, the tabel in mysql and the txt are both 'utf-8' charset, why I get the messy code? thanks a lot!
You will need to investigate some more to understand your problem. One of the options for example is that your data was written into DB correctly but in your command line it is just displayed incorrectly due to some wrong setting of encoding in your operating system environment. Or the problem might be that the data was garbled (corrupted) when it was written and that means that it is wrongly stored in the DB. So I would suggest to take your original file with properly displayed Chinese characters and convert them to unicode sequence, and then take the data in DB and also convert them into unicode sequence and compare to see if your DB data is just displayed incorrectly or the data itself is corrupted. This will help you to understand your problem and then to find a way to fix it. Here is tool that can help you:
There is an Open Source java library MgntUtils (written by me) that has a Utility that converts Strings to unicode sequence and vise versa:
result = "Hello World";
result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
System.out.println(result);
result = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString(result);
System.out.println(result);
The output of this code is:
\u0048\u0065\u006c\u006c\u006f\u0020\u0057\u006f\u0072\u006c\u0064
Hello World
The library can be found at Maven Central or at Github It comes as maven artifact and with sources and javadoc
Here is javadoc for the class StringUnicodeEncoderDecoder
This is sql file
# mo.sql
DROP TABLE IF EXISTS mo;
## _CREATE_TABLE_
CREATE TABLE mo
(
name CHAR(30),
age INT,
salary INT
);
## _CREATE_TABLE_
LOAD DATA LOCAL INFILE 'moja-2001.txt' INTO TABLE mo;
I run it from Linux terminal
mysql -p cookbook < mo.sql
No warning,no errors.
SELECT * FROM mo;
+--------------------------+------+--------+
| name | age | salary |
+--------------------------+------+--------+
| jova jovic | 24 | NULL |
| ceda prashak | 25 | NULL |
| toma grobar 28 20001 | NULL | NULL |
+--------------------------+------+--------+
I have created txt file with geany text editor
jova jovic 24 999
ceda prashak 25 1000
toma grobar 28 20001
Why is salary column wrong?Why is third row wrong also?
The last row does not use the correct separators for values. I suspect the other rows separate the values using tabs and the last one uses spaces. The same for column salary.
Make sure you use the same separator for values. It's better to use comma (it is visible and less error prone) and use the FIELDS TERMINATED BY clause of the LOAD DATA statement to inform MySQL about it.
Change the file to look like this:
jova jovic,24,999
ceda prashak,25,1000
toma grobar,28,20001
and import it like this:
LOAD DATA LOCAL INFILE 'moja-2001.txt' INTO TABLE mo FIELDS TERMINATED BY ',';
Read more about the LOAD DATA statement.
In Vertica 7.2, I'm using COPY with fdelimitedparser. I would like to be able to specify a date or datetime format for some but not all of the columns. Different date columns can have different formats.
I can't list all columns like when using COPY without a parser, since I have many files with different column combinations, and I would rather avoid writing a script to generate my copy command for each file.
Is there any way to do this ?
Additionally, how do I know which parser natively accepts which date format ?
Thanks !
You can use Vertica filler option when loading data.
See example here :
Transform data during load in Vertica
A small example also :
dbadmin=> \! cat /tmp/file.csv
2016-19-11
dbadmin=> copy tbl1 (v_col_1 FILLER date FORMAT 'YYYY-DD-MM',col1 as v_col_1) from '/tmp/file.csv';
Rows Loaded
-------------
1
(1 row)
dbadmin=> select * from tbl1;
col1
------------
2016-11-19
(1 row)
dbadmin=> copy tbl1 (v_col_1 FILLER date FORMAT 'YYYY-MM-DD',col1 as v_col_1) from '/tmp/file.csv';
Rows Loaded
-------------
1
(1 row)
dbadmin=> select * from tbl1;
col1
------------
2016-11-19
2017-07-14
(2 rows)
hope this helped
You can use the format keyword as part of the COPY command
see below eg from Vertica Forum :
create table test3 (id int, Name varchar(16), dt date, f2 int);
CREATE TABLE
vsql=> \!cat /tmp/mydata.data
1|foo|29-Jan-2013|100.0
2|bar|30-Jan-2013|200.0
3|egg|31-Jan-2013|300.0
4|tux|01-Feb-2013|59.9
vsql=> copy test3
vsql-> ( id, Name, dt format 'DD#MON#YYYY', f 2)
vsql-> from '/tmp/mydata.data' direct delimiter '|' abort on error;
Rows Loaded
-------------
4
(1 row)
vsql=> select * from test3;
id | Name | dt | f2
----+------+------------+----------
1 | foo | 2013-01-29 | 100.0000
2 | bar | 2013-01-30 | 200.0000
3 | egg | 2013-01-31 | 300.0000
4 | tux | 2013-02-01 | 59.9000
I understand , you need to chooses between “Simple to load “ And “Fast
to consume”, flex table will add some impacts on the consumers ,
some info on that : Flex table is row based storage it will consume
more disk space and its have zero capability to encode the data ,
you have the ability to materialized the relevant columns as
columnar , but the data will be persist twice , on both ROW and
columnar storages (load time should be slower, and it will require)
. At a Query time , if you plan to query only the materialized
columns you should be ok , but if not , you should expect to have
performance issues
I am writing sql query for bigsql.
If it looks like this
select t.city from table t where t.city like 'A%'
It works ok, but next one fails:
select t.city from table t where t.city like 'A%' escape '\'
I only add escape expression and it give me following error
Error Code: -5199, SQL State: 57067] DB2 SQL Error: SQLCODE=-5199, SQLSTATE=57067, SQLERRMC=Java DFSIO;1;2, DRIVER=4.15.82
I found this documentation http://www-01.ibm.com/support/knowledgecenter/SSPT3X_2.1.2/com.ibm.swg.im.infosphere.biginsights.bigsql.doc/doc/bsql_like_predicate.html?lang=en
So seems escape should work.
If I escape escape character I get another error
Error Code: -130, SQL State: 22019] DB2 SQL Error: SQLCODE=-130, SQLSTATE=22019, SQLERRMC=null, DRIVER=4.15.82. 2) [Error Code: -727, SQL State: 56098] DB2 SQL Error: SQLCODE=-727, SQLSTATE=56098, SQLERRMC=2;-130;22019;, DRIVER=4.15.82
But if I use not '\' character as escape, but something another, like '/' it works fine.
Any ideas why it may happen?
Try this maybe. You might have to escape the escape character.
select t.city from table t where t.city like 'A%' escape '\\'
Based upon this sample:
\connect bigsql
drop table if exists stack.issue1;
create hadoop table if not exists stack.issue1 (
f1 integer,
f2 integer,
f3 varchar(200),
f4 integer
)
stored as parquetfile;
insert into stack.issue1 (f1,f2,f3,f4) values (0,0,'Detroit',0);
insert into stack.issue1 (f1,f2,f3,f4) values (1,1,'Mt. Pleasant',1);
insert into stack.issue1 (f1,f2,f3,f4) values (2,2,'Marysville',2);
insert into stack.issue1 (f1,f2,f3,f4) values (3,3,'St. Clair',3);
insert into stack.issue1 (f1,f2,f3,f4) values (4,4,'Port Huron',4);
select * from stack.issue1;
select * from stack.issue1 where f3 like 'M%';
\quit
I get the following results:
jsqsh --autoconnect --input-file=./t.sql --output-file=t.out
0 rows affected (total: 0.28s)
0 rows affected (total: 0.22s)
1 row affected (total: 0.37s)
1 row affected (total: 0.35s)
1 row affected (total: 0.38s)
1 row affected (total: 0.35s)
1 row affected (total: 0.35s)
5 rows in results(first row: 0.33s; total: 0.33s)
2 rows in results(first row: 0.26s; total: 0.26s)
cat t.out
+----+----+--------------+----+
| F1 | F2 | F3 | F4 |
+----+----+--------------+----+
| 1 | 1 | Mt. Pleasant | 1 |
| 0 | 0 | Detroit | 0 |
| 4 | 4 | Port Huron | 4 |
| 3 | 3 | St. Clair | 3 |
| 2 | 2 | Marysville | 2 |
+----+----+--------------+----+
+----+----+--------------+----+
| F1 | F2 | F3 | F4 |
+----+----+--------------+----+
| 1 | 1 | Mt. Pleasant | 1 |
| 2 | 2 | Marysville | 2 |
+----+----+--------------+----+
This shows your syntax is correct, however, based upon the -5199 error code, this is an issue with the FMP processes not having enough memory or an issue with the Hadoop I/O component. You can get further information on this error by issuing
db2 ? sql5199n
from the command line.
The SQL error message should have directed you to the node where the error occurred and where the Big SQL log file is and the associated reader log files are located.
SQL5199 error generally means an issue with HDFS ( you can do a db2 \? SQL5199 to get details on the message -- as user bigsql ). Check the bigsql and DFS logs to see if that gives any pointers to the problem.
Hope this helps.