I use a txt file to upload data into a mysql database.
The txt file I use is the following:
http://www.repubblica.it/
http://www.repubblica.it/minify/sites/repubblica/nazionale/config_01.cache.php?name=site_home_css
http://www.repubblica.it/minify/sites/repubblica/nazionale/config_01.cache.php?name=social_home_css
http://quotidiano.repubblica.it/home?adv=t&source=homerepit
http://www.repubblica.it/servizi/mobile/index.html
http://inchieste.repubblica.it/
http://espresso.repubblica.it/
http://altoadige.gelocal.it/
http://corrierealpi.gelocal.it/
http://gazzettadimantova.gelocal.it/
http://gazzettadimodena.gelocal.it/
http://gazzettadireggio.gelocal.it/
http://mattinopadova.gelocal.it/
http://ilpiccolo.gelocal.it/
http://trentinocorrierealpi.gelocal.it/
http://lacittadisalerno.gelocal.it/
In my java program I use the following mysql code to upload the txt file:
//here I create the table
CREATE TABLE table(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,url VARCHAR(1000) NOT NULL);"
//here the txt is loaded
LOAD DATA LOCAL INFILE \'/tmp/test/url.txt\' INTO TABLE table LINES (url)
All that works fine. I have my table with two columns: id and url.
When i try to search a value in that table using a simple:
SELECT url FROM table WHERE url LIKE 'http://www.repubblica.it/'
or
SELECT url FROM table WHERE url ='http://www.repubblica.it/'
MySQL return an empty result set (i.e. zero rows). ( Query took 0.0004 sec )
Here a screenshot of phpmyadmin:
Why I am not able to seach for my values?
What I am doing wrong?
Thanks in advance
You need use wildcards in your like statment LIKE '%..%' to match a part of a string
SELECT url FROM table WHERE url LIKE '%http://www.repubblica.it/%'
If you try to search directly on phpmysqmin you have an option to search by 'LIKE %..%'
Maybe you have spaces before and after the string, try this:
UPDATE `table`
SET `url`=TRIM(`url`)
Then:
SELECT
`url`
FROM
`table`
WHERE
`url`='http://www.repubblica.it/'
Related
I have a csv (500kb) file that contains the entries in the following format:
16037549,poetry
598195,historical fiction
22466716,poetry
My table is built with the following command:
CREATE TABLE `genre` (
`book_id` int(11),
`genre` varchar(200)
)
The csv file is loaded with the following load command:
LOAD DATA local infile '...' into table genre character set 'utf8mb4' fields terminated by ',' optionally enclosed by '\"' escaped by '\"';
The problem is genre field is not correctly interpreted after import. I cannot do string match like, the following. The result of the query is empty set.
SELECT * from genre where genre = 'poetry';
However, the following would return the desired result:
SELECT * from genre where genre like 'poetry%';
It seems there is something around the string that get added to the field during the process of import. I am fairly sure that both genre field and book_id field stay within the size set by the create table. I encounter the same problem with another file. Both of these fields are at the end of the line.
This problem doesn't exist if I only import a smaller file that contain 10 lines of data from the bigger csv.
I have no idea how to debug the problem. Any better ways to import the csv or any debug tips?
context: I want to upload the file to gcp and the import statement is the format accepted by cloud sql.
it turns out it is a line termination issue. If you want to upload to cloud sql you need to convert the csv to a linux file. One of the method is using sed:
sed -e "s/\r//g" file > newfile
I am trying to load csv files into a Hive table. I need to have it done through HDFS.
My end goal is to have the hive table also connected to Impala tables, which I can then load into Power BI, but I am having trouble getting the Hive tables to populate.
I create a table in the Hive query editor using the following code:
CREATE TABLE IF NOT EXISTS dbname.table_name (
time_stamp TIMESTAMP COMMENT 'time_stamp',
attribute STRING COMMENT 'attribute',
value DOUBLE COMMENT 'value',
vehicle STRING COMMENT 'vehicle',
filename STRING COMMENT 'filename')
Then I check and see the LOCATION using the following code:
SHOW CREATE TABLE dbname.table_name;
and find that is has gone to the default location:
hdfs://our_company/user/hive/warehouse/dbname.db/table_name
So I go to the above location in HDFS, and I upload a few csv files manually, which are in the same five-column format as the table I created. Here is where I expect this data to be loaded into the Hive table, but when I go back to dbname in Hive, and open up the table I made, all values are still null, and when I try to open in browser I get:
DB Error
AnalysisException: Could not resolve path: 'dbname.table_name'
Then I try the following code:
LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' INTO TABLE dbname.table_name;
It runs fine, but the table in Hive still does not populate.
I also tried all of the above using CREATE EXTERNAL TABLE instead, and specifying the HDFS in the LOCATION argument. I also tried making an HDFS location first, uploading the csv files, then CREATE EXTERNAL TABLE with the LOCATION argument pointed at the pre-made HDFS location.
I already made sure I have authorization privileges.
My table will not populate with the csv files, no matter which method I try.
What I am doing wrong here?
I was able to solve the problem using:
CREATE TABLE IF NOT EXISTS dbname.table_name (
time_stamp STRING COMMENT 'time_stamp',
attribute STRING COMMENT 'attribute',
value STRING COMMENT 'value',
vehicle STRING COMMENT 'vehicle',
filename STRING COMMENT 'filename')
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
and
LOAD DATA INPATH 'hdfs://our_company/user/hive/warehouse/dbname.db/table_name' OVERWRITE INTO TABLE dbname.table_name;
I am trying to read csv data from s3 bucket and creating a table in AWS Athena. My table when created was unable to skip the header information of my CSV file.
Query Example :
CREATE EXTERNAL TABLE IF NOT EXISTS table_name ( `event_type_id`
string, `customer_id` string, `date` string, `email` string )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH
SERDEPROPERTIES ( "separatorChar" = "|", "quoteChar" = "\"" )
LOCATION 's3://location/'
TBLPROPERTIES ("skip.header.line.count"="1");
skip.header.line.count doesn't seem to work.
But this does not work out. I think Aws has some issue with this.Is there any other way that I could get through this?
This is what works in Redshift:
You want to use table properties ('skip.header.line.count'='1')
Along with other properties if you want, e.g. 'numRows'='100'.
Here's a sample:
create external table exreddb1.test_table
(ID BIGINT
,NAME VARCHAR
)
row format delimited
fields terminated by ','
stored as textfile
location 's3://mybucket/myfolder/'
table properties ('numRows'='100', 'skip.header.line.count'='1');
This is a known deficiency.
The best method I've seen was tweeted by Eric Hammond:
...WHERE date NOT LIKE '#%'
This appears to skip header lines during a Query. I'm not sure how it works, but it might be a method for skipping NULLs.
As of today (2019-11-18), the query from the OP seems to work. i.e. skip.header.line.count is honored and the first line is indeed skipped.
I was trying to insert a JSON file into a table which has only one column varchar2(4000) using sql loader. After I load I see the file text is loaded in multiple rows instead of one row , but I want them in one row , the whole file in one column and one row. Not sure why this is happening , is there a option to tell in the control file? here is my control file:
LOAD DATA
INFILE 'c:\json\sample-order.json'
INTO TABLE at_jsondocs
FIELDS
( jsontext CHAR(4000) )
See Alex Poole's explanation here, but the column in your table should be a CLOB, and you need to structure your control file like this:
LOAD DATA
INFILE *
INTO TABLE at_jsondocs
(
x FILLER CHAR(1),
jsontext LOBFILE(CONSTANT "c:\json\sample-order.json") TERMINATED BY EOF
)
BEGINDATA
0
I have a table with three columns, NODEID, X, Y. NODEID is the primary key and it is set as an INT(4) to be AUTOINCREMENT. I wish to add more data to this table by importing it from a CSV via the phpmyadmin import. Question:
What would be the format of the CSV look like?
Is this possible or is importing basically just to replace the whole data with the CSV?
As of now the CSV looks like:
1,-105.057578,39.785603
2,-105.038646,39.771132
3,-105.013045,39.771727
5,-105.045721,39.762055
6,-105.031777,39.76206
7,-105.046015,39.72835
8,-105.029796,39.728304
10,-104.930863,39.754579
11,-104.910624,39.754644
13,-104.930959,39.74367
16,-105.045802,39.685253
17,-105.032149,39.688557
18,-105.060891,39.657622
20,-105.042257,39.644086
etc...
Change the SQL that phpmyadmin will run to this:
LOAD DATA INFILE '*FILEPATH*'
INTO TABLE *table*
(X, Y);
(You will only have to change the last line)
And your csv should look like
-105.057578,39.785603
-105.038646,39.771132
-105.013045,39.771727
-105.045721,39.762055
-105.031777,39.76206
-105.046015,39.72835
The last line tells MySQL to look for only those two columns of data and insert null for any other columns. The NULL value will be auto-incremented as expected.