I am new to SQL databases and I am learning about the BLOB data type.
My SQL table structure looks like this:
CREATE TABLE test (
NAME VARCHAR(20),
AGE INT,
CITY VARCHAR(20),
FILE BLOB);
I insert values into it using the command line as I was observing the behavior of the BLOB field. According to my knowledge BLOB field encoded the data. But it is not showing me the encoded data when I retrieve all the records. I inserted the following values:
INSERT INTO test VALUES ('Neha', 21, 'Lahore', 'Computer Science');
It is showing me the following output:
Please help me how this is happening. What is the right concept of this.
Related
I am working on some benchmarks and need to compare ORC, Parquet and CSV formats. I have exported TPC/H (SF1000) to ORC based tables. When I want to export it to Parquet I can run:
CREATE TABLE hive.tpch_sf1_parquet.region
WITH (format = 'parquet')
AS SELECT * FROM hive.tpch_sf1_orc.region
When I try the similar approach with CSV, then I get the error Hive CSV storage format only supports VARCHAR (unbounded). I would assumed that it would convert the other datatypes (i.e. bigint) to text and store the column format in the Hive metadata.
I can export the data to CSV using trino --server trino:8080 --catalog hive --schema tpch_sf1_orc --output-format=CSV --execute 'SELECT * FROM nation, but then it gets emitted to a file. Although this works for SF1 it quickly becomes unusable for SF1000 scale-factor. Another disadvantage is that my Hive metastores wouldn't have the appropriate meta-data (although I could patch it manually if nothing else works).
Anyone an idea how to convert my ORC/Parquet data to CSV using Hive?
In Trino Hive connector, the CSV table can contain varchar columns only.
You need to cast the exported columns to varchar when creating the table
CREATE TABLE region_csv
WITH (format='CSV')
AS SELECT CAST(regionkey AS varchar), CAST(name AS varchar), CAST(comment AS varchar)
FROM region_orc
Note that you will need to update your benchmark queries accordingly, e.g. by applying reverse casts.
DISCLAIMER: Read the full post, before using anything discussed here. It's not real CSV and you migth screw up!
It is possible to create typed CSV-ish tables when using the TEXTFILE format and use ',' as the field separator:
CREATE TABLE hive.test.region (
regionkey bigint,
name varchar(25),
comment varchar(152)
)
WITH (
format = 'TEXTFILE',
textfile_field_separator = ','
);
This will create a typed version of the table in the Hive catalog using the TEXTFILE format. It normally uses the ^A character (ASCII 10), but when set to ',' it resembles the same structure as CSV formats.
IMPORTANT: Although it looks like CSV, it is not real CSV. It doesn't follow RFC 4180, because it doesn't properly quote and escape. The following INSERT will not be inserted co:
INSERT INTO hive.test.region VALUES (
1,
'A "quote", with comma',
'The comment contains a newline
in it');
The text will be copied unmodified to the file without escaping quotes or commas. This should have been written like this to be proper CSV:
1,"A ""quote"", with comma","The comment contains a newline
in it"
Unfortunately, it is written as:
1,A "quote", with comma,The comment contains a newline
in it
This results in invalid data that will be represented by NULL columns. For this reason, this method can only be used when you have full control over the text-based data and are sure that it doesn't contain newlines, quotes, commas, ...
I have a xml file which contains rows like
<row Id="50720" UserId="24115" Name="Teacher" Date="2011-04-29T03:17:22.257" Class="3" TagBased="False" />
<row Id="50717" UserId="902" Name="c++" Date="2011-04-29T03:00:17.067" Class="3" TagBased="True" />
I want store the data of this xml file into my mysql databse.
My current create table syntax is
create table Badges(
Id int,
UserId int,
Name nvarchar (50),
Date datetime,
Class tinyint,
TagBased bit
);
The sql statement to import the xml file is as follows:
LOAD XML LOCAL INFILE '/media/anurag/Learning/iit_hyderabad/Sem_3/dataset /Badges.xml'
INTO TABLE Badges(Id, UserId, Name, Date, Class,TagBased);
But TagBased column is not able to be stored properly as mysql does not directly recognize False as 0 and True as 1.
This above query is storing 1 for all the False as well as True fields in the database.
Can you please help me with the right fix for the query.
You need to use variables within the load xml statement to convert the text true/false to a boolean true/false, which are the same as 1/0.
LOAD XML LOCAL INFILE '/media/anurag/Learning/iit_hyderabad/Sem_3/dataset /Badges.xml'
INTO TABLE Badges(Id, UserId, Name, Date, Class,#var)
SET TagBased=(#var = 'True');
I'm currently learning/testing Hive and can't seem to find a suitable solution to this problem:
I have log files which look like this:
IP, Date, Time, URL, Useragent
Which I have currently in a Table with these Columns. These Columns are delimited by '\t' but URL has been given some specific client information looking somewhat like this:
example.org/log.gif?userID=xxx&sex=m&age=y&subscriber=y&lastlogin=ddd
and I want to create a new table with these given value-pairs: userID, sex, age, subscriber, lastlogin another problem being that the value-pairs are not always complete, or some are missing. Like this:
example.org/log.gif?userID=xxx&sex=m&age=y&subscriber=y&lastlogin=ddd
example.org/log.gif?userID=xxx&sex=m&age=y&lastlogin=
Which makes Hive's ... format delimited fields terminated by '&'; afaik useless in this case because it would lead to wrong values in columns.
Is there a way to solve this problem in Hive with SQL and regex?
This can be done, albeit with two Hive tables. You first load data into one table with the columns:
IP, Date, Time, URL, Useragent
Here I recommend using an EXTERNAL Hive table - you aren't parsing the data and this Hive table doesn't need to exist for very long, so simply place Hive metadata on top of it:
CREATE EXTERNAL TABLE raw_log (
ip string,
date string,
time string,
url string,
useragent string
)
LOCATION '<hdfs_location_of_the_raw_log_folder>';
Then use an INSERT INTO query with the Hive regexp_extract(string subject, string pattern, int index) method (see https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF) to load it into the "final" table with the correct columns.
You can also write your own UDF which would enable you to better-handle the incomplete/missing values that you mention, albeit with the tradeoff that you have to re-compile and re-deploy a JAR every time the input data's format changes (see https://cwiki.apache.org/confluence/display/Hive/HivePlugins).
I have a text file given as a part of the assignment and I have to use it to populate the database. I don't understand how I am supposed to use it to populate the database.
Here is an example of the text file -
STUDENT TABLE DATA:
The row names are in this order
Student ID, Student name, Programme, Level, Age
10, Lorry Ross,CS,1,18
102,Lydia Ken,CIS,1,18
103,Bob Chung,CS,1,18
See documentation for LOAD DATA INFILE
I am using mysql and I'm trying to populate the database in build.xml.
How can I insert a blob file in this table:
CREATE TABLE CONTENT (
idContent varchar(30) not null,
price int,
url blob,
primary key (idContent)
);
I've tried this:
INSERT INTO CONTENT VALUES ("Tecnico.png", 0, LOAD_FILE("src/Tecnico.png"));
but the url return is null, then I tried the entire path to the .png and returned null to.
Can anyone help me please?
MySQL LOAD_FILE() reads a file that is already present on the server, do not uploads the file.
So, if the file is on the server, and the full path is something like "/var/www/[AnotherFolfer]/src/Tecnico.png"
Try, to use:
INSERT INTO CONTENT VALUES ("Tecnico.png", 0, LOAD_FILE('/var/www/[AnotherFolder]/src/Tecnico.png'));