i have a text like this:
Exception in org.baharan.dominant.dao.core.nonPlanAllocation.INonPlanAllocationRepository.getAllGrid()
with cause = 'org.hibernate.exception.SQLGrammarException: could not extract ResultSet'
Caused by: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist
i tokenize this text with word_tokenize in python and output is:
Exception
org.baharan.dominant.dao.core.nonPlanAllocation.INonPlanAllocationRepository.getAllGrid
cause
'org.hibernate.exception.SQLGrammarException
could
extract
ResultSet'
Caused
java.sql.SQLSyntaxErrorException
ORA-00942
table
view
exist
But as you can see, the second line outputs several words that are dotted together. How to separate these as a Word?!
i use this python code:
>>> f = open('001.txt')
>>> text = [w for w in word_tokenize(f.read()) if w not in stopwords]
and In fact, I want all words to be separated like this:
Exception
org
baharan
dominant
dao
core
nonPlanAllocation
INonPlanAllocationRepository
getAllGrid
cause
'org
hibernate
exception
SQLGrammarException
could
extract
ResultSet'
Caused
java
sql
SQLSyntaxErrorException
ORA-00942
table
view
exist
f = "Exception in org.baharan.dominant.dao.core.nonPlanAllocation.INonPlanAllocationRepository.getAllGrid() \
with cause = 'org.hibernate.exception.SQLGrammarException: could not extract ResultSet' \
Caused by: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist'"
s = ''
f_list = f.replace('.', ' ').split(' ')
for item in f_list:
#print(item)
s = s + ' ' + ''.join(item)+'\n'
print(s)
output
Exception
in
org
baharan
dominant
dao
core
nonPlanAllocation
INonPlanAllocationRepository
getAllGrid()
with
cause
=
'org
hibernate
exception
SQLGrammarException:
could
not
extract
ResultSet'
Caused
by:
java
sql
SQLSyntaxErrorException:
ORA-00942:
table
or
view
does
not
exist'
i found a simple way that use of RegexpTokenizer of nltk.tokenize like this:
>>> from nltk.tokenize import RegexpTokenizer
>>> tokenizer = RegexpTokenizer(r'\w+')
The output after considering remove stopwords is as follows:
Exception
org
baharan
dominant
dao
core
nonPlanAllocation
INonPlanAllocationRepository
getAllGrid
cause
org
hibernate
exception
SQLGrammarException
could
extract
ResultSet
Caused
java
sql
SQLSyntaxErrorException
ORA-00942
table
view
exist
While doing Twitter sentiment analysis i come across this error despite searching for extensively for three days for the solution i couldn't get any solution.
Error
hive> select * from load_tweets;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected character ('O' (code 79)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: java.io.StringReader#5a82bc58; line: 1, column: 2]
Time taken: 1.698 seconds
hive>
Here is the Table creation
hive> create external table load_tweets(id BIGINT,text STRING) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/flume/tweets'
> ;
I am using Cloudera Json serde version
json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar
I have checked jar file is properly added in hive class path.
Database:
http://freetexthost.com/ik4jyogkfm
This is database fetch by flume.
I am referring this article for the data fetching
https://acadgild.com/blog/streaming-twitter-data-using-flume/
I am running my Pigscript in Local mode in eclipse.
when I try to store the output in JsonStorage.
Exception in thread "main" java.lang.RuntimeException: Cannot instantiate:org.apache.pig.builtin.JsonStorage
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:473)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:4976)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3473)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1351)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:706)
at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1017)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:967)
at org.apache.pig.PigServer.registerQuery(PigServer.java:383)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:716)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at org.apache.pig.PigServer.registerScript(PigServer.java:407)
at com.paypal.debugpig.DebugPig.main(DebugPig.java:13)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.pig.builtin.JsonStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:458)
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:470)
... 14 more
PigScript :
REGISTER C:/path/to/jar/pig.jar;
REGISTER C:/path/to/jar/UpperUDf/UpperUDf_fat.jar;
A = LOAD 'C:/path/to/data/file/student.txt' using PigStorage('\t') AS (name: chararray, age: int, gpa: float);
B = FOREACH A GENERATE myudfs.UPPER(name) ,age, gpa ;
Store B into 'output_student_Json' using org.apache.pig.builtin.JsonStorage();
when I dump or store the ouput in text file its working and but issues occurs when I try to store in JSON format.
Any pointers appreciated
Thank you
I have verified it, and it is working for me if i am using the below line of code for storing output into json file format.
store B into 'json_output' using JsonStorage();
My .csv file is in an enclosed format.
"13","9827259163","0","D","2"
"13","9827961481","0","D","2"
"13","9827202228","0","A","2"
"13","9827529897","0","A","2"
"13","9827700249","0","A","2"
"12","9883219029","0","A","2"
"17","9861065312","0","A","2"
"17","9861220761","0","D","2"
"13","9827438384","0","A","2"
"13","9827336733","0","D","2"
"13","9827380905","0","D","2"
"13","9827115358","0","D","2"
"17","9861475884","0","D","2"
"17","9861511646","0","D","2"
"17","9861310397","0","D","2"
"13","9827035035","0","A","2"
"13","9827304969","0","D","2"
"13","9827355786","0","A","2"
"13","9827702373","0","A","2"
Like it is in mysql, I have tried using "enclosed" keyword as follows..
CREATE EXTERNAL TABLE dnd (ServiceAreaCode varchar(50), PhoneNumber varchar(15), Preferences varchar(15), Opstype varchar(15), PhoneType varchar(10))
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\n'
LOCATION '/dnd';
But, it is giving an error as follows...
NoViableAltException(26#[1704:103: ( tableRowFormatMapKeysIdentifier )?])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:144)
at org.apache.hadoop.hive.ql.parse.HiveParser.rowFormatDelimited(HiveParser.java:30427)
at org.apache.hadoop.hive.ql.parse.HiveParser.tableRowFormat(HiveParser.java:30662)
at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:4683)
at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2144)
at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398)
at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:456)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:466)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: ParseException line 5:33 cannot recognize input near 'ENCLOSED' 'BY' ''"'' in serde properties specification
Is there a way I can directly import this file ?? Thanks in advance.
Find another way. The solution is serde. Please download serde jar using this link : https://github.com/downloads/IllyaYalovyy/csv-serde/csv-serde-0.9.1.jar
then follow below steps using hive prompt :
add jar path/to/csv-serde.jar;
create table dnd (ServiceAreaCode varchar(50), PhoneNumber varchar(15), Preferences varchar(15), Opstype varchar(15), PhoneType varchar(10))
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
with serdeproperties(
"separatorChar" = "\,",
"quoteChar" = "\"")
stored as textfile
;
and then load data from your given path using below query:
load data local inpath 'path/xyz.csv' into table dnd;
and then run :
select * from dnd;
Hey I did it quoted csv data in hive table:
first download csv serde(I downloaded csv-serde-1.1.2.jar)
Then
hive>add jar /opt/hive-1.1.1/lib/csv-serde-1.1.2.jar;
Hive>create table t1(schema) row format serde 'com.bizo.hive.serde.csv.CSVSerde' with serdeproperties ("separatorChar" = ",") LOCATION '/user/hive/warehouse/dwb/ot1/';
Then we have to add serde in the hive-site.xml as below mentioned, so that we can query table from hive-shell.
<property><name>hive.aux.jars.path</name><value>hdfs://master-ip:54310/hive-serde/csv-serde-1.1.2.jar</value></property>
In hive we can use jar file to retrieve the data which is enclosed in double quotes.
For your problem please refer this link :
http://stackoverflow.com/questions/21156071/why-dont-hive-have-fields-enclosed-by-like-in-mysql
I am trying to execute de following code in Hibernate to create a .csv file from a mySQL database. :
String sql =
"SELECT * INTO OUTFILE 'table.csv' FIELDS TERMINATED BY ','" +
" OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\\n' " +
"FROM match INNER JOIN totala ON match_code= match";
The .csv file is created correctly but then I get the following error:
Exception in thread "main" org.hibernate.exception.GenericJDBCException: could not execute query
at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.loader.Loader.doList(Loader.java:2536)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
at org.hibernate.loader.Loader.list(Loader.java:2271)
at org.hibernate.loader.custom.CustomLoader.list(CustomLoader.java:316)
at org.hibernate.impl.SessionImpl.listCustomQuery(SessionImpl.java:1842)
at org.hibernate.impl.AbstractSessionImpl.list(AbstractSessionImpl.java:165)
at org.hibernate.impl.SQLQueryImpl.list(SQLQueryImpl.java:157)
at datuak.DatuBasea.sortu_csv_fitxategia(DatuBasea.java:101)
at sortzailea.csv_sortzailea.Csv.Sortu(Csv.java:8)
at html_erauzlea.Nagusia.main(Nagusia.java:34)
Caused by: java.sql.SQLException: ResultSet is from UPDATE. No Data
at com.mysql.jdbc.ResultSet.next(ResultSet.java:2491)
at org.hibernate.loader.Loader.doQuery(Loader.java:825)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.doList(Loader.java:2533)
... 9 more
I think that the problem is that I am executing a select query that doesn´t return any value, it only creates a .csv file and the method is expecting to return a ResultSet.
So, could someone give some suggestions?
Thanks in advance