Hadoop InvalidInput Exception - exception

root#priyal-Inspiron-N5030:/home/priyal# hadoop dfs -copyFromLocal in /in
root#priyal-Inspiron-N5030:/home/priyal# hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount in out
INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1424175893740_0008
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/root/in
can someone plz suggest what exactly the problem is and how do i resolve ? I am a new user

Error
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/root/in
says that the destination location /user/root/in is not prsrent in your HDFS. You can brows your file system using simple web UI.
http://localhost:50070/
then click on Utilities drop down -> Brows the file System. and check your foledr hirerchy i.e. /user/root/in
if it does not exist you can create folder using command
hadoop dfs -mkdir hdfs://localhost:9000/user/root/in
now try to execute your copy command
hadoop dfs -copyFromLocal /local/path/of/source/file hdfs://localhost:9000/user/root/in
hope this will help you :)

Related

Getting an error Unable to locate appender "jmeter-log" for logger config "root" while running and creating the html dashboard for my jmeter scripts

I am trying to create Jmeter HTML report through CSV with the help of command but getting below error in my CMD. Please help me what i need to change or enhance for getting the reults
2020-07-23 16:47:20,385 main ERROR Null object returned for File in Appenders.
2020-07-23 16:47:20,409 main ERROR Unable to locate appender "jmeter-log" for logger config "root"
An error occurred: Cannot read test results file : XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
errorlevel=1
Press any key to continue . . .
The main problem is Cannot read test results file and it occurs when you point JMeter to not-existing file or the file cannot be read (you don't have permissions to open the file in that location)
The other problem is with JMeter logging configuration, either your log4j2.xml file is broken or again you don't have proper read/write permissions to the folder where JMeter is installed. Try running the terminal with elevated rights and both should go away
I solved this problem by installing the latest version of Jmeter, 5.3. After that, no more logging/summarizer errors.

Databrick csv cannot find local file

In a program I have csv extracted from excel, I need to upload the csv to hdfs and save it as parquet format, doesn't matter with python version or spark version, no scala please.
Almost all discussions I came across are about databrick, however, it seems cannot find the file, here is the code and error:
df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema","true").option("delimiter",",").load("file:///home/rxie/csv_out/wamp.csv")
Error:
java.io.FileNotFoundException: File file:/home/rxie/csv_out/wamp.csv
does not exist
The file path:
ls -la /home/rxie/csv_out/wamp.csv
-rw-r--r-- 1 rxie linuxusers 2896878 Nov 12 14:59 /home/rxie/csv_out/wamp.csv
Thank you.
I found the issue now!
The reason why it errors out of file not found is actually correct, because I was using Spark Context with setMaster("yarn-cluster"), that means all worker nodes will look for the csv file, of course all worker nodes (except the one starting the program where the csv resides) do not have this file and hence error out. What I really should do is to use setMaster("local").
FIX:
conf = SparkConf().setAppName('test').setMaster("local")
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)
csv = "file:///home/rxie/csv_out/wamp.csv"
df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema","true").option("delimiter",",").load(csv)
Yes, you are right, the file should be present at all worker nodes.
well. you can still read a local file in yarn cluster mode. you just need to add your file using addFile.
spark.sparkContext.addFile("file:///your local file path ")
spark will copy the file to each node where executor will be created and can be able to process your file in cluster mode as well.
I am using spark 2.3 version so you can change your spark context accordingly but addFile method remains same.
try this with your yarn (cluster mode) and let me know if it works for you.

Read file from Cloudera CDSW Project with PySpark

I have a file sitting in my Cloudera project under "/home/cdsw/npi.json". I've tried using the following commands to use PySpark for reading from my "local" CDSW project, but can't get at it with any of the following commands. They all throw the "Path does not exist: " error
npi = sc.read.format("json").load("file:///home/cdsw/npi.json")
npi = sc.read.format("json").load("file:/home/cdsw/npi.json")
npi = sc.read.format("json").load("home/cdsw/npi.json")
As per this documentation, Accessing Data from HDFS
From terminal, copy the file from local file system to HDFS. Either use -put or -copyFromLocal.
hdfs dfs -put /home/cdsw/npi.json /destination
where, /destination is in HDFS.
Then, read the file in PySpark.
npi = sc.read.format("json").load("/destination/npi.json")
For more information:
put
put [-f] [-p] [-l] <localsrc> ... <destination>
Copy files from the local file system into fs. Copying fails if the file already
exists, unless the -f flag is given.

Neo.ClientError.Statement.ExternalResourceFailed on ubuntu

I'm trying to import a csv file in NEO4j db using script :
LOAD CSV FROM "file:///dataframe6.txt" AS line
RETURN count(*)
But I get following error:
Neo.ClientError.Statement.ExternalResourceFailed
Couldn't load the external resource at: file:/home/gaurav/sharing/dataframe6.txt
P.S. : I'm using ubuntu machine
added this line dbms.directories.import=/home/gaurav/sharing/
and dbms.security.allow_csv_import_from_file_urls=true
I had to change folder permission for user NEO4j, it works now

migrate Apache OFBiz from Apache Derby to mysql

I am using ofbiz in my organization. I want to migrate ofbiz from derby to MySQL.
I refer the steps from
(https://cwiki.apache.org/confluence/display/OFBIZ/How+to+migrate+OfBiz+from+Derby+to+MySQL+database) here, but at the i got stuck at the end.
At the end when I type (java -jar ofbiz.jar -install) this command I am getting an exception,
C:\Users\sagar_vinod_khanke\Sagar\Apache OFBiz\Ofbiz\13.07>java -jar ofbiz.jar -
install
Exception in thread "main" org.ofbiz.base.start.StartupException: Couldn't not f
etch config instance
at org.ofbiz.base.start.Start.init(Start.java:202)
at org.ofbiz.base.start.Start.main(Start.java:127)
Caused by: java.io.IOException: Cannot load configuration properties : org/ofbiz
/base/start/-install.properties
at org.ofbiz.base.start.Config.getPropertiesFile(Config.java:229)
at org.ofbiz.base.start.Config.readConfig(Config.java:297)
at org.ofbiz.base.start.Config.getInstance(Config.java:58)
at org.ofbiz.base.start.Start.init(Start.java:200)
... 1 more
Can you please help me?
don't use - with install.
See revised Step-V
Step-V
1. Run the following command from command prompt:
java -jar ofbiz.jar install
2. Start OfBiz
3. Use webtools to import all data from XML:
a. Navigate to http://localhost:8080/catalog/
b. Go to Applications>WebTools
c. Go to section 'Entity XML Tools' and click the link 'XML Data Import Dir' -> In the 'Absolute directory path:' enter the full path of the directory where you exported the data in Step - II