Newbie with Pig/hadoop..
Running pig at local.
java -Xmx512m -Xmx1024m -cp $PIGDIR/pig.jar org.apache.pig.Main -Dpig.temp.dir=/tmp/$USER/$RANDOM -stop_on_failure -x local script-buzz.pig
with my script.pig:
(...)
buzz = FOREACH files GENERATE chiron.buzz.Honey(file, id) as buzz_file, id;
Trying to write a folder/file with my UDF raise:
[JobControl] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:felipehorta cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: file:/Users/felipehorta/dev/ufrj/pig/pig-buzz/output
the following code must (!) writes files that are consumed at the next LOAD.
jar works fine with: $ java -jar Pgm.jar *
(...)
public String exec(Tuple input) throws IOException {
try{
System.out.println(input.get(0).toString());
BumbleBee b = new BumbleBee(input.get(0).toString());
return b.writeRelation(input.get(1).toString());
} catch(Exception e){
System.err.println("Failed to process input; error - " + e.getMessage());
return null;
}
}
public String writeRelation(String folder) throws IOException {
try {
// writing file!
File output = new File("output/ERelation_" + folder + ".txt");
output.getParentFile().mkdirs();
FileWriter fw = new FileWriter(output);
String line = System.getProperty("line.separator");
fw.append("YEAR;WORD;COUNT" + line);
for (Integer year : buzzCandidates.keySet()) {
Map<String, Long> wordCounts = buzzCandidates.get(year);
for (String word : wordCounts.keySet()) {
long value = wordCounts.get(word);
if (value >= 3) {
fw.append(year + ";" + word.replace(" ", "_") + ";" + String.valueOf(value) + line);
}
}
}
fw.flush();
fw.close();
return output.getAbsolutePath();
} catch (Exception e) {
System.out.println(">>> ERROR!!\t" + e.getMessage());
return "ERROR";
}
}
I think it is about permission writing files with UDF, but I dont know where set permissions. Any help?
Thanks in advance, fellows!
Error reads Input path does not exist: file:/Users/felipehorta/dev/ufrj/pig/pig-buzz/output Please check the pig script as to how the load is used.
relation = load '/Users/felipehorta/dev/ufrj/pig/pig-buzz/output' using ...
would be the correct way of doing it.
Not sure if this could be the exact reason. Would be great if you could post the scripts.
Related
I have a Hadoop job running in EMR and i am passing the S3 Path as input and output to this Job.
When i run locally everything is working fine.( As there is a single node)
How ever when i run in EMR with 5 node cluster i am running into File Already exists IO Exception.
The output path has a timestamp in it so the out put path doesn't exists in S3.
Error: java.io.IOException: File already exists:s3://<mybucket_name>/8_9_0a4574ca-96d0-47c8-8eb8-4deb82944d4b/customer/RawFile12.txt/1523583593585/TOKENIZED/part-m-00000
I have a very simple hadoop app (primarily my mapper) which reads each line from a file and converts it (using an existing library)
Not sure why each node is trying to write with the same file name.
Here is mapper
public static class TokenizeMapper extends Mapper<Object,Text,Text,Text>{
public void map(Object key, Text value,Mapper.Context context) throws IOException,InterruptedException{
//TODO: Invoke Core Engine to transform the Data
Encryption encryption = new Encryption();
String tokenizedVal = encryption.apply(value.toString());
context.write(tokenizedVal,1);
}
}
Any my Reducer
public static class TokenizeReducer extends Reducer<Text,Text,Text,Text> {
public void reduce(Text text,Iterable<Text> lines,Context context) throws IOException,InterruptedException{
Iterator<Text> iterator = lines.iterator();
int counter =0;
while(iterator.hasNext()){
counter++;
}
Text output = new Text(""+counter);
context.write(text,output);
}
}
And my main class
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
long startTime = System.currentTimeMillis();
try {
Configuration config = new Configuration();
String[] additionalArgs = new GenericOptionsParser(config, args).getRemainingArgs();
if (additionalArgs.length != 2) {
System.err.println("Usage: Tokenizer Input_File and Output_File ");
System.exit(2);
}
Job job = Job.getInstance(config, "Raw File Tokenizer");
job.setJarByClass(Tokenizer.class);
job.setMapperClass(TokenizeMapper.class);
job.setReducerClass(TokenizeReducer.class);
job.setNumReduceTasks(0);
job.setOutputKeyClass(Text.class);
job.setOutputKeyClass(Text.class);
FileInputFormat.addInputPath(job, new Path(additionalArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(additionalArgs[1]));
boolean status = job.waitForCompletion(true);
if (status) {
//System.exit(0);
System.out.println("Completed Job Successfully");
} else {
System.out.println("Job did not Succeed");
}
}
catch(Exception e){
e.printStackTrace();
}
finally{
System.out.println("Total Time for processing =["+(System.currentTimeMillis()-startTime)+"]");
}
}
I am passing the arguments when i am launching the cluster as
s3://<mybucket>/8_9_0a4574ca-96d0-47c8-8eb8-4deb82944d4b/customer/RawFile12.txt
s3://<mybucket>/8_9_0a4574ca-96d0-47c8-8eb8-4deb82944d4b/customer/RawFile12.txt/1523583593585/TOKENIZED
Appreciate any inputs.
Thanks
In the driver code, you have set Reducer to 0, then we do not need the reducer code.
In case you need to clear the output dir before job launch, you can use this snippet to clear the dir if it exists:-
FileSystem fileSystem = FileSystem.get(<hadoop config object>);
if(fileSystem.exists(new Path(<pathTocheck>)))
{
fileSystem.delete(new Path(<pathTocheck>), true);
}
I find the best way to save game data in Unity3D Game engine.
At first, I serialize objects using BinaryFormatter.
But I heard this way has some issues and is not suitable for save.
So, What is the best or recommended way for saving game state?
In my case, save format must be byte array.
But I heard this way has some issues and not suitable for save.
That's right. On some devices, there are issues with BinaryFormatter. It gets worse when you update or change the class. Your old settings might be lost since the classes non longer match. Sometimes, you get an exception when reading the saved data due to this.
Also, on iOS, you have to add Environment.SetEnvironmentVariable("MONO_REFLECTION_SERIALIZER", "yes"); or you will have problems with BinaryFormatter.
The best way to save is with PlayerPrefs and Json. You can learn how to do that here.
In my case, save format must be byte array
In this case, you can convert it to json then convert the json string to byte array. You can then use File.WriteAllBytes and File.ReadAllBytes to save and read the byte array.
Here is a Generic class that can be used to save data. Almost the-same as this but it does not use PlayerPrefs. It uses file to save the json data.
DataSaver class:
public class DataSaver
{
//Save Data
public static void saveData<T>(T dataToSave, string dataFileName)
{
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Convert To Json then to bytes
string jsonData = JsonUtility.ToJson(dataToSave, true);
byte[] jsonByte = Encoding.ASCII.GetBytes(jsonData);
//Create Directory if it does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Directory.CreateDirectory(Path.GetDirectoryName(tempPath));
}
//Debug.Log(path);
try
{
File.WriteAllBytes(tempPath, jsonByte);
Debug.Log("Saved Data to: " + tempPath.Replace("/", "\\"));
}
catch (Exception e)
{
Debug.LogWarning("Failed To PlayerInfo Data to: " + tempPath.Replace("/", "\\"));
Debug.LogWarning("Error: " + e.Message);
}
}
//Load Data
public static T loadData<T>(string dataFileName)
{
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Exit if Directory or File does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Debug.LogWarning("Directory does not exist");
return default(T);
}
if (!File.Exists(tempPath))
{
Debug.Log("File does not exist");
return default(T);
}
//Load saved Json
byte[] jsonByte = null;
try
{
jsonByte = File.ReadAllBytes(tempPath);
Debug.Log("Loaded Data from: " + tempPath.Replace("/", "\\"));
}
catch (Exception e)
{
Debug.LogWarning("Failed To Load Data from: " + tempPath.Replace("/", "\\"));
Debug.LogWarning("Error: " + e.Message);
}
//Convert to json string
string jsonData = Encoding.ASCII.GetString(jsonByte);
//Convert to Object
object resultValue = JsonUtility.FromJson<T>(jsonData);
return (T)Convert.ChangeType(resultValue, typeof(T));
}
public static bool deleteData(string dataFileName)
{
bool success = false;
//Load Data
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Exit if Directory or File does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Debug.LogWarning("Directory does not exist");
return false;
}
if (!File.Exists(tempPath))
{
Debug.Log("File does not exist");
return false;
}
try
{
File.Delete(tempPath);
Debug.Log("Data deleted from: " + tempPath.Replace("/", "\\"));
success = true;
}
catch (Exception e)
{
Debug.LogWarning("Failed To Delete Data: " + e.Message);
}
return success;
}
}
USAGE:
Example class to Save:
[Serializable]
public class PlayerInfo
{
public List<int> ID = new List<int>();
public List<int> Amounts = new List<int>();
public int life = 0;
public float highScore = 0;
}
Save Data:
PlayerInfo saveData = new PlayerInfo();
saveData.life = 99;
saveData.highScore = 40;
//Save data from PlayerInfo to a file named players
DataSaver.saveData(saveData, "players");
Load Data:
PlayerInfo loadedData = DataSaver.loadData<PlayerInfo>("players");
if (loadedData == null)
{
return;
}
//Display loaded Data
Debug.Log("Life: " + loadedData.life);
Debug.Log("High Score: " + loadedData.highScore);
for (int i = 0; i < loadedData.ID.Count; i++)
{
Debug.Log("ID: " + loadedData.ID[i]);
}
for (int i = 0; i < loadedData.Amounts.Count; i++)
{
Debug.Log("Amounts: " + loadedData.Amounts[i]);
}
Delete Data:
DataSaver.deleteData("players");
I know this post is old, but in case other users also find it while searching for save strategies, remember:
PlayerPrefs is not for storing game state. It is explicitly named "PlayerPrefs" to indicate its use: storing player preferences. It is essentially plain text. It can easily be located, opened, and edited by any player. This may not be a concern for all developers, but it will matter to many whose games are competitive.
Use PlayerPrefs for Options menu settings like volume sliders and graphics settings: things where you don't care that the player can set and change them at will.
Use I/O and serialization for saving game data, or send it to a server as Json. These methods are more secure than PlayerPrefs, even if you encrypt the data before saving.
I find the best way to save game data in Unity3D Game engine.
At first, I serialize objects using BinaryFormatter.
But I heard this way has some issues and is not suitable for save.
So, What is the best or recommended way for saving game state?
In my case, save format must be byte array.
But I heard this way has some issues and not suitable for save.
That's right. On some devices, there are issues with BinaryFormatter. It gets worse when you update or change the class. Your old settings might be lost since the classes non longer match. Sometimes, you get an exception when reading the saved data due to this.
Also, on iOS, you have to add Environment.SetEnvironmentVariable("MONO_REFLECTION_SERIALIZER", "yes"); or you will have problems with BinaryFormatter.
The best way to save is with PlayerPrefs and Json. You can learn how to do that here.
In my case, save format must be byte array
In this case, you can convert it to json then convert the json string to byte array. You can then use File.WriteAllBytes and File.ReadAllBytes to save and read the byte array.
Here is a Generic class that can be used to save data. Almost the-same as this but it does not use PlayerPrefs. It uses file to save the json data.
DataSaver class:
public class DataSaver
{
//Save Data
public static void saveData<T>(T dataToSave, string dataFileName)
{
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Convert To Json then to bytes
string jsonData = JsonUtility.ToJson(dataToSave, true);
byte[] jsonByte = Encoding.ASCII.GetBytes(jsonData);
//Create Directory if it does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Directory.CreateDirectory(Path.GetDirectoryName(tempPath));
}
//Debug.Log(path);
try
{
File.WriteAllBytes(tempPath, jsonByte);
Debug.Log("Saved Data to: " + tempPath.Replace("/", "\\"));
}
catch (Exception e)
{
Debug.LogWarning("Failed To PlayerInfo Data to: " + tempPath.Replace("/", "\\"));
Debug.LogWarning("Error: " + e.Message);
}
}
//Load Data
public static T loadData<T>(string dataFileName)
{
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Exit if Directory or File does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Debug.LogWarning("Directory does not exist");
return default(T);
}
if (!File.Exists(tempPath))
{
Debug.Log("File does not exist");
return default(T);
}
//Load saved Json
byte[] jsonByte = null;
try
{
jsonByte = File.ReadAllBytes(tempPath);
Debug.Log("Loaded Data from: " + tempPath.Replace("/", "\\"));
}
catch (Exception e)
{
Debug.LogWarning("Failed To Load Data from: " + tempPath.Replace("/", "\\"));
Debug.LogWarning("Error: " + e.Message);
}
//Convert to json string
string jsonData = Encoding.ASCII.GetString(jsonByte);
//Convert to Object
object resultValue = JsonUtility.FromJson<T>(jsonData);
return (T)Convert.ChangeType(resultValue, typeof(T));
}
public static bool deleteData(string dataFileName)
{
bool success = false;
//Load Data
string tempPath = Path.Combine(Application.persistentDataPath, "data");
tempPath = Path.Combine(tempPath, dataFileName + ".txt");
//Exit if Directory or File does not exist
if (!Directory.Exists(Path.GetDirectoryName(tempPath)))
{
Debug.LogWarning("Directory does not exist");
return false;
}
if (!File.Exists(tempPath))
{
Debug.Log("File does not exist");
return false;
}
try
{
File.Delete(tempPath);
Debug.Log("Data deleted from: " + tempPath.Replace("/", "\\"));
success = true;
}
catch (Exception e)
{
Debug.LogWarning("Failed To Delete Data: " + e.Message);
}
return success;
}
}
USAGE:
Example class to Save:
[Serializable]
public class PlayerInfo
{
public List<int> ID = new List<int>();
public List<int> Amounts = new List<int>();
public int life = 0;
public float highScore = 0;
}
Save Data:
PlayerInfo saveData = new PlayerInfo();
saveData.life = 99;
saveData.highScore = 40;
//Save data from PlayerInfo to a file named players
DataSaver.saveData(saveData, "players");
Load Data:
PlayerInfo loadedData = DataSaver.loadData<PlayerInfo>("players");
if (loadedData == null)
{
return;
}
//Display loaded Data
Debug.Log("Life: " + loadedData.life);
Debug.Log("High Score: " + loadedData.highScore);
for (int i = 0; i < loadedData.ID.Count; i++)
{
Debug.Log("ID: " + loadedData.ID[i]);
}
for (int i = 0; i < loadedData.Amounts.Count; i++)
{
Debug.Log("Amounts: " + loadedData.Amounts[i]);
}
Delete Data:
DataSaver.deleteData("players");
I know this post is old, but in case other users also find it while searching for save strategies, remember:
PlayerPrefs is not for storing game state. It is explicitly named "PlayerPrefs" to indicate its use: storing player preferences. It is essentially plain text. It can easily be located, opened, and edited by any player. This may not be a concern for all developers, but it will matter to many whose games are competitive.
Use PlayerPrefs for Options menu settings like volume sliders and graphics settings: things where you don't care that the player can set and change them at will.
Use I/O and serialization for saving game data, or send it to a server as Json. These methods are more secure than PlayerPrefs, even if you encrypt the data before saving.
In a java service, without a function declaration, a function call is there and only compile time error comes. But the output is as expected with no run time errors. How is that possible? Can anyone please explain?
"The method functionName() is undefined" is the error it shows.
Below is the code.
public static final void documentToStringVals(IData pipeline)
throws ServiceException {
// pipeline
IDataCursor pipelineCursor = pipeline.getCursor();
String success = "false";
IData inputDoc = null;
String outputValue = "";
String headerYN = "N";
boolean headerValue = false;
String delimiter = ",";
String newline = System.getProperty("line.separator");
if (pipelineCursor.first("inputDocument") ) {
inputDoc = (IData) pipelineCursor.getValue();
}
else {
throw new ServiceException("inputDocument is a required parameter");
}
if (pipelineCursor.first("delimiter") ) {
delimiter = (String) pipelineCursor.getValue();
}
if (pipelineCursor.first("headerYN") ) {
headerYN = (String) pipelineCursor.getValue();
}
if (headerYN.equalsIgnoreCase("Y")) {
headerValue = true;
}
try {
outputValue = docValuesToString(inputDoc, headerValue, delimiter);
outputValue += newline;
success = "true";
}
catch (Exception e) {
System.out.println("Exception in getting string from document: " + e.getMessage());
pipelineCursor.insertAfter("errorMessage", e.getMessage());
}
pipelineCursor.insertAfter("success", success);
pipelineCursor.insertAfter("outputValue", outputValue);
pipelineCursor.destroy();
}
The code you posted has no reference to "functionName", so I suspect there's a reference to it either in the shared code section or in another Java service in the same folder. Given that all Java services in a folder get compiled into a single class, and therefore all those services need to be compiled together, this could cause the error message when you're compiling the service above.
I'm trying to copy files for Tesseract to use and no matter how I try it keeps giving me filenot found exceptions. I don't understand why because I have them in the assets folder. I tried it one way by copying the specific tessdata folder which wasn't working so I tried putting them all under the general assets folder and copying each file in there into a new directory I created on the card called tessdata.
Here's an image of the files in the folder, the method for copying and the log errors that post:
And here's the code:
private void copyAssets() {
AssetManager assetManager = getAssets();
String[] files = null;
try {
files = assetManager.list("");
} catch (IOException e) {
android.util.Log.e(TAG, "Failed to get asset file list.", e);
}
for(String filename : files) {
InputStream in = null;
OutputStream out = null;
try {
in = assetManager.open(filename);
out = new FileOutputStream(tesspath+ "/" + filename);
copyFile(in, out);
in.close();
in = null;
out.flush();
out.close();
out = null;
} catch(IOException e) {
android.util.Log.e("tag", "Failed to copy asset file: " + filename, e);
}
}
}
-I also tried using this method from an example which copied them from the tessdata folder within assets-
if (!(new File(tesspath + File.separator + lang + ".traineddata")).exists()) {
copyAssets();
/* try {
AssetManager assetManager = getAssets();
//open the asset manager and open the traineddata path
InputStream in = assetManager.open("tessdata/eng.traineddata");
android.util.Log.e(TAG, "OPENED SUCCESSFULLY IF NO ERROR BEFORE THIS");
OutputStream out = new FileOutputStream(tesspath + "/eng.traineddata");
android.util.Log.e(TAG, "WRITING NOW TO" + tesspath);
byte[] buf = new byte[8024];
int len;
while ((len = in.read(buf)) > 0) {
out.write(buf, 0, len);
}
in.close();
out.close();
} catch (IOException e) {
android.util.Log.e(TAG, "Was unable to copy " + lang
+ " traineddata " + e.toString());
android.util.Log.e(TAG, "IM PRINTING THE STACK TRACEs");
e.printStackTrace();
}
*/
} else {
processImage(STORAGE_PATH + File.separator + "savedAndroid.jpg");
}
have you checked the sdcard write perssion in manifest.xml ?
probably you will need this in the AndroidManifest.xml
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />