I am trying to load a Json file (Email_Master.json) using a pig script which is present on Azure storage container. The json file has been generated by a pig script and stored onto azure container. Below is the image how the file looks on container.
I am facing the error while loading the file using pig script through Powershell
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1131: Could not find schema file
The command used is
a = LOAD '$Azure_Path/Email_Master.json' USING JsonLoader();
How to resolve the issue?
The issue is with Default Container which is specified while provisioning HdInsight Cluster.The schema and Header files are storing in Default container.
Related
I am trying to load a json file using the load_dataset method in the Huggingface Datasets library in the Kaggle Kernel. This is the code:
data=load_dataset("json",data_files="/kaggle/input/dataset/gold_summaries_test.json")
I get the following error only when I am working in the Kaggle Kernel.
AttributeError: 'list' object has no attribute 'keys'
Below is a preview of the json file I am trying to load.
gold_summaries_test.json
This error does not occur when loading the file in Google Colab, or on a python console in my local system.
I am trying to load avro files in S3 to a table in Redshift. one of the Avro files doesn't have a correct format. the problem is when copy command tries to load that file, it throws an exception and doesn't run the copy for correct files. how can I skip the wrong-formatted file and c
opy the correct files? here is my code for loading file:
COPY tmp.table
FROM 's3://{BUCKET}/{PREFIX}'
IAM_ROLE '{ROLE}'
FORMAT AVRO 's3://{BUCKET}/{AVRO_PATH}'
the error that I am getting is:
code: 8001
context: Cannot init avro reader from s3 file Incorrect Avro container file magic number
query: 19308992
location: avropath_request.cpp:438
process: query0_125_19308992 [pid=23925]
You can preprocess the s3://{BUCKET}/{PREFIX} files and create a manifest file with only the Avro files that have the right format/schema. Redshift can't do this for you and will try to process all files on the s3://{BUCKET}/{PREFIX} path.
I have updated the neo4j.conf file but can't seem to get rid of this error after changing the file and restarting. I am just trying to load a json file through neo4j and have included the line apoc.import.file.enabled=true on the neo4j.conf but doesn't seem to be working for me, I'm still getting the error message:
Failed to invoke procedure 'apoc.load.json' Caused by
java.lang.RuntimeException : Import from files not enabled, please set
apoc.import.file.enabled=true in your neo4j.conf
I am using neo4jCE 3.2.3 and have used the right file path for the json file as it previously worked on my desktop computer (I'm just trying to replicate it on my laptop) and I am using apoc 3.2.0.4 version plugin. The procedure apoc.load.json is also there when I call all procedures directory.
I am trying to refresh my app to show any differences in local host but the sencha CMD seems to not able to find my app.json file in the generated app folder structure. The file is deffinately there as well, but i do not know why it is tripping up with this.
Here is the log:
Sencha Cmd v4.0.2.67
[ERR] Failed to load JSON from /Users/username/Downloads/sencha/app.json - com.google.gson.stream.MalformedJsonException: Unterminated array at line 80 column 20
It seems this error only occurs when an additional css file is added in the app.json file.
Thanks.
I've created an SSIS PACKAGE on machine X to retrieve data from MYSQL DB Query from machine Y and write to an SQLSERVER Destination Table which is on machine Z(compulsions since I am unable to connect to mysql from Z and X is the only machine which has navicat).
The package runs to the T when run manually and I'm trying to schedule it on machine X for Z's DB .I've created the xml configuration file and placed it on Z since the process runs on Z's DB.and the job fails when executing as a scheduled Job.
I've added passwords to the config file as they don't save automatically.
I suppose it's due to different machines being used(Package on X running on Z's DB and config file on Z).
Here's the error:
Failed to open package file "D:\CSMS\SSIS\Random\Random\MySQlDBtoDWH11DataTransfer.dtsx" due to error 0x80070015 "The device is not ready." This happens when loading a package and the file cannot be opened or loaded correctly into the XML document. This can be the result of either providing an incorrect file name was specified when calling LoadPackage or the XML file was specified and has an incorrect format. End Error Could not load package "D:\CSMS\SSIS\Random\Random\MySQlDBtoDWH11DataTransfer.dtsx" because of error 0xC0011002. Description: Failed to open package file "D:\CSMS\SSIS\Random\Random\MySQlDBtoDWH11DataTransfer.dtsx" due to error 0x80070015 "The device is not ready." This happens when loading a package and the file cannot be opened or loaded correctly into the XML document. This can be the result of either providing an incorrect file name was specified when calling LoadPackage or the XML file was specified and has an incorrect format.
Unable to understand where I'm failing!
Are you using Direct configuration or using Indirect( in which your xml config file path is saved in Environmental variable?
IF you are using Direct configuration, you need to make sure your both machines have the same folder structure which is saved in package.
If you are using Environment variable to point to configuration file. Make sure you have changed the value of variable according to machines and folders where your configuration file is.
To close this question,I've scheduled it to run from a batch file and the process is running fine.