Read Parquet File Error Spark on Azure Synapse Workspace - exception

I am running a pyspark job using azure synpase workspace. My Spark Job is failing with following error. Can someone help me in debugging this error?
This error is coming in spark application run by Pipeline on Azure Synapse
Stacktrace: An error occurred while calling o1394.execute.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 94.0 failed 4 times, most recent failure: Lost task 0.3 in stage 94.0 (TID 2313) (vm-1d164027 executor 3): java.io.EOFException
at org.apache.parquet.bytes.BytesUtils.readIntLittleEndian(BytesUtils.java:85)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:520)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:505)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:499)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:476)
at

The error message indicates that the Spark job is failing because it is encountering an EOFException while reading Parquet files. This suggests that there is something wrong with the Parquet files themselves, and that they are either incomplete or corrupt.
To debug this issue, you will need to inspect the Parquet files themselves to see if there is anything wrong with them. One way to do this is to use the "parquet-tools" command-line tool. This can be used to examine the contents of Parquet files, and can be helpful in identifying issues such as missing or corrupted data.
If you are unable to identify the cause of the issue using the parquet-tools tool, it's possible it could be a library implementation issue.

Related

SSIS Workflow issue when run by Task Scheduler

I have a SSIS Workflow that is a little unreliable.
The Normal procedure should be as follows: Task Scheduler Job starts batch File. Batch File starts SSIS Job.
This process produces this error:
Error: 2020-12-08 07:10:43.95
Code: 0xC02090F5
Source: Data Flow Task Connect to Impala [2132]
Description: The component "Connect to Impala" (2132) was unable to process the data. ERROR [08S01] [Cloudera][ImpalaODBC] (120) Error while retrieving data from in Impala: [08S01] : SSL_read: error code: 0
End Error
Error: 2020-12-08 07:10:43.95
Code: 0xC0047038
Source: Data Flow Task SSIS.Pipeline
Description: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Connect to Impala" (2132) returned error code 0xC02090F5. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
End Error
But when I start the Batch File, the SSIS Job is executed successfully. Even when I start the Job within MS Visual Studio there are no issues.
Google makes a hint that there could be an issue with the ODBC connection. I am using an 32 Bit ODBC Driver for Impala (User DSN). Also tried it with 64 Bit but doesn't work either.
I appreciate every answer.
Thanks
Problem solved
Please see this solution: https://community.cloudera.com/t5/Support-Questions/ERROR-SSL-read-error-code-0-with-IMPALA-from-R-programming/td-p/280561
It is about timeout parameters.

An unexpected error happened during phase Publishing of job

I am trying to use the ForgeApp with Revit. For the same I am trying to execute the workitem from Postman. During the execution I am getting below error.
An unexpected error happened during phase Publishing of job.
The parts of the actual report (after removing some of the sensitive info) is as follows:
[10/15/2020 05:45:24] Finished running. Process will return: Success
[10/15/2020 05:45:24] ====== Revit finished running: revitcoreconsole ======
[10/15/2020 05:45:25] End Revit Core Engine standard output dump.
[10/15/2020 05:45:25] End script phase.
[10/15/2020 05:45:25] Start upload phase.
[10/15/2020 05:45:25] Error: Non-optional output [result.json] is missing.
[10/15/2020 05:45:25] Error: An unexpected error happened during phase Publishing of job.
[10/15/2020 05:45:25] Job finished with result FailedMissingOutput
[10/15/2020 05:45:25] Job Status:
From the error it is clear that the Plugin has failed in processing but it gives give much idea as to why it has failed . After receiving this error, I tried to debug locally by following https://forge.autodesk.com/blog/design-automation-debug-revit-plugin-locally
But during debugging it is failing with below error while Executing the Plugin itself. It executes the Onstartup without any issues but after that it is not going in HandleDesignAutomationReadyEvent.
Managed Debugging Assistant 'FatalExecutionEngineError' : 'The runtime has encountered a fatal error. The address of the error was at 0xdb9b8a8d, on thread 0x3784*
So I am not sure what to ahead to resolve this. If I can get this working somehow in the Local with Debugger or through Postman then it would help.
Further update - I have now found the root cause. Even though it was complaining about the missing output file leading me to believe the code was not running properly, the actual root cause was that it was not finding the model from the inputFile parameter. This became clear once I tried putting Console.Write in the Plugin code and then debugging became easy. I wasn't sure Console would be printed in the output at first and hence turned the debug logs off at first . But as I didn't have any other means to have the verbose logging , I put in lots of Console write and now got to know the root cause. Thanks

Azure pipeline getting error: [error]The read operation failed, see inner exception on mac hosted agent

im getting this error, which i try to find why and what happened Suddenly:
and more importantly how to debug such an error .
what this line means :
Error The read operation failed, see inner exception.
where is this : inner exception?
020-09-30T18:47:22.0199830Z ##[section]Starting: Initialize job
2020-09-30T18:47:22.0201330Z Agent name: 'Hosted Agent'
2020-09-30T18:47:22.0201750Z Agent machine name: 'Mac-1601490664598'
2020-09-30T18:47:22.0202040Z Current agent version: '2.175.2'
2020-09-30T18:47:22.0219900Z Current image version: '20200904.1'
2020-09-30T18:47:22.0229850Z Agent running as: 'runner'
2020-09-30T18:47:22.0293150Z Prepare build directory.
2020-09-30T18:47:22.0595770Z Set build variables.
2020-09-30T18:47:22.0631220Z Download all required tasks.
2020-09-30T18:47:22.0751440Z Downloading task: CmdLine (2.164.2)
2020-09-30T18:48:02.2372880Z Downloading task: UseRubyVersion (0.165.2)
2020-09-30T18:48:48.2651220Z Downloading task: DownloadBuildArtifacts (0.167.2)
2020-09-30T18:51:03.2405560Z ##[warning]Failed to download task 'DownloadBuildArtifacts'. Error The read operation failed, see inner exception.
2020-09-30T18:51:03.2423990Z ##[warning]Inner Exception: {ex.InnerException.Message}
2020-09-30T18:51:03.2428450Z ##[warning]Back off 23.799 seconds before retry.
2020-09-30T18:53:07.4698560Z ##[warning]Failed to download task 'DownloadBuildArtifacts'. Error The read operation failed, see inner exception.
2020-09-30T18:53:07.4701220Z ##[warning]Inner Exception: {ex.InnerException.Message}
2020-09-30T18:53:07.4704340Z ##[warning]Back off 13.329 seconds before retry.
2020-09-30T18:57:08.7191850Z ##[error]The read operation failed, see inner exception.
2020-09-30T18:57:08.7198800Z ##[section]Finishing: Initialize job
You are not the only one who encountered this interruption, see this post.
I reviewed our internal service telemetry log, the issue you encountered should caused by our service event. https://status.dev.azure.com/_history
There were some exception occurred on our backend start from 15:23:27 CST, which make you encountered pipeline interruption.
how to debug such an error
As normal, it's hard for users to check the inner exception if you are using hosted pool. The detailed exception messages are recorded in our backend telemetry log. You can contact our team by clicking on Report outage button mentioned below if you are blocked again in the future and would like to know the details message about it:
Since the event has been mitigated now, I'm sure your pipelines will work fine if you re-run the pipeline now.

"File not found Exception" at runtime in Glassfish ESB 2.1

Good wishes of the day..!
In production we have Glassfish 2.1 server hosting ESB Applications in two instances each under two Linux Boxes. Functionality of the ESB app to takes client request and transform to destination, again receive the response and sent back to the client.
From past few days we are seeing "File not found Exception" in the logs throwing by WsdlQueryHelper of HTTP BC. We analyzed the logs and came to know that it is happening for only one instance (Instance 2 of Server 1), that to for few requests in that instance. We checked the service of that instance from SOAP tool and it is giving appropriate response, Understood that WsdlQueryHelper failed to process few requests at runtime. Below exception details for the same in logs,
*[#|2012-12-13T18:29:24.526+1100|FINE|sun-appserver2.1|com.sun.jbi.httpsoapbc.WsdlQueryHelper|_ThreadID=319;_ThreadName=httpWorkerThread-7092-0;ClassName=com.sun.jbi.httpsoapbc.WsdlQueryHelper;MethodName=;_RequestID=6fdd0535-24d4-4878-8c98-b48e2dea39eb;|init
query helper failed. javax.wsdl.WSDLException: WSDLException (at
/definitions/types/xsd:schema): faultCode=OTHER_ERROR: An error
occurred trying to resolve schema referenced at 'RouterSchema_v4.xsd',
relative to ''.: java.io.FileNotFoundException: This file was not
found:
file:/home/glassfish/GlassFishESBv21/glassfish/nodeagents/GLASSFISH-001-NA/GLASSFISH-001-instB/RouterSchema_v4.xsd
at com.ibm.wsdl.xml.WSDLReaderImpl.parseSchema(WSDLReaderImpl.java:918)
at com.ibm.wsdl.xml.WSDLReaderImpl.parseSchema(WSDLReaderImpl.java:678)
at com.ibm.wsdl.xml.WSDLReaderImpl.parseTypes(WSDLReaderImpl.java:639)
at com.ibm.wsdl.xml.WSDLReaderImpl.parseDefinitions(WSDLReaderImpl.java:339)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(WSDLReaderImpl.java:2324)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(WSDLReaderImpl.java:2288)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(WSDLReaderImpl.java:2341)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(WSDLReaderImpl.java:2362)
at com.sun.jbi.httpsoapbc.WsdlQueryHelper.(WsdlQueryHelper.java:105)
at com.sun.jbi.httpsoapbc.embedded.JAXWSGrizzlyRequestProcessor.processSynchronousQueryResource(JAXWSGrizzlyRequestProcessor.java:293)
at com.sun.jbi.httpsoapbc.embedded.JAXWSGrizzlyRequestProcessor.service(JAXWSGrizzlyRequestProcessor.java:217)
at com.sun.enterprise.web.connector.grizzly.DefaultProcessorTask.invokeAdapter(DefaultProcessorTask.java:647)
at com.sun.jbi.httpsoapbc.embedded.JBIGrizzlyAsyncFilter.doFilter(JBIGrizzlyAsyncFilter.java:95)
at com.sun.enterprise.web.connector.grizzly.async.DefaultAsyncExecutor.invokeFilters(DefaultAsyncExecutor.java:175)
at com.sun.enterprise.web.connector.grizzly.async.DefaultAsyncExecutor.interrupt(DefaultAsyncExecutor.java:153)
at com.sun.enterprise.web.connector.grizzly.async.AsyncProcessorTask.doTask(AsyncProcessorTask.java:92)
at com.sun.enterprise.web.connector.grizzly.TaskBase.run(TaskBase.java:265)
at com.sun.enterprise.web.connector.grizzly.WorkerThreadImpl.run(WorkerThreadImpl.java:116)
Caused by: java.io.FileNotFoundException: This file was not found:
file:/home/glassfish/GlassFishESBv21/glassfish/nodeagents/GFESB_ASPAC_001-NA/GFESB_ASPAC_001-instB/RouterSchema_v4.xsd
at com.ibm.wsdl.util.StringUtils.getContentAsInputStream(StringUtils.java:199)
at com.ibm.wsdl.xml.WSDLReaderImpl.parseSchema(WSDLReaderImpl.java:840)
... 17 more |#]*
Could you please help us to resolve the issue.
Regards,
Ram

Why isn't findbugs_result.xml generated?

I installed jenkins 1.452 and findbugs 4.34, when after building and invoking findbugs, findbugs_result.xml isn't generate so findbugs report isn't generated neither. Below message is out:
[FINDBUGS] Collecting findbugs analysis files...
Finished: SUCCESS
Normaly findbugs is supposed to generate result.xml and parse it out to many warnings so that we could see many warning or progress message in the output console. Findbugs work normally for the same project in hudson, and I cofigure in jenkins the same as in hudson. Below is configuration snapshot.
OS is CentOS, and where I can find findbugs log ,I think maybe some erros are out, log will be very helpful to my case.
I change name to '**/findbugs.xml", it throws another error:
[FINDBUGS] Collecting findbugs analysis files...
[FINDBUGS] Parsing 1 files in /home/irteam/.jenkins/jobs/hangame-dnest/workspace
[FINDBUGS] Parsing of file /home/irteam/.jenkins/jobs/hangame-dnest/workspace/common-build/common/findbugs.xml failed due to an exception:
org.dom4j.DocumentException: Sax error Nested exception: Invalid top-level element (expected BugCollection, saw project)
Do you have some clues?