Apache Spark Task not Serializable

Apache Spark Task not Serializable - exception

I realize this question has been asked before, but I think my failure is due to a different reason.
List<Tuple2<String, Integer>> results = results.collect();
for (int i=0; i<results.size(); i++) {
System.out.println(results.get(0)._1);
}
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: tools.MAStreamProcessor$1 at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214) at
I have a simple 'map/reduce' program in Spark. The above lines take the results of the reduce step and loop through each resultant element. If I comment them out, then I get no errors. I stayed away from using 'forEach' or the concise for () thinking that the underlying generated produce elements that aren't serializable. I've gotten it down to a simple for loop and so wonder why I am still running into this error.
Thanks,
Ranjit

Use the -Dsun.io.serialization.extendedDebugInfo=true flag to turn on serialization debug logging. It will tell you what exactly it's unable to serialize.
The answer will have nothing to do with the lines you pasted. The collect is not the source of the problem, it's just what triggers the computation of the RDD. If you don't compute the RDD, nothing gets sent to the executors. So the accidental inclusion of something unserializable in an earlier step causes no problems without collect.

Related

How to skip record which produces runtime exception in Kafka and keep stream running?

I have implemented kafka stream application. Let's say one of the object's field which the stream is currently processing contains a number instead of a string value. Currently when there is an exception thrown in the processing logic eg. .transform() method, whole stream is killed and my application stops to process data.
I would like to skip such invalid record and keep processing next records available on a input topic. Additionally I don't want to implement any try-catch statements in my stream processing code.
To achieve this, I implemented StreamsUncaughtExceptionHandler so it returns StreamThreadExceptionResponse.REPLACE_THREAD enum in order to spawn new thread and keep on processing next records waiting on the input topic. However, it turned out that the stream consumer offset is not committed and when new a thread is started, it takes old record which just have killed the previous stream thread... Since the logic is the same, new thread will also fail to process the error record and again fail. Some kind of a loop spawning new thread and failing on a same record every time.
Is there any clean way of skipping failing record and keep the stream processing next records?
Please note, I am not asking about DeserializationExceptionHandler or ProductionExceptionHandler.

When it comes to the application-level code, it is mostly up to the application how the exception is handled. This use case has come up before. See these previous Stack Overflow threads.
Example on handling processing exception in Spring Cloud Streams with Kafka Streams Binder and the functional style processor
How to stop sending to kafka topic when control goes to catch block Functional kafka spring
Try to see if those answers can be applied to your scenario.

You can filter the event that dont match a pattern or validate the events before you transform them

Ollydbg Debugging - Pass exception to application / Step into instruction

I'm trying to identify a bug in a program (32bit) which could probably lead to code execution. So far I debugged the application with ollydbg and ran my exploit code. Then ollydbg gives me an exception.
If I press "Ctrl+F9" nothing seems to be executed of my shellcode
In contrast when the exception occures and I step through the next instructions with "F8" I finally reach my shellcode and it gets executet
If I run the application without ollydbg, my shellcode also doesn't get executed
Why does my shellcode get executed when I step to the next instructions an otherwise not? What's then the normal case when I run my application without a debugger?
Thanks a lot!

When an exception is raised in a thread the system will first check if a debugger is attached.
If a debugger is attached the exception is reported to the debugger (and not to the faulting process or thread). In ollydbg (and most debuggers) you then have the choice to do something with that exception.
The 1st one is to pass that exception to the faulting thread (CTRL+F9) in ollydbg.
The system will look at the EXCEPTION_REGISTRATION_RECORD for the current thread and walks the list of EXCEPTION_REGISTRATION structures (each of these structures has an exception handler) and check if a handler can handle the exception.
If a handler can handle the exception, the stack is unwind (to a certain point) and the thread might continue its life.
If no handler can handle the exception, the final handler is called and the program crashes (the system will then usually display a dialog box informing the user that the process crashed).
This is exactly the same behavior in the case no debugger is attached.
Thus, in your case, passing the exception to the debugger will probably unwind the stack, and the thread will continue its execution after the location of the exception (or simply crash the whole application if the exception couldn't be handled).
The second option - when a debugger is attached - is to not pass the exception to the faulting thread (using one of the step [into | over] / run button). In this case the system will not search for any handler and the thread will either simply rethrow the exception (if it can't pass over it) or continue execution like nothing happened (if the debugger knows how to handle it).
You should check which type (most probably one of: Access violation in read / write ; breakpoint exception) of exception is raised and correct the problem (see at the bottom of the ollydbg window, it will tell you which kind of exception has been raised) if you want to execute your shellcode without problem.

Why throw an exception in Prolog instead a simple fail?

I'm programming in Prolog and sometimes I want to get a fail but instead I get an exception, which I can't understand why should be a difference between them. If something couldn't execute that's mean that the predicate didn't succed, so it's a simple fail. Am I missing something?

A failure means that what you're trying to prove is false. An exception means that what you're trying to prove doesn't make sense (e.g. trying to compute the square root of an atom) for some reason or that you bumped into some system limitation (e.g. exhausting available memory).
But you can easily convert any exception into a failure by writing:
catch(Goal, _, fail)

Why does this PUSH instruction cause a UNDEFINED_INSTRUCTION exception at my ARM processor?

I am working with a Cortex-A9 and my program crashes because of an UNDEFINED_INSTRUCTION exception. The assembly line that causes this exception is according to my debugger's trace:
Trace #9999 : S:0x022D9A7C E92D4800 ARM PUSH {r11,lr}
Exception: UNDEFINED_INSTRUCTION (9)
I program in C and don't write assembly or binary and I am using gcc. Is this really the instruction that causes the exception, i.e. is the encoding of this PUSH instruction wrong and hence a compiler/assembler bug? Or is the encoding correct and something strange is going on? Scrolling back in the trace I found another PUSH instruction, that does not cause errors and looks like this:
Trace #9966 : S:0x022A65FC E52DB004 ARM PUSH {r11}
And of course there are a lot of other PUSH instruction too. But I did not find any other that tries to push specifically R11 and LR, so I can't compare.
I can't answer my own question, so I edit it:
Sorry guys, I don't exactly know what happend. I tried it several times and got the same error again and again. Then I turned the device off, went away and tried it again later and know it works fine...
Maybe the memory was corrupted somehow due to overheating or something? I don't know. Thanks for your answers anyway.
I use gcc 4.7.2 btw.

I suspect something is corrupting the SP register. Load/store multiple (of which PUSH is one alias) to unaligned addresses are undefined in the architecture, so if SP gets overwritten with something that's not a multiple of 4, then a subsequent push/pop will throw an undef exception.
Now, if you're on ARM Linux, there is (usually) a kernel trap for unaligned accesses left over from the bad old days which if enabled will attempt to fix up most unaligned load/store multiple instructions (despite them being architecturally invalid). However if the address is invalid (as is likely in the case of SP being overwritten with nonsense) it will give up and leave the undef handler to do its thing.
In the (highly unlikely) case that the compiler has somehow generated bad code that is fix-uppable most of the time,
cat /proc/cpuinfo/alignment
would show some non-zero fixup counts, but as I say, it's most likely corruption - a previous function has smashed the stack in such a way that an invalid SP is loaded on return, that then shows up at the next stack operation. Best double-check your pointer and array accesses.

pcap into qthread

I'm writing an application under Linux, using Qt library.
So, there are two QThreads. In one of the threads pcap_next() function is calling in while cycle. All threads often using public members of each other during its working.
Without using pcap library (for example read packet from hard disk) everything is right, but when I try to put pcap's functions into separate thread, I have SEGFAULT error.
I can't understand how pcap works. Its looks like pcap freezes the whole process, and because of this threads can't get access to public members of each other.
The main run() function of pcap's thread looks like:
while()
{
Data = pcap_next(handle, &header);
if (Data...)
{
//processing functions
}
}
any ideas?

"Freezing the whole process" would keep the other threads from even running; it wouldn't cause the process to crash.
If your program makes simultaneous calls on a single pcap_t in more than one thread, other than some safe calls such as pcap_breakloop() (which will not interrupt a thread that's blocked - you'd need to deliver a signal in UN*X to do that), there is no guarantee that it will work.
If you never make simultaneous pcap calls on the same pcap_t in different threads, it should work.
I.e., you could open the device/savefile in one thread, getting a pcap_t, and, once that's done, have the same thread or another thread read packets from the pcap_t. You could not, however, have more than one thread read packets from the pcap_t.
However, there could be something wrong with the way you're using pcap, in a fashion that would crash even in a single-threaded program. We'd have to see all your pcap calls to see whether that's the case.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008