Platform is Windows 7 SP1.
I recently spent some time debugging an issue that was caused because a code was passing an invalid parameter to one of the "safe" CRT functions. As a result my application was aborted right away with no warning or anything -- not even a crash dialog.
At first, I tried to figure this out by attaching Windbg to my application. However when the crash happened, by the time the code broke into Windbg pretty much every thread had been killed save for ONE thread on which Windbg had to break into. There was no clue as to what was wrong. So, I attached Visual Studio as a debugger instead and when my application terminated, I saw every thread exiting with error code 0xc0000417. That is what gave me the clue that there is an invalid parameter issue somewhere.
Next, the way I went about trying to debug this is to once again attach Windbg to my application but this time randomly (by trial & error) place breakpoints in various places like kernel32!TerminateThread, kernel32!UnhandledExceptionFilter and kernel32!SetUnhandledExceptionFilter.
Of the lot, placing a break point at SetUnhandledExceptionFilter immediately showed the callstack of the offending thread when the crash occurred and the CRT function that we were calling incorrectly.
Question: Is there anything intuitive that should have told me to place bp on SUEF right away? I would like to understand this a bit better and not do this by trial and error. Second question is w.r.t to the error code I determined via Visual Studio. Without resorting to VS, how do I determine thread exit codes on Windbg?
i was going to just comment but this became bigger so an answer
setting windbg as postmortem debugger using Windbg -I will also route all the unhandled exception to windbg
Windbg -I should Register windbg as postmortem debugger
by default Auto is set to 1 in AeDebug Registry Key
if you don't want to debug every program you can edit this to 0
to provide you an additional DoYouWanttoDebug option in the wer Dialog
reg query "hklm\software\microsoft\windows nt\currentversion\aedebug"
HKEY_LOCAL_MACHINE\software\microsoft\windows nt\currentversion\aedebug
Debugger REG_SZ "xxxxxxxxxx\windbg.exe" -p %ld -e %ld -g
Auto REG_SZ 0
assuming you registered a postmortem debugger and you run this code
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
unsigned long input[] = {1,45,0xf001,0xffffffff};
int i = 0;
char buf[5] = {0};
for(i=0;i<_countof(input);i++)
{
_ultoa_s(input[i],buf,sizeof(buf),16);
printf("%s\n",buf);
}
return 1;
}
on the exception you will see a wer dialog like this
you can now choose to debug this program
windows also writes the exit code on unhandled exception to event log
you can use powershell to retrieve one event like this
PS C:\> Get-EventLog -LogName Application -Source "Application Error" -newest 1| format-list
Index : 577102
EntryType : Error
InstanceId : 1000
Message : Faulting application name:
ultos.exe, version: 0.0.0.0, time stamp: 0x577680f1
Faulting module name: ultos.exe, version:
0.0.0.0, time stamp: 0x577680f1
Exception code: 0xc0000417
Fault offset: 0x000211c2
Faulting process id: 0x4a8
Faulting application start time: 0x01d1d3aaf61c8aaa
Faulting application path: E:\test\ulto\ultos.exe
Faulting module path: E:\test\ulto\ultos.exe
Report Id: 348d86fc-3f9e-11e6-ade2-005056c00008
Category : Application Crashing Events
CategoryNumber : 100
ReplacementStrings : {ultos.exe, 0.0.0.0, 577680f1, ultos.exe...}
Source : Application Error
TimeGenerated : 7/1/2016 8:42:21 PM
TimeWritten : 7/1/2016 8:42:21 PM
UserName :
and if you choose to debug
you can view the CallStack
0:000> kPL
# ChildEBP RetAddr
00 001ffdc8 77cf68d4 ntdll!KiFastSystemCallRet
01 001ffdcc 75e91fdb ntdll!NtTerminateProcess+0xc
02 001ffddc 012911d3 KERNELBASE!TerminateProcess+0x2c
03 001ffdec 01291174 ultos!_invoke_watson(
wchar_t * expression = 0x00000000 "",
wchar_t * function_name = 0x00000000 "",
wchar_t * file_name = 0x00000000 "",
unsigned int line_number = 0,
unsigned int reserved = 0)+0x31
04 001ffe10 01291181 ultos!_invalid_parameter(
wchar_t * expression = <Value unavailable error>,
wchar_t * function_name = <Value unavailable error>,
wchar_t * file_name = <Value unavailable error>,
unsigned int line_number = <Value unavailable error>,
unsigned int reserved = <Value unavailable error>)+0x7a
05 001ffe28 0128ad96 ultos!_invalid_parameter_noinfo(void)+0xc
06 001ffe3c 0128affa ultos!common_xtox<unsigned long,char>(
unsigned long original_value = 0xffffffff,
char * buffer = 0x001ffea4 "",
unsigned int buffer_count = 5,
unsigned int radix = 0x10,
bool is_negative = false)+0x58
07 001ffe5c 0128b496 ultos!common_xtox_s<unsigned long,char>(
unsigned long value = 0xffffffff,
char * buffer = 0x001ffea4 "",
unsigned int buffer_count = 5,
unsigned int radix = 0x10,
bool is_negative = false)+0x59
08 001ffe78 012712b2 ultos!_ultoa_s(
unsigned long value = 0xffffffff,
char * buffer = 0x001ffea4 "",
unsigned int buffer_count = 5,
int radix = 0n16)+0x18
09 001ffeac 0127151b ultos!main(void)+0x52
0a (Inline) -------- ultos!invoke_main+0x1d
0b 001ffef8 76403c45 ultos!__scrt_common_main_seh(void)+0xff
0c 001fff04 77d137f5 kernel32!BaseThreadInitThunk+0xe
0d 001fff44 77d137c8 ntdll!__RtlUserThreadStart+0x70
0e 001fff5c 00000000 ntdll!_RtlUserThreadStart+0x1b
Related
I am struggling to understand why some of my code using boost, which was working fine under Visual Studio 2017, is now resulting in an access violation under Visual Studio 2019. However, I only encounter this failure under Debug build. Release build works fine, with no issues.
What could I have set up incorrectly in my build, environment, or code, that could cause such a failure?
My environment:
Windows 10
Boost 1.74 (dynamic link)
Visual Studio 2019 v16.7.6
Compiling for C++ x64
The failing line of my code is this:
boost::filesystem::path dir = (boost::filesystem::temp_directory_path() / boost::filesystem::unique_path("%%%%-%%%%-%%%%-%%%%"));
The failing line in Boost filesystem is this here in boost/filesystem/path.hpp:
namespace path_traits
{ // without codecvt
inline
void convert(const char* from,
const char* from_end, // 0 for null terminated MBCS
std::wstring & to)
{
convert(from, from_end, to, path::codecvt());
}
The failure message reported by Visual Studio is as follows:
Exception thrown at 0x00007FF9164F1399 (vcruntime140d.dll) in ezv8.tests.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.
The call stack looks like this:
vcruntime140d.dll!00007ff9164f1550() Unknown
> boost_filesystem-vc142-mt-gd-x64-1_74.dll!wmemmove(wchar_t * _S1, const wchar_t * _S2, unsigned __int64 _N) Line 248 C++
boost_filesystem-vc142-mt-gd-x64-1_74.dll!std::_WChar_traits<wchar_t>::move(wchar_t * const _First1, const wchar_t * const _First2, const unsigned __int64 _Count) Line 204 C++
boost_filesystem-vc142-mt-gd-x64-1_74.dll!std::wstring::append(const wchar_t * const _Ptr, const unsigned __int64 _Count) Line 2864 C++
boost_filesystem-vc142-mt-gd-x64-1_74.dll!std::wstring::append<wchar_t *,0>(wchar_t * const _First, wchar_t * const _Last) Line 2916 C++
boost_filesystem-vc142-mt-gd-x64-1_74.dll!`anonymous namespace'::convert_aux(const char * from, const char * from_end, wchar_t * to, wchar_t * to_end, std::wstring & target, const std::codecvt<wchar_t,char,_Mbstatet> & cvt) Line 77 C++
boost_filesystem-vc142-mt-gd-x64-1_74.dll!boost::filesystem::path_traits::convert(const char * from, const char * from_end, std::wstring & to, const std::codecvt<wchar_t,char,_Mbstatet> & cvt) Line 153 C++
appsvcs.dll!boost::filesystem::path_traits::convert(const char * from, const char * from_end, std::wstring & to) Line 1006 C++
appsvcs.dll!boost::filesystem::path_traits::dispatch<std::wstring>(const std::string & c, std::wstring & to) Line 257 C++
appsvcs.dll!boost::filesystem::path::path<char [20]>(const char[20] & source, void * __formal) Line 168 C++
I use UTF-8 strings throughout my code, so I have configured boost::filesystem to expect UTF-8 strings as follows:
boost::nowide::nowide_filesystem();
The cause of this issue turned out to be inconsistent use of _ITERATOR_DEBUG_LEVEL. This setting does affect ABI compatibility. I was setting this flag (to 0) in my own code, but it was not set in the Boost build. The solution is to either remove the flag from one's own code, or add the flag to the Boost build by adding define=_ITERATOR_DEBUG_LEVEL=0 to the b2 arguments (from another stack overflow answer).
I have a strange issue when my application crashes in nvcuda.dll after running for about 2 hours. After spending a lot of time trying to debug the issue, I think I have an idea what’s going on but I’d like to know if anybody else has seen this problem.
My application launches most of its kernels in non-default streams and this process can go on for hours before there is a need to use default stream. Everything was working fine until I upgraded the drivers from some 320 version to most recent 332.50 (for K40m) version. What happens now is if the app runs for about 2 hours and then makes any call which uses default stream then it crashes during the call somewhere inside nvcuda.dll. At first I thought something is wrong with my kernels but it happens even if I use some basic stuff like cudaMemcpy (which uses default stream). The crash does not happen when the app is running for, say, 1 hour or 1.5 hours. It took me a while to realize that might be an issue with the driver so I uninstalled the new driver and installed the old one (320.92) and the problem was gone! I repeated the same process (changing the driver, rebooting then running the app again) multiple times and had 100% repro.
Unfortunately, I don’t have a small, self-contained repro but before I try to create one, has anybody seen something like that recently? The entry from Event Viewer at the time of the crash does not say much:
Faulting application name: <app>.exe, version: <version>, time stamp: 0x5316a970
Faulting module name: nvcuda.dll, version: 8.17.13.3250, time stamp: 0x52e1fa40
Exception code: 0xc00000fd
Fault offset: 0x00000000002226e7
Faulting process id: 0x1558
Faulting application start time: 0x01cf3831a2f3b71b
Faulting application path: <app>.exe
Faulting module path: C:\windows\SYSTEM32\nvcuda.dll
Report Id: aceb9a51-a433-11e3-9403-90b11c4725be
Faulting package full name:
Faulting package-relative application ID:
Update 1:
I now have a simple application which reproduces the crash both on K20m and K40m cards.
Update 2:
Updated sample app, was able to repro the crash. From the call stack it looks like there is a stack overflow somewhere in nvcuda.dll.
Steps:
Install latest version (332.50) of the drivers on the machine.
In Visual Studio 2012 create a new CUDA 5.5 project.
Replace the contents of the kernel.cu with the code below.
Compile and run the code on the machine with K20m or K40m.
After approximately 2 hours of execution the app will crash and the entry below will be written into event log.
Uninstall the driver and install previous (e.g. 321.10) version of the driver.
Run the app, it should still be running after 2, 3 and more hours.
Log:
Faulting application name: CudaTests60.exe, version: 0.0.0.0, time stamp: 0x5317974f
Faulting module name: nvcuda.dll, version: 8.17.13.3250, time stamp: 0x52e1fa40
Exception code: 0xc00000fd
Fault offset: 0x000000000004f5cb
Faulting process id: 0x23d0
Faulting application start time: 0x01cf38ba16961e74
Faulting application path: d:\bin\test\CudaTests60.exe
Faulting module path: C:\windows\system32\nvcuda.dll
Report Id: 192506c4-a4be-11e3-9401-90b11c4b02c0
Faulting package full name:
Faulting package-relative application ID:
Code:
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <vector>
#include <stdio.h>
#include <assert.h>
#include <cublas_v2.h>
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <Windows.h>
int main()
{
cudaError_t cudaStatus;
{
int crow = 10000;
int ccol = 10000;
int cshared = 10000;
int xLength = crow * cshared;
int yLength = cshared * ccol;
int matLength = crow * ccol;
thrust::device_vector<float> x(xLength);
thrust::device_vector<float> y(yLength);
thrust::device_vector<float> mat(matLength);
thrust::fill(x.begin(), x.end(), 1.0f);
thrust::fill(y.begin(), y.end(), 1.0f);
thrust::fill(mat.begin(), mat.end(), .0f);
cudaStream_t ops;
cudaStatus = cudaStreamCreate(&ops);
assert(0 == cudaStatus);
cublasHandle_t cbh;
cublasStatus_t cbstatus;
cbstatus = cublasCreate(&cbh);
assert(0 == cbstatus);
cbstatus = cublasSetStream(cbh, ops);
assert(0 == cbstatus);
float alpha = 1;
float beta = 0;
float* px = thrust::raw_pointer_cast(x.data());
float* py = thrust::raw_pointer_cast(y.data());
float* pmat = thrust::raw_pointer_cast(mat.data());
ULONGLONG start = GetTickCount64();
ULONGLONG iter = 0;
while (true)
{
cbstatus = cublasSgemm(cbh, CUBLAS_OP_N, CUBLAS_OP_N, crow, ccol, cshared, &alpha, px, crow, py, cshared, &beta, pmat, crow);
assert(0 == cbstatus);
if (0 != cbstatus)
{
printf("cublasSgemm failed: %d.\n", cbstatus);
break;
}
cudaStatus = cudaStreamSynchronize(ops);
assert(0 == cudaStatus);
if (0 != cudaStatus)
{
printf("cudaStreamSynchronize failed: %d.\n", cudaStatus);
break;
}
ULONGLONG cur = GetTickCount64();
// Exit after 2 hours.
if (cur - start > 2 * 3600 * 1000)
break;
iter++;
}
// Crash will happen here.
printf("Before cudaMemcpy.\n");
float res = 0;
cudaStatus = cudaMemcpy(&res, px, sizeof(float), cudaMemcpyDeviceToHost);
assert(0 == cudaStatus);
if (0 == cudaStatus)
printf("After cudaMemcpy: %f\n", res);
else
printf("cudaMemcpy failed: %d\n", cudaStatus);
}
return 0;
}
I'm not surprised the program crashes right where you've indicated.
This line of code is illegal:
cudaStatus = cudaMemcpy(pmat, px, x.size() * sizeof(float), cudaMemcpyDeviceToHost);
both pmat and px are pointers to device memory. However you've requested cudaMemcpyDeviceToHost which means the pmat pointer is interpreted as a host pointer and gets dereferenced during the copy operation. Dereferencing a device pointer in host code is illegal and will cause a seg fault.
With suitable modifications I ran your code on linux and it indicates a seg fault at that line.
Note that I'm not disputing there may be a problem in the driver you indicated (bugs are possible!), but I don't think this code is reproducing anything related to a driver bug.
Bugs can be filed at: https://developer.nvidia.com/nvbugs/cuda/add You will need to log in with developer credentials.
As an aside, your code appears to take a designed exit after 2 hours. I don't see how it could be running longer as you've indicated:
7.Run the app, it should still be running after 2, 3 and more hours.
Unless there is something wrong with your tick count timing system, which I haven't validated.
The bug has been fixed in Tesla drivers starting version 333.11. If you have the same problem, make sure you've updated the drivers.
Here is my code:
#include<stdio.h>
#include<pcap.h>
void pcapdump(u_char* argument,const struct pcap_pkthdr* packet_header,const u_char* packet_content);
int main()
{
int i=0, devid,ret;
char errbuf[PCAP_ERRBUF_SIZE];
pcap_t *handle;
bpf_u_int32 mask;
bpf_u_int32 net;
int num_packets=500;
pcap_dumper_t *p;
pcap_if_t *alldevs;
pcap_if_t *pdev;
const struct pcap_pkthdr *packet_header;
const u_char *packet_content;
ret=pcap_findalldevs(&alldevs,errbuf);
if(ret=-1)
{
printf("%s",errbuf);
};
for (pdev = alldevs;pdev;pdev=pdev->next)
printf("#%d: %s %s %s\n",++i,pdev->name,pdev->description,pdev->description);
printf("select a device: ");
scanf("%d", &devid);
pdev=alldevs;
while (--devid)
pdev=pdev->next;
printf("Selected %s \n", pdev->name);
if (pcap_lookupnet(pdev->name,&net,&mask,errbuf)==-1)
{
printf("Couldn't get netmask for device %s: %s\n", pdev->name, errbuf);
net = 0;
mask = 0;
};
handle=pcap_open_live(pdev->name,BUFSIZ,1,0,errbuf);
printf("Number of packets: %d\n", num_packets);
pcap_dump_open(handle,"/home/jiangzhongbai/capturefiles/10.pcapng");
pcap_loop(handle,num_packets,pcap_dump,NULL);
pcap_dump_close(p);
pcap_freealldevs(alldevs);
pcap_close(handle);
printf("\nCapture complete.\n");
return 0;
}
The result is
eth0 (null) (null)
wlan0 (null) (null)
nflog Linux netfilter log (NFLOG) interface Linux netfilter log (NFLOG) interface
nfqueue Linux netfilter queue (NFQUEUE) interface Linux netfilter queue (NFQUEUE) interface
any Pseudo-device that captures on all interfaces Pseudo-device that captures on all interfaces
lo (null) (null)
select a device: 2
Selected wlan0
Number of packets: 500
Segmentation fault (core dumped)
I think there is something wrong with the functionpcap_dump_open.But I don't know how to solve the problem of Segmentation fault (core dumped).Please help me.
How to solve Segmentation fault (core dumped)
If pcap_findalldevs() returns -1, don't just print an error message, quit, because alldevs isn't necessarily set to a valid value or to NULL.
Do not assume that pdev->description is non-null - only print it if it's non-null.
Assign the result of pcap_dump_open() to the variable p.
Pass p, rather than NULL, as the fourth argument to pcap_loop().
The following code works correctly for x86 but not in mips platform.
char *str = "11111111-22222222 r-xp 00000000 00:0e 1843624 /lib/libdl.so.0";
unsigned long long start_addr, stop_addr, offset;
char* access = NULL;
char* filename = NULL;
sscanf(str, "%llx-%llx %m[-rwxp] %llx %*[:0-9a-f] %*d %ms",
&start_addr, &stop_addr, &access, &offset, &filename);
printf("\n start : %x, stop : %x, offset : %x\n",start_addr,stop_addr,offset);
printf("\n Permission : %s\n",access);
printf("\n Filename : %s\n",filename);
In x86 it outputs :
start : 11111111, stop : 22222222, offset : 0
Permission : r-xp
Filename : /lib/libdl.so.0
But in mips it is showing :
start : 7ff20f5b, stop : 11111111, offset : 0
Permission : (null)
Filename : (null)
I used mipsel-linux-uclibc toolchain to compile. Can somebody help.
The C code generates a segmentation fault on SunOS. So it works on Linux but not with RISC architectures lijke MIPS(?). You should investigate why using the gdbdebugger, so use the command man gdb or run gdb and insert your breakpoints or I will do it as an exercise.
I am curious to understand the divide by zero exception handling in linux. When divide by zero operation is performed, a trap is generated i.e. INT0 is sent to the processor and ultimately SIGFPE signal is sent to the process that performed the operation.
As I see, the divide by zero exception is registered in trap_init() function as
set_trap_gate(0, ÷_error);
I want to know in detail, what all happens in between the INT0 being generated and before the SIGFPE being sent to the process?
Trap handler is registered in the trap_init function from arch/x86/kernel/traps.c
void __init trap_init(void)
..
set_intr_gate(X86_TRAP_DE, ÷_error);
set_intr_gate writes the address of the handler function into idt_table x86/include/asm/desc.h.
How is the divide_error function defined? As a macro in traps.c
DO_ERROR_INFO(X86_TRAP_DE, SIGFPE, "divide error", divide_error, FPE_INTDIV,
regs->ip)
And the macro DO_ERROR_INFO is defined a bit above in the same traps.c:
193 #define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
194 dotraplinkage void do_##name(struct pt_regs *regs, long error_code) \
195 { \
196 siginfo_t info; \
197 enum ctx_state prev_state; \
198 \
199 info.si_signo = signr; \
200 info.si_errno = 0; \
201 info.si_code = sicode; \
202 info.si_addr = (void __user *)siaddr; \
203 prev_state = exception_enter(); \
204 if (notify_die(DIE_TRAP, str, regs, error_code, \
205 trapnr, signr) == NOTIFY_STOP) { \
206 exception_exit(prev_state); \
207 return; \
208 } \
209 conditional_sti(regs); \
210 do_trap(trapnr, signr, str, regs, error_code, &info); \
211 exception_exit(prev_state); \
212 }
(Actually it defines the do_divide_error function which is called by the small asm-coded stub "entry point" with prepared arguments. The macro is defined in entry_32.S as ENTRY(divide_error) and entry_64.S as macro zeroentry: 1303 zeroentry divide_error do_divide_error)
So, when a user divides by zero (and this operation reaches the retirement buffer in OoO), hardware generates a trap, sets %eip to divide_error stub, it sets up the frame and calls the C function do_divide_error. The function do_divide_error will create the siginfo_t struct describing the error (signo=SIGFPE, addr= address of failed instruction,etc), then it will try to inform all notifiers, registered with register_die_notifier (actually it is a hook, sometimes used by the in-kernel debugger "kgdb"; kprobe's kprobe_exceptions_notify - only for int3 or gpf; uprobe's arch_uprobe_exception_notify - again only int3, etc).
Because DIE_TRAP is usually not blocked by the notifier, the do_trap function will be called. It has a short code of do_trap:
139 static void __kprobes
140 do_trap(int trapnr, int signr, char *str, struct pt_regs *regs,
141 long error_code, siginfo_t *info)
142 {
143 struct task_struct *tsk = current;
...
157 tsk->thread.error_code = error_code;
158 tsk->thread.trap_nr = trapnr;
170
171 if (info)
172 force_sig_info(signr, info, tsk);
...
175 }
do_trap will send a signal to the current process with force_sig_info, which will "Force a signal that the process can't ignore".. If there is an active debugger for the process (our current process is ptrace-ed by gdb or strace), then send_signal will translate the signal SIGFPE to the current process from do_trap into SIGTRAP to debugger. If no debugger - the signal SIGFPE should kill our process while saving the core file, because that is the default action for SIGFPE (check man 7 signal in the section "Standard signals", search for SIGFPE in the table).
The process can't set SIGFPE to ignore it (I'm not sure here: 1), but it can define its own signal handler to handle the signal (example of handing SIGFPE another). This handler may just print %eip from siginfo, run backtrace() and die; or it even may try to recover the situation and return to the failed instruction. This may be useful for example in some JITs like qemu, java, or valgrind; or in high-level languages like java or ghc, which can turn SIGFPE into a language exception and programs in these languages can handle the exception (for example, spaghetti from openjdk is in hotspot/src/os/linux/vm/os_linux.cpp).
There is a list of SIGFPE handlers in debian via codesearch for siagaction SIGFPE or for signal SIGFPE