LIRC driver option - default vs devinput - lirc

By default, the options in /etc/lirc/lirc_options.conf are as follows:
driver = devinput
device = auto
Article https://learn.pi-supply.com/make/ir-remote-control-support-on-raspbian-buster-justboom/ suggests the following:
driver = default
device = /dev/lirc0
The suggested options do work for me. However, I am wondering if the original settings are also the equivalent.
Also, is there a way to dump current lircd options? For example, which "device" is auto actually resolving to?

They are not the same. The devinput driver uses the kernel decoder, and feeds these decoded events to the lircd fifo. This fifo is what clients read from.
The default driver reads raw timing data from the kernel and makes it's own decoding using lircd.conf.
In general, if the devinput driver works it could safely be used and is a simpler setup. The default driver is useful in contexts where the kernel decoding doesn't work for example when a remote isn't supported by the kernel or there is a need to send (blast) ir signals -- the latter cannot be done using the devinput driver.
More info: https://www.lirc.org/html/configuration-guide.html
There is no way to dump the options as such. However, by setting the loglevel to debug and inspect the logs using for example journalctl the values are visible.
EDIT: /dev/lirc0 and friends provides the raw, unencoded data from the kernel. The devinput driver reads from a /dev/input/eventXX device. In both cases 'auto' make lircd to use the first found usable device which works as long as there is only one remote connected.

Related

How to load raw binary into Qemu

As all information I found about Qemu is related to Linux kernel, uboot or elf binaries I can't quite figure out how to load a binary blob from an embedded device into a specific address and execute part of it. The code I want to run does only arithmetics, so there are no hardware dependencies involved.
I would start qemu with something like
qemu-arm -singlestep -g8000
attach gdb, set initial register state and jump to my starting address to single step through it.
But how do I initially load binary data to a specific address and eventually set up an additional ram range?
how to load a binary blob from an embedded device into a specific address and execute part of it.
You can load binary blob into softmmu QEMU by the generic loader (-device loader).
I would start qemu with something like
qemu-arm -singlestep -g8000
This command line is for the linux-user QEMU invocation. It emulates userspace linux process of the guest architecture, it is unprivileged and does not provide support for any devices, including generic loader. Try using qemu-system-arm instead.
It's in fact easy with the Unicorn framework which works on top of Qemu. Based on the example in the websites doc section I wrote a Python script which loads the data, sets the registers, adds a hook which prints important per step information and start execution at the desired address until a target address.

Is cuDevicePrimaryCtxRetain() used for having persistent CUDA context objects between multiple processes?

Using only driver api, for example, I have a profiling with single process below(cuCtxCreate), cuCtxCreate overhead is nearly comparable to 300MB data copy to/from GPU:
In CUDA documentation here, it says(for cuDevicePrimaryCtxRetain) Retains the primary context on the device, creating it **if necessary**. Is this an expected behavior for repeated calls to same process from command line(such as running a process 1000 times for explicitly processing 1000 different input images)? Does device need CU_COMPUTEMODE_EXCLUSIVE_PROCESS to work as intended(re-use same context when called multiple times)?
For now, upper image is same even if I call that process multiple times. Even without using profiler, timings show around 1second completion time.
Edit: According the documentation, primary context is one per device per process. Does this mean there won't be a problem when using multiple threaded single application?
What is re-use time limit for primary context? Is 1 second between processes okay or does it have to be miliseconds to keep primary context alive?
I'm already caching ptx codes into a file so the only remaining overhead looks like cuMemAlloc(), malloc() and cuMemHostRegister() so re-using latest context from last call to same process would optimize timings good.
Edit-2: Documentation says The caller must call cuDevicePrimaryCtxRelease() when done using the context. for cuDevicePrimaryCtxRetain. Is caller here any process? Can I just use retain in first called process and use release on the last called process in a list of hundreds of sequentally called processes? Does system need a reset if last process couldn't be launched and cuDevicePrimaryCtxRelease not called?
Edit-3:
Is primary context intended for this?
process-1: retain (creates)
process-2: retain (re-uses)
...
process-99: retain (re-uses)
process-100: 1 x retain and 100 x release (to decrease counter and unload at last)
Everything is compiled for sm_30 and device is Grid K520.
GPU was at boost frequency during cuCtxCreate()
Project was 64-bit(release mode) compiled on a windows server 2016 OS and CUDA driver installation with windows-7 compatibility(this was the only way working for K520 + windows_server_2016)
tl;dr: No, it is not.
Is cuDevicePrimaryCtxRetain() used for having persistent CUDA context objects between multiple processes?
No. It is intended to allow the driver API to bind to a context which a library which has used the runtime API has already lazily created. Nothing more than that. Once upon a time it was necessary to create contexts with the driver API and then have the runtime bind to them. Now, with these APIs, you don't have to do that. You can, for example, see how this is done in Tensorflow here.
Does this mean there won't be a problem when using multiple threaded single application?
The driver API has been fully thread safe since about CUDA 2.0
Is caller here any process? Can I just use retain in first called process and use release on the last called process in a list of hundreds of sequentally [sic] called processes?
No. Contexts are always unique to a given process. They can't be shared between processes in this way
Is primary context intended for this?
process-1: retain (creates)
process-2: retain (re-uses)
...
process-99: retain (re-uses)
process-100: 1 x retain and 100 x release (to decrease counter and unload at last)
No.

Compiling TCL libraries with TCL_MEM_DEBUG

I compiled TCL libraries with mem flag. But when i tried to use the libraries on my application i couldn't see any message in the console. will the trace messages out to standardard output(terminal) or will there be any log files to log the messages?
When you compile Tcl with the memory debugging enabled (using a Posix configuration style, this means that you passed in --enable-symbols=mem or --enable-symbols=all to configure; I'm not certain about what happens with Windows) there is a substantial amount of extra checking of memory allocation handling by default, and an extra Tcl command — memory — is defined. Some memory subcommands do cause messages to be written to stderr; you'll need to be running inside a suitable console in order to see them, and this can be something of an issue on Windows if you are not aware of it. Other commands will dump things to a named file.
FWIW, when developing Tcl I usually build with --enable-symbols=all except when doing performance testing. The various debugging options are known to have substantial impacts on the speed of Tcl's implementation (which is why it is a compile option rather than being always present, and consequently why the interface is rather rougher than for the rest of Tcl).

Memory access exception handling with MinGW on XP

I am trying to use the MinGW GCC toolchain on XP with some vendor code from an embedded project that accesses high memory (>0xFFFF0000) which is, I believe, beyond the virtual mem address space allowed in 'civilian' processes in XP.
I want to handle the memory access exceptions myself in some way that will permit execution to continue at the instruction following the exception, ie ignore it. Is there some way to do it with MinGW? Or with MS toolchain?
The vastly simplified picture is thus:
/////////////
// MyFile.c
MyFunc(){
VendorFunc_A();
}
/////////////////
// VendorFile.c
VendorFunc_A(){
VendorFunc_DoSomeDesirableSideEffect();
VendorFunc_B();
VendorFunc_DoSomeMoreGoodStuff();
}
VendorFunc_B(){
int *pHW_Reg = 0xFFFF0000;
*pHW_Reg = 1; // Mem Access EXCEPTION HERE
return(0); // I want to continue here
}
More detail:
I am developing an embedded project on an Atmel AVR32 platform with freeRTOS using the AVR32-gcc toolchain. It is desirable to develop/debug high level application code independent of the hardware (and the slow avr32 simulator). Various gcc, makefile and macro tricks permit me to build my Avr32/freeRTOS project in the MinGW/Win32 freeRTOS port enviroment and I can debug in eclipse/gdb. But the high-mem HW access in the (vendor supplied) Avr32 code crashes the MinGW exe (due to the mem access exception).
I am contemplating some combination of these approaches:
1) Manage the access exceptions in SW. Ideally I'd be creating a kind of HW simulator but that'd be difficult and involve some gnarly assembly code, I think. Alot of the exceptions can likely just be ignored.
2) Creating a modified copy of the Avr32 header files so as to relocate the HW register #defines into user process address space (and create some structs and linker sections that commmit those areas of virtual memory space)
3) Conditional compilation of function calls that result in highMem/HW access, or alernatively more macro tricks, so as to minimize code cruft in the 'real' HW target code. (There are other developers on this project.)
Any suggestions or helpful links would be appreciated.
This page is on the right track, but seems overly complicated, and is C++ which I'd like to avoid. But I may try it yet, absent other suggestions.
http://www.programmingunlimited.net/siteexec/content.cgi?page=mingw-seh
You need to figure out why the vendor code wants to write 1 to address 0xFFFF0000 in the first place, and then write a custom VendorFunc_B() function that emulates this behavior. It is likely that 0xFFFF0000 is a hardware register that will do something special when written to (eg. change baud rate on a serial port or power up the laser or ...). When you know what will happen when you write to this register on the target hardware, you can rewrite the vendor code to do something appropriate in the windows code (eg. write the string "Starting laser" to a log file). It is safe to assume that writing 1 to address 0xFFFF0000 on Windows XP will not be the right thing to do, and the Windows XP memory protection system detects this and terminates your program.
I had a similar issue recently, and this is the solution i settled on:
Trap memory accesses inside a standard executable built with MinGW
First of all, you need to find a way to remap those address ranges (maybe some undef/define combos) to some usable memory. If you can't do this, maybe you can hook through a seg-fault and handle the write yourself.
I also use this to "simulate" some specific HW behavior inside a single executable, for some already written code. However, in my case, i found a way to redefine early all the register access macros.

what is the difference between ZwOpenFile and NtOpenFile?

ZWOpenFile and NtOpenFile are both the functions of nt dll..ZwOpenFile is implemented as same as NtopenFile..but I dont understand why ZWopenFile is included in nt dll function.Can anyone please explain me the difference?
This is documented in MSDN:
A kernel-mode driver calls the Zw version of a native system services routine to inform the routine that the parameters come from a trusted, kernel-mode source. In this case, the routine assumes that it can safely use the parameters without first validating them. However, if the parameters might be from either a user-mode source or a kernel-mode source, the driver instead calls the Nt version of the routine, which determines, based on the history of the calling thread, whether the parameters originated in user mode or kernel mode. For more information about how the routine distinguishes user-mode parameters from kernel-mode parameters, see PreviousMode.
Basically it relates to how the parameters are validated.
Generally, kernel drivers should only use the ZwXxx() functions.
When called from user mode, the ZwXxx() and NtXxx() functions are exactly the same - they resolve to the same bits of code in ntdll.dll.
When called from a kernel-mode driver, the Zwxxx() variant ensures that a flag used by the kernel is set to indicate that the requestor mode (what's supposed to indicate the caller's mode) is kernel mode. If a kernel driver calls the NtXxx() variant the requestor mode isn't explicitly set so it's left alone and might indicate user or kernel mode, depending on what has occurred in the call stack up to this point.
If the requestor mode flag is set to user mode, the kernel will validate parameters, which might not be the right thing to do (especially if the kernel driver is passing in kernel mode buffers, as the validation will fail in that case), if it's set to kernel mode, the kernel implicitly trusts parameters.
So the rules for using these API names generally boils down to: if you're writing a kernel driver, call the ZwXxx() version (unless you're dealing with special situations, and you know what you're doing and why). If you're writing a user mode component, it doesn't matter which set you call.
As far as I know, Microsoft only documents the NtXxx() for use in user-mode (where it indicates that they are the user-mode equivalent to the corresponding ZwXxx() function).
Giving an example to what has already been said to ensure OP or anyone else gets a complete picture.
NtXxx calls from user mode are resulting in passing less trusted data(from user mode) to a more privileged layer (kernel mode). So it expects the buffer has valid user mode address, the Handles being passed are valid user mode handles, etc.
If a driver calls NtXxx api instead of its equivalent ZwXxx it has to ensure that valid user mode arguments are being passed i.e. it cannot pass a kernel mode address (even if it is valid) and a kernel mode handle (see OBJ_KERNEL_HANDLE).
As already said the ZwXxx equivalent of the API explicitly indicates (through requestor level) that such parameter validation needs to be skipped as the callee is at the same privilege level as the caller.
Here is link to a good starting point for anyone who wants to go beyond the obvious,
https://www.osronline.com/article.cfm?id=257.