I'm trying to develop a PCI device and I need to implement a legacy interrupt (not MSI or MSIX). I followed the example of edu.c but the IRQ is still raised when I load my driver.
I tried to look at other devices but no luck. Here is my code :
static void xxx_pci_realize(...)
{
// ....
pci_config_set_interrupt_pin(pci_conf, 1);
pci_set_irq(pdev, 0);
// ....
}
Does anyone have any idea of what is incorrect ?
Thank you !
You should not be trying to mess with the state of the PCI lines in your realize method at all. That method is where the device is created, and happens only once at the start of the simulation. Interrupt lines should be raised and lowered in response to things happening while the system is running -- typically the guest writes a register and this causes you to do something that means you then raise an interrupt. Then the guest gets that interrupt, and tells the device "OK, I've dealt with this now", and then the device lowers the interrupt. You can see this pattern in the 'edu.c' device you mention.
It seems that the interrupt was not issued from my device: I have read somewhere that, on legacy IRQs, devices are chained up on the IRQ pins 10 and 11. So the CPU/kernel has no way to say which one issued the IRQ.
If I understood it well, when an IRQ is issued, each device having legacy interrupts has to say if the interrupt originated from him or not. It is done by setting registers that will be read by the driver who will be able to handle the IRQ. And then the device following him in the list will do the same.
As I have adapted my QEMU PCI device to return 0 for registers that indicates if the device has issued the IRQ, the IRQ passes and is disabled after that. So I assume it is because the IRQ was issued by an other device.
If someone thinks it's not the real reason, I would be happy to know if there is another possibility :)
Thanks again #peter-maydell for your help !
Related
Suppose I have a GPU and driver version supporting unified addressing; two GPUs, G0 and G1; a buffer allocated in G1 device memory; and that the current context C0 is a context for G0.
Under these circumstances, is it legitimate to cuMemcpy() from my buffer to host memory, despite it having been allocated in a different context for a different device?
So far, I've been working under the assumption that the answer is "yes". But I've recently experienced some behavior which seems to contradict this assumption.
Calling cuMemcpy from another context is legal, regardless of which device the context was created on. Depending on which case you are in, I recommend the following:
If this is a multi-threaded application, double-check your program and make sure you are not releasing your device memory before the copy is completed
If you are using the cuMallocAsync/cuFreeAsync API to allocate and/or release memory, please make sure that operations are correctly stream-ordered
Run compute-sanitizer on your program
If you keep experiencing issues after these steps, you can file a bug with NVIDIA here.
Are there any CPU-state bits indicating being in an exception/interrupt handler in x86 and x86-64? In other words, can we tell whether the main thread or exception handler is currently executed based only on the CPU registers' state?
Not, there's no bit in the CPU itself (e.g. a control register) that means "we're in an exception or interrupt handler".
But there is hidden state indicating that you're in an NMI (Non-Maskable Interrupt) handler. Since you can't block them by disabling interrupts, and unblockable arbitrary nesting of NMIs would be inconvenient, another NMI won't get delivered until you run an iret. Even if an exception (like #DE div by 0) happens during an NMI handler, and that exception handler itself returns with iret even if you're not done handling the NMI. See The x86 NMI iret problem on LWN.
For normal interrupts, you can disable interrupts (cli) if you don't want another interrupt to be delivered while this one is being handled.
However, the interrupt controller (logically outside the CPU core, but actually part of modern CPUs) may need to be told when you're done handling an external interrupt. (Not a software-interrupt or exception). https://wiki.osdev.org/IDT_problems#I_can_only_receive_one_IRQ shows the outb instructions needed to keep the legacy PIC happy. (I don't know if this applies to more modern ways of doing interrupts, like MSI-X message-signalled interrupts.
That part of the OSdev wiki page might be specific to toy OSes that let the BIOS emulate legacy IBM-PC stuff.) But either way, that's only for external interrupts like PS/2 keyboard controller, hard drive DMA complete, or whatever (not exceptions), so it's unrelated to your Are Linux system calls executed inside an exception handler? question.
The lack of exception-state means there's no special instruction you have to run to "acknowledge" an exception before calling schedule() from what was an interrupt handler. All you have to do is make sure interrupts are enabled or not when they should or shouldn't be. (sti / cli, or pushf / popf to save/restore the old interrupt state.) And of course that your software data structures remain consistent and appropriate for what you're doing. But there isn't anything you have to do specifically to keep the CPU happy.
It's not like with user-space where a signal handler should tell the OS it's done instead of just jumping somewhere and running indefinitely. (In Linux, a signal handler can modify the main-thread program-counter so sigreturn(2) resumes execution somewhere other than where you were when it was delivered.) If POSIX or Linux signals were the (mental) model you were wondering about for interrupts/exceptions, no, it's not like that.
There is an interrupt-priority mechanism (CR8 in x86-64, or the LAPIC TPR (Task Priority Register)), but it does not automatically get set when the CPU delivers an interrupt. You can set it once (e.g. if you have a lot of high-priority interrupts to process on this core) and it persists across interrupts. (How is CR8 register used to prioritize interrupts in an x86-64 CPU?).
It's just a filter on what interrupt-numbers can get delivered to this core when interrupts are enabled (sti, IF=1 bit in RFLAGS). Apparently Windows makes some use of it, or did back in 2007, but Linux doesn't (or didn't).
It's not like you have to tell the CPU / LAPIC that you're done with this interrupt so it's ok for it to deliver another interrupt of this or lower priority.
I know this question seems very generic as it can depend on the platform,
but I understand with procedure / function calls, the assembler code to push return address on the stack and local variables etc. can be part of either the caller function or callee function.
When a hardware exception or interrupt occurs tho, the Program Counter will get the address of the exception handler via the exception table, but where is the actual code to store the state, return address etc. Or is this automatically done at the hardware level for interrupts and exceptions?
Thanks in advance
since you are asking about arm and you tagged microcontroller you might be talking about the arm7tdmi but are probably talking about one of the cortex-ms. these work differently than the full sized arm architecture. as documented in the architectural reference manual that is associated with these cores (the armv6-m or armv7-m depending on the core) it documents that the hardware conforms to the ABI, plus stuff for an interrupt. So the return address the psr and registers 0 through 4 plus some others are all put on the stack, which is unusual for an architecture to do. R14 instead of getting the return address gets an invalid address of a specific pattern which is all part of the architecture, unlike other processor ip, addresses spaces on the cortex-ms are encouraged or dictated by arm, that is why you see ram starts at 0x20000000 usually on these and flash is less than that, there are some exceptions where they place ram in the "executable" range pretending to be harvard when really modified harvard. This helps with the 0xFFFxxxxx link register return address, depending on the manual they either yada yada over the return address or they go into detail as to what the patterns you find mean.
likewise the address in the vector table is spelled out something like the first 16 are system/arm exceptions then interrupts follow after that where it can be up to 128 or 256 possible interrupts, but you have to look at the chip vendor (not arm) documentation for that to see how many they exposed and what is tied to what. if you are not using those interrupts you dont have to leave a huge hole in your flash for vectors, just use that flash for your program (so long as you insure you are never going to fire that exception or interrupt).
For function calls, which occur at well defined (synchronous) locations in the program, the compiler generates executable instructions to manage the return address, registers and local variables. These instructions are integrated with your function code. The details are hardware and compiler specific.
For a hardware exception or interrupt, which can occur at any location (asynchronous) in the program, managing the return address and registers is all done in hardware. The details are hardware specific.
Think about how a hardware exception/interrupt can occur at any point during the execution of a program. And then consider that if a hardware exception/interrupt required special instructions integrated into the executable code then those special instructions would have to be repeated everywhere throughout the program. That doesn't make sense. Hardware exception/interrupt management is handled in hardware.
The "code" isn't software at all; by definition the CPU has to do it itself internally because interrupts happen asynchronously. (Or for synchronous exceptions caused by instructions being executed, then the internal handling of that instruction is what effectively triggers it).
So it's microcode or hardwired logic inside the CPU that generates the stores of a return address on an exception, and does any other stuff that the architecture defines as happening as part of taking an exception / interrupt.
You might as well as where the code is that pushes a return address when the call instruction executes, on x86 for example where the call instruction pushes return info onto the stack instead of overwriting a link register (the way most RISCs do).
So I'm getting a "prefetch abort" exception on our arm9 system. This system does not have an MMU, so is there anyway this could be a software problem? All the registers seem correct to me, and the code looks right (not corrupted) from the JTAG point of view.
Right now I'm thinking this is some kind of hardware issue (although I hate to say it - the hardware has been fine until now).
What exactly is the exception you're getting?
Last time this happened to me, I went up the wrong creek for a while because I didn't realize an ARM "prefetch abort" meant the instruction prefetch, not data prefetch, and I'd just been playing with data prefetch instructions. It simply means that the program has attempted to jump to a memory location that doesn't exist. (The actual problem was that I'd mistyped "go 81000000" as "go 81000" in the bootloader.)
See also:
http://www.keil.com/support/docs/3080.htm (KB entry on debugging data aborts)
http://www.ethernut.de/en/documents/arm-exceptions.html (list of ARM exceptions)
What's the address that the prefetch abort is triggering on. It can occur because the program counter (PC or R15) is being set to an address that isn't valid on your microcontroller (this can happen even if you're not using an MMU - the microcontroller's address space likely has 'holes' in it that will trigger the prefetch abort). It could also occur if you try to prefetch an address that would be improperly aligned, but I think this dpends on the microcontroller implementation (the ARM ARM lists the behavior as 'UPREDICTABLE').
Is the CPU actually in Abort mode? If it's executing the Prefetch handler but isn't in abort mode that would mean that some code is branching through the prefetch abort vector, generally through address 0x0000000c but controllers often allow the vector addresses to be remapped.
I have a VxWorks application running on ARM uC.
First let me summarize the application;
Application consists of a 3rd party stack and a gateway application.
We have implemented an operating system abstraction layer to support OS in-dependency.
The underlying stack has its own memory management&control facility which holds memory blocks in a doubly linked list.
For instance ; we don't directly perform malloc/new , free/delege .Instead we call OSA layer's routines and it gets the memory from OS and puts it in a list then returns this memory to application.(routines : XXAlloc , XXFree,XXReAlloc)
And when freeing the memory we again use XXFree.
In fact this block is a struct which has
-magic numbers indication the beginning and end of memory block
-size that user requested allocated
-size in reality due to alignment issue previous and next pointers
-pointer to piece of memory given back to application. link register that shows where in the application xxAlloc is called.
With this block structure stack can check if a block is corrupted or not.
Also we have pthread library which is ported from Linux that we use to
-create/terminate threads(currently there are 22 threads)
-synchronization objects(events,mutexes..)
There is main task called by taskSpawn and later this task created other threads.
this was a description of application and its VxWorks interface.
The problem is :
one of tasks suddenly gets destroyed by VxWorks giving no information about what's wrong.
I also have a jtag debugger and it hits the VxWorks taskDestoy() routine but call stack doesn't give any information neither PC or r14.
I'm suspicious of specific routine in code where huge xxAlloc is done but problem occurs
very sporadic giving no clue that I can map it to source code.
I think OS detects and exception and performs its handling silently.
any help would be great
regards
It resolved.
I did an isolated test. Allocated 20MB with malloc and memset with 0x55 and stopped thread of my application.
And I wrote another thread which checks my 20MB if any data else than 0x55 is written.
And quess what!! some other thread which belongs other components in CPU (someone else developed them) write my allocated space.
Thanks 4 your help
If your task exits, taskDestroy() is called. If you are suspicious of huge xxAlloc, verify that the allocation code is not calling exit() when memory is exhausted. I've been bitten by this behavior in a third party OSAL before.
Sounds like you are debugging after integration; this can be a hell of a job.
I suggest breaking the problem into smaller pieces.
Process
1) you can get more insight by instrumenting the code and/or using VxWorks intrumentation (depending on which version). This allows you to get more visibility in what happens. Be sure to log everything to a file, so you move back in time from the point where the task ends. Instrumentation is a worthwile investment as it will be handy in more occasions. Interesting hooks in VxWorks: Taskhooklib
2) memory allocation/deallocation is very fundamental functionality. It would be my first candidate for thorough (unit) testing in a well-defined multi-thread environment. If you have done this and no errors are found, I'd first start to look why the tas has ended.
other possible causes
A task will also end when the work is done.. so it may be a return caused by a not-so-endless loop. Especially if it is always the same task, this would be my guess.
And some versions of VxWorks have MMU support which must be considered.