Write in to RAM in EL1 on ARMv8 - exception

I am writing my RTOS for armv8 arch and I am using u-boot. Now when my board booting it switchs in EL1. But I can not write/read any values. Is there a way to disable translation table? Or is the problem elsewhere? Thanks advance for answers :)

I disabled the mmu with sctlr_el1, Anyway I added some asm code and I can read the ram. The code generated by gcc halt the cpu, Here's the .s files
ldr w0, =__bss_size
it's working
adrp x19, __bss_size
ldr w0, [x19, #:lo12:__bss_size]
it is generated by gcc and it is not working

Related

QEMU MIPS32 - 16550 Uart Implementation on a Custom Board

I’m trying to use QEMU to emulate a piece of firmware, but I’m having trouble getting the UART device to properly update the Line Status Register and display the input character.
Details:
Target device: Qualcomm QCA9533 (Documentation here if you're curious)
Target firmware: VxWorks 6.6 with U-Boot bootload
CPU: MIPS 24Kc
Board: mipssim (modified)
Memory: 512MB
Command used: qemu-system-mips -S -s -cpu 24Kc -M mipssim –nographic -device loader,addr=0xBF000000,cpu-num=0 -serial /dev/ttyS0 -bios target_image.bin
I have to apologize here, but I am unable to share my source. However, as I am attempting to retool the mipssim board, I have only made minor changes to the code, which are as follows:
Rebased bios memory region to 0x1F000000
Changed load_image_targphys() target address to 0x1F000000
Changed $pc initial value to 0xBF000000 (TLB remap of 0x1F000000)
Replaced the mipssim serial_init() ¬call with serial_mm_init(isa, 0x20000, env->irq[0], 115200, serial_hd(0), DEVICE_NATIVE_ENDIAN).
While it seems like serial_init() is probably the currently accepted standard, I wasn’t having any luck with remapping it. I noticed the malta board had no issues outputting on a MIPS test kernel I gave it, so I tried to mimic what was done there. However, I still cannot understand how QEMU works and I am unable to find many good resources that explain it. My slog through the source and included docs is ongoing, but in the meantime I was hoping someone might have some insight into what I’m doing wrong.
The binary loads and executes correctly from address 0xBF000000, but hangs when it hits the first UART polling loop. A look at mtree in the QEMU monitor shows that the I/O device is mapped correctly to address range 0x18020000-0x1802003F, and when the firmware writes to the Tx buffer, gdb shows the character successfully is written to memory. There’s just no further action from the serial device to pull that character and display it, so the firmware endlessly polls on the LSR waiting for an update.
Is there something I’m missing when it comes to serial/hardware interaction in QEMU? I would have assumed that remapping all of the existing functional components of the mipssim board would be enough to at least get serial communication working, especially since the target uses the same 16550 UART that mipssim does. Please let me know if you have any insights. It would be helpful if I could find a way to debug QEMU itself with symbols, but at the same time I’m not totally sure what I’d be looking for. Even advice on how to scope down the issue would be useful.
Thank you!
Well after a lot of hard work I got the UART working. The answer to the question lies within the serial_ioport_read() and serial_ioport_write() functions. These two methods are assigned as the callbacks QEMU invokes when data is read or written to the MemoryRegion for the serial device (which is initialized in serial_init() or serial_mm_init()). These functions do a bit of masking on the address (passed into the functions as addr) to determine which register is being referenced, then return the value from the SerialState struct corresponding to that register. It's surprisingly simple, but I guess everything seems simple once you've figured it out. The big turning point was the realization that QEMU effectively implements the serial device as a MemoryRegion with special functionality that is triggered on a memory operation.
Anyway, hope this helps someone in the future avoid the nightmare I went through. Cheers!

How to cross-compile source code for MIPS without delay-slot implemented

I am now trying to cross-compile my C code for MIPS architecture.
And I am using a cross-compiler I make on my host computer. I have a FPGA board which implemented with a multi-cycle (not pipeline) MIPS core. I was just wondering if I could compile the code without delay-slot yet keep other optimizations made via GCC.
The simplest flag -O for optimization still implements the delay-slot. So I was wondering if there is any options to simply disable delay-slot in MIPS cross-compiling.

Several questions about CUDA and cuPrintf

I can compile successfully my code using cuPrintf by nvcc, but cannot compile it in Visual Studio 2012 environment. It says that "volatile char *" cannot be changed to "const void *" in "cudaMemcpyToSymbol" function.
cuPrintf seems doesn't work, there's no cuPrintf function executed in kernel code.
How to make nvcc export pdb file?
Is there any other convenient way to debug in kernel function? I have only one laptop.
1st , cuPrinft is deprecated (As far as I know it has never been released) you can print data from kernel using print command, but this is a very not recommended way of debugging your kernels.
2nd, You are compiling using CUDA nvcc compiler, there is no such thing pdb file in CUDA, Albeit watch the 'g' and 'G' flags, those may dramatically increase your running time.
3rd,
The best way to debug kernels is using visual Nsight

Why is the initialization of the GPU taking very long on Kepler achitecture and how to fix this?

When running my application the very first cuda_malloc takes 40 seconds which is due to the initialization of the GPU. When I build in debug mode this reduces to 5 seconds and when I run the same code on a Fermi device, it takes far less than a second (not even worth measuring in my case).
Now the funny thing is that if I compile for this specific architecture, using the flag sm35 instead of sm20, it becomes fast again. As I should not use any new sm35 features just yet, how can I compile for sm20 and not have this huge delay? Also I am curious what is causing this delay? Is the machine code recompiled on the fly into sm35 code?
Ps. I run on windows but a colleague of mine encountered the same problem, probably on windows. The device is a Kepler, driver version 320.
Yes, the machine code is recompiled on the fly. This is called the JIT-compile step, and it will occur any time the machine code does not match the device that is being used (and assuming valid PTX code exists in the executable.)
You can learn more about JIT-compile here. Note the discussion of the cache which should alleviate the issue after the first run.
If you specify compilation for both sm_20 and sm_35, you can build a binary/executable that will run quickly on both types of devices, and you will also get notification if you are using a sm_35 feature that is not supported on sm_20 (during the compile process).

Self-modifying program: Why does it raise an exception?

Just for the purposes of experimenting and playing around, I wrote the following short x64 assembly program:
.code
AsmFun proc
mov rax, MyLabel
mov byte ptr [rax], 0C3h ; C3 is x64 machine code for "ret"
MyLabel:
mov rax, 239847 ; This isn't "ret"
AsmFun endp
end
(I then called the code from C.)
It compiles/assembles/links just fine, but when I walk through the program, Visual Studio complains that an un-handled exception has been raised: "Access writing violation as [MyLabel].", where of course it doesn't actually say "[MyLabel]", but rather the address that happens to be at in memory.
Why is this happening? Is it a Windows thing that was put in place to avoid security exploits?
I live in Linux world, but perhaps you can adapt what I've found out.
Memory pages are generally read-only if they have execute permission. How I got around this was with mmap() and mprotect()... I'm sure there's something similar in Windows. It's a good bet the Mono source code would shed some light.
I used mmap() to allocate a new page with write access (but not read or execute). I populated it, then called mprotect() to change it to read-only and executable.
Don't forget... there are registers you want to avoid trashing. See the ABI documentation for further details.