Opened 12 years ago

Closed 12 years ago

#407 closed defect (notadefect)

HelenOS/ppc32 broken on latest Qemu

Reported by: Jakub Jermář Owned by: Martin Decky
Priority: major Milestone: 0.5.0
Component: helenos/kernel/ppc32 Version: mainline
Keywords: Cc: mark.cave-ayland@…
Blocker for: Depends on:
See also:

Description

I am logging this ticket mainly for tracking purposes.

There is some problem, either with HelenOS, Qemu or OpenBIOS, which causes HelenOS, as of revision (but also much older) mainline,1356, and Qemu 1.0 (but also much older and also newer) to fail in the following way:

jermar@phantom:~/software/HelenOS.mainline$ qemu-system-ppc -cdrom image.iso -boot d
qemu: fatal: Trying to execute code outside RAM or ROM at 0x70015f70

NIP 70015f70   LR 70015f70 CTR 00000000 XER 00000000
MSR 00001000 HID0 00000000  HF 00000000 idx 1
TB 00000000 77872323 DECR 4294964801
GPR00 000000007000b4c0 000000000061df9c 00000000700204d4 000000007001af98
GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR08 0000000000000000 000000000000001c 0000000000000001 0000000000000001
GPR12 000000007001af70 0000000000000000 000000000021a334 000000000021a384
GPR16 000000000021a358 0000000000222484 000000000021aaa0 000000000021a30c
GPR20 00000000002346a0 0000000000220454 000000007001942c 0000000070019800
GPR24 0000000070019400 0000000000000000 000000007000a724 0000000070018d04
GPR28 000000000021a358 0000000000222484 000000000021aaa0 000000000021a30c
CR 22000022  [ E  E  -  -  -  -  E  E  ]             RES ffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 00000000
 SRR0 70011aa8  SRR1 0000d030    PVR 00080301 VRSAVE 00000000
SPRG0 0061dff0 SPRG1 00000001  SPRG2 7001af70  SPRG3 08000000
SPRG4 00000000 SPRG5 00000000  SPRG6 00000000  SPRG7 00000000
 SDR1 07ff0000
Aborted

Interestingly, very old versions of Qemu (e.g. 0.11.1) do not exhibit this problem. Reportedly, newer versions of Qemu with older OpenBIOS do not show the problem either.

This issue is quite unfortunate as it is not possible to use the newest version of Qemu for testing all supported targets.

Change History (9)

comment:1 by Martin Decky, 12 years ago

FYI: The system still boots fine on real iMac G4. Therefore, the root cause of the problem might be actually in OpenBIOS.

comment:2 by Jakub Jermář, 12 years ago

Hm, this is interesting. Thanks for testing, btw. Note that I have been testing this against the default Qemu ppc machine, which is g3beige. I wonder if the results would be different for the other PowerMac machine.

In the meantime, I also experimented with various versions of Qemu and the OpenBIOS version from Qemu 0.11.1. It turns out that 0.11.1 is the last version that actually worked for me. The entire 0.12 series didn't work for some unrelated reasons, and I believe the situation with 0.13 series was the same. Some of the later versions of Qemu would boot with the 0.11.1 OpenBIOS, but the emulator would crash shortly after the kernel boots anyway.

comment:3 by Mark Cave Ayland, 12 years ago

Hi Jermar,

I saw your message on Artyom's blog related to problems with HelenOS on ppc32 and QEMU, and as a developer on the OpenBIOS project wanted to try and help debug this.

AFAICT there are actually 2 bugs here: one in QEMU, and one in OpenBIOS. First, we need to sort out the bug in QEMU and then once that is done, it shouldn't be too hard to find out what the problem is in OpenBIOS.

Following your blog post, I took the OpenBIOS PPC image from QEMU 0.11 stable branch and performed a git bisect on the QEMU source (fixing some temporary breakages as required) which gave me the following as the first bad commit:

commit 41557447d30eeb944e42069513df13585f5e6c7f
Author: Alexander Graf <agraf@…>
Date: Fri Sep 10 15:08:34 2010 +0000

PPC: Redesign interrupt trigger path


According to the Book3S spec, the interrupt context starts with an MSR
value that is rather simple. If we leave out the HV case, it's almost
always 0.


To reflect this, let's redesign the way that MSR value gets calculated.
Using this, we also squash the bug where MSR_POW can slip through into
the interrupt handler MSR.


Reported-by: Thomas Monjalon <thomas.monjalon@…>
Signed-off-by: Alexander Graf <agraf@…>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@…>

:040000 040000 d5671d029caa3777d5ba670abf34ada79d1baf4b 6b6deb8c8a81683c75cdc001e79e9f4e1d85ab3f M target-ppc

Before this commit, the HelenOS 0.4.2 image from http://www.helenos.org/download works fine, and afterwards it hangs just after emitting "init: Spawning /srv/devfs" on the console.

I've tried using the same OpenBIOS from QEMU 0.11 stable on QEMU git master and this gives exactly the same problem whereas OpenBIOS from SVN trunk doesn't make it that far and stops at "Booting the kernel…". However, I'd suggest raising a suitable bug report on qemu-devel in order to get the interrupt problem fixed with the old QEMU 0.11 OpenBIOS first - Alex Graf is still active on the PPC development front and should hopefully be able to help you with this if you can file a suitable bug report.

Once this is working with the older OpenBIOS from 0.11, please raise another bug report on the OpenBIOS list and then we should then be able to fix the OpenBIOS problem without too much difficulty.

HTH,

Mark.

comment:4 by Mark Cave Ayland, 12 years ago

Cc: mark.cave-ayland@… added

in reply to:  3 comment:5 by Jakub Jermář, 12 years ago

Hi Mark

Thanks for taking the initiative and looking into these issues!

I will try to reproduce your findings tonight using the latest mainline build of HelenOS (with the barebone option turned on). There were some serious problems on our side with the PPC memory management in 0.4.3 and earlier and I would like to rule out (or at least minimize) the possibility it is actually HelenOS which is misbehaving.

Jakub

comment:6 by Jakub Jermář, 12 years ago

Ok, mainline,1408 shows similar behavior as 0.4.2, reflecting the fact that devfs was renamed to locfs in the meantime. Qemu commit 41557447d30eeb944e42069513df13585f5e6c7f introduced a regression which makes HelenOS hang at the following line: "Spawning /srv/locfs".

Going to file a Qemu bug for this regression.

comment:7 by Jakub Jermář, 12 years ago

Logged Qemu Bug 942299.

comment:8 by Mark Cave Ayland, 12 years ago

I've just updated the above Launchpad ticket with an update on where we are with these patches - currently one has been applied, the second has been rewritten and is still pending review.

comment:9 by Jakub Jermář, 12 years ago

Resolution: notadefect
Status: newclosed

Mark, thanks for working on this and for creating both QEMU patches. I am now going to close the HelenOS ticket as the issue is clearly not present in QEMU git and HelenOS mainline any more.

Note: See TracTickets for help on using tickets.