Comments (107)
Did you never enabled umb at b0
by any chance?
Please see if its not crashing with
freedos.
from comcom64.
Nope, rest of dosemu.conf is pretty standard stuff
$ cat test-imagedir/dosemu.conf
$_lpt1 = ""
$_hdimage = "dXXXXs/c:hdtype1 +1"
$_floppy_a = ""
$_com1 = "/tmp/ttyV0"
$_pktdriver=(on)
$_vnet = "tap"
$_tapdev = "tap0"
from comcom64.
Can't reproduce...
from comcom64.
Though I didn't configure any bridges.
Could you confirm that the crash happens not
when vlm initializes, but later? In which case I
should configure the network too.
from comcom64.
Configuring and activating bridge does not
help, still no crash. I wonder if it crashes for
you on init stage or when handling some
packet? -D9+P
from comcom64.
I don't think it got as far as looking for the netware server as the only traffic I see on tap0 are spanning tree for the bridge.
root@polly:~# tshark -i tap0
Running as user "root" and group "root". This could be dangerous.
Capturing on 'tap0'
1 0.000000000 fe80::a4e2:a1ff:fe19:ec0 → ff02::16 ICMPv6 110 Multicast Listener Report Message v2
2 0.179970211 fe80::a4e2:a1ff:fe19:ec0 → ff02::16 ICMPv6 110 Multicast Listener Report Message v2
3 0.619964140 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
4 2.603962331 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
5 4.619960918 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
6 6.603963806 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
7 8.619967041 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
8 10.603963211 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
9 12.619963403 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
10 14.603966118 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
11 16.619962525 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
12 18.603961349 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
13 20.619963745 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
14 22.603963868 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
15 24.619963950 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
16 26.603980642 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. TC + Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
17 28.619970755 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
18 30.603965086 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
19 32.619966633 a6:e2:a1:19:0e:c0 → Spanning-tree-(for-bridges)_00 STP 52 Conf. Root = 32768/0/52:54:00:e8:3d:3c Cost = 0 Port = 0x8002
Here's the 9+P log
packet-log.zip
from comcom64.
So please verify that its not related to the
network packets (by disabling bridge), and
also we have the different versions of some
software, can you try my versions? (from
screenshot)
from comcom64.
There's nothing on the other side of the bridge yet, output packets should just be disappearing.
Have you a link to the different client?
from comcom64.
But anyway, I removed tap0 from the bridge.
root@polly:~# brctl delif virbr0 tap0
root@polly:~# brctl show
bridge name bridge id STP enabled interfaces
virbr0 8000.525400e83d3c yes virbr0-nic
But still crashes.
Here's the client I'm using
nwclient.zip
from comcom64.
pdether.exe.gz
New pdether.
from comcom64.
vlm.exe.gz
Older vlm.
from comcom64.
Crashed with your pdether.exe
Crashed with your vlm.exe
from comcom64.
I am on 32bit with hardware vm86()
from comcom64.
Try cpu_emu, though I dont suppose that
would help...
from comcom64.
net.cfg.gz
net.cfg
from comcom64.
Crashed with '-I cpuemu fullsim'
from comcom64.
Crashed with '-I cpuemu fullsim'
As always havent switched $_cpu_vm?
from comcom64.
Crashed with your net.cfg
from comcom64.
With -I \'cpuemu fullsim\' -I \'cpu_vm emulated\'
I get dosemu crash immediately, don't even see the boot.
log from gdb
Thread 1 "dosemu.bin" received signal SIGSEGV, Segmentation fault.
0xb39fd253 in ?? ()
(gdb) bt
#0 0xb39fd253 in ?? ()
stsp/fdpp#1 0x081412fb in CloseAndExec_x86 (PC=652327, mode=3, ln=623) at codegen-x86.c:3201
stsp/fdpp#2 0x0811eaf4 in _Interp86 (PC=652327, basemode=3) at interp.c:623
stsp/fdpp#3 0x0811d2ce in Interp86 (PC=11003, mod0=3) at interp.c:395
stsp/fdpp#4 0x08131b7f in e_vm86 () at cpu-emu.c:1144
stsp/fdpp#5 0x0810f851 in do_vm86 (x=0x88db060 <vm86u>) at do_vm86.c:433
stsp/fdpp#6 0x0810f8be in _do_vm86 () at do_vm86.c:455
stsp/fdpp#7 0x0811006b in run_vm86 () at do_vm86.c:590
stsp/fdpp#8 0x081100e7 in loopstep_run_vm86 () at do_vm86.c:614
stsp/fdpp#9 0x080affe7 in main (argc=12, argv=0xbffff6e4) at emu.c:422
from comcom64.
That's another ticket I think!
from comcom64.
Oh I do see a warning intermittently from codegen-x86.c about not being able to represent 'large number' in an int, so perhaps it's a 32bit thing there.
from comcom64.
Dont use fullsim...
from comcom64.
what then vm86sim?
from comcom64.
Yes!
from comcom64.
Appears the same
[New Thread 0x80cc2b40 (LWP 28467)]
ERROR: fdpp booting, this is very experimental!
Thread 1 "dosemu.bin" received signal SIGSEGV, Segmentation fault.
0xb39fd253 in ?? ()
(gdb) bt
#0 0xb39fd253 in ?? ()
stsp/fdpp#1 0x081412fb in CloseAndExec_x86 (PC=652327, mode=3, ln=623) at codegen-x86.c:3201
stsp/fdpp#2 0x0811eaf4 in _Interp86 (PC=652327, basemode=3) at interp.c:623
stsp/fdpp#3 0x0811d2ce in Interp86 (PC=11003, mod0=3) at interp.c:395
stsp/fdpp#4 0x08131b7f in e_vm86 () at cpu-emu.c:1144
stsp/fdpp#5 0x0810f851 in do_vm86 (x=0x88db060 <vm86u>) at do_vm86.c:433
stsp/fdpp#6 0x0810f8be in _do_vm86 () at do_vm86.c:455
stsp/fdpp#7 0x0811006b in run_vm86 () at do_vm86.c:590
stsp/fdpp#8 0x081100e7 in loopstep_run_vm86 () at do_vm86.c:614
stsp/fdpp#9 0x080affe7 in main (argc=12, argv=0xbffff6e4) at emu.c:422
I think this is a tangent to the original problem.
from comcom64.
CloseAndExec_x86
should not be called in vm86sim mode...
from comcom64.
Please see why it is, its very simple
(it just shouldn't be registered as a codegen callback)
from comcom64.
Seems to be switched on config.cpusim
value, but that doesn't seem to be modified in global.conf?
from comcom64.
#ifdef HOST_ARCH_X86
if (config.cpusim)
InitGen_sim();
else
InitGen_x86();
#else
InitGen_sim();
#endif
InitGen_sim()
should be in effect for you.
_x86
not.
from comcom64.
| CPUEMU cpuemu
{
#ifdef X86_EMULATOR
config.cpuemu = $2;
if (config.cpuemu > 4) {
config.cpuemu -= 2;
#ifdef HOST_ARCH_X86
config.cpusim = 1;
#endif
}
So its not modified directly from global.conf
from comcom64.
cpuemu : L_OFF { $$ = 0; }
| VM86 { $$ = 3; }
| FULL { $$ = 4; }
| VM86SIM { $$ = 5; }
| FULLSIM { $$ = 6; }
Not sure why there is a gap 0...3, but it should work.
from comcom64.
Looks like it silently accepts two -I
options, but one overrides the other. This seems to work
-I 'cpuemu vm86sim cpu_vm emulated'
CONF: config variable c_comline set
Parsing commandline statements.
CONF: Parsing commandline file.
CONF: simulated CPUEMU set to 3 for 586
CONF: CPU VM set to 2
CONF: config variable c_comline unset
from comcom64.
So finally a new log with -D9+P
log2.zip
from comcom64.
and yes still crashes with tap0 removed from the bridge.
from comcom64.
Not sure why there is a gap 0...3, but it should work.
Pretty sure it's for this horrid -= 2
fixup
#ifdef X86_EMULATOR
config.cpuemu = $2;
if (config.cpuemu > 4) {
config.cpuemu -= 2;
#ifdef HOST_ARCH_X86
config.cpusim = 1;
#endif
}
c_printf("CONF: %s CPUEMU set to %d for %d86\n",
CONFIG_CPUSIM ? "simulated" : "JIT",
config.cpuemu, (int)vm86s.cpu_type);
#endif
from comcom64.
Please fill the bugs about -I, segfaults
and all that.
from comcom64.
-I
is probably okay, dosemu
start script adds multiple ones together, but I was calling the binary directly. Incidentally I noticed that the -valgrind
option will have a similar problem, I think.
from comcom64.
So in the meantime I switched back to my versions and FreeDOS, do you know where Novell has the command line tools to download for querying shares etc is it 'net'?
from comcom64.
I created the branch "vg" (in both projects)
with the valgrind changes. Unfortunately I
can't test it right now as dosemu -valgrind
seems to broke for me even w/o this patch...
Could you check what works for you?
from comcom64.
Using the vg branch of both, valgrind seems to work. Here's the last section before the problem
==26349==
==26349== Use of uninitialised value of size 4
==26349== at 0x815152A: Gen_sim (codegen-sim.c:2210)
==26349== by 0x81201CD: _Interp86 (interp.c:917)
==26349== by 0x811D2CD: Interp86 (interp.c:395)
==26349== by 0x8131B7E: e_vm86 (cpu-emu.c:1144)
==26349== by 0x810F850: do_vm86 (do_vm86.c:433)
==26349== by 0x810F8BD: _do_vm86 (do_vm86.c:455)
==26349== by 0x811006A: run_vm86 (do_vm86.c:590)
==26349== by 0x81100E6: loopstep_run_vm86 (do_vm86.c:614)
==26349== by 0x80AFFE6: main (emu.c:422)
==26349==
==26349== Invalid read of size 1
==26349== at 0x81450DB: Gen_sim (codegen-sim.c:642)
==26349== by 0x8120A16: _Interp86 (interp.c:983)
==26349== by 0x811D2CD: Interp86 (interp.c:395)
==26349== by 0x8131B7E: e_vm86 (cpu-emu.c:1144)
==26349== by 0x810F850: do_vm86 (do_vm86.c:433)
==26349== by 0x810F8BD: _do_vm86 (do_vm86.c:455)
==26349== by 0x811006A: run_vm86 (do_vm86.c:590)
==26349== by 0x81100E6: loopstep_run_vm86 (do_vm86.c:614)
==26349== by 0x80AFFE6: main (emu.c:422)
==26349== Address 0xda1f930 is in a rwx mapped file /dev/shm/dosemu_26349 (deleted) segment
==26349==
MCB corruption
ERROR: fdpp: abort at memmgr.cc:332
==26381==
==26381== HEAP SUMMARY:
==26381== in use at exit: 19,488,680 bytes in 7,864 blocks
==26381== total heap usage: 68,227 allocs, 60,363 frees, 29,697,750 bytes allocated
==26381==
==26381== LEAK SUMMARY:
==26381== definitely lost: 17,533 bytes in 8 blocks
==26381== indirectly lost: 776 bytes in 1 blocks
==26381== possibly lost: 41,433 bytes in 1,776 blocks
==26381== still reachable: 19,428,938 bytes in 6,079 blocks
==26381== suppressed: 0 bytes in 0 blocks
==26381== Rerun with --leak-check=full to see details of leaked memory
==26381==
==26381== For counts of detected and suppressed errors, rerun with: -v
==26381== Use --track-origins=yes to see where uninitialised values come from
==26381== ERROR SUMMARY: 3641 errors from 446 contexts (suppressed: 3 from 1)
Are there any options you'd like me to add to valgrind?
from comcom64.
OK, -valgrind
didn't work for me because of the silly
-pg build. So I merged the valgrind support now, and
I am sure it will help to debug this.
You need to ignore the uninitialized errors because
valgrind doesn't have the notion of r/o memory.
But the problem is, vlm crashes for me under
valgrind w/o "MCB corruption" msg.
from comcom64.
Ignore invalid reads.
Hunt for invalid writes.
And update.
from comcom64.
Use dosemu -valgrind
please.
from comcom64.
I didn't use the dosemu script, but i harvested your -valgrind options from it
VLG="valgrind --log-file=valgrind.log"
VLG_DOSEMU_ARGS="-I 'cpuemu vm86sim cpu_vm emulated cpu_vm_dpmi kvm'"
I hope that's alright?
Here are the log files
t1.zip
Edit: This is back on fdpp/master and dosemu2/devel
from comcom64.
No, that's not right.
Where is VG="valgrind --trace-children=yes --track-origins=yes"
?
I think you are complicating things w/o any need
for this. Script works.
from comcom64.
Also its not seen in the valgrind log the point
where the "MCB corruption" was printed.
from comcom64.
From script that would be visible as both print
to a console...
from comcom64.
I'm using your script now the only way to capture it I know of is `> valgrind.log 2>&1' but that's mixing stdout and stderr streams so synchronisation is dubious
t2.zip
from comcom64.
It has only a couple Invalid write of size
msgs,
and I've seen that with freecom. Please try comcom32 -
it doesn't produce any errors (not because its that
good, but just because valgrind can't catch kvm).
If the writes are indeed related to freecom, try to
disable dos=high,umb
.
from comcom64.
I removed the "read" errors.
Not sure why they were there, but they
definitely do not help us fixing this bug.
So please update.
from comcom64.
I added more patches, and now, loading
the entire ipx stack, there is only 1 read error
on my screen. If you have more than that,
then we are progressing. :)
from comcom64.
And that read error comes from lsl, which
probably traverses the mcb chain by hands.
from comcom64.
Tested latest git as before with both freecom and comcom32, although comcom seems to crash dosemu itself.
t4-comcom32.zip
t3-freecom.zip
I don't think the comcom crash is anything to do with valgrind as it seems to occur without it too.
C:\NWCLIENT>startnet
C:\NWCLIENT>SET NWLANGUAGE=ENGLISH
C:\NWCLIENT>C:\NWCLIENT\LSL.COM
Novell Link Support Layer for DOS ODI v2.20 (960401)
(c) Copyright 1990 - 1996, by Novell, Inc. All rights reserved.
BUFFERS 4 1514
The configuration file used was "C:\NWCLIENT\NET.CFG".
Max Boards 4, Max Stacks 4
Buffers 4, Buffer size 1514 bytes, Memory pool 0 bytes.
C:\NWCLIENT>
C:\NWCLIENT>rem C:\NWCLIENT\NE2000.COM
C:\NWCLIENT>C:\pdether\pdether
Ethernet Packet Driver MLID, v1.01, built Nov 01 1991 at 16:30:31
PDEther installed successfully.
C:\NWCLIENT>
C:\NWCLIENT>C:\NWCLIENT\IPXODI.COM
NetWare IPX/SPX Protocol v3.03 (960611)
(C) Copyright 1990-1995 Novell, Inc. All Rights Reserved.
Bound to logical board 1 (PDETHER) : Protocol ID 8137
C:\NWCLIENT>C:\NWCLIENT\VLM.EXE
VLM.EXE - NetWare virtual loadable module manager v1.21 (960514)
(C) Copyright 1993 - 1996 Novell, Inc. All Rights Reserved.
Patent pending.
Patent No. 5,349,642.
The VLM.EXE file is pre-initializing the VLMs.............
The VLM.EXE file is using extended memory (XMS).
ERROR: general protection at 0xb6052c10: 50
from comcom64.
Even though the comcom test was inconclusive I switched back to freecom and tried dos=low
first, then dos=low,noumb
and the fdpp crash/mcb corruption occurred with both.
from comcom64.
Fixed the error from the log.
Please re-do the log and insert manually
the separators so that it is clear what prog
produces what messages.
from comcom64.
Improved valgrinding a bit more.
from comcom64.
Here's the latest log, but I didn't manage to annotate it yet. Will keep trying but really need a DOS builtin command to write to unix stderr or stdout. I tried sprinkling 'unix echo.sh' in the startnet.bat where echo was a shell script that echoed command line to append to the log file, but strangely I got the echoes in the log, and the valgrind output in the DOS window!
t5.zip
from comcom64.
Why can't you just annotate it by hands?
from comcom64.
Please try again.
from comcom64.
I am currently running dosemu -valgrind -D9+ge
.
It runs for a few hours already, and is currently in
a process of starting ipxodi. :) Next thing to start is
vlm, and I'll have the full execution trace of the IPX
stack.
from comcom64.
Annotated by hand log, position found by inserting 'PAUSE' into batchfile at each point. Seems like only VLM has problems.
t6.zip
from comcom64.
Cool, thanks.
So most errors are fixed, but the
crash is still there, and valgrind
shows no write errors, so it misses
the problem. This may mean that
the corruption comes from fdpp side
and not from dos side. If this is true,
we are very lucky. Ill work on checking
the fdpp side too, and we ll see.
Current code only checks dos side,
and this gave nothing, so staying
optimistic.
from comcom64.
Added code to check for MCB corruption from fdpp side.
Please see the new log.
from comcom64.
new annotated logs t7.zip
from comcom64.
Added more annotations, but this will unlikely
help. For some reason valgrind doesn't see the
corruption for you, but in all my tests it does...
So please update the log, and I'll get back to
this after processing another back-log ot
regressions.
from comcom64.
Here's the latest (unfortunately I didn't have 'f' debug flag set, but the corruption event is still in test.log)
Strange how the valgrind memchk report ends up in the log, I didn't expect that.
t8.zip
from comcom64.
I see this in the log
==31987== More than 100 errors detected. Subsequent errors
==31987== will still be recorded, but in less detail than before.
Perhaps that's why you don't see the corruption in my logs?
from comcom64.
I also repeated the test and used dosdebug 'mcbs' command at each point
before LSL
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0290:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0291:0000 0x000c [F] Files
029e:0000 0x0008 [D] Driver (EMUFS)
02a7:0000 0x0009 [D] Driver (EMS)
02b1:0000 0x029e [B] Buffers
0550:0000 0x0154 [F] Files
06a5:0000 0x008f [L] CDS Array
0735:0000 0x0080 [S] Stacks
07b6:0000 0x0010 [B] Buffers
07c7:0000 ------ [LINK]
0866:0000 0x0006 [FREE]
086d:0000 0x12c1 [COMMAND]
1b2f:0000 0x7e14 [FREE]
9944:0000 0x0619 [COMMAND]
9f5e:0000 0x0090 [COMMAND]
9fef:0000 0x0010 [COMMAND] (END)
before PDETHER
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0290:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0291:0000 0x000c [F] Files
029e:0000 0x0008 [D] Driver (EMUFS)
02a7:0000 0x0009 [D] Driver (EMS)
02b1:0000 0x029e [B] Buffers
0550:0000 0x0154 [F] Files
06a5:0000 0x008f [L] CDS Array
0735:0000 0x0080 [S] Stacks
07b6:0000 0x0010 [B] Buffers
07c7:0000 ------ [LINK]
0866:0000 0x0006 [FREE]
086d:0000 0x00bc [COMMAND]
092a:0000 0x000f [FREE]
093a:0000 0x0306 [LSL]
0c41:0000 0x12c1 [02158]
1f03:0000 0x805a [FREE]
9f5e:0000 0x0090 [COMMAND]
9fef:0000 0x0010 [COMMAND] (END)
before IPXODI
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0290:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0291:0000 0x000c [F] Files
029e:0000 0x0008 [D] Driver (EMUFS)
02a7:0000 0x0009 [D] Driver (EMS)
02b1:0000 0x029e [B] Buffers
0550:0000 0x0154 [F] Files
06a5:0000 0x008f [L] CDS Array
0735:0000 0x0080 [S] Stacks
07b6:0000 0x0010 [B] Buffers
07c7:0000 ------ [LINK]
0866:0000 0x0006 [FREE]
086d:0000 0x00bc [COMMAND]
092a:0000 0x000f [FREE]
093a:0000 0x0306 [LSL]
0c41:0000 0x00b5 [PDETHER]
0cf7:0000 0x12c1 [02158]
1fb9:0000 0x7fa4 [FREE]
9f5e:0000 0x0090 [COMMAND]
9fef:0000 0x0010 [COMMAND] (END)
before VLM
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0290:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0291:0000 0x000c [F] Files
029e:0000 0x0008 [D] Driver (EMUFS)
02a7:0000 0x0009 [D] Driver (EMS)
02b1:0000 0x029e [B] Buffers
0550:0000 0x0154 [F] Files
06a5:0000 0x008f [L] CDS Array
0735:0000 0x0080 [S] Stacks
07b6:0000 0x0010 [B] Buffers
07c7:0000 ------ [LINK]
0866:0000 0x0006 [FREE]
086d:0000 0x00bc [COMMAND]
092a:0000 0x000f [FREE]
093a:0000 0x0306 [LSL]
0c41:0000 0x00b5 [PDETHER]
0cf7:0000 0x040f [IPXODI]
1107:0000 0x12c1 [02158]
23c9:0000 0x7b94 [FREE]
9f5e:0000 0x0090 [COMMAND]
9fef:0000 0x0010 [COMMAND] (END)
after crash
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0290:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0291:0000 0x000c [F] Files
029e:0000 0x0008 [D] Driver (EMUFS)
02a7:0000 0x0009 [D] Driver (EMS)
02b1:0000 0x029e [B] Buffers
0550:0000 0x0154 [F] Files
06a5:0000 0x008f [L] CDS Array
0735:0000 0x0080 [S] Stacks
07b6:0000 0x0010 [B] Buffers
07c7:0000 ------ [LINK]
0866:0000 0x0006 [FREE]
086d:0000 0x00bc [COMMAND]
092a:0000 0x000f [04360]
093a:0000 0x0306 [LSL]
0c41:0000 0x00b5 [PDETHER]
0cf7:0000 0x040f [IPXODI]
1107:0000 0x0155 [VLM]
125d:0000 0x7804 [FREE]
8a62:0000 0x058c [04360]
8fef:0000 0x0fff [FREE]
9fef:0000 0x0010 [COMMAND] (END)
from comcom64.
that bad memory was always zero until sometime during VLM load
dosdebug> d cc0d:0000
dosdebug>
cc0d:0000 06 CF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .O..............
cc0d:0010 00 00 54 CF 74 02 0E CC E1 02 0E CC 6B 03 0E CC ..TOt..La..Lk..L
cc0d:0020 8C 03 0E CC 79 04 0E CC 02 05 0E CC F4 00 0E CC ...Ly..L...Lt..L
cc0d:0030 14 01 0E CC 56 05 0E CC 0D 05 0E CC 00 00 00 00 ...LV..L...L....
cc0d:0040 4E 56 6C 6D 40 00 BB 00 00 55 BD 40 00 55 BD 01 NVlm@.;[email protected]=.
cc0d:0050 00 55 BD 04 00 55 2E FF 1E C8 29 5D C3 55 BD 40 .U=..U..H)]CU=@
cc0d:0060 00 55 BD 01 00 55 BD 01 00 55 2E FF 1E C8 29 5D .U=..U=..U..H)]
cc0d:0070 C3 55 BD 40 00 55 BD 43 00 55 BD 06 00 55 2E FF [email protected]=C.U=..U.
and it looks like code not data nor minor corruption of an MCB
dosdebug> u cc0b:0000
dosdebug>
cc0b:0000 9C pushf
cc0b:0001 3D0516 cmp ax,1605
cc0b:0004 7406 je 000C ($+6)
cc0b:0006 9D popf
cc0b:0007 2EFF2E2706 jmp far word cs:[0627]
cc0b:000c 2EFF1E2706 call far word cs:[0627]
cc0b:0011 06 push es
cc0b:0012 0E push cs
cc0b:0013 07 pop es
cc0b:0014 58 pop ax
cc0b:0015 26891E2E06 mov es:[062E],bx
cc0b:001a 26A33006 mov es:[0630],ax
cc0b:001e BB2C06 mov bx,062C
dosdebug> u
dosdebug>
cc0b:0021 CF iret
I guess we need to see the previous entry in the MCB chain that led us here.
from comcom64.
Please update the logs.
from comcom64.
New logs t9.zip
from comcom64.
I am currently running dosemu -valgrind -D9+ge.
It runs for a few hours already, and is currently in
a process of starting ipxodi. :)
I notice that after every run there are two orphaned valgrind processes left behind. I kill them (-9) but the system never feels responsive and the following valgrind runs take longer until I've rebooted the machine.
from comcom64.
If you know what exactly mcb gets
corrupted (it seems you do), maybe
you can check that the appropriate
fd_prot_mem() is called, and then
filter out all other addresses from that
call, to reduce the logging?
from comcom64.
the system never feels responsive and the following valgrind runs
take longer until I've rebooted the machine.
Maybe things got swapped out?
I've got 20gigorama and 8core CPU for
properly debugging dosemu2.
from comcom64.
I guess to properly run it under valgrind
with -D9+e, we'd need to wait for 20THz
cpus. :)
from comcom64.
If I wanted to tweak fdpp to log the binary at load, how would I convert the FP_DS_DX to char * for fdebug to handle?
diff --git a/kernel/inthndlr.c b/kernel/inthndlr.c
index d0f62c8..aaae7e7 100644
--- a/kernel/inthndlr.c
+++ b/kernel/inthndlr.c
@@ -1074,6 +1074,8 @@ dispatch:
case 0x4b:
break_flg = FALSE;
+ fdebug("#################### int21/4Bh Program exec\n");
+
rc = DosExec(lr.AL, MK_FP(lr.ES, lr.BX), FP_DS_DX);
goto short_check;
I injured my back again, so I can't sit for long periods without 'paying the price' when I get back up again!
from comcom64.
New logs t10.zip
from comcom64.
If I wanted to tweak fdpp to log the binary at load, how would I convert the FP_DS_DX to char *
For example with GET_PTR()
macro.
I injured my back again, so I can't sit for long periods
fdpp is not worth the health!
So please get well. :)
I'll take the care of it while you are away.
from comcom64.
Would be good to get ssh for this btw.
from comcom64.
It appears if you remove all *.vlm files, then no
crash. This is why I wasn't able to reproduce it.
Now I do, though not with MCB corruption, it just
crashes.
Some debugging.
If you produce the -D+e
log and search for
call 02B5
in it, you'll see the following:
esi=000c0000 edi=00000059 ebp=0000356e esp=0000355c
vf=000b7216 cs=a28c ds=a1d9 es=c217
fs=0000 gs=0000 ss=8a73 flg=00030213
stk=0ace 8a73 356e 8a73 113a 8a73 1399 f6f4 0000 091e
Fetch e80114e8 at 000a2a5e mode 3
000a2a5e: e81401 a28c:019e call 02B5 ($+114)
CALL: ret=000001a1
** Jump taken to 000a2b75
(R) DR1=0000070a DR2=0000040a AR1=404ed462 AR2=404d5730
(R) SR1=00003556 TR1=000006d2
(R) RFL m=[BDA] v=0 cout=00000000 RES=00000000
== (1454) == Closing sequence at 000a2a61
(R) DR1=0000070a DR2=0000040a AR1=404ed462 AR2=404d5730
(R) SR1=00003556 TR1=000006d2
(R) RFL m=[BDA] v=0 cout=00000000 RES=00000000
eax=00000a07 ebx=0000070a ecx=00000000 edx=0000001a
esi=000c0000 edi=00000059 ebp=0000356e esp=0000355a
vf=000b7216 cs=a28c ds=a1d9 es=c217
fs=0000 gs=0000 ss=8a73 flg=00030213
stk=01a1 0ace 8a73 356e 8a73 113a 8a73 1399 f6f4 0000
Now search further for nearest pop es
:
eax=000002a9 ebx=00000024 ecx=00000000 edx=000002aa
esi=000c0000 edi=00000059 ebp=0000356e esp=0000355a
vf=000b7212 cs=a28c ds=a1d9 es=5252
fs=0000 gs=0000 ss=8a73 flg=00030246
stk=01a1 0ace 8a73 356e 8a73 113a 8a73 1399 f6f4 0000
Fetch 20b85a07 at 000a2c07 mode 3
000a2c07: 07 a28c:0347 pop es
(G) O_POP MODE+1 [DA]
(V) 000001a1
(R) DR1=000001a1 DR2=0000040a AR1=404ed43e AR2=404d5730
(R) SR1=0000355c TR1=000006ae
(R) RFL m=[DA] v=2 cout=00000000 RES=000002a9
(G) S_REG_WL ES [DA]
(R) DR1=000001a1 DR2=0000040a AR1=404ed43e AR2=404d5730
(R) SR1=0000355c TR1=000006ae
(R) RFL m=[DA] v=2 cout=00000000 RES=000002a9
(G) A_SR_SH4 ES [DA]
SetSeg REAL ES:01a1
eax=000002a9 ebx=00000024 ecx=00000000 edx=000002aa
esi=000c0000 edi=00000059 ebp=0000356e esp=0000355c
vf=000b7212 cs=a28c ds=a1d9 es=01a1
fs=0000 gs=0000 ss=8a73 flg=00030246
stk=0ace 8a73 356e 8a73 113a 8a73 1399 f6f4 0000 091e
Fetch 7a20b85a at 000a2c08 mode 3
000a2c08: 5a a28c:0348 pop dx
(G) O_POP1 [DA]
(R) DR1=000001a1 DR2=0000040a AR1=404ed43e AR2=404d5730
(R) SR1=0000355c TR1=000006ae
(R) RFL m=[DA] v=2 cout=00000000 RES=000002a9
(G) O_POP2 EDX [DA]
(V) 00000ace
(R) DR1=00000ace DR2=0000040a AR1=404ed43e AR2=404d5730
(R) SR1=0000355e TR1=000006ae
(R) RFL m=[DA] v=2 cout=00000000 RES=000002a9
(G) O_POP3 [DA]
(R) DR1=00000ace DR2=0000040a AR1=404ed43e AR2=404d5730
(R) SR1=0000355e TR1=000006ae
(R) RFL m=[DA] v=2 cout=00000000 RES=000002a9
eax=000002a9 ebx=00000024 ecx=00000000 edx=00000ace
esi=000c0000 edi=00000059 ebp=0000356e esp=0000355e
vf=000b7212 cs=a28c ds=a1d9 es=01a1
fs=0000 gs=0000 ss=8a73 flg=00030246
stk=8a73 356e 8a73 113a 8a73 1399 f6f4 0000 091e 051c
So we can see that at that point ES contains the return
address and DX got a subsequent stack word, so the state
is corrupted right here. The actual crash happens much
much later when ret
later goes to junk address.
So in this debugging session I was able to trace from the
actual crash back to the state corruption. But more debugging
is needed to find out why exactly it pops return address by
mistake.
from comcom64.
I distilled the ~500Gb logs into a smaller ones
that just contain the problematic function call
under freedos (good) and fdpp (bad).
They can be made even smaller by cutting off
at pop es
point:
logs_trim.tar.gz
One has esp=0000355c
(bad) and another esp=00003558
(good).
In the good log you can search for push
:
000c5bac: 52 c58b:02fc push dx
(G) O_PUSH1 [DA]
(R) DR1=0000c41b DR2=0000080a AR1=41dd41a1 AR2=41d9a730
(R) SR1=0000355a TR1=00000001
(R) RFL m=[DA] v=2 cout=00000000 RES=00000000
(G) O_PUSH2 EDX [DA]
(V) 0000c41b
(R) DR1=0000c41b DR2=0000080a AR1=41dd41a1 AR2=41d9a730
(R) SR1=00003558 TR1=00000001
(R) RFL m=[DA] v=2 cout=00000000 RES=00000000
(G) O_PUSH2 ES [DA]
(V) 0000c41a
(R) DR1=0000c41a DR2=0000080a AR1=41dd41a1 AR2=41d9a730
(R) SR1=00003556 TR1=00000001
(R) RFL m=[DA] v=2 cout=00000000 RES=00000000
(G) O_PUSH3 [DA]
(R) DR1=0000c41a DR2=0000080a AR1=41dd41a1 AR2=41d9a730
(R) SR1=00003556 TR1=00000001
(R) RFL m=[DA] v=2 cout=00000000 RES=00000000
eax=00000a07 ebx=00000024 ecx=00000000 edx=0000c41b
esi=00000000 edi=00000059 ebp=0000356e esp=00003556
vf=00093246 cs=c58b ds=c4d8 es=c41a
fs=0c07 gs=c42b ss=8a73 flg=00030212
stk=c41a c41b 01a1 0ace 8a73 356e 8a73 0c17 8a73 1399
Which means that dx&es were pushed.
In a bad log there is no such part.
from comcom64.
Made a disasm diff:
asm.diff.txt.gz
And the problematic hunk is:
02ee cmp dx,es:[0026]
02f3 jne 02C2 ($-33)
-02f5 cmp dx,es:[0001]
-02fa jne 02C2 ($-3a)
-02fc push dx
-02fe mov bx,es:[0003]
-0303 mov [06B2],bx
-0307 mov bx,es:[003C]
-030c test bx,bx
-030e je 0318 ($+8)
-0310 cmp word [06D2],031E
-0316 jnc 032E ($+16)
+02c2 add dx,es:[0003]
+02c7 mov es,dx
+02c9 inc dx
+02ca cmp byte es:[0000],4D
+02d0 je 02E5 ($+13)
+02e5 cmp word es:[0010],20CD
+02ec jne 02C2 ($-2c)
+02c2 add dx,es:[0003]
+02c7 mov es,dx
+02c9 inc dx
+02ca cmp byte es:[0000],4D
+02d0 je 02E5 ($+13)
+02e5 cmp word es:[0010],20CD
+02ec jne 02C2 ($-2c)
+02c2 add dx,es:[0003]
+02c7 mov es,dx
+02c9 inc dx
+02ca cmp byte es:[0000],4D
+02d0 je 02E5 ($+13)
+02e5 cmp word es:[0010],20CD
+02ec jne 02C2 ($-2c)
+02c2 add dx,es:[0003]
+02c7 mov es,dx
+02c9 inc dx
+02ca cmp byte es:[0000],4D
+02d0 je 02E5 ($+13)
+02d2 cmp byte cs:[0510],FF
+02d8 je 0325 ($+4b)
+0325 xor ax,ax
+0327 mov es,ax
+0329 mov dx,es:[00BA]
032e mov es,dx
0330 mov dx,es:[002C]
It search for some signatures, seemingly PSP (20CD),
and in a freedos case, finds whatever it needs. In fdpp
case it finds nothing and that triggers a crasher bug in
vlm itself. It just is not prepared to not find what it was
looking for.
The search procedure must be studied a bit more.
Full disasm traces, distilled:
disasms.tar.gz
They are quite small, 130 and 225 lines only.
I wonder if you can do the rest. :)
from comcom64.
Note that push dx
above is actually
push dx ; push es
because simx86
optimized them into a single push of
2 registers. I think its kinda wrong to
do such optimization on an interpreter
level (it could do it at codegen backend
instead), but this is it.
from comcom64.
Ok, so it walks an mcb chain
searching for a self-owned PSP
(parent_psp==self), then it also
checks that mcb is owned by that
psp. I.e. it just searches for the
master psp of some process, but
finds nothing. So either fdpp produces
the corrupted psp, or the initial
search address is wrong.
Problem deciphered.
from comcom64.
Mm, it is looking for a shell, because only
shell seems to have a self-owned PSP.
Why would it look for a shell...
from comcom64.
Damn, comcom32 bug...
from comcom64.
Thing is I'm seeing this with Freecom, perhaps there are two bugs?
from comcom64.
Very likely...
But this is all I could reproduce.
from comcom64.
Works very well for me now.
Please open another ticket if this is still a problem.
from comcom64.
And if you do, please make sure the MCB
of command.com is self-owned.
from comcom64.
I mean, with comcom32 it crashed for me
even under freedos. So this definitely had
to be fixed before anything else.
from comcom64.
Here's FreeCOM's MCB, you can see the owner is itself so I think that's all good
dosdebug> mcbs
dosdebug>
ADDR(LOW) PARAS OWNER
0291:0000 0x0536 [DOS]
=> ADDR PARAS TYPE USAGE
0292:0000 0x000c [F] Files
029f:0000 0x0008 [D] Driver (EMUFS)
02a8:0000 0x0009 [D] Driver (EMS)
02b2:0000 0x029e [B] Buffers
0551:0000 0x0154 [F] Files
06a6:0000 0x008f [L] CDS Array
0736:0000 0x0080 [S] Stacks
07b7:0000 0x0010 [B] Buffers
07c8:0000 ------ [LINK]
0871:0000 0x0006 [FREE]
0878:0000 0x00bc [COMMAND]
0935:0000 0x000f [04371]
0945:0000 0x0306 [LSL]
0c4c:0000 0x00b5 [PDETHER]
0d02:0000 0x040f [IPXODI]
1112:0000 0x015f [VLM]
1272:0000 0x77ef [04371]
8a62:0000 0x058c [04371]
8fef:0000 0x0fff [FREE]
9fef:0000 0x0010 [COMMAND] (END)
dosdebug> d 0878:0000
dosdebug>
0878:0000 4D 79 08 BC 00 00 00 00 43 4F 4D 4D 41 4E 44 00 My.<....COMMAND.
0878:0010 CD 20 06 1C 00 9A C0 00 00 00 0C 0A 89 08 25 F1 M ....@.......%q
0878:0020 00 F0 EF 0F D9 00 79 08 01 01 01 00 02 FF FF FF .po.Y.y......
0878:0030 FF FF FF FF FF FF FF FF FF FF FF FF F0 9F 7E 08 p.~.
0878:0040 89 08 14 00 18 00 79 08 00 00 60 00 00 00 00 00 ......y...`.....
0878:0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0878:0060 CD 21 CB 00 00 00 00 00 00 00 00 00 00 20 20 20 M!K..........
0878:0070 20 20 20 20 20 20 20 20 00 00 00 00 00 20 20 20 .....
I'm going to stop pursuing this as I've pretty much forgotten why I was trying this network client anyway. Unfortunately I use this machine for private stuff so I can't open it to the Internet, so let's leave it unresolved and perhaps a better test case will present itself sometime.
from comcom64.
Can anyone else still reproduce that, even
with the fixed comcom32?
from comcom64.
Built with -m32
and still no crash.
So this is not a 32bit-specific.
Could you please upload a self-contained test-case?
from comcom64.
Here's what I have nwtest.tar.gz
1/ cd nwclient
2/ startnet
3/ Crash on VLM load of NETX.VLM
from comcom64.
Reproduced.
from comcom64.
So the crash happens because of your
dos=low,noumb
which is not in a default
config. I am shocked you haven't told me
that for over a week...
from comcom64.
The dos=low,noumb
is an artefact from the experiment you asked me to do in #27 (comment). Before I had no dos=
, would that equate to noumb?
from comcom64.
I had no dos=, would that equate to noumb?
Of course!
from comcom64.
Please, test with the default setup first,
otherwise this is getting ridiculously time-consuming
every now and then.
from comcom64.
Related Issues (20)
- can't convert deb arch to noarch HOT 2
- implement ctty
- DIR (using int21/7303) non longer reports correct space HOT 6
- Suggestion: Make a PPA for Debian HOT 19
- DIR with non existent drive returns spurious result HOT 2
- disable CI for entire branch HOT 2
- CD changes drive
- SET accepts any switch letter as meaning /P switch HOT 1
- pipes do not handle prog args
- suppress Runtime error 200 HOT 5
- setting date and time
- saveable history buffer
- Unable to manually set time or date with comcom32 HOT 2
- COMCOM32 wildcard not working
- Continue on error? HOT 1
- Odd trailing character on version string HOT 20
- implement toclip-alike winoldap support HOT 1
- command completion should account for BAT EXE COM files
- `README.md`'s links to build tools does not work. Where to find `dj64-gcc`? HOT 5
- LFN: too many dirs open HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from comcom64.