VOGONS


Reply 81 of 110, by mkarcher

User metadata
Rank l33t
Rank
l33t
douglar wrote on 2023-10-11, 02:08:

Great write up. Can I talk you into giving your version of the classic story about A20, himem, and the keyboard controller?

Yeah, but be prepared for some unpopular oppinion mixed in it.

When Intel designed the 80286 processor, they didn't specifically target the IBM PC, and especially, Intel didn't target the IBM AT (which was not even designed while the 286 processor was designed). While the 80286 was designed, it wasn't expected that everyone would be running DOS the next 10 years. The 286 was designed for two use cases: One is the use as "Turbo 8086", which keeps the complete 8086 architecture and executes real-mode software. The other use case is for "advanced computers" using a multi-tasking operating system with a protected-mode kernel and suitable userspace software. The use case of a "hybrid computer" that is meant for both real-mode DOS software and modern protected mode multi-tasking software was obviously not envisioned by Intel.

Let's start with a small excourse into arithmetic logic unit design: Most computers use a number representation for negative integer numbers called "2's complement". Instead of showing binary or hexadecimal numbers, I'm going to explain that concept using decimal numbers. Imagine you can only handle numbers with 3 digits. In that case, you can obviously calculate something like 015 + 033 = 048, or 333 + 456 = 789. But it's also possibe to get a "carry" into the fourth digit (which doesn't exist), e.g. in 710 + 611 = 321 (+1000 as carry). For normal arithmetic calculations, the processor keeps a carry flag, and if you want to deal with number exceeding 16 bits (in computer technology) or 3 digits (in my example), you need to process the carry flag. The processor has an add-with-carry instruction for that. For example, if you add 000'710 and 000'611, you first calculate 710+611 to obtain 321, and then 000 + 000 + carry to obtain 001, resulting in a total result of 001'321. The most important observation is that generating a carry is not an error, or something that must be avoided, but normal operation. You can use that for your advantage: Observe how adding 999 has the same effect as subtracting 1! And that's actually how negative numbers are implemented in computers: If we deal with "signed numbers", we consider 000..499 positive and 500..999 negative, representing the number -500 to -1. On the other hand, when we deal with "unsigned numbers", we treat 999 as 999. However we treat 999 does not matter for how addition is to be performed. The interpretation of 999 as -1 is equally valid as the interpretation as 999, and it depends on context what interpretation is "correct".

Now, back to the 8086 and its addressing: To make this discussion easier to follow, instead of segments and offsets being binary 16 bit numbers, and the segment being multiplied by 16, lets examine what happens in a similar architecture that uses 3-digit decimal numbers for segments and offsets, and the segment is multiplied by ten, with the full addressable memory range being 10'000 bytes, numbered from 0000 to 9999. Obviously, using a segment value of 000, we can address 0000..0999 using the offsets 000..999. A segment value of 100 provides a base of 1000 (as segment values are multiplied by 10, just by appending a zero to the segment value), so offsets 000..999 now reach to addresses 1000..9999. If segment zero only contains 500 elements, the next segment could already start at segment value 050, which provides a base address of 0500. The physical addresses reachable from this base address are 0500..1499. In this kind of addressing, offset 999 is obviously no longer equivalent to -1, because treating the offset at -1 with a base of 0500 would target 0499, not 1499. This happens, because the result has 4 digits, but the offset only has 3 digits. The equivalence of -1 and 999 only exists for 3-digit results, not for four-digit results. But now let's look at the segment base addresses: the highest segment we can choose is segment 999, which starts at 9990. The offsets 0..9 will reach into physical memory addresses 9990..9999 (the last 10 bytes of this hypothetical machine that can address 10'000 bytes). On the other hand, offsets 10..999 will yield addresses 0000..0989, because there is no fifth digit that the carry of the addition could go into. So you can interpret segment "999" as a segment near the end of the memory range, which only has 10 elements till the memory ends. At the same time, this is also segment -1, with a base address of -10, and permittable offsets inside this segment starting at 10. This kind of dealing with "wrap-around", allowing negative segment numbers to exist (if you supply a high enough offset value) is a fundamental property of a segmented four-digit decimal addressing architecture. It is not a bug or a quirk, but it just works as designed. So actually, in the 8086, the highest segment register value is not 999, but 0FFFFh (a hexadecimal number), being the same thing as 65535 (a decimal number), and this does not point 10 bytes, but 16 bytes before the end of the 1MB address space, but the idea is exactly the same: For the first 16 bytes inside segment 65535, you get the last 16 bytes of address space (an IBM PC has the end of the BIOS ROM there), and for the remaining 65520 bytes in that segment, you get to see the start of the address space, so the segment number is treated as -1 in this context. This happens because the highest address the 8086 can send to the mainboard has the hexadecimal value 0F'FFFFh, the next number would be 10'0000h, but this number is not representable by the 20 address lines (called A0..A19), so it wraps to physical addres 0'0000h.

As the 80286 can address 16MB, it now has 4 extra address lines called A20..A23, allowing addresses up to 0FF'FFFFh. This means, there is no intrinsic overflow at the end of the 1MB address space anymore, and treating segment 0FFFFh as -1 no longer works if the main board actually cares about A20..A23. And that's where the actual use of the of the 286 processor did not match the design expectations. Intel expected the computers using the 80286 as "turbo-8086" do not care about A20..A23, and provide the compatible 1MB address space, while modern computers with modern operating systems worked in the protected virtual-address mode anyways in which calculating with physical addresses at the application level made no sense anymore. Intel did not expect a system that mainly operates in real-address mode, but has more than 1MB of address space, but that is exactly what the IBM AT does! So using a processor outside the use-case planned for it requires some creativity by the system designer - and IBM got sufficiently creative here. Their AT can both execute DOS and ignore the A20 address line, providing the address wraparound that is part of the 8086 system architecture, or it can be switched to protected mode using a BIOS call, which will switch the computer (not just the processor) into a different operation mode that honors the A20 line and switches the processor into the protected virtual-address mode. The idea by Intel was that a 80286 based computer either works in real mode all the time, or it just does some preliminary set-up in real mode, and then switches into protected mode for good. Intel could have made the choice between real mode and protected mode influenced by a pin on the processor instead of making protected mode a software-enabled option, but as the protected mode requires some data structures to work at all, most prominently the global descriptor table containing the kernel segment descriptors, having the initialization of the protected mode data structures run in real mode and do a controlled switch when this is done seemed the more viable option.

The keyboard interface between the XT and the AT is completely different. The keyboard interface of the XT contains of a synchronous serial receiver (no transmitter at all!) that receives a single byte from the keyboard, then triggers an interrupt and reads the received byte from a discrete TTL logic shift register chip through a parallel interface chip, the Intel 8255 (a chip that connects 3 8-bit "ports" to a data bus). The XT keyboard interface had no intelligence at all. The most observable consequence is that the CPU can not tell the keyboard to turn on/turn off caps lock, num lock or scroll lock LEDs. This is no issue for the standard PC and XT keyboard - because they don't have any LEDs! With the AT, the keyboard port was extended to be a bi-directional communication port, and interfacing that port could no longer be built from discrete TTL components, but handling this bi-directional synchronous serial interface was implemented in software in an Intel 8042 microcontroller (which is a variant of the 8048 microcontroller). This controller is called the "keyboard controller". While the primary purpose of this 8042 microprocessor was to handle communication with the keyboard, there were some spare pins on it that could be used for different purposes. And that's how IBM decided to implement the mainboard mode switch between XT-compatible 1MB addressing and new-fangled 16MB addressing: One pin on the keyboard controller outputted a signal that was fed into an AND gate to mask the address line A20. This AND gate is called the "A20 gate", and sometimes, the control signal from the keyboard controller to this gate is also called "A20 gate". That's how the keyboard controller is involved in adressing memory on the IBM AT.

HIMEM does a new twist on the game: HIMEM puts the mainbaord into "A20 unmasked" mode most of the time, but keeps the processor in real mode most of the time. So suddenly, address 10h in segment 0FFFFh no longer wraps down to address 0, but you can peek into the first 64K of extended memory. "Extended memory" is the name for memory with a physical address of 1MB or higher, which should not be visible to 8086-compatible code. It could have been a quite easy hack - if there wouldn't have been a notable amount of real-mode software actually using negative segment numbers! This was not a bug, but a sensible design choice: When software proceeds forwards through a memory range that might exceed 64KB, at makes sense to "re-normalize addresses" from time to time by increasing the segment number by some amount to minimize the offset, so that the maximum possible bytes forward from the current address is visible without changing the segment address. This pattern is completely unproblematic. On the other hand, when you proceed backward through memory, the sensible choice is not to minimize the offset value, but to maximize it, by minimizing the segment number. If an algorithm like this got down to re-normalize the segment/offset combination at a physical address of 32KB, it would choose an offset of 64KB (the maximum representable offset), and a segment with a base address of negative 32KB (which would be segment number -2048), and voilà, there is your negative segment number that perfectly works on a 8086 IBM PC, but will fail to work on an IBM AT with the A20 gate opened, as offset 64KB in that segment would no longer point to 32KB, but to 1MB + 32KB, so it would point into the start of extended memory. Most notable is the Microsoft Linker used to link all the EXE utilities shipped with MS-DOS. It had the option to apply a very primitive EXE compression using run-length encoding in release builds (enabled by the command line switch /EXEPACK, thus this scheme got named "EXEPACK"). As expansion of that compression scheme causes the output to be bigger than the input, Microsoft chose the sensible thing to do: It processes the data backwards, so it can operate in-place. EXE files are not limited to 64KB, so this is a text-book example of an algorithm processing a data block that might exceed 64KB backwards, so it uses pointers normalized for maximal offset. So HIMEM and DOS worked together to smartly control the A20 gate (and on the IBM AT and compatible machines, this means it interfaced the keyboard controller). When you load HIMEM and enable "DOS=HIGH" in CONFIG.SYS, DOS moves parts of the DOS kernel data structures into the first 64KB of extended memory (called the "high memory area"), and keeps the A20 gate open most of the time. But whenever an EXE file is started, the A20 gate gets closed (because that EXE file might be packed using EXEPACK), and it keeps being closed (and the adressing is now PC/XT compatible) until a system call into the DOS kernel is made, which re-opens the A20 gate. It is known that the EXEPACK unpacking code which is executed first in EXEPACK-packed executables does not perform any DOS call by itself, so when a system call happened, either the unpacking failed ("packed file is corrupt"), or it is finished and the actual application started running. So in fact, when I wrote a notable amount of real-mode software actually using negative segment numbers I didn't actually write about different software vendors writing their own algorithms, but actually about a notable amount of DOS software linked with the Microsoft Linker with /EXEPACK enabled.

OK, so HIMEM manages the A20 line, but that's not the only purpose of HIMEM. HIMEM also manages allocation of the extended memory, and handles data copying into and out of the extended memory. HIMEM provides a software interface conforming to the "Extended Memory Specification" (XMS), which allows to keep track whether some software uses the HMA (which can not be allocated partially to different XMS API users), and how much extended memory is borrowed to real mode by being used as HMA. The remaining extended memory can be allocated and freed as "Extended memory blocks" by different applications like RAMDRIVE.SYS or SMARTDRV.SYS, or by application programs. The only common use of the HMA was allocating the HMA to the DOS kernel using DOS=HIGH, but it would have been possible to not use DOS=HIGH, and have some other application make use of the HMA. As most users used DOS=HIGH anyway, I don't know of any applications making use of the HMA on their own. The XMS API also provides function calls to allocate and free UMBs, but this part of the XMS API is not implemented by HIMEM.SYS and thus not in scope of this post.

So I already wrote a very long post, but a lot of readers are waiting for the most well known elphant in the room to be addressed: As Intel intended the real-address mode to only be an intermediate set-up for software that is going to switch to protected mode, Intel skipped the logic to re-initialize the processor back to normal real-mode operation. There is no way to leave protected mode without completely re-initializing the 80286 processor, by resetting the processor using the processor reset pin. Issuing a processor reset does not mean issuing a system reset, though. On the IBM AT, the 80286 could be reset without any other parts of the system receiving a reset signal. This dedicated CPU reset signal to leave the protected mode was again generated by the 8042 keyboard controller, by giving it a special "pulse output pins" command. As there is no publicly documented way to access memory beyond 1024KB + 64KB (-16 bytes) without entering protected mode, the standard IBM BIOS call to "copy data between any addresses in conventional or extended memory" indeed switched the system to protected mode, enabled the A20 gate, performed the copy, and issued a processor reset afterwards. The BIOS intialization code then checks the status bits of the keyboard controller: If it indicates that is has not seen a system reset since the last complete POST, the BIOS knows that no one pushed the reset button (which would cause a system reset), but this is a dedicated processor-only reset. In that case, the next step of the AT BIOS is to check the CMOS RAM, byte 15. This is called the "shutdown status" byte. This byte indicates how the computer is supposed to resume operation after a reset. Some codes are internal to the IBM BIOS, while other are officially documented. They mostly work by doing a minimal system reinitialization followed by a jump to an address specified in the BIOS data area, so that the application that invoked the reset can then continue in real mode.

The 286-12 our family had when I was young with an AMI BIOS required around 1ms to switch from protected mode to real mode (as measured by a performance check utility supplied with a DOS extender), which is quite slow (12'000 clock cycles). It would be nice to be able to access extended memory without paying this cost every time after accessing it. And that's where the black magic comes in: There is an undocumented 286 opcode, known today under the name "286 LOADALL". This opcode was not intended for application or operating system use, but for in-system emulators. An in-system emulator is a hardware debugger plugged between the processor and the processor socket of a system, that is able to interrupt normal execution and investigate processor and system state, then continue execution. This works by having dedicated emulator ROM and RAM on the in-system emulator, and switching the processor from executing application code into executing debugger code. When the debugger was done and system code execution was meant to continue, the processor needed to be re-initialized to the state it had when it was interrupted. This in-system emulator has to work with real mode as well as with protected mode target code. Interrupting and resuming protected-mode code is not completely straightforward, because segment registers in protected mode contain segment numbers, not the actual properties of the segment, and when a segment register is loaded, that number is used to look up the segment description, and the contents of the segment description is put into an invisible shadow register of the processor called the "descriptor cache". If the description is changed afterwards, the processor keeps using the old cached descriptor. For this to work even with breaks into the debugger code of an in-system emulator, the instruction LOADALL had to be able to directly load the segment descriptors, independent of the current descriptor table contents. As it got know in ~1988, the LOADALL instruction could not just be used to exit from the debugger code of an in-system emulator, but it could also be called from real-mode code. Now that's a game changer! This allows real-mode code to load any base address into the segment descriptor cache, even past the 1MB barrier, without entering protected mode, thus also without the need to leave protected mode. 286 LOADALL is quite inconvenient, because it loads the register contents from a fixed memory address (that was supposed to be in emulator RAM, and in this context, the specific address surely makes sense), but this address is somewhere inside the memory allocated for the DOS kernel. MS-DOS 5.0 or newer reserved this space for HIMEM usage, so HIMEM can use the LOADALL approach instead of the switch-to-protected-mode approach to implement the "XMS copy" function.

So, we can get rid of switching to protected mode to copy data from/to/between extended memory. But we can't get rid of masking A20 everytime an executable packed by EXEPACK starts (which involves the keyboard controller on AT systems). And we can't get rid of resetting the 286 to leave protected mode once protected mode was actually entered. While HIMEM could skip entering protected mode, Windows 3.0 (and 3.1) running in standard mode could not, because the point of standard mode is to execute 286 code in protected mode. Yet, Windows 3.x required to switch to real mode a lot of times: Everytime a system service provided by DOS or the BIOS was called, the processor had to be switched back to real mode. But we can get rid of requiring to kindly ask the keyboard controller to reset the 286 processor when it seems the next time for the 8042 to handle commands sent to it, because most 286 mainboards provide an internal shortcut to generate a CPU reset: Whenever the 286 in protected mode gets lost that much that it has no recovery procedure available (because something, like an invalid memory access, happened that needed to be reported to the OS (this is a fault), but while trying to report that invalid memory access to the OS, this turned out to again violate protection rules (now it's a double fault) and reporting this condition to the backup recovery handler also resulted in a protection violation (a triple fault)) it just gives up executing code, and it notifies the main board by issuing a special bus cycle, called the "shutdown cycle". Many main boards identify a shutdown cycle and generate a processor reset as recovery procedure. If this works, you can eliminate the keyboard controller from the path back from protected mode to real mode, yet you can't elminiate the reset itself.

Reply 82 of 110, by douglar

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2023-10-11, 18:54:

The BIOS intialization code then checks the status bits of the keyboard controller: If it indicates that is has not seen a system reset since the last complete POST, the BIOS knows that no one pushed the reset button (which would cause a system reset), but this is a dedicated processor-only reset. In that case, the next step of the AT BIOS is to check the CMOS RAM, byte 15. This is called the "shutdown status" byte. This byte indicates how the computer is supposed to resume operation after a reset. Some codes are internal to the IBM BIOS, while other are officially documented. They mostly work by doing a minimal system reinitialization followed by a jump to an address specified in the BIOS data area, so that the application that invoked the reset can then continue in real mode.

Thanks for the great write up !!

Reply 83 of 110, by digger

User metadata
Rank Oldbie
Rank
Oldbie

Indeed, thanks for the detailed information. It makes it all the more painful when you compare this to the elegance of the 68000 architecture, which did not require memory segmentation, was designed with forward-compatibility with 32-bit processors in mind, and yet came out years before the 80286.

It has been mentioned many times before, but it really makes one wonder what could have been, if only IBM had decided to base the original IBM PC on the 68000 instead of the 8088. So many nasty kludges, like the aforementioned A20 gate or having to switch back and forth between real mode and protected mode with a huge performance penalty, would not have been necessary, the platform would have been able to evolve much more naturally, and the transition into the 32-bit era would have gone way more smoothly.

Reply 84 of 110, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2023-10-11, 18:54:

OK, so HIMEM manages the A20 line, but that's not the only purpose of HIMEM. HIMEM also manages allocation of the extended memory, and handles data copying into and out of the extended memory. HIMEM provides a software interface conforming to the "Extended Memory Specification" (XMS), which allows to keep track whether some software uses the HMA (which can not be allocated partially to different XMS API users), and how much extended memory is borrowed to real mode by being used as HMA. The remaining extended memory can be allocated and freed as "Extended memory blocks" by different applications like RAMDRIVE.SYS or SMARTDRV.SYS, or by application programs. The only common use of the HMA was allocating the HMA to the DOS kernel using DOS=HIGH, but it would have been possible to not use DOS=HIGH, and have some other application make use of the HMA. As most users used DOS=HIGH anyway, I don't know of any applications making use of the HMA on their own. The XMS API also provides function calls to allocate and free UMBs, but this part of the XMS API is not implemented by HIMEM.SYS and thus not in scope of this post.

I believe Windows 2.x was the original target in mind for the acquire and release HMA functionality, DOS=HIGH not coming until MS-DOS 5.0 several years later, after which it would have become a vestigial feature.

Reply 85 of 110, by Jo22

User metadata
Rank l33t++
Rank
l33t++
digger wrote on 2023-10-11, 21:40:

[..] It makes it all the more painful when you compare this to the elegance of the 68000 architecture, which did not require memory segmentation, was designed with forward-compatibility with 32-bit processors in mind, and yet came out years before the 80286.[..]

Interesting thoughts! 🙂

I often saw the 68010 more like an equivalent to an 80286, due to its optional MMU, the 68451.
The MC68010 also could do virtual memory and had a little buffer built-in, which was nice for repeating operations (loops).

The basic 68000 is more like an 80186 (CPU core only) in terms of features, maybe.
The 8086/8088 are a bit too simple to fit into the comparison, they're more like 6800 or 6502s territory (their larger 1 MB address space excepted).
The NEC V30 and up are a bit closer, maybe. Later models had integrated MMUs/24-Bit address space etc.

Personally, I often still wonder why the plain 68000 was still kept being used in home computers and consoles for so long,
considering that the 68010 was a worthy successor that's pin-compatible and slightly more elegant.

It also had kind of support for privileges, so that user/kernel mode code could be differentiated.
That was one compatibility issue that needed minor patching of application software, maybe.

But especially the optional MMU was nice, concept wise.
A modified Amiga OS could have taken advantage of it, maybe, considering the feature creep that the Amiga platform already had in terms of countless specialized support chips.

Hm. Maybe that was because of the clone chip market, not sure. 🤷‍♂️
By the time, the original 68000 was heavily being cloned/manufactured by other companies under license.
Being an 1970s technology, it maybe was easier to mass produce, lower reject rates etc.

digger wrote on 2023-10-11, 21:40:

It has been mentioned many times before, but it really makes one wonder what could have been, if only IBM had decided to base the original IBM PC on the 68000 instead of the 8088. So many nasty kludges, like the aforementioned A20 gate or having to switch back and forth between real mode and protected mode with a huge performance penalty, would not have been necessary, the platform would have been able to evolve much more naturally, and the transition into the 32-bit era would have gone way more smoothly.

You're lucky. Some individual had converted an IBM PC to M68k not long ago. 🙂

https://hackaday.io/project/190838-ibm-pc-808 … -motorola-68000

https://hackaday.com/2023/05/02/ibm-pc-runs-b … 00-cpu-upgrade/

What I've always found most interesting of the PC platform was how it endured.
It's literally an hacky interim solution that outgrew itself.
Though not pretty, it became a platform with an actual heritage and family tree, sort of.
I think that's what made it human, it some way or another. It evolved over time.

Edit: I just remember, the original Macintosh platform eventually had issues with the Motorola 680x0, too.
It was about 32-Bit addressing. Older applications were written with 24-Bit address space in mind and not 32-Bit ready.
So there was an A20 Gate like problem, too.
The solution was a product named "Mode32".

More information:

https://lowendmac.com/2015/32-bit-addressing-on-older-macs/

https://macgui.com/news/article.php?t=527

Edited.

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 86 of 110, by Grzyb

User metadata
Rank l33t
Rank
l33t
digger wrote on 2023-10-11, 21:40:

It has been mentioned many times before, but it really makes one wonder what could have been, if only IBM had decided to base the original IBM PC on the 68000 instead of the 8088. So many nasty kludges, like the aforementioned A20 gate or having to switch back and forth between real mode and protected mode with a huge performance penalty, would not have been necessary, the platform would have been able to evolve much more naturally, and the transition into the 32-bit era would have gone way more smoothly.

In 1981, Motorola wasn't yet ready for mass production of the 68000, and the price was prohibitive.
So, IBM would have to postpone the PC release until 1984 or so.

There was plenty of problems with the 68k series as well, eg.:
- no FPU option before the 020
- the FPU in 040 wasn't fully functional compared to the earlier discrete FPUs
- the MMU in 030 was incompatible with earlier discrete MMUs
- the MMU in 040 was incompatible with 030

Zaglądali do kufrów, zaglądali do waliz, nie zajrzeli do dupy - tam miałem klimatyzm.

Reply 87 of 110, by Jo22

User metadata
Rank l33t++
Rank
l33t++
Grzyb wrote on 2023-10-12, 05:14:

In 1981, Motorola wasn't yet ready for mass production of the 68000, and the price was prohibitive.
So, IBM would have to postpone the PC release until 1984 or so. [..]

I suppose we can be grateful that the TI TMS9900 wasn't among their list?
An IBM-branded TI99/4A wouldn't have been a very pretty sight.

Edit: https://spectrum.ieee.org/the-inside-story-of … -microprocessor

Edit: In this hindsight, the IBM PC Model 5150 wasn't a real Personal Computer yet as we know it. If we use the long-lived AT architecture as a reference, I mean.

With CGA graphics, 16 or 64 KB of RAM, ROM BASIC, the cassette interface and lack of an HDD it was specced more like a home computer, still.
Or in MDA configuration, the equivalent to an electronic typewriter with a disk drive (similar to the Wang and Olivetti models).

The IBM PC/XT Model 5160 did improve on this, but it wasn't until the IBM PC/AT Model 5170 that the platform had matured (Tandy 2000 excluded, since it wasn't by IBM).

It was a full 16-Bit architecture from the start, supported HDDs, had a Real-Time clock w/ a CMOS RAM and introduced EGA,
the first IBM graphics standard that wasn't being laughed at (Hercules was okay and widely accepted, but not by IBM).

The performance was good enough to do a bit of multi-tasking or multi-program operation.
Utilities like TopView or DesqView ran more bearable on an 80286 processor than an 8088.
If you did actual business work in your office back in the 80s, this had affected the workflow.

Imagine being expected to finish work on time, but that darn 4,77 MHz PC didn't catch up with you.
Replacing and upgrading it was less costly than not being able to being competitive.

I know this sounds strange, but the impression of thebperformance of the original IBM PC being poor isn't new.

I've read magazine articles of the mid-1980s that said that someone could just make a cup of coffe and then come back to see if the PC has finished.

That's were these many CPU accelerator boards come into play, I suppose.
They upgraded existing 4,77 MHz 8088 PCs to another performance level.

With the IBM PC/AT and the 80286 CPU, things slowly changed for the better.
The 80386 was still being wished for, of course, but an PC/AT or clone at 8 or 10 MHz was still fine for most use cases
(the 10 MHz type was the fastest common type in ca. 1988; 12 MHz wasn't so widely available yet).

"Time, it seems, doesn't flow. For some it's fast, for some it's slow.
In what to one race is no time at all, another race can rise and fall..." - The Minstrel

//My video channel//

Reply 88 of 110, by GigAHerZ

User metadata
Rank Oldbie
Rank
Oldbie
Jo22 wrote on 2023-10-12, 08:28:
Grzyb wrote on 2023-10-12, 05:14:

In 1981, Motorola wasn't yet ready for mass production of the 68000, and the price was prohibitive.
So, IBM would have to postpone the PC release until 1984 or so. [..]

I suppose we can be grateful that the TI TMS9900 wasn't among their list?
An IBM-branded TI99/4A wouldn't have been a very pretty sight.

Why not? Maybe we would have gotten native hardware-accelerated multi-tasking from day 0?
TI99 was a disaster because it had so little ram that every program eventually stored it's data in videoram over slow 8bit bus connecting to the videochip.
TMS9900 is a fascinating chip.

What am i missing here?

"640K ought to be enough for anybody." - And i intend to get every last bit out of it even after loading every damn driver!

Reply 89 of 110, by mkarcher

User metadata
Rank l33t
Rank
l33t
Jo22 wrote on 2023-10-12, 08:28:

With CGA graphics, 16 or 64 KB of RAM, ROM BASIC, the cassette interface and lack of an HDD it was specced more like a home computer, still.
Or in MDA configuration, the equivalent to an electronic typewriter with a disk drive (similar to the Wang and Olivetti models).

As I heard the story, the intended base configuration of 16KB without any floppy drive never hit the market, because IBM realized that they couldn't compete with home computers anyway, so directly went to the more professional equipped version with a floppy drive. As the IBM BIOS design is to load the boot sector of the floppy to physical address 31K, an IBM PC that can boot from floppy requires at least 32KB of RAM. So that's the base model they actually sold. Rumor has it that IBM employes were internally able to get the 16KB no-floppy configuration, though. The original 5150 mainboard wasn't 16KB or 64KB, but 16KB to 64KB, but it also supported 32KB and 48KB.

Nevertheless, you are right that the design of the IBM PC is basically a "home computer specced up a little bit", so compromises in future-proofness are to be expected. While IBM used the "16-bit" 8088 processor that had an address space exceeding 64KB, the whole set of support chips, the DMA controller, the interrupt controller, the timer controller are from Intel's 8-bit series. They were designed to support an 8080 or 8085, not an 8086. The PC/XT architecture was not designed to be extended to "native 16 bits".

With the AT, IBM realized that they had to extend to 16 bits, and the AT works fine as 16-bit 80286 machine. If you don't insist to run outdated 8086 software on it, but go "native 286", you don't have to deal with the A20 gate (A20 can be enabled all the time), you don't have to deal with resets to get the processor out of the protected mode. The only tribute you have to pay to the XT compatibilty of the AT with native 286 software is the non-continous memory due to the "adapter RAM/ROM region" and the BIOS between 640KB and 1MB. It's a shame the AT didn't get a sensible DMA controller, though, but they just band-aided a second obsolete 8-bit / 64KB DMA controller chip to support the new 16-bit DMA channels.

Reply 90 of 110, by jakethompson1

User metadata
Rank Oldbie
Rank
Oldbie
mkarcher wrote on 2023-10-12, 21:52:

With the AT, IBM realized that they had to extend to 16 bits, and the AT works fine as 16-bit 80286 machine. If you don't insist to run outdated 8086 software on it, but go "native 286", you don't have to deal with the A20 gate (A20 can be enabled all the time), you don't have to deal with resets to get the processor out of the protected mode. The only tribute you have to pay to the XT compatibilty of the AT with native 286 software is the non-continous memory due to the "adapter RAM/ROM region" and the BIOS between 640KB and 1MB. It's a shame the AT didn't get a sensible DMA controller, though, but they just band-aided a second obsolete 8-bit / 64KB DMA controller chip to support the new 16-bit DMA channels.

Is this related to the AT disk controller switching from DMA to PIO because ISA DMA is so slow vs. rep insw?

Reply 91 of 110, by Disruptor

User metadata
Rank Oldbie
Rank
Oldbie
jakethompson1 wrote on 2023-10-12, 22:06:

Is this related to the AT disk controller switching from DMA to PIO because ISA DMA is so slow vs. rep insw?

Not just this. And the 8088/8086 had no REP insw. This was introduced in the 80188/80186. But the V20 had REP insw.
Good performing AT interface cards had either shared memory or busmaster DMA. But no no never those Intel 8 bit chips...

Reply 92 of 110, by mkarcher

User metadata
Rank l33t
Rank
l33t
jakethompson1 wrote on 2023-10-12, 22:06:
mkarcher wrote on 2023-10-12, 21:52:

With the AT, IBM realized that they had to extend to 16 bits, and the AT works fine as 16-bit 80286 machine. If you don't insist to run outdated 8086 software on it, but go "native 286", you don't have to deal with the A20 gate (A20 can be enabled all the time), you don't have to deal with resets to get the processor out of the protected mode. The only tribute you have to pay to the XT compatibilty of the AT with native 286 software is the non-continous memory due to the "adapter RAM/ROM region" and the BIOS between 640KB and 1MB. It's a shame the AT didn't get a sensible DMA controller, though, but they just band-aided a second obsolete 8-bit / 64KB DMA controller chip to support the new 16-bit DMA channels.

Is this related to the AT disk controller switching from DMA to PIO because ISA DMA is so slow vs. rep insw?

I get why they ran the 8-bit DMA channels as slow as they did - they wanted to keep the XT hard disk controllers working in the AT. In the PC/XT, the DMA controller ran at the one system clock of 4.77MHz. IBM didn't dare to clock the stuff up to 6MHz, which would likely not only make cards not work, but also exceed the specified clock of the Intel 8237 (I need to admit I skipped looking up what variants exist). Instead, IBM divided the 6MHz to 3MHz, making ISA DMA noticable slower on the AT than it was on the XT. The later versions of the AT increased the DMA clock back to 4MHz, which might be acceptable for PC-compatible 8-bit DMA.

On the other hand, REP INSW was quite fast on the 286, it doesn't have any stupid 64K or 128K boundaries (like the outdated DMA controller did), and as long as you don't do multitasking, there is no advantage of "DMA in the background" over "PIO in the foreground".

The EISA DMA controller showed how a sensible DMA controller could have been designed. It allows tighter timings (at least if the memory side of the DMA transfer is on-board memory and not ISA bus memory), it does away with the 64K / 128K "pages" of DMA, it supports full 32-bit addressing, and IIRC it even support scatter/gather DMA. We got better DMA support in some later chipsets (e.g. the ISA bridge of the Intel Saturn 486 chipset supports most features of the EISA DMA controller that do not require the EISA connector), but support was so sparse that neither hardware vendors nor operating system vendors felt like investing money in supporting EISA-based DMA using that DMA controller. You had some nice EISA busmaster DMA cards (e.g. the Adaptec 2742) for sure, but I don't think first-party DMA (that is DMA with addresses generated by the mainboard DMA controller) on EISA was used a lot.

With the condensed timings supported by EISA host bridges, DMA on the ISA bus started to make sense in some niche cases - especially if executing DMA in the background pays off.

Reply 93 of 110, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

I think I tried years ago to use an XT HDD controller in an AT with no luck. Neither on a 5170 or a later 386 board. Or does that mean "as long as you strap the IRQ 7 line to IRQ 14"

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 94 of 110, by Grzyb

User metadata
Rank l33t
Rank
l33t
BitWrangler wrote on 2023-10-12, 23:48:

I think I tried years ago to use an XT HDD controller in an AT with no luck. Neither on a 5170 or a later 386 board. Or does that mean "as long as you strap the IRQ 7 line to IRQ 14"

I tried it many times, successfully.
The only problem was: XT HDD controllers come with BIOS extension that can't coexist with HDD support in AT BIOS.
Solution: disable the AT BIOS HDD support by setting all HDDs to "None" in CMOS Setup.

Zaglądali do kufrów, zaglądali do waliz, nie zajrzeli do dupy - tam miałem klimatyzm.

Reply 95 of 110, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Ah, maybe I was expecting to configure it in setup.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 96 of 110, by DEAT

User metadata
Rank Newbie
Rank
Newbie
aries-mu wrote on 2023-10-03, 08:38:

Will Himem make it XMS?

Yes, though after confirming with a few runs via HIMEM I recommend setting the /TESTMEM:OFF switch to significantly improve the time it'll take for HIMEM to load. FDXMS286 will also work for FreeDOS, though paradoxically despite being designed for 286 PCs it requires the /PS switch to initialise correctly on generic AT boards.

Will Smartdrv or other caching software be able to utilize it?

Yes. I've noticed an improvement with benchmarking on WinTune 2.0.

Assuming I can source a 286 PC, what would be the best way to upgrade it?

Replacing a soldered PLCC68 CPU with a socketed one is a good start, same with replacing soldered oscillators with socketed ones. High-quality RAM is a must for faster CPU speeds.

RAM like 4 MB or even 8 MB?

Extremely chipset-dependent, as other people has pointed out. The Headland HT18 chipset will allow the full 16MB that a 286 CPU can access, but compared to the Headland HT12 it does not have built-in EMS support and it is not as fast clock-to-clock, if wolf_286 benchmarks are any indication. For reference, I believe I was getting 17.1 FPS with a Cirrus Logic GD5429 with a Headland HT12 @ 25Mhz and 0-wait states (that mobo died) while I get 15.6 FPS with the Headland HT18 @ 25Mhz with unknown wait states.

With very careful management of EMS and XMS memory on a Headland HT12 w/4MB of RAM, you can run Master of Magic and 1830: Railroads & Robber Barons without issues.

ISA SVGA, what? Cirrus Logic?

If you only care about DOS, any Cirrus Logic GD542x or WD90c3x card will be sufficient. Tseng ET4000 is overrated and has compatibility issues. I'm still in the process of determining what is ideal for a Win 3.1-centric build or a mix of DOS and Win 3.1.

rasz_pl wrote on 2023-10-03, 21:58:

Are there any non Windows VGA games requiring more than 1MB that will run acceptably on 286? I mentioned Sam & Max: Hit the Road because that is one of those rare games with VGA/>1MB and 286 requirements.

The floppy versions of Sam & Max and Day of the Tentacle will both run on a V20 without EMS, confirmed with my NuXT. Curiously, Indiana Jones and the Fate of Atlantis and Monkey Island 2 both require a 286 since they do an explicit CPU check. Neither game requires XMS or EMS.

Here's a dump of commercial/shareware games from my spreadsheet of games installed on my CF card that have an explicit 286 requirement and will utilise XMS, those that require XMS are bolded:

Blake Stone: Aliens of Gold (v1.0 only, all subsequent versions have a soft 386 requirement where it'll hang while loading any level)
Champions of Zulula
Disciples of Steel
Dune II
Electranoid
Legends of Valour
MicroProse Formula One Grand Prix
Nomad
Operation Body Count (allows higher screen sizes)
Revolt of Don's Knights
Rules of Engagement 2
Spear of Destiny
The Clue!
Wolfenstein 3D

And here's the list of games with an explicit 286 requirement and will utilise EMS, those that require EMS are bolded:

1830: Railroads and Robber Barons (may work on a V20, requires 2.7MB of EMS but my 8MB EMS card is 16-bit only)
Arcy 2
Armour-Geddon
Battle Bugs
Blake Stone: Aliens of Gold (v1.0 only)
Challenge of the Five Realms
Corridor 7 (floppy version) (without EMS, sound produces glitchy noise)
Darklands
Darkstrike ATF - Advanced Tactical Fighter
Devil Land
Gene Splicing
King Arthur's Knights of the Round Table
Lode Runner: The Legend Returns
Master of Magic (may work on a V20, requires 2.7MB of EMS but my 8MB EMS card is 16-bit only)
Mystic Towers
Nomad
Reunion
Spear of Destiny
The Aethra Chronicles: Celystra's Bane
The Clue!
The Incredible Machine 2
Total Carnage
Wing Commander 1
Wolfenstein 3D
Yendorian Tales Book I: Chapter 2

Additionally, Ashes of Empire also requires XMS or EMS, but it will run on a 8088. Speaking of, here's a list of games that will run on a XT with EMS, once again those that require it are bolded:

8088

Aces of the Pacific (v1.0 only, all other versions and the 1946 expansion require a 386)
Alien Breed
Ashes of Empire
Diggers
Igor: Objective Uikokahonia
Hoyle Classic Card Games (1993)
Ken's Labyrinth
Lemmings 2: The Tribes
Prince of Persia 2: The Shadow and the Flame
Quick Majik Adventure
SEAL Team
The Summoning
Veil of Darkness
Wizardry 7: Crusaders of the Dark Savant
Wolfsbane
Worlds of Ultima: Martian Dreams
Worlds of Ultima: Savage Empire

V20

Bram Stoker's Dracula
Doofus
Hexx: Heresy of the Wizard
Hired Guns
International Tennis
Laser Squad
Master of Orion
Monster Bash
Populous 2: Trials of the Olympian Gods
Space Hulk (03/03/1993 version only - all other versions require a 386)
Spaceward Ho!
Star Wars: X-Wing
Super 3D Noah's Ark
The Terminator 2029 (including Operation Scour expansion)
Wing Commander 2

So yes, with a motherboard chipset that has built-in EMS support there is more than enough justification for having more than 1MB of RAM without even needing to mention Windows 3.1 as a reason.

Finally, without XMS there are several more games that will need UMB drivers and DOSMAX to free up enough conventional memory for the following games to run, sorted by CPU requirement and going with the assumption of BUFFERS=20 or higher in CONFIG.SYS, as some will run with BUFFERS=10 or lower but is not recommended with spinning rust:

8088

Aces of the Pacific (v1.0 only)
Catcher
Cybergenic Ranger: Secret of the Seventh Planet
Heartlight
Igor: Objective Uikokahonia
International Sensible Soccer
Lemmings 2: The New Tribes (if using MSMOUSE.COM and EMS for digitised sound)
SEAL Team
Sensible Soccer: European Champions - 92/93 Edition
Tex Murphy: Martian Memorandum
Ultrabots
Wolfsbane

V20

D&D Stronghold: Kingdom Simulator
Daemonsgate
Enigma
Hannibal
Maelstrom
Operation Europe: Path to Victory 1939-45
Protostar: War on the Frontier
Romance of the Three Kingdoms III: Dragon of Destiny
Sid Meier's Colonization
Siege
Space Hulk (03/03/1993 version only)
Tegel's Mercenaries
Walls of Rome

286

Battle Bugs
Blake Stone: Aliens of Gold (v1.0 only)
Challenge of the Five Realms
Frankenstein
Nippon Safes Inc.
Nomad
Stack Up
Star Control 2 (claims to require 566KB of conventional memory, but this is incorrect)

The following 286 games that require EMS I can not confirm if it requires UMB+DOSMAX or XMS, as I can't use EMS and UMB at the same time with my motherboard:

Armour-Geddon
Darkstrike ATF: Advanced Tactical Fighter (explicit requirement of 563KB conventional, which I barely reached but it crashed after a couple of minutes in-game)
Darklands (explicit requirement of 576KB conventional)
Gene Splicing (explicit requirement of 580KB conventional)
The Incredible Machine 2 (explicit requirement of 560KB conventional, crashed while loading)

Last edited by DEAT on 2023-10-13, 03:30. Edited 2 times in total.

Reply 97 of 110, by BitWrangler

User metadata
Rank l33t++
Rank
l33t++

Looked up a couple of your bolded ones like Alien Breed and they say 640k min EMS optional for digitised sound.

Unicorn herding operations are proceeding, but all the totes of hens teeth and barrels of rocking horse poop give them plenty of hiding spots.

Reply 98 of 110, by Horun

User metadata
Rank l33t++
Rank
l33t++

Yes there are good reasons to have 2Mb or 4Mb on a 286. Even Geos (GeoWorks) supported Extended ram and could use it if needed....

Hate posting a reply and then have to edit it because it made no sense 😁 First computer was an IBM 3270 workstation with CGA monitor. Stuff: https://archive.org/details/@horun

Reply 99 of 110, by DEAT

User metadata
Rank Newbie
Rank
Newbie
BitWrangler wrote on 2023-10-13, 02:09:

Looked up a couple of your bolded ones like Alien Breed and they say 640k min EMS optional for digitised sound.

Retested and you're correct - looks like I had a typo in my spreadsheet. Corrected, thanks for that!