Gw Temp


Article - 'Understanding hardware' by Guest

An item about Miscellanious posted on Jan 28, 2005


An article that explains about hardware and programming, beyond the luxury of the "makers".


Understanding the hardware

We live in 2004, in the past, you needed to understand how the hardware worked in order to program games. But times have changed and now we have those 'gamemaker' things and special libraries and kits to make our lives seem like paradise. But if you want to program your own game in a real programming language it is still very relevant to know how the hardware works. Remember, read this if you're interested in it only, it is not mandatory to development; but it can help understanding 'why' certain things look like this and that in a programming language. In fact, I discourage the use of this document if you haven't looked into 'hello world' yet and coded your own small mini game (this document will only confuse you).

Programming itself is a way of figuring out how something really works, you have an idea or problem in your head that you want to solve, then, you have to put it on paper somehow (make it concrete) with a language. A higher level language will do everything for you that seems logical. A low level language often requires understanding of the hardware before things become rational. A high level language is encouraged.

A functional language is often very expresive, but sometimes cryptic. Their goal is to achieve something in the shortest amount of developing time as possible. Like haskell. However, these applications don't always run as fast as more imperative (less functional) languages. Imperative means you have to tell the language what you are doing all the time, you often have to write a lot of repetive code etc. An imperative language is discouraged. It is best to get something that suits your current level of 'know how' and 'know when'.

Often, they use assembly (a very imperative method of programming) to show people how hardware works. I don't agree on this anymore. Please, never consider getting into assembly, it's almost always never worth your time. Exceptions would be figuring out ways how to optimise your code (making it faster) or for assembly only architectures or machines (ie shader language). Though, even those often ship with a simple C compiler.

What you have to understand is that the eventual 'product' of your code will be machine code. Machine code is the most basic thing there is for a processor. The code consists of a list of operations compiled by a compiler, this compiler could be a C++ compiler for instance. Sometimes, compilers compile in something else than machine code, for example there can be compilers that compile c++ code and turn it into object files. This objects files are then send to a 'linker' which then assembles the machine code (or other). This is how it often goes, but it is not relevant, the eventually output will often still be machine code.

The machine code thus is a list of instructions. An instruction consists of an 'opcode' and data, the opcode means 'operating code' and the data are the 'parameters' of this 'opcode', together, they're an instruction. A processor, for example the one that's in your computer now (x86), can read those instructions. There are instructions
that can interact with your hardware, like there are opcodes that send data to your harddrive (abstract) or videocard. Sometimes, an opcode needs a parameter, like you have an operation called "print" but it still needs to know 'what' to print. If you would have "print "hello"" you would have the opcode for hello and the argument "hello". There can be multiple arguments.

A processor itself reads instructions as big as it 'architecture size'. For you computer, that is likely 32 bit, and if you're on a AMD64, it's 64bit. If you for example have an 16 bit processor, you could make an instruction have 8 bits for the operating code, and 8 bits for the parameters. There is also a trick to make it more but we won't be talking about that just yet.

The processor practically grabs this list of instructions from a binary file (often an .exe or .dll), loads it, and iterates them. Iteration means it starts with the beginning of the list and works it's way to the end, but in the mean time the iterator can change it's place. The place it's currently on is called the PC, the program counter (small bit of irrelevant information there).

For a processor, everything is made up out of 0 and 1. Binary. Practically, all your hardware is made up out of that too (not concretely of course, unless you have some supernatural understanding of how the universe works). Us humans made protocols that consists out of interfaces so we can turn binary in something useful. Protocol is a standard for communication; the hardware needs to communicate with eachother (keyboard <->
software <-> screen). The interfaces are the protocols put into action.

An example of a standard would be the 'byte'. A byte is exactly 8 bits long. The best way to use these 8 bits is to use it to store a number. With 8 bits, you can make a number ranging from 0 to 255. On top of this 'standard' we build another 'enumerational' standard, which we call ASCII. ASCII is an international standard for symbols represented in bytes (called a 'char'). For example we say that when a byte contains the value 45 it is the letter 'a', and when it is 46 it is the letter 'b'. We do this because we cannot
concretely make the letters reside into the computer memory, there is no way making the computer figure out what 'a' would be without an international standard. The ASCII standard, for example, has 128 standard symbols ranging from 0 to 128, thus any program written that uses this standard will work on any platform that has implemented this standard. That means if I'm on an IBM computer and I put "hello" in ASCII in an ASCII using text editor, I could save it as a raw ASCII text file and load it onto a MAC and the letters would still be the same. If we did not have this standard, it wouldn't look like "hello" on a MAC anymore, instead, it would look like some garbage.

Thus standards and protocol are the most important thing in CS, ever. (hail the open source community)

If you don't understand all what I just told you (I don't blame you yet), then listen to this story. First, you have to accept that text and ASCII are human creations, we have made the standard, it is there, and it represents a way to put down text in an abstract binary way. So, there are also some scientists that do not know much about computer science but like researching deep space. The most retarded thing you could do is send a digital data disc into space with some human books on it and pictures.


For those aliens, our protocols mean nothing. For them, it will be like a big array of 1's and 0's. They don't have ASCII, they don't know that we use bytes. Maybe they use units of 10bits. Even if they would figure out everything was partitioned in 8 bits sections, they would have no way of telling that 47 means 'a'. Best we could do is send a 'piece of paper' with it that explains the protocol in a really abstract and simple way. I figure out the best way to put it to aliens is show them 8 boxes, and all 128 different combinations of letters. Even if we did THAT, they can't open the images, they could only read the text. Why? They don't know our protocol for digital image data or how we partition our files. Even 'that' is a protocol build on top of another protocol. For example we have a format for our images that represents the image by making the first two bytes the 'width' of the image and the following two the 'height', and then we follow everything up with an array of 3 byte structures, these three bytes would then gradually mean what the R G B value is. Now, aliens may do it the other way around, or even not use RGB. Maybe they use more bytes than 2, who knows, they can't know. Even if they did, there is the next problem. They don't have our hardware. They can't read the disk. Sure, even
if they figured that out (laser technology) they still have to figure out where they have to start reading the disk. Maybe they will start from the outer ring, and not the inner! Oh, so you say we give them a CD drive with it? Won't work, a CD drive uses an interface which is build on hardware protocol, here protocol that connects the drive to the mainboard. They could have fixed all these problems by sending a computer into space with all the data on it and only with an 'on' button on it. Also with an instruction pamphlet which visualises what button to press in 2 steps, PREFERABLE with the numers on them as binary AND arabic. First an image with the computer on it, and then one with the button pushed.

Back the architectures. If you have a CPU, it will read instructions and understand what to do with them as long as they have been compiled and linked for that architecture (sometimes they are intermediatly compatible with other architectures if they have a protocol, like x86 code will run on 286 386 486 etc processors. your processor is probably an 686). However, not always all the 'binary' data is instructions, sometimes it's an abstract data structure instead, like ASCII. Or maybe an image, or something else like

The CPU can get it's instructions and data from various places, and it eats and feeds (input and output) them trough various methods. The most important are the 'stack' the 'registers' and the 'ram', and there is also often an 'i/o map' (note that here that means binary map; a map of that what the binary data really means). Sometimes we call it a memory map when the io map is interlinked with the RAM (irrelevant).

The CPU stack is often in the lower cache part, but not always. The stack is really important. A stack is also an abstract name for something that can reside even in our real world (well duh it's a stack). A stack can work in several ways. But bestends you 'push' things on the stack and you 'pop' things off. Like you put a plate on a stack and take one off. A stack can function in different ways, like for example you have a FIFO stack "first in first out", which means that whatever get's pushed in first will get popped out first. There is also FILO.

A stack is important because you can for example have a cpu operation 'add' which adds two numbers. It could get these two numbers from the stack, he simply 'pops' two values off and the cpu adds those two numbers. The result is then again pushed onto the stack. This is how a basic virtual stack machine works. What happens to add those two numbers is something that's currently not relevant (that is more part of 'developing hardware'). The size of the stack for a CPU is often limitted. On a concrete CPU stack you CANNOT push anything on you want. Often, the items in the stack are as big as the CPU architecture size, for you that probably means 32 bits (or 64 on AMD64). So if you're on a 32 bit architecture and you want to push on some ASCII text, then you have to realise one ASCII character requires 1 byte, and one byte is 8 bits. We're talking about 32 bits here, so you have to shove in something of the size 32 / 8 = 4 characters. If you want to push only one byte, you could 'nullify' the other three (0 is a standard for 'here ends the array of data'). A stack can also reside in RAM, but in a more abstract way than the CPU (you'll have to program it yourself with the available instructions).

Why do we use stacks and not just 'linear memory'? Mainly we do it because the stack is 'really fucking fast'. RAM is often slow, you need to interface with it, where the cache is actually on the cpu. It means it takes less 'cpy cycles' to get some data. Powerfull processors often allow for a lot of high level cache. (high level is what you use, the lower is what the cpu uses when it needs a buffer stack) PS: I'm not 100% sure on that.

Next are the registers. Registers are practically also the size of the architecture, but you can look to them as variables. The registers reside on the CPU, and they influence how the CPU works and you may use them to figure out things about the hardware or how to configure it. CPU registers is one thing, but if you for example look at the GBA, you could see that in the memory map there are also registers for other parts of the hardware like there are sound registers on the sound chip. It is important to understand registers are often just part of how the hardware works and is configured on a 'lower binary represented level'.

For convenience let's talk about CPU registers. An example register would be the PC, the program counter. If you increase this counter, you practically skip the next instruction. On 32 bit systems registers are 32 bit. However, processors haven't always been 32 bit, they once were 16 bit (and others, like 8 bit like the nes). Sometimes a processors allows registers to be edited in 16 bit mode, when we do that we talk about 'higher' and 'lower' mode. A CPU also has general purpose registers; these are treated like 'very fucking fast' variables, you use them for intensive calculations, often you don't get to use them because your operating system will use them for you (with really smart software you will never in your life figure out). For example you have the E register (they're called like that, ABCDEF etc). When I want to 'talk' to the 32 bit E register as 32 bit we say EX (irrelevant) and we use EH and EL to talk to it in a 16 bit partitioned fashion (irrelevant). Some registers on the x86 CPU are not the size of the CPU but emulated 16 bit because we used to have those in the past (they're not really 16 bit but they work like that). Another example of a register is the 'flag' register, this flag register knows when something went wrong etc.

That's it for registers (for now). Next is RAM. RAM means random access memory. It means you can read AND write to it. It's flexible. There's also ROM, from which you may only read data. And that's just boring :) There are a million of different methods and things out there, especially weird hardware like flashcards and stuff but it all comes down to being writeable and readable or not, fuck what they want you to believe.

SO it's just a big list of data. However, when we talk about RAM it has been a fashion that we talk about the RAM that interfaces with the CPU. The RAM is so important that they decided that there should be a special bridge between them. For example the CPU has instructions that have arguments which may read data from the RAM, or the CPU may for example execute code that resides in RAM.

But that's not how it always works. You see, a CPU needs to interface with all hardware at any time in almost any operation. The way they connected the CPU to the other hardware defines how to interface with this hardware. The size of the map is distinct to the size of the CPU architecture. We have a 32 bit processor; this allows for 32 bit memory mapping. We've assigned it all to ram. When I talk about a 'reference' in the CPU and this is for example 0, then I talk about the first bit of data inside the ram. When I talk about 1, I'm getting the second byte (or integer) from the ram, up to 0xFFFFFF (32 bit maximum in heximals).

The GBA has a memory map that does not have it's complete map attached to the ram. This is also because it doesn't have much ram and it would be a waste not using the full 32 bit potention for other purposes. If I tell the GBA about 0x100000 I might very well be talking to the screen and not to the RAM. When I change anything in 'this area' something will also change on the screen (if you get what I mean). But I could also for example read from address (or register) 0x17000 to get information about what 'buttons' are pressed. This 'memory map' is important when you want to program for a specific piece of hardware
(here the gba).

Let's say I have an operating code "is this 1", which pops an address from the stack and checks whether this value at this address is '1'. If it is one, it will pop the next address and change the PC to this (and thus changes what code to execute). So let's say I have hardware at address 0x10 that knows if the A button is pressed (1 if pressed 0 if not). Then I simply push '0x10' onto the stack and a location where the code is for "when this key is pressed", then I instruct the "is this 1" operation. In a higher level language it would look something like this:

if((char*)0x10 == 1) {
printf("A button is pressed!");
or something similar.

However, now we have a problem. The processor needs to know whether or not something is simply a constant or a reference (or a reference to a reference etc). Why? Well, let's say I have the 'add' instruction, which adds two values from the stack and pushes it on the stack. When I say 0x10 + 0x10 then it will simply give me 0x20 (whatever), because it will always think it's a constant where it 'could be' the state of button A instead! This is why we often have arguments in the instructions, or flags. A flag is an option that
can be true or false (ie: constant/register).

Yes complicated complicated. But it's really not. Generally, if we have a language like C the thing is a lot more clear. When we do this in C:

10 + 10
It will use two constants. When we do:
avariable + 10

It will add a reference to a constant (variables are either 'references' to the RAM or a 'stack indice')

A stack indice is something on top of the 'constant/reference' thing. The stack indice is a number used to tell the CPU where on the stack the value resides. When there is the numbers 5 8 and 10 residing on a stack and 10 is the top then stack indice -2 would point to value 5. We use stack indices because they're really fucking fast. Also, it's not always '-2', sometimes it's '2' it just depends on your style or the CPU. No big deal, really, it's all software sometimes. Don't worry about it the compiler will do everything
for you, good compilers know *exactly* what to do. I know this stuff because I want to write my own compilers.

So what about those functions I make myself? How do they work? Well, they're 'not really there' at all. They're, again, protocol. Sometimes some basic functionality resides in your BIOS (a rom sometimes flashable chip on your mainboard), like your BIOS can write text to the screen etc. Anyway, a compiler makes you use these functions and compiles them into machine code, for example when I have:

void afunction() {
void main() {

The compiler will turn this all in instructions the CPU can read, either it will simple put the body (compound; code) of 'afunction' there where 'afunction' is, or it will put the code of 'afunction' in the machine code and remember the location where it is and then do a 'jump' instruction to that label (this is more common). This is not relevant to know, what is relevant is that functions are not really there at all.

Jumps work pretty simple. All a jump does is set the PC (program counter, I remind you AGAIN). A more complicated form of 'jumping' is 'calling'. Calling is very very important for functions to operate hierachiously (that means using functions in functions). 'call' jumps to a place and pushes the value of the PC on the stack, remembering from where it called 'afunction' (so here that would be where main is). Then, when it arives at 'afunction' it will do the printf, and when it finds the } (virtually) it will do a 'ret' instruction. RET pops a value from the stack and sets this as the program counter, returning it to 'main'. (hence we call it 'return').

If you make too many hierachies (often recursive), you will 'overflow' the stack, this means the stack is too small, or too stuffed for your CPU to tackle and your computer will freeze. Example:

void afunction() {

So I hope you understand how hardware works now a bit better... See ya later.