Gw Temp

Menu

Tutorial - 'Comprehensive C++ Tutorial for Beginners - Part 5' by AzureFenrir

An item about Programming Languages posted on

Blurb

AzureFenrir retuns with the fifth part of his C++ Beginnes tutorial, this time discussing File I/O and basic C structures to prepare you for object-oriented pogramming.

Body

Comprehensive C++ Tutorial for Beginners - Part 5

Well, well, we're at five already, aren't we? It's just one more tutorial (Introduction to OOP) until the end of this series! This tutorial will simply cover some other necessary stuff in C++, such as basic data structures and functions. I'll also add a bit of basic C++ file input :).

So...since I like old-fashioned C++ (and Dn7 continuously criticizes me because of it), I will start off by showing you how to use the newer ANSI headers and their more advanced string class.



A Look at a More Modern C++: ANSI Headers and strings

Now, remember how I've always told you to initialize headers like this:

#include <headername.h>
#include <headername2.h>
#include <headername3.h>
// ...


Well, the more modern version of C++ uses another set of headers (called the ANSI headers), which supports quite a few more modern functions to make your programming life easier. Instead of the above declaration, you should use:

#include <headername>
#include <headername2>
#include <headername3>
// ...

using namespace std;


Whoa! What is this "using namespace std" thing? Well, to ensure that their functions do not somehow conflict with yours due to an identical name, all ANSI functions and classes are encased in a namespace, which serves to keep them apart from your functions (by keeping them in a different area). The command "using namespace std" simply tells C++ that your program will use the namespace "std" and will not conflict with any of its functions.

So...what if you do not put this statement on top of your .cpp file? Well, in this case, all of the ANSI classes will still remain in the "std" namespace, and you will have to do this to refer to...say...the cout function:

std::cout << "Hello, Gaming World!\n";

BTW, the "\n" at the end refers to the "Line Feed" character. This character will cause an effect similar to endl in DOS (basically, it will start a new line).

Now, you are probably wondering why we need to use the ANSI headers instead of the regular ".h" headers. Well, in addition to being the newest C++ trend, the ANSI headers also offer a lot of things that the standard headers doesn't...and all of them designed to make your life much easier than C++'s normal living hell.

Let's use strings to set an example. Do you remember declaring old C-style strings from the second part of my tutorial? Well, it's quite a hassle to keep track of pointers and string sizes, and the string operations are sometimes quite annoying. I know I used to HATE the pointer and string errors that came from character manipulation and dynamic string arrays.

So...what does the ANSI headers offer that will solve this problem? Well, how about a string class?

The ANSI string class is quite easy to use. It can be declared like this:

string NewString;

So...why is this string so much more advantageous? Well, for one, forget about the strcpy function. Instead, just set the string directly with the equal opeator:

NewString = "Hello, Gaming World!";

And what about the strcat function that adds a string to the end of another if there's room? Why do we need that when we can simply use +=, without needing to check whether the string goes out of the allotted buffer size?

NewString += " Hello, Bartek Gnaido!";

Not to mention that you could still access individual characters:

NewString[1] = 'u';

The string class also offers a *c_str function that returns a constant string, which is quite handy for certain C and older C++ functions (as well as anything that requires a old school string):

cout << NewString.c_str() << endl;

This class also offers a slew of other useful functions such as find(), replace(), and insert(), but for the sake of a short introduction to ANSI, I won't describe them here.



Old School Data Structures: Structs and Unions

How ironic? Right after I show you how to use the newer ANSI headers, I revert right back to old-school structures.

So what are structs and unions? Well, they are custom data types that you, the user, can create. Both of these data structures can contain multiple submembers of data, allowing you to easily declare many variables with one declaration.

We'll start with structs. The basic declaration for structs is as follows:

struct MyStruct {
    int member1;
    double member2;
    char member3;
    string member4;
}


The members are the variables that the struct "contains." Your program can use any or all of these variables once he declares a new variable with this new data type:

MyStruct MyNewVariable;

Now, you can access MyNewVaiable's members by doing something like this:

MyNewVaiable.member1 = 3;
MyNewVaiable.member3 = 'u';
if (MyNewVaiable.member2 == 3.251)
{
    MyNewVaiable.member4 == "Hello, Gaming World!";
}


So what did we do here? Well, when we declared that MyNewVariable has a MyStruct data type, C++ automatically created a clean copy of all of MyStruct's members and gave them to MyNewVariable. Therefore, MyNewVariable now contains an integer, a double, a character, and a string.

If you declared another MyStruct-type variable:

MyStruct MyNewVariable2;

It will also have a clean copy of all of the structure's members. Therefore, setting, say, the member2 member of MyNewVariable will do nothing to the member2 member of MyNewVariable2, and vice versa. They are independent variables, and changes to one doesn't affect the other at all.

Unions are just like structs: they have member variables, and all of them could be accessed. However, unions store their member functions in the same place, which means that changing one of the members would change all of the other members, as well. For example, let's look at the following union:

union MyUnion {
    int member1;
    char member2;
}


Let's assume that you declare a new variable of this type and set the new variable's member1 to, say, 1280:

MyUnion MyNewVariable;
MyNewVariable.member1 = 1280;


Next, let's set this function's member2 to...say...'d':

MyNewVariable.member2 = 'd';

Then, let's output the value of member1 to the console:

cout << MyNewVariable..member1 << endl;

Ho! What is this? MyNewVariable's member1 has changed! Well, unions store all of its members in the same memory space, which means that member1 shares a memory space with member2. Therefore, changing the char member will also change a portion of the integer member.

Unions are mostly used for multivariable compatibility in the old C days, but can also be combined with structs for a neat trick:

union MyUnion {

    long member1;

    struct MyStruct {
        char byte1;
        char byte2;
        char byte3;
        char byte4;
    } member2;
}


Now, you can store a long number in member1, and use member2 to access the individual bytes in member1. Neat, eh? :P



Functions in C++

Do you remember how every C++ program should start with the function "main"?

int main()
{
    // Code here, Blah Blah
    return 0;
}


Did you know that this mysterious "main" is actually a C++ function? Well, C++ functions are declared in a similar fashion:

return_type function_name()
{
    // Code here, Blah Blah
    return return_value;
}


Was that very confusing? Well, let's make a simple function that calculates the value of 3 + 4 to demonstrate:

long add()
{
    return (3 + 4);
}


Do you understand now? The return_type specifies what the function will return (like string, long, char, double, etc.), and the function_name specifies the name of the function. You can insert your function's code in the code here part, and use the "return" keyword to return a value.

To use C++ functions, you must call them, like this:

MyVariable = add();

The above code will assign the value 7 (since 3 + 4 = 7 on Earth) to MyVariable.

But wait...a function would be petty useless if the user can't specify anything, right? So, let's expand our previous simplified definition of functions to include these parameters:

return_type function_name(type argument1, type argument2, ...)
{
    // Code here, Blah Blah
    return return_value;
}


There. The arguments are basically custom "variables" that the user can pass to your function to specify something (such as the two numbers that should be added, the interest rate of investment, etc.) With this in mind, let's make the above function more useful by making it add ANY two numbers that the user wants:

long add(long a, long b)
{
    return (a + b);
}


Now the user can add any two numbers that he/she wants. For example, let's assume that you want to add 4 and 5 together and store the result in MyVariable. You would then use:

MyVariable = add(4, 5);

That's quite easy to understand, right? Remember that a function can have as many lines of code as you want - it doesn't always have to be just one "return" statement.

Finally, you can make a function return NOTHING (useful for functions that perform tasks instead of calculations) by making "void" the return_type:

void function_name(type argument1, type argument2, ...)
{
    // Code here, Blah Blah
    return;
}


Obviously, in this case, you can't return a value. As a result, you could simply use "return" by itself to exit the function and return to main (or the calling routine).

BTW: Functions are usually placed on the bottom of your CPP file, right after function main(). Because they are usually placed under main(), the compiler will compile these functions after compiling main, and you may get an error when the compiler reaches a reference to your function.

If you didn't understand the above statement, that is OK. Just remember that functions usually go after main, and function prototypes are usually needed for functions to work...wait a minute. What is a function prototype?

Well, it's basically a version of the function without any code, like this:

return_type function_name(type argument1, type argument2, ...);

Everything in the prototype must be the exact same as your actual function, except for the fact that it has no coding and is terminated by a semicolon. This prototype will ensure that C++ will know that your function exists, and won't get confusing looking for it and throw out an error.

Oh, and these prototypes goes before the main function.



An Introduction to File Input and Output

You probably know what files are, and if you don't...well, it's a clunk of data stored in your computer. For example, every Word Document is a file, and so is every program, every spreadsheet, every e-mail, every web page, every DrFunk 60-Second Paint Pictureâ„¢, and...well, you get the picture.

Let's start with the headers. Obviously, file input/output requires its own header, called "fstream":

#include <fstream.h>

Or, if you like, the ANSI version:

#include <fstream>
using namespace std;


Now, after that #include is there, we need the actual code for file input/output. So, to do that, we need to declare a new fstream object:

fstream file;

So...what is this fstream class? Well, it gives us all the functions and operators that we need to read and write to files in C++.

For the sake of this tutorial, I will start with outputting to files. Before you can write to a file, you must first OPEN it, like:

file.open("C:\Path\Filename.ext", ios::out);

Or, if you like, you could do this when you declare the file object, like:

fstream file("C:\Path\Filename.ext", ios::out);

The ios::out means that we want to write stuff to the file. You can also use ios::app to add stuff to the end of the file and thus not erase its contents.

After you open a file, the rest is actually quite simple. You can write to files in the same way that you write to consoles:

file << "I want to write this to the file!" << endl;

In fact, you can even use the input/output manipulators here, like this:

file << "Azure's Idiotic Autobio" << setfill('.') << setw(20) << "Page 5" << endl;

Finally, when you are done, you can close the file, like:

file.close();

You could also get stuff from files. The concept is similar to writing stuff, except the open statement is a bit different:

file.open("C:\Path\Filename.ext", ios::in);

Or, again, you could do this when you declare the file object, like:

fstream file("C:\Path\Filename.ext", ios::in);

Now, just use the variable "file" as if it was "cin":

file >> Variable1 >> Vaiable2 >> Variable3 >> String1;

You can also detect if a file has ended (so you could stop trying to input stuff) by using the "eof" function:

if(!file.eof()) // File has not ended yet
{
    file >> Variable1;
}




Intermediate Stuff: Binary File Input/Output

First of all, what is binary I/O? Well, it's similar to text I/O, but is more frequently used for inputting binary data instead of text and numbers.

To perform binary I/O, you must first add an ios::binary to your open statement:

file.open("C:\Path\Filename.ext", ios::in | ios::binary);

Note that the separator is a SINGLE straight line (Binary OR instead of Boolean OR, if you are curious).

Now, there are two ways to read binary files. One way is character-by-character using the "get" and "put" statements:

file.get(MyCharacter);
file.put('C');
file.put((char) 210);


The "(char)" operator simply changes the 210 to the 'Ã’' (ASCII #210) character for compatibility with the "put" statement.

Obviously, reading files character-by-chaacter is quite annoying. In fact, this was the method that I first used when I made my "Rast Unlocker" utility. Unfortunately, the program took hours to read my sample project's .ldb file, which is really only ~100 KB. So...we need a better way to read from these binary files.

Luckily, C++ also offers the "read" and "write" functions, which are used like this:

file.read(buffer, size);
file.write(buffer, size);


So...let's assume that you want to stuff four bytes of binary data into a "long integer" variable. Then, you would use:

long MyLongVariable;
file.read((unsigned char *) &MyLongVariable, sizeof(MyLongVariable));


So...I guess I definitely have to explain that, right? Well, file.read requires a string buffer (unsigned char *) as its first value. Therefore, we have to pass the variable by reference so that "read" could actually modify the variable...but the variable is still in long*, so we have to change it to a string buffer first using (unsigned char *).

The sizeof() statement returns the number of bytes that a variable takes up, which is required by the read and write functions to ensure that the buffer does not overflow.

NOTE: If you did not understand the above explanation, do not worry! Try reading through my pointer tutorial first, and if you still do not understand this section, then just pat yourself on the back and remember that this is intermediate stuff. You will be able to understand it soon enough. I just inserted it here for those people who MIGHT understand the technical stuff.

Write is used in the same fashion as read.



Random Access for File I/O

Random Access means that you can access and read from/write to any point of the file. For example, you could read half of a text file, and then jump to the end. You could also change certain characters in the middle of the text file, or swap random words around. The power is with you :).

Random Access works on binary files. If you didn't understand the "read" and "write" functions earlier, then you will have to use the "get" and "put" functions to read and write one character at a time.

The most important functions for random access are "seekg" and "seekp", which moves the pointer to a certain place in the file. You use "seekg" to move the get pointer (Read), and "seekp" to move the put pointer (Write).

The functions share a syntax:

seek?(offset, origin);

Offset is the number of characters that you want to move, and origin can be either "ios::beg", "ios::cur", or "ios::end." What does this mean? Well, origin specifies either "beginning," "current location," or "end." If origin is "beginning," then seek? will move the current location pointer to offset characters away from the beginning. Likewise, seek? will move the current location pointer to offset characters away from the current location is origin is "ios::cur".

For example, the following will move the current location pointer (get) six characters right:

seekg(6, ios::cur);

And this will move the current location pointer (put) at the end of the file:

seekp(0, ios::end);

There are also two functions, "tellg" and "tellp", that will tell you where the current location pointers are:

cout << "The get pointer is currently at " << file.tellg() << endl;

And that's really all for random access files. I guess I got a little more technical than I needed to, so if you don't understand the last two sections, then don't worry. They are more intermediate stuff that most beginners do not need to understand yet.




Phew! About two and a half hours have passed, and I finally finished another Beginners' C++ Tutorial. The next one will cover the basics of classes and OOP...and don't worry, I won't put any of the more intermediate stuff on it.

So...until then, so long. And make sure to share any mangos that you stumble upon :P.