Class notes
Transcription
Class notes
Mathematics 1264: C/C++ programming, Hilary 2015 ´ D´unlaing Colm O January 27, 2015 Note to student. The course notes will be published on the web, section by section: the exam syllabus will be included some weeks before the end of the semester. Plan of course (tentative schedule, subject to change) • Week 1: Hexadecimal numbers, machine code, assembler code, languages. C and C++. Hello world in C and in C++ char, short, int, long, float, double, pointer, arrays, unsigned types. Sign extension. • Week 2: Programming assignment 1 Integer arithmetic Variables Assignment statements — day of week Quiz 1 • Week 3: Arrays and initialisation Programming assignment 2 for-loops and output While-loops Command-line arguments Input through scanf() and fgets() and >> • Week 4: Programming assignment 3 If-statements Functions and subroutines Simulating functions and subroutines Quiz 2 1 • Week 5: Programming assignment 4 Functions and subroutines continued: more about variables. 2-dimensional arrays C string library Pointers, Malloc() and Calloc(). • Week 6: Programming assignment 5 Structured types in C and C++ Matrix example Quiz 3 • Week 7 is Reading Week. • Week 8: Programming assignment 6 Classes in C++ C++ Standard Template Library (STL) • Week 9: Programming assignment 7 STL continued Quiz 4 • Week 10: Programming assignment 8 Matrix example again St. Patrick’s Day C++ armadillo linear algebra library • Week 11: Last Programming assignment 9? Files The cut-and-paste principle Quiz 5 • Week 12: Review Good Friday 2 1 Hex numbers, machine code, assemblers, languages 1.1 Octal and hexadecimal numbers Our decimal number system is derived from the human hand. All computer data is stored as patterns of 0s and 1s. A ‘bit’ is a binary digit, i.e., 0 or 1, or an object which can take these values. There is a multiplicative effect, so that 8 bits combined together can take 28 = 256 different values. A byte is a group of 8 bits. The binary string 01001000 represents 0 + 0 × 2 + 0 × 22 + 1 × 23 + 0 × 24 + 0 × 25 + 1 × 26 + 0 × 27 = 8 + 64 = 72 (that is, 72 in decimal, of course). It is easy to list the binary strings of length 3 in ascending order: 000, 001, 010, 011, 100, 101, 110, 111 The rightmost bit is called the ‘low-order bit.’ The ‘low order bit’ changes most often; the next bit changes half as often; the high-order bit changes only once. It is easy to convert a bitstring into an octal string. Simply put it in groups of 3, starting from the right. Thus 01001000 01 001 000 1 1 0 On the other hand, interpreting 110 as an octal string we get 0 + 1 × 8 + 1 × 82 = 72 (again, 72 in decimal). It is a coincidence that all the octal digits in 01 001 000 are 0 or 1, so it ‘looks’ like a binary number. To correct the ambiguity, one can use (. . .)b to indicate ‘to base b’. Then without ambiguity (01001000)2 = (110)8 = (72)10 Octal numbers give a compact way to represent bitstrings. So do hexadecimal numbers, which are numbers to base 16. We need 16 hex digits to form hexadecimal numbers. One uses a,b,c,d,e,f (or A,B,C,D,E,F) for the digits ≥ 10. Every hex digit equals four binary digits. The hex digits convert to octal, binary, and decimal as follows 3 hex 0 1 2 3 4 5 6 7 8 9 a b c d e f octal 0 1 2 3 4 5 6 7 10 11 12 13 14 15 16 17 binary 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 There are procedures for addition, subtraction, multiplication, and division, in binary, octal, and hex. Addition is easy. For example (each calculation is ‘staggered’ from right to left to show the ‘carries’). binary 10100011 +11010101 ---------10 10 10 1 1 1 10 ----------That is 10100011 +11010101 ----------101111000 octal hex Decimal 243 +325 ---10 7 5 ---i.e. 243 +325 ---570 a3 +d5 --8 17 --i.e c3 +d5 --178 163 +213 ---376 Multiplication and division in binary involve a fairly large number of very simple steps. Except for trivial cases, they are always ‘long multiplication’ and ‘long division.’ We shall not attempt them by hand. 4 1.2 Features of a computer A computer has several components, including Central memory, central processor, hard disc, and terminal (or monitor). Long-term data is on the hard disc; the central processor works on short-term data in the central memory. Here is a C program #include <stdio.h> main() { printf("Hello\n"); printf("there\n"); } Create a file hello.c containing the above lines, then run gcc hello.c This will create a file a.out which the computer can run as a program: aturing% a.out (The ‘aturing% ’ is a ‘command-line prompt.’) will cause the message Hello there to be written to the terminal. Question: what’s the ’\n’ for? a.out is in machine code. A computer accepts instructions in a very compact form called its machine code. A machine program (also called a ‘binary’ or ‘executable’) is a list of instructions in machine code. In the 1970s, with small microprocessors, it was common to write programs directly in machine code. Here is an example of machine code. Nowadays, most machine-code programs have thousands of lines like these. For example (I think that this tabulates the a.out file compiled from the above program, but it may be something different) on an Intel computer. The instructions are given in hex. Memory Address Machine instructions------------------------------------ 5 00000210 00000220 00000230 00000240 00000250 00000260 00000270 00000280 00000290 000002a0 000002b0 000002c0 000002d0 69 74 32 01 10 d8 d4 00 ff ff ff 31 68 6e 61 2e 00 69 95 95 00 35 25 25 ed c0 5f 72 30 01 69 04 04 00 c8 d0 d4 5e 83 75 74 00 00 0d 08 08 e8 95 95 95 89 04 73 5f 00 24 00 06 07 c8 04 04 04 e1 08 65 6d 00 00 00 05 02 00 08 08 08 83 51 64 61 02 00 02 00 00 00 ff 68 68 e4 56 00 69 00 00 00 00 00 00 25 00 08 f0 68 5f 6e 02 10 56 d0 55 e8 cc 00 00 50 84 5f 00 00 00 00 95 89 f3 95 00 00 54 83 6c 47 01 00 00 04 e5 01 04 00 00 52 04 69 4c 00 00 00 08 83 00 08 e9 e9 68 08 62 49 00 00 00 07 ec 00 00 e0 d0 20 e8 63 42 00 00 00 01 08 c9 00 ff ff 84 bf 5f 43 00 00 00 00 e8 c3 00 ff ff 04 ff 73 5f 00 00 00 00 61 00 00 ff ff 08 ff (1.1) Although letters on the terminal look like ordinary newsprint, say, under close inspection the letters spelling Hello are just patterns of dots, something like How are these letters stored on a computer? they could be stored as 7 × 5 patterns of zeroes and 1s, where 0 means ‘no dot’ and 1 means ‘dot.’ This would require 35 bits per letter. Instead, all characters are stored as 8-bit patterns under an internationally agreed code, the ASCII code. To learn more, type man ascii ASCII code for H is 01001000, for e is 01100101, and so on. Question. The ASCII code for H has octal value 110. The ASCII code for e is 01100101 as a bitstring. What is it in octal? in decimal? Figure 1 shows the basic computer components. Conclusions. The computer stores all data as patterns of 0s and 1s, called bitstrings. All characters appear on the screen as patterns of dots. Central memory, processor, hard disc. When you have edited and saved your program hello.c, it is now stored on the hard disc. (in ASCII, of course). It is data. It is the processor which does the work of the computer. Its job is to read instructions from central memory and execute them. The instructions are contained in executable programs. When you type gcc hello.c the computer copies an executable program called gcc into central memory, then executes that program on the data contained in hello.c. It produces a new executable program which is usually called a.out and stores it on disc. When you type (on ‘jbell’, say) 6 01100110110001001 01100110110001001 hard disc central memory 01100110110001001 processor 01001000011001010110110001101100011011110000101000000000 01110100011010000110010101110010011001010000101000000000 Hello there terminal Figure 1: Parts of a computer %jbell a.out the computer copies a.out into central memory and executes it, with the results as described. 2 Anatomy of a C program, and a C++ Here is a Hello, World program in C. #include <stdio.h> main() { printf("Hello, World\n"); } • The printf() statement prints the message on the terminal (screen, monitor). This action is called output. • The statement printf() is not ‘part’ of the C language; it is a separate routine whose general properties are in the file stdio.h which is stored in some recognised place in the computer. The #include statement is necessary; otherwise gcc will not recognise the printf() statement. The file stdio.h is called a ‘header file.’ Hence the suffix .h. • The real business of the program is in the 7 main () { .... } Every C program must contain this — called the ‘main routine.’ • The C program should be stored in a file hello.c or something: the .c suffix shows it is a C program. • gcc hello.c produces an executable file a.out as already discussed. And here is one in C++ #include <iostream> using namespace std; int main () { cout << "Hello, World" << endl; return 0; } • This time there is no printf() statement; the output is produced by cout << ...ectetera. cout represents the terminal, and the important facts are stored in the file iostream (no .h). The endl is ‘end-of-line.’ One can use "\n" as in C, or put the \n after ‘World.’ • using namespace std; has to be there (semicolon and all). It involves some complicated ideas which we ignore for now. • The main() routine is presented slightly differently, with the int and the return 0;. This is not of much interest. • This should be stored in a file like hello.cpp. The .cpp suffix indicates a C++ program. To compile it, use g++ rather than gcc: aturing% g++ hello.cpp • The executable program will be in a.out, just as with the C program. 3 Various types of computer data Machine instructions generally manipulate data stored in central memory. Data is organised as follows (there is some repetition here). • The fundamental unit of data is a bit, something which can have two values, 0 or 1. Central memory is a very large collection of bits, possibly billions. 8 • Before the 1970s central memory was composed of many (about a million) small doughnutshaped magnets threaded together with copper wire and called magnetic core memory. Hence the word ‘core’ used to mean central memory, and ‘core dump’ for a display of the contents of central memory (usually following a program crash). Nowadays, billions of bits of memory are stored on a single chip. • Bits are never read singly. Memory is grouped into 8-bit units called bytes. Each byte then can have 28 = 256 values. A byte then corresponds to a number in the range 0 (00000000) to 255 (11111111). • The ascii character set maps all printable characters, such as 0, a, &, *, to byte values. Also, nonprintable characters such a carriage return, backspace, ctrl-U, etcetera (ctrl-G is 07 in Hex. It should make a sound when pressed — or printed). • As far as I know, the smallest piece of data in C is a single byte, and the keyword is char because of the ascii conventions. In other words, when you need to present data byte-by-byte in a C program, you will use the word char. • Next is short (short integer). In our system this appears to be two bytes with 65536 different values. In the 1990s the default integer length was 16 bits (short). Now that memory is much more abundant, the default is 32 bits. • Next is int (integer), 32 bits. The range is from −2147483648 to 2147483647. About ±2 billion. • Next is long. On 32-bit machines this appears to be 4 bytes, on 64-bit machines this is 8 bytes. • Memory addresses are important in C. There is no special keyword for ‘memory address’ — they are introduced in another way — but all memory addresses occupy 4 bytes or 8 bytes. On 32-bit machines the range is 0 . . . 232 − 1. The highest memory address is 4294967295, 4 gigabytes. • Next is float. In our system this appears to be a 4-byte representation of floating-point numbers. • Next is double. In our system this appears to be a 8-byte representation of floating-point numbers. The following program shows the size (number of bytes) in each data type. It uses features of C which will not be introduced until late in the term. #include <stdio.h> main() { 9 printf("char %d bytes\n", sizeof(char)); printf("short %d bytes\n", sizeof(short)); printf("int %d bytes\n", sizeof(int)); printf("float %d bytes\n", sizeof(float)); printf("long %d bytes\n", sizeof(long)); printf("double %d bytes\n", sizeof(double)); /* * Working with addresses is an advanced topic. * Just to give a foretaste, * ‘short *’ means ‘address of a short integer,’, * ‘int *’ means ‘address of an ‘int’’, and * so on. All these addresses are 4 or 8 bytes, * depending on the machine. */ printf("address printf("address printf("address printf("address printf("address printf("address of of of of of of char %d bytes\n", sizeof(char * )); short %d bytes\n", sizeof(short * )); int %d bytes\n", sizeof(int * )); float %d bytes\n", sizeof(float * )); long %d bytes\n", sizeof(long * )); double %d bytes\n", sizeof(double * )); } Output when run on my 32-bit office PC: char 1 bytes short 2 bytes int 4 bytes float 4 bytes long 4 bytes double 8 bytes address of char 4 bytes address of short 4 bytes address of int 4 bytes address of float 4 bytes address of long 4 bytes address of double 4 bytes Output when run on the 64-bit machine aturing: char 1 bytes short 2 bytes int 4 bytes 10 float 4 bytes long 8 bytes double 8 bytes address of char 8 bytes address of short 8 bytes address of int 8 bytes address of float 8 bytes address of long 8 bytes address of double 8 bytes Here are the internal representations of various numbers (Most of them need explanation) char z: short 43: short -43: short -9: short -32768: short 32767: int -2: int 300: int -300: int 70000: int -70000: int -2147483648: int 2147483647: long -3: float 1234.560059: double 1234.560000: string hello: 7a 2b d5 f7 00 ff fe 2c d4 70 90 00 ff fd ec 0a 68 00 00 ff ff 80 7f ff 01 fe 11 ee 00 ff ff 51 d7 65 00 00 00 00 00 00 ff 00 ff 01 fe 00 ff ff 9a a3 6c 00 00 00 00 00 00 ff 00 ff 00 ff 80 7f ff 44 70 6c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3d 6f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 93 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4 Integer arithmetic (4.1) 2s complement. A short integer is 4 hex digits, 16 bits, or 2 bytes long so it can represent at most 216 = 65536 different integers. We might expect it to take values 0 to 65535, but instead half of the values are negative. The range of values is from −32768 to 32767. Notice that 43 decimal is represented as 2b 00 hex. This shows that on our machines the first byte is low-order, the second is high-order. It is said humorously that on Intel processors, numbers are stored little-endian, meaning that the low-order byte (but not the low-order bit) is stored before the high-order byte. We should preferably write it with high-order byte first: 00 2b This represents 2 ∗ 16 + 11 = 43, as expected. Next, −9 is represented as f7 ff, or, high-order byte first, ff f7. Normally ff ff would represent 216 − 1 and ff f7 would be 216 − 9. The general rules are as follows. 11 • Let N = 215 . (The same idea holds for 32-bit and 64-bit integers, except that then N = 231 = 2, 147, 483, 648 N = 263 = 9, 223, 372, 036, 854, 775, 808 respectively.) • An integer x is in short integer range if −N ≤ x ≤ N − 1. • If −N ≤ x < N − 1, then the 2s-complement form of x is ( x if 0 ≤ x ≤ N − 1 2N + x if −N ≤ x ≤ −1 Thus a 2s-complement short integer has ‘face value’ between 0000 and f f f f (hexadecimal) or 0 and 65535 (decimal), and the signed integer it represents is in the range −32768 . . . 32767. • If x′ and y ′ are 2s-complement integers, then their 2s-complement sum is x′ + y ′ mod (2N ) i.e., x′ + y ′ mod 65536. • Modular arithmetic: x mod y is the remainder on dividing x by y. For example, 11 mod 4 = 3. (4.2) Proposition Let x and y be two integers within the range of short integers, i.e., −32768 ≤ x, y ≤ 32767. If x + y is also in this range, then 2s-complement addition will produce the correct answer in 2s-complement form. Partial proof. If x and y are both nonnegative, then 0 ≤ x + y < 215 and there is no carry to the 16th bit. If x and y are both negative, and x + y is in range, then 216 > 216 + x + y ≥ 215 . But (216 + x) + (216 + y) = 216 + (216 + x + y). The remainder modulo 216 is 216 + x + y, and it is between 215 and 216 − 1, which is correct. Case one positive, the other nonnegative: skippped. (4.3) Converting decimal to short. Positive numbers can be converted to hexadecimal by repeatedly dividing by 16. For example, to convert 12345 to hex, 12345 ÷ 16 = 771 remainder 9, i.e., 12345 = 16 × 771 + 9 771 ÷ 16 = 48 remainder 3, i.e., 771 = 16 × 48 + 3 48 ÷ 16 = 2 remainder 0, i.e., 48 = 16 × 3 + 0 12345 = 16 × (16 × (16 × 3 + 0) + 3) + 9 12345 = 163 × 3 + 162 × 0 + 16 × 3 + 9 (12345)10 = (3039)16 12 To convert a negative integer x to 2s-complement, first convert |x| to hex, then subtract from ffff, then add 1. This is the same as subtracting from 21 6, as required. For example, to convert −12345 to 2s-complement short integer, (12345)10 = (3039)16 f f f f − 3039 = cf c6 cf c6 + 1 = cf c7 Little endian: c7 cf Negatives. If x is in short integer range, and so is −x, and the short integer representation of x is y, then −y is represented as (216 − y), whether x is positive or negative. The only case where x is in range, but not −x, is x = −32768, where y = 216 − y = 32768. (4.4) Floating point numbers are the computerese version of high-precision decimal numbers. They can be broken down into exponent e and mantissa m (both integers) and represent m ∗ 2e . Further details later. (4.5) To disinguish between hex and decimal, we write, for example, (23)16 = (35)10 . 5 Variables An integer variable in C or C++ is a named item stored as an integer. It must be declared. Its value can be altered through assignment statements. #include <stdio.h> main() { int x,y; x = 1; y = 2; printf("x is %d, y is %d, x+y is %d\n", x, y, x+y); } This example shows two integer variables. To output any information about them, you need the formatting in the printf() statement. It is easy to guess what it prints. The string "x is %d, y is %d, x+y is %d\n" is called a format string. The values of x, y, x + y are inserted into the three places where %d occurs. There are two variants of the %d format. • %8d will insert an 8-digit number, right justified padded on the left with blanks. If the number is already at least 8 digits long, the 8 in %8d has no effect. • %08d is like %8d, but it pads with zeroes, not blanks. 13 6 For loops Here is a simple program. #include <stdio.h> main() { int i; for ( i=0; i<5; i = i+1 ) printf( "hello\n"); } The output is hello hello hello hello hello • Every C program must contain one section main(){ ...}. • This program uses one variable, an integer i. • printf() prints to the terminal. It is essential, but it is not part of the C language proper. The line #include <stdio.h> tells gcc that there is a file (somewhere) called stdio.h which needs to be included. It helps explain the printf() statement. • The text "hello\n" is called a character-string constant. It includes the newline (or carriagereturn) character \n. • The statement i = i+1; means replace the variable i (stored somewhere in central memory) by the new value i+1. There is a shorter way to write this: 14 ++i; This abbreviation should be used with care: it is more complicated than it looks. • The for (...) ... statement is called a for-loop. It operates as follows. • i is set to 0, then compared to 5. • 0 < 5, so the statement printf("hello\n"); is executed. • i is incremented to 1, and again compared to 5. • 1 < 5, so the print statement is executed. • And so on, with i = 0, 1, 2, 3, 4. Then i is incremented to 5, 5 is not < 5, so the loop terminates and the program terminates. TEMPLATE for a for-loop for ( <initial action> <while condition holds true> <between-step action> ; ; <statement> OR ; ) { <group of statements> } Indentation. A group of statements should be indented further than the curly braces, which should be level with the ‘for.’ A single statement should be indented further than the ‘for.’ (Indentation makes it easier to understand the program structure.) We can have a single statement printf("hello\n"); or a group of statements, each terminated by semicolon, and the group between braces — see below. BEST PRACTICE. It is better to group statements between braces, even when there is only one. Semicolons. There must be a semicolon after each statement, including the last in a group. While condition is true: Only the condition is given, such as i<7. The symbol < means ‘less than,’ of course. Other relations include 15 Mathematical form ≤ = ≥ > 6= C form <= == >= > != Here is another example. #include <stdio.h> main() { int i, j; for ( i=0; i<5; ++i ) { for ( j=0; j<i; ++j ) printf ( " " ); printf( "hello\n"); } } The output is hello hello hello hello hello Printing strings and integers. The general printf statement has the form printf ( <format control string>, item_1, ... item_n ); The minimal possibility is where no items are printed, just the format string. This was used in printf("hello\n");. More generally, the items are matched with parts of the control string to produce a formatted output. (Hence the f, for formatted, in printf.) If the item is a character string like "hello\n", it should be matched by %s. If it is an integer (or a short integer), it should be matched by %d. For example, the following code prints out a multiplication table. 16 #include <stdio.h> main() { int n, i; n = 7; printf("%d times table\n\n", n); for ( i=0; i<10; ++i ) printf("%d times %d is %d\n", n, i, n*i ); } Note: n*i means n × i. The output is 7 times table 7 7 7 7 7 7 7 7 7 7 times times times times times times times times times times 0 1 2 3 4 5 6 7 8 9 is is is is is is is is is is 0 7 14 21 28 35 42 49 56 63 It would look better if the 0 and 7 in the first two lines were aligned with the right-hand ends of the lines below them. This can be done with the statement printf("%d times %d is %2d\n", n, i, n*i ); The general %d-format rules are • %d causes an integer (or short integer) value to be converted to the shortest possible ASCII string and printed. • %5d causes an integer to be converted to an ASCII string of length ≥ 5 and printed. If (counting digits and possible minus sign) there are < 5 ASCII characters, it is padded on the left with blanks. 17 There are two more variations. • %07d causes an integer to be converted to an ASCII string of length ≥ 7 and printed. If necessary, it is padded with zeroes on the left. If negative, the zeroes come after the minus sign, of course. • %-07d is like %07d except padding is with blanks on the right. The minus sign is about alignment, and has nothing to do with the fact that numerical data is being printed. Note that it cancels out the zero-padding! • Summary. Integers are formatted in printf statements by including the following in the format control string. Special notation: the square brackets are for optional items. The angle brackets are for descriptions. %[-][0][h minimum width i]d Examples: %d %3d %-3d %010d %-10d Rules for formatting a character string are simpler. Strings are formatted in printf statements by %[-][h minimum width i]s Examples: %s %10s %-10s 7 Printf formats tabulated, first draft %d %8d %08d %-8d %c %s %8s %-8s %x %o signed decimal output right justify with blanks right justify with zeroes left justify (ascii) character string right justify left justify hexadecimal octal 18 8 Assignment statements with modular arithmetic Arithmetic assignment statements use the following operators (and more): + - * / % • ‘*’ stands for multiplication, of course • ‘/’, when applied to integers, means integer division, that is, it is rounded to an integer • ‘%’ means integer remainder on division, and is a variant of the mathematical ‘remainder modulo’ operator. • When m and n are both positive, then m/n is the quotient (rounded down) and m%n is the remainder, which is nonnegative. • When m is negative and n positive, then m/n is rounded up and m%n is negative, which is not the same as the mathematical form. • One needs to allow for this. For example, assuming n is positive, (m − 1) mod n should be (assuming m is nonnegative) be converted to (m+n-1)%n • Division and remainder when n is negative — whatever the rules are, they are not worth remembering; there is no reason one should ever want to perform integer division by a negative number. Example. Given a date in the form dd mm yy, where these three numbers are positive integers in the correct ranges, and it is understood that the date is in this century, then the following expression gives a number between 0 and 6 where 0 means Sunday and 6 means Saturday: yy yy + ⌊ ⌋ + offset[mm − 1] + dd + C mod 7 4 where offset[] is an integer array. This hasn’t been introduced yet, but its usage is rather intuitive. int offset[12] = {0,3,3,6,1,4,6,2,5,0,3,5}; and C is a correction for leap years: in January and February of a leap year, subtract 1, because the extra day only ‘kicks in’ in March. Allowing some leeway with integer arrays, you know enough now to convert the above formula to C or C++ code, except for the ‘correction term’ C, which requires an if-statement, not yet introduced. 19 9 Command-line arguments To be able to supply your program with command-line arguments, add a bit to your main() section. First, more notation about characters and character strings. • A single character (as opposed to a ‘string’ of characters) is represented with a single quote, such as ’a’, ’A’, ’\n’, ’\0’ • The null character is represented as ’\0’, 8 zero-bits or 00000000 • A character string is an array of characters, terminated with a null character. For example, "hello\n" is stored as an array of seven characters, including the final null character. • For technical reasons, a character string may be declared using a * rather than a [] notation, e.g., char * x #include <stdio.h> main ( int argc, char * argv[] ) { int i; for (i=0; i<argc; ++i) printf ( "%s\n", argv[i] ); } If one compiles this program and types a.out a quick brown fox the four character strings "a", "quick", "brown", and ”fox” are called command-line arguments. The result is a.out a quick brown fox 20 Partial explanation. You can, as shown, use to argc as you would use an integer variable. It means the number of character strings on the ‘command line,’ including the a.out. The minimum value is 1. The variable argv is an array of character strings. Its size is not given, but argv[i] is the i-th command argument, valid for i between 0 and argc-1. The command-line arguments are character strings, but they can be converted to integers, etcetera, through another #include: #include <stdlib.h> If x is a character string, then atoi (x) is the integer value of x. If x does not represent an integer then atoi(x) is just zero. For example, #include <stdio.h> #include <stdlib.h> main ( int argc, char { int dd, mm, yy; dd = atoi ( argv[1] mm = atoi ( argv[2] yy = atoi ( argv[3] * argv[] ) ); ); ); printf ("Date is %02d/%02d/%02d\n", dd, mm, yy); } 10 If-statements (10.1) Conditions and if-statements. An if-statement has the form (mind the INDENTATION) if ( <condition> ) <statement; or {group}> and --- optionally --else <statement; or {group}> 21 The condition must be in parentheses. if ( <condition> ) . . . Programming languages usually use the word ‘then.’ C doesn’t. The condition is in parentheses and ‘then’ is understood. Statement or group of statements? It is best practice to use curly brackets always, as otherwise one gets into a mess. (If I forget to do so, remind me.) if ( x == 1 ) { printf ("hello\n"); } else { printf ("goodbye\n"); } Conditions are converted to integers. In a.out the condition argc == 2 is tested and an integer produced: 1 for true and 0 for false. More generally, any integer value can be used as a condition; nonzero is treated as true and zero as false. Complex if-statements. The basic ‘if-statement’ relations are ==, <, <=, >, >=, != They can be grouped into more complex statements using && for ‘and,’ || for ‘or,’ and ! for ’not.’ For example, to test if a 4-digit year is a leap-year, if ( yy % 400 == 0 || ( yy % 4 == 0 && yy % 100 != 0 ) ) Every fourth year is a leap year, except for centuries; every fourth century is a leap year. More complex conditions can be constructed with && || ! for and, or, not. The DOUBLE ampersand and double bar are important; single ampersand and single bar have a different meaning. For example, suppose yy represents a year, including the century, not just the last two digits. According to the Gregorian calendar, a leap year is • divisible by 4, and 22 • either is not divisible by 100 or is divisible by 400. Meaning that only one century in 4 is a leap-year; so on average the year is 365 397 400 days long, apparently a good approximation. This can be expressed in C: ... int leapyear, yy; .... leapyear = yy % 4 == 0 && ( yy % 100 != 0 || yy % 400 == 0 ) ; if ( leapyear ) .... There are rules about the order of evaluation in the expression yy % 4 == 0 && ( yy % 100 != 0 || yy % 400 == 0 ) To be really sure, you can fully parenthesise the expression, getting (yy % 4 == 0) && ( (yy % 100 != 0) || (yy % 400 == 0) ) There are certain rules about order of evaluation, but it’s hard to remember them all. Better safe than sorry. 23