0% found this document useful (0 votes)
2 views

06-arrays

The lecture covers advanced topics in memory management and arrays, focusing on improving the efficiency of prime number calculations using C programming. It discusses the importance of arrays, memory access, and potential pitfalls like segmentation faults when accessing out-of-bounds memory. The session also introduces the sieve of Eratosthenes for finding prime numbers and emphasizes the concept of memory-bound functions.

Uploaded by

carlotta.hoelzle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

06-arrays

The lecture covers advanced topics in memory management and arrays, focusing on improving the efficiency of prime number calculations using C programming. It discusses the importance of arrays, memory access, and potential pitfalls like segmentation faults when accessing out-of-bounds memory. The session also introduces the sieve of Eratosthenes for finding prime numbers and emphasizes the concept of memory-bound functions.

Uploaded by

carlotta.hoelzle
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

PT1 today’s lecture will be quite a bit harder and faster already

memory and arrays


professor patrick baudisch
pt1
hasso-plattner institute
anyone else who needs access to Monday’s “error & exit” lecture?
we already talked about
bridging the gap between pt1 and your first coding job

progress so far: seven students showed up and their task is to write simple apps
and games for dualPanto in Unity

next step: Unity game jam on dualpanto with Eva Krebs and Corinna Jaschek
tomorrow Friday 1pm (or join at 3pm)

anyone else feels like they need a boost to get into a coding job? join now
it would be sufficient to
#include <stdio.h>

remember iPrime() + test bed?


int isPrime(int n)

divide by primes only ☹ let’s run this for 30.000…


{
int i;
for(i = 2; i <= n/2; i++)

< 1sec… pretty fast


if(n % i == 0)
return 0;
return 1;
} let’s run this for 300.000…
int main(void) { 7sec… well, still pretty ok, right
int n;
printf("Enter a positive integer: "); probably just the printf()
fflush(stdout);
scanf("%d", &n);
nope, still 7 seconds
int i;
for (i = 2; i < n; i++)
what if we ran this for 3.000.000?
if (isPrime(i))
printf("%d is prime\n", i);
what’s the issue/how to improve?
return 0;

<30sec brains>
}

; // printf("%d is prime\n", i); let’s “comment it out”


if we want to divide by prime numbers only, we need to somehow need remember
the earlier prime numbers. How many variables will we need for this?

later today, we will rewrite isPrime() so that it runs… fast!


(at the expense of having lot’s of new issues)
interesting

<30sec brains>
memory
let’s talk about accessing RAM

random-access memory ::
allows data items to be read or written in
almost the same amount of time irrespective
of the physical location of data inside the memory.
00001 inside:
enumerated memory cells
00000
00016

so far have we accessed (00068, but no need to know)


memory using… int i

this gives a name to one cell of idea: address n cells with one variable…
memory
let’s use a single variable name, say a,
if we want to store prime plus a length, say a[135],
numbers, we need to access to say that we want to use the name ‘a’
n memory locations… to refer to 135 cells of memory
with a single variable name?

<30sec>
arrays
array (general term) ::
a systematic arrangement of similar objects, usually in rows and columns.
each element will hold an int reserve memory for 8 elements

int a[8]; 7
a[0] = 7;
address the first element write a 7 into it

german: feld

array (computing) ::
a data type that is meant to describe a collection of elements,
each selected by one or more indices.
#include <stdio.h>
int a[8];
a[0] = 7;
int main(void) { quick warm-up exercise
1. define an int array with 10 elements
2. write a 0 into element 0, a 1 into element 1…
3. print the contents of the array

return 0;
}

<3min REPL>
#include <stdio.h>

int main(void) {
start at 0 not ‘<=’ because array started at 0
int a[10];
for (int i = 0; i < 10; i++)
a[i] = i;
for (int i = 0; i < 10; i++)
printf("a[%d] = %d\n", i, a[i]);
return 0;
}

<30sec brainstorming>
666
writing past the end
of the array you mean: you
what will happen? expect C to check
an error message? bounds for you?
int a[100]; nope.
for (int i = 0; i < 666; i++) this is C, after all
a[i] = i;

what is going wrong here, what will happen?

<30sec brainstorming>
00001

00000
00016
(address of a)
int a[100]

we reserved this…

…but now we are overwriting whatever is


here (hopefully just data…)
= C’s worst safety issue)
let’s have some fun…
#include <stdio.h> memory management says
“segmentation fault” what might
int main(void) { that be?
int i;
int a[10] = {0}; let’s put a big number here…
int b[10] = {0}; what will happen now?

b[15] = 666;
b[15] = 666; I am wondering what will happen? oh!
for (i = 0; i <= 9; i++)
for printf("a[%d]= %d\n",
(i = 0; i <= 9; i++)i, a[i]);
printf("done\n");
printf("a[%d]= %d\n", i, a[i]);
return 0;

<30sec brains>
}
segmentation fault ::
says that the program has attempted to access a "restricted" area of memory,
which is pretty much any memory not allocated to the program
one common approach to reducing the risk

#define size_of_a 100


int a[size_of_a];

for (int i = 0; i < size_of_a; i++)


a[i] = rand();

storing array sizes/boundaries in a MACRO or Const is an opportunity to


1. to keep code consistent and reduce the risk of overflow bugs
2. give the size a name that documents its purpose
3. collect lots of #define (and const) in one space for easy configuration

give it a try, but (if you need it just once) do not be dogmatic about it
initializing
arrays
note the curly brackets

int a[5] = {1,2,3,4,5}; // a = [1,2,3,4,5]

int b[] = {1,2,3,4,5}; // b = [1,2,3,4,5]


// thus works int b[10] = {0};
int c[10] = {1,2,3,4,5};
// c = [1,2,3,4,5], rest set to 0

int d[5] = {1,2,3,4,5,6,7,8,9,10};


// d = [1,2,3,4,5]
main.c:9:1: warning: excess elements in array initializer
[enabled by default]

do you like that this is just a warning (in the light of the error and exit lecture?)
my 2ct: C is great, but you know…
what does it do?
<vote>
#include <stdio.h>
int a[5] = {1,2,3,4,5}; // a = [1,2,3,4,5]
int b[] = {1,2,3,4,5};
int c[10] = {1,2,3,4,5};
d[0] = 1
int d[5] = {1,2,3,4,5,6,7,8,9,10};
d[1] = 2
int main(void) {
int i;
d[2] = 3
// Disable stdout buffering d[3] = 4
setvbuf(stdout, NULL, _IONBF, 0); d[4] = 5
for (i = 0; i <= 10; i++) { d[5] = 0
printf("d[%d] = %d\n", i, d[i]); d[6] = 946082336
fflush(stdout);
d[7] = 32687
}
d[8] = 0
printf("Done\n");
return 0;
d[9] = 0
} d[10] = 0
Done

whatever was in memory there; 6-10 were never written


arrays as parameter
int sumUpFiveElements(int a[5]) {
return a[0] + a[1] + a[2] + a[3] + a[4];
}

int sumUpFiveElements(int a[]) {


return a[0] + a[1] + a[2] + a[3] + a[4];
}

how do you write sumUpElements() for any number of elements?


int sumUp(int a[], int size) {
int i, sum = 0;
for (i = 0; i < size; i++)
sum += a[i];
return sum;
}
ok?
<brainstorming>
int sumUpFiveElements(int a[5]) {
return a[0] + a[1] + a[2] + a[3] + a[4];
}

is invoking this function slow or fast?

a programming language could


[ ] copy the array (aka “pass the array by value”) or
[ ] just tell the called function were it is (aka “by reference” (kind-of))

given what you know about C, which one would you guess it is?
can you write a quick program that allows you to test your assumption?

<30sec brainstorming>
go do it

general approach: pass array into function, modify an element, see if modified.

#include <stdio.h>
void modifyFirstElement(int array[3]){
array[0] = 2;
}
int main (void) {

<3min repl>
int b[3] = {1,1,1};
modifyFirstElement(b);
printf("b[0] = %d\n", b[0]);
b[0] has changed, thus…
return 0;
C passes arrays “by reference”
}
(because it is fast)
this is passed by value

int square(int x) {…}


statements are “executed”,
call by value :: expressions are “evaluated”
same difference.
is the most common evaluation strategy, used in languages as different as C and Scheme. In call by value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable
in the function (typically by copying the value into a new memory region).
call by reference ::
an evaluation strategy where a function receives an implicit reference to a variable used as argument, rather than a copy of its value.
This typically means that the function can modify (i.e. assign to) the variable used as argument—something that will be seen by its caller. Call by reference can therefore be used to provide an additional
call by reference ::
actually, C does not really support call by reference. It simulates it using pointers (objects representing the memory addresses of other objects, see later today).
In C this may cause memory safety errors such as null pointer dereferences, and also may be confusing.

http://stackoverflow.com/questions/33622787/are-pointers-considered-a-method-of-calling-by-reference-in-c
In languages that support true "call by reference", such as C++, we can write
swap(x,y); // does not work in C

and the fact that x and y are passed as pointers


is only declared in the definition of swap, e.g.,
void swap(int &x, int &y); // cannot say that in C

http://stackoverflow.com/questions/17423218/diff-between-call-by-reference-and-call-by-pointer
!
int sumUpFiveElements(int a[5]) {
return a[0] + a[1] + a[2] + a[3] + a[4];
}

int sumUpFiveElements(int a[]) {


return a[0] + a[1] + a[2] + a[3] + a[4];
}

how do you write sumUpElements() for any number of elements?


int sumUp(int a[], int size) {
int i, sum = 0;
for (i = 0; i < size; i++)
sum += a[i];
return sum;
} passing a[] does not pass a[].
It just passes the address of the first element

<ah, this is why>


make this a black-and-white image as part of touch detection

passes a pointer to the image = fast

void threshold(float image[], float thresh, const int w, const int h)


{
for (int y = 0; y < h; ++y) {
for (int x = 0; x < w; ++x) {
if (image[y * w + x] > thresh) {
image[y * w + x] = 1.0f;
} else {
image[y * w + x] = 0.0f;
}
}
}
}

call by reference makes computer vision practical


#include <stdio.h>
int main(void) {
int i;
int a[10] = {0}; just for once we pushed it too far.
int b[10] = {0}; (but give us a week, and we’ll work around it… ;))

b = a; main.c:8:4: error: incompatible types when


assigning to type 'int[10]' from type 'int *'
for (i = 0; i <= 9; i++)
printf("b[%d]= %d\n", i, a[i]);
return 0;
}
will this compile?

<30sec brainstorming> http://stackoverflow.com/questions/7882735/why-cant-i-assign-arrays-as-a-b


ready for isPrime()
example: sieve of Eratosthenes
finding prime numbers by crossing out multiples of found primes
C

how to store this sieve?


e.g., int sieve[100]; // as boolean <20sec
brains>
// some other time we will save memory
// by packing Booleans more tightly
#define SIEVE_SIZE 30000
#define FALSE 0
#define TRUE 1

void eratosthenes(int sieve[], int sieveSize) int main(void)


{ {
int i;
int sieve[SIEVE_SIZE];

eratosthenes(sieve, SIEVE_SIZE);

for (i = 0; i < SIEVE_SIZE; i++)


if (sieve[i])
printf("%d is prime\n", i);

return 0;
}
}
#define SIEVE_SIZE 30000 make 300,000
#define FALSE 0 pretty quick!
#define TRUE 1

void eratosthenes(int sieve[], int sieveSize) int main(void)


{ {
int i, j; int i;
sieve[0] = FALSE; int sieve[SIEVE_SIZE];
sieve[1] = FALSE;
eratosthenes(sieve, SIEVE_SIZE);
for (i = 2; i < sieveSize; i++)
sieve[i] = TRUE;
for (i = 0; i < SIEVE_SIZE; i++)
if (sieve[i])
for (i = 2; i * i < sieveSize; i++) ; // printf("%d is p.\n", i);
printf("%d is prime\n", i);
if (sieve[i])
for (j = i*i; j<sieveSize; j+=i)
return 0; comment out
sieve[j] = FALSE;
}
}
make 3,000,000
“exited with non-zero status”/
“Segmentation fault: 11” …what?

let’s get back to this question later…

<30sec brainstorming>
we turned isPrime() into a …

memory bound function ::


situation in which the time to complete a given computational problem is decided primarily by the amount of memory required to hold data.
another
example
let’s implement one of the most important data structures in programming…
the stack
#include <stdio.h>
// define stack as an array here
// remember what stack is pointing to
// ok to leave out all checks and bounds

void push(int i) {

int pop(void) {

int main(void) {
push(10); push(20); push(30);
printf("stack holds %d, %d, %d\n", pop(), pop(), pop());
return 0;
}
#include <stdio.h>
int stack[255];
int top = 0;
// ahhh... leaving out all checks and bounds

void push(int i) {
stack[top++] = i;
}
i++ tends to be
int pop(void) { used together with --i
return stack[top--]; (not with i--)
}

int main(void) { (also: function parameters are not


push(10); push(20); push(30); evaluated in a defined order in C)
printf("stack holds %d, %d, %d\n", pop(), pop(), pop());
return 0;
}
find the bug
<brainstorming>
#include <stdio.h>
// ok to leave out all checks and bounds

void push(int stack[], int *top, int i) {


stack[(*top)++] = i;
}

int pop(int stack[], int *top) {


return stack[--(*top)];
}

int main(void) {
int stack[255]; versions without global variables
int top= 0;
push(stack, &top, 10); pass stack and top as parameters
push(stack, &top, 20); (look at this version when you have
push(stack, &top, 30);
printf("stack holds %d, %d, %d\n", learned about pointers)
pop(stack, &top),
pop(stack, &top),
pop(stack, &top));
return 0;
}
function calls in C (and most other languages are implemented using a stack)

can you guess how this works?

<60sec brainstorming>
so what is a “frame”…?
(stack here shown as growing upwards)
#include <stdio.h>

int main(void) {
int i;
int a[10] = {0};
int b[10] = {0};

b[15] = 666;
we were wondering what would happen
for (i = 0; i <= 9; i++)they are next to each other on the stack
printf("a[%d]= %d\n", i,stack grows downwards, arrays upwards
a[i]);
printf("done\n");
return 0;
}
how to make program
goto anywhere?

<brains>
more geeky details…
so how big is the stack in the repl machines (in kilo bytes)?

when we ran sieve with 3,000,000 we got this


“exited with non-zero status”/
“Segmentation fault: 11” …what?

aka “stack overflow”

how much RAM did we use up (assuming int = 4 bytes)?


3Mio x sizeof(int),… probably 12MB

so the stack on the repl machines must be < 12MB

<30sec brainstorming>
if you really want to know
#include <sys/resource.h>
#include <stdio.h>

int main (void)


{
struct rlimit limit;

getrlimit (RLIMIT_STACK, &limit);


printf ("\nStack Limit = %ld \n",
limit.rlim_cur);
return 0;
} Stack Limit = 8388608

http://stackoverflow.com/questions/53827/checking-available-stack-size-in-c
there are other ways to manage memory
! malloc(), when we talk about pointers
summary
00001

00000
00016
(00068, but no need to know)
int i

we just gave a name to many cells idea: if we want space for n cells…
of memory
let’s use a variable name, say a,
this allows us to compute bigger plus a length, say a[135],
things faster, as we can keep a lot to say that we want to use the name ‘a’
of partial solutions around to refer to 135 cells of memory
! we could handle a lot of memory
memory "! speed are to a with a single variable
certain extent interchangeable
end
go download the slides
and reprise, practice

today’s lecture was quite a bit harder and faster already

professor patrick baudisch


pt1

memory and arrays


hasso-plattner institute

You might also like