Systems, GUI Notes
Systems, GUI Notes
1.2 Processes
Def: an instance of a program that is being executed (running instance)
- OS provides - logical flows of control (1 flow - > single-threaded, multiple flows - >
multi-threaded) Private, Protected address space, abstracted resources (fd)
- Context switching - loads multiple processes into memory, and switch to another process if curr
process is blocked or waiting for user input. Time-sharing (switches processes periodically to
make sure that all process make process) State -> includes curr program text, location of text
(PC/IP), and all state, vari (global, heap, stack, CPU) Interacts with mode switching
- Dual-Mode Operation (kernel - system), (user - non-privilege)
- Mode Switching (User -> Kernel Mode) (protected, reasons external(I/O, keyboard)
internal(interrupt, trap, exception). (Kernel->User) (special privileged, return from interrupt)
- SUMMARY -> Mode switch guarantees that kernel gains control when needed To react to
external events To handle error situations Entry into kernel is controlled Not all mode switches
lead to context switches Kernel decides if/when – subject to process state transitions and
scheduling policies Mode switch does not change the identity of current process/thread
Process States
- OS’s keep track of the status of each process. RUNNING: This process is executing its
instructions on a CPU READY: This process is ready to execute on a CPU, but currently is not (it
is waiting for a CPU to be assigned) BLOCKED: This process is not ready to execute on a CPU,
because it is waiting for some event it cannot currently make use of a CPU even if one is
available NB: in systems whose kernel supports multi-threading, the states are maintained for
each thread separately.
Process State Transitions
- RUNNING → BLOCKED: process cannot continue because it first must wait for something, e.g.
for input (keystroke, file from disk, network message, data from Unix pipe) for exclusive access
to a resource (acquire a lock) for a signal from another thread/process for time to pass (e.g.,
sleep(2) sys call) for a child process to terminate BLOCKED → READY: process becomes ready
when that something finally becomes available OS adds process to a ready queue data structure
READY → RUNNING: process is chosen by the scheduler only 1 process can be chosen per
CPU requires scheduling policy if demands exceeds supply RUNNING → READY: process is
descheduled OS preempted the process to give another READY process a turn or, rarely, process
voluntarily yielded the CPU
Job Control
- Job control: Some systems provide the ability to stop (suspend) a process for some time, and
continue it later with all its state intact. E.g., in Linux Ctrl-Z This mechanism is separate from the
state transitions caused by events processes wait for – events can still arrive for stopped processes
Programmer’s View
Process state transitions are guided by decisions or events outside the programmer’s control (user actions,
user input, I/O events, interprocess communication, synchronization) and/or decisions made by the OS
(scheduling decisions) They may occur frequently, and over small time scales e.g., on Linux preemption
may occur every 4ms for RUNNING processes when processes interact on shared resources (locks, pipes)
they may frequently block/unblock) For all practical purposes, these transitions, and the resulting
execution order, are unpredictable The resulting concurrency requires that programmers not make any
assumptions about the order in which processes execute; rather, they must use signaling and
synchronization facilities to coordinate any process interactions
Process Management
OS provide APIs (system calls) to manage processes Process creation includes way to set up new
process’s environment Process termination Normal termination (exit(), return from main()) Abnormal
termination (due to misbehavior: “crash”, due to outside intervention: “kill”) In either case, OS cleans up
(reclaims all memory, closes all low-level file descriptors) Process interaction; examples include Waiting
for a process to finish Stopping/continuing a process Change a process’s scheduling and other attributes
Reporting and profiling facilities OS provides facilities to be used by or in coordination with control
programs (shell, GUI, Task Manager) Examples include Ctrl-C, Ctrl-Z
Process Management (Unix)
Unix separates process creation from loading a new program The fork() system call creates a new process,
but does not load a new program The newly created process is called a child process (the creating process
is referred to as parent) Corollary: Unix processes form a tree-like hierarchy Child processes may inherit
parts of their environment from their parents, but are otherwise distinct entities The child process then
may change/set up the environment and, when ready, load a new program that replaces the current
program but retains certain aspects of the environment (exec()) The parent has the option of waiting (via
wait()) for the child process to terminate, which is also called “joining” the child process Parent can also
learn how the child process terminated, e.g. the code that the child passed to exit()
fork() - Keeps program and process, but also creates a new process New process is a clone of the parent;
child state is a (now separate) copy of parent’s state, including everything: heap, stack, file descriptors
Called once, returns twice (once in parent, once in child)
exec() - Keeps process, but discards old program and loads a new program Reinitializes process state
(clears heap + stack, starts at new program’s main()); except it retains file descriptors If successful, is
called once but does not return includes multiple variants (execvp(), etc.)
Standard Streams -
By convention, 0, 1, 2 are used for standard input, standard output, and standard error streams Programs
do not have to open any files; they are preconnected; thus programs can use them without needing any
additional information Control programs (shell), or the program starting a program can set those up to
refer to some regular file, terminal device, or something else When used, they access the underlying
kernel object in the same way as if they’d open it themselves Programs should, in general, avoid changing
their behavior depending on the specific type of object their standard streams are connected to Exceptions
exist, e.g., flushing strategy of C’s stdio depends on whether standard output is a terminal or not Python 2
sys.stdout.encoding fiasco
Pipes - A Unix pipe is a FIFO, bounded buffer that provides the abstraction of a unidirectional stream of
bytes flowing from writer to reader Writers: can store data in the pipe as long as there is space blocks if
pipe is full until reader drains pipe Readers: drains pipe by reading from it if empty, blocks until writer
writes data Pipes provide a classic “bounded buffer” abstraction that is safe: no race conditions, no shared
memory, handled by kernel provides flow control that automatically controls relative progress: e.g., if
writer is BLOCKED, but reader is READY, it’ll be scheduled. And vice versa. Created unnamed; file
descriptor table entry provide for automatic cleanup
Unix Signals - Unix Signals present a uniform mechanism that allows the kernel to inform processes of
events of interest from a small predefined set (< 32) Traditionally represented by their integer number,
sometimes associated with some optional additional information These events fall into 2 groups 1
Synchronous: caused by something the process did (aka “internally generated event”) 2 Asynchronous:
not related to what the process currently does (aka “externally generated event”) Uniform API includes
provisions for programs to determine actions to be taken for signals, which include terminating the
process, optionally with core dump ignoring the signal invoking a user-defined handler stopping the
process (in the job control sense) continuing the process Sensible default actions support user control and
fail-stop behavior when faults occur
SIGILL (1) Illegal Instruction SIGABRT (1) Program called abort() SIGFPE (1) Floating Point Exception
(e.g. integer division by zero, but not usually IEEE 754 division by 0.0) SIGSEGV (1) Segmentation
Fault - catch all for memory and privilege violations SIGPIPE (1) Broken Pipe - attempt to write to a
closed pipe SIGTTIN (2) Terminal input - attempt to read from terminal while in background SIGTTOU
(2) Terminal output - attempt to write to terminal while in background (1) Default action: terminate the
process (2) Default action: stop the process
SIGINT (1, 3) Interrupt: user typed Ctrl-C SIGQUIT (1, 3) Interrupt: user typed Ctrl-\ SIGTERM (3)
User typed kill pid (default) SIGKILL (2, 3) User typed kill -9 pid (urgent) SIGALRM (1, 3) An alarm
timer went off (alarm(2)) SIGCHLD (1) A child process terminated or was stopped SIGTSTP (1)
Terminal stop: user typed Ctrl-Z SIGSTOP (2) User typed kill -STOP pid (1) These are sent by the kernel,
e.g., terminal device driver (2) SIGKILL and SIGSTOP cannot be caught or ignored (3) Default action:
terminate the process
First, a signal is sent (via the kernel) to a target process Some signals are sent internally by the kernel (e.g.
SIGALRM, SIGINT, SIGCHLD) User processes can use the kill(2) system call to send signals to each
other (subject to permission) The kill(1) command or your shell’s built-in kill command do just that.
raise(3) sends a signal to the current process This action makes the signal become “pending” Then
(possibly some time later) the target process receives the signal and performs the action (ignore,
terminate, or call handler). Aside: the details of how processes learn about pending signals and how they
react to them are complicated, but handled by the kernel Here we focus on what user programmers need
to observe when using signals
Async-Signal Safety -
Is it safe to manipulate data from a signal handler while that same data is being manipulated by the
program that was executing (and interrupted) when the signal was delivered? In general, is it safe to call a
function from a signal handler while that same function was executing when the signal was delivered?
Answer: it depends. POSIX defines a list of functions for which it is safe, so-called async-signal-safe
functions, see signal-safety(7) for a list and the book’s Web Aside: Async-signal Safety printf() is not
async-signal-safe (acquires the console lock) Two strategies to write async-signal-safe programs: 1 don’t
call async-signal-unsafe function in a signal handler 2 block signals while calling unsafe functions in the
main control flow (or when manipulating shared data)
Blocking/Masking Signals -
If signals are masked/blocked most of the time in the main program, signal handlers can call most
functions, but signal delivery may be delayed. If a signal is not masked most of the time, signal handlers
must be very carefully implemented. In practice, coarse-grained solutions are perfectly acceptable unless
there is a requirement that bounds the maximum allowed latency in which to react to a signal. Side note:
OS face the same trade-off when implementing (hardware) interrupt handlers.
Foreground vs Background vs Stopped - User Expectations: The shell waits for foreground jobs before
outputting a new prompt Foreground jobs receive user input Foreground jobs can have full control of the
terminal (e.g. vim) Background jobs execute, but do not prevent further user interaction with the shell
Stopped jobs are neither foreground nor background OS Support: Minimal notion of fg/bg inside the OS
OS do maintain a foreground process group id for each terminal: Control keys (Ctrl-Z, Ctrl-C), are turned
into signals sent to foreground process group Certain terminal operations cause a process to be stopped
with SIGTTOU/SIGTTIN if attempted while calling process’s group is not foreground group In Linux,
look for the plus + to see the fg process group. The shell’s task is to relay the user’s expectations to the
OS, and to inform the user of any events that result, while maintaining internal state that accurately
reflects the state of each job
Process Groups - Purpose: to group processes for the purposes of signal delivery Sending a signal sends it
all processes that are part of a process group This applies to both signals sent via a system call (kill(2) or
using killpg(3)) and signals sent by the kernel (e.g. SIGTSTP, SIGINT, etc.) Simple, cooperative
management scheme: Any process is part of exactly one process group at all times Each group has a
leader whose pid is used to determine its process group id NB: Process groups may persist even if the
leader process has already exited, as long as there are still members alive Any process may create a new
process group, declaring itself as the leader Any process may join (or be assigned to) an existing process
group Subject to permission restrictions Intended use Though the API is open to all processes, it’s
commonly used by control programs (shells) to arrange processes into groups that correspond to the jobs
the shell manages, allowing the user to kill entire groups and for Ctrl-Z/Ctrl-C to be sent to entire groups
Nice default behavior: a fork()’d child inherits the process group of its parent, making it automatically
subject to any signals delivered to the group
Linking Summary -
Compiler resolves certain symbolic names, but passes any that are global in extent onto the linker as
references in relocatable object files Linker merges object files to produce an executable, computing a
virtual address space layout in the process The executable contains the text and data needed to load a
program into memory We have ignored so far: Lexical scoping rules (global vs. local to a compilation
unit) Rules the linker applies when deciding how to resolve an external reference Static and dynamic
libraries
static void f(); maybe makes sense only if defined in same header file static void f() { } no usually ok
when inlining is intended void g(); no recommended way of declaring global functions extern2 void g();
no recommended way of declaring global functions void g() { } multiply-defined violates ODR Variables
static int v; no separate copies of v! Likely wrong. static int w = 42; no separate copies of w! Likely
wrong. int v; multiply-defined violates ODR extern int v; no recommended way of declaring a global
variable int v = 42; multiply-defined violates ODR
Multithreading -
Def: General purpose OS already provide the ability to execute processes concurrently. In many
applications, we would like to pursue multiple, concurrent computations simultaneously within a process,
e.g. Parallel Computing: perform multiple tasks or work on shares of data simultaneously Overlap I/O &
Computation: checksum and repair while downloading in a file sharing program Serve a UI while
performing background activity (spell check, contact server or backend for autosuggestions) Handling
multiple clients simultaneously in a network server Such application-level concurrency is supported by
having multiple threads of execution.
Threads vs Processes -
Processes provide concurrent, separate logical flows of control within a system/machine Threads provide
separate logical flows of control within a process. Processes share machine resources, files on disk,
inherited file descriptors, terminals and do not share address space. Threads share address space1 , open
file descriptors and do not share stack2 & registers.Think of threads as multiple programs executing
concurrently within a shared process, sharing all data and resources, but maintaining separate stacks and
execution state.
Cooperative Multi-Threading-
It’s possible to maintain multiple control flows entirely without kernel level support Exists in multiple
variants in different languages, known as coroutines or user-level threads depending on variant Requires a
primitive that saves & restores execution state Non-preemptive model: threads’ access to the CPU is not
preempted (taken away) unless the thread yields access to the CPU voluntarily Yield may be directed
(saying which coroutine should run next) or undirected (run something else next), e.g. uthreads example
In some higher-level languages, functions can “yield” temporary results as their execution state is saved
and restored (e.g., Python or ES6 yield) Can be combined with asynchronous I/O: yield a promise object
that represents an in-progress operation: async/await
CompLang Notes
Early Precursors: Pseudocodes (Poor readability, poor modifiability, expression coding was
tedious, machine deficiencies) Short Code: developed by Mauchly, expressions were coded left
to right. Speedcoding: Pseudo ops for arithmetic and math functions, conditional and
unconditional branching, auto-increment registers for array access.
First Compiling System: UNIVAC Compiling System, developed by Grace Hopper, Pseudocode
expanded into machine code.
Fortran:
First designed in (1954) and designed for the new IBM 704 (computer), which had index
registers and floating point hardware. Led to the idea of compiled programming languages.
Design Process of Fortran (No need for dynamic storage, need good array handling and counting
loops, no string handling, decimal arithmetic, or powerful input/output). Fortran II (independent
compilation, fixed the bugs). Fortran IV (explicit type declarations, logical selection statement,
subprogram names could be parameters), Fortran 77 (became new standard in 1978, character
string handling, if-then-else statement), Added Modules, Dynamic Arrays, Pointers, Recursion,
CASE statement, Parameters type checking. Scientific Computing.
Major Additions: Compiler Programming Languages, named variables with types, structured
control flow (if statements and do loops).
Functional Programming:
Ex. Scheme, ML, pure LISP, FP. It was created to formalize computation using math functions
and using recursion over iterations and state. LISP (LISt Processing Language) was first
Functional Program and processed data in lists, and has symbolic computation, only two data
types (atoms and lists), syntax based on lambda calculus, no need for variables or assignment,
control via recursion and conditional expressions. Dominant language for Artificial Intelligence,
ML, Haskell, and F# are FP Languages, but use very different syntax.
ALGOL (Algorithmic Language): Important since it was the result of designing a universal
language. Created the concept of type (names, arrays, parameters, subscripts, compound
statements, semicolon, assignment operator, no I/O, no meant to be implemented. New feature
(block structure, parameter, block structure (local scope), parameter passing methods, recursion,
arrays, and no I/O and no handling. Served as a foundation for many languages, standardized
mathematical notation,
COBOL (Purpose and Contributions): Design Goals: easy to use, broaden and computer users,
and must not be biased by current compiler problems. Environment of development (Based on
FLOW-MATIC, features: names up to 12 characters, with embedded hyphens, english names for
arithmetic operators (no arithmetic expressions). Contributions: first macro facility in a
high-level language, hierarchical data structures (records), Nested selection statements, Long
names (up to 30 characters), with hyphens, and separate data division. Design Problems:
arithmetic expressions subscripts, and fights among manufacturers. Business Computing.
PL/I: Attempted to combine features of FORTRAN, COBOL, and ALGOL, Was overly
complex, had a large, inconsistent feature set, and lacked strong type safety. Attempted to
combine I/O and floating and arrays for MIS. Tried to work off of the obvious solution of
building a new computer to do both kinds of scientific and business computing. Initially called
NPL. Concerns: many new features were poorly designed and too large and too complex.
ADA: Successful since it had a huge design effort involving hundreds of people, money, and
eight years. Contributions of packages (support for data abstraction), exception handling
(elaborate), generic program units, and concurrency (tasking model). Had strong type checking,
was designed for safety and reliability, featured concurrency support, and was standardized.
Simula: First Object-Oriented Language (Classes, Objects, Inheritance), laid the foundation for
C++, and other modern OOP languages
Reasons to Separate Lexical and Syntax Analysis: Simplicity (less complex approaches can be
used for lexical analysis separating them simplifies the parser), Efficiency (separation allows
optimization of the lexical analyzer), Portability (parts of the lexical analyzer may not be
portable, but parser is always portable)
Top-Down Parsing (Right-recursive) starts from the start symbol and expands production rules
until reaching the input, recursive descent, roots to leaves). Starts from the start symbol and
expands using production rules until matching the input.
Bottom-Up Parsing (start from the input (leaves to root) and apply reduction rules until reaching
the start symbol) Starts from the input symbols and applies reductions to reach the start symbol.
Most Common Top-Down Parsing Method: Recursive Descent Parsing (uses a set of mutually
recursive functions for each non-terminal to parse the input). Examples of this are expr, term,
and factor. Shortcomings of this approach is cannot handle left-recursive grammars (as it leads to
infinite recursion), and limited to LL(k) grammars (meaning it can only predict the next steps
based on a fixed number of lookahead tokens)
Most Common Bottom-Up Parsing Method: LR (Left to right, rightmost derivation in reverse).
LR(1) and LALR(1) parsers are widely used in compiler design. Uses shift-reducing (shift is the
action of moving the next token to the top of the parse stack, reduce is the action of replacing the
handle on the top of the parse stack with its corresponding LHS). Advantages: they will work for
nearly all grammars that describe programming languages, they work on a larger class of
grammars than other bottom-up algorithms, but are as efficient as any other bottom-up parser,
and can detect syntax errors.
Sextuple Attributes for Variables: Name (identifier), Address (memory location assigned to
variable), Value (the actual data stored), Type (Determine the set of values and operations
allowed), Lifetime (the time during which the variable exists in memory), Scope (region where
variable can be accessed).
Alias: Occurs when two or more variables refer to the same memory location.
Data Type: Defines the set of values a variable can take, the operations that can be performed on
those values, the memory representation of values. Determines the range of values of variables
and the set of operations that are defined for values of that type; in the case of floating point, type
also determines the precision.
Binding: An association between an entity and an attribute, such as between a variable and its
type or value or between an operation and a symbol.
Two Broad Categories of Binding: Static Binding (Early): occurs at compile time. It first occurs
before run time and remains unchanged throughout program execution. Example (Variable types
in C (int x = 5). Dynamic Binding (Late): occurs during execution or can change during
execution of the program. Example (Python variables x=5 then x = “hello”).
Static Scope: Determined at compile time based on the block structure of the program. Resolved
by looking at the nearest enclosing scope.
Dynamic Scope: Determined at runtime, based on the calling sequence of functions. A variable is
resolved by searching back through the call stack.
Referencing Environment: The set of all variables visible at a specific point in a program. Ex.
static scoping references environment by block nesting, and dynamic scoping references the
environment by the call stack at runtime. The collection of all names that are visible in the
statement.
Chapter 6 Data Types
Data Type: defines a collection of data objects and a set of predefined operations on those objects.
Variable’s Descriptor: the collection of the attributes of a variable. An object represents an instance of a
user-defined (abstract data) type.
Design Issue for All Data Types: What operations are defined and how are they specified?
Primitive Data Types: Those not defined in terms of other data types. Can be merely reflections of the
hardware. Others require only a little non-hardware support for their implementation. Almost all
programming languages provide a set of primitive data types.
Rectangular Array: Multi-dimensioned array in which all rows have the same number of elements.
Supported by F# and C#.
Jagged Matrix (Array): Has rows with varying number of elements (Possible when multidimensional
arrays actually appear as arrays of arrays. Supported by C, C++, C#, F# and Java)
Slice: Some substructure of an array; nothing more than a referencing mechanism. Only useful in
languages that have array operations.
# Can be skipped
Associative Arrays: An unordered collection of data elements that are indeed by an equal number of
values called keys. Design issues (what is the form of references to elements, is the size static or
dynamic?) Built-in type in Perl, Python, Ruby, and Swift.
Records: are used when collection of data values is heterogenous. Access to array elements is much
slower than access to record fields, because subscripts are dynamic (field names are static)
Implementation of Record Type: Offset address relative to the beginning of the records is associated with
each field.
Record Types: A record is a possibly heterogeneous aggregate of data elements in which the individual
elements are identified by names. Design issues (what is the syntactic form of references to the field, are
elliptical references allowed). C# defines a record using the record keyword:
Public record Person(string Name, int Age, double Weight)
Var person1 = new Person(“Alice”, 30, 65.5);
Console.WriteLine($”Name: {person1.Name}, Age: {person1.Age}, Weight: {person1.Weight}”);
COBOL Records: uses level numbers to show nested records, others use recursive definition.
Record field references:
COBOL (field_name OF record_name_1 OF … OF record_name_n)
Others (dot notation): record_name_1.record_name_2 … record_name+n.field_name
Fully qualified references must include all record names.
Elliptical references allow leaving out record names as long as the reference is unambiguous, for example
in COBOL: FIRST, FIRST OF EMP-NAME, and FIRST OF EMP_REC are elliptical references to the
employee’s first name.
Tuple: A data type that is similar to a record, except that the elements are not named. Used in Python, Ml,
and F# to allow functions to return multiple values.
Ex. Python (myTuple = (3, 5.8, ‘apple’), ML (val myTuple = (3, 5.8, ‘apple’)), F# (F# let tup = (3, 5, 7),
let a, b, c = tup))
List Types: Lists in Lisp and Scheme are delimited by parentheses and use no commas: (A B C D) and (A
(B C) D). Data and code have the same form. Interpreter needs to know what a list is so we quote it with
an ‘, (A B C) is data.
Lists in Scheme: CAR return the first element of its list parameter, CDR return remainder of list after first
element is removed, CONS puts first parameter into second parameter to make a new lists, LIST return a
new list of its parameters.
Lists in ML: write in brackets, separated by commas, must be of same type, CONS function as binary
operator ( 3 :: [5, 7, 9] → [3, 5, 7, 9]) hd → CAR, tl → CDR
Lists in F#: like ML, separated by semicolons and hd and tl are methods of the List class.
List Types in Python: mutable, can be of any type, created with assignment and subscripting.
Union: a type whose variables are allowed to store different type values at different times during
execution. Design Issue (should type checking be required?)
Type Checking: type checking of unions requires that each union include a type indicator called a
discriminant.
The Pointer Type variable has a range of values that consist of memory addresses and a special value, nil.
Provide the power of indirect addressing. Provide a way to manage dynamic memory. A pointer can be
used to access a location in the area where storage is dynamically created (heap).
Pointer Design Issues: What are the scope of and lifetime of a pointer variable? What is the lifetime of a
heap-dynamic variable? Aer pointers restricted as to the type of value to which they can point? Are
pointers used for dynamic storage management, indirect addressing, or both? Should the language support
pointer types, reference types, or both?
Pointer Operations: assignment and dereferencing, assignment is used to set a pointer variable’s value to
some userful address. Dereferencing yields the value stored at the location represented by the pointer’s
value. Can be explicit or implicit. C++ uses an explicit operation via *.
Dangling Pointers (Dangerous): A pointer points to a heap-dynamic variable that has been deallocated.
Problems in heap management.
Tombstone: extra heap cell that is a pointer to the heap-dynamic variable: (the actual pointer variable
points only at tombstones, when heap-dynamic variable de-allocated, tombstone remains but set to mil,
and costly in time and space.
Locks-and-keys: Pointer values are represented as (key, address) pairs: heap-dynamic variables are
represented as variable plus cell for integer lock value. When a heap-dynamic variable is allocated, lock
value is created and placed in lock cell and key cell of pointer.
Lost Heap-Dynamic Variable: An allocated heap-dynamic variable that is no longer accessible to the user
program (often called garbage): Pointer p1 is set to point to a newly created heap-dynamic variable,
Pointer p1 is later set to point to another newly created heap-dynamic variable, the process of losing
heap-dynamic variable is called memory leakage.
Pointers in C and C++: Extremely flexible, point to any variable regardless of when or where it was
allocated, used for dynamic storage management and addressing, pointer arithmetic, explicit
dereferencing and address-of operators.
Reference Types: C++ includes a special kind of pointer type called a reference type that is used primarily
for formal parameters (pass-by-reference and pass-by-value). Java extends C++’s reference variables and
allows them to replace pointers entirely (references to objects rather than addresses).
Type Checking: Generalize the concept of operands and operators to include subprograms and
assignments. The activity of ensuring that the operands of an operator are of compatible types. A
compatible type is one that is either legal for the operator, or is allowed under language rules to be
implicitly converted, by compiler-generated code, to a legal type (this automatic conversion is called a
coercion).
Type error: the application of an operator to an operand of an inappropriate type. If all type bindings are
static, nearly all type checking can be static. If type bindings are dynamic, type checking must be
dynamic.
Strongly typed: if type errors are always detected. Advantages of strong type is that it allows the detection
of the misuses of variables that result in type errors. C and C++ are not checked, unions are not type
checked, parameter type checking can be avoided. Java and C# are almost because of explicit type
casting. ML and F# are. Coercion rules strongly affect strong typing: they can weaken it.
Type Equivalence: Means the two variables have equivalent types if they are in either the same
declaration or in declarations that use the same type name. Easy to implement but highly restrictive
(subranges of integer types are not equivalent with integer types, and formal parameters must be the same
type as their corresponding actual parameters).
Arithmetic Expressions: motivation for development of the first programming languages. Consists of
operators, operands, parentheses, and function calls. Binary operators are infix, except in Scheme and
LISP, in which they are prefixes. Most unary operators are prefixes, but the ++ and – operators in C-based
languages can be either prefix or postfix. Design Issues (operator precedence rules, operator associativity
rules, order of operand evaluation, operand evaluation side effects, operator overloading, and type mixing
in expressions?)
Operator: an unary operator has one operand, binary operator has two operands, ternary operator has three
operands.
Precedence rules: define the order in which “adjacent” operators of different precedence levels are
evaluated. Typical precedence levels (parenthese, unary operators, **, *, /, x, -)
Order of Adjacent operators: with the same precedence level are evaluated. Typical associativity rules
(left to right, except ** which is right to left, unary operators associated right to left (FORTRAN),
Precedence and associativity rules can be overridden with parentheses.
Ruby (All arithmetic, relational, and assignment operators, as well as array indexing, shifts, and bitwise
logical operators, are implemented as methods. One result of this is that these operators can be all
overridden by application programs.
Scheme: All arithmetic and logic operations are explicitly called subprograms.
Conditional Expressions: C-based languages (C, C++), an example (average = (count == 0)? 0 : sum /
count. This is evaluated like this: if (count == 0) average = 0, else average = sum / count.
Operand Evaluation Order: Variable (fetch the value from memory), Constants (sometimes a fetch from
memory, sometimes the constant is in the machine language instruction), parenthesized expressions
(evaluate all operands and operators first), and most interesting case is when an operand is a function call.
#Skippable
Referential Transparency: A program has the property of referential transparency if any two expressions
in the program that have the same value can be substituted for one another anywhere in the program,
without affecting the action of the program. Advantages: semantics of a program is much easier to
understand if it has referential transparency.
Functional Language: Because they do not have variables, programs in pure functional languages are
referentially transparent: Functions cannot have state (which would be stored in local variables), and if a
function uses an outside value, it must be a constant. Value depends only on its parameters.
# End of Skippable
Overloaded Operators: Use of an operator for more than one purpose is called operator overloading.
Someare common (+ for int and float) and others are potential trouble (* in C and C++) that causes loss of
compiler error detection and loss of readability. User-defined (C++, C#, and F#) when sensibly used can
be an aid to readability. Potential problems are nonsense operations and readability may suffer.
Type Conversion:
Narrowing Conversion: one that converts an object to a type that cannot include all of the values of the
original type ex. float to int
Widening Conversion: one in which an object is converted to a type that can include at least
approximations to all of the original type ex. int to float.
Mixed Mode: expression is one that has operands of different types (ex. Int * float). A coercion is an
implicit type conversion and the disadvantage is that they decrease the type error detection ability of the
compiler. In most languages, all numeric types are coerced in expressions, using widening conversions. In
ML and F#, there are no coercions in expressions.
Explicit Type Conversion: called casting. Errors are accused by inherent limitations of arithmetic
(division by zero) and overflow due to computer arithmetic. Often ignored by the run-time system.
Relational Expressions: Use relational operators and operands of various types. Evaluate to some Boolean
representation. Operator symbols used vary somewhat among languages. JS and PHP has === and !==,
similar to == and != but they do not coerce their operands.
Boolean Expressions: Operands are Boolean and the result is Boolean. C89 has no Boolean: it uses int
type with 0 for false and nonzero for true. One odd characteristic of C’s expressions: a < b < c is legal
expression, but the result is not what you might expect: left operators is evaluated (0 or 1) and the result is
compared with the third operand.
Short Circuit Evaluation: An expression in which the result is determined without evaluating all the
operands and/or operators. Example: (13 * a) * (b / 13 - 1), if a is zero, no need to evaluate (b / 13-1). C,
C++, and Java uses it for Boolean operators (&& and ||), but also provide bitwise Boolean operators that
are not short circuit (& and |) All logic operators used it in Ruby, Perl, ML, F#, and Python. Exposes
potential problem of side effects in expressions.
Assignments:
Functional Language:
Expressions as Building Blocks (computation is the evaluation of expressions, which
depend solely on the referencing environment).
Referential Transparency: Expressions in a purely functional language are referentially
transparent, meaning their value depends only on the referencing environment and not on any
external state.
Imperative Language:
Ordered Series of Changes: Computation is typically an ordered series of changes to the
values of variables in memory.
Assignments: These are the primary means of changing variable values in memory.
Computing by Side Effects: Imperative programming often involves computing through
side effects, where changes to the state of the program are crucial.
Assignment Side Effects: A side effect occurs when a programming construct influences subsequent
computation in any way other than by returning a value. Expressions (always produce a value and may
have side effects) Statements (executed solely for their side effects (imperative programming))
# Not as important
Context Based: A variable may refer to the value of the variable may refer to the value of the variable
(r-value) or its location (l-value) – a named container for a value.
Value Model of Variable: An expression is either an l-value or an r-value based on the context in which it
appears. Built-in types (can’t be passed uniformly to methods expecting class type parameters).
Reference Model of Variable: a variable is a named reference for a value–every variable is an l-value.
Orthogonality: Features can be used in any combination, the combinations all make sense , and the
meaning of given features is consistent. (Algol 68: orthogonality was a principal design goal)
Initialization: Imperative languages do not always initialize the values of variables in declarations (three
reasons why they should, static variable local to subroutine, statically allocated variable, prevents
accidental use of uninitialized variables). In addition to built in types, for an orthogonal approach,
aggregates (built-up structured values of user-defined composite types) are needed (C, Ada, ML). A
language can provide a default value. Using uninitialized variable as a dynamic semantic error: Run-time
detection could be expensive. Definite assignment: no use of uninitialized variables: Every possible
control path assigns a value. Constructors: initialization versus assignment
Chapter 9 Subprograms
● Are subprograms process abstraction or data abstraction?
● What are the general characteristics of subprograms?
● What are the basic definitions for subprograms? (Definition, call, header)
● What are parameters? (Formal parameters, actual parameters, positional parameters, keyword
parameters)
● How do procedures and functions differ?
● What are the design issues for subprograms?
● What is a referencing environment?
● What are the three parameter passing modes? How are those modes implemented? How do
common languages support those modes?
● How are subprograms passed as parameters to subprograms?
● What is an overloaded subprogram?
● What is a generic subprogram?
● What is a closure?
● What is a coroutine?
Programming Languages
Pascal:
REMEMBER: begin end for block, var for any extras variables, (a, b, c: TYPE): TYPE
For Loops->
Function sumlist(arr: array of Integer): Integer
Var
Num, sum: Integer
Begin
Sum := 0;
For num in arr do
If num mod 2 = 0 then
Sum := sum + num;
sumlist := sum;
end;
1. How do you write basic Pascal?
program HelloWorld;
begin
writeln('Hello, world!');
end.
2. How are blocks formed?
begin
writeln('This is a block');
end.
3. How are functions written? - returns a value
function Add(x, y: Integer): Integer;
begin
Add := x + y;
end;
4. How are procedures written? - functions but don’t return value
procedure PrintHello;
begin
writeln('Hello!');
end;
5. How are files included?
program MyProgram;
uses crt; { Includes the crt library for console handling }
begin
writeln('Hello, world!');
End.
Rust
Sum. filter even -> filter even from int that are evens
1. How awesome is Haskell? Purity, laziness, functional programming, concise and
expressive code focuses on functions and immutability.
2. How to write functions?
Ex. functionName arguments = expression
add :: Int -> Int -> Int
add x y = x + y
3. How to implement conditions?
if x > 10 then "Greater than 10" else "Not greater than 10"
OR
checkNumber x
| x > 10 = "Greater than 10"
| otherwise = "Not greater than 10"
Prolog →
● How are facts stated in Prolog?
● How are rules stated in Prolog?
● How does Prolog try to satisfy a predicate?
● Know how the basic predicates work (e.g., member, append, not)
● How is a variable set to a number?
Ruby →
● How is a class made in Ruby?
● How is a subclass made in Ruby?
● How does a method of a subclass in Ruby call its parent's method of the same name?
● How do you access an attribute in Ruby?
● How can you add setters and/or getters to an attribute of a Ruby class?
● Know basic control structures (if, for, while, etc.)
Rust:
Fn discriminate(a:f64 b:f64 c:64) -> f64
{
Let d = (b * b) - 4 * a * c;
If d < 0.0 { - 1.0 } else {d}
}
React
Declaractice: designs simple views for each state in the application, React also updates and renders the
component if its data changes.
Component Based: encapsulated components that manage their own state, and composition of
components to make complex UIs.
Learn Once, Write Anywhere: No assumptions about the rest of the technology stack, rendering on server
using Node, and React Native for mobile applications.
React.Components: lets you define components as classes or functions, components defined as classes
currently provide more features. To define a React component class, you need to extend
React.Component:
Class Welcome extends React.Component { render() (return); }
Hook: a special function that lets you “hook into” features.
useState - a hook that lets you add React state to functions components.
useEffect - a hook that can be viewed as componentDidMount, componentDidUpdate, and
componentWillUnmount combined.
States: components can maintain internal state data via this.state, and when component’s state changes, it
can be updated by calleding render.
React Development Steps:
UI structure (break the UI into a component hierarchy): In mock, draw boxes around every component
and name them, result is a hierarchy of components, single responsibility principle a component should
ideally only do one thing. Arrange the components into a hierarchy.
Static Version (build a static verison in React): Easiest way is to build a version that takes your data model
and renders the UI but has no interactivity. Building a static version requires a lot of typing and adding
interactivity requires a lot of thinking and not a lot of typing. Build components that reuse other
components and pass data using props, so not use state.
Minimal UI state: Identify the minimal and complete representation of UI state: interactivity requires a
user to be able to trigger changes to your underlying data model. Determine the absolute minimal
representation of the app’s state and compute everything else on-demand. If it is passed in from a parent
via props, remain unchanged over time, and if you can compute it based on any other state or props. It
isn't state.
State implementation: identify where the state should live For each state piece, identify every component
that renders something based on that state, find a common owner component (a single component above
all the components that need the state in the hierarchy), either the common owner or another component
higher up in the hierarchy should own the state.
Inverse data flow: add inverse data flow. React is all about one-way data flow down the component
hierarchy. React makes this data flow explicit and helps a developer to understand how a program works.
It does require more coding than traditional two-way data binding. Event-driven programming:
components should only update their own state, a component passes a callback (event handler) to a child
component down the component hierarchy.
Component Creation: Always start component names with a capital letter, React treats components
starting with lowercase letters as DOM tags. Write a JavaScript function and returns a React element that
accepts props:
Function Welcome(props) { return <h1>Hello, {props.name}</h1>}
Component Extraction: It is acceptable, even desirable, to split a larger component into a collection of
smaller components. Name props from the component’s own point of view rather than the context in
which it is being used. Extracting components requires effort but over time builds a palette of reusable
components. A good rule of thumb is to identify reusable components, a part of the UI is used several
times, and a part of the UI is complex enough on its own.
Component Props: A component must never modify its own props. All React components must act like
pure functions with respect to their props, instead use state. Ex. pure function sum(a,b) {return a + b}
React Elements: Elements are the smallest building blocks of React apps and its describes what is seen
on the screen: const element = <h1> Hello, world </h1>. They are not DOM elements, they are plain
objects that are cheap to create. React elements are not React components and applications usually have a
single root DOM node.
Data Flows Down: Neither parent nor child components can know if a certain component is stateful or
stateless, and they shouldn’t care. State is often called local or encapsulated, it is not accessible to any
component other than the one that owns and sets it. A component may choose to pass its state down as
props to its child components.
Lifting State Up: Often, several components need to reflect the same changing data, leave the shared state
up to their closest common ancestor, in a children component, replace the state parameter with a props
parameter since props are read-only. Make the component controlled (in addition to the props parameter
having the corresponding props callback, the callback is invoked where previously the state parameter
was changed).
State Management: There should be a single “source of truth” for any data that changes in a React
application. It is located up in the component hierarchy, the common ancestor of components that need
the data. Rely on the top-down flow. Since any state “lives” in some component and that component can
change it, the surface area for bugs is greatly reduced.
React Design Patterns: A named pair of a problem and solution, effective solution to common problems,
Model-View-Controller. Other examples (creational, structural, behavioral, concurrency). Factory: an
interface for creating a single object, but let subclasses decide which class to instantiate. Facade: a unified
interface to a set of interfaces in a subsystem. Iterator: Accessing the element of an aggregate object
without exposing its underlying representation. Monitor An object whose methods are subject to mutual
exclusion.
Design Antipatterns: Many solutions to different problems that are not effective (lead to more brittle code,
code is less performant and less maintainable). A commonly-used pattern of action that, despite on the
surface being an appropriate and effective response to a problem, has significant negative outcomes.
Layout Components: Primary concern is helping to arrange other components created on the page, The
main idea of layout components is that other components shouldn’t know or care where it is that they’re
actually being displayed on the page. Ex. Split-screen, list, or modal.
Container Components: Take care of all the data loading and other data management for their child
components. The approach is to take that logic out into the container and the container then takes care of
loading that data and passes it automatically to the children components. The children components should
not know where their data is coming from or how to manage it, user data loader, resource loader, and data
sources.
Uncontrolled Components: Keeps track of all its own internal state and really the only time the data gets
out of that component is when some event occurs. Most cases the component itself is the one that keeps
track of its own state. The parent component would pass a function to actually get the values of that
component’s state when the submit event is triggered.
Controlled Components: Controlled components have their parent take care of keeping track of the state
and that state is then usually passed through to a controlled component as a prop. The state of the
component is passed through as props. More reusable and easier to test.
Higher-order Components: components that return another component. Most React components simply
return some JSX, which represents the DOM elements that component wants to be rendered in its place.
HIgher-order components are used to share behavior between several other components. Add extra
functionality to an existing component.
Custom Hooks: Combine basic hooks to provide extra hooks, allow us to share complex behaviors. Put
some frequently used logic into a custom hook and simply use the hook in different components.
Functional Languages: Design of the functional languages is based on mathematical functions
MUI - React Library that provides a robust, customizable, and accessible library of foundational and
advanced components (MUI core: foundational, MUI X: advanced, Templates: application, Design Kits:
components). Developed mobile-first, a strategy to first write code for mobile devices, and then scale up
components as necessary using CSS media queries. Components work in isolation and are
self-supporting.
Mid-Term 2 GUI
Graphics (Graphics, Vector Geometry, Coordinate Systems, Graphical Primitives)
Computer Graphics: The science and art of communicating visually via a computer’s display and its
interaction devices. Often means the whole field of study that involves these tools and the pictures they
produce. Better set of tools for plotting curves and presenting the data they encounter in their other studies
or work. Want to be more productive and communicate ideas better.
Combining Models: To produce a representation of a particular view of the scene (Model of objects,
model of the light, Geometric model, and mathematical model)
Matrix Transformation: Scale a model object with matrix S and then translate it with T to the correct
position. I.e. I=ST. Let T be a new transformation matrix and let C be the current transformation. New
transformation C’ will be C’ = CT. Calls in order T1, T2, T3. We get C’ = CT1T2T3. To display model
object we would transform its coordinate in the vertex shader.
Basic Dataset Types: The four basic dataset types are tables, networks, fields, and geometry: other
possible collections of items include clusters, sets, and lists. These datasets are made up of different
combinations of the five data types: items, attributes, links, positions, and grids. For any of these dataset
types, the full dataset could be available immediately in the form of a static file, or it might be dynamic
data processed gradually in the form of a stream. The type of an attribute can be categorical or ordered,
with a further split into ordinal and quantitative. The ordering direction of attributes can be sequential,
diverging, or cyclic.
Commonly Used Data Visualization Types: Amounts, Distributions, Proportions, x-y relationships,
Geospatial data, and Uncertainty.
Amounts: The most common approach to visualizing (i.e., numerical values shown for some set of
categories) is using bars, either vertically or horizontally arranged. We can also use dots at the location
where the corresponding bar would end.
Distributions: Histograms (binning the data) and density pots (data probability distribution) provide the
most intuitive visualizations of a distribution, but both require arbitrary parameter choices and can be
misleading. Cumulative densities and quantile-quantile (q-q) plots always represent the data faithfully but
can be more difficult to interpret.
Proportions: Pie Chart (emphasizes that the individual parts add up to a whole and highlight simple
functions), Bars (easily compared in side-by-side bars), and Stacked bars (useful when comparing
multiple sets of proportions).
X-Y Relationships: Scatterplot (shows one quantitative variable reactive to another), Bubble chart (dot
size represents the third variable), Paired scatterplot (the variables along the x and y axes are measured in
the same units), and Slopegraph (paired points connected by straight lines)
Scatter Plot: A graph in which values of two variables are plotted along two axes, the pattern of the
resulting points revealing any correlation present. Data (model): an array of 2D points (x, y, coords), a
point is represented as a JavaScript object with two properties, x and y. Data Processing (controller):
preprocess data, as needed, to be passed to the view. Visualization (view): draw and label the x- and
y-axes and draw scatter points on the graph.
Charts: A chart refers to any flat layout of data in a graphical manner. The data points, which can be
individual values or objects in arrays, may contain categorical, quantitative, topological, or unstructured
data. All charts consist of several graphical elements that are drawn or derived from the dataset being
represented. These graphical elements may be: Graphical primitives, like circles or rectangles, more
complex, multipart, graphical objects like the boxplot, and supplemental pieces like axes and labels.
Charts Creation: Creating and formatting axis components, creating legends, using line and area
generators for charts, creating complex shapes consisting of multiple types of SVG elements.
Functionality: Data-binding, Data transformation, generators (take data and return the SVG drawing code
to create graphical objects based on that data. (abstracting the process of writing a <path> d attribute.
Components: create an entire set of graphical objects necessary for a particular chart component. Layouts
(take in one or more arrays of data, and sometimes generators, and append attributes to the data necessary
to draw it in certain positions or sizes, either statically or dynamically).
Data Layouts: When a dataset is associated with a layout, each of the objects in the dataset has attributes
that allow for drawing the data. They are a preprocessing step that formats your data so that it;s ready to
be displayed in the form you’ve chosen.
Layouts: A layout is a function that modifies a dataset for graphical representational A layout
encapsulates a strategy for laying out data elements visually, relative to each other. Ex. histogram, pie
chart, stack, Sankey, word cloud. Layouts take a set of input data, apply an algorithm or heuristic, and
output the resulting positions/shapes for a cohesive display of data. Layouts typically operate across a
collection of data as a whole, rather than individually. Layout instances are sometimes functions that can
be configured and then applied to a set of data (other times, specific methods or event handlers are used
for data input and position output).
—--- Quiz 1 ^^
Creating Layouts: Design (default arrangement, user defined parameters, resizing and size definition),
Implementation (initialization, getter/setter functions, data processing, elements positioning), Testing
(using data sets of various sizes to check visual appearances), and Extending (providing additional
functionality and customization).
Discussion: To make code more reusable, follow the two patterns that already exist in D3, layouts and
components: components create graphical elements, like the axis component. Layouts decorate data for
the purpose of drawing, like pie chart layout. Plugins follow a getter/setter pattern popular with D3 that
allows people to use method-chaining. In making layouts and generators, use the call functionality in D3
by passing a <g> element to the rendering function.
Geospatial Data: Map (takes coordinates on the globe and projects them onto a flat surface), Choropleth
(coloring regions in the map according to the data), Cartogram (distort the regions according to data), and
Cartogram Heatmap (simplify each region in a cartogram into a square).
Uncertainty: Error Bars (indicate the range of likely values for some estimate or measurement
(horizontally and/or vertically), and Graded error bars (multiple ranges at the same time, where each
range corresponds to a different degree of confidence.
Hierarchical Visualization: Understanding hierarchical data principles, learning about circle packs, using
dendrograms, working with treemaps, and employing partitions.
Ex. Circle Packing (Nodes sized by value, Circle packs don’t use space efficiently, Encoding numerical
value with radius is not efficient, the best use is when leaf nodes map well to individual things of the same
type and that we don’t think of as varying size.
Ex. Dendrogram (shows each node using the same symbology), the use of lines to demonstrate
connections between the nodes places gives more visual structure to the lineage rather than the links or
the nodes separately.
Ex. Partition (No space is wasted on links, and the value of each node is encoded in the length of the
node. Easier to evaluate the numerical difference between the nodes. Useful to quickly and effectively
measure the values encoded in the nodes).
Ex. Treemaps (It is difficult to evaluate the area of rectangles and understand the value mapped to that
area. Good for numerical hierarchical data and comparison of rough value and aggregated value across
categories. Demographic data)
Scalable Vector Graphics (SVG): An text-based image that provides high quality, scalable web graphics.
Has similar structure to HTML, DOM Object, property attributes, uses absolute positions.
OpenGL (Open Graphics Library): Controls whatever hardware and use its functions instead of
controlling the hardware directly. A collection of routines that the programmer can call along with a
model of how the routines work together to produce graphics. (2D and 3D graphics). Not
platform-specific. API only deals with rendering graphics
OpenGL Primitives: low-level graphics API that provides access to graphics hardware features. Available
primitives (points, lines, triangles).
OpenGL Pipeline: Sends vertex and image data, configuration and state changes, and rendering
commands to OpenGL. Vertices are processed into primitives and rasterized into fragments. Then they are
merged into a frame buffer. Useful for identifying exactly what work application must perform to
generate the results. Allows customization of each stage of the graphics pipeline through customized
shader programs.
OpenGL ES: The new standard for 3D graphics on the Web.A part of the HTML5 family of technologies.
There are several open source JavaScript toolkits.
Pipeline Stages: Transformations, lighting, clipping, texturing, environmental effects, etc. on large
datasets. The size of data and the complexity of the calculations performed on it can impact performance.
Application design should balance the work performed in each pipeline stage to the capabilities of the
renderer.
Fragment Shader: The fragment shader implements a general-purpose programmable method for
operating on fragments (pixels).
Polygons: OpenGL displays only triangles (simple (edges don’t cross)), Convex (all points on line
segments between two points in a polygon are also in polygon). Flat (all vertices are in the same plane)
Data-Driven Documents (D3): Mike Bostock 2011, open source, lightweight JavaScript Library for
manipulating web documents and creating custom interactive data based web visualizations without
predefined forms or specific features. Data driven, based on web standards, support for DOM
manipulation, dynamic properties, dynamic, data driven, element creation and manipulation, custom
visualizations, no predefined formats, and interactions, animations, and transitions.
React (D3): wrapper for D3 code, uses it in render method. D3 centric creates a container element in
React and puts D3 code in useEffects.. Use D3 code in the render method to create DOM elements. Use
D3 in useEffect to update style.
Using D3: A ‘glue’ that connects HTML5, DOM, and CSS. provides wrappers for Javascript DOM API.
First step when writing a D3 program is to select a DOM object or a collection of objects. Once the
selection is created, it can be modified (its attributes/styling can be modified), children elements can be
added or removed. The statements are usually in the form of chained functions. Data can be used to
inform modification and add/removal. An SVG element can be used for graphics/visualization.
Function of Data: DOM manipulation functions can have as a parameter a constant value of a function, a
function of data.These anonymous functions of data (list/array) are called when using data binding
(described next). An arrow function expression could be used. In addition to the data item and index
parameters, this refers to the current DOM object.
Data Binding Functions: data (binds elements to data), join (enter/updates/exits elements based on data),
enter (gets the enter selection (data missing elements)), exit (gets the exit selection (elements missing
data)), and datum (gets or set element data (without joining)). Calling data creates three arrays (enter (data
without corresponding DOM elements), update (DOM elements mapped to data), exit (DOM elements
now missing data).
D3 and JSON: Data arrays can contain JavaScript objects and JSON objects: {html: “Level 1”, class:
“level1”} JSON object (and a JavaScript Object Literal): {“html”: “Level 1”, “class” : “level1”} keys (list
the keys of an associative array, values (list the values of an associated array), and entries (list the
key-value entries of an associative array)
Large Data Sets: Large datasets often require using canvas to render them to maintain performances. For
interactivity pair an SVG layer with the canvas layer and deal with activating and deactivating them in
interaction functions. There are many different views and visualization techniques. Check the available
implementations (D3 examples gallery) and use them.
Mixed Mode Rendering: using built-in canvas rendering for D3 shapes. Creating large random datasets of
multiple types. Using canvas drawing in conjunction with SVG to draw large datasets. Optimizing
geospatial, network, and traditional data visualization. Working with quadtrees to enhance spatial search
performance. A key characteristic is interactivity with large data sets along with good graphics
performance. Pair an SVG layer with the canvas layer and deal with activating and deactivating them in
interaction functions. It is implemented by placing interactive SVG elements below a <canvas> element
requires that the <canvas> has pointer-events style to none, even if it has a transparent background, in
order to register click events on the <svg> element underneath it.
Types of Visualization: Scientific Visualization: The use of interactive visual representation of specific
data, typically physically based, to amplify cognition. Information Visualization: the use of interactive
visual representations of abstract, non-physically based data to amplify cognition. Will focus on
information visualization and some basic types of charts/plots/graphs.
Visualization Stages: The collection and storage of data. Preprocessing / transforming the data into
something that is easier to manipulate. Mapping from the selected data to a visual representation. The
human perceptual and cognitive system (the perceiver).
Information Visualization User Tasks: Overview (Provide an overview of the entire collection of data),
Zoom (Drill down from the abstract view to the detail view (items of interest)), Filter (Eliminate
unnecessary or unimportant details), Details on Demand (select of an item or group and retrieve, as
needed, additional information). Relate (view relationships among items), History (Maintain a history of
actions to support undo, replay, and refinement). Extract (Extraction of sub-collections and query
parameters).
Coordinate Multiple Views (CMV): Interactive visual analysis and, at the same time, an active area of
research. Several visualization techniques are combined and applied on collected data to make it easier to
find causal relationships and uncover unforeseen connections. We use linking and brushing to combine
different visualization methods to overcome the shortcomings of single techniques.
Linking: Helps us show how a point or set of points, behaves in each of the views. Accomplished by
selecting, highlighting, emphasizing these points. The selected points could be drawn as a filled circle
while the remaining points could be drawn as unfilled circles.
Brushing: Selecting a subset of the data points with an input device (interactions technique). Extends the
concept of linking. Interactively selected (by mouse) and all views are dynamically updated. Selecting a
region of points in one view results in those points reflected in the other views. Selection in Editor and
BarChart of Project 1 is an example with the App State and Event Handlers.
Web Storage API: provides mechanisms by which browsers can store key/value pairs, in a much more
intuitive fashion than using cookies. Two mechanisms: sessionStorage (maintains a separate storage area
for duration of the page session), and localStorage (does the same thing but persists even when the
browser is closed and reopened, cleared only through JavaScript or Browser Cache)
Computer Graphics: Combining models to produce a representation of a particular view of the scene,
model of objects, model of the light, geometric model, and mathematical model.
Virtual Camera: Looks at a scene from a specific location, and with some orientation or attitude. We can
create a coordinate system whose origin is at the center of the camera, whose z-axis points opposite the
view direction, and whose x- and y-axes point to the right and to the top of the camera, respectively.
Texture: Vertices of each of the six faces of the die (shown in an exploded view) are assigned texture
coordinates (a few are indicated by the arrows in the diagram). Image is then used to determine the
appearance of each face of the die as if the texture were a rubber sheet stretched onto the face.
Perspectives: Computer graphics can be viewed from several perspectives, its applications, the various
fields that are involved in the study of graphics, some of the tools that make the images produced by
graphics so effective, some numbers to help understand the scale at which computer graphics works. The
elementary ideas required to develop a graphics program.
QUIZ 2 — ^^^
GUI component: An element that displays an information arrangement changeable by the user and
provides a single interaction point for a given kind of data. GUI components are basic visual building
blocks which, combined in an application, hold all the data processed by the application and the available
interactions on this data. To appear, every GUI component must be part of a containment hierarchy. A
layout manager determines the size and position of the components within a container. Uses computer
graphics to display individual GUI components and the overall GUI.
Containment Hierarchy: A tree of components that has a top-level container as its root.
Graphics Support: Most GUi frameworks include support for graphics which includes 2D drawing
primitives and more and more frameworks support 3D drawing. Support can be provided in wrapper
classes for an existing graphics library (OpenGL), custom framework classes and services. Support
classes usually include Canvas (provides a rectangular area where drawing takes place) and Graphics
context (Maintains the state (attributes) for drawing)
Drawing Primitives: Low level graphics libraries, such as OpenGL, the basic drawing shapes are points,
lines, and triangles. (Everything else is done in shaders making graphics programming very involved).
Graphics libraries in GUI frameworks usually provide a layer of abstraction to provide a richer set of
drawing primitives that are easier to use. (set of drawing primitives also includes rectangle, oval, polygon,
text, etc. A single method draws a primitive of a given size at a specific location).
Graphic Tools: Both software and hardware. Hardware tools: include video monitors, graphics cards, and
printers that display graphics. They also included input devices such as a mouse, data glove, or trackball
that let users point to items and draw figures.Software tools: the operating system, editor, compiler, and
debugger you commonly use. Graphics routines: functions to draw a simple line or circle. Functions to
manage windows with pull-down menus, input, and dialog boxes. Functions to allow the programmer to
set up a camera in a 3D coordinate system and take snapshots of objects.
Device Independent Graphics: Device independent graphics libraries that allow the programmer to use a
common set of functions within an application, and to run the same application on a variety of systems
and displays are available. Graphics API provides access to library functionality. Most graphics API
support limited primitives: points, line segments, polygons and some curves and surfaces.
Immediate and Retained-Mode: Immediate Mode (Quartz, AWT - thin layer providing efficient access to
graphics output devices. Retained Mode: uses a scene graph, a special purpose database containing the
scene description/representation. (using templates, UI controls, look and field, and layout managers).
Color: a perceptual phenomenon, the eye is approx logarithmic. Intensity is independent of hue. Degree of
saturation of a color is independent of density and hue.
Three-color Theory (human two types of sensors, rods (monochromatic, night vision), cones (color
sensitive, three types of cones, and only three values are sent)
Primary Colors: red, green, and value occupies such of this horseshoe shape. Associated to mono spectral
lights of equal brightnesses. Color can be used to code data, and color-usage rules are based on
physiological rather than aesthetic considerations, and apply color conservatively. RGB (red, green, blue),
HSL (hue, saturation, lightness), HSV (hue, saturation, and value), CMYK (Cyan, Magenta, Yellow, and
Black).
Math Colors: Color components stored in frame buffer separately. Usually 8 bits per component in the
buffer. Additive Color (formed by adding amounts of three primaries(RGB). LCDs, Projection Systems,
positive films. Subtractive Color (formed by filtering white lights with CMY filters).
Interpolating Colors: Colors that are very similar, almost any scheme will work, including interpolating
the RGB coefficients. No one of them is right for all circumstances.
Drawing Order: no “layer” and no depth concept, no z index, the order in which elements are coded
determines their order. <style> tag can be included within svg element.
SVG: Scalable (supports diff display resolutions), Vector (geometric objects (lines)), and Graphics
(structured description of vector and mixed raster graphics). SVG is an image that is text-based, similar to
HTML structure, DOM object, Retained Mode.
SVG Graphics: supports three fundamental types of graphics,shapes, text, and raster images(represents an
array of values that specify the paint color and opacity at a series of points)
Basic Syntax: elements and attributes in lower case since XML is case-sensitive. Svg element is a
drawing canvas for SVG drawings. Visual elements are included within the svg element (rect, circle, text,
etc.). Attribute values must be placed inside quotes.
viewBox: The viewBox attribute allows you to specify that given set of graphics stretch to fit a particular
container element. The value is a list of four numbers min-x, min-y, width and height, separated by
whitespace and/or a comma, which specify a rectangle in user space which should be mapped to the
bounds of the viewport established by the given element, taking into account aspect ratio. Negative values
for width or height are not permitted and a value of zero disables rendering of the element.
Basic Shapes: Rect, circle, ellipse, line, polyline, polygons. These shape elements are equivalent to a path
element. They may be stroked, filled and used as clip paths. All of the properties available for path
elements also apply to basic shapes.
Paths: A path represents the outline of a shape which can be filled or stroked, also be used as a clipping
path to describe animation, or position text. Defined in SVG using the path element which contains
moveto, lineto, curveto, arc, and closepath.
Moveto: M/m (starts new sub-path at (x, y)), M(absolute), m(relative coords). If followed by multiple
pairs the next coords are now lineto commands.
Closepath: lose the current subpath by connecting it back to the current subpath’s initial point. And Z and
z have the identical effect.
Lineto: L/l draw a line from the current point to the given (x, y) coordinates which becomes the new
current points. L indicates that abs coord, l indicates relative coords. H/h draws horizontal line, V/v is
vertical line from point.
Math Modeling: Right-handed systems (x-axis to right, y-axis is up, z-axis is backwards).
Transformations (used to scale, translate, rotate, reflect and shear shapes and objects. Transforms its
vertices, can use both algebraic and matrix notations).
Windows and Viewports: Use Natural coordinates for what we are drawing. A graphics library converts
our coordinates to screen coordinates when we set up a screen window and viewport. Viewport may be
smaller than the screen window. The default viewport is the entire screen window. Conversion requires
scaling and shifting mapping the world window to the screen window and the viewport. Windows are left,
top, right, bottom (w.1, w.t, e.r, e.b). Viewports have the same values. Mapping should be proportional.
Mapping Rectangles: We can map any aligned rectangle to any other aligned rectangle. If the aspect ratios
of the two rectangles are not the same, distortion will result.
Immediate Mode: Quartz, AWT - thin layer providing efficient access to graphics output devices.
Retained Mode: Uses a scene graph, a special purpose database containing the scene
description/representation (UI controls (widgets), look and feel, layout managers, and not often used for
3D).
Homogeneous Coordinates: Define a point in a plane using three coordinates instead of two. 2D point is
expressed in a 2D homogeneous coordinate system as (x, y, 1) or (2x, 2y, 2) or in general (w*x, w*y, w).
A 2D homogeneous point can be transformed into P(x’, y’, 1)
2D Translation: Algebraic Notation: x’ x + t_x, y’ = y + t_y, 1 = 1. Matrix Notation: [x’ y’ 1] = [1 0 t_x, 0
1 t_y, 0 0 1][x y 1].
2D Scaling: Scaling relative to point(P_x, P_y): Translate (Px, Py) to the origin (t_x = -p_x, t_y = -p_y),
Scale, Translate Back (t_x = p_x, t_y=p_y)
Canvas: An HTML element used to draw graphics using scripting. Creates a fixed size drawing surface
that exposes one or more rendering contexts, which are used to create and manipulate the content shown.
It is initially blank, and to display something a script first needs to access the rendering context and draw
on it. The canvas element has a DOM getContext function, used to obtain the rendering context and its
drawing functions: takes one parameter, the type of context, for WebGL, the context is “webgl”.
Canvas versus SVG: Unlike SVG, canvas does not include a hierarchy of objects (retained mode):
Drawing with canvas produces higher performances, but slightly less crisp graphics. There are no
elements (<path>) to attach mouse events, too. Instead, they are attached to the canvas itself. The canvas
coordinates of the mouse event (pixel) must be mapped to the corresponding drawn element.
Coordinate System: OpenGL is right-handed, x-axis to right, y-axis is up, z-axis is backwards.
Transformations: Let T be a new transformation matrix and let C be the current transformation. The new
transformation C’ will be C’=CT. Thus to display the model object we would transform its coordinate in
the vertex shader.
Normalized Device Coordinates (NDC): In a step-by-step fashion where we transform an object’s vertices
to several coordinate systems before finally transforming them to NDC. There are five coordinate systems
(local, world, view, clip, and screen space).
Interactions with canvas: has no knowledge of input or output, and canvas element provides a wrapper for
connections with the rest of the GUI. Resize, mouse and other events are handled by canvas and used to
update graphics. We need to map from the screen coordinate system to the world.
Network Visualization: Creating adjacency matrices and arc diagrams, using the force-directed layout,
using constrained forces, representing directionality, and adding and removing network nodes and edges.
HW4 Usage - The Html file contains vertex and fragment shaders and initially the body is empty. In the
JavaScript file in the init function canvas and SVG elements are created and added to the body using D3.
SVG element is used to add text (title). The canvas element is used for rendering the line and the
rectangle. The vertices array contains points used to specify the line (vertex 0 and 1) and the rectangle
(vertices 2-5). A transformation matrix (scaling in x and y, translation in x and y) is used to transform
from the SVG coordinate system to the WebGL Normalized Device Coordinate system. The rectangle is
drawn as a triangle fan.
MERN Stack: MongoDB (document-oriented, No-SQL database used to store the application data), Node
(JS runtime environment used to run JavaScript on a machine rather than a browser), Express (framework
layered on top of Node used to build the backend of a site (web server) using Node functions and
structures). React (GUI framework used to build the user interface of the single page web application).
Architecture: User interacts with the React UI components at the application front-end residing in the
browser. This frontend is served by the application backend residing in a server, through Express running
on top of Node. Any interaction that causes a data change request is sent to the Node based Express
server. The Express server retrieves data from the mongoDB database if required, and returns the data to
the frontend of the application. Data is presented to the user.
MongoDB:
A document database: A record in MongoDB is a document, which is a data structure composed of field
and value pairs.
MongoDB documents are similar to JSON objects. MongoDB stores documents in collections. The values
of fields may include other documents, arrays, and arrays of documents.
The advantages of using documents are: Documents (i.e., objects) correspond to native data types in many
programming languages. Embedded documents and arrays reduce need for expensive joins. Dynamic
schema supports fluent polymorphism
Express Middleware: Middleware functions have access to the request object (req), the response object
(res), and the next function in the application’s request-response cycle. The next function is a function in
the Express router which, when invoked, executes the middleware succeeding the current middleware.
Middleware functions can perform the following tasks: Execute any code. Make changes to the request
and the response objects. End the request-response cycle. Call the next middleware in the stack.
HTTP:
The Hypertext Transfer Protocol (HTTP) is designed to enable communications between clients and
servers. HTTP works as a request-response protocol between a client and server. The two most common
HTTP methods are: GET and POST: GET is used to request data from a specified resource. POST is used
to send data to a server to create/update a resource.
GET:
BACK button/Reload: Harmless
Bookmarked, Cached, Encoding Type (Application/x-www-form-urlencoded),
History: Parameters remain in browser history.
Restrictions on Data Length: Yes, when sending data the GET method adds the data to the URL: and the
length of a URL is limited (maximum URL length is 2048 characters).
Restrictions on data Type: Only ASCII characters allowed.
Security: GET is less secure compared to POST because data sent is apart of the URL. Never use GET
when sending passwords or other sensitive information.
Visibility: Data is visible to everyone in the URL.
POST:
BACK button/Reload: Data will be re-submitted (browser should alert the user that the data are about to
be re-submitted)
Bookmarked, Not Cached, Encoding Type (Application/x-www-form-urlencoded or multipart/form-data),
History: Parameters not saved browser history.
Restrictions on Data Length: no restrictions
Restrictions on Data Type: no restrictions, binary is also allowed.
Security: Safer than GET because parameters are not stored in browser history or web server logs
Visibility: Data is not displayed in the URL.
Postman:
A collaboration platform for API development.
Quickly and easily send REST, SOAP, and GraphQL requests directly within Postman.
Can be used to test GET and POST requests.
Free version available
Mongoose:
Installation: npm install mongoose
Everything is derived from a Schema: A schema maps to a MongoDB collection. Defines the shape of the
documents within that collection.
Example:
Const schema = new mongoose.Schema({name: String, size: String });
A model is a class with which we construct documents: Constructors derived/compiled from schema
definitions:
Example: Const Datasets = mongoose.model ('Datasets', schema);
Document is an instance of a class.
Schema: A schema can have an arbitrary number of fields—each one represents a field in the documents
stored in MongoDB.
Examples:
Const schema = new Schema({
Name: String,
Binary: Buffer,
Living: Boolean,
Updated: { type: Date, default: Date.now ()},
Age: {type: Number, min: 18, max: 65, required: true},
Mixed: Schema.Types.Mixed,
_someId: Schema.Types.ObjectId,
array:[],
ofString:[String],// You can also have an array of each of the other types too.
Nested: {stuff:{type: String, lowercase: true, trim:true}},
});
Model
Models are constructors compiled from Schema definitions.
An instance of a model is called a document.
Models are responsible for creating and reading documents from the underlying
MongoDB database.
Calling mongoose.model() on a schema compiles a model:
Const schema = new mongoose.Schema({ name: String, size: String });
Const Tank = mongoose.model('Tank', schema);
The .model() function makes a copy of schema.
Add everything to schema, including hooks, before calling .model()
Document:
Mongoose documents represent a one-to-one mapping to documents as stored in MongoDB
A document is an instance of a model. Creating and saving to the database:
Const Tank = mongoose.model('Tank', yourSchema);
Const small = new Tank({ size: 'small'});
Await small.save();
or
Await Tank.create({ size:'small' });
//or, for inserting large batches of documents
Await Tank.insertMany([{size: 'small' }]);
Query
Mongoose models provide several static helper functions for CRUD (Create, Read, Update, and Delete)
operations. Each of these functions returns a mongoose Query object:
Model.deleteMany()
Model.deleteOne()
Model.find()
Model.findById()
React Native: React Native provides a number of built-in Core Components to use in app:
Basic Components, User Interface, List Views, iOS-specific, Android-specific, Others
There are third party libraries: React Native Community and Expo (TypeScript)
Styling: Styling is done using JavaScript. All of the core components accept a prop named style. The style
names and values usually match how CSS works on the web, except names are written using camel
casing, e.g., backgroundColor. The style prop can be a plain JavaScript object: It can also be an array of
styles: The last style in the array has precedence, can be used to inherit styles.
As a component grows in complexity, it is often cleaner to use StyleSheet.create to define several styles in
one place: const styles = StyleSheet.create ({container: {marginTop:50,},
Interactions: Users interact with mobile apps mainly through touch. They can use a combination of
gestures, such as tapping on a button, scrolling a list, or zooming on a map. React Native provides
components to handle all sorts of common gestures, as well as a comprehensive gesture responder system
to allow for more advanced gesture recognition.
Example:Button
<Button
onPress={() => {
alert('You tapped the button!');
}}
title="Press Me"
/>
iOS
The iOS system is based on the same technologies used by Mac OS X, namely the Mach kernel and BSD
interfaces. iOS apps run in a UNIX-based system and have full support for threads, sockets, and many of
the other technologies typically available at that level. However, there are places where the behavior of
iOS differs from that of Mac OS X. To manage program memory, iOS uses essentially the same virtual
memory system found in Mac OS X:
Each program still has its own virtual address space, but unlike Mac OS X, the amount of usable virtual
memory is constrained by the amount of physical memory available.
For security reasons, iOS places each app (including its preferences and data) in a sandbox at install time.
Sandbox
A sandbox is a set of fine-grained controls that limit the app’s access to files, preferences, network
resources, hardware, etc. As part of the sandboxing process, the system installs each app in its own
sandbox directory, which acts as the home for the app and its data. To help apps organize their data, each
sandbox directory contains several well-known subdirectories for placing files.
Design Patterns
Model-View-Controller: this design pattern governs the overall structure of your app.
Delegation: this design pattern facilitates the transfer information and data from one object to another.
Target-action: this design pattern translates user interactions with buttons and controls into code that your
app can execute. Block objects: use blocks to implement callbacks and asynchronous code.
Sandboxing: all iOS apps are placed in sandboxes to protect the system and other apps.
The structure of the sandbox affects the placement of your app’s files and has implications for data
backups and some app-related features.
Android
Android is a software stack for mobile devices that includes an operating system, middleware and key
applications:
Application framework for reuse and replacement of components.
Dalvik virtual machine optimized for mobile devices.
Integrated browser based on the open source WebKit engine.
Optimized graphics powered by a custom 2D graphics library; 3D graphics based on the OpenGL ES.
SQLite for structured data storage.
Media support for common audio, video, and still image formats (MPEG4, H.264, MP3, AAC, AMR,
JPG, PNG, GIF).
GSM Telephony (hardware dependent).
Bluetooth, EDGE, 3G, and WiFi (hardware dependent).
Camera, GPS, compass, and accelerometer (hardware dependent).
Rich development environment
Application
Android applications are written in Java. The Android SDK tools compile the code, along with any data
and resource files, into an Android package, an archive file with an .apk suffix. Once installed on a
device, each Android application lives in its own security sandbox:
The Android operating system is a multi-user Linux system in which each application is a different user.
By default, the system assigns each application a unique Linux user ID. The system sets permissions for
all the files in an application so that only the user ID assigned to that application can access them.
Each process has its own virtual machine (VM), so an application's code runs in isolation from other
applications. By default, every application runs in its own Linux process.
JavaFX:
JavaFX is an open source, client application platform for desktop, mobile and embedded systems built on
Java. It is a collaborative effort by many individuals and companies with the goal of producing a modern,
efficient, and fully featured toolkit for developing rich client applications.
Application:
A basic JavaFX application has classes from 3 packages:
javafx.application: the application life-cycle classes.
javafx.stage: the top-level container classes for the content.
javafx.scene: the core set of base classes for the content.
Content: a scene graph is a tree data structure, most commonly found in graphical applications and
libraries such as vector editing tools, 3D libraries, and video games.
The JavaFX scene graph is a retained mode API: maintains an internal model of all graphical objects.
It always knows what objects to display, what areas of the screen need repainting, and how to render it all
in the most efficient way.
Use the scene graph API instead of invoking primitive drawing methods directly: the system
automatically handles the rendering. Significantly reduces the amount of application code.
Qt:
Qt is a cross-platform application and UI framework.
It an be used to write web-enabled applications once and deploy them across desktop, mobile and
embedded operating systems without rewriting the source code.
A C++ framework for high performance cross-platform software development.
Provides a rich set of standard widgets that can be used to create graphical user interfaces for applications.
Layout managers are used to arrange and resize widgets to suit the user’s screen, language and fonts.
Current version 6.5.
MUI X Charts
The @ mui/x-charts is an MIT library for rendering charts relying on D3.js for data manipulation and
SVG for rendering.
MUI X Charts provides three levels of customization layers: Single components with great defaults.
Extensive configuration props. Subcomponents for flexible composition. Supported Charts are bar Chart,
Line Chart, Pie Chart, Scatter Chart, Sparkline, Gauge, and Heatmap. ‘
Supported Features:
Axis: provides associate values to element positions (LineChart, BarChart, ScatterChart).
Custom components: Creating custom chart components and managing dimensions.
Legend: UI element mapping symbols and colors to the series' label.
Stacking: displaying the decomposition of values.
Styling: charts customization.
Tooltips and highlights: provides extra data on chart items.
Tooltip: Use built in tooltip functionality. Customize by using context dependent text. Depends on the
type of the chart
Linking: Linking helps us show how a point, or set of points, behaves in each of the views. This is
accomplished by selecting/highlighting/emphasizing these points: For example, the selected points could
be drawn as a filled circle while the remaining points could be drawn as unfilled circles. A typical
application of this would be to show how an outlier shows up in each of the individual pairwise plots.
Brushing: Brushing means selecting a subset of the data points with an input device (interactions
technique). Brushing extends the concept of linking a bit further. The points to be highlighted are
interactively selected (e.g., by a mouse) and all views are dynamically updated (ideally in real time):
Selecting a region of points in one view results in those points reflected in the other views.
Selection: Editor: use check boxes. Charts: provide a click handler. The click handler used to update
selection based on the data point index. The coloring is done through props, based on a data index. Create
a utility function that returns color based on current selection. Maintain a list of selected data points.
Selection is synced and works in all views
Rubber Banding:
A technique used for displaying the segments, rectangles (and other primitives) that change as they are
manipulated interactively. Line segment: Press the mouse button to specify the location of the start point
of the segment. Move the mouse and the current location of the mouse is the end point of the segment.
Release the mouse to remove the line segment.
Rectangle: Press the mouse button to specify the location of the start point of a diagonal of the rectangle.
Move the mouse and the current location of the mouse is the end point of the diagonal of the rectangle.
Release the mouse to remove the rectangle.