[UNIT 1] ocr a level computer science notes
[UNIT 1] ocr a level computer science notes
1.1 - The characteristics of contemporary processors, input, output & storage devices
Registers
Small memory cellsthat operate at a very high speed, used to temporarily store data. All arithmetic, logical
and shiftoperations occur in these registers.
- Program Counter (PC) - Holds the addressof the next instructionto be executed
- Accumulator (ACC) - Stores the results from calculations processed by the ALU
- The processor accesses other general-purpose registers where temporary values are stored
while calculations are completed. Any result resides in the accumulator
- Memory Address Register (MAR) - Holds the addressof a location that is to be read from or
written to
- Memory Data Register (MDR) - Temporarily stores datathat has been read or data that needs to
be written
- Current Instruction Register (CIR) - Holds the current instructionbeing executed, divided up into
operand and opcode
Buses
Bus - A set of parallel wireswhich connect two or more componentsinside the CPU
- The communication channels linking the CPU with the RAM & I/O devices
- 3 buses in the CPU: data bus, control bus& address bus. (collectively called the system bus)
The width of the bus is the number of parallel wiresthe bus has.
- The width of the bus is directly proportionalto the number of bitsthat can be transferred
simultaneouslyat any given time.
- Buses are typically 8, 16, 32 or 64 wires wide
Data Bus
A bi-directional busused for transporting dataand instructionsbetween components.
- Bi-Directional Bus - A bus where bits can be carried in both directions
Address Bus
Used to transmit the memory addressesspecifying where data is to be sent to or retrieved from
- The width of the address bus is proportional to the number of addressable memory locations
Control Bus
A bi-directionalbus used to transmit control signalsbetween internal and external components.
- Coordinates the use of the address and data buses and provides status information between
system components.
Fetch-Decode-Execute Cycle
Fetch
1. The program counter (PC) holds the address of the next instruction to be executed. The contents of
the PC are copied to the memory address register (MAR), which is connected to the address bus.
2. The address of the next instruction to be executed is placed on the address bus
3. Once the address of the instruction is on the address bus, the control unit instructs a memory read
operation to allow the contents of the memory location to be transferred to the processor.
4. The instruction that is stored at that address is transferred using the data bus from the main memory
to the processor, and is saved in the memory buffer register (MBR)/memory data register (MDR).
- Simultaneously, the contents of the program counter (PC) are incremented by one so that
they point to the address of the next instruction that needs to be fetched
5. The contents of the memory data register (MDR) are copied to the current instruction register (CIR).
- This ensures that the current instruction is kept safe so that the memory data register (MDR)
can be used during the execute stage, in order to store additional data that is needed.
Decode
1. The control unit decodes the instruction that is kept in the current instruction register (CIR). This
involves splitting the instruction into operand and opcode to determine what type of instruction
needs to be carried out, checking if additional data are required from memory, and figuring out
where these are kept in main memory.
Execute
1. The instruction is executed.
- The exact sequence of operations depends on the type of instruction that is being executed.
- For example, for an arithmetic instruction, any required data are fetched from the main
memory, then the calculation is executed by the Arithmetic and Logic Unit (ALU), and the
result of the instruction is stored in the accumulator, a general-purpose register, or back into
main memory.
Number of Cores
Core - An independent processorthat is able to run its own fetch-execute cycle.
- A computer with multiple cores can complete more than onefetch-execute cycle at any given time.
- A computer with dual cores can theoretically complete tasks twice as fastas a computer with a
single core.
- However, not all programs are able to utilise multiple cores efficiently as they have not been
designed to do so, so this is not always possible.
Cache
Cache - Fast, relatively small capacity set of locations that sit close to the processor used to store the most
frequently used instructions & data
- Level 1 (L1) Cache: Part of the circuitry of each core, is the smallest and fastest cache. Very fast
memory cells with a small capacity. (2-64KB)
- Level 2 (L2) Cache: It is slower than L1 cache and tends to be larger & often shared by cores.
Relatively fast memory cell, with a medium sized capacity. (256KB-2MB)
- Level 3 (L3) Cache: Sits on the processor or near it on the motherboard. Much larger and slower
memory cell.
Advantages of Pipelining
- Improved Performance: Pipelining reduces the overall execution time by increasing the output of
tasks.
- Parallel Processing: Tasks are executed concurrently, making efficient use of system resources
and maximising performance.
- Reduced Latency: The pipeline structure minimises idle time between instructions, resulting in
faster data processing
Disadvantages of Pipelining
- Structural Hazards: When there are not enough hardware resources to handle all of the pipelined
instructions (more likely with CISC processors)
- Control Hazards: when a program branches and so the next instruction to be executed is not
necessarily the one that has been fetched
- Data Hazards: When there is an attempt to read a piece of data that has not yet been written by the
previous instruction (more likely with CISC processors)
Computer Architecture
Von Neumann
Includes the basic components of the computer and processor (single control unit, ALU, registersand
memory units) in which a shared memory & shared data busis used for both data and instructions.
- Built on the stored program concept (A program must be stored in main memory to be executed &
instructions are fetched one at a time & executed sequentially)
- Includes:
- a Processor
- a Memory unit that can communicate directly with the processor
- Connections for input and output devices
- Secondary storage for saving/backing up data
Harvard
Has physically separate memories for instructions and data, more commonly used with embedded
processors.
- Useful for when memories have different characteristics
- Example: instructions may be read only, while data may be read-write.
- Allows optimisation of the size of individual memory cells & their buses depending on your needs,
- Example: the instruction memory can be designed to be larger so a larger word size can be
used for instructions
Cheaper to develop as the control unit is easier to Quicker execution as data & instructions can be
design fetched in parallel
Programs can be optimised in size Memories can be different sizes, which can make
more efficient use of space
Contemporary
Contemporary processors use a combination of Harvard and Von Neumannarchitecture.
- Von Neumann is used when working with data and instructions in main memory
- Harvard architecture is used to divide the cache into instruction cache and data cache.
Direct Comparison
Reduced Instruction Set Computers (RISC) Complex Instruction Set Computers (CISC)
Instruction set is made up of a small number of Instruction set is made up of a large number of
simple, fixed-length instructions complex, variable-length instructions
The compiler has to do more work to translate high The compiler has less work to translate high level
level code into machine code code into machine code
More RAM is required to store the code Less RAM is required since code is shorter
Pipelining is possible since each instruction takes Many specialised instructions are made, even
one clock cycle though only a few of them are used
One instruction is executed per clock cycle Instructions can take several clock cycles to be
executed
Parallel Systems
System that can complete instructions & tasks simultaneously with a single core, by using threading
- Threading - use of multiple threads running at the same time and performing different tasks in a
single program
- Thread - a single sequential flow of control within a program
Output Devices
A device which can be used to send out information from the computer
- Examples:
- Speakers: Used to output sound from a computer
- Printers: Used to output information from a computer onto paper
- Projector: Used to project video content from a computer onto a screen
Magnetic Tape
Long stretches of tape wound into reels were passed through readers which would check the polarity of the
tape and read off a binary value
- Bulky way to store data
- Improvements made by reducing the width of the tape and writing in diagonal lines as opposed to
horizontal tracks
- However, the technology was superseded by more modern technology like ROM cartridges
Floppy Disks
Athin (usually flexible) magnetic disk enclosed in plastic to protect the disk from dust and dirt.
- Incredibly portable (due to thin size & low weight)
- Good for exchanging small amounts of data
- Typical storage capacity of 1MB (later versions as much as 200MB)
Information is stored in blocks, which are combined to form pages. a visual representation:
The preferred logic gate used for storing small quantities of data, such as code to be executed, is NOR.
- For larger files, like photos and videos, NAND is the preferred logic gate.
Blu-Ray
An advancement over DVDs, blu-ray discs have more than 5x as much storagethan traditional DVDs
- Useful for storing high-resolution films
Virtual Storage
Storing information remotely so that it can be accessed by any computer with access to the same system,
for example over the Internet. Also known as ‘cloud storage’
- Examples: Cloud storage services (Google Drive, Microsoft OneDrive, iCloud, etc.)
- Often anabstraction of multiple drives acting like one.
- Information stored in the cloud is actually stored on 100s of hard drives or SSDs formatted to
act as a single piece of storage
- Virtual/Cloud storage is facilitated by the internet, and is usually owned and managed by
commercial organisations
- Therefore, disadvantages include: Inaccessible without internet connection, Accessibility
limited by user’s network & potential costs from companies for more storage if needed
- Virtual storage can be configured so that it automatically synchronises with local
drives.
- This means that there is always a local copy, in case there is no available internet
connection & any changes made locally will be synchronised when the connection is
next available
- However, advantages include: Easily accessible over internet, Can be easily shared with
others, security managed by external companies, no need to buy storage hardware
1.2 - Software & Software Development
Memory Management
Paging
Splitting up memory into equal-sized sections known as pages, with programs being made up of a certain
number of equally-sized pages
- These can then be swapped between main memory and the hard diskas needed when a program
is being run
The logical address space is divided into memory units called pages.
- When a page is loaded into the main memory, it is stored in a page frame,
- Page Frame - A block of sequential addresses that is the same size (has the same number
of addresses) as a page.
- Paging allows memory to be allocated in a non-contiguous manner (pages of the same process do
not need to be stored together, but can be allocated wherever there is free space in the main
memory)
- A page table is used to keep track of which page frame is allocated to each page.
Segmentation
The splitting up of memory into logical sized divisions, known as segments, which vary in size.
- These are representative of the structure and logical flow of the program, with segments being
allocated to blocks of code such as conditional statements or loops.
The memory blocks that are allocated to processes are divided into segments of different sizes to fit the
varying memory requirements of each process.
- The segments do not need to be stored continuously across a fixed address space, and they can be
moved in and out of memory as required.
- The OS tracks the allocation of memory for each process using a segment table, which records
where each segment required for a process is physically located.
Virtual Memory
Using a section of the hard driveto act as RAM when the space in main memory is insufficient to store
programs being used.
- Sections of programs that are not currently in useare temporarily moved into virtual memory
through paging, freeing up memory for other programs in RAM.
Interrupts
Role of Interrupts
Interrupt - Signals generated by software or hardware to indicate to the processor that a process needs
attention.
Buffers
Buffer - a small amount of memory inside a device where work currently being handled by the device is
stored
- Buffers are used whenever data needs to be sent or read from a device
- So devices such a printers and hard drives will all have buffers
System Stack
When the CPU gives priority to the interrupt, it keeps track of where it was in the program that is was
executing using the system stack
- It transfers the current contents of the registers to the system stack so that it can resume when the
interrupt has been processed.
- It will then the call an interrupt service routine (ISR) to deal with the interrupt
Interrupts can have different priorities, and so some interrupts may themselves be suspended while a
higher priority interrupt is being processed
- When the main process needs to be resumed, the values that were sent to the service stack are
retrieved (last in, first out), and execution resumes
Scheduling
Scheduling - Ensuring all sections of programs being run(known as ‘jobs’) receive a fair amount of
processing time by implementing various scheduling algorithms
- Scheduling methods can be:
- Pre-emptive: Jobs are actively made to start and stop by the operating system.
- For example: Multilevel Feedback Queues, Shortest Remaining Time, Round Robin
- Non-pre-emptive: Once a job is started, it is left alone until it is completed.
- For example: First Come First Served, Shortest Job First
Each process will take a certain amount of time to run, which may be:
- Bound - Will complete within a certain time
- Unbound - Will continue to run until the user finishes (e.g. playing a game or music)
Processing States
Due to the fact that only one process can run at a time on a single core processor (or on any one core of a
multicore processor) any process can be in one of three states:
- Running – the process is being run by the CPU
- Ready to Run – the process is in the queue waiting to be processed
- Blocked – the process is waiting for an I/O operation to complete (e.g. reading data from the hard
drive)
As there can be many processes ongoing at the same time, the OS maintains queues for those processes
in ‘Ready to run’ & ‘Blocked’ states
When a process is in the ‘running’ state, many things can happen to it:
- Complete & Close – if the process has completed, it can remove itself from the scheduler
- Interrupted by the Scheduler (pre-empted) – the scheduling algorithm has determined that the
process has had enough processing time, and so will swap in another process and place the current
process back into the ready to run queue.
- Become Clocked - if a process is waiting for an I/O event, it will be moved immediately to the
blocked queue so that it does not waste processing time
- Give Up CPU Time - some processes will voluntarily give up processing time to allow other
processes time on the CPU
Process Swapping
While a process is in the running state, the program data & instructions will occupy the CPU registers
- When the process is swapped out, this data will need to be saved so that the process can resume
where it left off
Starvation - When a process keeps getting pre-empted by higher priority processes, so that it never
actually gets processor time, or does not get enough processor time to complete.
One of the main purposes of the scheduling algorithm is to prevent either of these situations happening
Round Robin
Each job is givena section of processor time(a time slice)
within which it is allowed to execute
- If the process has not finished when this time is up, it
stops running and the computer switches to the next
process.
- The process that has been suspended will only
be able to resume running when it is next
allocated processor time.
- Round robin is a pre-emptive algorithm as it allows the
OS to remove a process from the CPU before it is completed.
Processes are chosen from different queues based on the priority of each
queue and how much processing time each process has already had:
- Initially, processes are added to a queue with a certain level of priority.
- If a process uses too much CPU time, it is moved to a lower priority queue and if a process has
been idle for a long time, it is moved to a higher priority queue.
- This helps avoid starvation.
- Processes that depend on input/output devices require a lot of processing time, so they are kept in
high priority queues, and processes that are quick to complete are served first.
Distributed systems are useful when processor-intensive tasks need to be completed and a single
processor cannot provide enough computational power on its own.
Embedded
Used in computers which only serve a specific purpose, and are therefore built to perform asmall range of
specific tasks, catered towards a specific device
- Provide a reliable platform for specific applications to carry out their processes.
- Often used to provide hardware reliability and ensure efficient use of resources, although this is
usually at the cost of flexibility of the system.
- Likely to be customised for the device on which they are installed, so that they best harness the
resources of the hardware to ensure the device is optimised for the best performance
- Limited in their functionality & hard to update
- Consume significantly less powerthan other types of OS
Multi-Tasking
Enable the user tocarry out tasks simultaneously through the use of time slicingto switch quickly between
programs and applications in memory.
- Specifically designed to enable the management of multiple processes at once so the CPU can be
given the right tasks at the right time to make it appear that all of the programs are running quickly
and simultaneously
Multi-User
Provides the facilities to enable several users to all access a system’s resources at the same time
- Controls the consumption of resources so that users can access the same system at the same time,
without adversely affecting the other users
- Therefore a scheduling algorithm must be usedto ensure processor time is shared fairly
between jobs.
- Without a suitable scheduling algorithm, there is a risk ofprocessor starvation
Real-Time
Supports applications that need to process data, produce a particular output to perform tasks within a
definitive time period
- Not likely to be as flexible as other operating systems as they need to ensure a consistent response
to inputs within a guaranteed time period.
- Commonly used in time-critical computer systems (e.g. plane autopilot systems, self-driving cars)
Real-time operating systems are optimised to process priority jobs incredibly quickly.
- They will also have ‘fail-safe’ features so that in the event of a piece of hardware failing there is
another component that can quickly be used
The Program Counter register points to the location of the BIOS upon each start-up of the computer as the
BIOS is responsible for running various key tests before the operating system is loaded into memory, such
as:
- POST (Power-on self test)which ensures that all hardware are correctly connected and functional
- Checking the CPU clock, memory & processor is operational
- Testing for external memory devices connected to the computer
BIOS is critical to the computer system as it is only after these checks are completed that the operating
system can be loaded into RAM from secondary storage
BIOS is stored in ROM, but changes, configurations & updates to the original BIOS are stored on the
CMOS (complementary metal-oxide semiconductor), a small chip within the computer.
Device Drivers
Small computer programs which are provided to the operating system
and allow the operating system to interact with peripheral devices
- When a piece of hardware is used, the device driver
communicates this request to the operating systemwhich can
then produce the relevant output
- Specific to the computer’s architecture& operating system, so
different drivers must be used for different device types &
different operating systems
- Drivers extend the functionality of an operating system
When a program requests to interact with a hardware device, it calls a routine within the driver software of
that device, which will cause the device to perform specific tasks.
- The drivers may also receive requests from the device that require the processor to be interrupted.
- The driver will then provide the interrupt service routine to the operating system
Typically, drivers are installed using software routines that are either built into the operating system, or
where necessary, downloaded from the internet.
- If a new device is added to the system, the OS will detect it and try to install the relevant device
driver
- It is able to identify this by querying the device for the vendor ID (VID) and product ID (PID) which
can then be checked against a database of known drivers
Virtual Machines
Virtual Machine - A theoretical computerin that it is a software implementation of a computer system.
- Occurs when a host system run a software that allows other software to behave as if it were running
on a hardware system
- It provides an environment with a translatorfor intermediate code to run
- Commonly used to create adevelopment environmentfor programmers to test programs on
different operating systems
- Can be configured to replicate any combination of hardware so that the software running on it works
as if it were accessing certain devices, even if they don't exist.
Host System - The computer running the virtual machine
Guest system - the virtual machine itself
- The guest system will have it’s own RAM, processor and hard drive allocation, but will have no
knowledge of the host system
- It does not know that it is being run as a virtual machine
- The virtual machine runs as a sandbox, a self contained environment
At run-time, the original program is translated into intermediate code, which is then passed to a virtual
machine.
- The VM then runs this code and determines how to control the host computer so that it can carry out
the relevant instructions.
- Running intermediate code in a virtual machine can also be considerably slower compared to
running low-level code on the device it was designed for
Bespoke Software
Software that is developed to meet the user’s specific requirements.
- As bespoke software is developed for a specific task and/or user, it is unlikely that other users will
wish to buy it, and therefore, development costs, ongoing support & maintenance costs are high
- May have bugs because only a few users will ever get to test them (compared to general purpose
software which is often beta tested through perhaps many thousands of end users)
- If the user’s needs were not correctly specified before the software was developed, the software
may not meet the requirements of the users (dependent on the development methodology used)
Applications Software
Software designed to beused by the end-user to perform one specific task
- Requires systems software in order to run
- Examples: desktop publishing, word processors, spreadsheets, web browsers
- Application software can be general purpose (e.g. word processing software), special purpose
(mathematical calculator software) or bespoke
Systems Software
Low-level software that is responsible for running the computer system smoothly, interacting with hardware
and generally providing a platform for applications software to run.
- The user does not directly interact with systems software but it ensures high performance for the
user.
- Examples: library programs, utility programs, operating system, device drivers.
Utilities
Utility Software - System software integral to ensuring the consistent, high performanceof the operating
system
- Each utility program has a specific functionlinked to the maintenance of the operating system
- Examples: Compression, Defragmentation, Anti-virus, Backup, Encryption, Formatting Software
Encryption Software
Scrambles information using an algorithm & a key so that it is not understandable to anyone without the key
- Does not prevent the data being intercepted, it prevents it from being understood after interception
- Most common uses is to encrypt when transmitting data, but some OS automatically encrypt
secondary storage to prevent data theft in the event that the device is lost
- Most cloud providers also encrypt data when it is stored
Formatting Software
Prepares the storage device for data storage by creating sectors & tracks which data can be stored
(magnetic storage)
- Creates a special list called the File Allocation Table (FAT)
- This is used to keep track of where data is located, each time a new file is written to a hard
disk, the table is updated with where the data is physically stored in the disk
- When you format a disk, the data on the disk is not deleted, it is the entry on the FAT which is
deleted
- The computer can no longer locate any data & will think its empty, but the data actually
remains, just overwritten
- There are utilities that will forensically wipe a disk by writing 0s to every memory location
Defragmentation Software
As files are stored as singular ‘blocks’ contiguously, over time, these ‘blocks’ get fragmented over the hard
disk, making reading files longer as the hard drive needs to be searched in multiple places instead of one
contiguous area
- Rearranges the hard disk so that all ‘blocks’ of a program are places together and all free space is
collated at the end
+ This helps to shorten read/write times
- Not used on SSDs, as they already have very fast seek times, so even if files are stored across
many locations, it does not provide any meaningful performance benefit to defragment an SSD
- SSDs also have a finite read-write cycles before they start to degrade, so defragmentation
would needlessly use up these
Compression Software
Reduces the size of files.
- Reasons why it is needed
- Less storage space required
- Faster download times – improving online experience
- Improved streaming of video/audio files
There are two ways in which compression software might reduce the size of a file, lossy or lossless:
- Lossy Compression: When unrequired data is removed from a file.
- MP3s are an example of this where sound quality may reduce but not to a point which is
noticeable by the listener.
- Lossless Compression: This is when data is temporarily removed from the file, but added back
(rebuilt) when the file is to be used again.
- Zip files are an example of this. They will need to be unzipped (extracted) to be useable
again
Antivirus Software
Responsible for detecting potential threatsto the computer, alerting the user and removing these threats.
Backup Software
Automatically creates routine copies of specific files selected in specific time periods by the user.
- This means that in the event of a power failure, malicious attack or other accident, files can be
recovered.
Open Source
Open Source - Software for which the source code is freely available to download in which the code can be
inspected and modified to suit the specific requirements of the user
- Although most of the time is free, all open source software is licenced with a GNU general public
licence
- This prevents someone taking the ideas from a piece of open course software, modifying it &
then releasing it as closed source
Advantages
+ Can be modified and improved by anyone
+ Technical support from online community
+ Can freely share the software with others
+ Software licence is usually free
Disadvantages
- Support available online may be insufficient or incorrect
- No user manuals
- Lower security as may not be developed in a controlled environment
Closed Source
Closed Source - Software that requires the user to have a licence to use, in which the source code is not
editable or freely available
- It is developed and provided to the user as a fully compiled, executable set of files
Advantages
+ Thorough, regular and well-tested updates
+ Company owning software provides expert support & user manuals
+ High levels of security as developed professionally
+ The company is obliged to provide software that is “fit for purpose” as it is covered by the Trades
Description Act
Disadvantages
- Licence restricts how many people can use the software at once
- Users cannot modify and improve software themselves
- Licence usually has to be paid for
Translators
Translator - a program that converts high-level source code into low-level object code, which is then ready
to be executed by a computer
Machine Code
Made up of instructions & data represented by bits. Each instruction consists of an operator (or opcode)
and and operand
- Operator/Opcode - the actual instruction that the CPU needs to decode & execute. The instructions
that a CPU understands is given by its instruction set
- Operand - Any data that the instruction needs to work
Assembly Language
An abstraction of machine code that uses a small set of commands which represent machine code
instructions (mnemonics)
- Assembly language commands have a one to one relationship with machine language commands
- E.g. instead of 1011, a command like ADD would be used
- CPU Specific: The instructions used are dependent on the instruction set of the processor.
High-Level Languages
A programming language with strong abstraction from the details of the computer, which is easily written in
and read by humans (E.g. Python, Java, C++)
+ Offer a further abstraction of machine code. Uses (close to) natural language to represent key
functions that are then carried out by one or more machine language commands
+ Easier to write for humans, but still is easily converted to machine code so that it can be processed
by the CPU
- An instruction in high-level languages can represent many instructions that a CPU will carry out.
- Writing in source code can therefore lead to a CPU running commands that may not actually
be directly required, causing the program to run slower than it could
Low-Level Languages
A programming language that provides little or no abstraction from a computer's instruction set architecture
(E.g. Assembly Language)
- They may be used because one instruction in low level language represents a single instruction, a
programmer can control precisely the actions of the CPU & how it uses the RAM
- This means that programmers can write highly efficient programs, which will run faster & use
memory more resourcefully.
Compilers
Translate high-level code into machine code all at once, after carrying out a number of checks and reporting
back any errors.
+ Produces a fully optimised executable file (but does not execute code)
+ Code can be run without a translatorbeing present
+ Runs faster than interpreted code after being compiled
- This initial compilation process is longerthan using an interpreter or an assembler
- If changes need to be made, the whole program must be recompiled.
- Once code has been compiled to produce machine code, it can only be executed on certain devices
- compiled code is specific to a particular processor type and operating system.
Most commercial software applications are distributed as compiled code so that the source code is
protected, this is due to the fact that the executable file is produced in such a way that cannot be read by a
human or readily reversed-engineered.
Interpreters
Translate and execute code line-by-line. These are mostly used in the development stage of a program.
+ Can quickly stop & produce an error if a line contains an error.
+ This feature makes interpreters useful for testing sections of code and pinpointing errors, as
time is not wasted compiling the entire program before it has been fully debugged.
- Code can be executed on a range of platformsas long as the right interpreter is available, thus
making interpreted codemore portable
- Translator must be present on computer being run
- Slower than running compiled code, as it needs to be translated each time it is executed
Assemblers
Translate assembly code into machine code.
- Each line of assembly code is equivalent to almost one line of machine codeso code is translated
on almost a one-to-one basis.
- Relatively simple translation process, as compared to translating high-level into machine code
Comparing Translators
Multiple Platforms Code is compiled for a specific Will run anywhere there is a
CPU instruction set and OS suitable interpreter
Stages of Compilation
Lexical Analysis
First stage of compilation, whitespace & comments are removedfrom the code & the remaining code is
analysed for keywords and names of variables and constants.
- These then are replaced with tokensand information about the token associated with each keyword
or identifier is stored in a symbol table.
- Tokens: Each aspect of the code has its purpose in the code identified (a command, a variable
name, a value, arithmetic operator, etc.) & each token is added to a symbol table.
- Example:
while flag = False:
print “not found”;
#terminates when item is found
goes through lexical analysis to become:
while flag=False:
print“not found”;
Symbol Table - Used by the compiler to keep track of all of the elements identified in the program
- The format of the symbol table will vary depending on the compiler, but may typically contain:
- An identifier
- the kind of item (variable, keyword, operator etc.)
- the data type
- run time memory address or value
- scope & restrictions (local vs global)
Token Stream
- Aim of lexical analysis is to convert the code into a token stream, based on the symbol table The
token stream, is then passed to the next stage which check whether it is in an acceptable order,
according to the syntax rules of the language
Syntax Analysis
Tokens are analysed against the grammar & rules of the programming language.
- Any tokens that break the rules of the programming language are flagged up as syntax errors&
added to a list of errors.
- Examples of Syntax Errors: undeclared variable type, incomplete set of brackets
- An abstract syntax tree is produced, which is an unambiguous data structure representation of the
source code in a syntactically correct form
- Further detail about identifiers is also added to the symbol table.
- Semantic analysisis also carried out at the syntax analysis stage, where logic mistakes within the
program are detected.
- Examples of semantic errors: multiple declaration, undeclared identifiers
Code Generation
A separate program is created that is distinct from the original source code.
- The abstract syntax tree produced in the syntax analysis stage is used to produce machine code.
- The code that is generated is in binary form, known as the object code.
- This is a major distinguishing feature between compilation and interpretation; interpreters do not
produce a separate executable file
- Libraries that are used within the code are not included in this, this is where linkers are used
Optimisation
Searches through the code for areas it could be made more efficient by detecting & removing insignificant,
redundant parts of code
- Repeated sections of code may be grouped and replaced with a more efficient piece of code which
produces the same result.
- The aim of optimisation is tomake the code faster to executealthough this stage can significantly
add to the overall time taken for compilation.
- There is a danger, however, that excessive optimisation may alter the way in which the program
behaves.
Loaders
Part of the operating system which are responsible for loading the executable files into memory from
secondary storage and prepares the program for execution
- Programs are provided by the operating system.
- When a file is executed, the loader retrieves the library or subroutine from the given memory
location
Use of Libraries
Libraries -Pre-compiled programs which can be incorporated within other programs using either static or
dynamic linking.
- Ready-to-use and error free, so save time developing & testing modules
- Can be reused within multiple programs
- Often used to provide a specialised range of functions which would otherwise require a lot of time
and effort to develop, sosave programmers from having to recreate existing functions & instead
make use of others’ expertise.
- Popular libraries provide mathematical and graphical functions.
1.2.3 - Software Development
General Stages of Software Development
System Development Life Cycle (SDLC) - Describes the different stages that happen during a software or
system project
Software can be developed using a variety of methodologies, however, they all have a general form of:
Feasibility
Can be evaluated using ‘TELOS’
- Technical: is the project possible considering the technology available and accessible?
- Economic: can the project be financed in the short-term and the long-term?
- Legal: can the project be solved within the law?
- Operational: can the project be successfully implemented and maintained?
- Scheduling: can the project be completed given the time available?
Analysis
- Stakeholdersstate what they require from the finished product. This information is used toclearly
define the problemand the system requirements. Requirements may be defined by:
- Analysing strengths and weaknesses with current way this problem is being solved
- Considering types of data involved including inputs, outputs, stored data and amount of data
- Aspects to be considered:
- Features: High-level statements of what the system will do
- Requirements: What the system needs to do to develop the features
- Success Factors: How we will check that we have met the requirements
Design
Translating requirements into feasible solutions, as well as defining & justifying any limitations that the
project may have. A test plan may also be designed at this stage.
The different aspects of the new system are designed, such as:
- Inputs: volume, methods, frequency
- Outputs: volume, methods, frequency
- Security features: level required, access levels
- Hardware set-up: compatibility
- User interface: menus, accessibility, navigation
Development
The design from the previous stage is used to split the project into individual, self-contained modules, which
are allocated to teams for programming.
- These are all then tested
Testing
The program is tested against the test plan formed in the Design stage. There are various types of testing
that can be carried out:
1. White Box Testing: A form of testing carried out by software development teamsin which the test
plan is based on the internal structure of the program. All of the possible routes through the program
are tested
- The structure of the code is tested & so requires knowledge of how the code is written
- Each path through the code must be tested to check that the code runs as expected
- Unit Tests: Designed to check that a section of code is working in isolation from the rest of
the code
- This stage of testing will use valid, invalid/erroneous & boundary/extreme inputs to make
sure that code can handle such inputs correctly
2. Black Box Testing: Form of testing where the software is testedwithout the testers being aware of
the internal structureof the software and can be carried out both within the company and by
end-users.
- The test plan traces throughinputs and outputs within the software
- Doesn’t require knowledge of how the code is working, only on the required functionality of
the program
- Testers will not normally be end users who are not familiar with the code, but are familiar
with what the code needs to do
- Usually follow a test script, which defines how the test should be carried out
- Test scripts specify the actions to take & the data to use in the test
- Based on the requirements, the expected outcomes will be known & can be compared to the
actual outcome, which if they are the same then the test is passed,
- If not, it fails and is sent back to the developers for remedial action
3. Alpha Testing: Carried outin-houseby the software development teams within the company. Bugs
are pinpointed and fixed.
- Alpha builds are for the developer as they may be unstable or non-functional in some areas,
but give an idea of the current state of system development
4. Beta Testing: Carried out by end-usersafter alpha testing has been completed. Feedback from
users is used to inform the next stage of development.
- Beta builds are done once the main functionality is all complete & the system is assessed to
be stable
- End users test the system as if they were using the end product in order to find any
remaining bugs or issues
- Any bugs which are found are passed back to the development team to be fixed
- Companies use beta releases to expose the software to as many end users as
possible so that it is thoroughly tested
Importance of Testing
- Untested or poorly tested code will result in unstable software & possibly non-working requirement,
which will lead to user dissatisfaction
- Developers are not the best people to test code as they make subconscious assumptions
about how the code will be used, which may not will be true of an end user
- An independent user is more likely to find bugs, as they will use the software in the same
way a user might
- The netter tested code is, the most stable & robust it will be
- The earlier on in the development process that bugs are found, the easier they will be to fix (and
less expensive to fix)
Implementation
Where the system is made available to the end users
This may mean installing the solution onto production server & setting up the support mechanism
- May also include producing & shipping physical products
- May include making a download available through online services
- In the immediate period after implementation, the development team will generally retain
responsibility for support
Once the testing stage has been used to make the appropriate changes to the software, it is installed onto
the users’ systems
Evaluation
Effectiveness of the softwareis evaluated against the system requirementsdefined at the analysis stage to
evaluate its suitability in solving the problem.
- Different criteria are considered, including robustness, reliability, portability and maintainability
Maintenance
Any errors or improvements that could be made to the software are flagged up by the end-users.
- Programmers will regularly send outsoftware updates to fix any bugs, security issuesor make any
needed improvements.
There may also be a roadmap for how the developers see the system developing over the coming years.
Waterfall Lifecycle
Based on a series of stages which are completed in sequence, from start to finish.
1. Feasibility & Problem Definition
2. Requirements
3. Analysis
4. Design
5. Development
6. Implementation
7. Evaluation
8. Maintenance
At the end of each phase, it is a formal milestone where documentation is reviewed & approved by
stakeholders
- Change management is important in the waterfall method in order to incorporate information that
was not known at an earlier stage but is found to be important later
Advantages
+ Straightforward to manage
+ Clearly documented
+ Rigid & Structured
Disadvantages
- Lack of flexibility
- No risk analysis
- Limited user involvement
- Relies on accurate & meticulous planning
Agile Methodologies
Acollection of methodologies which aim to improve the
flexibility of software developmentand adapt to changes in
user requirements faster
- It is also easier to make improvements or changes
to the software.
Advantages
+ Produces high quality code
+ Flexible to changing requirements
+ Regular user input
Disadvantages
- Poor documentation
- Requires consistent interaction between user & programmer
- Projects can sometimes take longer than anticipated where users make continuous requests for
minor changes or new features/functionality
Extreme Programming
An agile model in which the development team consists of a pair of programmers alongside a
representative end-user.
- The model is built on ‘user stories’: system requirements are specified by the end-user and used
when designing the program.
- The aim of paired programming is to produce high-quality code, as the code is written by one
person and critiqued by the other so it is improved as it is written.
- Each iteration through the cycle generates what is called a ‘working version’, as opposed to a
‘prototype’, of the program, meaning it could function as the final product
At the start of each release, there is a planning phase that determines what will be developed and how it
will be tested.
- There is a heavy emphasis on development standards and strict version control.
- At the end of each release, there is a feedback stage that will inform the next release.
Advantages
+ Produces high quality code
+ Constant user involvement means high usability
+ Having a user on each team means that there is a strong focus on getting the user requirements
right
Disadvantages
- High cost of two people working on one project
- Teamwork is essential
- End-user may not be able to be present
- Projects can sometimes take longer than anticipated where users make continuous requests for
minor changes
Then the cycle repeats continuously until a suitable solution with all the user requirements met is produced.
Advantages
+ Thorough risk-analysis and mitigation
+ Caters to changing user needs
+ Produces prototypes throughout
Disadvantages
- Expensive to hire risk assessors
- Lack of focus on code efficiency
- High costs due to constant prototyping
- Can easily go overtime & become more expensive
Advantages
+ Caters to changing user requirements
+ Highly usable finished product
+ Focus on core features, reducing development time
Disadvantages
- Poorer quality documentation
- Fast pace may reduce code quality
Everyday language
- Too Time consuming
- Inaccurate, (ambiguous & open to interpretation)
Extreme Produces high quality code High cost of two people Small to medium projects
Programming Constant user involvement working on one project with unclear initial
means high usability Teamwork is essential requirements requiring
Having a user on each End-user may not be able excellent usability.
team means that there is a to be present - Not suitable for
strong focus on getting the Projects can sometimes projects where the
user requirements right take longer than anticipated users are hard to
where users make contact or do not
continuous requests for want to be actively
minor changes involved, because
of its emphasis on
continuous
feedback
- Functional: uses the concept of reusing a set of functions, which form the core of the program
- Programs are made up of lines of code consisting offunction calls, often combined within
each other.
- Closely linked to mathematics
- Examples: Haskell, C# & Java
- Logic: Uses code which defines a set of facts and rulesbased on the problem
- Queries are used to find answers to problems.
- Example: Prolog
Procedural Languages
Procedural Languages - Languages that use traditional data typessuch as integers and strings which are
built into the language, & also provide data structureslike dictionaries and arrays.
- Used for a wide range of software development as it is very simple to implement
- May not be possible/efficient to solve all kinds of problems with procedural languages
Structured Programming
- Subsection of procedural programming in which the control flow is given by four main programming
structures:
- Sequence: Code is executed line-by-line, from top to bottom.
- Selection: A certain block of code is run if a specific condition is met, using IF statements.
- Iteration: A block of code is executed a certain number of timesorwhile a condition is met.
Iteration uses FOR, WHILE or REPEAT UNTIL loops.
- Recursion: Functions are expressed in terms of themselves. Functions are executed, calling
themselves, until a certain condition known as abase case (which does not call the function)
is met.
Assembly Language
Assembly Language - An abstraction of machine code that uses a small set of commands which
represent machine code instructions (mnemonics)
- Low-level language & is the next step up from machine code
- Converted to machine code using an assemblerwhen it is executed
- Each mnemonic is represented by a numeric code
- Commands that assembly language uses are processor-specificas it directly interacts with the
CPU’s special purpose registers.
- Allows for direct interaction with hardware so is useful in embedded systems.
- Each instruction in assembly language is typically equivalent to almost one line of machine code.
ADD Add Add the value at the given memory address to the value in
the Accumulator
SUB Subtract Subtract the value at the given memory address from the
value in the Accumulator
STA Store Store the value in the Accumulator at the given memory
address
LDA Load Load the value at the given memory address into the
Accumulator
INP Input Allows the user to input a value which will be held in the
Accumulator
HLT Halt Stops the program at that line, preventing the rest of the
code from executing
BRA Branch always Branches to a given address no matter the value in the
Accumulator. This is an unconditional branch
Example: LMC program which returns the remainder, called the modulus, when num1is divided by num2
- BRP Positive: branches to the ‘positive’ flag, subtracting num2 while the result of num1 minus
num2 is positive
Code
INP
STA num1
INP
STA num2
LDA num1
positive STA num1
SUB num2
BRP positive
LDA num1
OUT
HLT
num1 DAT
num2 DAT
Addressing modesallow for a much greater number of locations for data to be storedas the size of the
operand would otherwise constrain the number of addresses that could be accessed.
- Specifies how the operand should be interpreted
- Part of the opcode
Examples
Immediate: Goes to the actual value given, which in this case is the 005 in the
‘memory address’ column, so value in accumulator is 5 (005)
Direct: Goes to the contents of the memory address value (the value held at
memory address 005, given by the value in the column ‘Contents’ next to the
memory address. In this case is 9, as 9 is next to 005 in the table, making the
value in the accumulator 9)
Indirect: This goes to the ‘memory address’ value given (005), goes to the
corresponding ‘contents’ (9), which is then used as a further address, taking us
to the memory address (009), and by looking in the ‘memory address’ column,
you then look at the corresponding value in the ‘contents’ column to get the final value in the accumulator,
19
Indexed: Find the memory address needed in the ‘memory address’ column (005), then goes to the right
column, ‘index register’, and takes that value in the ‘index register’ column, to make the value in the
accumulator 7 (007)
- Add the value given to index to the value in the index register then go to that address to find the
value
Object-Oriented Languages
Object Oriented Programming - Programming paradigm built on entities called objects formed from classes
which have certain attributes & methods
- Supports well structured code & reusability
- Considers the problem to be solved as a series of data objects
Classes
Class - atemplate for an objectand defines the attributes and methods of an object
- Object - a particular instance of a class
- State is given by attributes which give an object’s properties.
- Behaviour is defined by the methods associated with a class, which describe the actions it can
perform
- Can be used tocreate objectsby instantiation
- Can be used to create multiple objects with the same set of attributes and methods.
- Usually associated with an entity.
- Example: a class called ‘Library’.It could have attributes ‘number_of_books’,
‘number_of_computers’and methods ‘add_book’and ‘remove_book’. Similarly, ‘Book’could
also be a class.
Instantiation
Instantiation - the creation of an object (or an “instance” of a given class) in an object-oriented programming
- Objects are created within a class through the process of instantiation
- A constructor is used to instantiate an object from the class
- Can instantiate any number of objects from the class
- All instances will have the same attributes & methods, but the values of the attributes will be
unique to the particular instance of the object
Methods
A program subroutine that represents an action an object can perform
- Setter - a special type of method used in OOP thatsets the value of a particular attribute
- Getter - a special type of method used in OOP which retrieves the value of a given attribute
- Getters & setters are used is to make sureattributes cannot be directly accessed & editedby
users (encapsulation)
- Attributes are declared as privateso can only be altered by public methods.
- Every class must also have a constructor method, which is called ‘new’
Attributes
Data recorded as a variable associated with an object
- Attributes are declared as private, so can only be altered by public methods
Inheritance
A subclass can inherit data & behaviour from a superclass
- Base Class - The first class within a hierarchy of classes
- Subclass will inherit common attributes form the parent & may inherit the common methods
- Subclass may then also add attributes & methods of their own
- Inheritance would be expressed as: class Biography inherits Book
Encapsulation
A method of maintaining data integrity by only allowing class methods to access data in an object’s
attributes
- An object encapsulates both its data & methods into a single entity
- Any one object cannot affect the way in which any of object functions
- Means that the class can hide how it works whilst allowing other developers to use it through its
interface (e.g. its public attributes & methods)
Polymorphism
The ability to process objects differently, depending on their subclass
- Linked to inheritance (super & subclasses)
- Means objects can behave differently depending on their class.
- Can result in the same method producing different outputsdepending on the object involved.
Lossy Compression
Compressing a file by removing some of its data to reduce its size
+ More effective at reducing space than lossless compression
+ Can achieve very high compression ratios
- Audio become less clear & images become more pixelated
- The loss in quality is very small, to the point where it is barely perceptible
Common Formats:
- MP3: Audio
- JPEG: Images
- MPEG: Video
Lossless Compression
Compressing a file by rearranging its contents in a way that allows the original data to be reconstructed
from the compressed data with no loss of information
Common Formats:
- PNG: Images
- GIF: Images/Short animations
- TIFF: Images
- ZIP: Compressing 1 or more files together
Encryption
Encryption - a way to convert data (plaintext) into ciphertext so that only authorised parties can
unscramble it and understand its content
- Cipher - an algorithm used to perform encryption and decryption
- Key (to a cipher) - a set of parameters to guide the algorithm
Ciphers
- Substitution Cipher - Encrypting text by substituting one character with another
- E.g. Caesar Cipher: offsetting all the letters in a phrase to
Cryptanalysis
Cryptanalysis - The process of breaking a code, to figure out a plaintext from ciphertext without knowing the
key
- Brute force approach, but ciphers often have weaknesses that allow process to be exploited
Symmetrical Encryption
A method of encryption where both the sender and receiver share the sameprivate key, which they
distribute to each other during a key exchange
- This key is used for both encrypting and decryptingdata.
- Key exchanges have to happen over the internet, so could potentially be intercepted
Asymmetrical Encryption
A method of encryption where a public key, available to everyone is used to encrypt the data, and the data
is decrypted by the paired private key, only known to the recipient
- Public key can be published anywhere, private key kept secret
- Together they are known as akey pair & are mathematically related to each other
- A single key cannot be used to both encrypt & decrypt communication.
- Messages encrypted with the recipient’s public key can only be decrypted with the recipient’s
private key, which should only be in the possession of the recipient.
- Example: If someone wants to send you a message, they must first find your public key
- The message is then encrypted with your public key meaning that only youcan decrypt it, as
you are the only one with the private key which is a pair to the public one
- To prove that a message has been sent by you, you can encrypt it using your private key.
- This means that anyone can decrypt it(as your public key is available to anyone) and by
doing so, can guarantee that you encrypted the message, as only you have access to the
private key.
- This forms the basis of the digital signature systems
Hashing
Hashing - A process in which an algorithm is used on an input (a key) to
turn it into a fixed size value (a hash)
- the process of applying an algorithm on a piece of data, to derive a
numerical value, which can be used for verification purposes
- There are a vast number of hash functions (algorithms) that do this
- The output of a hash functioncan’t be reversedto form the key
- One way process
- No need for reversing, as it is used to verify the integrity of
the data, rather than encrypt it
- Takes input data of any size and reduces it to a fixed size
Uses of Hashing
Storing Passwords
- Convert passwords into a numerical value of fixed size and stored in a file
- No need to store or transfer actual password, only hash value.
- The output of a hash functioncan’t be reversedto form the key.
- Makes hashing useful for storing passwords.
- A password entered by a user can be hashed and checked against the key to see if it is
correct
- A successful hacker would only gain access to the keys
- Which is useless, as theycan’t be reversed to gain the passwords.
Many people often choose the same password, which would then give the same hash value, which
introduces a potential vulnerability
- Salt - A random piece of data that is added to the start of the password before it is passed through
the hashing algorithm
- Is stored alongside the hash value
- The password is hashed, the hashed password is sent to the server, and the salt is added to
the hashed password, the hash password & salt are hashed together, and then it is stored.
- This means that if someone gets the hash, they cannot reverse engineer the
password, as the salt is integrated in the hashed password
Hash Tables
Hashing can be used to speed up access to records in a data structure
- For a table index, a hash function is applied to the data contained in the record, which generates a
value
- The record is then inserted at this index position in the table
- When the record needs to be retrieved, applying the hash value to the search item gives the index
of the record
Digital Signatures
Extra data that is appended to the file that is unique to the sender and very difficult to forge
To ensure that you only receive legitimate files, encrypted with your key, from acceptable sources
- The public key can also generate a digital code called a digital signature, which verifies the file’s
content and the sender
1.3.2 - Databases
Relational Databases
Database - a persistent organised store of related data
- Persistent - Permanent, stored on secondary storage
- Organised - Data stored in records and fields
- This could be in a single table or a collection of tables.
- Most programs will store data using a database
Terminology
- Entity - An item of interest about which information is stored
- An entity becomes a table in a database
- Record - one instance of an entity in a database
- Attributes - Characteristics of an entity
- Becomes a field in the database (a column in a table)
- Table - Collection of data that relates to an entity (e.g. students)
- Record - A collection of data about a single entity (e.g. a student)
- Field - A unique piece of data about an entity (student surnames)
- Field Name - An identifier for the single piece of data (e.g. ‘surnames’).
Flat File
Flat File - a database that consists of a single file, a single table of data
- Will most likely be based around a single entity and its attributes.
- Typically written out like this: Entity1(Attribute1, Attribute2, Attribute3 ...)
- Underlined item is the primary key
Primary Key
Primary Key - Unique identifierfor each record in the table.
- In the above example, the unique identifier is the CarID as this is always different for each row in the
table.
- The primary key is shown by underlining itin its description: Entity1(Attribute1, Attribute2, Attribute3
...)
Foreign Key
Foreign Key - The attribute which links two tables together.
- A primary key that is present in another table & does not need to
be unique
- Will exist in one table as the primary key and act as the foreign
key in another
- The foreign key is shown using an asterisk.
- Example: DoctorID is the foreign key in the Patient table.
Secondary Key
Secondary Key - A key that can be used as an alternative index to access or sort records in a table quicker
- Allows a database to besearched quickly
- Less accurate than a primary key, as secondary keys can be identical
- Example Doctor/Patient: 2 patients can have the same surname, but not the same PatientID
- Example Doctor/Patient - The patient is unlikely to remember their patientID but will know their
surname.
- Therefore, a secondary index (secondary key) is set up on the surname attribute.
- This makes it possible to order and search by surname which makes it easier to find specific
patients in the database.
One-to-many
One table can be associated with many other tables,
- Example: A school has many teachers, but teachers are only associated with one school
Many-to-many
One entity can be associated with many other entities and the same applies the other way round.
- Example: Students and courses
- Each student can enrol in more than one course & each course can have more than one
student.
Normalisation
Normalisation - The process of coming up with the best possible layoutfor a relational database
This tries to achieve:
- No redundancy(unnecessary duplicates)
- Consistent datathroughout linked tables
- Data is logically grouped so that related data is stored together
- Records can beadded and removed without issues
- Complex queries can be carried out
Rules of 2NF:
- Database is already in 1NF
- All partial dependencies have been removed
Partial Dependencies
- When certain fields only depend on part of the primary key
- This only applies when the primary key is a composite key
- Composite Key: a type of primary key that uses multiple fields, instead of 1 unique field
- Partial dependencies can be removed by splitting the partially dependent fields from the rest of the
table, creating 2 separate tables
Example:
This means that the fields ‘course name’, ‘lecturer initials’ &
‘lecturer name’ have partial dependencies, as they don’t
depend on every field in the composite primary key
To get rid of this partial dependency, you can take the fields
‘course number’, ‘course name’, ‘lecturer initials’ & ‘lecturer
name’, and put them into a new separate table.
- This has created a many-to-many relationship
between the 2 tables, which is not ideal for databases
- This many-to-many relationship can be solved through the usage of a linking table
- Linking table - A table that links 2 other tables together, through the use of their primary keys
- With this example, you first create a new primary key called ‘student number’, assigning
each student a student number and then removing all of the extra entries for each student
that, so that their names don’t come up multiple times in the students table
- Using the new 2 tables, you can create a linking table with the only fields being the primary
keys of those 2 tables (‘student number’ and ‘course number’)
- Now you can use this linking table, the students table and the course table to locate records
Third Normal Form (3NF)
A database which is in second normal form& has no transitive (non-key) dependencies
- Transitive (non-key) dependency - When a value of a field is determined by the value of another
field that isn’t the primary key
- To get rid of transitive dependencies, you should not be able to determine a value in a field
by a non-key
- Tables with only 2 fields (e.g. linking tables) are always in 3NF, as they only have 2 fields
Rules of 3NF
- The database is already in 2NF
- There are no transitive/non-key dependencies
Example:
We need to make sure that this table is in 3NF, to do this, you have to
check if the values in each field can be determined by non-key fields.
- The ‘course number’ field does not have to be checked
because it is the primary key
- First, you check the ‘course name’ field
- Can ‘course name’ be determined by ‘course
number’? Yes
- This is fine, as ‘course number’ is the primary
key
- Can ‘course name’ be determined by ‘lecturer initials’?
No
- Can ‘course name’ be determined by ‘lecturer name’? No
- Then you check the next field: ‘lecturer initials’
- Can ‘lecturer initials’ be determined by ‘course number’? Yes
- Can ‘lecturer initials’ be determined by ‘course name’? Yes
- This is a transitive (non-key) dependency, meaning that this table is not in 3NF
- Can ‘lecturer initials’ be determined by ‘lecturer name’? No
- Then you check the last field: ‘lecturer name’
- Can ‘lecturer name’ be determined by ‘course number’? Yes
- Can ‘lecturer name’ be determined by ‘course name’? Yes
- This is a transitive (non-key) dependency, meaning that this table is not in 3NF
- Can ‘lecturer name’ be determined by ‘lecturer initials’? Yes
- This is a transitive (non-key) dependency, meaning that this table is not in 3NF
Therefore, this table is not in 3NF, because it has 2 transitive (non-key) dependencies
- To put this into 3NF, you can split the 2 fields with transitive dependencies (‘lecturer initials’ and
‘lecturer name’ into separate tables
In converting from 1NF to 3NF, we have gone from 1 table, a flat file, to 4 tables, a relational database.
Indexing
Indexing - Methodused to store the position of each record ordered by a certain attribute.
- Used to look up and access data quickly.
- Primary keyis automatically indexed; however, the primary key is almost never queried since it is
not normally remembered.
- Secondary keys are used instead and indexed to make the table easier and faster to search
through on those particular attributes.
Methods of Capturing
Capturing Data
Method of data capture must be
- Accurate: Errors in databases may cause problems in future
There are a range of manual & automated methods, depending on the context
Manual
- Paper-based forms, which would be completed and the information is then manually entered into
the database
- These forms may have been formatted to support Optical Character Recognition (OCR)
- This is when the form is scanned and the data is read electronically
- This method is likely to be error prone and so is slowly becoming obsolete
Automated
- A variation on OCR is Optical Mark Recognition (OMR)
- This uses specially formatted forms to capture large amounts of data and are used for
multiple choice tests (e.g. 11+), surveys & the lottery
- Other automated methods include barcodes & QR codes
- Banks scan cheques using Magnetic Ink Character Recognition (MICR)
- All details, excluding amount, are printed in a magnetic ink which can be recognised by a
computer
- Amount must be entered manually
Selecting
Selecting the correct data is an important part of data preprocessing.
- This could involve only selecting data that fits a certain criteria to reduce the volume of input.
- Example: Speed cameras will only select cars going above a certain speed
- Then, background information will be removed so only the number plate is added to a record
Data Validation
- Data entry methods will use validation to ensure that all the required fields are populated with the
correct type of data
- Does not ensure that data is correct, just ensures it is valid in the context
Methods of Data Validation
- Presence Check: Ensures there is an item in every field
- Format Check: Makes sure the format of data is correct (e.g. DD/MM/YYYY vs MM/DD/YYYY)
- Range Check: Checks the data between an expected range of values
- Type Check: Ensures that data entered into a field is of the correct data type
- Uniqueness Check: Ensures that an item is not entered multiple times into a database
- Consistency Check: Confirms the data’s been entered in a logically consistent way
- Code Check: A code check ensures that a field is selected from a valid list of values or follows
certain formatting rules
- Constraint Check: Checks the consistency of entered data, for example against a regular
expression
- Length Check: Ensures that the appropriate number of characters are entered into the field
Data Verification
- Has to be done manually, by looking at the data
Managing
Instead of being selected, collected data can alternatively be managed using SQL to sort, restructure &
select certain sections
Exchanging Data
Exchanging data - the process of transferring the collected data.
- Common method: Electronic Data Interchange (EDI)
- This doesn’t require human interactionand enables data transfer from one computer to
another
Example:
SELECT MovieTitle, DatePublished
FROM Movie
WHERE DatePublished BETWEEN #01/01/2000# AND
#31/12/2005#
ORDER BY DatePublished
AND
Used to add extra conditions that both need to be met to select records within an SQL statement
- Example: Selecting movies where the movie name starts with A and has over 100 million downloads
- SELECT *
FROM Movies
WHERE name LIKE A%
AND downloads > 1000000
OR
Used to add extra conditions that either can be met to select records within an SQL statement
- Example: Selecting movies where the name starts with A or B
- SELECT *
FROM Movies
WHERE name LIKE A%
OR name LIKE B%
DELETE (FROM)
Used to delete records in a table based upon a certain condition
- Example: Deleting movies that have less than 1000 downloads
- DELETE FROM Movies
WHERE downloads < 1000
INSERT (INTO)
Used to insert a new recordinto a database table
- Example:
- INSERT INTO (column1, column2, ...)
VALUES (value1, value2, ...)
DROP (TABLE)
Used to delete a table from a relational database
- Example: Deleting a table called ‘directors’
- DROP TABLE Directors
JOIN
Provides a method of combining rows from multiple tablesbased on a common field between them
- This also acts as an equivalent to INNER JOIN
- Example: Joining together 2 tables ‘Movies’ and ‘Directors’
- SELECT Movie.MovieTitle, Director.DirectorName, Movie.MovieCompany
FROM Movie
JOIN Director
ON Movie.DirectorName = Director.DirectorName
WILDCARDS
A method of selecting every single field when accessing a record, usually shown as an asterisks (*) or
percentage symbol (%)
- Example:
- SELECT * (or SELECT %)
FROM Movie
WHERE DatePublished > 2008
Referential Integrity
Referential integrity - the accuracy and consistency of data within a relational database
- All foreign keys in a table must represent a valid primary key in a source table
- Ensures that information is not removed if it is required elsewhere in a linked database
- If two database tables are linked, one of these tables cannot be deleted as the other table
requires its contents
- One way to maintain a referential integrity would be to enforce a cascade delete restraint on the
primary key relationship between tables
- Essentially, if a primary key is deleted from one table, the associated records within a related
table will also be deleted
- Cascade Delete - If a record is deleted, which is present as a foreign key in a linked table,
all records in the linked tables which contain the value as a foreign key
- Cascade Update is a similar concept, but for updates instead of deletes
- It can also prevent from adding a record to a table when it has no foreign key (primary key from
another table) to go with it
Data Integrity
The maintenance & consistency of data in a data store, as the data store must reflect the reality that it
represents
- The Database Management System (DBMS) helps to ensure that all data is consistent, and there is
always data integrity
Transaction Processing
Transaction - a single operation executed on data.
- A collection of operations can also sometimes be considered a transaction
- Each transactions must succeed or fail as a complete unit, and can never be partially complete
- Transaction Processing: Any information processing that is divided into individual, indivisible
operations (transactions)
All relational databases have certain base functionality, referred to as CRUD. Each aspect of CRUD has its
equivalent SQL command(s):
- Create (INSERT or CREATE)
- Retrieve (SELECT)
- Update (UPDATE)
- Delete (DELETE)
ACID
To ensure data integrity, transaction processing in all database management systems (DBMS) must
conform to a set of rules, ACID, which describes the ideal properties of every database transaction:
- Atomicity: A transaction must be processed in its entirety or not at all, no partial transactions
- Consistency: A transaction/change must retain the overall state of the database and maintain the
referential integrity rulesbetween linked tables, taking the database from one valid state to another
- Isolation: Simultaneous executions of transactions should not interfere with one another and should
lead to the same resultas if they were executed one after the other
- Durability: Once a transaction has been executed it must remain so & not be lost due to system
failure and will remain soregardless of the circumstances surrounding it, such as in the event of a
power cut
Features of DBMS
- Allows different applications to access the data at the same time
- Controls access to the data
- Security features are provided to limit who can do what
- Provides backups & the ability to restore from a backup if a disaster occurs
- Supports a query language, and other languages which can be used to do CRUD & structure the
database
- Can enforce referential integrity
- Prevents any operations that could damage the relationships between tables of data
- Controls concurrency, when multiple people are accessing the same database
Record Locking
Record Locking - The process of preventing simultaneous access to recordsin a database
- Used to prevent inconsistenciesor a loss of updates.
- While one person is editing a record, this ‘locks’ the record so prevents others from accessing the
same record until the first user has completed their transaction
- An issue with this is deadlock
- Where a record is locked dependent on another recorded which is also locked & so there is
a cyclic dependency that cannot be resolved
Example: Deadlock
- User 1 accesses Customer 1’s record and as a result locks Customer 1’s record.
- Simultaneously User 2 accesses Customer 2’s record and as a result locks Customer 2’s
record.
- Now User 1 tries to access Customer 2’s record, and User 2 tries to access Customer 1’s record.
- User 1 waits for Customer 2’s record to be free and User 2 waits for Customer 1’s record to
be free and as they are both waiting, there is no progress causing a deadlock
Data Redundancy
Data Redundancy - the process of having one or more copies of the data in physically different locations
- This means that if there is any damage to one copy the others will remain unaffected and can be
recovered
- This is important because some data is extremely important and companies cannot afford to
lose it
1.3.3 - Networks
Characteristics of Networks
Network - two or morecomputers connected together with the ability to transmit information and resources
between each other
- Node - Singular device on a network
Advantages of Networks
+ Enables digital communication between people
+ Enables the sharing of digital information
+ Enables the sharing of peripheral devices, such as printers & scanners
+ Enables computers to be updated with the latest software from a central point
+ Enables distributed processing (the ability for a single program to run simultaneously at various
computers
+ Enables an organisation to keep their data private and manage certain aspects of the network (e.g.
security, access to programs, access to data, availability of devices and services, etc.)
Disadvantages of Networks
- Requires expertise to instal & maintain a large network, which can be costly
- Several security issues from unauthorised access to data
- Measures to secure a network include:
- Passwords, strong passwords use a range of character types
- Not allowing users to install software
- With wireless access, use encryption
- Changing passwords frequently
- Challenges surrounding continuity of service 24/7
- Disaster recovery plans - in the event of the network going down
- Failover system - transferring to another system if problems occur
- Redundancy of equipment - use of alternate equipment if required
- Regular backup regime - ensuring the data is always available even if the live data source
experiences issues
Network Topologies
Types of Topology
- Physical Topology: The physical layoutof the wires and components which form the network. (e.g.
bus, star, mesh)
- Logical Topology: An abstract representation of how data flows and is transmitted within a
network, independent of its physical layout
Comparing Topologies
Bus All the terminals (devices) Relatively inexpensive to If backbone cable fails, the
are connected to a set up entire network gets
backbone cable, the ends disconnected
Doesn’t require any
of which are plugged into a
additional hardware As traffic increases,
terminator. performance decreases
Star Use a central node, often a Performance is consistent Expensivedue to switch &
switch, to direct data even with heavy network cabling
through the network. MAC traffic
(Media Access Control) If the central switch fails,
addresses, which are If one cable fails,only that the rest of the network fails
unique to a device, are single terminal is affected
used to identify each
device on the network. Transmits data faster,
giving better performance
than bus topology
No data collisions
Mesh Every node is connected to If using a wireless network, If using a wireless network,
every other node. Mesh there is no cabling cost devices with wireless
topologies are most capability (which increases
commonly found with As the number of nodes cost) must be purchased
wireless technology such increase, the reliability and
as Wi-Fi. speedof network becomes If using a wired network, a
better large quantity of cableis
(Partial Mesh - Majority of required compared to other
nodes are connected with Nodes are automatically network topologies like bus
each other, but not all) incorporated and star, which is
expensive
Nodes don’t go through a
central switch, improving Maintaining the network is
speed difficult
Standards - A set of rules hardware & software specifications that allow manufacturers to create products
& services that are compatible with each other
- Formal standards, so any type of device can connect to it & work on the network, if it can implement
the standards
- make it easy for hardware manufacturers to ensure interoperability
Handshaking - An initial communication between 2 devices to decide how communication will operate, by
deciding on a set of protocols to use
- Used when 2 devices communicate over the internet by deciding on certain protocols
- Simply carried out by 1 device, sending a list of protocols to user to another device & receiving an
acknowledgement back in return
- if the devices fail to agree with the choice of protocols, another set must be chosen, or they will fail
to communicated
- TTPS (Hypertext Transfer Protocol Secure)- Used for secure web page rendering, a way for a
H
client & server to securely send & receive requests & deliver HTML pages
- Has added encryption & authentication
- Used whenever a website deals with sensitive information, such as passwords or bank
details
- POP3 (Post Office Protocol)- Used for accessing emails, and deletes them from the server after
being downloaded to the device
- IMAP (Internet Message Access Protocol)- Used for email access, but keeps the email on the
server after accessing it, so it can be accessed across multiple devices
- Maintains synchronicity between devices
- FTP (File Transfer Protocol)- Used for the transmission of files over networks
- Usually over a Wide Area Network (WAN)
- People often use FTP Clients: software applications that sit on top of the actual FTP protocol
- When you interact with the program, the client generates & sends the appropriate
FTP commands
- Address Resolution Protocol (ARP) - a protocol or procedure that connects an Internet Protocol
(IP) address to a Media Access Control (MAC) address
- Used when a host needs to broadcast an IP address in order to find the destination IP
address
- Dynamic Host Configuration Protocol (DHCP) - a client/server protocol that automatically
provides an Internet Protocol (IP) host with its IP address and other related configuration information
- Used by the router to assign devices on a network IP addresses
Logical:
- Bit rate
- Error detection
- Packet size
- Packet ordering
- Routing
- Compression and encryption
- Digital signatures
Error Detection
Checksum
A value that represents the number of bits in a transmission message, that is compared to the number of
bits in the received message to ensure all packets have arrived
- Calculated from the data in the packet using an algorithm and added to the packet
- Receiving computer calculates the checksum in the same way when it receives the packet and
compares it to the value in the packet
- If they match, the packet has transmitted successfully
- The most common checksum method in use is the cyclic redundancy check (CRC)
- Picks up over 99% of errors
Parity Bits
An extra bit of data that is added to the packet, used to indicate whether the packet contains an even or
odd number of 1s
- Whether to use even or odd parity is decided prior to transmission
- Even Parity: Where the parity bit is set so that the total number of 1s in the packet is even
- Odd Parity: where the parity bit is set so that the total number of 1s in the packet is odd
- Can be used for error correction, as well as detection
- By treated the data as a grid & effectively assigning two parity bits for each bit in the
message
Layering
Layering - To divide the complex task of networking into smaller, simpler tasks that together form a network
- The hardware and/or software for each layer has a defined responsibility, & each layer provides a
service to the layer above it
Advantages of Layering
+ Reducing a complex problem into smaller sub-problems
+ Devices can be manufactured to operate at a particular layer
+ Products from different vendors will work together
TCP/IP Stack
TCP/IP (Transmission Control/Internet Protocol) Stack - A stack of networking protocolsthat work
together passing packets during communication
- Each layer gives a choice of protocol, which is responsible for adding information to the packet
before passing it to the next layer
- When it is received, the protocol on the receiving machine will read the data that was added by the
same protocol on the sending machine
Application Layer
Specifies what protocolneeds to be used in order to relate the application to the data that’s being sent
- Protocols: FTP, HTTP, HTTPS, SMTP, IMAP, POP3
When Sending
- Uses an appropriate protocol relating to whatever application is being used to transmit data
- Based at the top of the stack.
- Prepares the data in a format ready to be passed to the next layer – transport
When Receiving
- Receives data from the transport layer
- Presents the data to the user through the application
Transport Layer
Uses TCP to establish an end-to-end connection & maintaining conversations between the source and
recipient computer
- Protocols: TCP, UDP
When Sending
- After a connection is established, the transport layer splits the data into packets
- Splitting data up into packetsand labels these packets with:
- Packet number
- Total number of packets the original data was split up into
- Port number being used for communication
- Port - The application on the device that needs the packet
- Port Number - A number corresponding to the application on the device that needs
the packet
- If any packets get lost, the transport layer requests retransmissions of these lost packets]
- Passes the packets onto the network/internet layer
When Receiving
- Reassembles packets in the correct order
- Acknowledges receipt of packets that have been successfully transmitted
- Requests transmission of lost packets
- Receives data from the network/internet layer
Internet/Network Layer
Adds the source and destination IP addresses, forming socket addresses/numbers (or separates them
when receiving)
- Protocols: IP
When Sending
- Adds to each packet:
- Source IP Address
- Destination IP Address
- Routersoperate on the network/internet layer & the router is what uses the IP addresses to forward
the packets
- Forms a Socket Address/Number - The combination of the IP address and the port number
- Sockets are then used to specify which device the packets must be sent to and the application
being used on that device.
-
When Receiving
- Separates Socket address/number into IP & Port number, then passes that to the transport layer
Link Layer
The connection between the network devices, adding the MAC addressidentifying the Network Interface
Cardsof the source and destination computers
- Responsible for adding the Media Access Control (MAC) Address of the
- Source device
- Destination device
- For devices on the same network, the destination MAC address is the address of the
recipient computer, otherwise, it will be the MAC address of the router
- When transmitting data between routers on over a WAN, the MAC address is
changed at each hop on the route
- Operating system device drivers also sit here
- Protocols: Ethernet, fibre optic, Wi-fi
Communication between devices on different networks (WAN), requires both the network layer & the
link layer, which uses IP addresses to create a packet, & the link layer frame
- The destination MAC address will be the MAC address of the router
- When the frame reaches the router, the router works out where it needs to send the packet by
looking at the destination IP address in the header
- Frame - the unit of transmission in a link layer protocol, and consists of a link layer header
followed by a packet
- The router then sets its own MAC address as the source & the next device as the destination
- It can also use the Address Resolution Protocol (ARP) to find the destination MAC
address if the router doesn’t already know it
- If there are several routers on the path to the destination, the source destination MAC
address will be overwritten at every hop
- Once the frames reach their final destination, they will travel up through the layers of the TCP/IP
stack in reverse order, stripping off the headers & tails as they go
Communication between devices on the same network (LAN), this creates a frame using the same IP
address
On the recipient’s computer these layers are looked at from bottom to top,
but on the sender's end, it is from top to bottom.
- Once the destination has been reached, the MAC address is removed
by the link layer
- Then the IP addresses are removed by the Network Layer, then the
transport layers remove the port number and reassemble the packets
- Finally, the application layer presents the data to the recipient in the
form it was requested in
Example: Sending a Message
Next level
- Your Internet Service Provider (ISP)’s DNS Server
1. A user requests a URL (e.g. www.bbc.co.uk) via a browser (Google, Safari, Firefox, etc.)
2. The browser sends the domain name to a Domain Name System (DNS) & maps it to an IP address
- If the URL if frequently requested, the IP will be cached in the browser cache, going straight
to returning it to the browser
- This is received by the DNS resolver server
- The DNS resolver server then queries a DNS root name server
- The DNS root name server response with the address of the top-level domain server
(TLD) for .com (or .uk, .org, .edu, .ca, etc.)
- The resolver makes a request to the .com (.uk, .org, .edu, etc.) TLD server
- The TLD server then responds with the IP address of the domain’s name server (e.g.
google.com, gov.uk)
- The recursive resolver sends a query to the domain’s name server
- The IP address of the website is then returned to the resolver from the name server
- The DNS resolver responds to the web browser with the IP address of the URL
3. The DNS maps the domain name to an IP address
a. Goes through the TCP/IP Stack to establish a connection
4. returns it to the browser
5. The requested web page/resource is returned to the clients web browser
Hierarchy
The hierarchy of domains continues from top to bottom (of the tree) and from right to left in the URL.
- Some top-level domains are further divided administratively.
- Example: The .uk top-level domain has many subdomains (.co, .ac, .gov); in these
instances, the name of the authority that has registered the domain will be the third level
The Cloud
The Cloud - a network of servers hosted on the internet which offer a range of services to store and
process data
- Cloud Computing - the provision of computing resources and services over the internet
Advantages
+ They do not need to ‘buy’ and install the software themselves
+ Any device can access the service if they have an internet connection
+ There is no need to upgrade the software (this will be handled by the provider)
+ Collaboration can occur with multiple people working on the same document at once.
+ Work is automatically saved / backed up
+ Lower costs than setting up your own servers
Disadvantages
- Sensitive company data may be stored in another country which may not adhere to the same data
laws as the country the company is in
- Completely reliant on the network / internet / provider
- Service costs may vary
Virtual Networks
Virtual Network - a network which uses software to subdivide a physical network (LAN or WAN) into
smaller ones
- For a simple virtual network, software will re-divide a Local Area Network (LAN) into a series of
smaller networks so that users of the smaller network can share information separately from the rest
of the members of the larger network
Virtual Private Networks (VPNs) - When software re-divides a wide area network (WAN) into a series of
smaller networks so that groups of people spread across a country / the world can communicate and share
information separately from other members of a network
Data Packets
Data Packets - segments of datacontaining a header, a payload and a trailer
- Header:
- Sender & recipient IP addresses
- The sender and the recipient’s IP addresses act like a postcode, allowing the packet
to be delivered to the correct destination & enabling the recipient device to trace
where the packet came from.
- Protocol being used
- The protocol allows the recipient computer to understand how to interpret the packet.
- Order of the packets
- Upon arriving at the recipient device, packets are reconstructed in the appropriate
order as specified in the header.
- Time To Live / Hop Limit
- The Time To Live (TTL), tells the packet when to expireso that it does not travel
forever.
- Payload
- Raw data to be transmitted
- Trailer
- Checksum, or cyclic redundancy check
- The trailer contains a code used to detect whether any errors have occurred during
transmission.
Packet Switching
Packet switching - a method of communication in which data is communicated using packetsacross a
network.
- In this method of communication, packets are sent across the most efficient route, which can vary
for each packet
- Packets are sent via whichever route the node deicides is least congested
- Supports store & forward transmission
- Store & Forward Transmission - a method of switching data packets by the switching
device that receives the data frame and then checks for errors before forwarding the packets
Advantages Disadvantages
Multiple methods to ensure data arrives intact Time is spent deconstructing and reconstructing
- e.g. checksumsand cyclic redundancy the data packets
checks
Must wait for all packets to arrive before data can
Multiple routes can be used between devices, so if be received
one path breaks, another can be used
Circuit Switching
Circuit switching - a method of communication where a direct linkis created between two devices
- This direct link is maintained for the duration of the entire conversationbetween devices
- Circuit switching requires the two devices to transfer and receive data at the same rate
- Link created is dedicated & temporary, with a fixed amount of bandwidth, which only lasts until the
transmission is complete
- All data uses the same path
- Dedicated bandwidth
- Physical path between source & destination
- No store & forward transmission
- Store & Forward Transmission - a method of switching data packets by the switching
device that receives the data frame and then checks for errors before forwarding the packets
Advantages Disadvantages
Data arrives in a logical order which results in a Bandwidth is wasted during periods of time in
quicker reconstructionof the data. which no data is being sent
Enables two users to hold a call without delay in Devices must transfer & receive data at the same
speech rate
Guarantees quality of transmission through Using switches means electrical interference may
dedicated bandwidth be produced, which can corrupt or destroy data
Network Threats
Hackers
An unauthorised person who attempts to gain access to a computer system with the intent of damaging
data or somehow harming that system
- Black-hat hackers - An unauthorised person who attempts to gain access to a computer system
through nefarious means with the intent of stealing, damaging or harming data or a system
- White-hat (ethical) hackers - Security experts employed by companies, who attempt to gain
access to a computer system through nefarious means in order to find vulnerabilities in a system
and fix them
- Grey-hat hackers - Security expects, who are not employed by companies, who attempt to locate
vulnerabilities in a computer system as a hobby, usually reporting them to the company
Malware
An umbrella term for any computer code written with malicious intent to frustrate or harm. These have
various effects including:
- Deleting, corrupting or encrypting files
- Causing devices to crash, reboot or slow down
- Reducing network speeds
- Logging keyboard inputs & sending them to hackers
Virus
Pieces of code capable of copying themselves & spreading throughout a system
- Typically designed to have a detrimental effect (e.g. corrupting a system, destroying data)
Spyware
A form of malware that obtains information about a user's computer activities by covertly transmitting data
from their device
- Can steal all sorts of information, such as:
- Internet surfing habits
- Email addresses
- Visited web pages
- Downloads/download habits
- Passwords
- Credit card numbers
- Keystrokes (this is done using keyloggers)
- Cookies
Social Engineering
An umbrella term for several different manipulation techniques that exploit human error
- This can include a view to obtaining private information, access to a restricted system or money
- Lure users into exposing data, spreading malware or provided access to a system through:
- Baiting
- Scareware
- Pretexting/blagging
- Phishing
- Pharming
- Shoulder-surfing
- Quid pro quo
- Vishing
Phishing
An online fraud technique used to trick users into giving out personal information, such as usernames,
passwords & bank details
- Perpetrators disguise themselves as a trustworthy source in an electronic communication (e.g.
email or fake website)
Pharming
An online fraud technique using malicious code installed on a PC or server, which misdirects users to
fraudulent websites without their knowledge
Network Security
The aim of network security is to
- Only allow authorised users to access what they need
- Prevent unauthorised access
- Minimise the potential damage caused by unauthorised access
Firewalls
A piece of software or hardware configured to let only certain types of traffic through it
- Can be set up to prevent
- Unauthorised internet traffic from an outside LAN
- Users in a LAN from accessing parts of the internet prohibited by an organisation they are a
part of
- Can block certain ports & types of traffic
- Can inspect data travelling across it, to see if it looks suspicious
- Can consist of two network interface cards (NICs) between the user and the Internet.
- It passes the packets between these two NICs and compares them against packet filters set
by the firewall software.
- Packet filters - The preconfigured rules for packets
- Operating systems & home routers often come with built-in firewalls, whereas more sophisticated
ones can be purchased separately
Packet filtering / static filtering limits network accessin accordance with administrator rules and policies
- Works by examining the source IP, destination IP and the protocols being used as well as the ports
being requested.
- When access is denied by a firewall, two things can occur.
- Dropped: A packet is denied and the sender is not notifiedof the error
- Rejected: A packet is denied and the sender is notified of the error
Proxies
Proxy server - A server which receives all internet requests before sending them out onto the internet
- Acts as an intermediary, collecting and sending data on behalf of the user.
- All traffic travels through the device on its way in & out of a network
- Will look at each request and analyse it against its protocols / rules & if it meets the rules then it will
pass it on to the internet, & the user will access the service required.
- However, if the request is for a website / service which is on a blocked / filtered list, it will not
be forwarded to the internet, & the user will be unable to access the service required
- Some proxy servers are dedicated to caching common websites so that a user can browse a copy
of a site without accessing it directly.
- This can reduce traffic to certain websites as a result.
Encryption
A way to convert data (plaintext) into ciphertext so that only authorised parties can unscramble it and
understand its content
- A way of keeping data securewhen transmitting it over the Internet.
- Makes data unreadable if it is intercepted.
- Data is encrypted and decrypted using a set of keys that only authorised personnel have access to
User Training
This is the process of educating users on network safety & security, so they are less vulnerable to social
engineering attacks that depend on people as the weak point
Anti-Malware Software
These are applications that identify, notify the user of and remove malware from a device
- These often come pre-installed, but can be purchased
- Crucial to make sure that these are updated with the latest patches, so that when flaws are
discovered & patched, the device is in-line with the newest version
Network Hardware
Network Interface Card
The card within a device required to form a wired network connection between a node and another
networked device
- Usually built into the device and assigns a unique media access control (MAC) address to each
device.
- Media Access Control (MAC) Address - a 48-bit valuecoded into the device and is usually
written as a twelve digit hexadecimal number
- Responsible for placing packets onto network cables in the form of electrical signals or pulses of
light
- Encodes the data according to the physical protocol being used on the network
- Ethernet is a very common protocol for data transmission
- Built into the motherboard, so no extra hardware is needed
- 2 Types: Wired & Wireless
- Converts the data that is to be sent into appropriate signals that can be carried across the medium:
- These signals will be voltages through a wire, or pulses of light through a fibre
- In a Wireless Network Interface Card (W-NIC), signals will be modulated radio waves
Switch
Device used to direct the flow of dataacross a network by enabling multiple wired devices to connect so
that the network traffic can be forwarded on to the next point in the network
- Most commonly used in networks using a star topology
- Contains multiple Ethernet ports
- Uses MAC addresses and data frames rather than IP addresses and packets
- Creates and maintains a MAC address table for all devices that are connected
- MAC Address Table - where the switch stores information about the other Ethernet
interfaces to which it is connected on a network.
- Enables the switch to send outgoing data (Ethernet frames) on the specific port
required to reach its destination, instead of flooding (broadcasting the data on all
ports)
- As a result, is able to forward traffic to only the intended recipient which reduces data collisions and
overall network traffic
Wireless Access Point (WAP)
Device which connects to a wired network and provides wireless network radio signal for wireless devices
to connect to
- More commonly combined with a router to enable internet access.
- Used in mesh networks
- Will generally have a maximum number of devices that can connect
- Will generally have a physical broadcast range within which connections are possible (which can be
also influenced by other factors)
- Uses W-iFi or other similar standard
- Wi-Fi range is limited & easily interfered with by various materials (e.g. walls/ceilings,
electromagnetic waves from other devices like microwaves)
- Uses a broadcast signal, so security is very important
Hub
Device which allows more than device to be connected in a network by providing a point of connection
- Any message sent to a hub is sent to all nodes connected to that hub
- This leads to the potential for ‘data collisions’ – when two computers transmit data at the
same time
- When the packets collide, they can get corrupted and so both are discarded
- Part of the Ethernet protocol called Carrier Sense Multiple Access with Collision Detection
(CSMA/CD) is used to handle this.
- Collisions can cause a severe reduction in network throughout, & so hubs are only suitable for use
in small networks
- Much slower than a switch at data transfer (despite having similar function)
Routers
Responsible for moving data from one network to another, e.g. from your home network to the internet, or
across different networks on the internet
- Used to determine the best route that a packet should take for the next stage of its journey
- Switch the packet from the inbound port to the outbound port
- In order to do this, routers maintain a routing table which stores information about which
routers are connected to which ports
- Used to connecttwo or more networks together.
- Allow private, home networks to connect to the Internet
Routers will share their routing tables with other routers using the Routing Information Protocol (RIP, part of
the TCP/IP stack), so that each router gets information about more networks than only those that it is
directly connected to.
- RIP also helps the router determine the most efficient route across the network, generally involving
the smallest number of ‘hops’
- Home routers, although they have the same function, only need to know how to connect to the ISP,
so will have a static routing table
Gateway
Translatesthe protocols so that networks can communicate with each other by removing the header from
packets before adding the data to packets using the new protocol
- Used when routing packets between networks, when protocols are not the samebetween them
- Reformat the data into the format required by the new network
- The data can then continue its way to the destination
Cables
Needed in a network to connect the different pieces of hardware together
- Within a network, ethernet cables are commonly used
- Ethernet is the name of the protocol - the actual cables are really unshielded twisted pair
copper cables with an RJ45 connector
- Different Types: Twisted pair, Coaxial, Fibre optic
- Between networks, fibre optic cable may be used – very fast but expensive
Coaxial Cables
Metal wire made up of 4 cylindrical components (solid conductor wire, a layer of insulation, a grounding
conductor & layer of exterior insulation
- Can be affected by noise from external magnetic fields (but to a lesser extent than twisted copper
pair)
- Provides moderate bandwidth
- More expensive than twisted pair, cheaper than fibre optic
- Used for feedlines (radio transmission), antenna receivers, networking, digital audio & cable
television
Client-Server Networks
A network that consists of terminal(s),known as clients, connected to a server, which it can request
resources from
- The server is apowerful, central computer.
- Holds all of the important information & resources
- Has greater processing powerthan the terminals
- Manages network traffic
- Records user activity
- Controls user authentication
- Clients make requests to the server for data, communication & other services
- Servers need to be high spec devices, as they need to be able to:
- Handle requests from multiple clients simultaneously
- Respond quickly to requests
- Have high availability (available 24/7 & reliable)
- Work in an environment with lots of other servers (e.g. high durability
- If the server fails it will not be able to provide resources, hence the importance of a failover &
redundancy
- Failover: The process of swapping over authority to the backup hardware
- Redundancy: A piece of backup hardware that is used in case of a fail in the main server
- Types of servers
- Print
- Web
- Database
- Mail
- Application
Client-server networks are best suited to organisations with large number of computers or situations where
many computers need access to the same informations
Advantages Disadvantages
Central backupsare carried out so there is no need Relatively expensiveto set up & maintain
for individual backups, easy to backup shared data
Functionality of terminals depends on the server; if
Data and resources can be shared between clients this falls, performance falls
Easy to manage file security, as all data is stored in Trained staff are required to maintain the server
1 location
Users will lose network access if the server fails
Easier to install software & security updates to all
computers Server is a single point of failure
- Security & backups centrally managed
Peer-to-Peer Networks
A network in whichcomputers are of equal status and are connected to each otherso that they can share
files independently
- Peer: A computer connected to the network
- All computers are of equal status
- Each device effectively acts as both a server and client, as it can both provide and request
resources
- Commonly used for file sharing
- Especially in piracy, since it’s almost impossible to trace the origin of files
- For Larger files
- Each peer individually downloads parts of the file
- Once a single have enough of the file, it can start sharing with other peers
- Other peers then download parts of the file from peers that have already download
those parts
- Once a peer has a whole file, it can be used as a seed for other peers to start
downloading the file
- This model enables large files to be downloaded without the need for a powerful
server
-
- Each peer is responsible for its own security & data backup
- Peers usually have their own printers
- Can send print jobs to another peer to process, but that peer must be switched on to
communicate with the printer
Peer-to-peer networks are best suited to smaller organisations with fewer computers, or where fewer users
need access to the same data
Advantages Disadvantages
Allows all users to share & request resources Backups must be performed separately
Not dependent on a central server May be difficult to locate resources & maintain a
well-ordered file store
Specialist staff are not required
The router maintains a table which enables it to direct all traffic to the correct device within the LAN
- Achieved by adding a unique port number to each packet)
Essential Tags
- <html>, </html> - All code written within these tags is interpreted as HTML
- <body>, </body> - Defines the content in the main content area of the webpage
- <head>, </head> - Defines the browser tab or window heading area
- <title>, </title> - Defines the text that appears with the tab or window heading area
- <h1> </h1>, <h2> </h2> & <h3> </h3> - Heading styles in decreasing sizes, h1 being the largest
- <link> link to an external sheet (e.g. a CSS style sheet: <link rel="stylesheet" href="styles.css"> )
- <p></p> defines a paragraph of text
- <img src="imageFilename.png" alt = “ alt text for the image" style = "width: 100px; height: 70px;">
- <div></div> forms a division in the HTML page, as can be assigned identifiers or classes to be able
to reference in external sheets, such as CSS or Javascript
- <div id = "theDivision"> </div>
- <div class = "myDiv"> </div>
- <form> </form> Defines a web form where users can input data
- <input> Defines where a user can have input into a web form (enter text, select items, submit, etc.)
- Example:
- <form action="/theform.php" method="get">
<input type="text" id="fname" name="fname"><br><br>
<input type="submit" value="Submit">
</form>
- Types of input: button, checkbox, color, date, datetime-local, email, file, hidden, image,
month, number, password, radio, range, reset, search, submit, tel, text, time, url, week
- <ol> </ol> Defines an ordered list, in which items will be numbered
- <ul> </ul> Defines an unordered list, in which items will bullet pointed
- <li> </li> Defines a list item
- <script> </script> Signifies that all code between these tags will be identified as Javascript
HTML Code
<!DOCTYPE html>
<html>
<!-- everything within the <html> </html> tags is read as HTML code -->
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<title>replit</title>
<link href="style.css" rel="stylesheet" type="text/css" />
<!-- Above is a link to an external CSS file
</head>
<body>
<script src="script.js"></script>
<script>
var name =prompt("What is your name? ", "")
alert("Welcome to the webpage, " + name + "!")
document.write("Hello, " + name + "! This is Javascript embedded into the HTML file!")
</script>
<p> this is a link to an external script, the js script. And this message is enclosed in paragraph tags</p>
</body>
<br>
<head> This is heading tag, this creates a heading for the page</head>
</br>
<title> This is a title tag, which creates a larger title for the page</title>
<body> This is a body tag, where the content of the webpage is contained</body>
<h1> This is heading 1 </h1>
<p> the styling for this is in an external style sheet, which is used for styling across multiple pages</p>
<p> the styling of this heading is using external style sheet, which only applies to all h1 elements on every
page in this project</p>
<!-- This CSS code embedded within the HTML page, it only effects the styling of all h2 tags ON THIS
PAGE ONLY!-->
<style>
h2{
color:#34fe88;
}
<div id="colourfulHeader">
<h4> Identifiers & Classes </h4>
<p>identifiers are only used once on a page, and are used to style specific elements on a page. when using
css, this is prefixed with a hastag (#)</p>
<p>a class is used to style multiple elements on a page, and is used to style elements with the same style.
when using css, this is prefixed with a full stop (.)</p>
</div>
<div class="classExample">
<p> This is a paragraph tag within a div tag. a div tag just shows a division in the HTML code. it is being
stylised using a class named "classExample" in the external style sheet</p>
</div>
<ol>
<li>ordered list, with numbers</li>
<li>the items are in order with numbers</li>
<li>which is defined by the ol tag</li>
</ol>
<ul>
<li>this is list is unordered, with bullet points</li>
<li>defined by the ul tag </li>
<li>and have bullet points</li>
</ul>
</html>
2. Internal Styling: When the styling of a tag is applied within the HTML code, within style tags, and
applies to all instances of that tag within that page
a. Only applies to the page it is embedded in
b. Only applies to the type of tag specified in the style code (e.g. h1, h2, title, etc.)
c. Must be applied with correct CSS syntax within <style></style> tags within the HTML code
d. Overrides external styling, but can be overridden by in-line styling
e. Example: (within HTML code):
i. <style>
h2{
color:#34fe88;
}
Essential Properties
These can be applied to the entire HTML page, or just a single container of text, using either in-line, internal
or external styling.
- background-color
- border-color
- border-style
- border-width
- font-family
- font-size
- height
- width
- color (with named & hex colours)
Examples:
CSS Code
html {
height: 100%;
width: 100%;
border-color: #ed719e;
border-width: 10px;
border-style:dotted;
}
h1{
color:#e392fe;
font-size: 20px;
font-family: "Lucida Console", "Courier New", monospace;
}
#colourfulHeader{
background-color: lightpink;
font-family: "verdana";
}
.classExample {
background-color: lemonchiffon;
color: black;
border-radius: 20%;
border-color: black;
border-width: 2px;
border-style: solid;
}
Javascript
General Syntax Rules
- All command lines (lines that output, declare a variable, etc.) must end with a semicolon
- All sections of code are contained within curly brackets {}, with the final curly bracket being on the
line below the section it contains, on its own separate line
- Javascript is case sensitive, so ‘age = 18’ is not the same as writing ‘Age = 18’ and will be treated
as 2 separate variables with no affiliation
Variables
Can be declared in 4 ways:
- Automatically (typically seen as bad practice to do this)
- age = 18
- Using ‘var’ (only used to support older browsers)
- var age = 18
- Using ‘let’ (should be used when value should be able to be reassigned)
- let age = 18
- Using ‘const’ (should be used when value should not be changed during the program)
- const age = 18
After declaring, the type of variable does not need to be specified (e.g. if you wanted to change age later
on, you could put ‘age = 16’ instead of ‘var age = 16’, bearing in mind that variables labelled with ‘const’
cannot be changed through reassignment later in the program)
Casting
bool = false;
num = Number(bool);
bool = false;
num = +bool;
num = 0;
bool = Boolean(num);
num = 0;
bool = !!num;
Outputting to Screen
This can be done in 3 different ways:
- By changing the contents of an HTML element
- Example:
- chosenElement = document.getElementById(“example”);
chosenElement.innerHTML = “Hello World”;
- In this example, the item in HTML code contained within the tag where id = “example”
is replaced with “Hello World”
Key aspects:
- var i = 0: Defines the starting number for the iteration
- i < 5: Defines the number that the variable I will go up to, which is how many times the code will loop
- i++: Defines the increment of i after each iteration, the ++ being a representation of +1
- All parts of the for statement must be separated by semicolons
Key aspects:
- var i = 0: Declaration of the variable to be the condition for the loop
- i < 5: The condition to be met for the iteration to be continued
- i++: Defines the increment of i after each iteration, the ++ being a representation of +1
Arithmetic
Operator Description
+ Add
- Subtract
* Multiply
** Exponentiation
/ Division
++ Increment
-- Decrement
Arithmetic Assignment
+= x += y x=x+y
-= x -= y x=x-y
*= x *= y x=x*y
/= x /= y x=x/y
%= x %= y x=x%y
**= x **= y x = x ** y
Logical
Operator Description
== Equal to
!= Not equal to
The output can be done using any of the 3 methods described in the ‘Outputting to Screen’ section
String Handling
indexOf() let str = "Hello, World!"; Returns the index of the first
let index = str.indexOf("o"); occurrence of a specified value
alert(index);
lastIndexOf() let str = "Hello, World!"; Returns the index of the last
let lastIndex = occurrence of a specified value
str.lastIndexOf("o");
alert(lastIndex);
replace() let str = "Hello, World!"; Returns a new string with some
let newStr = str.replace("World", or all matches of a pattern
"JavaScript"); replaced by a replacement.
alert(newStr);
replaceAll() let str = "Hello, World! World!"; Returns a new string with all
let newStr = matches of a pattern replaced by
str.replaceAll("World", a replacement.
"JavaScript");
alert(newStr);
trim() let str = " Hello, World! "; Removes whitespace from both
let trimmed = str.trim(); ends of a string.
alert(trimmed);
Functions
Functions are defined with the word function before the name of the functions itself
Example:
- function multiply(num1, num2) {
var total = num1 * num2;
return total;
}
Procedures
Defined with the word function before the name of the procedure itself & does not return a value to the main
program
Example:
function multiply(num1, num2) {
var total = num1 * num2;
alert(total);
}
Arrays
Declaring an Array
- Literal Notation
- Example: let array = [1, 2, 3, 4, 5];
- Array Constructor Method
- Example: let array = new Array(1, 2, 3, 4, 5);
- Array of method
- Example: let array = Array.of(1, 2, 3, 4, 5);
Array Functions
pop() let array = [1, 2, 3]; Removes the last element from
let lastElement = array.pop(); an array and returns that element
alert(lastElement);
at() let array = [1, 2, 3, 4, 5]; Returns the element at the given
let element = array.at(2); index, allowing for positive and
alert(element); negative integers
indexOf() let array = [1, 2, 3]; Returns the first index at which a
let index = array.indexOf(2); given element can be found in
alert(index); the array
lastIndexOf() let array = [1, 2, 3, 2]; Returns the last index at which a
let lastIndex = given element can be found in
array.lastIndexOf(2); the array
alert(lastIndex);
Web Crawlers
Software used to collect information about websites to build an index of web pages
- Work by traversing the internet, one web page at a time by using links on websites
- Collect keywords & phrases from the linked web pages, then adds this information to the index
- Collect & add meta data from websites, which is information provided by the website owner
PageRank Algorithm
An algorithm used to rank web pages, determining the order in which web pages are displayed when a
search is conducted, in which higher ranked pages will show up first (at the top of the page)
- The PageRank algorithm was devised in 1995 by Larry Page and Sergey Brin, the founders of
Google
PageRanks of webpages are constantly being recalculate & updated, and each time the calculations are
run, the process produces slightly more accurate results
- Eventually, the PageRank of each page will stabilise, when further iterations of the algorithm
would produce a negligible change (if any), so wouldn’t cause a difference in rank
Factors Affecting PageRank
- There are multiple factors which determine the page rank of a page:
- How many incoming linksit has from other web pages
- The quality of the links to a page
- The page rankof the web pages that link to it
- Popularity of the page
- Popularity of incoming pages
Damping Factor
The probability, at any step, that the person will continue following links from the webpage they are
currently on
- Also known as the teleportation probability
- Usually set to 85%, due to the random surfer model
- Stops the page rank of pages linked to page A from having too much effect
- Based on the idea of a random surfer (random surfer model) who will follow a link from a page 85%
of the time, but 15% of the time will go somewhere else entirely
Advantages Disadvantages
Thin Clients Easy to set up, maintain and add Reliant on the server
terminals on a network as little
client installation required Requires high end hardware for
reliability
Software can be managed
centrally Higher demand on the server
Thick Clients Robust and reliable providing More expensive client side
greater up time hardware required
Can put strain on a server so important that the Reduces the load on a server by offloading
server only processes data it has to (e.g. for processes to the client
security reasons)
Can slow the web experience down due to data Reduces web traffic as less data gets sent /
transfer between client and server received
Good for security related validation checks (e.g. Good for initial data entry validation (e.g.
looking up passwords on database) presence/format checks)
128 64 32 16 8 4 2 1
27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1
4. Subtract next largest power of 2: 5 - 4 = 1
5. Record a 1 in the ‘4’ position:
-
128 64 32 16 8 4 2 1
1 1
6. Subtract next largest power of 2: 1 - 1 = 0
7. Record 1 in the ‘1’ position
-
128 64 32 16 8 4 2 1
1 1 1
8. Fill in 0s for the rest of the positions skipped:
-
128 64 32 16 8 4 2 1
0 0 0 0 1 1 0 1
Place Value:
-
16 1
161 160
Letter/Number Representations:
-
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7 8 9 A B C D E F
Example - 42 in Hexadecimal
1. 42 > 16, and so 42 / 16 = 2.625,
a. Integer value of the division is 2, therefore the first digit is ‘2’
2. Remainder of this division is 10
a. This corresponds to the letter ‘A’ in the table above, therefore the second digit is ‘A’
Therefore, 42 in hexadecimal is 2A
Example
Example:
Two’s Complement
A method of representing negative numbers in binary that works by making the most significant bit
negative.
- For example, with eight bits (a byte) the most significant bit, usually 128, represents -128.
Subtracting 12 from 8
- This is done in five bit two’s complement (where -16 is the
right-most bit),
- 8 is 01000 & -12 is 10100.
The two’s complement numbers are then added using regular binary
addition & the result is 11100
Example: In denary, in the number 7.52 × 1013, the mantissa would be ‘7.52’ & the exponent is ‘1013’
Floating point binary numbers are represented in two’s complement, so the left most bit of both the
mantissa and the exponent is always a sign bit.
- This is so it can represent both positive & negative mantissas, as well as both positive & negative
exponents
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
The point is implied to be immediately after the sign bit of the mantissa, however, there is no physical bit
allocated to be the point
- All mantissa values are fractional between (close to) +1 (this being 511/512) and -1
- All exponents are integers between -32 and 31 (in this example)
Converting from positive floating point to denary:
1. Rewrite the number in ‘standard form’ ( a × 2b)
2. Convert the exponent to denary
3. Move the decimal point in the mantissa according to the magnitude of the exponent
a. Positive Exponent: Move to the right
b. Negative Exponent: Move to the left
4. Read the converted number
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 1
4 2 1 1/2
1 1 0 1
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1
4 2 1 1/2
1 0 0 1
Process of Normalisation
1. Identify whether mantissa is positive or negative
2. Shift the mantissa ‘point’ left or right to become a 0.1 or 1.0
a. Positive Mantissa: 0.1
b. Negative Mantissa: 1.0
3. Change exponent according to number of places shifted
a. Left: Increase mantissa by magnitude of shift
b. Right: Reduce mantissa by magnitude of shift
4. Combine to create full normalised floating point number
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1
1. This mantissa is positive, so it must be shifted to have a 0.1 at the beginning
2. To normalise the mantissa the point needs to be moved 2 to the right (discarding 0s):
Mantissa
0 1 1 0 0 0 0 0 0 0
Exponent
-32 16 8 4 2 1
0 0 0 0 1 1
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
1 1 1 1 1 0 0 0 1 0 0 0 0 1 0 1
Mantissa
1 0 0 0 1 0 0 0 0 0
-32 16 8 4 2 1
0 0 0 0 0 1
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
Mantissa Exponent
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1
Mantissa
0 1 0 1 0 0 0 0 0 0
Exponent
-32 16 8 4 2 1
1 1 1 1 0 0
-1 1/2 1/4 1/8 1/16 1/32 1/64 1/128 1/256 1/512 -32 16 8 4 2 1
0 1 0 1 0 0 0 0 0 0 1 1 1 1 0 0
0 1 1 0 1 0 0 0 0 1
Subtraction
Similar to regular binary subtraction, this involves converting to two’s complement & adding
1. Make the exponents the same.
2. The mantissa of the number to be subtracted must be converted to two’s complement.
a. This is performed by flipping all the bits and adding one.
Now binary addition is carried out on the two numbers, before the result is normalised.
Bitwise Manipulation
Logical Shifts
This is a shift performed on binary numbers
- There are two varieties: logical shift left and logical shift right.
A shift involves moving all of the bits in a binary number a specified number of places to the right or to the
left.
- This can be thought of as adding a number of leading or trailing zeros.
For example, perform a logical shift left by three places to the binary 10010110
- A logical shift left by three places is the same as adding three trailing zeros. The result is therefore
10010110000.
The result of a logical shift is a multiplication (or division if shifting right) by two to the power of the number
of places shifted.
- Example: A logical shift left by one place has the effect of doubling (21) the initial number
Masks
A mask can be applied to binary numbers by combining them with a logic gate
- These logic gates are usually AND, OR or XOR
AND OR XOR
A B P A B P A B P
0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
By ‘masking’ a number, you’re applying the corresponding logic gate to each bit of the original number,
using a number known as the ‘mask’
Examples:
AND OR XOR
Character Sets
Character Set - A published collection of codes and corresponding characters which can be used by
computers for representing text.
- Two widely used character sets are ASCII and Unicode.
Unicode
Character set which uses a varying number of bitsallowing for over 1 million different characters, many of
which have yet to be allocated.
- Because of this, Unicode has enough capacity to represent a wealth of different languages, symbols
and emoji
1.4.2 - Data Structures
Arrays
More array functions are outlined within 1.3.4 - Web Technologies: Arrays
Array - An ordered, finite set of elementsof a single type
- Unless stated in the question, arrays are always taken to be zero-indexed, which means that the
first element in the array is considered to be at position zero
One-Dimensional (1D)
A 1D array is alinear array, it only needs to be accessed in the form array[index]
- Can be created like this: array = [1, 23, 12, 14, 16, 29, 12]
- And then accessed (& printed) like this: print(array[3])
- Which would output ‘14’
Records
Record - one instance of an entity in a database (more info in 1.3.2 - Databases)
- Commonly referred to as arow in a file(table) and is made up of fields.
- Each field in the record can be identified by recordName.fieldName after it has been created
- Used in databases, for example, a file containing 3 records, where each record has 4 fields:
Lists
List - a data structure consisting of a number of ordered items where the items can occur more than once
- Elements can be accessed in the form list[index]
- Values are stored non-contiguously: Do not have to be stored next to each other in memory,
- Can also contain elements of more than one data type
List Manipulation
format: listName.function(parameters)
Tuples
Tuple - An ordered set of values of any type
- Immutable, so cannot be changed, & elements cannot be added or removed once it has been
created
- Attempting to do so will result in a syntax error
- Initialised using regular brackets instead of square brackets
- Example: newTuple = (“Value1”, 2, 33.21)
- Elements in a tuple are accessed in a similar way to elements in an array, for example:
- print(newTuple[0])
>> Value1
Linked Lists
Linked List - A dynamic data structureused to hold an ordered sequence
- Items in the list don’t have to be in contiguous data locations
- Each item is called a node, & contains a data fieldalongside another address called a link or pointer
field
- Each list also stores a predetermined ‘Start’ pointer variable & a variable for the next free index,
which gets updated as the list fills up
- Doubly Linked List: When an extra pointer is added to every node in a linked list so it can point to
the previous and next items
- Circular Linked List: When the pointer of the last node points to the start node
- The order of items can be changed by changing the pointers, but items cannot be directly accessed
like in arrays (no ‘random access’)
- Random Access - The ability to access a specific element directly, given its index
When traversing a linked list, the algorithm begins at the index given by the ‘Start’ pointer and outputs the
values at each node until it finds that the pointer field is empty/null.
- This signals that the end of the linked list has been reached.
Example 1
In the table shown on the right:
- The ‘Data’ field contains the value of the actual data which is part of
the list
- The ‘Pointer’ field contains the address of the next item in the list
- The variable ‘Start’ contains the index of the first item in the list
- The variable ‘NextFree’ contains the index of the next free space in the
list
Graphs
This is a set of vertices/nodesconnected by edges/arcs. There are 3 different types:
- Directed Graph: The edges can only be traversed in one direction
- Undirected Graph: The edges can be traversed in both directions
- Weighted Graph: A ‘cost’ is attached to each edge/arc
Key Terms
- Neighbours - 2 nodes that are connected to each other by an edge
- Degree - The number of other nodes that it is connected to
- Loop - An edge that connects a node to itself
- Path - A sequence of nodes that are connected by edges
- Cycle - A closed path, i.e. a path that starts and ends at the same node (& no node is visited more
than once)
Computers are able to process graphs by using anadjacency matrixor an adjacency list
A square matrix used to represent a finite graph, A collection of unordered lists used to represent a
where the elements of the matrix indicate whether finite graph, where each unordered list within an
pairs of vertices are adjacent or not in the graph adjacency list describes the set of neighbours of a
particular vertex in the graph.
+ More convenient to work with due to quicker + More efficient for large, sparse networks
access times
+ Easy to add nodes
Stack
Stack - An abstract data type that holds an ordered, linear sequence of
items, a last in, first out (LIFO) structure
- Used to reverse an action, such as to go back a page in web
browsers or press the ‘undo’ button
- Implemented using a pointer which points to the top of the stack,
where the next piece of data will be inserted.
- Can be implemented as either a static structure or a dynamic
structure.
- Static: Stack with a predetermined size
- Dynamic: Stack without a predetermined size
- Static stacks are better for when the maximum size required
is known in advance, & are easier to implement & make
more efficient use of memory.
Stack Manipulation
format: stackName.function(parameters)
Queue
Queue - An abstract data type that holds an ordered, linear sequence of items, a first in, first out (FIFO)
structure
- Linear Queue - A data structure consisting of an array
- Items are added into the next available space in the queue, starting from the front
- Items are removed from the front of the queue
- Make use of 2 pointers: one pointing to the front of the queue & one pointing to the back of the
queue (where the next item can be added)
- Priority Queue - A data structure where each element in the queue has a priority
- When new elements are added to the queue, they are inserted ahead of those of lower
priority and behind elements of equal priority
- Static Queue: Queue where the size is predetermined
- Circular Queue: Type of static queue where rear pointeris equal to the maximum size of the
queue, so can loop back to the front of the array & store values here, provided that it is
empty
- Dynamic Queue: Queue where size is not predetermined
Key Terms
- Enqueue: Adding items to the back of the queue
- Dequeue: Removing items from front of the queue
Syntax: queueName.function(Parameters)
Linear Queue
Circular Queue
Initialise Enqueue Dequeue
Tree
Tree - A connected, undirected graph with no cycles
- Nodes are connected to other nodes using branches, with the lower-level nodes being the children
of the higher-level nodes
Key Terms
- Node - An item in the tree
- Root Node - The start node for traversals, a single
node which does not have any incoming nodes
- Edge/Branch/Arc - Connects two nodes together
- Child - A node with incoming edges
- Parent - A node with outgoing edges
- Subtree - Subsection of a tree consisting of a parent
and all the children of a parent
- Leaf - A node with no children
Post-Order Traversal
Follows the order: left subtree, right subtree, root node
- Starts at the leftmost child node, then its parent node, then the next
leftmost child node, then rightmost child node, then the parent node,
then that node’s parent node, then the rightmost child node, then its
parent node until you reach the root node
Example:
- Order of Traversal: 7, 5, 10, 12, 11, 9, 34, 25, 20, 15
- The leftmost child node (7), then its parent node (5), then the next leftmost child node (10), then
rightmost child node (12), then the parent node (11), then that node’s parent node (9), then the
rightmost child node (34), then its parent node (25, 20) until you reach the root node (15)
Hash Table
Hash Table - an array which is coupled with a hash function
- Hash Function - A function that can be used to map data of arbitrary size to fixed-size values
- Takes in data (a key) and releases an output (the hash)
- Maps the key to an indexin the hash table
- Each piece of data is mapped to a unique value
- Used for indexing, as they provide fast access to data due to keys having a unique, one-to-one
relationship with the address at which they are stored
Truth Table - A table showing all possible combinations of inputs and, for each combination, the output that
the circuit will produce
- Variables in expression are filled in on the columns, and all the possible combinations are listed
beneath them
- The result of the expression given is then shown on another column heading for the output, labelled
as either a random letter (typically C, P, or Y)
Truth tables can be used to evaluate operations step by step, for example:
- Solve:
- Truth Table:
Karnaugh Maps
Karnaugh Map - a version of a truth table for a Boolean expression that is laid out in a way that makes it
easier to simplify
- Filled in corresponding to the expression’s truth table
- Can be filled in from 1-4 inputs
- For 3/4 inputs: Each row is 1 bit different than the previous row
Example:
P = (A ^ B) V C
Step 1: Expression Step 2: Fill Headings Step 3: Fill in rest of Karnaugh Map
P = (A ^ B) V C
2. Label the headings for each input (A / B, A / NOT B, NOT A / NOT B etc.)
3. For each groups, work out its expression by focusing on what’s common to all cells in the group
- Bearing in mind that, a 0 is NOT [column heading] & 1 is [column heading]
- if a column heading is AB for example,
- 00 = NOT A / NOT B
- 01 = NOT A / B
- 11 = A / B
- 10 = A / NOT B
4. Apply OR between each mini expression to create the simplified expression of the karnaugh map
Example:
Step 1: Karnaugh Map Step 2: Label Headings Step 3: Label Group Step 4: Apply OR
& Group Expressions between mini
expressions
¬C^D v A^B
Simplifying Boolean Expressions
There are 4 types of laws used to simplify boolean expressions
- Commutative
- Distributive
- Associative
- De Morgan’s
- Double Negation
Logic Circuit
This is the logic circuit for a D-type flip flop.
- Uses 4 NAND gates & updates the value of Q to the value
of D whenever the clock (CLK) ticks, on a rising edge.
- The value of Q is the stored value
- Q will only take on D’s value when the clock signal is on the
upward rise
- Between these moments, Q will remain unchanged
Adders
Adder - a logic circuit which adds together the number of inputs which are true, and outputs that number in
binary
- 2 Types: Half adder & full adder
- Because the full adder has a carry input, the circuits can be chained together to form a ripple adder
- At each stage, B and C-in can be connected to the previous adder’s S and C-out, and a new input
can be attached to A.
An individual who can be identified by personal data is referred to as the data subject, with the law
protecting them in eight specific waysas laid out by the Government:
The Act also gives data subjects certain rights, such as the right torequest a copyof the data held about
them, the right to correct the dataheld about them and the right to prevent marketingusing contact details
provided by the data subject.
- As of 2018, The Data Protection Act (1998) has been replaced by the General Data Protection
Regulation (GDPR) and the Data Protection Act (2018)
The consequences of the second and third offences are generally worse than the first, with each offence
being punishable with imprisonment
The owner’s rights cover the way that the work can be used, e.g. copying, adapting, broadcasting, or
lending. The legislation covers:
- Literary works
- Dramatic works
- Musical works
- Artistic works
- Typographical works
- Sound recordings
- Films
- Digital assets (under 1992 extension)
In 1992, the Act was extended to include computer programsas a type of literary work, meaning that
copyright automatically applies to code.
- If an individual believes that their work has been copied, it is their responsibility to take action under
the Act
Trademark - a type of intellectual property consisting of a recognizable sign, design, or expression that
identifies a product or service from a particular source and distinguishes it from other
- Protects indications (e.g. a logo) of the commercial source of a product or service and remain in
force as long as they are actively used or registered
- A trademark owner can be an individual, business organisation, or any legal entity
Under the Act, the police and specified public bodies can, with appropriate authorisation:
- Demand that internet service providers (ISPs) provide access to their customers' digital
communications, without informing the customer
- Carry out mass surveillance of digital communications
- Demand that ISPs fit equipment to allow for digital surveillance
- Demand that someone hand over the keys to encrypted information
- Intercept and monitor ongoing digital communication
- Keep secret the existence of interception warrants and any data collected under them, even from
being revealed in court
This Act is particularly controversial as its powers extend to small agencies like local councils.
- Some people also feel that the Act is aninvasion of privacy, or that it is often improperly used
1.5.2 - Moral & Ethical Issues
Moral, Ethical, Social & Cultural
Ethics - Moral principles that govern a person's behaviour or the conducting of an activity
- Usually stemming from the expectations of a society/company/group of people
- Ethical - Relating to moral principles or the branch of knowledge dealing with these
Morals - Standards of behaviour; principles of right & wrong, usually to do with an individual
- Moral - Concerned with the principles of right and wrong behaviour
- Morality - Principles concerning the distinction between right & wrong or good & bad behaviour
Morals usually refer to personal beliefs influenced by factors such as society, culture, and individual
experiences, whereas ethics are guidelines established by communities or specific groups outlining
acceptable and unacceptable actions or behaviours.
- Social - Discussing the behaviours & expectations of society
- Cultural - Discussing how ethnic groups, countries, and religious groups view different attitudes and
behaviours.
Social Media
ADM is used to determine what different users should be displayed on their social media feeds.
- Based on many aspects, such as users’ interactions, inference about their interests, ‘likes’ on
content, ‘followers’, recency of content, location, language, etc.
- Feared that solely reinforcing people’s interests creates adangerous bubblein which their beliefs
are never challenged, leading to a close-minded society& issues such as ‘echo chambers’ &
confirmation bias
- Echo Chamber - An environment in which a person encounters only beliefs or opinions that
coincide with their own, so that their existing views are reinforced & alternative ideas are not
considered
- Confirmation Bias - the tendency to interpret new evidence as confirmation of one's
existing beliefs or theories
Criminal Reoffending
Across the world, law enforcement & criminal justice authorities to profile people, predict their supposed
future behaviour, and assess their alleged ‘risk’ of criminality or re-offending in the future.
- These predictions, profiles, and risk assessments can influence, inform, or result in policing &
criminal justice outcomes, including constant surveillance, stop and search, fines, questioning,
arrest, detention, prosecution, sentencing, and probation.
- Combined with AI, these ADM systems reproduce & reinforce discrimination on grounds including
but not limited to race, socio-economic status, & nationality
- This can infringe fundamental rights, including the right to a fair trial and the presumption of
innocence, the right to private & family life, and data protection rights
Hiring
ADM has improved productivityand made certain application processesmore convenientfor employers, by
filtering through candidate CVs much quicker without needing human input.
- Companies are able tohire workers fasterusing algorithms which can screen candidates for certain
desired qualities before the interview stage
- However, relying entirely on these algorithms could result in people being treated unfairly, as these
algorithms do not consider extenuating circumstances and are unable to process information with
the same consideration of contextual factorsthat humans are able to provide.
Driverless Cars
In driverless cars, the use of ADM often means decisions are made faster than humans are capable of
reacting, so have the potential to save lives.
- However, this raises ethical questions about how to decide who should be harmedif a scenario
arises in which either a pedestrian or the driver must be harmed.
- This then raises questions about who is responsiblefor the consequences of this decision.
Machine Learning
Machine learning - A branch of artificial intelligence (AI) that focuses on developing systems that
can learn and improve from experience without being explicitly programmed.
- Involves using algorithms & statistical models to enable computers to identify patterns, make
decisions, and predict outcomes based on data.
Key Concepts
- Data: The input that the system uses to learn patterns. It can be structured (e.g. spreadsheets) or
unstructured (e.g. text & images)
- Model: A mathematical representation of a system that processes the data and makes predictions
or decisions based on it
- Training: The process of feeding data into a machine learning algorithm so it can learn from it and
improve its model
Expert Systems
This is a computer program that uses AI to simulate the judgement and behaviour of a human or an
organisation that has expertise and experience in a particular field.
- Usually intended to complement, not replace, human experts.
- Replicate the knowledge and experience an expert in a particular subject would have.
- Made up of a knowledge basewhich consists of a set of facts and rules which are used to build an
inference engine, which is interrogated to find diagnoses
Neural Networks
This is a method in artificial intelligence that teaches computers to process data in a way that is inspired by
the human brain.
- Uses a type of machine learning process, called deep learning, that uses interconnected nodes or
neurons in a layered structure that resembles the human brain.
- These ‘learn’ from a set of data that they are given and this knowledge can be applied to new data
sets, in the same way a human is able to.
- Used in pattern detection & picking up on financial fraud.
Voice Recognition
This is a deep learning technique used to identify, distinguish, and authenticate a particular person’s voice,
which evaluates an individual’s unique voice biometrics, including frequency and flow of pitch, and natural
accent.
- This is distinctly different from speech recognition, which only recognizes spoken words, whereas
voice recognition identifies the speaker.
- AI is seen within voice recognition systems which are now common within smart home systems
such as Google Home and Amazon’s Alexa.
- These have increased convenience for people but raise questions about privacy, as they are
required to be constantly switched on to function.
Environmental Effects
With technological devices being produced cheaply and widely, they have become affordable for a large
proportion of the world population, and therefore a core part of society.
- However, the effects on the planet as a result of our consumption will go on to impact future
generationsas well as biodiversity.
Advantages
+ Push for Renewables: There has been a push in the UK forrenewable energy which counteracts
the effects of increased electricity consumption to an extent
+ Environmentally-friendly Technology: The use of engineering and technological approaches to
understand and address issues that affect the environment with the aim of fostering environmental
improvement
+ Smart Home Systems: Some use temperature sensors to determine when heating should
be switched on and motion sensors to switch off lights when a room is empty.
+ Computer Modes: PCs & laptops offer ‘Sleep’ and ‘Stand-by’ features and some newly
developed car engines are designed to prevent them from idling so as to reduce emissions.
Disadvantages
- Peer Pressure: Combined with the affordability of electronics, people throw away their old (&
sometimes fully functioning) devices at an alarming rate
- E-Waste: Some device components are built of mercury and radioactive isotopes which are toxic
and can contaminate water supplies
- Often, e-wasteis shipped to third world countries with lower environmental standardsto be
disposed, which is widely considered to be immoral & unacceptable
- Electricity Demand: As more devices are created, the demand for electricity increases with it to
power these devices.
- This requires the burning on non-renewable resources, such as fossil fuels, which emit
greenhouse gases, contributing to global warming & climate change
- Rates of climate change have accelerated over the past decade & it will continue to increase
if the current behaviour continues
Censorship & the Internet
Censorship - The act of suppressing the content that people are able to view, publish and access, usually
conducted by governments or private institutions
- May be done on the basis that such material is considered objectionable, harmful, sensitive, or
"inconvenient"
- However, can also exist on a smaller level, such as within a school in which pupils may be
prevented from accessing material deemed to be unsuitable.
- Within the workplace, censorship may be used to maintain high productivity and prevent
distractions
Political Ideas
Some countries use censorship to block out other political opinions.
- There is much debate about the extent to which the government should be able tocontrol what we
have access toand decide what is best for the public
- In the UK, ISPs block websites with content associated with terrorism & extremist political
beliefs.
- There is fear that censorship may be used toblock out alternative political beliefs.
- At this point, censorship would not be acting to protect the country but rather topush a
certain ideology, which some people consider to be unethical and unacceptable
Monitoring Behaviour
Surveillance Systems
Computers are used to monitor people’s behaviour in various environments.
- In many workplaces,employers monitor productivityby tracking the websites and applications
workers are accessing and the time spent on these
Types of Piracy
- Counterfeiting: Producing & selling unauthorised copies of software
- Cracking: Removing or bypassing copy protection mechanisms or licence keys to enable software
without a valid licence (e.g. bypassing a paywall and accessing paid content for free)
Websites must be laid out in a way that makes it easy for users to navigate between pages.
- Menus are a common tool used to provide function
- In English-speaking countries, menus are displayed on the left-hand side of the page.
- In countries such as Egypt or the UAE, where Arabic is the primary language, these menus
may instead be displayed on the right-hand side of the page as Arabic is read from right to
left.
This is particularly important for online stores, as a well laid-out website which is easy-to-navigate will
attract more customers.
Colour Paradigms
When choosing a colour scheme for a website, web developers must take into account how different
colours are interpretedaround the world.
- Some colours are regarded as unlucky in certain cultures and have other negative connotations.
- Example: White is associated with mourning in the Middle East, but is associated with purity in
Western cultures.
- Typically, a neutral colour scheme with widely positive connotations will be chosen, such as green
which represents luck & nature
Character Sets
In order to make websites accessible to as wide an audience as possible, the contents must be translated
into multiple languages.
- Some character sets are too small to accommodate all of the characters of a language, such as
ASCII
- ASCII only uses seven bits & so is unable to represent all of the characters in any languages
that don’t utilise the English alphabet or that use accented characters.
- Unicode is the preferred character set as it is able to representover a million characters,
able to be used to represent characters with vastly different characters