Database Answer
Database Answer
Review questions:
1.1 Definitions of Key Database Terms
1. Data: Raw facts and figures with no context on their own, which can
be processed to gain meaningful information.
2. Database: An organized collection of data that is stored and
accessed electronically, often structured in tables for easy querying
and updating.
3. DBMS (Database Management System): Software that manages
and facilitates interactions with databases, allowing users to create,
read, update, and delete data.
4. Database System: A combination of the DBMS, the database itself,
and the applications that interact with the database.
5. Database Catalog: A set of metadata that describes the structure
and definitions of objects (e.g., tables, indexes) within the database.
6. Program-Data Independence: The separation of data from
application logic, allowing changes in data structure without altering
the programs that use the data.
7. User View: A subset of the database tailored to meet the specific
needs of a user or group of users.
8. DBA (Database Administrator): The person responsible for
managing and maintaining the database system, including tasks like
backup, recovery, and access control.
9. End User: The individuals or applications that interact with the
database for retrieving, entering, or modifying data.
10. Canned Transaction: Predefined, repetitive queries or
transactions designed for standard operations, often used by non-
technical end users.
11. Deductive Database System: A database system with
capabilities to deduce or infer information from the stored data,
usually based on logic programming.
12. Persistent Object: Data objects that retain their state even
after the application that created them has ended, typically stored
in a database.
13. Meta-data: Data that describes other data, such as
information about data structure, format, and constraints.
14. Transaction-Processing Application: Applications that
involve multiple operations on a database and ensure data integrity
and consistency even in concurrent usage.
1.2 Main Types of Actions Involving Databases
1. Defining the Database: Establishing the data structure, format,
and constraints (schema definition) that the database will follow.
2. Constructing the Database: Populating the database with initial
data to enable immediate use.
3. Manipulating the Database: Operations such as querying,
updating, and deleting data, allowing users to work with stored
information.
4. Sharing the Database: Allowing multiple users and applications to
access the data concurrently, requiring mechanisms to manage
transactions and ensure data integrity.
1.3 Characteristics of the Database Approach vs. Traditional File
Systems
The database approach organizes data independently from application
programs, which allows for better consistency, reduced redundancy, and
easier data sharing. In contrast, traditional file systems store data in files
within specific programs, making it harder to maintain consistency, share
data across applications, and manage data dependencies. The database
approach also offers program-data independence, multiple user views,
and higher security.
1.4 Responsibilities of the DBA and Database Designers
DBA: Manages overall database maintenance, performance tuning,
backup, recovery, and ensuring data security and access control.
Database Designers: Responsible for creating the data model,
structuring the schema, and ensuring efficient data storage,
integrity, and usability.
1.5 Types of Database End Users and Their Activities
1. Casual Users: Occasionally query the database, often using ad-hoc
queries.
2. Naive/Parametric Users: Rely on pre-built applications to perform
repetitive tasks, like data entry or transactions.
3. Sophisticated Users: Use complex querying, often analysts or
developers who understand database internals.
4. Standalone Users: Maintain personal databases and perform
regular tasks using packaged applications.
1.6 Capabilities Provided by a DBMS
A DBMS should offer:
Data Definition and Storage: Defining, creating, and storing
structured data.
Data Manipulation and Retrieval: Efficient querying, updating,
and modification capabilities.
Transaction Management: Ensuring reliable, consistent, and
atomic processing of transactions.
Concurrency Control: Managing simultaneous data access to
prevent conflicts.
Backup and Recovery: Protecting data integrity and restoring
databases in case of failure.
Security Management: Controlling access based on user roles and
permissions.
1.7 Differences Between Database Systems and Information
Retrieval Systems
Database systems are structured for transaction processing, with strict
data consistency, integrity, and structured querying (e.g., SQL). They
typically handle structured data with defined schemas. Information
retrieval systems, however, focus on retrieving relevant data from
unstructured sources, such as documents or multimedia, often using
keyword matching or search indexes instead of structured querying.
Identify some informal queries and update operations that you
would expect to apply to the database shown in Figure 1.2.
Example Informal Queries:
1. List the names of all students majoring in Computer Science.
o Query: Which students have "CS" as their major?
2. Retrieve the course name and credit hours for all courses
taught in the Computer Science department.
o Query: Show all courses under the "CS" department along with
their credit hours.
3. Find the grades of student Smith in the Fall 2008 semester.
o Query: What are the grades of student Smith for courses he
took in Fall 2008?
4. List all the prerequisite courses for course CS3380.
o Query: What are the prerequisite courses required for
CS3380?
5. Retrieve the instructor's name for all sections offered in Fall
2008.
o Query: Which instructors are teaching in Fall 2008, and what
courses are they teaching?
6. Show the list of all students enrolled in the section of the
"Discrete Mathematics" course taught in Fall 2007.
o Query: Which students are enrolled in section 85 of the
MATH2410 course in Fall 2007?
Example Update Operations:
1. Change the major of student Brown to Mathematics.
o Update: Modify the major of Brown from "CS" to "MATH."
2. Add a new course "Algorithms" (course number CS4400, 4
credit hours) to the Computer Science department.
o Update: Insert a new record in the COURSE table for CS4400
with "Algorithms" as the course name and 4 credit hours
under the CS department.
3. Update the grade of student Brown in the section of "Data
Structures" from B to A.
o Update: Modify Brown's grade in section 102 from B to A in
the GRADE_REPORT table.
4. Remove student Smith from the section of "Database"
(section 135).
o Update: Delete Smith's enrollment (Student_number = 17)
from section 135 in the GRADE_REPORT table.
5. Add a new section for the course "Intro to Computer
Science" (course number CS1310) for Spring 2009 with
instructor Johnson.
o Update: Insert a new section record in the SECTION table with
a new identifier, for course CS1310 in Spring 2009, taught by
Johnson.
What is the difference between controlled and uncontrolled
redundancy? Illustrate with examples.
- Uncontrolled Redundancy:
Uncontrolled redundancy occurs when the same data is duplicated across
multiple places without proper management. This often leads to
inconsistencies, data anomalies, and wasted storage space.
Changes to one copy of the data may not be reflected in other copies,
causing discrepancies.
Example of Uncontrolled Redundancy: Suppose you have a
university database where student contact details are stored in
multiple places:
o In a student information file (e.g., name, phone number,
address).
o In a financial aid file.
o In an emergency contact file.
If a student changes their address, but the update is only applied to the
student information file and not to the financial aid or emergency contact
files, the database will have inconsistent and outdated information. This
leads to potential errors when contacting the student for financial or
emergency purposes.
- Controlled Redundancy:
Controlled redundancy is intentional and managed within a database
system to improve performance and ensure data consistency. It’s typically
done by having a primary source of truth for data, such as storing data in
one table and referencing it from other tables through foreign keys. The
DBMS ensures consistency between these tables by enforcing integrity
constraints.
Example of Controlled Redundancy: In a relational database,
data normalization is used to reduce redundancy, but some
controlled redundancy can exist for performance reasons. For
instance, consider the following structure in a university database:
o A Student table stores each student's personal information.
o A Grade_Report table stores grades along with a student
number that references the Student table (foreign key).
Here, the student number is duplicated in the Grade_Report table for
quick lookup, but the actual personal details of the student are only stored
once in the Student table. If a student's personal information is updated,
the change is made in one place, ensuring consistency. The redundancy is
controlled through foreign key relationships and ensures integrity
between the tables.
Key Differences:
Uncontrolled Redundancy: Leads to data inconsistency, wasted
space, and update anomalies. Data changes in one place may not
reflect in others.
Controlled Redundancy: Intentional, managed, and helps improve
efficiency without leading to data inconsistency. Referential integrity
is maintained by the DBMS.
Summary:
In a well-designed database, controlled redundancy is often necessary
for efficient querying and maintaining relationships between entities,
while uncontrolled redundancy is typically a sign of poor database
design and can lead to problems like data inconsistencies and anomalies.
Specify all the relationships among the records of the database
shown in Figure 1.2.
The database in Figure 1.2 consists of five entities or tables: STUDENT,
COURSE, SECTION, GRADE_REPORT, and PREREQUISITE. The
relationships among the records in this database can be described as
follows:
1. STUDENT – GRADE_REPORT Relationship:
Type: One-to-Many (1
)
Description: Each student can enroll in multiple sections (through
Section_identifier in the GRADE_REPORT table), but each entry in
the GRADE_REPORT table corresponds to only one student.
Relationship: The Student_number field in the GRADE_REPORT
table is a foreign key that references the Student_number field in
the STUDENT table.
Explanation: Each student can have multiple grade records (for
different sections) in the GRADE_REPORT table. However, each
grade report is associated with exactly one student.
2. COURSE – SECTION Relationship:
Type: One-to-Many (1
)
Description: A course can have multiple sections, but each section
belongs to one course.
Relationship: The Course_number field in the SECTION table is a
foreign key that references the Course_number field in the
COURSE table.
Explanation: A course such as "Discrete Mathematics" (MATH2410)
can be offered in different semesters and years, taught by different
instructors. Each section is a unique instance of a course being
taught.
3. SECTION – GRADE_REPORT Relationship:
Type: One-to-Many (1
)
Description: A section can have multiple students enrolled, but
each grade report entry refers to only one section.
Relationship: The Section_identifier field in the GRADE_REPORT
table is a foreign key that references the Section_identifier field in
the SECTION table.
Explanation: Each section can have many students enrolled in it,
and their grades are stored in the GRADE_REPORT table. However,
each grade record is associated with a single section.
4. COURSE – PREREQUISITE Relationship:
Type: One-to-Many (1
)
Description: A course can have multiple prerequisites, but each
prerequisite is associated with one course.
Relationship: The Course_number field in the PREREQUISITE
table is a foreign key that references the Course_number field in
the COURSE table. Similarly, the Prerequisite_number field also
references the Course_number field in the COURSE table.
Explanation: The PREREQUISITE table records the relationships
between courses and their prerequisites. For example, CS3380
(Database) has two prerequisites: CS3320 (Data Structures)
and MATH2410 (Discrete Mathematics). Each course can have
multiple prerequisite courses.
Summary of Relationships:
1. STUDENT – GRADE_REPORT: One student can have many grade
reports, but each grade report is for one student.
2. COURSE – SECTION: One course can have many sections, but each
section belongs to one course.
3. SECTION – GRADE_REPORT: One section can have many grade
reports (representing students' grades), but each grade report
corresponds to one section.
4. COURSE – PREREQUISITE: One course can have multiple
prerequisite courses, but each prerequisite is linked to one course.
These relationships define how the tables are connected and how data
flows across the database.
Give some additional views that may be needed by other user
groups for the database shown in Figure 1.2.
1. Instructor View:
Purpose: Instructors may need to view only the information relevant to
the courses they are teaching and the students enrolled in their sections.
View:
o Course names, section identifiers, semester, year, and a list of
students enrolled in each section (with their grades).
o Instructor details are relevant only to the sections they are
teaching
2. Student View:
Purpose: Students need a view that allows them to see the courses they are enrolled in, the
sections they are taking, and their corresponding grades.
View:
o Course name, section identifier, semester, year, and grade.
o The view will be filtered by the student’s ID (or student number).
View:
o Course name, department, section identifier, semester, year, and instructor name.
Additionally, a count of students enrolled in each section can be provided.
4. Academic Advisor View:
Purpose: Academic advisors need a view of the courses a student has completed, the grades
they've received, and whether they've met the prerequisites for advanced courses.
View:
o Student name, course name, section identifier, semester, year, grade, and
prerequisites for each course.
5. Registrar View:
Purpose: The registrar office needs to maintain records of student enrollments, courses
offered, and the grades students have received. This view would help in generating transcripts
and other academic records.
View:
o Student name, course name, section identifier, semester, year, and final grades
Purpose: Academic planners or curriculum designers need a view to assess course offerings
and prerequisites to ensure that students are meeting the required course sequences.
View:
o Course names, department, prerequisites, and the number of sections offered
Each of these views tailors the relevant data from the database to specific user needs while
ensuring users only access the data necessary for their roles.
Intro to Computer
CS 1310 4 CS
Science
Discrete
MATH 2410 3 MATH
Mathematics
Database CS 3380 3 CS
2. SECTION Table:
o Add a Course_prefix column to store the department prefix
separately from the course code.
o This way, only the prefix needs to be updated when the
department changes.
Restructured SECTION Table:
Section_ident Course_pr Course_c Semes Yea Instruct
ifier efix ode ter r or
Anderso
92 CS 1310 Fall 07
n
Section_ident Course_pr Course_c Semes Yea Instruct
ifier efix ode ter r or
Anderso
119 CS 1310 Fall 08
n