Ai Fundamental Midterm Quizzes - Jei
Ai Fundamental Midterm Quizzes - Jei
Green- Correct
Red- Incorrect
Flag question
Question text
Hierarchical clustering is a type of ______________ clustering.
Question 2Answer
a.
Hybrid
b.
Flat
c.
Deep
d.
Hierarchical
Question 3
Correct
Mark 1.00 out of 1.00
Flag question
Question text
Can the least squares method be used for nonlinear data sets?
Question 3Answer
a.
It depends on the data set
b.
It depends on the method used to transform the data set
c.
Yes
d.
No
Question 4
Correct
Mark 1.00 out of 1.00
Flag question
Question text
Hierarchical clustering is sensitive to the ______________ of the data.
Question 4Answer
a.
Variance
b.
All of the above
c.
Scale
d.
Outliers
Question 5
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How is the Hebb rule different from the delta rule?
Question 5Answer
a.
The Hebb rule uses the input and output to update the weights, while the delta rule uses
the error between the output and target
b.
The Hebb rule uses the output and target to update the weights, while the delta rule uses
the input and output
c.
The Hebb rule uses the error between the output and target to update the weights, while
the delta rule uses the input and output
d.
The Hebb rule uses the input and output to update the weights, while the delta rule uses
the error between the output and target
Question 6
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How does the k-means algorithm determine which data points belong to which cluster?
Question 6Answer
a.
By evaluating the variance of each cluster
b.
By computing the distance between data points and the centroid of each cluster
c.
By comparing the data point to the characteristics of each cluster
d.
By evaluating the probability that a data point belongs to each cluster
Question 7
Incorrect
Mark 0.00 out of 1.00
Flag question
Question text
How is the final set of clusters determined in the k-means algorithm?
Question 7Answer
a.
By selecting the set of clusters that maximize the sum of squared errors
b.
By selecting the set of clusters that maximize the within-cluster variance
c.
By selecting the set of clusters that minimize the sum of squared errors
d.
By selecting the set of clusters that minimize the within-cluster variance
Question 8
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How is the Hebb rule used in the training of a neural network?
Question 8Answer
a.
It is used to determine the structure of the neural network
b.
It is used to determine the input to the neural network
c.
It is used to calculate the output of the neural network
d.
It is used to adjust the weights of the neural network based on the input and output
Question 9
Incorrect
Mark 0.00 out of 1.00
Flag question
Question text
Can the Naive Bayes classifier handle missing or incomplete data?
Question 9Answer
a.
It can handle incomplete data but not missing data
b.
It can handle missing data but not incomplete data
c.
Yes, it can handle missing or incomplete data
d.
No, it cannot handle missing or incomplete data
Question 10
Incorrect
Mark 0.00 out of 1.00
Flag question
Question text
How is KNIME different from other data analysis tools?
Question 10Answer
a.
It has a user-friendly interface
b.
It is free
c.
It is open source
d.
It allows users to build custom data pipelines
Question 11
Correct
Mark 1.00 out of 1.00
Flag question
Question text
Hierarchical clustering can be either ______________ or ______________.
Question 11Answer
a.
Non-parametric, parametric
b.
Unsupervised, supervised
c.
Iterative, recursive
d.
Agglomerative, divisive
Question 12
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How can the sensitivity to the initial placement of centroids be addressed in the k-means
algorithm?
Question 12Answer
a.
By using a hierarchical clustering approach
b.
By normalizing the data prior to clustering
c.
By using a different clustering algorithm
d.
By using the k-means++ initialization method
Question 13
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How does the Naive Bayes classifier calculate the probability of a data point belonging to a
particular class?
Question 13Answer
a.
By using the least squares method
b.
By using the Bayes theorem
c.
By using the maximum likelihood estimation
d.
By using the gradient descent algorithm
Question 14
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How does supervised learning differ from unsupervised learning?
Question 14Answer
a.
Supervised learning involves predicting a continuous value, while unsupervised learning
involves predicting a categorical value
b.
Supervised learning involves labeled data, while unsupervised learning involves unlabeled
data
c.
Supervised learning involves predicting a value, while unsupervised learning involves
clustering data
d.
Supervised learning involves clustering data, while unsupervised learning involves
predicting a value
Question 15
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How does the least squares method handle outliers in the data set?
Question 15Answer
a.
It ignores them
b.
It gives them more weight
c.
It gives them less weight
d.
It removes them
Question 16
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How are batch learning algorithms typically used?
Question 16Answer
a.
To predict continuous values in real-time
b.
To classify data in batch mode
c.
To predict continuous values in batch mode
d.
To classify data in real-time
Question 17
Correct
Mark 1.00 out of 1.00
Flag question
Question text
Can the least squares method be used for multiple linear regression?
Question 17Answer
a.
Yes
b.
It depends on the data set
c.
It depends on the method used to transform the data set
d.
No
Question 18
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How can the problem of producing suboptimal results if the clusters are not spherical be
addressed in the k-means algorithm?
Question 18Answer
a.
By using the k-means++ initialization method
b.
By using a different clustering algorithm
c.
By using a hierarchical clustering approach
d.
By normalizing the data prior to clustering
Question 19
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How is the line of best fit calculated using the least squares method?
Question 19Answer
a.
By minimizing the variance of the data set
b.
By minimizing the mean of the data set
c.
By minimizing the sum of the squares of the errors between the data points and the line of
best fit
d.
By minimizing the sum of the absolute values of the errors between the data points and the
line of best fit
Question 20
Correct
Mark 1.00 out of 1.00
Flag question
Question text
Hierarchical clustering is a type of ______________ technique.
Question 20Answer
a.
Classification
b.
Clustering
c.
Regression
d.
Dimensionality reduction
Flag question
Question text
The KL distance between two discrete probability distributions P and Q is defined as:
Question 2Answer
a.
The sum of the logarithm of the ratio of the probabilities of each event in P and Q
b.
The sum of the ratio of the probabilities of each event in P and Q
c.
The sum of the differences between the probabilities of each event in P and Q
d.
The sum of the products of the probabilities of each event in P and Q
Question 3
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance is often used in machine learning and artificial intelligence to compare two
probability distributions, such as a model's predicted distribution and the true distribution.
In this context, the KL distance can be used as a:
Question 3Answer
a.
Activation function
b.
Cost function
c.
Loss function
d.
Kernel function
Question 4
Correct
Mark 1.00 out of 1.00
Flag question
Question text
In information theory, the KL distance can be used to measure the information lost when
approximating one distribution with another. Which of the following is NOT a property of the
KL distance in this context?
Question 4Answer
a.
It is zero only when the two distributions are identical
b.
It is always positive
c.
It is non-negative
d.
It is non-symmetric
Question 5
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance is also known as what other measure?
Question 5Answer
a.
Shannon entropy
b.
Joint entropy
c.
Cross-entropy
d.
Mutual information
Question 6
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance can be used to measure the information lost when approximating one
distribution with another. In this context, the distribution being approximated is known as
the:
Question 6Answer
a.
Approximation distribution
b.
Target distribution
c.
Reference distribution
d.
Base distribution
Question 7
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance is often used in machine learning to evaluate the performance of a
classification model. In this context, a low KL distance indicates that the model's predicted
class probabilities are:
Question 7Answer
a.
Very different from the true class probabilities
b.
Somewhat similar to the true class probabilities
c.
Very similar to the true class probabilities
d.
Somewhat different from the true class probabilities
Question 8
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The ______________ linkage criterion is a popular choice for hierarchical clustering, which
merges the two clusters based on the distance between their centroids.
Question 8Answer
a.
Average
b.
Complete
c.
Centroid
d.
Single
Question 9
Correct
Mark 1.00 out of 1.00
Flag question
Question text
In hierarchical clustering, the distance between clusters is typically measured using the
______________ criterion.
Question 9Answer
a.
Linkage criterion
b.
Euclidean distance
c.
Cosine similarity
d.
Manhattan distance
Question 10
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance can be used to measure the difference between two probability
distributions in terms of the information content of the distributions. In this context, the KL
distance is also known as:
Question 10Answer
a.
The information ratio
b.
The information distance
c.
The information gain
d.
The information divergence
Question 11
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The ______________ linkage criterion is a popular choice for hierarchical clustering, which
merges the two clusters based on the mean distance between their points.
Question 11Answer
a.
Single
b.
Complete
c.
Average
d.
Centroid
Question 12
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The ______________ linkage criterion is a popular choice for hierarchical clustering, which
merges the two clusters that have the minimum distance between them.
Question 12Answer
a.
Average
b.
Single
c.
Complete
d.
Centroid
Question 13
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What are some advantages of batch learning algorithms?
Question 13Answer
a.
They can learn from a small amount of data
b.
They can learn from streaming data in real-time
c.
They can learn from very large datasets
d.
They can learn from a limited amount of resources
Question 14
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How is the y-intercept of the line of best fit calculated using the least squares method?
Question 14Answer
a.
By dividing the mean of the y values by the slope
b.
By dividing the sum of the y values by the number of data points
c.
By subtracting the slope from the mean of the y values
d.
By dividing the sum of the product of the x values and the y values by the sum of the x
values
Question 15
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance is always positive and is equal to zero only when the two probability
distributions are:
Question 15Answer
a.
Uniformly distributed
b.
Identically distributed
c.
Independently distributed
d.
Mutually exclusive
Question 16
Correct
Mark 1.00 out of 1.00
Flag question
Question text
In hierarchical clustering, the final clusters are represented using a ______________
diagram.
Question 16Answer
a.
Dendrogram
b.
Bar chart
c.
Scatter plot
d.
Line graph
Question 17
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What are some disadvantages of batch learning algorithms?
Question 17Answer
a.
They are slow to adapt to changes in the data
b.
They require a large amount of resources
c.
They require a small amount of data
d.
They are prone to overfitting
Question 18
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The ______________ linkage criterion is a popular choice for hierarchical clustering, which
merges the two clusters that have the maximum distance between them.
Question 18Answer
a.
Average
b.
Complete
c.
Centroid
d.
Single
Question 19
Correct
Mark 1.00 out of 1.00
Flag question
Question text
How is the slope of the line of best fit calculated using the least squares method?
Question 19Answer
a.
By dividing the sum of the y values by the sum of the x values
b.
By dividing the sum of the product of the x values and the y values by the sum of the x
values
c.
By dividing the sum of the product of the x values and the y values by the sum of the
squares of the x values
d.
By dividing the sum of the y values by the sum of the squares of the x values
Question 20
Correct
Mark 1.00 out of 1.00
Flag question
Question text
The KL distance is often used in natural language processing to compare the distribution of
words in a document with the distribution of words in a reference corpus. In this context, a
low KL distance indicates that the document is:
Question 20Answer
a.
Somewhat similar to the reference corpus
b.
Somewhat different from the reference corpus
c.
Very similar to the reference corpus
d.
Very different from the reference corpus
Flag question
Question text
What is an example of a batch learning algorithm used for feature selection tasks?
Question 2Answer
a.
Mutual information
b.
Recursive feature elimination
c.
All of the above
d.
Variance threshold
Question 3
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a key characteristic of Bayesian networks?
Question 3Answer
a.
They are based on probability theory
b.
They use linear algebra for prediction
c.
They are trained on large amounts of data
d.
They use decision trees for prediction
Question 4
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a node in a Bayesian network?
Question 4Answer
a.
A probabilistic relationship between two variables
b.
A variable in the system being modeled
c.
All of the above
d.
A point in the network where two or more edges meet
Question 5
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a batch learning algorithm used for classification tasks?
Question 5Answer
a.
Support vector machine
b.
Decision tree
c.
K-nearest neighbors
d.
Linear regression
Question 6
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a real-world application of directed acyclic graphs (DAGs)?
Question 6Answer
a.
Computer networks
b.
Social media networks
c.
Data pipelines
d.
All of the above
Question 7
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is KNIME used for?
Question 7Answer
a.
Data mining
b.
All of the above
c.
Data analysis
d.
Data visualization
Question 8
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a batch learning algorithm used for clustering tasks?
Question 8Answer
a.
All of the above
b.
K-means
c.
Agglomerative clustering
d.
DBSCAN
Question 9
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a latent variable?
Question 9Answer
a.
The output of a model
b.
The features of a model
c.
A hidden or unobserved variable that affects the observed variables
d.
The weights of a model
Question 10
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a parent node in a Bayesian network?
Question 10Answer
a.
A node that is a direct descendant of another node in the network
b.
A node that has no parents or children in the network
c.
A node that is a direct ancestor of another node in the network
d.
None of the above
Question 11
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a batch learning algorithm used for dimensionality reduction tasks?
Question 11Answer
a.
t-SNE
b.
Principal component analysis
c.
Multidimensional scaling
d.
All of the above
Question 12
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a classification task in supervised learning?
Question 12Answer
a.
Predicting the stock price for the next day based on historical data
b.
Determining whether an email is spam or not
c.
Predicting the price of a house based on its characteristics
d.
Grouping customers into different segments based on their spending habits
Question 13
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a perceptron?
Question 13Answer
a.
A type of deep learning neural network
b.
A type of unsupervised learning algorithm
c.
A type of machine learning algorithm for classification tasks
d.
A type of artificial neuron that can be trained to recognize patterns
Question 14
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a Bayesian network used for?
Question 14Answer
a.
To optimize the use of resources
b.
All of the above
c.
To perform machine learning tasks
d.
To model and predict the behavior of systems
Question 15
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a batch learning algorithm used for regression tasks?
Question 15Answer
a.
Linear regression
b.
Decision tree
c.
K-nearest neighbors
d.
Support vector machine
Question 16
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a batch learning algorithm?
Question 16Answer
a.
An algorithm that processes the training data in small groups or batches
b.
An algorithm that processes the training data one example at a time
c.
An algorithm that processes all of the training data at once
d.
An algorithm that processes the training data in real-time
Question 17
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a batch learning algorithm?
Question 17Answer
a.
K-nearest neighbors
b.
All of the above
c.
Linear regression
d.
Support vector machine
Question 18
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is a directed acyclic graph (DAG)?
Question 18Answer
a.
A graph in which the edges have a direction and there are cycles
b.
A graph in which the edges have a direction and there are no cycles
c.
A graph in which the edges do not have a direction and there are cycles
d.
A graph in which the edges do not have a direction and there are no cycles
Question 19
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an edge in a Bayesian network?
Question 19Answer
a.
A probabilistic relationship between two variables
b.
A variable in the system being modeled
c.
A point in the network where two or more nodes meet
d.
None of the above
Question 20
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is an example of a regression task in supervised learning?
Question 20Answer
a.
Grouping customers into different segments based on their spending habits
b.
Predicting the price of a house based on its characteristics
c.
Determining whether an email is spam or not
d.
Predicting the stock price for the next day based on historical data
What is supervised learning used for?
Question 1Answer
a.
Both classification and regression tasks
b.
Regression tasks
c.
Classification tasks
d.
Unsupervised learning tasks
Question 2
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the disadvantage of the Naive Bayes classifier?
Question 2Answer
a.
It is unable to handle large amounts of data
b.
It is slower to train and predict
c.
It is inflexible
d.
It is less accurate
Question 3
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the advantage of the Naive Bayes classifier over other classifiers?
Question 3Answer
a.
It is faster to train and predict
b.
It is more flexible
c.
It is able to handle large amounts of data
d.
It is more accurate
Question 4
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the "M" step in the EM algorithm?
Question 4Answer
a.
The step where the expectation of the latent variables is calculated
b.
The step where the model parameters are updated
c.
The step where the likelihood of the model is maximized
d.
The step where the prediction accuracy of the model is calculated
Question 5
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the EM algorithm used to optimize in the "M" step?
Question 5Answer
a.
The prediction accuracy of the model
b.
The latent variables
c.
The model parameters
d.
The likelihood of the model
Question 6
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the advantage of using the Gaussian Naive Bayes classifier over other types of
Naive Bayes classifiers?
Question 6Answer
a.
It is more accurate
b.
It is faster to train and predict
c.
It is able to handle continuous features
d.
It is able to handle categorical features
Question 7
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the main disadvantage of the Hebb rule?
Question 7Answer
a.
It is slow to converge
b.
It is unable to handle large datasets
c.
It is prone to overfitting
d.
It is unable to handle nonlinear relationships
Question 8
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the main advantage of using a directed acyclic graph (DAG) over other types of
graphs?
Question 8Answer
a.
DAGs are easier to understand and visualize
b.
DAGs can represent more complex relationships between data
c.
DAGs are more efficient for storing and processing data
d.
All of the above
Question 9
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the "E" step in the EM algorithm?
Question 9Answer
a.
The step where the model parameters are updated
b.
The step where the prediction accuracy of the model is calculated
c.
The step where the expectation of the latent variables is calculated
d.
The step where the likelihood of the model is maximized
Question 10
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the equation for the Hebb rule?
Question 10Answer
a.
w(new) = w(old) + η(target - output)x(input)
b.
w(new) = w(old) + η(input - output)x(target)
c.
w(new) = w(old) + η(output - target)x(input)
d.
w(new) = w(old) + η(output)x(input)
Question 11
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the Hebb rule?
Question 11Answer
a.
A rule used to calculate the output of a neural network
b.
A rule used to determine the input to a neural network
c.
A rule used to adjust the weights in a neural network
d.
A rule used to determine the structure of a neural network
Question 12
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the learning rule for a perceptron called?
Question 12Answer
a.
The Hebbian Rule
b.
The Delta Rule
c.
The Backpropagation Algorithm
d.
The Perceptron Learning Algorithm
Question 13
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the Kullback-Leibler (KL) distance used for?
Question 13Answer
a.
To measure the uncertainty of a probability distribution
b.
To measure the dissimilarity between two probability distributions
c.
To measure the predictability of a probability distribution
d.
To measure the similarity between two probability distributions
Question 14
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the Naive Bayes classifier used for?
Question 14Answer
a.
To predict the probability of an event occurring
b.
To classify data into different categories based on certain features
c.
All of the above
d.
To predict the value of a continuous variable
Question 15
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the least squares method used for?
Question 15Answer
a.
To find the line of best fit for a set of data
b.
To calculate the variance of a data set
c.
To solve systems of linear equations
d.
To calculate the mean of a data set
Question 16
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the EM algorithm used to estimate in the "E" step?
Question 16Answer
a.
The likelihood of the model
b.
The latent variables
c.
The model parameters
d.
The prediction accuracy of the model
Question 17
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the main goal of the EM algorithm?
Question 17Answer
a.
To minimize the cost or loss function of a model
b.
To minimize the error between the predicted and actual values of the data
c.
To maximize the prediction accuracy of the model
d.
To maximize the likelihood of a model given the data
Question 18
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the main advantage of the Hebb rule?
Question 18Answer
a.
It is easy to implement
b.
It is able to handle large datasets
c.
It is fast to converge
d.
It is able to handle nonlinear relationships
Question 19
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the assumption made by the Naive Bayes classifier?
Question 19Answer
a.
That the features in the data are normally distributed
b.
That the features in the data are uniformly distributed
c.
That the features in the data are dependent on each other
d.
That the features in the data are independent of each other
Question 20
Correct
Mark 1.00 out of 1.00
Flag question
Question text
What is the EM algorithm used for?
Question 20Answer
a.
Regression
b.
All of the above
c.
Classification
d.
Clustering