100% found this document useful (1 vote)
3K views

Spark Streaming - Malay

This document contains multiple choice questions about Spark Streaming concepts and features. It covers topics like: - What DStreams internally represent (a sequence of RDDs arriving at discrete time intervals) - Common data sources for Spark Streaming like Kafka, Flume, and Twitter - The role of the receiver in dividing streams into blocks and the Block Management Master tracking block IDs - Key configurations like batch interval and how they relate to windows and sliding intervals - That DStreams are immutable like RDDs and represent a continuous stream of data - Basic and advanced data sources and common transformations and actions like reduceByKey

Uploaded by

Mahesh VP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views

Spark Streaming - Malay

This document contains multiple choice questions about Spark Streaming concepts and features. It covers topics like: - What DStreams internally represent (a sequence of RDDs arriving at discrete time intervals) - Common data sources for Spark Streaming like Kafka, Flume, and Twitter - The role of the receiver in dividing streams into blocks and the Block Management Master tracking block IDs - Key configurations like batch interval and how they relate to windows and sliding intervals - That DStreams are immutable like RDDs and represent a continuous stream of data - Basic and advanced data sources and common transformations and actions like reduceByKey

Uploaded by

Mahesh VP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 1

Which among the following can act as a data source for Spark Streaming?

---All the
options
Dstreams are internally, a collection of _______.RDD
HDFS cannot be a sink for Spark Streaming.---False
We cannot configure Twitter as a data source system for Spark Streaming.----False
Dstreams can be created from an existing Dstream.--True
Dstreams cannot be created directly from sources such as Kafka and Flume----False
Internally DStream is represented as a sequence of _____ arriving at discrete time
intervals---RDD
Dstreams are internally, a collection of---RDD
The receiver divides the stream into blocks and keeps them in memory.---True
Block Management units in the worker node reports to ____.---Block Management
Master in the Driver
Block Management Master keeps track of ___.Block id
Starting point of a streaming application is _______.---ssc.start()

What is a Window Duration/Size?---Interval at which a certain fold operation is


done on top of Dstreams.
Sliding Interval is the interval at which sliding of the window area occur.---True
Which among the following needs to be a multiple of batch interval?----Window
duration
There can be multiple Dstreams in a single window.---True

With Spark Streaming, the incoming data is split into micro batches---True
Which among the following is true about Spark Streaming?----All the options
Who is responsible for keeping track of the Block Ids?----Block Management Master
in the Driver
Data sources for Spark Streaming that comes under the 'Advanced sources' category
include----All the Options
Batch interval is configured at------creating Spark Streaming Context
For every batch interval, the Driver launches tasks to process a block---True
What is the programming abstraction in Spark Streaming?----Dstreams
What is a batch Interval?----Interval at which a Dstream is processed
Dstreams are internally----Collection of RDD
What is a Sliding Interval?----Interval at which sliding of the window area occur.
Which among the following is true about Window Operations?------Window duration
should be a multiple of batch interval
Dstreams are immutable. Choose the right option.----Yes,Like RDD Dstreams are
immutable
We specify ___________ when we create streaming context.----batch interval
Dstreams are-----Collection of RDD
Which among the following are Basic Sources of Spark Streaming?----Kafka
reduceByKey is a----Action
Reciever recieves data from the Streaming sources at the start of _________.------
Streaming Context
DStream represents a continuous stream of data.---True
Spark Streaming has two categories of sources - Basic sources and Advanced
sources.-----True

You might also like