3150713_(3)_merged[1]
3150713_(3)_merged[1]
/Seat No_____________
MARKS
Q.1 (a) Briefly discuss major coding styles supported by python programming language. 03
(b) What will be the output of the below python code? 04
import numpy as np
a = np.array([[1, 2.5, 3],[4, 5, 6.5]])
print(a.shape)
print(a.ndim)
print(a.itemsize)
print(a.nbytes)
(c) Write a python program to take two strings and position value (starting from 07
zero) as input from the user. Insert the second string in the first string after the
given position. (Example: String-1 = "Gujarat University", String-
2="Technological ", position = 7, Output="Gujarat Technological University")
Q.2 (a) Discuss characteristics of Set and Dictionary data structures in python. 03
(b) Write a short note on TF-IDF transformations. 04
(c) Explain different stages of the data science pipeline. 07
OR
(c) Discuss major libraries available with python language with its application in 07
the field of Data Science.
Q.3 (a) Explain different data structures available in pandas library. 03
(b) Below is the content of a csv file named "data.csv". 04
Name,Salary
abc,10000
xyz,20000
pqr,40000
Write a python program to read data from given csv file and print each pair of
names and salaries in the format "Name -> Salary". (Example: abc -> 10000 and
so on)
(c) Why it is important to find missing values from a dataset? Discuss different 07
approaches to handle missing values with appropriate examples.
OR
Q.3 (a) Write a short note on categorical variables. 03
(b) Consider below dataframe in python 04
A B C
1
0 7 1 4
1 2 5 8
2 3 6 9
Write a python program to create boxplot from the above data. What type of
important observation about data can be derived from the boxplot?
OR
Q.4 (a) Explain how to add markers and change line color and line style in a line graph 03
using matplotlib library.
(b) An adjacency matrix for a graph with for nodes named 0, 1, 2, 3 is given below. 04
0 1 2 3
0 0 1 1 0
1 0 0 0 0
2 0 1 0 1
3 1 0 0 0
Write a python program to draw the undirected graph from above matrix using
networkx library.
(c) Below is the data about average temperature (in °C) at a place for 15 days. 07
temp = [30, 34, 35, 32, 38, 26, 29, 45, 42, 32, 40, 33, 36, 34, 36]
Write a python program to create histogram with 4 bins from the above data.
What type of important observation about data can be derived from the
histogram?
Q.5 (a) What is the use of %timeit and %%timeit magic functions? 03
(b) Discuss importance of covariance and correlation in EDA. 04
(c) What is regression and classification problems? Explain any one with suitable 07
example.
OR
Q.5 (a) Explain different interfaces from Scikit-learn library. 03
(b) Explain the importance of chi-square test in EDA. 04
(c) Write a note on different descriptive statistics measures for numeric data. 07
*************
2
Seat No.: ________ Enrolment No.___________
MARKS
Q.1 (a) Write a single line code to get the value of "type" from the given dictionary in 03
such a way that it does not produce any error or exception even if any key from
the dictionary is misspelled. e.g. batters is misspelled as bateers. Still, your code
must traverse the dictionary and fetch the value “Regular” of the key “type”.
{
"batters": {
"batter": [
{
"batter": [
{
"batter": [{
"type": "Regular"
}]
}]
}]
}
}
(b) What is chi-square test? why it is necessary in data analysis? 04
(c) Explain following string functions with suitable example. 07
len, count, title, lower, upper, find, rfine, replace
Q.2 (a) List and explain the reasons which make python programming popular in Data 03
Science.
(b) Differentiate: Dictionary and List 04
(c) What do you mean by Exploratory Data Analysis? List and explain the task which 07
needs to be performed in EDA.
OR
(c) Define Standardization. Explain Z-score standardization with suitable example. 07
Q.5 (a) Provide your views on Data wrangling with suitable example. 03
(b) Define covariance and explain its importance with appropriate example. 04
(c) Elaborate XPath in detail with relatable example. 07
OR
Q.5 (a) Explain Hashing Tricks and its importance with suitable example. 03
(b) Explain importance of Legends, Labels and Annotations in Graphs. 04
(c) Describe sampling along with its types in detail with suitable example. 07
*************
Seat No.: ________ Enrolment No.___________
Marks
1
(c) Write a python code to draw a bar chart utilizing at least three 07
properties of it.
Q.5 (a) Explain Classification and clustering class of Scikit-learn. 03
(b) Illustrate the concept of regression class of Scikit-learn with 04
the help of small example.
(c) What is the use of hashing trick and hash function in Scikit- 07
learn? Explain in detail with example.
OR
Q.5 (a) What do you mean by categorical data? Explain with small 03
example.
(b) Explain the use of skew () and kurtosis () function. 04
(c) Explain EDA approach in detail. 07
***********
2
Seat No.: ________ Enrolment No.___________
MARKS
Q.1 (a) List Advantages of Python. 03
(b) Differentiate Numpy and Pandas. 04
(c) Explain Exploratory Data Analysis (EDA). 07
*************
2
Seat No.: ________ Enrolment No.___________
*************
1
Seat No.: ________ Enrolment No.___________
MARKS
Q.1 (a) Differentiate the list and dictionary data types of python by their 03
characteristics along with example in brief.
(b) What do you mean by slicing operation in string of python? Write an 04
example of slicing to fetch first name and last name from full name
of person and display it.
(c) Which are the basic activities we performed as a part of data science 07
pipeline? Summarize and explain in brief.
Q.2 (a) What is the core competencies needed to become a data scientist? 03
Explain in brief.
(b) Compare and summarize four different coding styles supported by 04
Python language.
(c) Summarize the characteristics of NumPy, Pandas, Scikit-Learn and 07
matplotlib libraries along with their usage in brief.
OR
(c) What do you mean by prototyping? List the phases of prototyping 07
and experimentation process and explain in brief.
Q.3 (a) Compare the numpy and pandas on the basis of their characteristics 03
and usage.
(b) For what purpose sampling is used. Demonstrate random sampling 04
with example.
(c) What is the need of streaming the data? Explain data uploading and 07
streaming data with example.
OR
Q.3 (a) How XPath is useful for analysis of html data? Explain in brief. 03
(b) Define term n-gram. Explain the TF-IDF techniques. 04
(c) List the techniques to handle missing data. Explain various 07
techniques with example.
Q.4 (a) List various types of graph/chart available in the pyplot of matplotlib 03
library for data visualization. Explain any two of them in brief.
(b) What kind data is analyzed with Bag of word model? Explain it with 04
example.
(c) What do you mean by time series data? How can we plot it? Explain 07
it with example to plot trend over time
OR
1
Q.4 (a) Compare bar graph, box-plot and histogram with respect to their 03
applicability in data visualization.
(b) Define stemming. Explain the concept of stemming with example. 04
(c) What is the use of scatter-plot in data visualization? Can we draw 07
trendline in scatter-plot? Explain it with example.
Q.5 (a) Define the term Data wrangling. Explain the steps needed to perform 03
data wrangling.
(b) Why we need to perform Z-score standardization in EDA? Justify it 04
with example.
(c) What is the use of hash function in EDA? Express various hashing 07
trick along with example.
OR
Q.5 (a) What do you mean by Exploratory Data Analysis (EDA)? How t-test 03
is useful for EDA?
(b) What do you mean by covariance? What is the importance of 04
covariance in data analysis? Explain it with example.
(c) List different way for defining descriptive statistics for 07
Numeric Data. Explain them in brief.
*************
2
Seat No.: ________ Enrolment No.___________
MARKS
Q.1 (a) What is the role of Python in Data science? 03
(b) Differentiate List and Tuple in Python 04
(c) Explain data science pipeline in details. 07
Page 2 of 2
Seat No.: ________ Enrolment No.___________
MARKS
Q.6 (a) List the type of plots that can be drawn using matplotlib. 03
(b) Write a python program to read data from CSV files using pandas. 04
(c) Explain pie chart plot with appropriate examples. 07
Q.8 (a) Define EDA. List the tasks need to be carried out in EDA? 03
(b) How hash functions can be useful to solve data science problems? 04
(c) Define the regression problem. How can it be solved using SciKit- 07
learn?
*************