Gujarat Technological University
Gujarat Technological University
/Seat No_____________
MARKS
Q.1 (a) Briefly discuss major coding styles supported by python programming language. 03
(b) What will be the output of the below python code? 04
import numpy as np
a = np.array([[1, 2.5, 3],[4, 5, 6.5]])
print(a.shape)
print(a.ndim)
print(a.itemsize)
print(a.nbytes)
(c) Write a python program to take two strings and position value (starting from 07
zero) as input from the user. Insert the second string in the first string after the
given position. (Example: String-1 = "Gujarat University", String-
2="Technological ", position = 7, Output="Gujarat Technological University")
Q.2 (a) Discuss characteristics of Set and Dictionary data structures in python. 03
(b) Write a short note on TF-IDF transformations. 04
(c) Explain different stages of the data science pipeline. 07
OR
(c) Discuss major libraries available with python language with its application in 07
the field of Data Science.
Q.3 (a) Explain different data structures available in pandas library. 03
(b) Below is the content of a csv file named "data.csv". 04
Name,Salary
abc,10000
xyz,20000
pqr,40000
Write a python program to read data from given csv file and print each pair of
names and salaries in the format "Name -> Salary". (Example: abc -> 10000 and
so on)
(c) Why it is important to find missing values from a dataset? Discuss different 07
approaches to handle missing values with appropriate examples.
OR
Q.3 (a) Write a short note on categorical variables. 03
(b) Consider below dataframe in python 04
A B C
1
0 7 1 4
1 2 5 8
2 3 6 9
Write a python program to create boxplot from the above data. What type of
important observation about data can be derived from the boxplot?
OR
Q.4 (a) Explain how to add markers and change line color and line style in a line graph 03
using matplotlib library.
(b) An adjacency matrix for a graph with for nodes named 0, 1, 2, 3 is given below. 04
0 1 2 3
0 0 1 1 0
1 0 0 0 0
2 0 1 0 1
3 1 0 0 0
Write a python program to draw the undirected graph from above matrix using
networkx library.
(c) Below is the data about average temperature (in °C) at a place for 15 days. 07
temp = [30, 34, 35, 32, 38, 26, 29, 45, 42, 32, 40, 33, 36, 34, 36]
Write a python program to create histogram with 4 bins from the above data.
What type of important observation about data can be derived from the
histogram?
Q.5 (a) What is the use of %timeit and %%timeit magic functions? 03
(b) Discuss importance of covariance and correlation in EDA. 04
(c) What is regression and classification problems? Explain any one with suitable 07
example.
OR
Q.5 (a) Explain different interfaces from Scikit-learn library. 03
(b) Explain the importance of chi-square test in EDA. 04
(c) Write a note on different descriptive statistics measures for numeric data. 07
*************