Da Ans (GKJ)
Da Ans (GKJ)
It involves asking the right questions, analyzing data, and using models to
make informed decisions.
Structured Data: Clearly defined data types like databases (e.g., sales
records, employee information).
DA ANS 1
TYPES:
Univariate Analysis (Single Variable)
Box Plot: Visualizes the spread and skewness of data using quartiles.
Line Plot: Shows trends over time for time series data.
Python is highly preferred for data science due to its simplicity, extensive
libraries for data analysis (like Pandas, NumPy), and strong community
support.
It also offers tools for data visualization (Matplotlib, Seaborn) and machine
learning (Scikit-learn).
8. Features of Python:
DA ANS 2
Strong community support: Active community provides libraries and
resources for data science.
Identifies Patterns and Trends: Visual tools like line charts, heatmaps, and bar
graphs help identify patterns that are not immediately apparent in raw data.
11. . Discuss bar chart line chart area fill and pie chart with examples.
Bar Chart: Visualizes categorical data with rectangular bars where the length
represents the data value. Example: Sales of different products.
Line Chart: Shows trends over time by connecting data points with a line.
Example: Stock prices over a month.
DA ANS 3
Area Chart: Similar to a line chart but with the area below the line filled with
color. Example: Visualizing the proportion of different market segments.
Pie Chart: Circular chart divided into slices representing proportions. Example:
Market share of companies in a sector.
Data Manipulation and Cleaning: Python’s libraries like Pandas and NumPy
allow handling large datasets.
Integration and Scalability: Python integrates well with other tools and
technologies, facilitating scalable data science solutions
Power BI: Microsoft's tool for data visualization and business intelligence.
DA ANS 4
Seaborn: Built on Matplotlib for creating informative and attractive graphics.
16. . List and explain components of python used for data science.
17. Explain different types of data visualization tools with their features
Data science is a multi-step process that involves extracting insights from data.
Key phases:
1. Discovery:
2. Data Preparation:
3. Model Planning:
DA ANS 5
Algorithm Selection: Decide which algorithms are appropriate for your
problem.
4. Model Building:
5. Deployment:
6. Communication:
1. Data Collection:
2. Data Cleaning:
DA ANS 6
Build models to understand data relationships.
4. Data Visualization:
Data analytics transforms raw data into actionable insights. Here are the main
types:
DA ANS 7
import pandas as pd
iris = pd.read_csv("iris.csv")
print(iris.head())
print(iris.describe())
iris['sepal_length'].hist()
plt.show()
plt.scatter(iris['sepal_length'], iris['sepal_width'])
plt.show()
DA ANS 8
print(iris.mean())
print(iris.corr())
iris.boxplot(column='sepal_length', by='species')
plt.show()
Pros
DA ANS 9
Cross-platform support.
Cons:
Feature Selection: Choosing the most relevant features for model building.
Example:
iris.fillna(iris.mean(), inplace=True)
DA ANS 10
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
iris_scaled = scaler.fit_transform(iris.drop('species', ax
is=1))
DA ANS 11