You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Welcome to my analysis of the data job market, focusing on data analyst roles. This project was created out of a desire to navigate and understand the job market more effectively. It delves into the top-paying and in-demand skills to help find optimal job opportunities for data analysts.
4
+
5
+
The data sourced from [Luke Barousse's Python Course](https://lukebarousse.com/python) which provides a foundation for my analysis, containing detailed information on job titles, salaries, locations, and essential skills. Through a series of Python scripts, I explore key questions such as the most demanded skills, salary trends, and the intersection of demand and salary in data analytics.
6
+
7
+
# The Questions
8
+
9
+
Below are the questions I want to answer in my project:
10
+
11
+
1. What are the skills most in demand for the top 3 most popular data roles?
12
+
2. How are in-demand skills trending for Data Analysts?
13
+
3. How well do jobs and skills pay for Data Analysts?
14
+
4. What are the optimal skills for data analysts to learn? (High Demand AND High Paying)
15
+
16
+
# Tools I Used
17
+
18
+
For my deep dive into the data analyst job market, I harnessed the power of several key tools:
19
+
20
+
-**Python:** The backbone of my analysis, allowing me to analyze the data and find critical insights.I also used the following Python libraries:
21
+
-**Pandas Library:** This was used to analyze the data.
22
+
-**Matplotlib Library:** I visualized the data.
23
+
-**Seaborn Library:** Helped me create more advanced visuals.
24
+
-**Jupyter Notebooks:** The tool I used to run my Python scripts which let me easily include my notes and analysis.
25
+
-**Visual Studio Code:** My go-to for executing my Python scripts.
26
+
-**Git & GitHub:** Essential for version control and sharing my Python code and analysis, ensuring collaboration and project tracking.
27
+
28
+
# Data Preparation and Cleanup
29
+
30
+
This section outlines the steps taken to prepare the data for analysis, ensuring accuracy and usability.
31
+
32
+
## Import & Clean Up Data
33
+
34
+
I start by importing necessary libraries and loading the dataset, followed by initial data cleaning tasks to ensure data quality.
df['job_skills'] = df['job_skills'].apply(lambdax: ast.literal_eval(x) if pd.notna(x) else x)
51
+
```
52
+
53
+
## Filter US Jobs
54
+
55
+
To focus my analysis on the U.S. job market, I apply filters to the dataset, narrowing down to roles based in the United States.
56
+
57
+
```python
58
+
df_US = df[df['job_country'] =='United States']
59
+
60
+
```
61
+
1
62
# The Analysis
2
63
64
+
Each Jupyter notebook for this project aimed at investigating specific aspects of the data job market. Here’s how I approached each question:
65
+
3
66
## 1. What are the most demanded skills for the top 3 most popular data roles?
4
67
5
68
To find the most demanded skills for the top 3 most popular data roles. I filtered out those positions by which ones were the most popular, and got the top 5 skills for these top 3 roles. This query highlights the most popular job titles and their top skills, showing which skills I should pay attention to depending on the role I'm targeting.
6
69
7
-
View my notebook with detailed steps here: [2_Skill_Demand.ipynb](3_Project/2_Skill_Demand.ipynb)
70
+
View my notebook with detailed steps here: [2_Skill_Demand](2_Skill_Demand.ipynb).
8
71
9
72
### Visualize Data
10
73
@@ -21,16 +84,22 @@ plt.show()
21
84
22
85
### Results
23
86
24
-

87
+

25
88
26
-
### Insights
89
+
*Bar graph visualizing the salary for the top 3 data roles and their top 5 skills associated with each.*
90
+
91
+
### Insights:
27
92
28
-
- Python is a versatile skill, highly demanded across all three roles, but most prominently for Data Scientists (72%) and Data Engineers (65%).
29
93
- SQL is the most requested skill for Data Analysts and Data Scientists, with it in over half the job postings for both roles. For Data Engineers, Python is the most sought-after skill, appearing in 68% of job postings.
30
94
- Data Engineers require more specialized technical skills (AWS, Azure, Spark) compared to Data Analysts and Data Scientists who are expected to be proficient in more general data management and analysis tools (Excel, Tableau).
95
+
- Python is a versatile skill, highly demanded across all three roles, but most prominently for Data Scientists (72%) and Data Engineers (65%).
31
96
32
97
## 2. How are in-demand skills trending for Data Analysts?
33
98
99
+
To find how skills are trending in 2023 for Data Analysts, I filtered data analyst positions and grouped the skills by the month of the job postings. This got me the top 5 skills of data analysts by month, showing how popular skills were throughout 2023.
100
+
101
+
View my notebook with detailed steps here: [3_Skills_Trend](3_Skills_Trend.ipynb).
102
+
34
103
### Visualize Data
35
104
36
105
```python
@@ -48,7 +117,7 @@ plt.show()
48
117
49
118
### Results
50
119
51
-

120
+

52
121
*Bar graph visualizing the trending top skills for data analysts in the US in 2023.*
53
122
54
123
### Insights:
@@ -58,7 +127,9 @@ plt.show()
58
127
59
128
## 3. How well do jobs and skills pay for Data Analysts?
60
129
61
-
### Salary Analysis for Data Nerds
130
+
To identify the highest-paying roles and skills, I only got jobs in the United States and looked at their median salary. But first I looked at the salary distributions of common data jobs like Data Scientist, Data Engineer, and Data Analyst, to get an idea of which jobs are paid the most.
131
+
132
+
View my notebook with detailed steps here: [4_Salary_Analysis](4_Salary_Analysis.ipynb).
62
133
63
134
#### Visualize Data
64
135
@@ -73,10 +144,9 @@ plt.show()
73
144
74
145
#### Results
75
146
76
-

147
+

77
148
*Box plot visualizing the salary distributions for the top 6 data job titles.*
78
149
79
-
80
150
#### Insights
81
151
82
152
- There's a significant variation in salary ranges across different job titles. Senior Data Scientist positions tend to have the highest salary potential, with up to $600K, indicating the high value placed on advanced data skills and experience in the industry.
@@ -87,6 +157,8 @@ plt.show()
87
157
88
158
### Highest Paid & Most Demanded Skills for Data Analysts
89
159
160
+
Next, I narrowed my analysis and focused only on data analyst roles. I looked at the highest-paid skills and the most in-demand skills. I used two bar charts to showcase these.
161
+
90
162
#### Visualize Data
91
163
92
164
```python
@@ -106,7 +178,7 @@ plt.show()
106
178
#### Results
107
179
Here's the breakdown of the highest-paid & most in-demand skills for data analysts in the US:
108
180
109
-

181
+

110
182
*Two separate bar graphs visualizing the highest paid skills and most in-demand skills for data analysts in the US.*
111
183
112
184
#### Insights:
@@ -117,7 +189,11 @@ Here's the breakdown of the highest-paid & most in-demand skills for data analys
117
189
118
190
- There's a clear distinction between the skills that are highest paid and those that are most in-demand. Data analysts aiming to maximize their career potential should consider developing a diverse skill set that includes both high-paying specialized skills and widely demanded foundational skills.
119
191
120
-
## 4. What is the most optimal skill to learn for Data Analysts?
192
+
## 4. What are the most optimal skills to learn for Data Analysts?
193
+
194
+
To identify the most optimal skills to learn ( the ones that are the highest paid and highest in demand) I calculated the percent of skill demand and the median salary of these skills. To easily identify which are the most optimal skills to learn.
195
+
196
+
View my notebook with detailed steps here: [5_Optimal_Skills](5_Optimal_Skills.ipynb).
121
197
122
198
#### Visualize Data
123
199
@@ -132,13 +208,81 @@ plt.show()
132
208
133
209
#### Results
134
210
135
-

211
+

136
212
*A scatter plot visualizing the most optimal skills (high paying & high demand) for data analysts in the US.*
137
213
138
214
#### Insights:
139
215
216
+
- The skill `Oracle` appears to have the highest median salary of nearly $97K, despite being less common in job postings. This suggests a high value placed on specialized database skills within the data analyst profession.
217
+
218
+
- More commonly required skills like `Excel` and `SQL` have a large presence in job listings but lower median salaries compared to specialized skills like `Python` and `Tableau`, which not only have higher salaries but are also moderately prevalent in job listings.
219
+
220
+
- Skills such as `Python`, `Tableau`, and `SQL Server` are towards the higher end of the salary spectrum while also being fairly common in job listings, indicating that proficiency in these tools can lead to good opportunities in data analytics.
221
+
222
+
### Visualizing Different Techonologies
223
+
224
+
Let's visualize the different technologies as well in the graph. We'll add color labels based on the technology (e.g., {Programming: Python})
225
+
226
+
#### Visualize Data
227
+
228
+
```python
229
+
from matplotlib.ticker import PercentFormatter
230
+
231
+
# Create a scatter plot
232
+
scatter = sns.scatterplot(
233
+
data=df_DA_skills_tech_high_demand,
234
+
x='skill_percent',
235
+
y='median_salary',
236
+
hue='technology', # Color by technology
237
+
palette='bright', # Use a bright palette for distinct colors
238
+
legend='full'# Ensure the legend is shown
239
+
)
240
+
plt.show()
241
+
242
+
```
243
+
244
+
#### Results
245
+
246
+

247
+
*A scatter plot visualizing the most optimal skills (high paying & high demand) for data analysts in the US with color labels for technology.*
248
+
249
+
#### Insights:
250
+
140
251
- The scatter plot shows that most of the `programming` skills (colored blue) tend to cluster at higher salary levels compared to other categories, indicating that programming expertise might offer greater salary benefits within the data analytics field.
141
252
253
+
- The database skills (colored orange), such as Oracle and SQL Server, are associated with some of the highest salaries among data analyst tools. This indicates a significant demand and valuation for data management and manipulation expertise in the industry.
254
+
142
255
- Analyst tools (colored green), including Tableau and Power BI, are prevalent in job postings and offer competitive salaries, showing that visualization and data analysis software are crucial for current data roles. This category not only has good salaries but is also versatile across different types of data tasks.
143
256
144
-
- The database skills (colored orange), such as Oracle and SQL Server, are associated with some of the highest salaries among data analyst tools. This indicates a significant demand and valuation for data management and manipulation expertise in the industry.
257
+
# What I Learned
258
+
259
+
Throughout this project, I deepened my understanding of the data analyst job market and enhanced my technical skills in Python, especially in data manipulation and visualization. Here are a few specific things I learned:
260
+
261
+
-**Advanced Python Usage**: Utilizing libraries such as Pandas for data manipulation, Seaborn and Matplotlib for data visualization, and other libraries helped me perform complex data analysis tasks more efficiently.
262
+
-**Data Cleaning Importance**: I learned that thorough data cleaning and preparation are crucial before any analysis can be conducted, ensuring the accuracy of insights derived from the data.
263
+
-**Strategic Skill Analysis**: The project emphasized the importance of aligning one's skills with market demand. Understanding the relationship between skill demand, salary, and job availability allows for more strategic career planning in the tech industry.
264
+
265
+
266
+
# Insights
267
+
268
+
This project provided several general insights into the data job market for analysts:
269
+
270
+
-**Skill Demand and Salary Correlation**: There is a clear correlation between the demand for specific skills and the salaries these skills command. Advanced and specialized skills like Python and Oracle often lead to higher salaries.
271
+
-**Market Trends**: There are changing trends in skill demand, highlighting the dynamic nature of the data job market. Keeping up with these trends is essential for career growth in data analytics.
272
+
-**Economic Value of Skills**: Understanding which skills are both in-demand and well-compensated can guide data analysts in prioritizing learning to maximize their economic returns.
273
+
274
+
275
+
# Challenges I Faced
276
+
277
+
This project was not without its challenges, but it provided good learning opportunities:
278
+
279
+
-**Data Inconsistencies**: Handling missing or inconsistent data entries requires careful consideration and thorough data-cleaning techniques to ensure the integrity of the analysis.
280
+
-**Complex Data Visualization**: Designing effective visual representations of complex datasets was challenging but critical for conveying insights clearly and compellingly.
281
+
-**Balancing Breadth and Depth**: Deciding how deeply to dive into each analysis while maintaining a broad overview of the data landscape required constant balancing to ensure comprehensive coverage without getting lost in details.
282
+
283
+
284
+
# Conclusion
285
+
286
+
This exploration into the data analyst job market has been incredibly informative, highlighting the critical skills and trends that shape this evolving field. The insights I got enhance my understanding and provide actionable guidance for anyone looking to advance their career in data analytics. As the market continues to change, ongoing analysis will be essential to stay ahead in data analytics. This project is a good foundation for future explorations and underscores the importance of continuous learning and adaptation in the data field.
0 commit comments