DEV Community

Stacy Gathu
Stacy Gathu

Posted on

An Introduction to Fundamental Libraries in Python for Data Science.

Getting started with learning Python as a data scientist can feel overwhelming, with new jargon flying everywhere even for the most basics tasks. It is therefore essential to first have an idea of the basic libraries that exist, why they do and when to use them before taking on the task of using them in your code. Here is a brief and hopefully helpful intro to the most common libraries in Python for beginners.

Pandas

Pandas is an open source library developed by Wes McKinney, see book here in 2008 that is suitable for working with tabular data. Pandas has 2 main data structures(a container that holds data in a specific way) namely Data frames and Series. A pandas Series is a one-dimensional array of labelled data with an index attached to it, whilst a data frame is a two-dimensional array of data, consisting of multiple series.

Here is an example of a pandas series

0          Apple
1         Banana
2         Cherry
3    Dragonfruit
dtype: object
Enter fullscreen mode Exit fullscreen mode

And here is one of a dataframe

         fruit  quantity price
0        Apple        25   $30
1       Banana        30   $10
2       Cherry        30   $20
3  Dragonfruit         5   $50
Enter fullscreen mode Exit fullscreen mode

Top comments (0)