Basics of Pandas — Part 2

Aarish Alam
Analytics Vidhya
Published in
4 min readNov 23, 2020

--

In my previous article I addressed some of the common queries faced by a beginner while working with various dataset. This article is the continuation of my previous article.

I’ll be continuing to demonstrate further concepts using the same dataset(UFO) as used in the first part of this article.

How do I sort a pandas DataFrame or a Series?

UFO-Dataset head()

I have slightly modified the ‘Time’ column in our dataset for the purpose of explaining this attribute.

Below code demonstrates how one can sort ‘Time’ column into ascending or descending order.

#as a default parameter sorts the values in ascending order
ufo.Time.sort_values()
#sorts in descending order
ufo.Time.sort_values(ascending=False)
#Another way to sort
ufo.sort_values('Time')
Sorted ‘Time’ column in ascending order
Sorted ‘Time’ column in descending order

What is this Inplace parameter in most of the methods of DataFrame?

At its core, the inplace parameter helps you decide how you want to affect the underlying data of the Pandas object. Do you want to make a change to the DataFrame object you are working on and overwrite what was there before? Or do you want to make a copy of the DataFrame object and assign it to a different variable so you can modify that original data later?

When you pass the parameter as inplace=True the changes that you make are overridden to your current DataFrame whereas if you give inplace=False (which is the default parameter) it shows you a copy of the DataFrame with the changed attributes.

Examples

#makes changes to the original DataFrame
ufo.drop('Colors Reported', axis=1, inplace=True)
ufo.Time.sort_values(inplace=True)

How do I filter rows of a pandas DataFrame by column value?

Below code represents the DataFrame with column value of ‘city’ being ‘ithaca’

ufo[(ufo['City']=='Ithaca')]
city==’Ithaca’

We can also use more comparators to filter out data. Such as <,>,||,&

Note that ‘&’ is ‘and’ comparator ‘||’ is ‘or’ comparator.

ufo[(ufo['Time']<1940) & (ufo['Time']>1935)]
1935<Time<1940

How do I change the data type of a pandas Series?

There are several methods to change the data type of a pandas series let us look at few of them.

Using astype method is the most common amongst all the other methods. Let us look at the default dtypes of columns in our DataFrame.

data-types

Initially all the given datatypes are of ‘object’ format let us typecast it using astypemethod.

#to conert an object to integer typecast it to string and then to integer.
ufo['Time']=ufo['Time'].astype(str).astype(int)
changed dtype of Time column

How do I use string methods in pandas?

String methods for pandas Series are accessed via ‘str’. Below cell of code represents how you can capitalize all the words in a particular column using string method.

#displays the name of cities in Block letters
ufo.City.str.upper()
upper() method of string

You can do a whole lot of operations using strings , one extensively used technique in data analyzation is to find if a particular string is present in given column . Suppose you have to find how many data points corresponds to ‘NY’ state then you can use the following block of code to get the count of your required query.

ufo.State.str.contains('NY').count()
#outputs 18247

There are some more regularly used methods of str such as str.replace() method to replace particular characters of strings with other characters(given in the argument separated by comma).

This marks the end of part 2 of the series “Basics of Pandas”. I’ll be talking about some more basic queries that arises while working with pandas in my next article.

Thanks 😉

--

--