generating errors we’ve already seen the raise keyword, in passing raise Exception is the simplest way to have your program stop when something goes wrong in a notebook/console environment, it stops the current cell/function (doesn’t crash the session) you have to raise <something> Exception is the most general case (“something happened”) other possibilities TypeError: someContinue Reading

Find duplicate rows: duplicated() Determines which duplicates to mark: keep Specify the column to find duplicate: subset Count duplicate / non-duplicate rows Remove duplicate rows: drop_duplicates() keep, subset inplace Aggregate based on duplicate elements: groupby() The following data is used as an example. row #6 is a duplicate of row #3. The sample csv file is linked below. FindContinue Reading

There are two kinds of sorting available in Pandas. They are − By label By Actual Value Let us consider an example with an output. import pandas as pd import numpy as np unsorted_df=pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns=[‘col2′,’col1’]) print unsorted_df Its output is as follows − col2 col1 1 -2.063177 0.537527 4 0.142932 -0.684884 6Continue Reading

Inserting rows and columns You can insert rows or columns using the relevant worksheet methods: openpyxl.worksheet.worksheet.Worksheet.insert_rows() openpyxl.worksheet.worksheet.Worksheet.insert_cols() openpyxl.worksheet.worksheet.Worksheet.delete_rows() openpyxl.worksheet.worksheet.Worksheet.delete_cols() The default is one row or column. For example to insert a row at 7 (before the existing row 7): >>> ws.insert_rows(7) Deleting rows and columns To delete the columns F:H: >>>Continue Reading

applymap() method only works on a pandas dataframe where function is applied on every element individually. apply() method can be applied both to series and dataframes where function can be applied both series and individual elements based on the type of function provided. map() method only works on a pandas series where typeContinue Reading

A crosstab computes aggregated metrics among two or more columns in a dataset that contains categorical values. Import Modules import pandas as pd import seaborn as sns Get Tips Dataset Let’s get the tips dataset from the seaborn library and assign it to the DataFrame df_tips. df_tips = sns.load_dataset(‘tips’) Each row represents a unique meal at aContinue Reading

Any groupby operation involves one of the following operations on the original object. They are − Splitting the Object Applying a function Combining the results In many situations, we split the data into sets and we apply some functionality on each subset. In the apply functionality, we can perform the following operations − Aggregation − computingContinue Reading