Select multiple columns with condition in Pandas
How to select multiple columns along with a condition based on the column of a Pandas dataFrame column. It was asked by one of my fellow teacher. Before we solve the issue let’s try to understand what is the problem.
Suppose we have a CSV file with the following data
RLN | NAME | Eng | hin | math | sci |
1 | Harish Kumar | 50 | 47 | 45 | 40 |
2 | Guneet Arora | 49 | 46 | 47 | 48 |
3 | Naysha Singh | 40 | 45 | 50 | 49 |
4 | Vasu Kumar | 43 | 46 | 48 | 44 |
5 | Divya Sagar | 43 | 38 | 42 | 41 |
The problem is he wants the name and marks obtained by all the students who are able to score more than 40 marks. The first solution was something like this
>>>import pandas as pd >>>df = pd.read_csv(r'c:/users/rakesh/desktop/qp.csv') >>>df RLN NAME Eng hin math sci 0 1 Harish Kumar 50 47 45 40 1 2 Guneet Arora 49 46 47 48 2 3 Naysha Singh 40 45 50 49 3 4 Vasu Kumar 43 46 48 44 4 5 Divya Sagar 43 38 42 41 >>>df1 = df[['NAME','Eng']]
The above part will generate a new dataframe from our existing dataframe having only two columns ‘NAME’ and ‘Eng’. The result of the df1 is something like this
>>>df1 RLN NAME Eng hin math sci 0 1 Harish Kumar 50 47 45 40 1 2 Guneet Arora 49 46 47 48 2 3 Naysha Singh 40 45 50 49 3 4 Vasu Kumar 43 46 48 44 4 5 Divya Sagar 43 38 42 41
now filter this data frame with our filter command
>>>df1[df['Eng'>40]] NAME Eng 0 Harish Kumar 50 1 Guneet Arora 49 3 Vasu Kumar 43 4 Divya Sagar 43
BUT the answer was not acceptable by my friends as he wanted all this query in a single row. so we came up with the following solution with the pandas query method
>>>df[['NAME','Eng']].query('Eng>40') NAME Eng 0 Harish Kumar 50 1 Guneet Arora 49 3 Vasu Kumar 43 4 Divya Sagar 43
Hope you people will love the solution. If you have any doubt please feel free to contact us.