CBSE Class 11 & 12 Computer Science and Informatics Practices Python Materials, Video Lecture

Python Pandas - Introduction

Following are the key points about Pandas:
  1. Python Pandas is an Open Source Python Library. 
  2. Pandas have derived from “Panel Data System”. 
  3. Pandas is a popular choice for Data Analysis and Data Science because it is very simple and easy to use.
  4. The main author of Pandas is Wes McKinney.
  5. Pandas offer high-performance, easy-to-use data structure and data analysis tools. 

Data Analysis – Process of evaluating big datasets using analytical and statistical tools to discover useful information and conclusion to support business decision-making.

Pandas Installation

Pandas library can be installed in your python by opening a command prompt and running the following command:
pip install pandas

Why Pandas is used in Python

  • Pandas is the most popular library in the Scientific Python ecosystem for doing data analysis. Pandas is capable of many tasks including-
  • It can read or write in many different data formats (Integer, float, double etc). 
  • It can calculate in all ways data is organized, i.e. row and column-wise analysis. 
  • It can easily select subsets of data from bulky data sets and even combine multiple datasets together. 
  • It supports visualization by integrating matplotlib and seaborn etc libraries.

Pandas Data Structure

Data Structure - A data structure is a particular way of organizing data in a computer so that it can be used effectively. For example, we can store a list of items using the list data structure.
  • There are four built-in data structures in Python - list, tuple, dictionary and set.
  • Pandas offer many data structures to handle a variety of data.  
  • Out of many Data Structures pandas two basic Data Structures of Pandas

Series - A Series is a Pandas data structure that represents a one-dimensional array-like object containing an array of data and an associated array of data labels, called its index. 
Pandas Series

DataFrame - A DataFrame is a two dimensional labelled array-like, Pandas data structure that stores an ordered collection of columns that can store data of different types.
Pandas DataFrame

Note: Both Series and DataFrame are objects in python.

No comments:

Post a Comment