
Data Analysis and Scientific Computing with Pandas and NumPy
Explore the functionalities of Pandas and NumPy, essential tools for data analysis and scientific computing in Python. Learn about data structures in Pandas, numerical computations in NumPy, and why these libraries are indispensable for efficient data handling and computation tasks.
Uploaded on | 1 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
BMEG3105-Introduction to Pandas and NumPy Qinze YU Wednesday, March 19, 2025 qzyu22@cse.cuhk.edu.hk Department of Computer Science and Engineering (CSE) The Chinese University of Hong Kong (CUHK)
Pandas 1
Pandas Pandas:A data analysis and manipulation tool, built on top of the Python. 2
Types of data structure in Pandas Data Structure Dimensions Description 1D labeled homogeneous array, size-immutable Series 1 General 2D labeled, size mutable tabular structure DataFrame 2 General 3D labeled, size mutable array Panel 3 3
DataFrame Series2 Series3 Name Age Gender Series1 Tom 22 Male Lily 19 Female Jim 39 Male Sherry 27 Female 5
DataFrame Column Type Name String Age Integer Gender String 6
NumPy 7
What is Numpy? NumPy is the fundamental package for scientific computing in Python. NumPy provides multidimensional array object 8
What is Numpy? Numpy, Scipy, and Matplotlib provide MATLAB-like functionality in python. Numpy Features: Typed multidimentional arrays (matrices) Fast numerical computations (matrix math) High-level math functions 9
Why do we need NumPy? Python does numerical computations slowly. 1000 x 1000 matrix multiply Python triple loop takes > 10 min. Numpy takes ~0.03 seconds 10
Array Structured lists of numbers. Vectors Matrices Images Tensors 11
Distance and Similarity Distance estimates the similarity between objects Distance low Similarity high Distance high Similarity low 12
Common ways to calculate distance Manhattan Distance ? 1 ? ?) ??? ????? ???????? = ( ?? ?? ?=1 Hamming Distance ? ??????? ???????? = ?(??= ??) ?=1 13
Example ? ??????? ???????? = ?(??= ??) ?=1 GCGAACGTAA Hamming Distance? Similarity? GCGTAGGTAA 14
Thanks! 15