For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them allÃIPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.
With this handbook, you'll learn how to use:
- IPython and Jupyter: provide computational environments for data scientists using Python
- NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python
- Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python
- Matplotlib: includes capabilities for a flexible range of data visualizations in Python
- Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms
About the Author
Jake VanderPlas is a long-time user and developer of the Python scientific stack. He finished his PhD in Astronomy at the University of Washington in 2012, and currently works as an interdisciplinary research director at the University of Washington, conducts his own astronomy research, and spends time advising and consulting with local scientists from a wide range of fields.
Table of Contents
1. Chapter 1 IPython: Beyond Normal Python
2. Chapter 2 Introduction to NumPy
3. Chapter 3 Data Manipulation with Pandas
4. Chapter 4 Visualization with Matplotlib
5. Chapter 5 Machine Learning