One of the main benefits of Python is the vast array of pre-existing packages (also called libraries), written by other Python users and available for installation. You can find Python packages on PyPI, the Python Package Index.
If using the Anaconda distribution of Python, many libraries come pre-installed. This tutorial covers the steps needed to install additional packages.
Here are some resources for popular data science Python libraries:
- Bokeh - It helps you build beautiful graphics, ranging from simple plots to complex dashboards with streaming datasets
- Keras - Keras focuses on debugging speed, code elegance & conciseness, maintainability, and deployability.
- Matplotlib - Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python
- NLTK - NLTK is a leading platform for building Python programs to work with human language data
- NumPy - The fundamental package for scientific computing with Python
- pandas - pandas is for high-performance, easy-to-use data structures and data analysis tools
- SciPy - Fundamental algorithms for scientific computing in Python
- scikit-learn - Simple and efficient tools for predictive data analysis
- Scrapy - An open source and collaborative framework for extracting the data you need from websites
- Seaborn - Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics
- Statsmodel - Statsmodel provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration