Data Science Hub
  • Data Science Hub
  • STATISTICS
    • Introduction
    • Fundamentals
      • Data Types
      • Central Tendency, Asymmetry, and Variability
      • Sampling
      • Confidence Interval
      • Hypothesis Testing
    • Distributions
      • Exponential Distribution
    • A/B Testing
      • Sample Size Calculation
      • Multiple Testing
  • Database
    • Database Fundamentals
    • Database Management Systems
    • Data Warehouse vs Data Lake
  • SQL
    • SQL Basics
      • Creating and Modifying Tables/Views
      • Data Types
      • Joins
    • SQL Rules
    • SQL Aggregate Functions
    • SQL Window Functions
    • SQL Data Manipulation
      • String Operations
      • Date/Time Operations
    • SQL Descriptive Stats
    • SQL Tips
    • SQL Performance Tuning
    • SQL Customization
    • SQL Practice
      • Designing Databases
        • Spotify Database Design
      • Most Commonly Asked
      • Mixed Queries
      • Popular Websites For SQL Practice
        • SQLZoo
          • World - BBC Tables
            • SUM and COUNT Tutorial
            • SELECT within SELECT Tutorial
            • SELECT from WORLD Tutorial
            • Select Quiz
            • BBC QUIZ
            • Nested SELECT Quiz
            • SUM and COUNT Quiz
          • Nobel Table
            • SELECT from Nobel Tutorial
            • Nobel Quiz
          • Soccer / Football Tables
            • JOIN Tutorial
            • JOIN Quiz
          • Movie / Actor / Casting Tables
            • More JOIN Operations Tutorial
            • JOIN Quiz 2
          • Teacher - Dept Tables
            • Using Null Quiz
          • Edinburgh Buses Table
            • Self join Quiz
        • HackerRank
          • SQL (Basic)
            • Select All
            • Select By ID
            • Japanese Cities' Attributes
            • Revising the Select Query I
            • Revising the Select Query II
            • Revising Aggregations - The Count Function
            • Revising Aggregations - The Sum Function
            • Revising Aggregations - Averages
            • Average Population
            • Japan Population
            • Population Density Difference
            • Population Census
            • African Cities
            • Average Population of Each Continent
            • Weather Observation Station 1
            • Weather Observation Station 2
            • Weather Observation Station 3
            • Weather Observation Station 4
            • Weather Observation Station 6
            • Weather Observation Station 7
            • Weather Observation Station 8
            • Weather Observation Station 9
            • Weather Observation Station 10
            • Weather Observation Station 11
            • Weather Observation Station 12
            • Weather Observation Station 13
            • Weather Observation Station 14
            • Weather Observation Station 15
            • Weather Observation Station 16
            • Weather Observation Station 17
            • Weather Observation Station 18
            • Weather Observation Station 19
            • Higher Than 75 Marks
            • Employee Names
            • Employee Salaries
            • The Blunder
            • Top Earners
            • Type of Triangle
            • The PADS
          • SQL (Intermediate)
            • Weather Observation Station 5
            • Weather Observation Station 20
            • New Companies
            • The Report
            • Top Competitors
            • Ollivander's Inventory
            • Challenges
            • Contest Leaderboard
            • SQL Project Planning
            • Placements
            • Symmetric Pairs
            • Binary Tree Nodes
            • Interviews
            • Occupations
          • SQL (Advanced)
            • Draw The Triangle 1
            • Draw The Triangle 2
            • Print Prime Numbers
            • 15 Days of Learning SQL
          • TABLES
            • City - Country
            • Station
            • Hackers - Submissions
            • Students
            • Employee - Employees
            • Occupations
            • Triangles
        • StrataScratch
          • Netflix
            • Oscar Nominees Table
            • Nominee Filmography Table
            • Nominee Information Table
          • Audible
            • Easy - Audible
          • Spotify
            • Worldwide Daily Song Ranking Table
            • Billboard Top 100 Year End Table
            • Daily Rankings 2017 US
          • Google
            • Easy - Google
            • Medium - Google
            • Hard - Google
        • LeetCode
          • Easy
  • Python
    • Basics
      • Variables and DataTypes
        • Lists
        • Dictionaries
      • Control Flow
      • Functions
    • Object Oriented Programming
      • Restaurant Modeler
    • Pythonic Resources
    • Projects
  • Machine Learning
    • Fundamentals
      • Supervised Learning
        • Classification Algorithms
          • k-Nearest Neighbors
            • kNN Parameters & Attributes
          • Logistic Regression
        • Classification Report
      • UnSupervised Learning
        • Clustering
          • Evaluation
      • Preprocessing
        • Scalers: Standard vs MinMax
        • Feature Selection vs Dimensionality Reduction
        • Encoding
    • Frameworks
    • Machine Learning in Advertising
    • Natural Language Processing
      • Stopwords
      • Name Entity Recognition (NER)
      • Sentiment Analysis
        • Agoda Reviews - Part I - Scraping Reviews, Detecting Languages, and Preprocessing
        • Agoda Reviews - Part II - Sentiment Analysis and WordClouds
    • Recommendation Systems
      • Spotify Recommender System - Artists
  • Geospatial Analysis
    • Geospatial Analysis Basics
    • GSA at Work
      • Web Scraping and Mapping
  • GIT
    • GIT Essentials
    • Connecting to GitHub
  • FAQ
    • Statistics
  • Cloud Computing
    • Introduction to Cloud Computing
    • Google Cloud Platform
  • Docker
    • What is Docker?
Powered by GitBook
On this page

Was this helpful?

  1. Python

Pythonic Resources

Last updated 1 year ago

Was this helpful?

The list below is a collection of Python packages used for visualization and several ML tasks compiled by

Data Visualization

  • - Altair is a declarative statistical visualization library for Python, based on Vega and Vega-Lite. The documentation for this library can be found

  • - Bokeh is a Python library for creating interactive visualizations for modern web browsers. The documentation for this library can be found .

  • - Written on top of Plotly.js and React.js, Dash is ideal for building and deploying data apps with customized user interfaces. The documentation for this library can be found .

  • - diagrams lets you draw the cloud system architecture in Python code. The documentation for this library can be found .

  • - makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. The documentation for this library can be found .

  • - a library for creating and manipulating graphs. It is intended to be as powerful (ie. fast) as possible to enable the analysis of large graphs. The documentation for this library can be found .

  • - a comprehensive library for creating static, animated, and interactive visualizations in Python. The documentation for this library can be found .

  • - a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. The documentation for this library can be found .

  • - Create HTML profiling reports from pandas DataFrame objects. The documentation for this library can be found .

  • - Plotly’s Python graphing library makes interactive, publication-quality graphs. The documentation for this library can be found

  • - an implementation of a grammar of graphics in Python, it is based on R’s ggplot2 library. The documentation for this library can be found

  • - a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.. The documentation for this library can be found .

  • - Streamlit turns data scripts into shareable web apps in minutes. All in pure Python. No front‑end experience required. The documentation for this library can be found .

General Purpose (Tabular) Machine Learning

  • - approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk.

  • - a Python package to tackle the curse of imbalanced datasets in Machine Learning. The documentation for this library can be found .

  • - a library for compiling trained traditional ML models into tensor computations. The documentation for this library can be found .

  • - a Python library to help model customer behavior and measure Customer Lifetime Value. The documentation for this library can be found .

  • - efficient Python implementations of several popular supervised and weakly-supervised metric learning algorithms. The documentation for this library can be found .

  • - Machine learning toolkit in Python with a strong emphasis on speed and low memory usage. The documentation for this library can be found . Its focus is on supervised classification with several classifiers available: SVMs (based on libsvm), k-NN, random forests, decision trees.

  • - a Python library to develop and implement neural networks. The documentation for this library can be found [here]http://pybrain.org/docs/index.html).

  • - a low-code machine learning library in Python that automates machine learning workflows. The documentation for this library can be found .

  • - a probabilistic programming library for Python that allows users to build Bayesian models with a simple Python API. The documentation for this library can be found .

  • - Multi-purpose Machine Learning library in Python. The documentation for this library can be found .

  • - a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. The documentation for this library can be found .

  • - XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. The documentation for this library can be found .

ML Explanability and Feature Interpretation

Hyper-parameter Optimization

Time Series

Survival Analysis

Causal Inference

Recommendation \& Ranking

Natural Language Processing

Computer Vision

Miscalleneous

- a Python library for debugging/inspecting machine learning classifiers and explaining their predictions. The documentation for this library can be found .

- a Python library to help explain the predictions of any machine learning classifier. A more thorough explanation of the methodology is available .

- a Python machine-learning library for explainable AI (XAI), offering omni-way explainable AI and interpretable machine learning capabilities. The documentation for this library can be found .

- a game theoretic approach to explain the output of any machine learning model.

- a Python library that provides a suite of visual analysis and diagnostic tools to facilitate machine learning model selection. The documentation for this library can be found .

- distributed Asynchronous Hyper-parameter Optimization. The documentation for this library can be found .

- an open source hyperparameter optimization framework to automate hyperparameter search. The documentation for this library can be found .

- an open source framework packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library. The documentation for this library can be found .

- a simple and efficient library that implements several methods for sequential model-based optimization. The documentation for this library can be found .

- a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. The documentation for this library can be found .

- Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. The documentation for this library can be found

- a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The documentation for this library can be found .

- a lightweight python library for time series data analysis. The two major functionalities it supports are anomaly detection and correlation. The documentation for this library can be found .

- a library for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. The documentation for this library can be found .

- provides an easy-to-use, flexible and modular open-source framework for a wide range of time series machine learning tasks. The documentation for this library can be found .

- lightning fast forecasting with statistical and econometric models. The documentation for this library can be found .

- automates the extraction of relevant features from time series data. The documentation for this library can be found .

- a Python toolkit that provides access to a wide range of outlier detection algorithms for detecting outliers in multivariate data. The documentation for this library can be found .

- a Python package dedicated to time series classification. It aims to make time series classification easily accessible by providing preprocessing and utility tools, and implementations of several time series classification algorithms. The documentation for this library can be found .

- a complete survival analysis library, written in pure Python. The documentation for this library can be found .

- a Python module for survival analysis built on top of scikit-learn. The documentation for this library can be found .

- an open source python package for Survival Analysis modeling built upon the most commonly used machine learning packages such as NumPy, SciPy and PyTorch. The documentation for this library can be found .

- provides a suite of uplift modeling and causal inference methods that allows user to estimate the Conditional Average Treatment Effect (CATE) or Individual Treatment Effect (ITE) from experimental or observational data. The documentation for this library can be found .

- An end-to-end library for causal inference. The documentation for this library can be found .

- applies machine learning techniques to estimate individualized causal responses from observational or experimental data. The suite of estimation methods provided in EconML represents the latest advances in causal machine learning. The documentation for this library can be found .

- an uplift modeling python package that provides fast sklearn-style models implementation, evaluation metrics and visualization tools. The documentation for this library can be found .

- LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback. The documentation for this library can be found .

- a Python scikit for building and analyzing recommender systems. The documentation for this library can be found .

- a Python framework for performing information retrieval experiments and implementing learn-to-rank pipelines. The documentation for this library can be found .

- a python library for implementing a recommender system. The documentation for this library can be found .

- an open-source NLP research library, built on PyTorch. The documentation for this library can be found .

- token level embeddings from BERT model on mxnet and gluonnlp. The documentation for this library can be found .

- a library for efficient learning of word representations and sentence classification. The documentation for this library can be found .

- a very simple framework for NLP that ships with state-of-the-art models for a range of NLP tasks. The documentation for this library can be found .

- About Fuzzy String Matching in Python. The documentation for this library can be found .

- a free open-source Python library for representing documents as semantic vectors. The documentation for this library can be found .

- a leading platform for building Python programs to work with human language data. The documentation for this library can be found .

- a Python wrapper for Stanford CoreNLP tools.

- a library for advanced Natural Language Processing in Python and Cython. The documentation for this library can be found .

- the Stanford NLP Group’s official Python NLP library. It contains support for running various accurate natural language processing tools on 60+ languages and for accessing the Java Stanford CoreNLP software from Python. The documentation for this library can be found .

- a Python library for processing textual data. It provides a consistent API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction. The documentation for this library can be found .

- provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The documentation for this library can be found .

- a web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization. The documentation for this library can be found .

- supports various multilingual applications and offers a wide range of analysis and broad language coverage. The documentation for this library can be found .

- a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

- accurately generate all possible forms of an English word. The documentation for this library can be found .

- makes it easy to use many advanced machine learning, pattern recognition and multivariate statistical techniques on neuroimaging data. The documentation for this library can be found .

- an open-source library that includes several hundreds of computer vision algorithms. The documentation for this library can be found .

- a good style guide to follow when developing in Python.

- the uncompromising Python code formatter.

Thomas L Vincent.
Altair
here
Bokeh
here
Dash
here
diagrams
here
folium
here
igraph
here
matplotlib
here
networkx
here
pandas-profiling
here
Plotly
here
plotnine
here
seaborn
here
Streamlit
here
annoy
imbalanced-learn
here
hummingbird
here
lifetimes
here
metric-learn
here
milk
here
pyBrain
pycaret
here
pymc3
here
scikit-learn
here
statsmodel
here
XGBoost
here
eli5
here
lime
here
omniXAI
here
shap
yellowbrick
here
hyperopt
here
optuna
here
ray
here
scikit-optimize
here
tpot
here
Auto_TS
here
darts
here
luminol
here
Prophet
here
sktime
here
statsforecast
here
tsfresh
here
pyod
here
pyts
here
lifelines
here
scikit-survival
here
pysurvival
here
Causal ML
here
doWhy
here
EconML
here
scikit-uplift
here
lightFM
here
surprise
here
pyTerrier
here
python-recsys
here
allennlp
here
bert-embedding
here
fastText
here
flair
here
fuzzywuzzy
here
gensim
here
NLTK
here
Stanford CoreNLP Python
spacy
here
stanza
here
textblob
here
transformers
here
pattern
here
polyglot
here
vaderSentiment
word_forms
here
NiLearn
here
OpenCV
here
Google Python Style Guide
black
Page cover image