
Pythonic Resources
The list below is a collection of Python packages used for visualization and several ML tasks compiled by Thomas L Vincent.
Data Visualization
matplotlib- a comprehensive library for creating static, animated, and interactive visualizations in Python. The documentation for this library can be foundhere.pandas-profiling- Create HTML profiling reports from pandas DataFrame objects. The documentation for this library can be foundhere.
General Purpose (Tabular) Machine Learning
annoy- approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk.imbalanced-learn- a Python package to tackle the curse of imbalanced datasets in Machine Learning. The documentation for this library can be foundhere.hummingbird- a library for compiling trained traditional ML models into tensor computations. The documentation for this library can be foundhere.metric-learn- efficient Python implementations of several popular supervised and weakly-supervised metric learning algorithms. The documentation for this library can be foundhere.pyBrain- a Python library to develop and implement neural networks. The documentation for this library can be found [here]http://pybrain.org/docs/index.html).scikit-learn- Multi-purpose Machine Learning library in Python. The documentation for this library can be foundhere.statsmodel- a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. The documentation for this library can be foundhere.
ML Explanability and Feature Interpretation
shap- a game theoretic approach to explain the output of any machine learning model.yellowbrick- a Python library that provides a suite of visual analysis and diagnostic tools to facilitate machine learning model selection. The documentation for this library can be foundhere.
Hyper-parameter Optimization
scikit-optimize- a simple and efficient library that implements several methods for sequential model-based optimization. The documentation for this library can be foundhere.
Time Series
statsforecast- lightning fast forecasting with statistical and econometric models. The documentation for this library can be foundhere.
Survival Analysis
scikit-survival- a Python module for survival analysis built on top of scikit-learn. The documentation for this library can be foundhere.pysurvival- an open source python package for Survival Analysis modeling built upon the most commonly used machine learning packages such as NumPy, SciPy and PyTorch. The documentation for this library can be foundhere.
Causal Inference
scikit-uplift- an uplift modeling python package that provides fast sklearn-style models implementation, evaluation metrics and visualization tools. The documentation for this library can be foundhere.
Recommendation \& Ranking
python-recsys- a python library for implementing a recommender system. The documentation for this library can be foundhere.
Natural Language Processing
bert-embedding- token level embeddings from BERT model on mxnet and gluonnlp. The documentation for this library can be foundhere.fuzzywuzzy- About Fuzzy String Matching in Python. The documentation for this library can be foundhere.Stanford CoreNLP Python- a Python wrapper for Stanford CoreNLP tools.transformers- provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. The documentation for this library can be foundhere.vaderSentiment- a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.word_forms- accurately generate all possible forms of an English word. The documentation for this library can be foundhere.
Computer Vision
Miscalleneous
Google Python Style Guide- a good style guide to follow when developing in Python.black- the uncompromising Python code formatter.
Last updated