Stars
Learn how to design, develop, deploy and iterate on production-grade ML applications.
My best practice of training large dataset using PyTorch.
An extension of XGBoost to probabilistic modelling
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM
Implement of L-LDA Model(Labeled Latent Dirichlet Allocation Model) with python
scikit-learn gradient-boosting-model interactions
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
renatoviolin / xlnet
Forked from zihangdai/xlnetXLNet: fine tuning on RTX 2080 GPU - 8 GB
Official public repository for PM4Py (Process Mining for Python) — an open-source library for exploring, analyzing, and optimizing business processes with Python.
scikit-learn: machine learning in Python
Visualizing Deep Neural Network Decisions: Prediction Difference Analysis
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
Presentations from H2O meetups & conferences by the H2O.ai team
Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.
Learn Julia the hard way!
An intuitive library to add plotting functionality to scikit-learn objects.
Python Library for Causal and Probabilistic Modeling using Bayesian Networks
The most cited deep learning papers
TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)
A library of sklearn compatible categorical variable encoders
Scikit-learn compatible implementations of the Random Rotation Ensemble idea of (Blaser & Fryzlewicz, 2016)
Adaptative Hybrid Extreme Rotation Forest (AdaHERF)
Distributed Deep learning with Keras & Spark
A Python implementation of global optimization with gaussian processes.
greedy feature selection based on ROC AUC
2nd place solution for Allstate Claims Severity competition at Kaggle
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials,…
120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.