Skip to content
View smougel's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report smougel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
smougel/README.md

Hi there 👋

  • 🔭 I’m the founder of https://www.seocopilot.fr : Helping compagnies to improve their SEO organicaly
  • 🌱 I’m currently learning about Data Science / AI / Deep Learning / Machine learning (Never ending process)
  • 👯 I’m looking to collaborate on everything that has a meaningful purpose when I have time !
  • 📫 How to reach me: https://www.linkedin.com/in/smougel/
  • 😄 Pronouns: He/His

I've a background of 20 years as Software Engineer (Fullstack dev) and now I add a new card to my set as Data Scientist / AI Engineer. I love to learn new things about Deep learning / CNN / Sequence Models (Many thanks Andrew NG & Coursera)

  • 💻 Programming languages : Python, Javascript, Php
  • 🗂️ Databases : Maria DB / MySQL / Redis
  • 💍 Front : React / Redux / Css / HTML 5
  • 🔧 Web frameworks : Symfony / Laravel / Code Igniter
  • 🔩 DB frameworks : Active Records, Doctrine ORM
  • ⚙️ Backend : Writing of Workers & Daemons
  • ⌚ Load & Queue management : Beanstalkd
  • 📊 DataViz : Matplotlib , plotly, seaborn
  • 🧪 Data science : Pandas, numpy, scikit learn, Tensorflow, Pytorch, keras

Projects

I will open source them as soon as possible.

Exploratory Data Analysis (Data for good): 🌳 🌲 🌱 🏢 Paris Trees 🌳 🌲 🌱

Goal : Helping Paris city to become a smart city.

Optimization of tree maintenance

Data source : opendata.paris.fr

My work : https://github.com/smougel/eda_paris_tree (Notebook & Presentation)

Exploratory Data Analysis (Open food facts) 🍕 🍇 🍓 🧀 🍔 🍫

Goal : Analyse and find healthy products / Inform people about nutritional metrics

My work : https://github.com/smougel/eda_open_food_facts (Notebook & Presentation)

Credit scoring ✒️ 💯

Build a model to detect people able to repay their loan... or not...

💡 Process :

  • exploratory data analysis
  • data cleaning
  • feature engineering
  • sampling / train & test split
  • model training : SVM, Neural Networks, Logistic Regression, Random forest
  • variable importance evaluation with lime

📏 Metrics : Precision / Recall / F1-Score

🪛 Hyperparameter tuning with grid-search and cross-validation

Tech used : Python, Scikit-learn, Matplotlib, Seaborn Project : https://github.com/smougel/credit_scoring/tree/master (Notebook)

Customer segmentation for e-commerce 👨 👧 🧓 👽 🙆

Unsupervised learning task

Dataset : Olist.com

Gain insights about user behavior and discover buyer characteristics

💡 Process :

  • exploratory data analysis
  • data wrangling
  • feature engineering
  • dimensionality reduction : Principal Component Analysis
  • clustering : k-means, db-scan
  • elbow method
  • High-dimensionality vizualisation : T-SNE, U-Map
  • Analysis of cluster stability

📏 Metrics : ARI Score

AI for Restaurants 🍽️ 🍝

😡 Customer dissatisfaction discovery 🪄 Automatic photo classification : Menu card, Food picture 🍝, Outdoor picture

Dataset : https://www.yelp.com/dataset

💡 Process :

  • exploratory data analysis
  • data wrangling for NLP (🤬 stop words , lemmatization, stemming, tokenization)
  • data wrangling for photos (contrast normalization, resizing, noise filtering)
  • Topic discovery : LDA
  • Convolutional neural networks
  • Regularization (Dropout)
  • Exploration of filters learned by the CNN (Thanks to francois Chollet)
  • Use of Yelp API

Tech used : Tensorflow, NLP, Sequence model, LSTM, CNN, Keras, Open CV

Bad Buzz Detection in comments 🗣️ 👍 / 👎

Dataset : https://www.kaggle.com/kazanova/sentiment140 (1.6M Tweets) Goal : Sentiment analysis from tweets. Benchmark with Microsoft Azure Sentiment Analysis.

💡 Process :

  • exploratory data analysis
  • data wrangling (lemming / stemming / tokenization)
  • modelization (Basic to advanced : logistic regression, TF-IDF, LSTM)
  • benchmark with Azure Machine Learning Services

🪛 Hyper parameter tuning 📏 Metrics : F-Beta Score

Tech used : Word embeddings (Word2Vec & FastText), Tensorflow & Keras

Image segmentation for autonomous driving 🤖 ❤️ 🚗

Dataset : Cityscape

💡 Process :

  • exploratory data analysis
  • data wrangling (Picture to binary mask)
  • data augmentation (Random cropping, flipping, mirroring)
  • modelization (Basic to advanced : Fully connected layers to U-Net architecture)
  • ☁️ training in the cloud (w/ Microsoft Azure : compute instance provisionning)
  • Model serving via Flask API hosted on Microsoft Azure

🪛 Hyper parameter tuning 📏 Metrics : Jaccard index

Tech used : Tensorflow, Keras, CNN, U-Net, Flask, Azure Services

Content recommendation for news reading 🧚 🪄 📚

Dataset : News Portal User Interactions by Globo.com https://www.kaggle.com/gspmoreira/news-portal-user-interactions-by-globocom#clicks_sample.csv

💡 Process :

  • exploratory data analysis (w/ t-SNE visualization of news embeddings)
  • data wrangling
  • modelization : content filtering and collaborative filtering
  • ☁️ training in the cloud (w/ Microsoft Azure : compute instance provisionning)
  • Use of serverless Azure Function for model serving / Azure Storage
  • Integration with a node js mobile app

📏 Metrics : Similarity measure (dot product, cosine)

Tech used : Tensorflow, Sparse Tensor, Matrix factorization, Azure Services

Chatbot for vacation booking 🤖 🪄 🌴 ☀️

Dataset : Microsoft frames dataset (Dialogs between two humans via a chat interface) https://www.microsoft.com/en-us/research/project/frames-dataset/

💡 Process :

  • exploratory data analysis (w/ t-SNE visualization of news embeddings)
  • data wrangling
  • LUIS Training
  • Integration w/ Microsoft bot framework

📏 Metrics : Similarity measure (dot product, cosine)

Tech used : Microsoft LUIS, Microsoft Bot Framework, Azure application insight, Unit Testing

Pinned Loading

  1. eda_open_food_facts eda_open_food_facts Public

    Jupyter Notebook

  2. eda_paris_tree eda_paris_tree Public

    EDA for Paris trees

    Jupyter Notebook

  3. pytorch-Deep-Learning pytorch-Deep-Learning Public

    Forked from Atcold/NYU-DLSP20

    Deep Learning (with PyTorch)

    Jupyter Notebook

  4. content_recommendation content_recommendation Public

    Jupyter Notebook

  5. travelbot travelbot Public

    Travel chatbot developed with azure machine learning / Luis / Microsoft Chatbot Framework Builder

    Python

  6. batteurMDR/ggj2015 batteurMDR/ggj2015 Public

    JavaScript