.. _l-expoinfo2a: Exposés (2A) ============ *en préparation* .. contents:: :local: :depth: 2 :ref:`Page principale du cours ` Principe ++++++++ Un cours ne suffit pas pour aborder les nombreux aspects du machine learning, tous ne sont pas utiles tous les jours dans la vie d'un datascientiste, la recherche est également très active dans ce domaine, c'est un voeu pieux que d'écrire un cours à jour. Il est important d'entretenir sa curiosité et de se former seul. L'objectif de cet exposé est de présenter pendant 10 à 15 minutes un thème ou un problème non abordé en cours. Il est fait à plusieurs et s'appuiera sur la lecture d'articles ou de documentation de librairies. L'exposé devra inclure : * Une synthèse ce que vous avez lu. * Le problème auquel la méthode présentée répond. * Existe-t-il des librairies qui implémentent la méthode ? Si oui, sont-elles simples d'utilisation, si non, l'algorithme est-il simple à implémenter ? * Avez-vous des critiques quant à la méthode ? L'exposé pourra porter sur un des thèmes suivants ou un autre de votre choix après validation. Lors de la présentation, chaque personne du groupe devra s'exprimer. Thèmes - 2018 +++++++++++++ *Thèmes choisis en 2018* Imputation de valeurs manquantes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Imputation de données manquantes `_ * `Scalable Tensor Factorizations for Incomplete Data `_ * `Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models `_ Régression linéaire et sélection de variables ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Sélection bayésienne de variables en régression linéaire `_ * `Régression linéaire bayésienne `_ * `bayesian_linear_regression.py `_ Descente de gradient ^^^^^^^^^^^^^^^^^^^^ * `Dual Principal Component Pursuit `_ * `Adam: A Method for Stochastic Optimization `_ * `HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent `_ * `Sparse Online Learning via Truncated Gradient `_ Clustering ^^^^^^^^^^ * `Consensus Clustering `_ * `A Survey of Clustering Ensemble Algotihm `_ Privacy ^^^^^^^ * `k-anonymity `_ * `A General Survey of Privacy-Preserving Data Mining Models and Algorithms `_ * `Privacy Preserving Data Mining `_ Random Forest améliorées ^^^^^^^^^^^^^^^^^^^^^^^^ * `Random Rotation Ensembles `_ * `Extremely randomized trees `_ * `Learning to rank with extremely randomized trees `_ * `Mondrian Forests: Efficient Online Random Forests `_ Factorization Machines ^^^^^^^^^^^^^^^^^^^^^^ * `Factorization Machines `_ * `Field-aware Factorization Machines in a Real-world Online Advertising System `_ * `Contextual and Position-Aware Factorization Machines for Sentiment Classification `_ Régressions linéaires pas classiques ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Intelligible Models for Classification and Regression `_ * `Isotonic regression `_ * `Online Isotonic Regression `_ * `Iteratively reweighted least squares `_ * `RANSAC `_ * `Multivariate Convex Regression with Adaptive Partitioning `_ * `Lattice Regression `_ Détection de communauté ^^^^^^^^^^^^^^^^^^^^^^^ * `Fast unfolding of communities in large networks `_ * `Partitioning Well-Clustered Graphs: Spectral Clustering Works! `_ * `A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks `_ Yield management ^^^^^^^^^^^^^^^^ * `Le yield managment pour les nuls `_ * `Machine-learning pour la prédiction des prix dans le secteur du tourisme en ligne `_ * `Yield Management at American Airlines `_ * `Perishability of Data: Dynamic Pricing under Varying-Coefficient Models `_ Enchères ^^^^^^^^ * `Learning Algorithms for Second-Price Auctions with Reserve `_ * `Learning Simple Auctions `_ * `A Structural Model of Sponsored Search Advertising Auctions `_ * `Bayesian Methods for Media Mix Modeling with Carryover and Shape Effects `_ Spiking Neural Networks ^^^^^^^^^^^^^^^^^^^^^^^ * `Spiking neural networks, an introduction `_ * `A Minimal Spiking Neural Network to Rapidly Train and Classify Handwritten Digits in Binary and 10-Digit Tasks `_ * `Training Deep Spiking Neural Networks Using Backpropagation `_ * `Spiking Neural Networks: Principles and Challenges `_ * `Python Tutorial: How to Write a Spiking Neural Network Simulation From Scratch `_ *Thèmes non choisis en 2018* Méthode LIME ^^^^^^^^^^^^ * `LIME `_ * `"Why Should I Trust You?": Explaining the Predictions of Any Classifier `_ * `Defining Locality for Surrogates in Post-hoc Interpretablity `_ * module `eli5 `_ Classification multi-label ^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Multiclass-Multilabel Classification with More Classes than Examples `_ * `A Ranking-based KNN Approach for Multi-Label Classification `_ Bandits ^^^^^^^ * `Learning to Interact `_ * `Thompson Sampling with the Online Bootstrap `_ Point aberrants ^^^^^^^^^^^^^^^ * `BoostClean: Automated Error Detection and Repair for Machine Learning `_ * `Outlier Detection Techniques `_ * `RANSAC `_ * `Scorpion: Explaining Away Outliers in Aggregate Queries `_ Détection d'anomalies ^^^^^^^^^^^^^^^^^^^^^ * `Robust Random Cut Forest Based Anomaly Detection On Streams `_ Education ^^^^^^^^^ * `Multi-Armed Bandits for Intelligent Tutoring Systems `_ * `Object learning through active exploration `_ Détection de biais ^^^^^^^^^^^^^^^^^^ * `On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation `_ * `Learning Theory of Distributed Regression with Bias Corrected Regularization Kernel Network `_ * `Identifying Significant Predictive Bias in Classifiers `_ * `On the reduction of biases in Big Data sets ofr the detection of irregular power usage `_ Robustesse ^^^^^^^^^^ * `Preserving Statistical Validity in Adaptive Data Analysis `_ Evaluation de politique, relations causales ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Machine Learning and Causal Inference for Policy Evaluation `_ * `Recursive Partitioning for Heterogeneous Causal Effects `_ * `Machine Learning Meets Instrumental Variables `_ * `Synthetic Control Methods and Big Data `_ * `To Explain or to Predict? `_ Théorie des jeux, économie, deep reinforcement lerning ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Artificial Intelligence as Structural Estimation: Economic Interpretations of Deep Blue, Bonanza, and AlphaGo `_ * `When Machine Learning Meets AI and Game Theory `_ Automated machine learning ^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Probabilistic Matrix Factorization for Automated Machine Learning `_ * `Probabilistic Matrix Factorization `_ * `auto-sklearn `_ * `Python Implementation of Probabilistic Matrix Factorization Algorithm `_ * `Matrix Factorization-based algorithms `_ Robust PCA ^^^^^^^^^^ * `ROBPCA: A New Approach to Robust Principal Component Analysis `_ * `A Unified Framework for Outlier-Robust PCA-like Algorithm `_ * `Robust Stochastic Principal Component Analysis `_ * `Online Robust PCA via Stochastic Optimization `_ * `Online PCA for Contaminated Data `_ Thèmes - 2019 +++++++++++++ Les thèmes de l'année dernière déjà choisis peuvent être repris à condition d'ajouter un article non prévu dans la liste et publié en 2019. Anonymisation ^^^^^^^^^^^^^ * `Estimating the success of re-identifications in incomplete datasets using generative models `_ Automated machine learning ^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Probabilistic Matrix Factorization for Automated Machine Learning `_ * `auto-sklearn `_ Conformal Prediction ^^^^^^^^^^^^^^^^^^^^ * `A Tutorial on Conformal Prediction `_ * `Regression Conformal Prediction with Nearest Neighbours `_ Causalité ^^^^^^^^^ * `Machine Learning Methods Economists Should Know About `_ * `Counterfactual Inference `_ * `The State of Applied Econometrics: Causality and Policy Evaluation `_ * `Estimating Treatment Effects with Causal Forests: An Application `_ Détection de changements de régime ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * `Selective review of offline change point detection methods `_ * `ruptures `_ Extraction de features ^^^^^^^^^^^^^^^^^^^^^^ * `Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA `_ Interprétabilité ^^^^^^^^^^^^^^^^ * `Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) `_, `tutorial `_) * `To Trust Or Not To Trust A Classifier `_, `Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction `_ * `DALEX: Explainers for Complex Predictive Models in R `_ Fuzed Lasso ^^^^^^^^^^^ * `Sparsity and smoothness via the fused lasso `_ * `Structured Association `_ * `Properties and Refinements of the Fused Lasso `_ * `Adaptive Generalized Fused-Lasso: Asymptotic Properties and Applications `_ k-nearest neighbours ^^^^^^^^^^^^^^^^^^^^ * `Neighbourhood Components Analysis `_ Privacy ^^^^^^^ * `A General Approach to Adding Differential Privacy to Iterative Training Procedures `_, `tensorflow/privacy `_ Très grande dimension ^^^^^^^^^^^^^^^^^^^^^ * `Making Decision Trees Feasible in Ultrahigh Feature and Label Dimensions `_ * `Identifying a Minimal Class of Models for High–dimensional Data `_ * `The xyz algorithm for fast interaction search in high-dimensional data `_ Sélection de variables ^^^^^^^^^^^^^^^^^^^^^^ * `Complete Search for Feature Selection in Decision Trees `_