Warith Harchaoui, Ph.D.

Warith Harchaoui

Expert in Artificial Intelligence

Computer Vision,
Natural Language Processing,
and Machine Learning

In Data Science,
No Data,
No Science


Crazy Scientist Drawing

I am Warith Harchaoui, Ph.D. (in Applied Mathematics) and passionate about artificial intelligence (AI). I am building my career in artificial intelligence both in research and in companies.

I began in 2008 my never-ending learning process with the best AI scientists thanks to the École Normale Supérieure de Cachan (MVA M.Sc.). After several experiences in the field of Computer Vision in startups and worldwide companies, I pursued my research endeavor at the École Normale Supérieure de Paris at the Willow / Sierra laboratories which are ranked among the best in the world.

Later, I reinforced the corporate component of his career in Data Science within the e-commerce world leader Oscaro.com for selling car parts from 2014 to 2020. While keeping my operational responsibilities, I accomplished my Ph.D. in Applied Mathematics from 2016 to 2020 with Charles Bouveyron at the MAP5 Lab of Université de Paris. Today, thanks to Jellysmack as Research Fellow, my expertise in image, sound, video and text processing encourages me to embrace my dreams: bringing ideas from artificial intelligence to the real world.


Ambition in Artificial Intelligence

AI Media

To the best of my efforts, I believe in the idea that Artificial Intelligence (AI) should be understood as a revolution comparable to agriculture 10,000 years before Christ, the invention of writing 3,000 years before Christ or even printing in the 15th century. By AI, I mean the widest definition: Mathematics applied with computers known as Statistical Learning, Pattern Recognition, Machine Learning, Data Science and even Signal Processing for various media such as sound, image, video, text and even tabular data. The tangible impact of AI profoundly changes the relationship between our minds and the world. For companies, almost all areas of our contemporary world are now impacted by this Science.

Philosophically, AI is the automated emergence of certain natural intelligence aspects such as learning, decision making, adaptation, prediction, imitation, content production etc. thanks to some Applied Mathematics with computers. Concretely, Artificial Intelligence is the science that allows a machine to execute tasks without exhaustively enumerating scenarios by human intervention. AI is the natural extension of automation from matter to information.

In practice, this ambition is translated into accomplishments towards this inevitable milestone for mankind by maintaining close relationships with academia such as the MAP5 lab, Université de Paris, INRIA Masaai, Executive MBA of Rennes School of Business, think tank 4th Revolution and companies like Jellysmack but also thanks to operational consulting as entrepreneur with Ircam Amplify for music analysis and monetization and VizioSense for embedded Computer Vision.

They Say

Throughout my professional experience, I have come to understand the importance of fostering a collaborative work environment where all voices and ideas are heard. A project with a team, colleagues and collaborators that do not get along well always ends up paying the price. This is why I collect feedbacks from my collaborators and colleagues to improve my work and my team's work.


“Warith is a Bible in AI”
Annual performance review at Jellysmack, 2022

The diversity of the problems to be solved at Jellysmack supports the idea of a scientific culture and a business culture without borders. For science, it is about stepping back and letting creativity flow through the inspiration of published literature. For the success of Jellysmack, it is about contributing to the innovation necessary for a competitive market at the service of internet video creators.

Executive MBA Rennes School of Business

“An unlikely but incredible duo!”
Student, 2022

Delivering a 2-day / 16-hour course with Laurent Pantanacce for a group of a dozen experienced students is a challenge. The feedback we have received justifies all the effort!
The course is about Artificial Intelligence and its uses with respect to tech, product and clients

Ircam Amplify

“Thank you for this great job.”
Nathalie Birocheau, founder of Ircam Amplify, 2021

It has been a privilege to work with the corporate side of the French Music Research Center, Ircam in order to satisfy the pragmatic needs of the music industry. Working with Matthieu Bouxin reminded me of an era when I worked with him in Java in our engineering school listening to the Pow Wow and the Queen.


“Warith's expertise in AI and Computer Vision has been key to shape the R&D team.”
Maxime Schacht, founder of VizioSense, 2021

This experience in operational consulting in Computer Vision has set the bar high for my future collaborations: no more impostor syndrome, just work! Translating the best that research has to offer into practical applications is what makes me up in the morning.

MAP5 Lab of Université Paris Descartes

“We feel your personality in your Ph.D. manuscript!”
Erwann Le Pennec, Professor at École polytechnique and referee of my Ph.D. defense, 2020

Great people like Pr. Charles Bouveyron (my academic thesis supervisor), Dr. Stéphane Raux (my corporate thesis supervisor), Dr. Pierre-Alexandre Mattei and Pr. Andrés Almansa did me an honour by helping me accomplish my Ph.D. work in the warmth of the MAP5 lab and with the pugnacity of the Oscaro company.
I am proud of a challenge I was able to take up: a state of the art chapter that remains relevant today.


“This cooperation with the university brought great value to the company I founded, Oscaro, with the product Cerbero.”
Pierre-Noël Luiggi, founder of Oscaro.com, 2020

Going back and forth between academia and the Oscaro.com company during my Ph.D. was a great experience. The core questioning of my Ph.D. naturally emerged there: customer groups, handling uncertainty for prediction, and key performance indicators that are useful and understandable to my corporate hierarchy and academic peers.

Favorite Books in Artificial Intelligence

In the rapidly advancing field of Artificial Intelligence, it is common to observe high volume and pace of both scientific and non-scientific publications. However, it is fortunate that experienced scientists do take the time to write comprehensive books that provide valuable insights and surveys of the field. In addition to the concise format of publications at top AI conferences, it is beneficial to delve deeper into the mathematical and algorithmic complexities of plain books in order to both understand the shorter works and effectively utilize the various toolkits available online. It is in this context that I present a list of books that I find particularly noteworthy, along with some comments, for those readers who are eager to engage in the exciting “AI adventure”.

I did not commented all the books I like so far yet. Indeed, it is difficult for me to comment books of people I admire in a way that is useful for the readers, so it takes me time.

Machine Learning

Machine Learning is the science of learning from data, which is experience gained by analyzing data instead of explicit programming. This is done by using computer-based algorithms that analyze input data, identify patterns using statistical techniques and make decisions or predictions. As a result, the machine can adapt to new and unprecedented data and make more accurate predictions.

Pattern Recognition and Machine Learning
Christopher M. Bishop, 2006,

This 738 pages top-notch textbook offers a complete overview of the fields of Pattern Recognition and Machine Learning. The “pattern recognition” part of the title reminds me how pragmatic it is for engineers bulding things people do not understand yet called engines, it might be subjective to say but I really love that book ( except the peculiar cover, otherwise it would be perfect! ). The mathematical background is not too heavy, and kindly refreshed to the readers.

Machine Learning: A Probabilistic Perspective
Kevin Murphy, 2012,

These three books (of thousand pages each approximately) cover a wide range of topics in detail, including probability, optimization, linear algebra, for machine learning with particular attention for conditional random fields, L1 sparisity regularization, and deep learning. People with mathematical backgrounds will find it a great reference, and it's also a good choice for self-study.

The attempt to unify traditional and more recent topics provides a valuable coherence and reflection for developing a culture. These books are not only about fundamentals but also about the state of the art. Ideally, an undergraduate a student considering a doctoral thesis should at least try to read the first volume “Machine Learning: A Probabilistic Perspective”: if it is not fascinating to the student, maybe he/she should not pursue a Ph.D. in Machine Learning.

It is written in an easy-to-understand style, with pseudo-code for the most important algorithms and plenty of examples from real-world fields like biology, text processing, computer vision, and robotics. Instead of just giving you a bunch of random tricks and techniques, the book takes a closer look to graphical models to tackle probabilistic modelling in a clear and concise way.

Bayesian Reasoning and Machine Learning
David Barber, 2012,

This 735 pages book explains how established tools are used in a wide range of industrial applications spreading rapidly, including search engines, DNA sequencing, stock market analysis, and robot locomotion. Beyond sterile discussions about “Bayesians vs. Frequentists” (troll discussions equivalent to “emacs vs. vim” or “Linux vs. Windows” in Machine Learning), this book is the first I can think of about what “Bayesian Modelling” or “Graphical Models” actually mean. This hands-on text opens opportunities to computer science students with some taste for mathematics to go further.

This book narrates advancements in the field of machine learning and graphical models. Before reading this book, I did not understand the circles and arrows in articles claiming they were graphical models. Now these drawings are much clearer to me and sometimes I do some myself. I can even say that what makes this book unique is the integration of multiple disciplines through the use of graphical models. In addition, the transition from traditional artificial intelligence to modern machine learning, executed with finesse, adds to the value of the book. It is written with clarity and, as such, should be accessible to a diverse audience, including those with varying levels of mathematical proficiency.

Computer Vision

Computer vision is an area of artificial intelligence that aims to replicate the capabilities of human vision by teaching computers to interpret and comprehend the visual environment in a similar way to humans. This field is applied to a variety of tasks, such as facial recognition, object detection, autonomous driving, and medical imaging.

Computer Vision: Algorithms and Applications, 2nd Edition
Richard Szeliski, 2022,

This 2nd edition of the book (1212 pages) is pleasantly entertaining yet covering almost all important subjects in Computer Vision: Filtering, Recognition, Feature Matching, Image Alignment, Motion Estimation, Computational Photography, Robotic Vision, Depth Estimation (with 2 or even 1 photograph(s) of the same scene), 3D, Rendering...

I highly recommend this book for newcomers trying to dive in the field.

Computer Vision: A Modern Approach, 2nd Edition
David Forsyth and Jean Ponce, 2011,

This textbook (800 pages) has been written by two living legends in Computer Vision: David A. Forsyth and Jean Ponce. Here the main objective is to develop a scientific culture and strenghten mathematical reflexes for handling classic Computer Vision problems from image modelling to understanding human activity.

The book is particularly comprehensive about building image features, computational geometry, image preprocessing, segmentation and object recognition which gives insight beyond Computer Vision.

Multiple View Geometry in Computer Vision
Richard Hartley and Andrew Zisserman, 2004,

The book (670 pages) covers the basic principles of Computer Vision, specifically in regards to understanding the structure of real world scenes and reconstructing them using geometric, algebraic and algorithmic principles. This is not only fundamental for 3D representations but also for understanding 2D perspective in images and videos. Being impregnated with the writing style of Richard Hartley and Andrew Zisserman is also valuable for being a researcher oneself.

Natural Language Processing

Natural Language Processing (NLP) allows computers to interpret and comprehend human language. This is achieved through the use of algorithms and software that analyze large amounts of data and extract the meaning of text, enabling computers to understand language in a similar way to humans. NLP is applied in various contexts, including search engine optimization, automatic summarization, sentiment analysis, and natural language generation.

Neural Network Methods in Natural Language Processing
Yoav Goldberg, 2017,

This long article (76 pages) that we can combine with the associated longer book (309 pages) is a pretty fine first-read of Natural Language Processing that finally works in practice! How numbers can express words and expressions of human beings? How to use the terrific idea of embeddings even beyond NLP. How can we use the Deep Learning artillery to accomplish wonders since the seminal word2vec approaches in the mid-2010s. The readers will appreciate the straightforward and clear explanations of the author.

Foundations of Statistical Natural Language Processing
Chris Manning and Hinrich Schütze, 1999,

This 620 pages book is old but it summarizes very well all the good practices of non-deep Natural Language Processing. It is nicely written and a nice source of inspiration for even non-NLP-related problems especially for pre-processing data. One can recommend this book for understanding at least the problems at hand in recent publications such as part-of-speech tagging, context free grammars, topics extraction or information retrieval.

Signal Processing

Signal processing deals with the manipulation, analysis, and transformation of signals. These signals may include sound waves, radio waves, images, and data from medical instruments. Signal processing is applied in a variety of contexts, including audio enhancement, data analysis in medical imaging, and more. Essentially, signal processing involves extracting useful information from signals such as frequency, amplitude, or color, or modifying to make it crisper.

A Wavelet Tour of Signal Processing, 3rd Edition
Stéphane Mallat, 2008,

One can recommend this legendary book (edited several times) even if you don't like wavelets. The great value of this book lies in its explanations of the links between algebra and signal processing (bases and projections), the refreshing insights into what a Fourier transform is, time-frequency analysis, sparsity, space scales, compression, inverse problems... all this with a pleasant writing style. The associated website A Wavelet Tour of Signal Processing is just magical! I cannot help myself from citing this awesome website brother Numerical Tours from Gabriel Peyré (who extended the book in this last edition).

Ce livre existe aussi en français.

Reinforcement Learning

Reinforcement Learning (RL) is about making machines learn from their environment and perform actions that maximize rewards. To do this, the computer is given a numerical goal or objective, and then given feedback after each action it performs in the form of punishment and reward, also numerical. The computer then adjusts its actions according to this feedback, learning from its mistakes and optimizing its behavior over time. Reinforcement learning is applied in a variety of fields, including games, robotics and autonomous vehicles. In general terms, reinforcement learning is the process of teaching a computer to perform the most efficient actions in a given environment in order to maximize the rewards

Reinforcement Learning, 2nd Edition
Richard S. Sutton and Andrew G. Barto, 2018,

As the name suggests, this 557 pages book provides an in-depth introduction of Reinforcement Learning (RL) from two authority figures of this community: R. Sutton and A. Barto. This book is a must-read to understand RL, and it does not assume prerequisite knowledge (for an undergraduate). It is perfect for a person who wants to know more about RL updated in this second edtion with the deep learning approaches.

In the new chapters for this edition, the readers can appreciate the relationships between RL and Optimal Control, as well as a chapter focused on famous prowess such as AlphaGo, AlphaGo Zero, Atari game playing and IBM Watson.

Algorithms and Optimization

Algorithms are sets of instructions for solving problems in a systematic way. Optimization is the process of identifying the most efficient way to solve a problem. In essence, algorithms and optimization can be viewed as sister techniques for improving efficiency. Algorithms and optimization are the cornerstones of artificial intelligence to adjust model parameters to fit the data. Mastering algorithms and optimization techniques provides the theoretical, but more importantly practical, tools to create new ones and adapt old ones to meet the specificity of your real-world problems.

Introduction to Algorithms
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein, 2009,

This 1312-page book is legendary. Don't be fooled by the word introduction or its relatively advanced age (2009): I would consider anyone very competent if they master this book. It is considered a must-read for many members of the AI community and even the wider computer community.

What makes it so special is that the chapters are both comprehensive and precise, with a particular effort to be simple but not simplistic. In practice, I have saved a lot of time in my work thanks to chapters on multiprocessing calculations and on how to use divide and conquer algorithms, dynamic programming and greedy algorithms to solve general problems beyond the preferred and fashionable programming language you like.

Convex Optimization
Stephen Boyd and Lieven Vandenberghe, 2004,

The “Boyd” is a gentle, yet rigorous “first book” of 727 pages for newcomers in Numerical Optimization. Every time, we hear training or learning from data, it is basically optimization even beyond AI. Convex optimization problems are special cases with exact solutions that can be used to tackle non-convex problems through successive approximations which makes it crucial in Machine Learning (and Deep Learning).

The exercises are so good that sometimes I suspect scientists writing articles to be inspired by the exercices in this book and simply extend them into valuable publications. I also appreciate this book for developping intuitions and interpretations of the concepts and methods. I cannot write about this book without mentioning its well-known solver toolbox CVXPY which is really useful for scientists and practioners.

Numerical Optimization, 2nd Edition
J. Frédéric Bonnans, J. Charles Gilbert, Claude Lemaréchal and Claudia A. Sagastizábal, 2006,

Numerical Optimization is ubiquitous in science and engineering as nicely explained in the introduction. It is a key component of many algorithms in machine learning, signal processing, image processing, computer vision, robotics, and many other fields.

This 508 pages book presents the main concepts and algorithms in a unified and accessible manner, with a focus on the practical aspects of their implementation. The authors are famous in this field and have been teaching this course for many years with also experience in energy management, geoscience, life sciences on optimization problems.

When in doubt while imagining new AI/ML algorithms, this is the book I would read first for confirmation and inspiration. When people don't find their answers in the “Boyd”, I recommend that one. The book is intended for graduate students and researchers but I have no problem admitting I consult it on a regular basis.

Ce livre existe aussi en français.

Numerical Recipes, 3rd Edition
William H. Press, Saul A. Teukolsky, William T. Vetterling and Brian P. Flannery, 2007,

“Numerical Recipes” is a famous and comprehensive 1256-page book on scientific computing techniques. It covers a wide range of topics, including linear algebra, the computer science involved, and various numerical methods and algorithms.

This is typically the kind of book that could help you design heavy computational algorithms in C/C++ or Fortran called from high-level languages like Python. It is quite rare to find such an easy and precise book to read, co-authored by world experts from academia and industry. I have been using this book for many years and still find it useful. It is a must-have book for any serious scientist or engineer who wants to deliver reliable software on a large scale.

Computational Optimal Transport
Gabriel Peyré and Marco Cuturi, 2019,

This 209-page book reviews the topic of Optimal Transport with a focus on numerical methods and their applications at various scales: small, medium and large. A standout feature of this book is the accompanying website, which boasts impressive teaching materials, a wealth of literature, and high quality toolboxes like the Python Optimal Transport (POT) toolbox (developed by Rémi Flamary and Nicolas Courty).

Starting with a history of Optimal Transport (invented by Gaspard Monge in 1781), the book guides readers through a comprehensive survey of the field especially for the concept of entropic regularization and how it has enabled the use of Optimal Transport at large scales settings in fields like Imaging Sciences (such as Color or Texture Processing), Computer Vision, Image Graphics (for shape manipulation), and Machine Learning (for tasks like Regression, Clustering, Classification, Density Fitting, and even Content Generation by imitation). To the best of my knowledge, this is the only book to cover the topic of Optimal Transport with such a precise computational angle.

Contributions in Artificial Intelligence

Embracing both the corporate and academic worlds in the field of artificial intelligence is probably the best conscious choice in my career so far. Industry and academia are two very different worlds, but they are also complementary in my experience. Sometimes the line gets blurred in institutions like OpenAI or trillion dollar companies like Microsoft Research, Baidu Research, Amazon Research or Google Research (not an exhaustive list and in random order) that contribute with groundbreaking papers and OpenSource toolkits like excellent scholars in top-level universities would.

To be honest, starting from an industrial problem allows me to limit some scientific ramblings (which I like so much because they are a source of creativity in disguise!) in favor of a greater impact on the real world... which justifies the efforts. In the end, I am rewarded by the satisfaction of making someone else's life easier, relieved from her/his original problem. So, in my own way, I try to contribute with all my might to this millennium dream of artificial intelligence. Luckily, I have been fortunate to continue to work in both academic and corporate environments with publications, teachings and prototypes / engines in production.

Artificial Intelligence for Business — 2nd Edition
Warith Harchaoui, Laurent Pantanacce and Nicolas Renard, December 2022, Rennes School of Business

This Executive MBA 16-hour course in 2 days is about AI and its uses with respect to tech, product and clients in companies with Laurent Pantanacce and Nicolas Renard. We focus on topics such as machine learning, natural language processing, and computer vision and their applications to business.

Through this course, students will gain an understanding of how AI can be used to improve business processes and operations, as well as insights from data. They will also learn how to use AI to develop solutions for everyday business problems. The ingredients were taken from Jellysmack, INSEAD, Stanford, Coursera and our own real-world experience. In this 2nd edition, Generative AI burst in with different forms in text and images changing the way we think about our own intelligence.

Generalised Mutual Information for Discriminative Clustering
Louis Ohl, Pierre-Alexandre Mattei, Charles Bouveyron, Warith Harchaoui, Arnaud Droit, Mickaël Leclercq and Frédéric Precioso, 2022, NEURIPS (ex-NIPS)

The paper focuses on deep clustering, which is a type of machine learning technique used to group data into categories (or clusters). The method consists in using a measure called mutual information (MI) to train a neural network (or a deep network). We have found that using GEMINI (for Generalised Mutual Information) provides good clusters. GEMINI can also automatically determine the appropriate number of clusters to use. This is an important consideration because in deep and non-deep clustering, the number of clusters is usually not known in advance. We have also shown that GEMINI is more efficient than classical MI for deep clustering.

Artificial Intelligence for Business — 1st Edition
Warith Harchaoui, Laurent Pantanacce, March 2022, Rennes School of Business

This Executive MBA 16-hour course in 2 days is about AI and its uses with respect to tech, product and clients in companies with Laurent Pantanacce. We focuses on topics such as machine learning, natural language processing, and computer vision and their applications to business.

Through this course, students will gain an understanding of how AI can be used to improve business processes and operations, as well as gain insights from data. They will also learn how to use AI to develop solutions for everyday business problems. The ingredients were taken from Jellysmack, INSEAD, Stanford, Coursera and our own real-world experience.

Thoughts in 2021 about Hardware in Artificial Intelligence
Warith Harchaoui, November 2021,

All that Artificial Intelligence in your pocket! This little article just put in writing some thoughts about AI in 2021. Expressions such as “Big Data” or “Moore's law” seem way behind us but are still meaningful, always revisited and even reinvented as the AI communities witness at least three game-changing prowess.

Optimal transport-based machine learning to match specific expression patterns in omics data
Thi Thanh Yen Nguyen, Olivier Bouaziz, Warith Harchaoui, Christian Neri, Antoine Chambaz, July 2021,

We present algorithms designed to learn a pattern of correspondence between two data sets in situations where it is desirable to match elements that exhibit a relationship belonging to a known parametric model. the challenge is to better understand micro-RNA (miRNA) regulation in the striatum of Huntington's disease (HD) of mice. While dealing with miRNA and messenger-RNA (mRNA) data, the biological hypothesis is that if a miRNA induces the degradation of a target mRNA or blocks its translation into proteins, or both, then the profile of the former should be similar through an affine transformation. Thanks to entropy-regularized optimal transport using the Sinkhorn algorithm, we derive either several co-clusters or several sets of matched elements. A simulation study with associated code illustrates how they work and perform.

Value in Data — White Paper for AI
Warith Harchaoui et Laurent Pantanacce, June 2021, Closerie des Lilas in Paris for 4ème Révolution

Every industrial revolution has been driven by a driving force: a raw material, a source of energy, a creative technology that redefines the economy. Since the 19th century, we can list steam, coal, oil, electricity, radio, the transistor, computers, and today artificial intelligence as examples of these driving forces. Our fourth industrial revolution is troubling because its commodity is abstract: data. [...]

Why does this document officially exist?
This document is white paper chapter of 8 pages that my friend Alkéos Michaïl asked my mentor Laurent Pantanacce and myself to write for the think tank 4ème Révolution. Although some chapters from several authors were not gathered nor published, I thought it would be a shame to let this one go to waste as it is a good introduction to the topic of data value for executives, managers, and scientists acknowledging AI and its raw material data changed their jobs.

Why does this document really exist?
It was a just a good excuse to write down a few ideas with my mentor and friend Laurent at the Closerie des Lilas in Paris like French intellectual poets and philosophers instead of letting our passionate discussions go in the air!

Ce document existe aussi en français.

Invitation à l'intelligence artificielle du texte
Warith Harchaoui, May 2021, IUT de l'université de Paris

En intelligence artificielle,le medium “texte” est spécial parce qu'il n'est pas un signal physique contrairement au son, à l'image ou la vidéo. Il s'agit plutôt d'un “signal symbolique” directement issu de l'intelligence humaine depuis son invention trois millénaires avant Jésus-Christ.

Cet exposé présente quelques concepts de l'intelligence artificielle du texte (ou NLP pour Natural Language Processing). Le but est de convaincre un public d'étudiants en informatique de 1ère année, de jeunes curieux en sciences pour qu'ils jouent avec les données et qu'ils programment sur ces sujets qui modifient en profondeur presque tous les domaines de notre monde contemporain.

True Story for a Rare Punctuation Mark
Warith Harchaoui, March 2021,

Dear Professor,

On the second day of March in the year 2021, my friend Quentin and I paid you a visit in prep school Stanislas in Paris to express our gratitude on behalf of many students for imparting upon us the satisfaction mark. For almost two decades, I have made use of this punctuation, and it has made me feel as though I belong to a guild of enlightened scholars, much like yourself, who continue to teach us even today. It is difficult for me to accurately convey to others the significance of this mark, but I will attempt to explain why it holds such special meaning for me. [...]

I would like to thank the local newspaper of Stanislas L'Échos de Stan (page 30) for publishing this letter to my dear Physics teacher Yves Dupont that played an essential role in my career in Science.

Cette lettre existe d'abord en français.

Learning Representations using Neural Networks and Optimal Transport (Ph.D.)
Warith Harchaoui, September 2016 to October 2020, MAP5 — Université Paris Descartes

My Ph.D. work was about artificial intelligence for:

  • how to make a groups e.g. among clients, genes, images;
  • how to show the distinctive attributes of data e.g. for clients and images;
  • how to estimate the confidence level of an automatic decision e.g. for industrial constraints, health, security and even justice;
These problems share a common scientific questioning: how should we represent data? For that purpose, I revisited mathematical concept called Optimal Transport with a widely known algorithmical tool called Neural Networks (nicknamed “Deep Learning” since 2010 approximately).

Great people like Pr. Charles Bouveyron (my academic thesis supervisor), Dr. Stéphane Raux (my corporate thesis supervisor), Dr. Pierre-Alexandre Mattei, Pr. Andrés Almansa, Thi Thanh Yen Nguyen, Pr. Olivier Bouaziz and Pr. Antoine Chambaz did me an honour by helping me accomplish this work in the warmth of the MAP5 lab and with the pugnacity of the Oscaro company.

Rencontre avec Luc Julia, l’IA n’existe pas !⎜ORLM-363
Luc Julia et Warith Harchaoui, February 2020, IUT de l'université de Paris

Discussion informelle dans la webTV ORLM (On refait le Mac) animée par Olivier Frigara avec Luc Julia et moi. Luc Julia est un pionnier de l'intelligence artificielle en Californie dès la fin des années 1990, co-fondateur de Siri, CTO de Samsung Innovation puis aujourd'hui chez Renault (au moment où je l'écris en 2022). Il nous présente son livre L'IA n'existe pas. Nous échangeons sur les enjeux et impacts de l'intelligence ariticielle sur la société, l'économie et donc nos vies tandis que je suis là la tête baissée à baragouiner pour cette première médiatisation télévisuelle !

Une introduction aux réseaux de neurones
Warith Harchaoui, December 2018, MAP5 — Université Paris Descartes, Institut Henri Poincaré, École 42

Depuis un seul neurone, à une couche de neurones, puis plusieurs couches parfois convolutionnelles et même plusieurs réseaux de neurones opposés, nous voyons émerger certains aspects du mystère de l'intelligence à travers une technologie qui change la donne dans presque tous les domaines de notre monde contemporain.

Cet exposé a été donné trois fois: à l'IHP (Institut Henri Poincaré) le 26 janvier 2018, l'Université Paris Descartes le 30 novembre 2018 et à l'École 42 le 18 décembre 2018.

Wasserstein Adversarial Mixture Clustering (WAMiC) — Poster
Warith Harchaoui, Pierre-Alexandre Mattei, Andrés Almansa and Charles Bouveyron, Summer 2018, Data Science Summer School — École Polytechnique

Clustering complex data is a key element of unsupervised learning which is still a challenging problem. In this work, we introduce a deep approach for unsupervised clustering based on a latent mixture living in a low-dimensional space. We achieve this clustering task through adversarial optimization of the Wasserstein distance between the real and generated data distributions.

The proposed approach also allows both dimensionality reduction and model selection. We achieve competitive results on difficult datasets made of images, sparse and dense data.

This work finally resulted in a chapter in my Ph.D. manuscript called Wasserstein Clustering.

Artificial Intelligence, Machine Learning, Computer Vision and Natural Language Processing with Python
Warith Harchaoui, Mohamed Chelali, Matias Tassano, Pierre-Louis Antonsanti and Azedine Mani, Last updated in December 2022 (since December 2018), MAP5 — Université Paris Descartes

Artificial Intelligence needs heavy computations. During the 2010s, the Deep Learning community paved the way of hardware acceleration by historically Graphics Processing Units (GPU) diverted from its original usage for the benefits of Applied Mathematics Research beyond Graphics.

The aim of this webpage is to present a cheat sheet for programming in Machine Learning
(i.e. Statistical Learning, Pattern Recognition, Artificial Intelligence, Data Science) for tremendous applications such as in Computer Vision, Sound Processing and Natural Language Processing.

This page has been extensively used at least in the MAP5 lab, Oscaro.com, Jellysmack for Applied Mathematics to conduct research in Machine Learning (ML), Computer Vision (CV) and Natural Language Processing (NLP) in Python. Please feel free to contact me (Warith Harchaoui, ) for improvements and suggestions.