CERC DS4DM – Data Science for Real-Time Decision-Making

After 10 years of existence, the chair will close its doors on January 31, 2025. We would like to thank the sponsors, members and partners for their support and dedication throughout the years. Their contributions have been invaluable in achieving our mission and goals and we are deeply grateful.

In this page, we are pleased to share facts about the chair and achievements that have paved the way for the integration of Machine Learning (ML) and Operations Research (OR). We are thrilled with the progress of this relatively new research area and eagerly anticipate future breakthroughs.

What is the Chair about?

The Canada Excellence Research Chair in Data Science for Real-Time Decision-Making aims at developing new tools and methodologies that will allow enormous volumes of data from multiple sources to be processed and analyzed in real time — in order to obtain usable knowledge and to automate decision-making. By combining processes for analyzing highly targeted data and real-time decision-making, their mathematical model-based tools will help organizations improve performance, by creating highly customized outputs and taking into account the environments, needs and individual behaviors of their clients or users. The applications that result will foster new business models that are based on accurate depictions of user behaviors and expectations, combined with competitors’ responses. The many sectors that could benefit include transportation management, energy, health care and manufacturing, as well as supply chain management and logistics.

Few facts

68 students graduated (9 interns / 17 masters / 26 phds / 16 postdocs)
more than 50% of the CERC team are from visible minorities or minority ethnics
Around 20 different nationalities, with 20% (14/68) of female students
Around 20 prizes received awards/prices for the research work
22 industrial partners (30 out of 68 students were involved in an industrial project)
Democratization of the approach with ECOLE library

The CERC research program general achievements

Nowadays applications in all areas of applied mathematics benefit from the availability of large quantities of data that are automatically collected by sensors, including mobile devices, in sectors as diverse as transportation, retail, logistics, telecommunications, healthcare, just to mention a few. Data flows in the system at large speed, in a variety of formats and, as mentioned, in large quantities. This represents an incredible opportunity for decision-making because, differently from the past, data guarantees detailed knowledge of the problem at hand, which, in turn, implies that decision-making, or Optimization (often used here as synonyms) is data driven. On the one hand, although analyzing and extracting actionable information from data is not for free, modern statistical learning, in particular Deep Learning, have recently shown a high degree of success in achieving knowledge acquisition in many important areas like image and speech recognition and natural language processing. On the other hand, decades of research on the theory and practice of Mathematical Optimization led to the development of sophisticated algorithms and mature software tools that are routinely used in decision-making and are highly successful. So, knowledge acquisition by Machine Learning (ML) and decision-making by Mathematical Optimization (MO) are key ingredients of the Artificial Intelligence knowledge revolution that is already affecting and has the potential to affect more and more in the near future, our society. In such a context, the research mission of the Canada Excellence Research Chair (CERC) in “Data Science for real-time decision-making” is to advance the methodological aspects of the integration between Machine Learning and Mathematical Optimization.

The research developed within the CERC has followed three main directions exemplified by the following three questions:

1) What can Discrete Optimization do for Machine Learning?

2) What can Machine Learning do for Discrete Optimization?

3) Which are the application areas that would be out of reach without a fine integration of Machine Learning and Mathematical Optimization?

The basic research idea behind question 1) is that Discrete Optimization has been disregarded in Machine Learning due to its theoretical complexity and due to the fact that most of the decisions in classical ML are inherently continuous and unconstrained. We argue that theoretical complexity is not such a strong barrier in practice and that in several ML contexts making discrete decisions is very relevant. In particular, the CERC has explored the use of Mixed-Integer Programming (MIP) for classical classification problems [Belotti et al. 2016; Shen et al. 2017] and the use of column generation for learning tasks [Jena et al. 2017].

For question 2), the activity in the Chair concentrated especially on the use of ML to learn and automate algorithmic tasks that are important in the solution of MIP problems. It has been observed [Lodi 2012] that MIP solvers often execute those tasks, i.e., make decisions, in heuristic way and ML is a very suitable candidate to learn from data effective heuristic policies. A crucial example of many concerns the so-called “variable selection” problem, when one needs to decide which, among the variables assuming fractional values in the Linear Programming relaxation of every node of a branch-and-bound algorithm for MIP has to be constrained (branched on) to generate the child nodes. Such a repetitive decision problem (i) is crucial for the performance of any branch and bound, (ii) does not yet have a satisfying mathematical solution and (iii) some of the heuristic approaches used for it are computationally expensive. Developing ML algorithms for variable selection is an extremely active research area [Lodi & Zarpellon 2017] and the CERC has started more than one effort to attack it (discussed by two presentations in the recent INFORMS meeting). Other examples of the use of ML for MIP tasks are on the learning of a classifier to automate the linearization decision for Mixed-Integer Quadratic Programming problems [Bonami et al. 2018], on predicting the evolution of a branch-and-bound algorithm for MIP [Zarpellon et al. 2018], and on selecting columns in column generation algorithms.

Concerning question 3), namely the application areas in which the synergy between ML and MO is likely to lead to improve our understanding of complex problems, we would like to mention, as an example, the healthcare sector. The CERC is very active in the mathematical aspects of kidney transplantation and exchange with a combination of techniques that include deep learning to predict the quality of a matching patient/donor and discrete optimization and game theory to exploit such a prediction to guarantee a fair allocation of the kidneys both for the individual patients (or hospitals) and for the society as a whole [Carvalho et al. 2017; Luck et al. 2017 and 2018].

Besides those three research questions that belong to the core of the synergy between Machine Learning and Mathematical Optimization, the members of the Chair have been active in more classical methodological aspects of MO like cutting plane generation [Bonami et al. 2016; Bonami et al. 2017; Rosat et al. 2017; Aardal et al. 2018], multi-level optimization [Carvalho et al. 2018; Baggio et al. 2018; Dan et all. 2018], Stochastic Programming [Lodi et al. 2016; Rostami et al. 2018], Quadratic Programming [Rostami et al. 2018; Furini et al. 2018], etc. Finally, contributions to some specific data-driven discrete optimization problems have been published, see for example [Mai & Lodi 2017; D] on facility location, [Rostami et al. 2017; Gmira et al. 2017] on vehicle routing, [Lodi et al. 2017; Olivier et al. 2018] on bin packing, [Lodi & Moradi 2018] on communication networks, [Anjos et al. 2018] in energy distribution, etc.

In 2020, one of the most notable achievement is the NeurIPS paper in which we show that ML can be used to makefast and accurate branching decisions within SCIP, a competitive Mixed-Integer Programming open-source solver. More generally, this area is growing rapidly, with the number of submissions on arxiv and at machine learning conferences increasing. However, our experience on the topic made us realize there is a problem of reproducibility and a high bar of entry that hinders progress in the area. To accelerate research and keep the Chair a leader on the topic, we started the development of a new open-source library, called “Ecole”, that will offer a standardized platform for research and industrial applications. This library will offer uniform Markov decision process interfaces, problem generators, feature extractors and benchmarking tools in robust C++ code on top of SCIP. A first initial release was produced at the end of 2020.

Software development project

The research activities within the Chair has been consolidated through the development of Ecole standing for Extensible Combinatorial Optimization Learning Environments and aiming to expose a number of control problems arising in combinatorial optimization solvers as Markov Decision Processes. Rather than trying to predict solutions to combinatorial optimization problems directly, the philosophy behind Ecole is to work in cooperation with the state-of-the-art Mixed Integer Linear Programming solver SCIP that acts as a controllable algorithm.

More information

Partners

The Chair relies on various public and private partners in hospitals and industrial sectors to play an active role in technological, economic and social development.