Optimization: Its role in Machine Learning
A machine learning algorithm tries to find a function (in a predictive modeling problem) or a distribution (in a statistical modeling problem), that maximizes the probability of observations, given the trained model. Maximizing this quantity is known as optimization, and many machine learning problems are concerned with various ways of optimizing an objective function. In this article we will talk about the role of optimization in machine learning.
In machine learning, optimization is the process of finding a minimum (or maximum) value or some optimal parameter values for a given function. This function is referred to as an objective function in the field of mathematical programming, which has close connections with many areas of optimization theory including mathematical analysis and numerical methods.
Expert knowledge has been applied in the field of optimization since antiquity, but it was not until the 19th century that mathematics began to be applied in an effort to solve practical problems. The development of mathematical optimization has been driven by a need for increased efficiency and productivity across many areas of science and engineering. Currently, optimization is commonly used in disciplines such as operations research, industrial engineering, economics, mathematics, and computer science to achieve a wide range of goals that include minimizing costs, maximizing production, or finding appropriate policy solutions for complex systems.
Many optimization algorithms have been developed. These are generally categorized into deterministic and stochastic approaches in the fields of mathematical programming and numerical analysis respectively. The most popular categories of optimizations algorithms are linear programming and nonlinear programming, global optimization algorithms such as evolutionary algorithms, local optimization methods like simulated annealing or the method of steepest descent, and dynamic programming methods. In addition to these categories, there are many other more specialized approaches to optimization that can be categorized in a variety of ways depending on how their features compare with those of other approaches.
The most popular optimization algorithm is linear programming which has applications in logistics, production planning, and transportation scheduling among many other fields. The simplex method for solving linear programs was first published by G. Dantzig as an efficient method for finding solutions to large-scale linear programs and has been at the core of operations research and industrial engineering ever since. Many extensions to linear programming have been developed, including integer linear programming, second-order cone programming, nonlinear optimization problems (a generalization of both linear and nonlinear programs), and mixed-integer programming. Extensions of the simplex method for solving these are also available.
Global optimization methods include simulated annealing, genetic algorithms, and ant colony optimization have applications in fields as diverse as quantum physics, computer science, electricity generation, economics, and climate modeling. Local or ad hoc methods have a wide variety of applications such as determining the structure of organic molecules like proteins.
Although there has been considerable work on algorithm efficiency and scalability by using parallel computing techniques with specialized hardware for solving optimization problems at high computational speed, these generally are not considered to be algorithm paradigms but rather implementation approaches. In the case of linear algebra or matrix computations, there are specialized algorithms that exploit matrix structure and some numerical optimization algorithms have been developed specifically for vector architectures.
There is an information gap between basic research in operations research and applied fields like engineering that often prevents new approaches from being quickly adopted. In addition to this issue, optimization has traditionally relied on expert knowledge but a recent trend towards more software-based approaches should be able to close this gap.
Optimization and artificial intelligence (AI) research has acquired increased importance due to the widespread use of optimization in many areas such as mathematical programming, game theory, forecasting, control systems, operations management, and finance. AI is concerned with the understanding and creation of intelligent agents that behave intelligently and “learn” from experience or knowledge acquired through problem-solving. The popularity of machine learning methods for implementing these has grown considerably over the past decade because they can be applied to a wide variety of problems in many application domains. Many optimization problems have been formulated as machine learning problems which are usually defined by using supervised classification techniques.
Originally published at https://protonautoml.com on July 20, 2021.