Deep Learning: An Introduction for Applied Mathematicians

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final year undergraduate students in mathematics who are keen to learn about the area. The article may also be useful for instructors in mathematics who wish to enliven their classes with references to the application of deep learning techniques. We focus on three fundamental questions: what is a deep neural network? how is a network trained? what is the stochastic gradient method? We illustrate the ideas with a short MATLAB code that sets up and trains a network. We also show the use of state-of-the art software on a large scale image classification problem. We finish with references to the current literature.

Related collections

Most cited references 3

Record: found
Abstract: found
Article: found

Is Open Access

Understanding Deep Convolutional Networks

Stéphane Mallat (2016)

Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties. Computations of invariants involve multiscale contractions, the linearization of hierarchical symmetries, and sparse separations. Applications are discussed.

0 comments Cited 38 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Stochastic Separation Theorems

A. N. Gorban, I. Tyukin (2017)

A set

$S$ is linearly separable if each

$x\in S$ can be separated from the rest of

$S$ by a linear functional. We study random

$N$ -element sets in

$\mathbb{R}^n$ for large

$n$ and demonstrate that for

$N 1-\vartheta$ , for a given (small)

$\vartheta>0$ . Constants

$a,b>0$ depend on the probability distribution and the constant

$\vartheta$ . The results are important for machine learning in high dimension, especially for correction of unavoidable mistakes of legacy Artificial Intelligence systems.