Artificial intelligence ( AI), sometimes called machine intelligence, is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals. Leading AI textbooks define the field as the study of “ intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term “artificial intelligence” is often used to describe machines (or computers) that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”. Artificial neural networks ( ANNs), usually called neural networks ( NNs), are computing systems vaguely inspired by the biological neural networks that constitute animal brains. There is lot of hype these days regarding the Artificial Intelligence and its technologies. This hype vs reality, we will understand with the help of the Gartner chart.
In this article I explain about the various terminology associated with Artificial Intelligence (AI), Transitioning from Machine Learning to Deep Learning, Basic building blocks for the study of AI and Artificial Neural Network (ANN). Also will talk about the Hype vs Reality on AI technologies.
Key Takeaways from this post are as below:
- Hype vs Reality?
- Machine learning vs Deep learning
- Applications of ANN/DL
- Fundamentals of Artificial neural network
- Forward Pass
- Back propagation
- Feed forward networks
- Activation Functions
- Specific Applications
Hype vs Reality
Above graph clearly mentions which technology going to remain for coming years and which will become obsolete. It is very helpful in case you are planning to up-skill yourself in any of those.
Brief history of ANN and DL
ImageNet Challenge 2012
Why second wave?
- With the advancement in Internet of Things IoT, More data from systems and sensors
- More compute power : GPU’s, multi-core CPU’s
- Training of Deep Architectures become faster
- With more data in place, data driven decisions are in demand than to only rely on experience driven decisions
Machine learning vs Deep learning
As it is very clear from the above image that, in Machine Learning, one has to program for feature extraction and then pass those feature to the models for classification or prediction. And hence these two steps happen independently. However, with the large amount of data it becomes near to impossible to extract features correctly and then pass them to the model for classification. Hence deep learning approach takes the advantage of high computation power which is available now a days, and takes huge raw data which further processed among different deep layers. And then the last layer which is called as output layer ultimately gives the classification results. So you see feature extraction and classification both happen within the deep layers.
Thinking is possible even with a “small” brain
Results of above Experiments
- Pigeons were able to discriminate between Van Gogh and Chagall with 95% accuracy (when presented with pictures they had been trained on)
- Discrimination still 85% successful for previously unseen paintings of the artists
- Mice can memorize mazes, odors of contraband (drugs / chemicals / explosives)
Applications of ANN/DL
Identification of different objects in the image.
Annotate what is happening in the image. something like labeling the image with the best described label.
Identify what the video all about. for example what kind of sport is is being played in the video.
Natural language processing
Process the raw text to extract important information. Few applications of NLP:
- semantic parsing
- search query retrieval
- sentence modeling
Once trained, the network can generate steering commands from the video images of a single center camera.
So, how does the brain work?
Direction of signal is along the Axon from Nucleus to synapse
Biological neural networks
- Fundamental units are termed neurons.
- Connections between neurons are synapses.
- Adult human brain consists of 100 billion neurons and 1000 trillion synaptic connections.
- Equivalent to computer with one trillion bit per second processor
Neurons in various species
Artificial neural network
Activation function plays an important role in classifying the objects in different categories.
DL is a class of ML algorithms use a cascade of many layers of nonlinear processing units for
feature extraction and transformation.
A neural net model is composed a set of Layers. There are many types of layers available and each layer has many parameters. Thus we can have infinitely many different network architectures.
Artificial neural network
Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing dendrites) and sums them to produce an output (or
activation) (representing a neuron’s axon).
Usually the sums of each node are weighted, and the sum is passed through a non-linear function known as an activation function.
Artificial neural model: Perceptron
General Training Steps
- Learning is changing weights
- In the very simple cases
- Start random
- If the output is correct then do nothing.
- If the output is too high, decrease the weights attached to high inputs
- If the output is too low, increase the weights attached to high inputs
Feed forward nets
What are hidden layers?
They are non-linear sums of the inputs
So, they are not linearly dependent on inputs (they are non-linearly dependent)
They are engineered features of the original inputs!
- Feature engineering is so difficult because for each type of data and each type of problem, different features do well
- Neural networks can potentially build features hierarchically
But, how to learn hidden weights
Neural Network simulator
Please refer the link https://www.mladdict.com/neural-network-simulator for the live visualization and understanding by doing.
In Neural Networks, the activation function of a node defines the output of that node given an
input or set of inputs.
A standard computer chip circuit can be seen as a digital network of activation functions that
can be “ON” (1) or “OFF” (0), depending on input.
This is similar to the behavior of the linear perceptron in neural networks.
It is the nonlinear activation function that allows such networks to compute nontrivial problems
using only a small number of nodes.
Properties of activation function Nonlinear:
When the activation function is non-linear, then a two-layer neural network can be proven to be a universal function approximator.
This property is necessary for enabling gradient-based optimization methods.
- When the range of the activation function is finite, gradient-based training methods tend to be more stable.
- Smaller learning rates are typically necessary.
When the activation function is monotonic, the error surface associated with a single-layer model is guaranteed to be convex.
Functions with a Monotonic derivative have been shown to generalize better in some cases.
Approximates identity near the origin:
- The neural network will learn efficiently when its weights are initialized with small random values.
- When the activation function does not approximate identity near the origin, special care must be used when initializing the weights.
- Dense Layer
- Dropout Layer
Will explain the working of each layer in detail in next article.
Parameters to vary for Model Tuning
Recap of evaluation measures
This is all about basics of Artificial Neural Networks. In my upcoming posts I will be writing in detail about each topics mentioned in this post. So stay tuned!!!
Concept Building in Machine Learning: https://ashutoshtripathi.com/machine-learning/
What is Bagging in Ensemble Learning
Ensemble Learning says, if we can build multiple models then why to select the best one why not top 2, again why not…
Originally published at http://ashutoshtripathi.com on September 14, 2020.