The Tree of Machine Learning Algorithms is a simplified schema to rationalize the types of learning paradigms used by categories of algorithms. Just as a tree’s branches grow stronger and wider with an expansive network of roots, the machine learning tree is strengthened using a network of data.
Data can be structured or unstructured, involve transactions or images, and even be generated by a sensor (like with the internet of things) or simple data entry. Data is everything, but not all data has the potential to provide business value. Discovery is the key process to help find, cleanse and curate data before sending it to machine learning algorithms.
Data is everything, but not all Data has the potential to provide business value.
There are three machine learning archetypes:
Unsupervised machine learning
This includes any algorithm where the learning model is only based on input data (X) and no corresponding output variables. The goal for unsupervised learning is to model the underlying structure or distribution in the data so you can learn more about the data. We call these “unsupervised learning”’ because there are no correct answers and there is no teacher. Algorithms are left to their own devices to discover and present the interesting structure in the data.
Supervised machine learning
If you have input variables (X) and an output variable (Y), and you use an algorithm to learn the mapping function from the input to the output, then you’re probably looking at supervised machine learning.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new input data (X), you can predict the output variables (Y) for that data. It is called “supervised learning,” because the process of an algorithm learning from the training data set can be thought of as a teacher supervising the learning process. We know the correct answers, then the algorithm iteratively makes predictions on the training data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of performance.
Reinforcement machine learning
This type of machine learning allows machines and software agents to automatically determine ideal behaviours within a specific context to maximize performance. For an agent to learn its behaviours, all that’s required is simple reward feedback. This category of algorithms shows great promise. It includes innovative technology like Q-learning — a model-free, reinforcement machine learning technique used to teach robots to pick up a device from a specific box and put it in a container. Whether it succeeds or fails, it memorizes the object, gains knowledge and trains itself to do this job with great speed and precision.
Deep learning is a subset of machine learning and covers all three paradigms using artificial neural networks (ANNs). ANNs are composed of multiple nodes that imitate the biological neurons of the human brain. The neurons are connected by links and interact with each other whilst the nodes can take input data and perform simple operations on the data. The result of these operations is passed to other neurons. The output at each node is called its ”activation value” or ”node value”. Each link is associated with a weight. ANNs are capable of learning, which takes place by altering weight values.
Convolutional neural networks are much like ordinary neural networks, but they are specialized to read images as input, so are used very successfully for image recognition.
So, there you have it: a brief voyage along the branches of the tree of machine learning. It may take some time for that tree to mature within your organization, but the results can be incredible.
Be sure to bookmark this page and refer to it to help drive your machine learning journey.
Enrico Galimberti is the director of consulting services at Teradata Italy. He has a broad experience of information data management and has developed significant expertise in a range of enterprise transformation services: enterprise data warehouse, big data transformation, analytic and BI solution implementation, business process re-engineering and IT operations improvement using technology as a critical instrument for enabling change. Enrico has also supported CDO and CIOs to implement innovation and digital strategies.
View all posts by Enrico Galimberti
Stay in the know
Subscribe to Teradata’s blog to get weekly insights delivered to you