Machine Learning VS Natural Language Processing

Introduction:

Various technological advancements have had an enhancing transformative effect on our daily lives. Emerging technologies like Artificial Intelligence, Cyber Security, Machine Learning, Natural Language Processing, etc; are becoming increasingly prevalent in our lives. The products of such advancements are seen in the form of new effective features in smartphones and laptops, automation devices, etc.

One of the most discussed technologies is Artificial Intelligence. Such emerging disciplines are very advanced and intricate. Some terms/domains like Machine Learning, Natural Language Processing, etc; are used synonymously. However, such domains are to be well understood and differentiated. This article is aimed at making one understand the fundamental differences between Machine Learning and Natural Language Processing.

Machine Learning(ML) and Natural Language Processing(NLP) are both sub-domains of Artificial Intelligence(AI) which is a branch of Computer Science(CS). The Venn diagram depicted below illustrates the relationship among five main domains, namely, CS, AI, ML, DL, and NLP.

CS = Computer Science

AI = Artificial Intelligence

ML = Machine Learning

DL = Deep Learning

NLP = Natural Language Processing



Artificial Intelligence:

Artificial Intelligence is the branch of Computer Science that is concerned with machines or systems that are capable of cognitive abilities, rational thinking, and decision-making abilities that resemble human intelligence. It is a wide domain consisting of various concepts and other sub-domains. Machine Learning and Natural Language Processing are sub-sets of Artificial Intelligence.

Machine Learning:

What is Machine Learning?

Machine Learning can be considered as a form of applied statistics. This sub-field of Artificial Intelligence makes use of various statistical techniques to build applications that learn automatically from data provided in the form of structured observations. The main aim of a machine learning algorithm is to make an application learn autonomously via experience in order to improve accuracy for a specific task. The Machine Learning algorithms are trained to find certain characteristic patterns in the data(training data) provided to it and this learning is used to make decisions and predictions on unseen/new data. The better the algorithm, the large and more varied the training data is, the better is the learning, and the more accurate would be the decisions and predictions for the new required data.

Basic Steps in building a Machine Learning Application:



  • Obtaining/Collecting Data

The fundamental step in developing a Machine Learning Model is to collect required and appropriate data. More is the quantity of data and highly varied is the data, more accurate would be the predictions obtained or decisions made.

  • Preparing Data

Real-world data is quite noisy and may have several unnecessary elements. It is extremely crucial to provide data in an organized format for proper training of the model.

  • Training a model on the data using an algorithm

Based on the kind of problem and the output labels, an appropriate algorithm is to be chosen using which the model can be trained on the training data.

  • Testing the model

Post successful training of the model, the model can be used to make required predictions on unseen/new data. Testing measures like Accuracy, True Positive Rate, etc; can be used to determine how good the model is.

  • Improving the model

The process of training and testing Machine Learning models is an iterative process. The model’s results are to be reviewed and based on the observations, new techniques are to be applied to improve the model are explained as follows:

  • Supervised Learning:

  • One of the most common types of learning.

  • Used when the data is labeled data or in other words, when the output is definite.

  • Can further be categorized into two types:

  • Regression(Output Label = Continuous)

  • Classification(Output Label = Discrete)


  • Examples: Decision Trees, Naïve Bayes, KNN algorithms.

  • Unsupervised Learning:

  • Used when the data is not labeled.

  • Allows the model to work on its own to discover patterns and information that were not previously detected.

  • Types:

  • Clustering(To find natural groups/clusters in data)

  • Association(To establish association between data objects)


  • Examples: K-Means, PCA algorithms.

  • Semi-supervised Learning:

  • Makes use of both labeled and unlabeled data for training.

  • Generally uses a combination of a small amount of labeled data and a large amount of unlabeled data.

  • Examples: Self Training, Mixture models.

  • Reinforcement Learning:

  • Reinforcement Algorithms learn to react to a specific environment.

  • Learning is done based on interaction with and feedback from the environment.

  • Examples: Markov Decision Process, Q-learning algorithms.

Natural Language Processing: