Artificial General Intelligence – Pro et Contra

Head, Board, Machine Learning, Algorithm
AGI

“There is a thin domain of research that, while having ambitious goals of making progress towards human-level intelligence, is also sufficiently grounded in science and engineering methodologies to bring real progress in technology. That’s the sweet spot.” – Yann LeCun

Mind Map

Before we delve into benefits of AGI, let us look at what is AGI first :

Artificial General Intelligence is based on four key principles:

  • Essence of intelligence is thought i.e.., rational deliberation which is necessarily sequential
  • Ideal model of thought is logical inference based on concepts
  • Perceptions is at a lower level of thought
  • Intelligence is based on ontology

Computers with AGI can think, comprehend, learn and apply AI techniques to solve real life challenges. AGI can handle unfamiliar problems. It is referred to as deep or strong AI. The capabilities of AGI are listed below:

  • Sensory perception
  • Fine motor skills
  • Natural Language Understanding
  • Natural Language Processing

Now let us look at the benefits of AGI:

  • AGI can provide solutions to world’s problems related to health, hunger, and poverty.
  • AGI can automate processes and improve efficiencies in the companies
  • Without manual supervision, AGI can execute tasks
silhouette of man standing on rock near body of water during daytime
Singularity

AGI has problems and disadvantages as mentioned below:

  • Singularity can be the effect of AGI
  • AI can be destructive to mankind
  • It can be a weapon for human extinction
  • AGI can evolve without any rules or principles

Note : Singularity is a hypothetical point in time when technological growth becomes uncontrollable and irreversible, resulting in uncontrolled changes to mankind.

“The robot has some objective and pursues it brilliantly to the destruction of mankind. And it’s because it’s the wrong objective. It’s the old King Midas problem. We’ve got to get the right objective,” he explains, “and since we don’t seem to know how to program it, the right answer seems to be that the robot should learn – from interacting with and watching humans – what it is humans care about.”Stuart Russell

 “Golden Touch” myth is about the old King Midas Problem. King Midas, a rich and greedy king in Ancient Greece, acquired the ability to change all that he touched into gold. But hardly had he started that everything was transformed into gold, including his daughter.

Solution for singularity is to develop a simulation tool to simulate AGI machine with different techniques. This simulation will help in predicting the AGI behavior against mankind.

Augmented, Reality, Virtual, Glasses
Simulation

Regression

There are different methods of Regression used in machine learning. The different techniques are listed below:

  1. Linear Regression
  2. Polynomial Regression
  3. Ridge Regression
  4. Lasso Regression
  5. Non Parametric Regression
  6. K-Nearest Neighbor Regression
  7. Kernel Regression

The types of the regression is dependent on the number of explanatory variables such as single (simple) and multiple.

Regression types

                      

 In the next section, linear regression is discussed in detail.

 Linear Regression

Linear Regression

                                  

Linear Regression is very popular modeling method.  This method consists of dependent and independent variables. Dependent variables are continuous. Independent variables are continuous and discrete. In linear regression, independent variables (Z) and dependent (W) variables are used for identifying relationship between them. The relationship used is a straight line which is a best fit. It is also referred as linear regression.

It is represented by an equation W=mZ + c + err, where c is intercept, m is slope of the line and err is error term.  To predict the value of a variable, the function W is used.  The linear regression has single independent variable.

Multiple linear regression has more than independent variables. If there are more than one independent variable, multiple linear regression addresses the finding the fit for the line which relates the dependent variable and independent variables.

Least Square method is used for finding the fit for multiple linear regression technique. The method tries to minimize the sum of the squares of the differences from each point to the line.  The deviations are squared and added to ensure that the positive and negative values are not cancelled out.

Code Snippet : Linear_Regression.py

Instructions for Running the Code

pip install numpy

pip install tensorflow

python linear_regression.py

Output of the code Execution

Classification

Classification 

In pharma, health care, economics and other fields, classification plays an important role. As huge data repositories exist in this domains and are used for planning and innovation. Patterns are identified and analyzed for forecasts and prediction.

A good example for classification algorithm is analysis of X-ray images. The labels are assigned for disease characteristics such as tumors and others. The label values can be yes and no. Image analysis of X-rays help in bringing down time for analysis of X-ray images.

X Ray Images

The classifier executes the classification algorithm which is of high speed and precise. The training data set is selected which is small to start with and covers all the parameters which are features of the model for X-ray images which are X-ray parameters such as bones, head, and other body parts.

  Xray Image analysis

                          

The learning patterns can be different for the training to happen. A feature vector is used for the classification algorithm. The vector has representations of the features in numeric form. Let us say the goal is to classify the images of dogs into different classes based on a set of features. The feature vector will consist of size, appearance, and purpose and hair color.

The techniques used for the classification are presented in the next section.

Linear Regression

Linear regression is a classifying technique where the relationships between the parameters observed are modeled. The observed parameters are numerically  fit to a line using a simple linear regression. The line is drawn for best fitting or closest to the point.

In a scenario where a group of values is labeled Yes which is of value 1 and other label No of value 0. The linear regression might fail in classifying which is shown in the picture below.

Linear Regression

                             

Perceptron

A binary classifier is an algorithm which is referred to as a perceptron. The input data which is based on binary classification is used by the algorithm. The output is a linear partition of data from one class to another. Binary classifier labels the data elements are boolean such as yes or no.

Perceptron

                  

Naive Bayes Classifier

A Naive Bayes classifier is based on bayes theorem. According to the bayes theorem, the probability of an event C happening, given that D has occurred can be calculated. D is the evidence for C happening and C is referred to as hypothesis.  The predictor and the features are independent to each other.  The algorithm is referred to as naive because  one feature does not affect the other one.

The bayesian posterior probability is dependent on prior occurrence, likelihood and evidence (data).

P(C/D)  = (P(D/C) P(C) )  / P(D)

           

Naive Bayes Method

To give an example for Naive Bayes classification,  let us look at objects which need to classified based on color YELLOW or BLUE. A new objects need to be classified based when they come. The class label need to be applied based on the objects which left.  We look at number of BLUE objects which left versus YELLOW. Let us say there are thrice as many BLUE objects versus YELLOW. The new case is thrice as likely to have BLUE label versus YELLOW. Bayesian analysis refers to this technique as prior probability. The previous observations decide the prior probabilities and the BLUE and YELLOW percentages are used for prediction of outcomes.

Let us say total of 80 objects, 60 of which are BLUE and 20 are YELLOW. The prior probabilities of class membership  are :

Prior probability for BLUE  = 60/80 = 3/4 = 0.75

Prior probability for YELLOW = 20/80 = 1/4 = 0.25

A new object (White circle) need to be classified as shown in the picture above. The naive bayes classifier uses a priori probabilities  for likelihood of this new object. The number of points are used for

calculating the probability of the new object being BLUE or GREEN. The likelihood of the object given YELLOW  is higher than likelihood of BLUE.

Naive Bayes Classifier

We look at the circle around the white object to check how many BLUE and YELLOW objects are there. The circle has 5 BLUE and 10 YELLOW objects. The membership of the new white object depends on the data presented and the number of YELLOW and BLUE objects which came and left in the system.

Likelihood Analysis

                         

 Let us take an example for using Naive Bayesian Classification.   A deck of cards consist of 52 cards. The goal is to find the probability of the card being a Queen.

Total number of cards in the deck are 52.  The total number of Queen cards is 4.  The probability of a card being a Queen card is :

P(Queen) = 4/52 = 1/13

The probability of a card being a Queen given that the card has a face on it can be calculated using bayes theorem.

P(Queen/Face)  = (P(Face/Queen)  P (Queen))/ P(Face)

The probability of the card is a queen given it has a face.

P(Face/Queen) = 1 

P(Queen) = 1/13

P(Face) = 9/52 

P(Queen/Face) = (1 x (1/13)) / (9/52)   = 4/9

Decision Trees

Decision Tree

Decision tree is used for representing the classified groups. Among the supervised classification learning methods, decision tree learning is very popular method. The features are from domains which are finite and discrete.  Class is the term  for classified domain element.  The tree which is used for labelling the input feature which is a non leaf node. The feature values are labelled by the arcs generating out of a node. The tree leaves are labeled with probability values of the classes.

The features which are from the tree have values assigned on the arcs. The algorithm stops when the leaf is classified.

ML Modeling

“Machine learning will increase productivity throughout the supply chain.” ~Dave Waters

Training Models

The training data is labeled by the domain experts. The machine learning model is trained by the labeled data. The data which is ambiguous  is evaluated and validated by the domain experts. The training data set is used for learning purposes.

Training flow

                      

Evaluation

Machine learning plays an important role in solving the complex problems. Machine Learning techniques are applied to develop learning models for forecasting. The machine learning models help in generating business value for the enterprise.

Model Evaluation

                                    

Model evaluated will be used for predictions. The learning model is used for forecasting, reporting, discovery, planning, optimization and analysis purposes in the organization.

Model Usage

                                     

Machine learning models assume that the training data is the basis and the unseen data is very important for making the model more effective. To validate and check the predictions, we need more unseen data for making the model trustworthy.  The model should not be remembering the training data and making forecasting for future scenarios. The training data sets might be linearly separable or not linearly separable.

Nonlinear and linear separability

                     

Note: The data set which is linearly separable splits the input set by a plane, line or hyperplane.  The points of one set are in first half space and the second set is in the other space.

The machine learning models are evaluated based on number of errors and mean squared error measures. The performance of the model is very important for any machine learning engagement. The evaluation of the model is based on the unseen data and out of the sample data predictions. The accuracy of the predictions is an important evaluation measure.

The model’s evaluation is based on two methods:

  1. Hold out
  2. Cross Validation

Hold out

The test data set is a prerequisite for model’s evaluation. The data which is used for developing needs to be different from the test data set. The prediction algorithm will have it in memory the label for the training set point. This scenario is called overfitting. The holdout evaluation is about testing the model on unseen data instead of just the trained data set. The learning model effectiveness is measured based on the unseen data accuracy. In the Hold out method, the data set has three subsets. The subsets are:

  1. Training Set
  2. Validation Set
  3. Test set (Unseen data)
  Data Subsets

                                            

The training data set is used for building the forecasting models. The validation set is used for evaluating and creating the learning model during the training phase Test data or unseen data is used for evaluating the future effectiveness of the model. The hold out method is effective for its performance.  The results will have high variableness because the accuracy varies at different stages of the machine learning.

Cross Validation

Cross validation is related to separating the observation data from the training data set. The training data set is used for the model learning and training. The unseen data set is used for evaluating the effectiveness of the model.

K-fold cross-validation is one of the cross validation methods. The data set is divided into k sub sets which are referred to as folds. k can vary from 5 to 10.  Each of those subsets are used for testing and validating the model.  The model performance is based on the average error over k different subsets. 

In four fold cross validation; the data is separated into 4 subsets. The models are trained set by step. The first model uses the first data set as the testing one and the other datasets are for training. This happens for 4 separations of the data. The effectiveness of the model is measured by 4 trials with 4 folds (data sets). Every data set point is used for testing once and for training in k-1 trials. The error bias comes down and the data is used for fitting. It reduces the variance and the effectiveness of this method improves by having testing data set as the training data set.

In the next section, we look into different types of Machine learning algorithms such as supervised learning and unsupervised learning in the next blog article.

IASA : AI Architecture Training Program

IASA Global was established in 2002. IASA is an international, non-profit business association dedicated to the advancement and sharing of issues related to software architecture in the enterprise, product, education and government sectors. They are committed to improving the quality of the IT architecture industry by developing and delivering standards, education programs and developing accreditation programs and services that optimize the development of the architecture profession. IASA Global has created the world’s first and only ITABoK (IT Architecture Body of Knowledge) that contains 250 skill sets that are critical for every Business and IT professionals to possess in order to deliver strategic values of technology for the business.

IASA Architect- AI Architecture Training Program is a basic course related to AI Enterprise architecture. This program is a defined baseline for successful IT architects who are implementing AI in enterprises.This initiative involves the advancement of best practices and education while delivering AI Enterprise programs and services to IT architects of all levels around the world.

The AI Architecture certification helps in ensuring that you are on the Enterprise Architect path. This demonstrates that you are taking the necessary steps to become a fully qualified architect to create an AI architecture. AI architecture has important factors such as the selection of machine learning frameworks and scalable solutions for automation. The AI reference architecture typically shows a workflow for automation solutions. Many AI frameworks such as Tensor Flow, Keras, NTLK, Pytorch, Google AI, IBM Watson, Microsoft Azure ML, and AWS Sage Maker are evolving and changing features rapidly. The AI architecture needs to have the flexibility and adaptability of handling the change. AI architecture helps in scaling, delivering speed and automating processes in the organization.

The AI architecture course explains the machine learning workflows and capabilities such as feature extraction, training, analytics, data collection, data analysis, data selection, project packaging, model tuning, evaluation, inference, validation, and deployment. The course will help in architecting AI applications for Recommendation, Forecast, Video Analysis, Image Analysis, text analytics, document analysis, voice to text, speech recognition, search, document analysis, conversational agents, translation, intelligent assistants, and transcription. NLP/NLU. Deep Learning, Knowledge studio, data refinery, IoT Platform, machine learning, natural language classifier, knowledge mining, cognitive search, decision-making applications, bots, robotic process automation.

In daily life, we come across many applications while working with customers and enterprises. The typical use cases where AI Architecture will help are:

  • Spam & Email – Filtering & User preferences based content analysis
  • Predictive Analytics – Credit Worthiness and Loan Applications
  • OCR : Pattern Recognition – Text, Images, Video and Audio
  • Biometrics: Identity Management & Security
  • Machine Learning Models: Life Insurance – Mortality rates, life expectancy
  • Medical Expense Prediction Model: patient history & medical claim history
  • Coverage Risk model: Liability & Property Insurance
  • Fraud Detection: Credit Card usage and activity patterns
  • Social Network Analysis: Relationship & Influence Analysis

Ecommerce websites use AI techniques and methods in their implementation . They have the below features related to AI:

  • Historical data related to customer transactions analysed for customer demographics
  • Shopping carts of the customer analysed for abandoned
  • Price analysis of the products using the historical data
  • Next Best action for the customer based on his preferences and previous purchases
  • Web page analytics related to customer browsing time for a product
  • Customer information related to profile, billing, and shipping addresses analysed for demographics 
  • Referral websites tracked by the customer views and click stream analysis
  • Patterns related to customer rating and reviews of the products
  • Marketing campaign effectiveness based on email, sms and web channels
  • Recommendations based on customer history related to browsing, usage and behavior.
  • Conversion of the shopping from view to a buy – analysis 

The recommendations of the customer and the merchant to the customer are analysed using various approaches mentioned below:

  • Collaborative Filtering
  • Content based Filtering
  • Train Matchbox Recommendation
  • Score Matchbox Recommendation

AI Modeling and Architectural development involves identifying modeling techniques, selecting algorithms, designing tests, developing models, assessing models and training the models. The other methods like Ensemble techniques help in combining and selecting multiple approaches based on scenarios. The AI model is validated and tested before using for unseen scenarios.

Enterprises are keen to evaluate AI & Machine learning techniques and develop models for decision making using Data science and algorithms. Leadership in enterprise is interested in getting their Architects trained based on experiential learning and avoid failures by using reference architecture and patterns & anti patterns. RPA is another area which enterprises want to evaluate and implement in the enterprise with AI & Machine learning, Voice and Natural language processing algorithms. Leadership is interested to know domain specific use cases where RPA is successful.

Course Outline

The Iasa AI Architecture Course for IT Architects gives you an architect knowledge of Artificial Intelligence frameworks and tools for developing AI Enterprise IT architectures that meet the demands of modern business. The curriculum is developed by AI Enterprise architects for IT architects, which is one of the basic ideas behind Iasa. The course focuses on: 

  • Data Requirements: discovering and understanding the demands and needs of the business.
  • Data Modeling Principles: The principles followed by AI Architects related to data modeling.
  • Business Case: To find the return on the investment and justify the need for AI Architecture.
  • Machine Learning Solutions: Machine learning solutions for AI Architectural requirements – Predictive analytics, pattern detection, regression model, and recommendations
  • AI Architecture Practice: How best to build the practice of architecture within your organization

Target Audience

Iasa’s basic course is aimed at IT Architects who want to become involved in AI architectural work.You should have basic knowledge and experience of system development and AI. You work as a developer, project manager, information model or process developer in AI Architecture projects.

Course Materials

The course will have around 20 to 30 Certification Questions. The workshop will have an assignment that will be included in the course. The assignment will have an application using machine learning algorithms. The course will have a workbook where the participant can apply the course concepts in their organization. By the end of the course, the workbook will help the participant to chart out the AI Strategy in the organization.

Course Modules

Module 00: Course Introduction

In the course introduction, we will cover the background and details of the Core Pre-Work and the basis of the ITABoK as well as working descriptions of AI Architecture and AI Architecture practices.

Content:

  • Course Schedule Review
  • Data Architecture Challenges
  • Iasa Proposition
  • What is AI architecture?
  • Machine Learning Process

Module 1: AI Architecture – Data Requirements

This module covers data sources, data formats, data mapping and user options for data analysis.

Content:

  • Data Sources
  • Data Formats
  • Data Mapping
  • Data Analysis Options

 For AI, data formats like image, video, text, and audio are covered. Different data sources like social media, blogs, news feeds, media websites, and other sources are presented in this module.

MODULE 2: AI Architecture – Data Modelling Principles

In architect engagement, we cover the data modeling principles and selection process for the right model. This module explains the training and evaluation of the data model.

Content:

  • Data Model selection
  • Training
  • Evaluation
  • Parameter Tuning

In this module, machine learning models, training, and validation parameters are discussed. Different machine learning algorithms such as classification, regression, neural network, decision tree, and random forest techniques. The learning methods covered are supervised, unsupervised, and reinforcement techniques.

Module 3: Business Case

Iasa fully covers the ‘demand’ or business drivers that underly AI architecture. This module explains the business value in driving AI Technology strategy through business decisions. We explore different business models, customer-driven architecture, and many other aspects of the business domain. The key aspects which are explored are innovation, automation, business value generation, and technology skills required by the engineering resources.

Content:

  • Business Model
  • Business Value Generation
  • Business Capabilities
  • Process Automation
  • Creating AI skills-based Teams

In this module, AI technology benefits are discussed through business case modeling, value generation, capabilities, automation, and AI engineering topics. The takeaways for this module will be the business benefits of rapid analysis prediction & processing, accurate forecasts, cut down in the process time due to automation and improvement in compliance.

Module 4: AI Architecture – Machine Learning Solutions

When describing AI architecture, we work through the details of tracing the business value decisions related to the machine learning solutions. We look at the value streams, capabilities, applications affected by AI, design tradeoff analysis, architecture assessment methods, viewpoints, and architecturally significant requirements.

Content:

  • Automated Advisors
  • Conversational AI solutions
  • Fraud Detection Solutions
  • Compliance solutions
  • Predictive Analytics

In this module, Predictive analytics, pattern detection, regression model, and recommendations are discussed in detail. The takeaways for this module will be real-time business decision making, eliminating manual tasks, enhancing security, reducing operating expenses, and improving business benefits through AI for your organization.

Module 5: AI Architecture Practice

The final module summarizes and expands on the AI Architecture process and the adoption of the set of programs and skills for the organization. The skill assessment provides a solid foundation for the student to review and understand their current AI skills and to create a growth plan for their future.

Content:

  • Growing Competencies
  • Skills Assessment
  • Handling the Automation Issues
  • Maturity Assessment
  • Engagement Model

The takeaway for this module will be to create an enterprise AI roadmap with assessments of the current team and skills of required team members. The student will be able to chart out a center of excellence or competency center development strategy by training and developing the engineers of their organization. The skill requirements will be based on the AI architectural platform. After this course, the student will be able to pick the right AI platform from IBM, Microsoft, Amazon, and Google AI platforms.

Course Summary

Exam Information

The Certified IT Architect – AI Architecture (CITA-AIA) credential is awarded to those who qualify based on a combination of criteria including education, experience and test-based examination of professional knowledge of AI architectural skills and management.

The CITA-AIA credential is awarded by achieving a 70% or higher on the CITA-AIA examination. The exam consists of 75 multiple-choice/true/false questions.

The AI Architecture exam is available online, anytime, via Iasa’s Learning Management System. If attending an onsite course, the exam is proctored on the last day. If attending an online course, access is given on the last day of the course as well. Students will be given 2.5 hours each to complete the exam.

Watch out for the course announcements from IASA Global regarding this AI Course.

Types of Machine learning

Supervised Learning

Supervised learning is related to creating a model which can be used for forecasting based on the historical data for unseen data. The machine learning technique reads the input data set and the expected output data. The model is trained for forecasting the outputs for the new scenarios.

The supervised machine learning can be categorized as:

  1. Regression
  2. Classification

The fitting of the data is done in the Regression method. The data is partitioned in the Classification method. Supervised learning is very popular in the machine learning space.

The input variables  z is transformed by the mapping function g to create the output variable W in supervised learning technique.

W = g(Z)

The new input data Z will be used for forecasting the output variables W using the mapping function.  The aim is to find the mapping function. This method is referred to as supervised learning as it is like a manager supervising the employee learning process. Supervisor checks the training process and the forecasts on the training data set. The supervisor validates the outputs for unseen data and the technique targets a goal set for effectiveness.

Let us look at the examples for classification in the following section. The first example is related to classification of dogs.

                                     Classification of Dogs

There are different types of dogs. Dogs can be classified into the following groups.

  1. Herding
  2. Sporting
  3. NonSporting
  4. Working
  5. Hounds
  6. Terriers
  7. Toy

Dogs have different characteristics and each group has set of features which are used to identify the dog. This is a good example for supervised learning where we have to classify the dog images into various groups based on features.

There are around 560 breeds of dogs presented in the  word cloud  below:

Dog Breed Word cloud

                              

Another example is classification of cats.

Classification of Cats

Below is the word cloud of 100 cat breeds. Each breed has different characteristic and feature to categorize the images.

Cat breed word cloud

                           

Some of the features or characteristics of the cat are body type, coat, pattern of the skin and coat. The shape of the face is another important factor for cat classification.

Note :A chowder is a set of cats. It is also referred as a glaring. The cats which are very different to each other  in a group, glaring is the right word. Kindle is a group of kittens.

In the case of regression, data is distributed in different dimensions. Information needs to be retrieved from it.The models need to evolved based on the data set and the errors need to be minimized for prediction. Regression is the method which is described above.

Dogs and cats problems have different challenges and learning is different in each case.  Features need to be analyzed and the models need to be fitted to the data available for prediction.

  Regression

                               

In terms of machine learning we define these two types as a part of broader class called supervised learning. Machine learning has evolved with the data and processing power available at that particular time.

Classical Machine Learning

Classical Machine learning consists of different phases such as modeling, evaluation and methods such as supervised and unsupervised learning. There are different techniques within the supervised and unsupervised learning which are presented in the next sections.

Classical Machine Learning

Machine Learning is related to a code which can learn by implicit code and logic. The input for the code is provided by the data for the training and learning purposes. Machine Learning is part of the computer science and related to Artificial Intelligence.  The data is gathered, staged, and cleansed for training and learning purposes.

Modelling

 Real world has different workflows and procedures which can be modeled using mathematics. Machine Learning model is based on the mathematical model of the procedure. Learning  is achieved by using the data provided. Data is collated from databases and devices. The data ingestion is done from different datasources.

Data is transformed, normalized, and cleansed before the data set is created for learning.Data is analyzed and patterns are identified for forecasting. Data set features are analyzed and identified for feature set creation.

Different sets of features from the data are used for selection of the approach. For example, for regression the complexity and the degree of the polynomial are the key factors. The model based on mathematics is chosen from a group of candidates.  Most of the time, the simplest model is the best one for prediction and forecasting.

“We consider it a good principle to explain the phenomena by the simplest hypothesis possible”. – Ptolemy

 Models can be selected from different approaches such as listed below:

  • Support Vector Machine
  • Logistic Regression
  • Others

  Machine Learning Algorithms are categorized into three types.

  1. Supervised Machine Learning
  2. Unsupervised Machine Learning
  3. Reinforcement Learning

Before we look at different types of machine learning algorithms, let us look at the machine learning models, features and model creation, training and evaluation of the models in the next blog article.