“There is a thin domain of research that, while having ambitious goals of making progress towards human-level intelligence, is also sufficiently grounded in science and engineering methodologies to bring real progress in technology. That’s the sweet spot.” – Yann LeCun
Before we delve into benefits of AGI, let us look at what is AGI first :
Artificial General Intelligence is based on four key principles:
Essence of intelligence is thought i.e.., rational deliberation which is necessarily sequential
Ideal model of thought is logical inference based on concepts
Perceptions is at a lower level of thought
Intelligence is based on ontology
Computers with AGI can think, comprehend, learn and apply AI techniques to solve real life challenges. AGI can handle unfamiliar problems. It is referred to as deep or strong AI. The capabilities of AGI are listed below:
Fine motor skills
Natural Language Understanding
Natural Language Processing
Now let us look at the benefits of AGI:
AGI can provide solutions to world’s problems related to health, hunger, and poverty.
AGI can automate processes and improve efficiencies in the companies
Without manual supervision, AGI can execute tasks
AGI has problems and disadvantages as mentioned below:
Singularity can be the effect of AGI
AI can be destructive to mankind
It can be a weapon for human extinction
AGI can evolve without any rules or principles
Note : Singularity is a hypothetical point in time when technological growth becomes uncontrollable and irreversible, resulting in uncontrolled changes to mankind.
“The robot has some objective and pursues it brilliantly to the destruction of mankind. And it’s because it’s the wrong objective. It’s the old King Midas problem. We’ve got to get the right objective,” he explains, “and since we don’t seem to know how to program it, the right answer seems to be that the robot should learn – from interacting with and watching humans – what it is humans care about.” – Stuart Russell
“Golden Touch” myth is about the old King Midas Problem. King Midas, a rich and greedy king in Ancient Greece, acquired the ability to change all that he touched into gold. But hardly had he started that everything was transformed into gold, including his daughter.
Solution for singularity is to develop a simulation tool to simulate AGI machine with different techniques. This simulation will help in predicting the AGI behavior against mankind.
There are different methods of Regression used in machine learning. The different techniques are listed below:
Non Parametric Regression
K-Nearest Neighbor Regression
The types of the regression is dependent on the number of explanatory variables such as single (simple) and multiple.
In the next section, linear regression is discussed in detail.
Linear Regression is very popular modeling method. This method consists of dependent and independent variables. Dependent variables are continuous. Independent variables are continuous and discrete. In linear regression, independent variables (Z) and dependent variables are used for identifying relationship between them. The relationship used is a straight line which is a best fit. It is also referred as linear regression.
It is represented by an equation W=mZ + c + err, where c is intercept, m is slope of the line and err is error term. To predict the value of a variable, the function W is used. The linear regression has single independent variable.
Multiple linear regression has more than independent variables. If there are more than one independent variable, multiple linear regression addresses the finding the fit for the line which relates the dependent variable and independent variables.
Least Square method is used for finding the fit for multiple linear regression technique. The method tries to minimize the sum of the squares of the differences from each point to the line. The deviations are squared and added to ensure that the positive and negative values are not cancelled out.
In pharma, health care, economics and other fields, classification plays an important role. As huge data repositories exist in this domains and are used for planning and innovation. Patterns are identified and analyzed for forecasts and prediction.
A good example for classification algorithm is analysis of X-ray images. The labels are assigned for disease characteristics such as tumors and others. The label values can be yes and no. Image analysis of X-rays help in bringing down time for analysis of X-ray images.
The classifier executes the classification algorithm which is of high speed and precise. The training data set is selected which is small to start with and covers all the parameters which are features of the model for X-ray images which are X-ray parameters such as bones, head, and other body parts.
The learning patterns can be different for the training to happen. A feature vector is used for the classification algorithm. The vector has representations of the features in numeric form. Let us say the goal is to classify the images of dogs into different classes based on a set of features. The feature vector will consist of size, appearance, and purpose and hair color.
The techniques used for the classification are presented in the next section.
Linear regression is a classifying technique where the relationships between the parameters observed are modeled. The observed parameters are numerically fit to a line using a simple linear regression. The line is drawn for best fitting or closest to the point.
In a scenario where a group of values is labeled Yes which is of value 1 and other label No of value 0. The linear regression might fail in classifying which is shown in the picture below.
A binary classifier is an algorithm which is referred to as a perceptron. The input data which is based on binary classification is used by the algorithm. The output is a linear partition of data from one class to another. Binary classifier labels the data elements are boolean such as yes or no.
Naive Bayes Classifier
A Naive Bayes classifier is based on bayes theorem. According to the bayes theorem, the probability of an event C happening, given that D has occurred can be calculated. D is the evidence for C happening and C is referred to as hypothesis. The predictor and the features are independent to each other. The algorithm is referred to as naive because one feature does not affect the other one.
The bayesian posterior probability is dependent on prior occurrence, likelihood and evidence (data).
P(C/D) = (P(D/C) P(C) ) / P(D)
To give an example for Naive Bayes classification, let us look at objects which need to classified based on color YELLOW or BLUE. A new objects need to be classified based when they come. The class label need to be applied based on the objects which left. We look at number of BLUE objects which left versus YELLOW. Let us say there are thrice as many BLUE objects versus YELLOW. The new case is thrice as likely to have BLUE label versus YELLOW. Bayesian analysis refers to this technique as prior probability. The previous observations decide the prior probabilities and the BLUE and YELLOW percentages are used for prediction of outcomes.
Let us say total of 80 objects, 60 of which are BLUE and 20 are YELLOW. The prior probabilities of class membership are :
Prior probability for BLUE = 60/80 = 3/4 = 0.75
Prior probability for YELLOW = 20/80 = 1/4 = 0.25
A new object (White circle) need to be classified as shown in the picture above. The naive bayes classifier uses a priori probabilities for likelihood of this new object. The number of points are used for
calculating the probability of the new object being BLUE or GREEN. The likelihood of the object given YELLOW is higher than likelihood of BLUE.
We look at the circle around the white object to check how many BLUE and YELLOW objects are there. The circle has 5 BLUE and 10 YELLOW objects. The membership of the new white object depends on the data presented and the number of YELLOW and BLUE objects which came and left in the system.
Let us take an example for using Naive Bayesian Classification. A deck of cards consist of 52 cards. The goal is to find the probability of the card being a Queen.
Total number of cards in the deck are 52. The total number of Queen cards is 4. The probability of a card being a Queen card is :
P(Queen) = 4/52 = 1/13
The probability of a card being a Queen given that the card has a face on it can be calculated using bayes theorem.
P(Queen/Face) = (P(Face/Queen) P (Queen))/ P(Face)
The probability of the card is a queen given it has a face.
P(Face/Queen) = 1
P(Queen) = 1/13
P(Face) = 9/52
P(Queen/Face) = (1 x (1/13)) / (9/52) = 4/9
Decision tree is used for representing the classified groups. Among the supervised classification learning methods, decision tree learning is very popular method. The features are from domains which are finite and discrete. Class is the term for classified domain element. The tree which is used for labelling the input feature which is a non leaf node. The feature values are labelled by the arcs generating out of a node. The tree leaves are labeled with probability values of the classes.
The features which are from the tree have values assigned on the arcs. The algorithm stops when the leaf is classified.
“Machine learning will increase productivity throughout the supply chain.” ~Dave Waters
The training data is labeled by the domain experts. The machine learning model is trained by the labeled data. The data which is ambiguous is evaluated and validated by the domain experts. The training data set is used for learning purposes.
Machine learning plays an important role in solving the complex problems. Machine Learning techniques are applied to develop learning models for forecasting. The machine learning models help in generating business value for the enterprise.
Model evaluated will be used for predictions. The learning model is used for forecasting, reporting, discovery, planning, optimization and analysis purposes in the organization.
Machine learning models assume that the training data is the basis and the unseen data is very important for making the model more effective. To validate and check the predictions, we need more unseen data for making the model trustworthy. The model should not be remembering the training data and making forecasting for future scenarios. The training data sets might be linearly separable or not linearly separable.
Note: The data set which is linearly separable splits the input set by a plane, line or hyperplane. The points of one set are in first half space and the second set is in the other space.
The machine learning models are evaluated based on number of errors and mean squared error measures. The performance of the model is very important for any machine learning engagement. The evaluation of the model is based on the unseen data and out of the sample data predictions. The accuracy of the predictions is an important evaluation measure.
The model’s evaluation is based on two methods:
The test data set is a prerequisite for model’s evaluation. The data which is used for developing needs to be different from the test data set. The prediction algorithm will have it in memory the label for the training set point. This scenario is called overfitting. The holdout evaluation is about testing the model on unseen data instead of just the trained data set. The learning model effectiveness is measured based on the unseen data accuracy. In the Hold out method, the data set has three subsets. The subsets are:
Test set (Unseen data)
The training data set is used for building the forecasting models. The validation set is used for evaluating and creating the learning model during the training phase Test data or unseen data is used for evaluating the future effectiveness of the model. The hold out method is effective for its performance. The results will have high variableness because the accuracy varies at different stages of the machine learning.
Cross validation is related to separating the observation data from the training data set. The training data set is used for the model learning and training. The unseen data set is used for evaluating the effectiveness of the model.
K-fold cross-validation is one of the cross validation methods. The data set is divided into k sub sets which are referred to as folds. k can vary from 5 to 10. Each of those subsets are used for testing and validating the model. The model performance is based on the average error over k different subsets.
In four fold cross validation; the data is separated into 4 subsets. The models are trained set by step. The first model uses the first data set as the testing one and the other datasets are for training. This happens for 4 separations of the data. The effectiveness of the model is measured by 4 trials with 4 folds (data sets). Every data set point is used for testing once and for training in k-1 trials. The error bias comes down and the data is used for fitting. It reduces the variance and the effectiveness of this method improves by having testing data set as the training data set.
In the next section, we look into different types of Machine learning algorithms such as supervised learning and unsupervised learning in the next blog article.
IASA Global was established in 2002. IASA is an international, non-profit business association dedicated to the advancement and sharing of issues related to software architecture in the enterprise, product, education and government sectors. They are committed to improving the quality of the IT architecture industry by developing and delivering standards, education programs and developing accreditation programs and services that optimize the development of the architecture profession. IASA Global has created the world’s first and only ITABoK (IT Architecture Body of Knowledge) that contains 250 skill sets that are critical for every Business and IT professionals to possess in order to deliver strategic values of technology for the business.
IASA Architect- AI Architecture Training Program is a basic course related to AI Enterprise architecture. This program is a defined baseline for successful IT architects who are implementing AI in enterprises.This initiative involves the advancement of best practices and education while delivering AI Enterprise programs and services to IT architects of all levels around the world.
The AI Architecture certification helps in ensuring that you are on the Enterprise Architect path. This demonstrates that you are taking the necessary steps to become a fully qualified architect to create an AI architecture. AI architecture has important factors such as the selection of machine learning frameworks and scalable solutions for automation. The AI reference architecture typically shows a workflow for automation solutions. Many AI frameworks such as Tensor Flow, Keras, NTLK, Pytorch, Google AI, IBM Watson, Microsoft Azure ML, and AWS Sage Maker are evolving and changing features rapidly. The AI architecture needs to have the flexibility and adaptability of handling the change. AI architecture helps in scaling, delivering speed and automating processes in the organization.
The AI architecture course explains the machine learning workflows and capabilities such as feature extraction, training, analytics, data collection, data analysis, data selection, project packaging, model tuning, evaluation, inference, validation, and deployment. The course will help in architecting AI applications for Recommendation, Forecast, Video Analysis, Image Analysis, text analytics, document analysis, voice to text, speech recognition, search, document analysis, conversational agents, translation, intelligent assistants, and transcription. NLP/NLU. Deep Learning, Knowledge studio, data refinery, IoT Platform, machine learning, natural language classifier, knowledge mining, cognitive search, decision-making applications, bots, robotic process automation.
In daily life, we come across many applications while working with customers and enterprises. The typical use cases where AI Architecture will help are:
Spam & Email – Filtering & User preferences based content analysis
Predictive Analytics – Credit Worthiness and Loan Applications
OCR : Pattern Recognition – Text, Images, Video and Audio
Biometrics: Identity Management & Security
Machine Learning Models: Life Insurance – Mortality rates, life expectancy
Medical Expense Prediction Model: patient history & medical claim history
Fraud Detection: Credit Card usage and activity patterns
Social Network Analysis: Relationship & Influence Analysis
Ecommerce websites use AI techniques and methods in their implementation . They have the below features related to AI:
Historical data related to customer transactions analysed for customer demographics
Shopping carts of the customer analysed for abandoned
Price analysis of the products using the historical data
Next Best action for the customer based on his preferences and previous purchases
Web page analytics related to customer browsing time for a product
Customer information related to profile, billing, and shipping addresses analysed for demographics
Referral websites tracked by the customer views and click stream analysis
Patterns related to customer rating and reviews of the products
Marketing campaign effectiveness based on email, sms and web channels
Recommendations based on customer history related to browsing, usage and behavior.
Conversion of the shopping from view to a buy – analysis
The recommendations of the customer and the merchant to the customer are analysed using various approaches mentioned below:
Content based Filtering
Train Matchbox Recommendation
Score Matchbox Recommendation
AI Modeling and Architectural development involves identifying modeling techniques, selecting algorithms, designing tests, developing models, assessing models and training the models. The other methods like Ensemble techniques help in combining and selecting multiple approaches based on scenarios. The AI model is validated and tested before using for unseen scenarios.
Enterprises are keen to evaluate AI & Machine learning techniques and develop models for decision making using Data science and algorithms. Leadership in enterprise is interested in getting their Architects trained based on experiential learning and avoid failures by using reference architecture and patterns & anti patterns. RPA is another area which enterprises want to evaluate and implement in the enterprise with AI & Machine learning, Voice and Natural language processing algorithms. Leadership is interested to know domain specific use cases where RPA is successful.
The Iasa AI Architecture Course for IT Architects gives you an architect knowledge of Artificial Intelligence frameworks and tools for developing AI Enterprise IT architectures that meet the demands of modern business. The curriculum is developed by AI Enterprise architects for IT architects, which is one of the basic ideas behind Iasa. The course focuses on:
Data Requirements: discovering and understanding the demands and needs of the business.
Data Modeling Principles: The principles followed by AI Architects related to data modeling.
Business Case: To find the return on the investment and justify the need for AI Architecture.
Machine Learning Solutions: Machine learning solutions for AI Architectural requirements – Predictive analytics, pattern detection, regression model, and recommendations
AI Architecture Practice: How best to build the practice of architecture within your organization
Iasa’s basic course is aimed at IT Architects who want to become involved in AI architectural work.You should have basic knowledge and experience of system development and AI. You work as a developer, project manager, information model or process developer in AI Architecture projects.
The course will have around 20 to 30 Certification Questions. The workshop will have an assignment that will be included in the course. The assignment will have an application using machine learning algorithms. The course will have a workbook where the participant can apply the course concepts in their organization. By the end of the course, the workbook will help the participant to chart out the AI Strategy in the organization.
Module 00: Course Introduction
In the course introduction, we will cover the background and details of the Core Pre-Work and the basis of the ITABoK as well as working descriptions of AI Architecture and AI Architecture practices.
Course Schedule Review
Data Architecture Challenges
What is AI architecture?
Machine Learning Process
Module 1: AI Architecture – Data Requirements
This module covers data sources, data formats, data mapping and user options for data analysis.
Data Analysis Options
For AI, data formats like image, video, text, and audio are covered. Different data sources like social media, blogs, news feeds, media websites, and other sources are presented in this module.
MODULE 2: AI Architecture – Data Modelling Principles
In architect engagement, we cover the data modeling principles and selection process for the right model. This module explains the training and evaluation of the data model.
Data Model selection
In this module, machine learning models, training, and validation parameters are discussed. Different machine learning algorithms such as classification, regression, neural network, decision tree, and random forest techniques. The learning methods covered are supervised, unsupervised, and reinforcement techniques.
Module 3: Business Case
Iasa fully covers the ‘demand’ or business drivers that underly AI architecture. This module explains the business value in driving AI Technology strategy through business decisions. We explore different business models, customer-driven architecture, and many other aspects of the business domain. The key aspects which are explored are innovation, automation, business value generation, and technology skills required by the engineering resources.
Business Value Generation
Creating AI skills-based Teams
In this module, AI technology benefits are discussed through business case modeling, value generation, capabilities, automation, and AI engineering topics. The takeaways for this module will be the business benefits of rapid analysis prediction & processing, accurate forecasts, cut down in the process time due to automation and improvement in compliance.
Module 4: AI Architecture – Machine Learning Solutions
When describing AI architecture, we work through the details of tracing the business value decisions related to the machine learning solutions. We look at the value streams, capabilities, applications affected by AI, design tradeoff analysis, architecture assessment methods, viewpoints, and architecturally significant requirements.
Conversational AI solutions
Fraud Detection Solutions
In this module, Predictive analytics, pattern detection, regression model, and recommendations are discussed in detail. The takeaways for this module will be real-time business decision making, eliminating manual tasks, enhancing security, reducing operating expenses, and improving business benefits through AI for your organization.
Module 5: AI Architecture Practice
The final module summarizes and expands on the AI Architecture process and the adoption of the set of programs and skills for the organization. The skill assessment provides a solid foundation for the student to review and understand their current AI skills and to create a growth plan for their future.
Handling the Automation Issues
The takeaway for this module will be to create an enterprise AI roadmap with assessments of the current team and skills of required team members. The student will be able to chart out a center of excellence or competency center development strategy by training and developing the engineers of their organization. The skill requirements will be based on the AI architectural platform. After this course, the student will be able to pick the right AI platform from IBM, Microsoft, Amazon, and Google AI platforms.
The Certified IT Architect – AI Architecture (CITA-AIA) credential is awarded to those who qualify based on a combination of criteria including education, experience and test-based examination of professional knowledge of AI architectural skills and management.
The CITA-AIA credential is awarded by achieving a 70% or higher on the CITA-AIA examination. The exam consists of 75 multiple-choice/true/false questions.
The AI Architecture exam is available online, anytime, via Iasa’s Learning Management System. If attending an onsite course, the exam is proctored on the last day. If attending an online course, access is given on the last day of the course as well. Students will be given 2.5 hours each to complete the exam.
Watch out for the course announcements from IASA Global regarding this AI Course.
Supervised learning is related to creating a model which can be used for forecasting based on the historical data for unseen data. The machine learning technique reads the input data set and the expected output data. The model is trained for forecasting the outputs for the new scenarios.
The supervised machine learning can be categorized as:
The fitting of the data is done in the Regression method. The data is partitioned in the Classification method. Supervised learning is very popular in the machine learning space.
The input variables z is transformed by the mapping function g to create the output variable W in supervised learning technique.
W = g(Z)
The new input data Z will be used for forecasting the output variables W using the mapping function. The aim is to find the mapping function. This method is referred to as supervised learning as it is like a manager supervising the employee learning process. Supervisor checks the training process and the forecasts on the training data set. The supervisor validates the outputs for unseen data and the technique targets a goal set for effectiveness.
Let us look at the examples for classification in the following section. The first example is related to classification of dogs.
Classification of Dogs
There are different types of dogs. Dogs can be classified into the following groups.
Dogs have different characteristics and each group has set of features which are used to identify the dog. This is a good example for supervised learning where we have to classify the dog images into various groups based on features.
There are around 560 breeds of dogs presented in the word cloud below:
Another example is classification of cats.
Below is the word cloud of 100 cat breeds. Each breed has different characteristic and feature to categorize the images.
Some of the features or characteristics of the cat are body type, coat, pattern of the skin and coat. The shape of the face is another important factor for cat classification.
Note :A chowder is a set of cats. It is also referred as a glaring. The cats which are very different to each other in a group, glaring is the right word. Kindle is a group of kittens.
In the case of regression, data is distributed in different dimensions. Information needs to be retrieved from it.The models need to evolved based on the data set and the errors need to be minimized for prediction. Regression is the method which is described above.
Dogs and cats problems have different challenges and learning is different in each case. Features need to be analyzed and the models need to be fitted to the data available for prediction.
In terms of machine learning we define these two types as a part of broader class called supervised learning. Machine learning has evolved with the data and processing power available at that particular time.
Classical Machine learning consists of different phases such as modeling, evaluation and methods such as supervised and unsupervised learning. There are different techniques within the supervised and unsupervised learning which are presented in the next sections.
Classical Machine Learning
Machine Learning is related to a code which can learn by implicit code and logic. The input for the code is provided by the data for the training and learning purposes. Machine Learning is part of the computer science and related to Artificial Intelligence. The data is gathered, staged, and cleansed for training and learning purposes.
Real world has different workflows and procedures which can be modeled using mathematics. Machine Learning model is based on the mathematical model of the procedure. Learning is achieved by using the data provided. Data is collated from databases and devices. The data ingestion is done from different datasources.
Data is transformed, normalized, and cleansed before the data set is created for learning.Data is analyzed and patterns are identified for forecasting. Data set features are analyzed and identified for feature set creation.
Different sets of features from the data are used for selection of the approach. For example, for regression the complexity and the degree of the polynomial are the key factors. The model based on mathematics is chosen from a group of candidates. Most of the time, the simplest model is the best one for prediction and forecasting.
“We consider it a good principle to explain the phenomena by the simplest hypothesis possible”. – Ptolemy
Models can be selected from different approaches such as listed below:
Support Vector Machine
Machine Learning Algorithms are categorized into three types.
Supervised Machine Learning
Unsupervised Machine Learning
Before we look at different types of machine learning algorithms, let us look at the machine learning models, features and model creation, training and evaluation of the models in the next blog article.