Technology Blog Posts by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
Raja_Gupta1
Product and Topic Expert
Product and Topic Expert
1,223

This is part of the blog series, where I am explaining 21 frequently used Generative AI jargons in simple words. To make it super simple and interesting, I will provide an analogy as well.

Note: I am publishing it as a series. Part 4 to 21 are yet to be published.

 

  1. Prompt Engineering
  2. AI Model [Current Blog]
  3. Foundation Model
  4. AI Hallucination
  5. Retrieval-Augmented Generation (RAG)
  6. Grounding
  7. Natural Language Processing (NLP)
  8. Explainable AI
  9. Prompt Injection Attack
  10. Overfitting and Underfitting
  11. Multimodality
  12. Autoencoders
  13. Computer Vision
  14. Transfer Learning
  15. AI Detectors
  16. Adversarial Attacks
  17. Data Augmentation
  18. Generative Adversarial Networks (GANs)
  19. Variational Autoencoders (VAEs)
  20. Transformer-Based Models
  21. AI Poisoning Attacks

This is 2nd blog in this series. The jargon is AI Model.

Side Note: I strongly recommend you to go through the blog series Generative AI for Beginner. It will only take 90 minutes of your time. No perquisites.

Let’s start!

 

An Analogy to Understand AI Model

Imagine there is a super-smart genie living in your computer. It’s not the blue, wish-granting kind, but more like a program with a knack for crunching numbers and solving puzzles faster than you can say “Abracadabra!”.

There are 2 major power this genie has. First, it can understand and process tons of information. It can process complex calculations or analyze vast amounts of text in seconds.

Second, it keeps learning new things. It gets smarter and more accurate over time with experience.

Similar to the genie, an AI model is actually a computer program that can understand and process tons of information and can be trained on huge data to perform specific tasks.

 

So, what exactly is AI Model?

An AI Model is a computer program that has been trained to perform specific tasks by learning from data. The AI models (programs) usually include complex mathematical and computational techniques to process vast amounts of data and extract meaningful insights.

Below image summarizes important points about AI Model.

 

Raja_Gupta_0-1719467101342.png

 

How does a typical AI Model work?

AI Models have 3 major components — Algorithms, Data and Parameters.

Algorithms

AI models uses mathematical formulas and rules that define the model’s behavior and how it processes information.

Data

The data is the foundation of everything an AI model does. There are basically two types of data used in AI models — Training Data and Input Data.

Parameters

AI models have adjustable elements which are fine-tuned during training to optimize its performance.

 

Training of AI model

AI models need huge datasets to get trained. The training data may include text, images, videos, numbers, or any other format of data. Some powerful AI models are trained on entire Internet data.

1. The training data first goes through some cleaning and pre-processing to make sure that it’s consistent and noise-free.

2. The prepared data is then fed to the AI models.

3. The AI model’s algorithm analyzes the training data and search for underlying patterns and relationships between different data points.

4. Based on the analysis, the AI model adjusts its internal parameters to better represent the discovered patterns.

Steps 1, 2 and 3 are repeated over and over again with different set of training data. This helps AI model to become more skilled at recognizing the patterns.

 

Examples of AI Model

Some of the popular examples of AI models are:

GPT (Generative Pre-trained Transformer)

Developed by OpenAI, GPT is a language model. It’s the AI model behind ChatGPT.

BERT (Bidirectional Encoder Representations from Transformers)

Developed by Google, BERT is a powerful language model designed to understand context and nuances in language. It’s widely used for natural language processing tasks.

AlphaGo

Developed by DeepMind, AlphaGo is an AI model that have mastered the game of Go and achieved superhuman performance. AlphaGo defeated human champions.

 

Let’s Create a Simple AI Model

We can build a simple AI model using Python. In this example, we will create a basic AI model to identifying objects in images.

Expected Output

An Image Recognition Model to identifying objects in images (e.g., cats and dogs).

Note — If you are new to Python, don’t worry. Just complete this 2-minute crash course on Python.

Follow below steps:

 

Step 1 — Install Necessary Libraries

Run below command to install TensorFlow library

pip install tensorflow

 

Step 2 — Write below Python code to create the AI model. This code snippet trains a basic image recognition model on the CIFAR-10 dataset and then makes a prediction on a new image.

# Import Libraries: Import TensorFlow and other necessary libraries.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.datasets import cifar10

 

# Load and preprocess the CIFAR-10 dataset
# CIFAR-10 dataset contains 60,000 color images in 10 classes (e.g., airplanes, cars, birds).
(x_train, y_train), (x_test, y_test) = cifar10.load_data()


# Preprocess Data: Normalize the image data by dividing by 255.0.
x_train, x_test = x_train / 255.0, x_test / 255.0


# Define the Model: Create a simple convolutional neural network (CNN),
# with layers for convolution, pooling, flattening, and dense layers.
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
MaxPooling2D((2, 2)),
Flatten(),
Dense(64, activation='relu'),
Dense(10, activation='softmax') # 10 classes in CIFAR-10
])


# Compile the Model.
# Configure for training by specifying optimizer, loss function, and metrics.
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])


# Train the Model on the training data
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))


# Make predictions on test data
predictions = model.predict(x_test)


# Output the prediction for the first test image
predicted_class = tf.argmax(predictions[0])
print(f"Predicted class: {predicted_class}")

 

This AI model can further be used in variety of use-cases, for example in a smartphone app that identifies the breed of a dog when you take a picture of it.

 

Next Blog

21 Generative AI Jargons Simplified: 3 — Foundation Model