Artificial Intelligence and Machine Learning Blogs
Explore AI and ML blogs. Discover use cases, advancements, and the transformative potential of AI for businesses. Stay informed of trends and applications.
cancel
Showing results for 
Search instead for 
Did you mean: 
MarioDeFelipe
Active Contributor
15,194

Meta released its state of the art model, Llama 3.1 405B a few days ago, and well I have some exciting news if you think "Why should I care, I am SAP", I will tell you 6 reasons why you should consider it.

  1. The first Open Source LLM to know ABAP** (Don't hurt me)
  2. The first Open Source LLM to know SAP APIs
  3. It knows HANA SQL.
  4. It allows commercial purposes, so you can build your SaaS with it 💰
  5. Its multilingual by design.
  6. Introduces a significantly longer context length of 128K, enhanced tool use, and stronger reasoning capabilities. Text summarizations or long code analysis can be done without splitting into pieces

Short INTRO

Meta has been training this model since January, dozens of thousands of GPUS, so, lets give Zuckerberg some credit, that is many millions of dollars we don't need to spend to train a model like it, because its open source.

Remember, there are 3 things we care about in AI;

1. The compute

2. The Algorithm

3. The Data and use case

Compute and Algorithm, don't ever consider. As we mentioned, Meta invested billions of dollars developing and training this model so you focus on 3 (the Data).

Running the model, I will provide couple of options but model inference is not simple. Be suspicious of over simplified explanations on "how to easily run an LLM". Large models are large and the performance decreases fast once they are being used, and we want them to be used, not to mention the cost of the storage, compute, and RAG. Best tip is always to focus on the quality of the data and the use case you want to give to this software.

Llama is the kind of model we would use internally in a company, because corps don't want users to easily give a sensitive PDF to OpenAI and get a summary out of it. Corporations are protecting its employees from providing regulated data to open endpoints. They keep it private.

HOW to call it from BTP (Llama running on Amazon Bedrock)

To call Llama model served by Amazon Bedrock from your SAP BTP application, assuming you already have an Amazon Bedrock account and the API ready to be called, follow the steps to setup SAP AI Core as Proxy for Bedrock and then expose Bedrock via SAP AI Core Proxy to your application. 

Llama 3.1 405B available on US West 2 (Oregon). By AuthorLlama 3.1 405B available on US West 2 (Oregon). By Author

 

 

The official documentation for this is here https://aws.amazon.com/blogs/awsforsap/power-your-business-with-secure-and-scalable-generative-ai-se...

the unofficial information is here https://community.sap.com/t5/technology-blogs-by-members/generative-ai-for-sap-vi-consume-amazon-bed...

and here https://community.sap.com/t5/technology-blogs-by-members/the-5-steps-to-consume-amazon-bedrock-and-a...

Can we run Llama 3.1 405B on AI Core?

To run the Llama 3.1 405B model, we will need a significant amount of RAM. The minimum RAM for running the Llama 3.1 405B model is 128 GB. However, this is just a minimum requirement, and having more RAM will significantly improve the model's performance. In fact, we can run it on 64 GB but makes the model very slow, 128 GB makes it slow and 256 GB makes it decent. This is a big model and requirements are significant.

SAP AI Core has a limited resource plan ** (August 2024). It maps AWS g4 family instances for NVIDIA T4 Tensor Core GPUs

Resource Plan Specifications for AWSResource Plan IDs GPUs CPU Cores Memory GBs
Infer-S1 T4310
Infer-M1 T4726
Infer-L1 T41558
 
 
We cant run Llama 3.1 405B on BTP without AI clusters. Fortunately, the distributed inference is something that may help. Let me introduce the Distributed Llama project.

Distributed Llama

Distributed Llama is a project that allows us to run an LLM model across multiple devices. It uses tensor parallelism and is optimized for the low amount of data required for synchronization. Distributed Llama distinguishes between two types of nodes that you can run on your devices:

  • Root Node — the application that acts as the root node of your cluster, coordinating the cluster.
  • Worker Node — the application that functions as a worker, executing instructions from the root node.

for us in BTP is good because we dont have many GPUs but some more CPUs, and Distributed Llama supports only CPU inference ** (August 2024).

 

AI cluster topology, 4 devices, total 256 GB RAM. Image by AuthorAI cluster topology, 4 devices, total 256 GB RAM. Image by Author

 

 

 

 

The root node on the first device and 3 worker nodes on the remaining devices will account the required 256 GB. Distributed Llama splits RAM usage across all devices.

Follow the steps detailed on this GitHub until finally, you run the Llama on the Master device.

./dllama-api \
--model models/llama3_1_405b_instruct_q40/dllama_model_llama3_1_405b_instruct_q40.m \
--tokenizer models/llama3_1_405b_instruct_q40/dllama_tokenizer_llama3_1_405b_instruct_q40.t \
--buffer-float-type q80 \
--max-seq-len 2048 \
--nthreads 4

 

Test for SAP APIs knowledge

my basic test for LLMs is to ask the mandatory field for the Document Info Record Service API

SAP API sourced knowledge by Llama. Image by AuthorSAP API sourced knowledge by Llama. Image by Author

 

 

This is not an easy test, the only model which has not failed on the same question is GPT 4.

Use case of LLMs as function calling is described on this blog post.

 

 

The ABAP Code

I give this code to the LLMs, my expectation is it discovers an incorrect syntaxis around the CATCH instruction outside the Loop, something almost all models miss;

 

 

 

DATA: lt_mara TYPE TABLE OF mara,
      ls_mara TYPE mara,
      lv_matnr TYPE mara-matnr.

SELECT matnr, maktx FROM mara INTO TABLE lt_mara.

LOOP AT lt_mara INTO ls_mara.
  IF ls_mara-matnr = lv_matnr.
    WRITE: / ls_mara-matnr, ls_mara-maktx.
  ENDIF.
ENDLOOP.

CATCH cx_sy_itab_line_not_found INTO DATA(lx_itab_error).
  WRITE: / 'Internal table error:', lx_itab_error->get_text( ).
ENDTRY.

 

 

 

Let me ask the model if above statement syntax is correct;

ABAP Syntax check. Image by AuthorABAP Syntax check. Image by Author

 

Corrected syntax;

 

 

 

DATA: lt_mara TYPE TABLE OF mara,
      ls_mara TYPE mara,
      lv_matnr TYPE mara-matnr.

lv_matnr = 'some_value'.  " Initialize lv_matnr with a value

SELECT * FROM mara INTO TABLE lt_mara.

LOOP AT lt_mara INTO ls_mara.
  IF ls_mara-matnr = lv_matnr.
    WRITE: / ls_mara-matnr, ls_mara-maktx.
  ENDIF.
ENDLOOP.

IF sy-subrc <> 0.
  WRITE: / 'No records found in MARA table'.
ENDIF.

 

 

 

I am no ABAPer but I believe this is a more decent code

HANA SQL

I ask the LLM;

 

 

 

Can I update 2 columns from 2 tables joined by foreign key with one statement in SAP HANA?

 

 

 

The answer should be NO, but sometimes I get this;

Incorrect response from MistralAI Large. Image by AuthorIncorrect response from MistralAI Large. Image by Author

 

And this is the correct response;

Llama 3.1 getting it right. Image by AuthorLlama 3.1 getting it right. Image by Author

 

 I am quite satisfied about it. Llama got right that I cant update 2 different HANA columns on a single statement.

Conclusion

Llama 3.1 405b is a beast. It is a large model and it will require a significant amount of resources to run it if that is our desire, but services like Amazon Bedrock help us to do the heavy work of not worrying about running LLMs and just privately expose an API for us only.

Llama 3.1 although has not been directly released for the enterprise segment, is exceeding by far all the business tests I have executed on my day to day SAP activities. Meta has done a very good job.

Sources

Meta AI Blog : https://llama.meta.com/
Meta Llama 3.1 : https://ai.meta.com/research/publications/the-llama-3-herd-of-models/
Model Accessability: https://llama.meta.com/llama-downloads/
Try on Huggingface: https://huggingface.co/chat/
Usage Llama3.1 : https://llama.meta.com/docs/getting-the-models/405b-partners/
Research document Link : https://www.rivista.ai/wp-content/uploads/2024/07/452387774_1036916434819166_4173978747091533306_n.p...

 

 

11 Comments
asiervs
Explorer

Very interesting @MarioDeFelipe ! A pleasure to read you!

ttrapp
SAP Mentor
SAP Mentor

Thanks for posting this! Unfortunately the linke to the research document gives out:

URL signature expired

Best Regards,
Tobias

MarioDeFelipe
Active Contributor
paul_snyman
Explorer

Great work Mario, keep it up!

joachimrees1
Active Contributor
0 Kudos

Maybe I am missing something, but if it's
ls_mara TYPE mara,

Then there's no
ls_mara-maktx.

(MAKTX is in MAKT...)

-> Syntax errror, in the "Corrected syntax;" 

JayeDutton
Discoverer
0 Kudos

@MarioDeFelipe Thanks for the info!

Generative or Regurgitative?

I had a play to see if it could generate a program using an example I already had to compare the result; It gave me an exchange between Vikram and Mathew on a similar topic:

JayeDutton_0-1723595537937.png

I then adjusted the response length and the same prompt then went on to tell me how to debug an ABAP program and create an ABAP program to send an email with an attachment... both unrelated to the prompt.

JayeDutton_1-1723596017821.png

*Note; I was able to find some use cases where it performed as well or better than other models. Just not code generation.

SonicPlanet
Explorer
0 Kudos

Hello Mario,

Does SAP provide the ability to fine tune the Llama 3.1 405B model, against specific/limited amount of the client's [internal] data, to improve the lackluster performance you mentioned?

Is there the capability to use ABAP or HANA SQL for embeddings using the vector database?

(As an ex-BW/HANA developer, i moved to snowflake platform as i did not know SAP have a vector engine  and doing embeddings). I am interesting in seeing how the SQL statements performed, know how slow ABAP performance is. 

Vitaliy-R
Developer Advocate
Developer Advocate
0 Kudos

Hi @SonicPlanet 

Is there the capability to use ABAP or HANA SQL for embeddings using the vector database?

Please check https://community.sap.com/t5/technology-blogs-by-sap/sap-hana-cloud-vector-engine-quick-faq-referenc... by @shabana 

FrankStienhans
Discoverer
0 Kudos

Nice article.

On your HANA SQL example: 

Can I update 2 columns from 2 tables joined by foreign key with one statement in SAP HANA?


Please note that the statement can be read in different ways

A)

  1. Can I update 2 columns from 2 tables
  2. joined by foreign key ...

B)

  1. Can I update 2 columns
  2. from 2 tables joined by foreign key ...

You meant A) but the AI might occasionally understand B).
Further, as LLMs are eager to help they probably have a bias towards B)

In AI it is critical to understand that the AI does not have our context, unless we explain that context to the AI. That is why other humans misunderstand us so often.

MarioDeFelipe
Active Contributor
0 Kudos

Hi @FrankStienhans good to see you here

No, its not a prompt issue, its an hallucination. the question is well understood but the LLM mades up the answer incorrectly, we cannot update two different tables in a single UPDATE statement in SAP HANA if I am not mistaken, HANA does not support multi-table updates in a single statement and Mixtral gives the answer incorrectly. Probably it was either trained with bad data or it totally made up the statement. We cant solve this by prompt engineering, this incorrect data is on the model weights.

Meta took a very good approach with Llama, it was trained probably on less data, but more accurate. Its better having a model that knows good stuff than a lot of staff with mixed good and bad data, for reasons like this.

FrankStienhans
Discoverer
0 Kudos

@MarioDeFelipe 

We generally don't observe hallucination issues. However With your prompt our AI misunderstood the request (but produced valid SQL).

When I changed the prompt to:

Can I update 2 columns from 2 tables joined by foreign key with one statement in SAP HANA? Those 2 to be changed columns are each in separate tables

It answered correctly.Screenshot 2024-09-12 at 7.33.47 AM.png