cancel
Showing results for 
Search instead for 
Did you mean: 

Metering of SAP BTP GenAI offerings

essay88
Explorer
0 Kudos

Hi,

I was reading about the new GenAI offerings on BTP to understand their capabilities and metering. I came across a statement in the standard documentation which I could not understand.

sagarwalgds_0-1709576536132.png

The last paragraph states - "A GenAI token corresponds to a block of 1000 tokens from the LLM service provider."

Does this statement mean that 1 token of SAP BTP GenAI hub is equivalent to 1,000 tokens from the Azure OpenAI service? And so 1,000 BTP GenAI tokens will contain 1,000,000 tokens from Azure Open AI service?

Can someone please confirm if I am reading it right.

View Entire Topic
Ivan-Mirisola
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi @essay88,

I believe the wording is not helping on the documentation (and you can enter your concerns directly on the documentation system so SAP can improve the wording to help other customers).

In essence, the documentation is trying to explain that there is a conversion ratio which will be applied to the calculation for every 1000 tokens of a particular LLM you need. That ratio will very for input and output according to each LLM. Check the example table on the same documentation for each ratio.

On the example formula from the documentation, it clearly explains how CUs are calculated:

Capacity units = x/1000 * 0.000210 + y/1000 * 0.00274

Where:

  • X is INPUT tokens
  • Y is OUTPU tokens

So, first you determine the amount of input blocks you need by dividing 'x' by 1000.
Then you multiply the block amount by the input ratio of a particular LLM model (here we are using GPT-35-Turbo's ratio). 
We do the same for the output and then add the two values together to produce an amount of CU.

This is simply stating that for every 1 block of input tokens you will have to add 0.000210 CUs.
And for every 1 block of output tokens used you will have to add 0.00274 CUs.

Therefore, no! The following statement is wrong!:

"1,000 BTP GenAI tokens will contain 1,000,000 tokens from Azure Open AI service".

The correct statement should something along the following lines:

"By paying for (0.000210 + 0.00274) CUs to SAP you will be able to use 1000 input & 1000 output tokens with LLM GTP-35-Turbo".

Suffice to say that each CUs in BTP will correspond to a monetary value to be paid to SAP only. 

Therefore, if you pay 1.04 EUR you will be able to use 1000 input tokens & 1000 output tokens.

The calculator is your friend here and allows you to enter any arbitrary amount of input and output tokens - it is used as an estimate for what you require on your scenario.  It will do the math for you to get the CUs required to accommodate the input/output tokens for each LLM you plan on using.

Once you have it in place, you will be consuming cloud credits in BTP according to the amount of CUs you consume. 

While estimating BTP costs, keep in mind that Generative AI hub requires the Extended service plan from SAP AI Core and it will be metered in Capacity Units - not tokens. And, you must also add the Standard service plan for AI Launchpad Service which will be metered as tenants created. Not to mention that AI Core is not used by itself in BTP - usually you ought to think about adding a application runtime for a front-end that will consume it which may use several other BTP services. 

Hope this clarifies.

Best regards,
Ivan

essay88
Explorer

Thank you @Ivan-Mirisola. It helps. I have provided feedback on the documentation page.