After a 5-10 minutes it should start the instance and it will provide you with an inference only URL endpoint, as well as a example curl statement to test it.
curl https://<your end point goes here> \
-X POST \
-d '{"inputs":"Whats is ABAP Cloud?"}' \
-H "Authorization: Bearer <your hf access token> \
-H "Content-Type: application/json"
cf cups hf_llama2_inf_service -p '{"url": "https://<Hugging Face Inference endpoint>", "api_key": "<hf access token>"}'
#requirements.txt
#CHANGES: add huggingface library
#---------------------------------------
...
huggingface_hub
#manifest.yaml
#CHANGES: add service
#---------------------------------------
...
- hf_llama2_inf_service
#server.py
# CHANGES:
# - add hugging face functionaility
# - get hf service credentials
# - add hf inference call logic
#---------------------------------------
...
from huggingface_hub import InferenceClient
...
...
# Get the Hugging Face credentials
service_name = "hf_llama2_inf_service"
hf_api_key = env.get_service(name=service_name).credentials['api_key']
hf_llama2_inf_url = env.get_service(name=service_name).credentials['url']
print("Hugging Face API Key assigned")
...
...
elif llm == 'Llama-2-7b-chat-hf':
# Streaming Client
client = InferenceClient(hf_llama2_inf_url, token=hf_api_key)
# generation parameter
gen_kwargs = dict(
max_new_tokens=1024,
top_k=50,
top_p=0.95,
temperature=0.8,
stop_sequences=["\nUser:", "<|endoftext|>", "</s>"],
)
stream = client.text_generation(prompt, stream=True, details=True, **gen_kwargs)
# yield each generated token
for r in stream:
# skip special tokens
if r.token.special:
continue
# stop if we encounter a stop sequence
if r.token.text in gen_kwargs["stop_sequences"]:
break
# yield the generated token
print(r.token.text, end = "")
yield r.token.text
...
SAP notes that posts about potential uses of generative AI and large language models are merely the individual poster’s ideas and opinions, and do not represent SAP’s official position or future development roadmap. SAP has no legal obligation or other commitment to pursue any course of business, or develop or release any functionality, mentioned in any post or related content on this website.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.