Technology Blog Posts by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
qiushi_wang1
Product and Topic Expert
Product and Topic Expert
894

Introduction

I wanted to add an AI-powered business card scanner to an MDK app. Instead of spending hours reading documentation, I used Cline AI assistant with tools and resources from MCP servers to learn from examples and build the feature through natural conversation. The whole process took about 30 minutes instead of the usual 4-8 hours.

This is the story of how it went, including the mistakes I made and how I fixed them.

Prerequisites

Before starting, ensure you have:


Step 1: Learning from an Example

I started by asking Cline to learn from an existing example app that already had AI Core integration. I gave it this prompt:

"learn MDK Action.Type.AICore.Chat.Completions from an example app at https://github.com/gqswang/cloud-mdk-samples/tree/main/Showcase_Apps/AICoreApp/MDKGenAI.mdkproject with github mcp server"

Cline used the GitHub MCP Server to access and analyze the repository, finding the AICoreApp showcase. Within minutes, Cline explained how AI Core integration works in MDK. The key insights were:

The correct action type is Action.Type.AICore.Chat.Completions - not a REST service action like I initially thought.

Dynamic execution is the way to go. Instead of hardcoding everything in the action file, you execute the action from JavaScript and pass parameters at runtime. This gives you flexibility to change the AI prompt, temperature, and other settings based on what the user is doing.

OpenAI Function Calling ensures you get structured JSON responses instead of unpredictable text. You define a schema for what you want back, and the AI follows it.

Image processing uses MDK's built-in binaryToBase64String() utility to convert photos into a format the AI can understand.

The example showed me that GPT-4o can handle both text prompts and images in a single request, which is perfect for scanning business cards. No need to opt for the latest model, which is much more expensive.

Below is a screenshot of VSCode with Cline at step 1: Learning from an Examplelearning.png

Step 2: Building the Feature

With that knowledge, I asked Cline to implement the business card scanner:

"now please use similar approach to add a new functionality to perform photo taking at customer detail page, to take photo of customer business card, analyze it with AI and extract Name, Phone number, Address, email, etc. please use MDK MCP server, but upgrade MDK schema version to 25.9 first"

Cline used the MDK MCP Server to create all the necessary files:

Application.app - Upgraded the schema version from 24.4 to 25.9 to support the new AI Core action type.

Services/AzureOpenAI.service - A simple service definition pointing to the "AICoreAPI" destination on BTP.

Actions/Customer/ChatCompletions.action - The AI Core action with minimal configuration. The real magic happens at runtime.

Rules/Customer/ScanBusinessCard.js - The JavaScript rule that does the heavy lifting. It takes the photo, converts it to base64, structures the AI request with a detailed prompt and JSON schema, calls the AI Core action, parses the response, and populates the form fields.

Pages/Customer/Customer_BusinessCard_Edit.page - A page with an attachment control for taking photos and form fields for the extracted data.

Actions/Customer/NavToBusinessCardEdit.action - Navigation to open the scanning page as a modal.

The implementation used a temperature of 0.1 (very low) because we want factual extraction, not creative responses. The JSON schema defined 11 fields to extract: firstName, lastName, phone, email, street, city, state, postalCode, country, company, and jobTitle.


Step 3: Validation and Debugging

This is where things got interesting. I asked Cline to validate and build the code:

"please use MDK MCP server to validate and build the generated code, fix the error if any"

Validation Passed

The MDK validator ran and reported 0 errors and 0 warnings. All file references were correct, the schema version was valid, and the action types were properly configured. There were 42 informational messages about platform-specific properties (like iOS-only or Android-only features), but those are normal for cross-platform apps.

First Build Failed

The build failed with a webpack error. It tried to process a markdown file (the blog post I was writing) that was sitting in the project root. MDK uses webpack to bundle everything, and it doesn't know how to handle markdown files.

Fix: Moved the markdown file to the Desktop. The rule is simple - only application metadata files belong in the project directory.

Second Build Succeeded

After removing the markdown file, the build completed successfully in about 2 seconds. It created two artifacts:

  • bundle.js (231 KB) - the compiled application
  • uploadBundle.zip (232 KB) - ready to deploy to Mobile Services

Pre-existing Bug Found

During validation, we also discovered an unrelated bug in the Dashboard page. It referenced rule files with .rule.js extensions, but the actual files only had .js extensions. This would have caused runtime errors. We fixed it by correcting the file references.

This showed me that validation doesn't just check new code - it can catch old bugs too.

Runtime Error

After deploying and testing, I hit a runtime error:

"Error scanning business card: TypeError: Cannot read properties of undefined (reading 'setValue')"

The AI extraction was working perfectly - I could see in the logs that it correctly extracted all the data from the business card. But the form fields weren't being populated.

Adding Logging: I asked Cline to add comprehensive logging to see what was happening. The logs showed that pageProxy.getControl() was returning undefined for all the form controls.

Root Cause: The controls were nested inside a SectionedTable, and getControl() only works for top-level controls.

The Fix: Changed from getControl() to evaluateTargetPath() with the #Control: syntax. This method can access controls at any nesting level.

After this fix, everything worked. The AI analyzed the business card photo, extracted all the information, and automatically populated the form fields.

All the code changes implemented by Cline with MCP servers throughout these steps are available in this pull request.

Below are screen recording of the new feature running on the iOS simulator:

runtime.gif


What I Learned

MCP servers are incredible for learning. Instead of reading documentation for hours, I learned from a working example in minutes. Cline used the GitHub MCP Server to access and analyze the entire repository, extracting the key patterns I needed.

Logging is essential when debugging. The initial error message was misleading. Only after adding detailed logging did I see that the AI extraction was perfect and the problem was with control access.

Validation catches more than you expect. It found a pre-existing bug that had nothing to do with my new feature. This kind of comprehensive checking prevents future problems.

The control access pattern matters. getControl() vs evaluateTargetPath() is not obvious from documentation. This is the kind of thing you learn by doing, and having good logging helps you figure it out quickly.

AI-assisted development is iterative. Each step built on the previous one: learn from examples, implement based on that learning, validate to catch issues, debug systematically when things don't work, and reflect on what you learned.

Time savings are real. What would normally take 4-8 hours took about 30 minutes. That's an 87-94% reduction in development time.


The Development Workflow

The process followed this pattern:

  1. Learn - Cline used MCP servers to access and analyze example code
  2. Implement - Cline generated all necessary files based on the learned patterns
  3. Validate - Cline used MDK MCP Server tools to check for errors
  4. Build - Fixed issues and created deployment artifacts
  5. Debug - Added logging to understand runtime problems
  6. Fix - Applied the correct solution based on what the logs revealed

This isn't about AI replacing developers. It's about AI amplifying what developers can do. Cline learned from examples faster than I could read docs, but I still needed to guide it with clear prompts, interpret the results, and debug systematically when things went wrong.


Practical Takeaways

Let Cline use MCP servers to learn from examples first, before writing any code. It's faster and more effective than reading documentation.

Add comprehensive logging early, especially when integrating new technologies. Logs reveal the real story that error messages hide.

Validate frequently to catch issues before they compound. Don't wait until everything is built.

Be specific with prompts. Treat AI assistants like expert colleagues - give them context, share complete error messages, and ask for explanations when needed.

Document as you go. Writing things down helps you understand them better and creates a reference for next time.


Summary

This experience showed me that AI-assisted development with tools like Cline and MCP servers isn't just faster - it's a different way of working. You learn by example instead of by documentation. You iterate quickly instead of planning everything upfront. You debug systematically with comprehensive logging instead of guessing.

The combination of human creativity and judgment with AI's ability to learn from examples and MCP servers' domain-specific knowledge creates something more powerful than any of them alone.

This is my first experiment with AI-assisted coding using MDK and GitHub MCP servers. All code changes discussed in this blog are available in the pull request. If you want to use this code, ensure the SAP AI Core service configuration matches your own BTP environment. For detailed setup instructions, refer to my previous blog: SAP Build Apps integration with SAP AI Core services: Part 1 - Setup.

In future posts, I'll share more insights on integrating SAP MDK with various AI technologies.


Resources