Solved: Problems with CONTENT_HEX in SO_DOCUMENT_READ_API1

alejandro_bindi · ‎2013 May 30

I developed a program which uses GOS attachments list to allow attachment of files. In a subsequent step, those attachments must be read and processed. So, I read the file contents using (according to note 927407) class CL_BINARY_RELATION and afterwards function module SO_DOCUMENT_READ_API1.

The problem is that when processing those contents, files end up damaged. It happens particularily with DOCX / PPTX / XLSX files (newer office formats).

If I download (using GUI_DOWNLOAD) or email (using CL_DOCUMENT_BCS->ADD_ATTACHMENT) those files, upon opening the corresponding office applications warn that the file is damaged, and repair it.

Comparing CONTENT_HEX with the DATA_TAB from GUI_UPLOAD of the same file (DIFF tool in debugger), I've narrowed the difference to the last bytes of the table. On GUI_UPLOAD one (undamaged file), the end is "0000000000...", while the one returned by SO_DOCUMENT_READ_API1 is "0202020202..."

I've seen that inside SO_DOCUMENT_READ_API1, the hex content is filled up using the text content, by function module SO_SOLITAB_TO_SOLIXTAB.

I've tried to work around this by converting myself instead using CL_BCS_CONVERT=>SOLI_TO_SOLIX, but the same happens.

I've run out of ideas and found no SMP note on this so if anyone else had the same problem, please help.

Thanks

alejandro_bindi · ‎2013 May 30

SOLVED: The problem actually was the size calculation.

On GUI_DOWNLOAD, I was calculating wrongly myself.

On CL_DOCUMENT_BCS->ADD_ATTACHMENT, the standard method called COUNT_DOC_SIZE is wrong!

The solution is to always use DOCUMENT_DATA-DOC_SIZE returned by SO_DOCUMENT_READ_API1, instead of calculating. Of course, COUNT_DOC_SIZE should be corrected, but I found no note.

alejandro_bindi · ‎2013 May 30

SOLVED: The problem actually was the size calculation.

On GUI_DOWNLOAD, I was calculating wrongly myself.

On CL_DOCUMENT_BCS->ADD_ATTACHMENT, the standard method called COUNT_DOC_SIZE is wrong!

The solution is to always use DOCUMENT_DATA-DOC_SIZE returned by SO_DOCUMENT_READ_API1, instead of calculating. Of course, COUNT_DOC_SIZE should be corrected, but I found no note.

Former Member · ‎2015 Mar 19

Hi Alejandro

With refrence to your post regarding conversion error issues while reading docx/pptx/xlsx files using SO_DOCUMENT_READ_API1 , could you elaborate on the solution you found for the same.

I too am facing the same issue, while reading the content in docx file attached to Purchase requisition (ME53N) attachment list.

The DOCUMENT_DATA-DOC_SIZE parameter is just being retunrned by the FM, so how did it solve the issue?

Thanks in advance

Best Regards

Rohan D Kannikar

alejandro_bindi · ‎2015 Mar 19

Hello Rohan, as I said in my latest post, as long as you use DOCUMENT_DATA-DOC_SIZE for either task (downloading or emailing), you should be fine since it is calculated correctly. Pass that variable along with your binary (typed x) internal table which holds the contents to the output method / f.m.

Regards

Former Member · ‎2015 Mar 20

Hi Alejandro

I need the text that is retrieved in it_content internal table inorder to display the same in a smartform. While reading a docx document attached to Purchase Requistion, ME53N tcode, the internal table contains garbage values. Although it is reading contents of attached notepad file correctly into the same.

CALL FUNCTION 'SO_DOCUMENT_READ_API1'

EXPORTING

    document_id                = 'FOL39000000000004EXT40000000000110'

    filter                     = zlc_x

IMPORTING

    document_data              = wa_data

tables
    object_content             = it_content
    contents_hex               = it_solix

Can you help regarding this issue?

Best Regards

Rohan D Kannikar

alejandro_bindi · ‎2015 Mar 23

Hi Rohan, yours is a different kind of problem then. You are basically trying to read a file as plain text which is NOT plain text. You would have the same exact issue if you uploaded that word file from your PC using GUI_UPLOAD instead of reading it from the system by SO_DOCUMENT_READ_API1.

Most probably you could try using OLE/DOI to interpret the file (research about interface I_OI_WORD_PROCESSOR_DOCUMENT), or since it's a DOCX file, maybe there are some XSLT transformations available to extract the text. I haven't used those methods though so I can't provide more info.

Regards

Former Member · ‎2015 Apr 10

Hi Rohan,

If you have found solution for same let me know.

As I am also facing same problem while reading XML attachment in class CL_DOCUMENT_BCS.

while reading contents_hex values I am getting garbage values like '0000000000000' at end of string.

I tried to convert the xstring values to string

and those values became '################################' .

But couldn't remove same using all standard process.

like

1) REPLACE all occurrences of '##' IN:

text1 WITH ' ' .

2) TRANSLATE text1 using '## '.

However if I copy same string to normal se38 editor and apply above mentioned approach it works.

If you have identified the solution for same let me know.

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My SAP Profile

My SAP Profile

Problems with CONTENT_HEX in SO_DOCUMENT_READ_API1