How to split Microsoft Word Document into image in PHP using Aspose.Words REST API and get result in response body

We want to make use of ‘splitDocument’ or ‘saveAs’ from the Document API to convert a single .docx file into .png images for each page.

None of the mentioned calls does allow us to directly get back the images, so we need to do a new API call for each image. Our documents on average are 40 pages, so when every “get image” call takes 1.5 second it results in 60 seconds of overhead.

What we would like to see options like:
  • return the raw images (base64 encoded?) in the first splitDocument / saveAs call
  • return a .zip with all images in it
Heard from Billy Lundy you were already busy with implementing, but please keep me up-to-date because this would drastically reduce your server load (amount of Other API operations calls).

Hi Richard,

You already have reported this issue and our developers were working on it. The good news is that the issue has been resolved and it is live already. This will be announced in the coming public release next week and respective topics will be added to documentation for your reference.

Best Regards,

Hi Richard,

In the meantime, you can update your URI to use this new feature. URI for split resource will be:

http://api.aspose.com/v1.1/words/Sample.doc/split?format=png&zipoutput=true

and output document name can be extracted from JSON e.g.

$json->SplitResult->ZippedPages->Href

Best Regards,

Hi Muhammad, sorry for another issue but I’m unable to find back the other issue so did report another one.


Great to hear that the feature is live already! This will definitely speed up our conversion process a lot.

Some questions:

Hi Richard,

1. Yes, it will be one call and return JSON/XML.

2. Yes, it has link to output ZippedPages ZIP file and it will be downloaded from Aspose for Cloud or specified third party storage.

3. Yes, you can use Download File example.

4. Yes, ZIP feature is supported for all formats supported by split resource.

5. Sure. We will add it once these features are publically released next week.

Best Regards,

Hi Muhammad, I’ve tried to test the “zipOutput” feature, but it does not seem to work yet. The SDK still only puts images in the online folder, not the expected .zip file.

The only result I get with the following call

http://api.aspose.com/v1.1/words/028b222f72d9d41662970183d0e9689641e2a2d7.docx/split?format=png&zipOutput=028b222f72d9d41662970183d0e9689641e2a2d7.docx.zip&appSID=53d3be57-9038-4360-a6ab-28017f8a7b93

is the following json:

{#1173 

+“SourceDocument”: {#1174 

+“Href”: “48822db50d78437ae8ed98385b102fd6a7598e7a.docx”

+“Rel”: “self”

+“Type”: null

+“Title”: null

}

+“Pages”: array:19 [

0 => {#1175}

1 => {#1176 }

2 => {#1177 }

3 => {#1178 }

4 => {#1179 }

5 => {#1180 }

6 => {#1181 }

7 => {#1182 }

8 => {#1183 }

9 => {#1184 }

10 => {#1185 }

11 => {#1186 }

12 => {#1187 }

13 => {#1188 }

14 => {#1189 }

15 => {#1190 }

16 => {#1191 }

17 => {#1192 }

18 => {#1193 }

]

+“ZippedPages”: null

}

Ah now I see what was going wrong in my previous call. The “zipOutput” parameter is a boolean type, not the wanted zip filename.

Next problem I face is that the OutputResult->Pages array is null. I’d like to know the amount of pages directly, and be able to iterate over it.

{#1173 

+“SourceDocument”: {#1174 

+“Href”: “fabb9c4c13887d735b293302dee59d7f6ca8e60e.docx”

+“Rel”: “self”

+“Type”: null

+“Title”: null

}

+“Pages”: null

+“ZippedPages”: {#1175 

+“Href”: “fabb9c4c13887d735b293302dee59d7f6ca8e60e.png.zip”

+“Rel”: “zippedpages”

+“Type”: null

+“Title”: null

}

}

Otherwise I’ve got to determine those things based on the amount of images in the .zip which seems weird to me. Please always return the Pages, because we should be able to do things based on that. The .zip content should represent what is in the Pages array.

The WordsDocument saveAs documentation does not work.


Aspose.Words Cloud - API References

Hi Richard,

Can you please elaborate what exactly do you want in Pages? Do you want to return it an integer value to indicate total number of pages in the ZIP file or do you want it to return the names of output pages?

Regarding saveAs, it is working fine at my end. Can you please share your code to reproduce the issue?

Best Regards,

The link of the PUT action for the WordsDocumentSaveAs documentation does not work. The following link is shown there:


Conversion Settings|Aspose Words Cloud Docs

Concerning the Pages. I’d expect that if zipOutput is returned, the SplitPages array shoud also be set in the same way as it usually was without the zipOutput parameter.

We have been testing the splitDocument .zip function on our QA environment.


The performance is really poor, splitting a .docx document of 9.8MB of only 47 pages took more than two minutes to split.

That way converting to .pdf and splitting the .pdf to a image per page on our own server is even faster.

Please explain why the splitting does take that long. Otherwise this seemingly great feature is not usable for anyone.

Hi Richard,

The WordsDocument saveAs documentation does not work.

http://api.aspose.com/v1.1/Help

Do you mean Test API feature does not work when you try to test it using online test API?

Best Regards,

No, just that the documentation link is broken.


The problem with the huge performance delay on splitting + downloading a .zip is more important to us at the moment.

Hi Richard,

Regarding link issue, it will be resolved soon.

As far as performance of split resource is concerned, following were my observations.

  • Tested a 30 MB file with images and less number of pages. It took less than 5 seconds.
  • Tested a 10 MB file with 75 pages. It took around 2 minutes.

Based on the above tests, if the document has more text and number of pages, then it takes more time to split. Can you please also share your document? This will help us testing it at our end once the issue is resolved.

Best Regards,

Hi Muhammad,


1) We can confirm the second scenario, splitting a file with many pages takes > 2 minutes.

2) We would love to help you testing the 30MB files, but currently we’ve got to pay for all calls we make. A demo account only allows 100 calls and 1MB for a file.

Can you please make a test account for us with a 30MB (or higher) upload limit and many document calls? Last week we finally did discover the poor performance when we were already live.

Hi Richard,

Pages property returns null instead of pages array and performance issues have been logged into our issue tracking system as SAASWORDS-206 and SAASWORDS-207 respectively. We will keep you updated on these issues in this thread.

As far as testing larger files is concerned, you can also test the files with Aspose.Words for .NET because this is not Aspose for Cloud specific issue; it will be resolved in Aspose.Words for .NET first.

Best Regards,

You are asking me to test with a totally different product? We do not have a .NET development environment, so that will not be possible. Please keep us up to date about releases for the Cloud products. Thanks for now!

Hi Richard,

In fact 30 MB limit is not available for all customers. It has been allocated to a limited number of customers with custom plans to maintain the performance.

We will check the possibility of a test account for you after the issue is resolved. Sorry for the inconvenience.

Best Regards,

Hi Muhammad, any update on the performance issues? Does the wordcount include the textareas already? Thanks

Hi Richard,

Performance issue has a high priority but it can take some time as this issue will be fixed in the codebase first. We cannot share a concrete ETA but it is expected to be fixed in near future.

Does the wordcount include the textareas already?

Yes.

Best Regards,