How to split Microsoft Word Document into image in PHP using Aspose.Words REST API and get result in response body

a.user · February 18, 2015, 8:49am

We want to make use of ‘splitDocument’ or ‘saveAs’ from the Document API to convert a single .docx file into .png images for each page.

None of the mentioned calls does allow us to directly get back the images, so we need to do a new API call for each image. Our documents on average are 40 pages, so when every “get image” call takes 1.5 second it results in 60 seconds of overhead.

What we would like to see options like:

return the raw images (base64 encoded?) in the first splitDocument / saveAs call
return a .zip with all images in it

Heard from Billy Lundy you were already busy with implementing, but please keep me up-to-date because this would drastically reduce your server load (amount of Other API operations calls).

muhammad.ijaz · February 19, 2015, 1:35am

Hi Richard,

You already have reported this issue and our developers were working on it. The good news is that the issue has been resolved and it is live already. This will be announced in the coming public release next week and respective topics will be added to documentation for your reference.

Best Regards,

muhammad.ijaz · February 19, 2015, 1:51am

Hi Richard,

In the meantime, you can update your URI to use this new feature. URI for split resource will be:

http://api.aspose.com/v1.1/words/Sample.doc/split?format=png&zipoutput=true

and output document name can be extracted from JSON e.g.

$json->SplitResult->ZippedPages->Href

Best Regards,

a.user · February 19, 2015, 4:53am

Hi Muhammad, sorry for another issue but I’m unable to find back the other issue so did report another one.

Great to hear that the feature is live already! This will definitely speed up our conversion process a lot.

Some questions:

Do I need 1 call for the “splitDocument” call? Will that call still result into a json?
The json probably has a ZippedPages link in it? Or is the Zip included in the Json? And the zip is located on the Aspose server after the split?
The ZippedPages can be retrieved via the Folder class?
Is the zip feature also added to the other “split” calls? For instance to the .pdf class?
Can you also add documentation to Github? https://github.com/aspose-words-cloud/aspose-words-cloud-php/blob/master/tests/Aspose/Words/SplitDocumentToFormatTests.php

muhammad.ijaz · February 19, 2015, 8:32am

Hi Richard,

1. Yes, it will be one call and return JSON/XML.

2. Yes, it has link to output ZippedPages ZIP file and it will be downloaded from Aspose for Cloud or specified third party storage.

3. Yes, you can use Download File example.

4. Yes, ZIP feature is supported for all formats supported by split resource.

5. Sure. We will add it once these features are publically released next week.

Best Regards,

a.user · February 24, 2015, 9:56am

Hi Muhammad, I’ve tried to test the “zipOutput” feature, but it does not seem to work yet. The SDK still only puts images in the online folder, not the expected .zip file.

The only result I get with the following call

http://api.aspose.com/v1.1/words/028b222f72d9d41662970183d0e9689641e2a2d7.docx/split?format=png&zipOutput=028b222f72d9d41662970183d0e9689641e2a2d7.docx.zip&appSID=53d3be57-9038-4360-a6ab-28017f8a7b93

is the following json:

{#1173 

+“SourceDocument”: {#1174 

+“Href”: “48822db50d78437ae8ed98385b102fd6a7598e7a.docx”

+“Rel”: “self”

+“Type”: null

+“Title”: null

}

+“Pages”: array:19 [

0 => {#1175}

1 => {#1176 }

2 => {#1177 }

3 => {#1178 }

4 => {#1179 }

5 => {#1180 }

6 => {#1181 }

7 => {#1182 }

8 => {#1183 }

9 => {#1184 }

10 => {#1185 }

11 => {#1186 }

12 => {#1187 }

13 => {#1188 }

14 => {#1189 }

15 => {#1190 }

16 => {#1191 }

17 => {#1192 }

18 => {#1193 }

]

+“ZippedPages”: null

}

a.user · February 24, 2015, 11:23am

Ah now I see what was going wrong in my previous call. The “zipOutput” parameter is a boolean type, not the wanted zip filename.

Next problem I face is that the OutputResult->Pages array is null. I’d like to know the amount of pages directly, and be able to iterate over it.

{#1173 

+“SourceDocument”: {#1174 

+“Href”: “fabb9c4c13887d735b293302dee59d7f6ca8e60e.docx”

+“Rel”: “self”

+“Type”: null

+“Title”: null

}

+“Pages”: null

+“ZippedPages”: {#1175 

+“Href”: “fabb9c4c13887d735b293302dee59d7f6ca8e60e.png.zip”

+“Rel”: “zippedpages”

+“Type”: null

+“Title”: null

}

}

Otherwise I’ve got to determine those things based on the amount of images in the .zip which seems weird to me. Please always return the Pages, because we should be able to do things based on that. The .zip content should represent what is in the Pages array.

a.user · February 25, 2015, 2:54am

The WordsDocument saveAs documentation does not work.

Aspose.Words Cloud - API References

muhammad.ijaz · February 26, 2015, 2:25am

Hi Richard,

Can you please elaborate what exactly do you want in Pages? Do you want to return it an integer value to indicate total number of pages in the ZIP file or do you want it to return the names of output pages?

Regarding saveAs, it is working fine at my end. Can you please share your code to reproduce the issue?

Best Regards,

a.user · February 27, 2015, 9:37am

The link of the PUT action for the WordsDocumentSaveAs documentation does not work. The following link is shown there:

Conversion Settings|Aspose Words Cloud Docs

Concerning the Pages. I’d expect that if zipOutput is returned, the SplitPages array shoud also be set in the same way as it usually was without the zipOutput parameter.

a.user · February 27, 2015, 11:33am

We have been testing the splitDocument .zip function on our QA environment.

The performance is really poor, splitting a .docx document of 9.8MB of only 47 pages took more than two minutes to split.

That way converting to .pdf and splitting the .pdf to a image per page on our own server is even faster.

Please explain why the splitting does take that long. Otherwise this seemingly great feature is not usable for anyone.

muhammad.ijaz · March 2, 2015, 5:51am

Hi Richard,

The WordsDocument saveAs documentation does not work.
http://api.aspose.com/v1.1/Help

Do you mean Test API feature does not work when you try to test it using online test API?

Best Regards,

a.user · March 2, 2015, 6:14am

No, just that the documentation link is broken.

The problem with the huge performance delay on splitting + downloading a .zip is more important to us at the moment.

muhammad.ijaz · March 3, 2015, 4:54am

Hi Richard,

Regarding link issue, it will be resolved soon.

As far as performance of split resource is concerned, following were my observations.

Tested a 30 MB file with images and less number of pages. It took less than 5 seconds.
Tested a 10 MB file with 75 pages. It took around 2 minutes.

Based on the above tests, if the document has more text and number of pages, then it takes more time to split. Can you please also share your document? This will help us testing it at our end once the issue is resolved.

Best Regards,

a.user · March 3, 2015, 10:17am

Hi Muhammad,

1) We can confirm the second scenario, splitting a file with many pages takes > 2 minutes.

2) We would love to help you testing the 30MB files, but currently we’ve got to pay for all calls we make. A demo account only allows 100 calls and 1MB for a file.

Can you please make a test account for us with a 30MB (or higher) upload limit and many document calls? Last week we finally did discover the poor performance when we were already live.

muhammad.ijaz · March 4, 2015, 5:41am

Hi Richard,

Pages property returns null instead of pages array and performance issues have been logged into our issue tracking system as SAASWORDS-206 and SAASWORDS-207 respectively. We will keep you updated on these issues in this thread.

As far as testing larger files is concerned, you can also test the files with Aspose.Words for .NET because this is not Aspose for Cloud specific issue; it will be resolved in Aspose.Words for .NET first.

Best Regards,

a.user · March 4, 2015, 6:17am

You are asking me to test with a totally different product? We do not have a .NET development environment, so that will not be possible. Please keep us up to date about releases for the Cloud products. Thanks for now!

muhammad.ijaz · March 5, 2015, 4:06am

Hi Richard,

In fact 30 MB limit is not available for all customers. It has been allocated to a limited number of customers with custom plans to maintain the performance.

We will check the possibility of a test account for you after the issue is resolved. Sorry for the inconvenience.

Best Regards,

a.user · March 18, 2015, 2:59am

Hi Muhammad, any update on the performance issues? Does the wordcount include the textareas already? Thanks

muhammad.ijaz · March 18, 2015, 7:39am

Hi Richard,

Performance issue has a high priority but it can take some time as this issue will be fixed in the codebase first. We cannot share a concrete ETA but it is expected to be fixed in near future.

Does the wordcount include the textareas already?

Yes.

Best Regards,