PDF to HTML Conversion with SVG image in C#

Hi Tilal Ahmad,
Thanks for your help. It is working with me now but I am facing a new issue. Currently when I convert the pdf to html with online demo url of aspose, its output is plain html with html table but when i convert it through code then it is generating html with SVG which is of no use for me because I want to scrape the information from the pdf file. I have attached both files, please have a look.
exempt.zip (77.6 KB)

WithOnlinDemo.zip (21.9 KB)

@adnanofpk

You need to set partsEmbeddingMode and rasterImagesSavingMode properties as follow. It will help you get the required result.

// instantiate the PdfApi
PdfApi pdfApi = new PdfApi(ClientSecret, ClientId);
string name = "exempt.pdf";            
string TestDataFolder = "C:\\Users\\hp 840 g3\\Downloads";
string resultName = "exempt.zip";

// upload PDF file to cloud storage
using (var file = System.IO.File.OpenRead(Path.Combine(TestDataFolder, name)))
{
    var uploadResponse = pdfApi.UploadFileAsync(name, file);
    Console.WriteLine("uploaded...");
}
// Convert PDf from cloud storage to HTML
var response =await pdfApi.PutPdfInStorageToHtmlAsync(name: name, outPath: resultName, partsEmbeddingMode: "EmbedAllIntoHtml", rasterImagesSavingMode: "AsEmbeddedPartsOfPngPageBackground");
Console.WriteLine(response);           

It was better than previous but not exactly the same as i received from the demo url of aspose. The demo url is using the html table wth no absolute position however with code it generates div with absolute position and the order of div does not match with the text in pdf.
exempt.zip (70.7 KB)

@adnanofpk

Please share the Aspose app URL. It will help us to investigate and address your issue.

Here is the demo url.

@adnanofpk

Thanks, we are looking into it and will share our findings with you soon.

Any update regarding the issue? Also can you let me know is there any file size limit or the number of pages to be converted limit on free account?

@adnanofpk

We have logged a ticket(PDFCLOUD-3677) to check the output difference between the Aspose.PDF app and the Aspose.PDF Cloud API. We will share an update with you as soon as the investigation is completed.

No, we don’t have any limitations on the file size of the free account. However, please note that each container(node) has limited memory. Memory consumption depends upon the file structure, as sometimes a 10MB file can take 1GB in memory while processed, and 100MB can take from 100MB to a huge number. You can try the conversion with your big flies; if you face any issues, you can share your input document via some free file sharing service with us for investigation.

can you give me some expected number of days/hours the issue will be resolved? so i can continue my task accordingly.

@adnanofpk

We have plan the ticket investigate in coming week. Hopefully, we will share the solution/ETA in next week.

Hi Tilal,
Did not hear any update from you? If I move from free to paid membership, will it decrease the time of fixing that issue?

@adnanofpk

We have made an initial investigation and noticed the Cloud API needs some changes to produce Aspose.PDF app results. We will keep you updated about the issue resolution progress within this forum thread.

so any ETA regarding its completion? means how much time i have to wait? the issue is that my application is complete now just waiting for this issue

@adnanofpk

I have asked my colleague, working on the fix, to share an ETA. I will notify you as soon as I get an update.

@adnanofpk

In reference to the above reported issue, we want to inform you that, unfortunately, the fix needs some major revamping in Aspose.PDF Cloud. Meanwhile, you can use Aspose.Words Cloud for the PDF to HTML conversion. It will serve the purpose. Please let us know if you need any assistance in this regard.

Hi Tilal,
Can I use aspose desktop app with same billing formula, i. e usage base instead of fixed amount per year?

@adnanofpk

No, As per my knowledge, you cannot use usage based billing with aspose.app. However, I will double check and inform you.

Meanwhile, please check Aspose.Words Cloud sample code and output for your reference.exempt.zip (26.0 KB)

Aspose.Words.Cloud.Sdk.Configuration config = new Aspose.Words.Cloud.Sdk.Configuration();
config.ClientId = ClientId;
config.ClientSecret = ClientSecret;

WordsApi wordsApi = new WordsApi(config);

string localFile = @"C:\Temp\exempt.pdf";

var requestDocument = File.OpenRead(localFile);
var convertRequest = new ConvertDocumentRequest(requestDocument, "html");
var actual = await wordsApi.ConvertDocument(convertRequest);            
var fileStream = System.IO.File.Create(@"C:\Temp\exempt.html");
actual.CopyTo(fileStream);
fileStream.Close();

That worked for me. Thanks a lot for your help.

1 Like

@adnanofpk

It is good to know that you managed to resolve the issue using the Aspose.Words Cloud API. However, we will also notify you when the original issue is resolved in the Aspose.PDF Cloud API.

sure, thank you

1 Like