Free Support Forum - aspose.cloud

Issues converting pdf to html


#1

I need to generate html out of pdf document and ive used api method “PutPdfInRequestToHtml” for the same.

I tried both with fixedLayout and without fixedLayout, however, i have issues with both the outputs

  1. With fixedLayout - It uses styles “left and top” which has 2 major issues…
    i. It is not responsive and
    ii. In a paragraph, line ends with character exactly as in pdf, instead of continuing the paragraph on the same line for a wider screen… may be because it considers each line as a paragraph tag…

  2. Without fixedLayout - the resulting html is completely misaligned…

Thanks,
Hemant


#2

@HemantJain

Thanks for your inquiry. We will appreciate it if you please share your sample input PDF document along with the problematic output HTML and expected HTML document. It will help us to investigate your issue exactly and guide you accordingly.

P.S: you can attach the documents as zip file.


#3

Hi,

PFA
FixedLayout True
Test.pdf as input PDF document and result.zip as output HTML
Test.pdf (127.1 KB)
result.zip (116.9 KB)

FixedLayout False
Test-en.pdf as input PDF document and resultFixedFalse.zip as output HTML
resultFixedFalse.zip (115.9 KB)
Test-en.pdf (142.0 KB)

Thanks,
Hemant


#4

@HemantJain

Thanks for sharing your input and output documents here. We are looking into these and will guide you accordingly. Meanwhile, please share your expected HTML document as well it will help us to understand your requirements exactly.


#5

Hi,

Attached is somewhat expected HTML document… The expectation is that the paragraphs are displayed properly in continuation and should be responsive.

Test-en-ExpectedOutput.zip (9.3 KB)

Thanks,
Hemant


#6

@HemantJain

Thanks for sharing expected HTML document. I am investigating the PutPdfInRequestToHtml API method behavior and will update you soon.

Meanwhile, you may try Aspose.Words Cloud ConvertDocument API. Testen_AW.zip (17.2 KB)

curl -X PUT "https://api.aspose.cloud/v4.0/words/convert?format=html&outPath=Temp/Testen_AW.html" 
-H "accept: application/xml" 
-H "authorization: Bearer [Access_Token]" 
-H "Content-Type: multipart/form-data" 
-F "File=@C:/Temp/Test-en.pdf"

#7

Thanks Tilal. I’ve tried what you have suggested and it is working wee for me now. Really appreciate you help at the right time…

Thanks once again.


#8

Hi Tilal,

I’m sure this Aspose.word convertDocument was working for pdf files yesterday, but today it is giving an error.

Please see attached screenshot…

image.png (79.8 KB)

Can you please help me?


#9

@HemantJain

Thanks for your inquiry. I’m afraid, I’m unable to replicate the error. It seems you are using old SDK 19.2. Please use latest SDK version, hopefully it will help you to resolve the issue.