Free Support Forum - aspose.cloud

Extracting paragraphs from pdf issues

When we extract a paragraph from pdf it is not complete.

For eg: If the paragraph has text

"SaaSpose is a new cloud-based document generation, conversion and automation platform for developers. SaaSpose’s APIs gives developers on all platforms total control over documents and file formats. It interoperates seamlessly with other cloud services."

The result after extraction

"SaaSpose is a new cloud-based document generation, conversion and automation platform forSaaSpose is a new cloud-based document generation, conversion and automation platform for"

Looks like the 1st break tag that it encounters it stops the extraction.

Can you please notify us when this is fixed. Thank you!

Hi,

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for using our product.

Please share your sample code and template file with us to show the issue. We will check it and get back to you soon.

Sorry for the inconvenience,

Hi,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for sharing the template file via email.

I tested your issue with the latest version of Aspose.Pdf for .NET v7.4 and I am unable to notice any problem. Complete PDF text is extracted using the TextAbsorber. Are you using some older version of Aspose.Pdf? If yes, then please download and try the latest version, In case you still face any issue, please share your system environment details with us i.e. OS, .NET framework, X64 or X86 application etc. This will help us in replicating the issue at our end.

Sorry for the inconvenience,

Moved to: https://forum.aspose.com/t/extracting-paragraphs-from-pdf-issues/91813