Python Search and Replace Text in PDF

Hello, I wanted to try a basic text replace but I didn’t manage to make it works. The PDF is generated by Adobe Illustrator. I tried with various standard and compatibility when saving the file without success.

The response: {‘code’: 200, ‘matches’: 0, ‘status’: ‘OK’}

The code used (Python):

import os
import asposepdfcloud
from asposepdfcloud.apis.pdf_api import PdfApi

pdf_api_client = asposepdfcloud.api_client.ApiClient(
app_key=“xxxxxxxxxxxxxxxxx”,
app_sid=‘xxxxxxx-xxxx-xxxxx-xxxxx-xxxxxxx’)

pdf_api = PdfApi(pdf_api_client)
filename = ‘pagina2.pdf’
remote_name = ‘pagina2.pdf’

pdf_api.upload_file(remote_name,filename)

text_replace1 = asposepdfcloud.models.TextReplace(old_value=‘PDF’,new_value=‘David Aviles’,regex=‘true’)
text_replace_list = asposepdfcloud.models.TextReplaceListRequest(text_replaces=[text_replace1])

response = pdf_api.post_document_text_replace(remote_name, text_replace_list)
print(response)pagina2.pdf (46.1 KB)

@gapCloser

It seems it is expected behavior. In Aspose.PDF, we try to mimic Adobe Acrobat as closest as possible. I am also unable to search and replace text in PDF using Adobe Acrobat.

P.S: Please never share your credentials(Client Id and Secret) in a public post.

It’s any way to make the text detectable?

@gapCloser

I have logged a ticket PDFCLOUD-2605 for further investigation. We will share our findings with you soon.

@gapCloser

We have analyzed your shared document and found that the characters are encoded in a special encoding that only this document knows. Simply put, the text consists of graphic elements. Therefore, we cannot read and work with the text in this document.

As a workaround, you may convert the PDF document into an image and then use OCR for text recognition.