Hello,
I have error like “Message: java.net.SocketTimeoutException: timeout
HTTP response code: 0
HTTP response body: null
HTTP response headers: null”
It happens only when I try to process many documents (more then 50). When I do 2-3 documents it is successfully processed.
If I try re-processe the document returned TIMEOUT but run it standalone, it return success.
Size of the document does not have any impact most of PDF files contains 3-10 pages (as images).
That is the settings I used:
private OCRSettingsRecognizePdf getSettingsRecognizePdf() {
OCRSettingsRecognizePdf settings = new OCRSettingsRecognizePdf();
settings.setDsrConfidence(DsrConfidence.HIGH);
settings.setDsrMode(DsrMode.TEXT_DETECTOR);
settings.setMakeBinarization(false);
settings.setMakeSkewCorrect(true);
settings.setMakeContrastCorrection(false);
settings.setMakeUpsampling(false);
settings.setResultType(ResultType.TEXT);
return settings;
}
I use it in the code as simple as that:
RecognizePdfApi api = this.getRecognizePdfApi();
OCRSettingsRecognizePdf settings = this.getSettingsRecognizePdf();
OCRRecognizePdfBody requestBody = new OCRRecognizePdfBody();
requestBody.setImage(fileContent);
requestBody.setSettings(settings);
try {
// Send PDF to ASPOSE OCR to start processing
String taskId = api.postRecognizePdf(requestBody);
...
This code is run by Java Batch job.
Another Batch job check the status every 5 minutes by given taskId as:
OCRResponse apiResponse = api.getRecognizePdf(taskId);
–
You can check server logs for 17/12/2024 - you will find more then 100 timeouts.
Maybe you have some concurrency limits for OCR process?
Another question is regarding response time on request “api.getRecognizePdf(taskId);” It takes 5-10 seconds to receive response even if it is PENDING. Is it normal?