I’ve been trying to compare two .doc documents using the /words/online/put/compareDocument api endpoint. The api responds with a 200 OK status code yet the response body contains chaotic content which I’m unable to convert to a .doc document. The response Content-Type is ‘multipart/mixed; boundary="…"’. You can see the response body in the image (99.9 KB).
I’m using typescript to send requests to the api (I write the requests myself, without the library, due to some conflicts that the library causes). Is that response correct? If so how do I convert that response into a .doc document? If not why do I get it and how does a request to that endpoint look like when only using typescript features (How do I get a correct response file which I can work with?)?
Yes, the response is correct. As the API response is multipart, you need to parse it to get the resultant document. For example, please check the sample Node.js code to compare two word documents from the local drive. It will give you an idea of how to resolve your issue.
Furthermore, please note that the Aspose.Words Cloud SDK for Node.js has .ts files; you can use the SDK with typescript. However, if you are having some issues using it, then please share some details of the issues. We will look into these and help you use the SDK.
Thanks for your fast response @tilal.ahmad . Before opening this question, I’ve already been trying to parse this multipart response (unsuccessfully). Additionally I am unable to resolve my issue using the code in your response .
So how can I parse the response without using the asposewordscloud library (preferably without using any library)?
And on top of that: why does it say the following inside the response?
{
…
“FileName”: “testDest.doc”,
“SourceFormat”: “Docx”,
…
}
Why is the SourceFormat Docx if I’m downloading a .doc document? Is that supposed to be like that?
@tilal.ahmad
I know how to parse json, but the response I’m getting is not json. It’s multipart/mixed content in which the first part is json, tho not relevant to me. The second part is what matters to me as it contains the file. But if I remove everything except the binary data (which I suppose is the file) and convert it into a .doc document, when opening the file, word tells me that the file has been corrupted (irrepairable). Now why is that the case and how do I fix that? How do I get a usable file?
@tilal.ahmad
In the following you can see the current code. accessToken is the access token generated by the https://api.aspose.cloud/connect/token endpoint (and is correct for sure). this.file1.file and this.file2.file are files of type File. The output variable stores my attempt of parsing the response. And as I said: The resulting file is a corrupted msword document.
Thanks for sharing the sample code. We have logged a ticket(WORDSCLOUD-2409) to investigate the problem and will keep you updated about the issue resolution progress within this forum thread.
We might have an idea of why the issue occurs: The response we get when sending the request without the library is encoded with utf-8, yet when sending the request with the library, the response is encoded with ANSI. We tried to set the loadEncoding parameter to ansi, yet that yields an error response (500, message="‘ansi’ is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method. (Parameter ‘name’)".
How do we change the encoding such that we get back ansi, which we may be able to process?
Thanks for sharing your findings. We are already working on the solution without Aspose.Words Cloud SDKs but using a rest client and we will share the sample code with you shortly.
The code below is extracted from the current SDK and gives a detailed description of CompareDocumentOnline response data parsing. Hopefully, it will give you some idea of how to parse the response.
/**
* create response
*/
createResponse(_response: Buffer, _headers: http.IncomingHttpHeaders): any {
const result = new CompareDocumentOnlineResponse();
const boundary = getBoundary(_headers);
const parts = parseMultipart(_response, boundary);
result.model = ObjectSerializer.deserialize(JSON.parse(findMultipartElement(parts, "Model").body.toString()), "DocumentResponse");
const partDocument = findMultipartElement(parts, "Document");
result.document = parseFilesCollection(partDocument.body, partDocument.headers);
return result;
}
/**
* Get boundary for IncomingHttpHeaders
*/
export function getBoundary(headers: http.IncomingHttpHeaders): string {
return parseContentType(headers["content-type"]);
}
/**
* Get boundary value from content-type header
*/
function parseContentType(contentType: string) : string {
return contentType.split(" ")[1].split("=")[1].slice(1, -1);
}
/**
* Parse multipart response body for given boundary
*/
export function parseMultipart(body: Buffer, boundary: string): any {
const allParts = [];
let partHeaders = [];
let buffer = [];
const UNKNOWN = 0;
const PART_HEADERS = 1;
const CONTENT = 4;
const PART_END = 5;
let state = UNKNOWN;
let lastline = '';
for (let i = 0; i < body.length; i++) {
const oneByte = body[i];
const prevByte = i > 0 ? body[i-1] : null;
const newLineDetected = ((oneByte === 0x0a) && (prevByte === 0x0d)) ? true : false;
const newLineChar = ((oneByte === 0x0a) || (oneByte === 0x0d)) ? true : false;
if(!newLineChar)
lastline += String.fromCharCode(oneByte);
if((UNKNOWN === state) && newLineDetected){
if(("--"+boundary) === lastline){
state = PART_HEADERS;
lastline = '';
};
} else
if((PART_HEADERS === state) && newLineDetected){
if (lastline !== '') {
partHeaders.push(lastline);
}
else {
state = CONTENT;
}
lastline = '';
} else
if(CONTENT === state){
if(lastline.length > (boundary.length+4)) lastline='';
if(((("--" + boundary) === lastline))){
const part = {
headers: partHeaders.reduce((headers, header) => {
if (header.indexOf(':') !== -1) {
const [ key, value ] = header.split(/:\s+/)
headers[key.toLowerCase()] = value
}
return headers
}, {}),
body: Buffer.from(buffer.slice(0,buffer.length - lastline.length - 1))
};
allParts.push(part);
buffer = []; lastline = ''; state = PART_END; partHeaders = [];
} else {
buffer.push(oneByte);
}
if(newLineDetected) lastline='';
} else
if(PART_END === state){
if(newLineDetected)
state = PART_HEADERS;
}
}
return allParts;
}
/**
* Get multipart part by name
*/
export function findMultipartElement(parts: any[], name: string): any {
for (const part of parts) {
const disp = part.headers['content-disposition'];
const subs = disp.split(';');
let subn = null;
subs.forEach(element => {
if (element.trim().startsWith("name=")) {
subn = element.trim().substr(5).replace(new RegExp('"', 'g'), '');
}
});
if (subn === name) {
return part;
}
}
return null;
}
/**
* Get files collection from Response
*/
export function parseFilesCollection(response: Buffer, headers: http.IncomingHttpHeaders): Map<string, Buffer> {
const result = new Map<string, Buffer>();
if (headers["content-type"]?.startsWith("multipart/mixed")) {
const boundary = getBoundary(headers);
const parts = parseMultipart(response, boundary);
for (const part of parts) {
const disp = part.headers['content-disposition'];
const subs = disp.split(';');
let filename = null;
subs.forEach(element => {
if (element.trim().startsWith("filename=")) {
filename = element.trim().substr(9).replace(new RegExp('"', 'g'), '');
}
});
result.set(filename, part.body);
};
}
else {
result.set("", response);
}
return result;
}