ByteArrayInputStream in = new ByteArrayInputStream( response.contentString().getBytes( "UTF-8" ) );
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
tidy().parseDOM( in, byteStream );
response.setContent( new NSData( byteStream.toByteArray() ) );
Pretty neat little cleanup trick ... Incidentally, if you use Project Wonder, switching to:
ByteArrayInputStream in = response.content().stream();
ERXRefByteArrayOutputStream out = new ERXRefByteArrayOutputStream();
tidy().parseDOM(in, out);
response.setContent(out.toNSData());
should give you better memory/performance ... the comparison is:
1) response.contentString() (copy if NSData-backed)
2) .getBytes() = copy
3) new ByteArrayInputStream(byte[]) = copy
4) writing to ByteArrayOutputStream = copy
5) byteStream.toByteArray() = copy
6) new NSData(byte[]) = copy
the Wonder variant one is:
1) response.content() (copy if String-backed)
2) writing to ByteArrayOutputStream = copy
there might be one more copy in the Wonder variation (I didn't spend too much time digging into .content()), but still at least cut in half on the copies. If you don't use wonder, you can just steal ERXRefByteArrayOutputStream -- it's basically BAOS, but modified to give direct access to the byte buffer. BAOS normally only hands back a copy, which sort of sucks most of the time.