[ASP] XMLHTTP object reads external pages in unicode?

They have: 17 posts

Joined: Dec 2004

Hi guys,

I have an ASP script that basically visit the given URL and saves the output as an HTML file. This is the readUrl function:

function readUrl(url)
Dim xml
Set xml = Server.CreateObject("Msxml2.XMLHTTP.4.0")
xml.Open "GET", url, False
xml.Send
readUrl = xml.responseText
Set xml = Nothing
end function

I cannot write the result of this function to a text file unless I do a format check using another function:

function getFileFormat(ByRef vTxt)
const UNICODE = -1, ASCII = 0
a2 = Asc(Mid(vTxt,2,1))
w3 = AscB(MidB(vTxt,3,1))
w2 = AscB(MidB(vTxt,2,1))
getFileFormat = ASCII
if a2<>w2 AND a2=w3 Then getFileFormat = UNICODE
end function

So I can create the text file using;

content = readUrl(url)
Set newFile = fso.CreateTextFile(file,true,getFileFormat(content))

The file created is always in unicode instead of ASCII. If I force to write in ASCII, I get an error because the content is in unicode. This make the HTML files double in size, and weird-looking in some applications (one space after every character)

Has anybody used XMLHTTP or a similar thing? Do you know how to avoid this problem?

Thanks

CptAwesome's picture

He has: 370 posts

Joined: Dec 2004

Sorry, I haven't used that particular option, in php you can user file() or fopen() functions to get the HTML of any page, you'll want to read the documentation on the functions:
http://www.php.net/manual/en/function.file.php and http://www.php.net/manual/en/function.fopen.php
or for php >= 4.3.0 http://www.php.net/manual/en/function.file-get-contents.php

There may also be a comparable function in ASP that you just haven't found, hope this is at least some help.

They have: 17 posts

Joined: Dec 2004

I may go with PHP if I cannot find a solution. I assume PHP works under Windows 2003 with no problem. I just need to rewrite the functions and the main iteration part. Thanks for the links, they look really useful.

As for the similar ASP function, no we don't have one. You have to write your own function in ASP using a MS/3rd party object. Even the latest ASP.NET depends on objects instead of builtin functions. It is just its nature.

They have: 17 posts

Joined: Dec 2004

I found this message in another forum:

...
XMLHTTP returns the results as a unicode string.

To solve this, I used Server.CreateObject("WinHttp.WinHttpRequest.5.1") instead. All properties and methods are the same.
...

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.