Author |
|
iconbill Valued Community Member
Joined: 07 March 2007 Location: Netherlands
Online Status: Offline Posts: 76
|
Posted: 01 April 2008 at 12:08am | IP Logged
|
|
|
We are trying to read a Greek message in the following scenarios (ASP/vbscript):
First we send from Google a message with subject 'Αίτηση Εγγραφής' (Greek text they say)
Scenario 1.
- ASP-File is saved as Unicode (UTF-8)
- no codepage in header: <%@LANGUAGE="VBSCRIPT"%>
- response.codepage = 65001
- response.charset = “utf-8”
- <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />c
- using: Set objPOP3 = Server.CreateObject("MailBee.POP3")
- objPOP3.CodepageMode = 1
- objPOP3.Codepage &n bsp; = 65001
- Browser (FF2) encoding is Unicode (UTF-8)
Reading and displaying will a wrong subject, reading and inserting it in the database won’t work either. If we do a manual insert in that page with the correct subject it will insert fine.
Scenario 2:
- ASP-File is saved as Greek (ISO)
- codepage 1252 in header: <%@LANGUAGE="VBSCRIPT" CODEPAGE="1252"%>
- no response.codepage
- no response.charset
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-7" />
- using: Set objPOP3 = Server.CreateObject("MailBee.POP3")
- objPOP3.CodepageMode = 1
- objPOP3.Codepage &n bsp; = 65001
- Browser (FF2) encoding is Greek (iso-8859-7)
This will display the subjects correctly but won’t let us insert them into our database.
We think it should be scenario 1, ‘cause we also want to support other character sets (Chinese, Japanese, Russian, etc…), but setting all things to Unicode won’t work. We’ve read a ton of documentation about this but still keep going around in circles. Could you please tell us how to read these kind of message subjects?
Kind regards,
Marco
Summery from the file retrieved from our pop-server:
-----------------------------------------------------
Subject: =?ISO-8859-7?B?RndkOiDB3/Tn8+cgxePj8eH23vI=?=
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_4362_3236 2342.1206964970490"
------=_Part_4362_32362342.1206964970490
Content-Type: text/plain; charset=ISO-8859-7
Content-Transfer-Encoding: base64
Content-Disposition: inline
wd/05/PnIMXj4/Hh9t7yCg==
------=_Part_4362_32362342.1206964970490
Content-Type: text/html; charset=ISO-8859-7
Content-Transfer-Encoding: base64
Content-Disposition: inline
PHNwYW4gY2xhc3M9ImdtYWlsX3F1b3RlIj48L3NwYW4+PGRpdj7B3/Tn8+cg xePj8eH23vIgPC9k
aXY+Cg==
------=_Part_4362_32362342.1206964970490--
--------------------------------------------------------
|
Back to Top |
|
|
Andrew AfterLogic Support
Joined: 28 April 2006 Location: United States
Online Status: Offline Posts: 1189
|
Posted: 01 April 2008 at 6:00am | IP Logged
|
|
|
We've implemented the following sample which displays (in UTF-8) the subject you provided correctly:
Code:
<%
Response.Charset="utf-8"
Set msg = Server.CreateObject("mailbee.message")
msg.Codepage = 65001
msg.ImportFromFile "D:\projects\test\test.eml"
Response.Write msg.Subject
%> |
|
|
Also, the following articles should be useful for you:
Chinese or russian characters from .asp form via Stored procedure into MS SQL 2005
ASP - Locales and Codepages
Setting the Code Page for String Conversions
Best regards,
Andrew
|
Back to Top |
|
|
iconbill Valued Community Member
Joined: 07 March 2007 Location: Netherlands
Online Status: Offline Posts: 76
|
Posted: 01 April 2008 at 7:05am | IP Logged
|
|
|
Thank you Andrew.
We found the answer! When I look in SQL-server manager, all my inserted Chinees and Greek messages text where a mess....BUT....when I read these records with ASP in UTF-8 everything is just fine! I was making a wrong conclusion, we where thinking that all the inserted data was wrong in SQL-server, but it was not!!
In case you know how to display the Chinees and Greek text in SQL-manager, please let us know.
Thank you
|
Back to Top |
|
|
Andrew AfterLogic Support
Joined: 28 April 2006 Location: United States
Online Status: Offline Posts: 1189
|
Posted: 02 April 2008 at 2:23am | IP Logged
|
|
|
If you store text in UTF-8 in a varchar field of database, it will be unreadable because SQL-manager doesn't support decoding UTF-8. However, if you use nvarchar type (Unicode, two bytes per char), it should be readable.
Best regards,
Andrew
|
Back to Top |
|
|
iconbill Valued Community Member
Joined: 07 March 2007 Location: Netherlands
Online Status: Offline Posts: 76
|
Posted: 02 April 2008 at 3:53am | IP Logged
|
|
|
We were/are storing the data in a nvarchar field (including the use of the N prefix when inserting). I think it has something to do with the collation (we're using SQL_Latin1_General_CP1_CI_AS), but we can't (won't) change that.
|
Back to Top |
|
|
Andrew AfterLogic Support
Joined: 28 April 2006 Location: United States
Online Status: Offline Posts: 1189
|
Posted: 02 April 2008 at 4:07am | IP Logged
|
|
|
If you store data in UTF-8, it doesn't matter if it's stored in varchar or nvarchar field because text encoded in UTF-8 can be stored in any text field, it doesn't require two bytes per char. UTF-8 assumes variable number of bytes per character.
UTF-8 is not the same as Unicode.
Best regards,
Andrew
|
Back to Top |
|
|
iconbill Valued Community Member
Joined: 07 March 2007 Location: Netherlands
Online Status: Offline Posts: 76
|
Posted: 04 April 2008 at 12:26am | IP Logged
|
|
|
Thank you Andrew for your fast help.
We are now able to send UTF-8 text like Greek or Chinees to every client and receive & display it correctly!
For all the people who will read this post in the future:
1. We are using now the next code in every ASP-page:
<%@LANGUAGE="VBSCRIPT" CODEPAGE="1252"%>
Response.CharSet = "utf-8"
2. And this in the pop page:
objPOP3.CodepageMode = 0
objPOP3.Codepage = 65001
3. Remark: We do not use Server.HTMLEncode() anymore, it's not needed in UTF-8 format. If you would use it (like we before) everything is wrong displayed
4. In Javascript when parsing values in string , use 'encodeURIComponent(value)'
Hope this helps other people! (took a lot of time before we understand the utf-8/unicode/sql/aps combinations and settings)
King regards,
Marco
|
Back to Top |
|
|
Andrew AfterLogic Support
Joined: 28 April 2006 Location: United States
Online Status: Offline Posts: 1189
|
Posted: 04 April 2008 at 4:39am | IP Logged
|
|
|
Thanks for the resume.
Quote:
3. Remark: We do not use Server.HTMLEncode() anymore, it's not needed in UTF-8 format. If you would use it (like we before) everything is wrong displayed |
|
|
It's necessary to perform HTML encoding though. Of course, Server.HTMLEncode() is locale dependent and cannot be used for locales non-default for server. The workaround is to use the following code:
Code:
Function WMHTMLEncode(strEncode)
Dim strTmpEncode
If Not IsNull(strEncode) Then
strTmpEncode = Replace(strEncode, "&", "&")
strTmpEncode = Replace(strTmpEncode, "<", "<")
strTmpEncode = Replace(strTmpEncode, ">", ">")
strTmpEncode = Replace(strTmpEncode, """", """)
Else
strTmpEncode = ""
End If
WMHTMLEncode = strTmpEncode
End Function |
|
|
Best regards,
Andrew
|
Back to Top |
|
|
iconbill Valued Community Member
Joined: 07 March 2007 Location: Netherlands
Online Status: Offline Posts: 76
|
Posted: 04 April 2008 at 7:17am | IP Logged
|
|
|
Thank you!
The more I build applications the less build-in vbscript function I use
Have a nice weekend!
|
Back to Top |
|
|