Visual Basic Internet Routines
UrlCanonicalize: Proper URL Path Encoding and Decoding
     
Posted:   Monday July 09, 2001
Updated:   Monday December 26, 2011
     
Applies to:   VB4-32, VB5, VB6
Developed with:   VB6, Windows 2000
OS restrictions:   See prerequisites below
Author:   VBnet - Randy Birch
     

Related:  

UrlUnescape: Encoding and Decoding URL Escape Characters 
UrlCreateFromPath: Proper URL Path Conversion from a DOS Path
UrlGetPart: Determine the Constituent Parts of a URL
     
 Prerequisites
Shlwapi.dll version 5.00 or greater, Windows XP, 2000, NT4 with IE 5 or later, Windows 98, or Windows 95 with IE 5 or later. 

Another of the Shell Lightweight Utility APIs, UrlCanonicalize takes a URL string and converts it into canonical form. The function will do such tasks as replacing unsafe characters with their escape sequences and collapsing sequences like "..\..." (as shown in the first three text boxes). The pszUrl member takes either a URL string, which must include a valid scheme such as "http://", or a remote or local file name. pszCanonicalized holds the returned null-terminated string for the URL.

If a URL string contains '/../' or '/./', UrlCanonicalize will normally treat the characters as indicating navigation in the URL hierarchy. The function will simplify the URLs before combining them. For instance "/hello/cruel/../world" will be simplified to "/hello/world". If the URL_DONT_SIMPLIFY flag is set in dwFlags, the function will not simplify URLs. In this case, "/hello/cruel/../world" will be left as is.

Specific flags are provided to customize the behaviour of UrlCanonicalize:
Flag Description
URL_DONT_SIMPLIFY Treat '/./' and '/../' in a URL string as literal characters, not as shorthand for navigation. See Remarks for further discussion. 
URL_ESCAPE_PERCENT  Convert any occurrence of '%' to its escape sequence..
URL_ESCAPE_SPACES_ONLY Replace only spaces with escape sequences. This flag takes precedence over URL_ESCAPE_UNSAFE, but does not apply to opaque URLs. 
URL_ESCAPE_UNSAFE  Replace unsafe values with their escape sequences. This flag applies to all URLs, including opaque URLs.
URL_PLUGGABLE_PROTOCOL  Combine URLs with client-defined pluggable protocols, according to the W3C specification. This flag does not apply to standard protocols such as ftp, http, gopher, and so on. If this flag is set, UrlCombine will not simplify URLs, so there is no need to also set URL_DONT_SIMPLIFY.
URL_UNESCAPE  Unescape any escape sequences that the URLs contain, with two exceptions. The escape sequences for '?' and '#' will not be unescaped. If one of the URL_ESCAPE_XXX flags is also set, the two URLs will unescaped, then combined, then escaped. 

The string passed to UrlCanonicalize can be any local or remote URL or file string.

 BAS Module Code
None.

 Form Code
To a form add a command button (Command1) and six text boxes (Text1 through Text6) along with the following code:

Option Explicit
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Copyright ©1996-2011 VBnet/Randy Birch, All Rights Reserved.
' Some pages may also contain other copyrights by the author.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' Distribution: You can freely use this code in your own
'               applications, but you may not reproduce 
'               or publish this code on any web site,
'               online service, or distribute as source 
'               on any media without express permission.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Private Const MAX_PATH                As Long = 260
Private Const ERROR_SUCCESS           As Long = 0

'Convert any occurrence of '%' to
'its escape sequence.
Private Const URL_ESCAPE_PERCENT      As Long = &H1000

'Replace only spaces with escape
'sequences. This flag takes precedence
'over URL_ESCAPE_UNSAFE, but does not
'apply to opaque URLs.
Private Const URL_ESCAPE_SPACES_ONLY As Long = &H4000000

'Treat '/./' and '/../' in a URL string
'as literal characters, not as shorthand
'for navigation.
Private Const URL_DONT_SIMPLIFY      As Long = &H8000000

'Unescape any escape sequences that
'the URLs contain, with two exceptions.
'The escape sequences for '?' and '#'
'will not be unescaped. If one of the
'URL_ESCAPE_XXX flags is also set, the
'two URLs will unescaped, then combined,
'then escaped.
Private Const URL_UNESCAPE            As Long = &H10000000

'Replace unsafe values with their
'escape sequences. This flag applies
'to all URLs, including opaque URLs.
Private Const URL_ESCAPE_UNSAFE       As Long = &H20000000

'Combine URLs with client-defined
'pluggable protocols, according to
'the W3C specification. This flag
'does not apply to standard protocols
'such as ftp, http, gopher, and so on.
'If this flag is set, UrlCombine will
'not simplify URLs, so there is no need
'to also set URL_DONT_SIMPLIFY.
Private Const URL_PLUGGABLE_PROTOCOL  As Long = &H40000000

Private Declare Function UrlCanonicalize Lib "shlwapi" _
   Alias "UrlCanonicalizeA" _
  (ByVal pszURL As String, _
   ByVal pszCanonicalized As String, _
   pcchCanonicalized As Long, _
   ByVal dwFlags As Long) As Long


Private Sub Form_Load()

   Text1.Text = "c:\my documents\vbnet articles\..\random access.doc"
   Text2.Text = ""
   Text3.Text = ""
   Text4.Text = "http://vbnet code lib/net code/net code/ip address.htm"
   Text5.Text = ""
   Text6.Text = ""
   Command1.Caption = "Canonicalize"

End Sub


Private Sub Command1_Click()
     
   Dim sUrl As String
   Dim sUrlEsc As String

  'use the original string in Text1 for
  'demo, and show results in Text2 and Text3
   sUrl = Text1.Text
   sUrlEsc = EncodeUrlCanonicalize(sUrl, URL_ESCAPE_UNSAFE)
   Text2.Text = sUrlEsc
   Text3.Text = EncodeUrlCanonicalize(sUrlEsc, URL_UNESCAPE)
   
   
  'use the original string in Text4 for
  'demo, and show results in Text5 and Text6
   sUrl = Text4.Text
   sUrlEsc = EncodeUrlCanonicalize(sUrl, URL_ESCAPE_UNSAFE)
   Text5.Text = sUrlEsc
   Text6.Text = EncodeUrlCanonicalize(sUrlEsc, URL_UNESCAPE)
   
End Sub


Private Function EncodeUrlCanonicalize(ByVal sUrl As String, _
                                       dwFlags As Long) As String

   Dim sUrlEsc As String
   Dim dwSize As Long
     
   If Len(sUrl) > 0 Then
      
      sUrlEsc = Space$(MAX_PATH)
      dwSize = Len(sUrlEsc)
      
      If UrlCanonicalize(sUrl, _
                         sUrlEsc, _
                         dwSize, _
                         dwFlags) = ERROR_SUCCESS Then
                   
         EncodeUrlCanonicalize = Left$(sUrlEsc, dwSize)
      
      End If  'If UrlCanonicalize
   End If  'If Len(sUrl) > 0

End Function
 Comments

 
 

PayPal Link
Make payments with PayPal - it's fast, free and secure!

 
 
 
 

Copyright ©1996-2011 VBnet and Randy Birch. All Rights Reserved.
Terms of Use  |  Your Privacy

 

Hit Counter