On this page
URL Helpers
Functions for working with URLs.
Contains implementations of functions from urllib.parse
that handle bytes and strings.
class werkzeug.urls.BaseURL(scheme, netloc, path, query, fragment)
-
Superclass of
URL
andBytesURL
.Create new instance of _URLTuple(scheme, netloc, path, query, fragment)
property ascii_host: Optional[str]
-
Works exactly like
host
but will return a result that is restricted to ASCII. If it finds a netloc that is not ASCII it will attempt to idna decode it. This is useful for socket operations when the URL might include internationalized characters.
property auth: Optional[str]
-
The authentication part in the URL if available,
None
otherwise.
decode_netloc()
-
Decodes the netloc part into a string.
- Return type
decode_query(*args, **kwargs)
-
Decodes the query part of the URL. Ths is a shortcut for calling
url_decode()
on the query argument. The arguments and keyword arguments are forwarded tourl_decode()
unchanged.
encode_netloc()
-
Encodes the netloc part to an ASCII safe URL as bytes.
- Return type
get_file_location(pathformat=None)
-
Returns a tuple with the location of the file in the form
(server, location)
. If the netloc is empty in the URL or points to localhost, it’s represented asNone
.The
pathformat
by default is autodetection but needs to be set when working with URLs of a specific system. The supported values are'windows'
when working with Windows or DOS paths and'posix'
when working with posix paths.If the URL does not point to a local file, the server and location are both represented as
None
.
property host: Optional[str]
-
The host part of the URL if available, otherwise
None
. The host is either the hostname or the IP address mentioned in the URL. It will not contain the port.
join(*args, **kwargs)
-
Joins this URL with another one. This is just a convenience function for calling into
url_join()
and then parsing the return value again.- Parameters
- Return type
property password: Optional[str]
-
The password if it was part of the URL,
None
otherwise. This undergoes URL decoding and will always be a string.
property port: Optional[int]
-
The port in the URL as an integer if it was present,
None
otherwise. This does not fill in default ports.
property raw_password: Optional[str]
-
The password if it was part of the URL,
None
otherwise. Unlikepassword
this one is not being decoded.
property raw_username: Optional[str]
-
The username if it was part of the URL,
None
otherwise. Unlikeusername
this one is not being decoded.
replace(**kwargs)
-
Return an URL with the same values, except for those parameters given new values by whichever keyword arguments are specified.
- Parameters
-
kwargs (Any) –
- Return type
to_iri_tuple()
-
Returns a
URL
tuple that holds a IRI. This will try to decode as much information as possible in the URL without losing information similar to how a web browser does it for the URL bar.It’s usually more interesting to directly call
uri_to_iri()
which will return a string.- Return type
to_uri_tuple()
-
Returns a
BytesURL
tuple that holds a URI. This will encode all the information in the URL properly to ASCII using the rules a web browser would follow.It’s usually more interesting to directly call
iri_to_uri()
which will return a string.- Return type
to_url()
-
Returns a URL string or bytes depending on the type of the information stored. This is just a convenience function for calling
url_unparse()
for this URL.- Return type
property username: Optional[str]
-
The username if it was part of the URL,
None
otherwise. This undergoes URL decoding and will always be a string.
class werkzeug.urls.BytesURL(scheme, netloc, path, query, fragment)
-
Represents a parsed URL in bytes.
Create new instance of _URLTuple(scheme, netloc, path, query, fragment)
decode(charset='utf-8', errors='replace')
-
Decodes the URL to a tuple made out of strings. The charset is only being used for the path, query and fragment.
- Parameters
- Return type
encode_netloc()
-
Returns the netloc unchanged as bytes.
- Return type
class werkzeug.urls.URL(scheme, netloc, path, query, fragment)
-
Represents a parsed URL. This behaves like a regular tuple but also has some extra attributes that give further insight into the URL.
Create new instance of _URLTuple(scheme, netloc, path, query, fragment)
encode(charset='utf-8', errors='replace')
-
Encodes the URL to a tuple made out of bytes. The charset is only being used for the path, query and fragment.
- Parameters
- Return type
werkzeug.urls.iri_to_uri(iri, charset='utf-8', errors='strict', safe_conversion=False)
-
Convert an IRI to a URI. All non-ASCII and unsafe characters are quoted. If the URL has a domain, it is encoded to Punycode.
>>> iri_to_uri('http://\u2603.net/p\xe5th?q=\xe8ry%DF') 'http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF'
- Parameters
-
- iri (Union[str, Tuple[str, str, str, str, str]]) – The IRI to convert.
- charset (str) – The encoding of the IRI.
- errors (str) – Error handler to use during
bytes.encode
. - safe_conversion (bool) – Return the URL unchanged if it only contains ASCII characters and no whitespace. See the explanation below.
- Return type
There is a general problem with IRI conversion with some protocols that are in violation of the URI specification. Consider the following two IRIs:
magnet:?xt=uri:whatever itms-services://?action=download-manifest
After parsing, we don’t know if the scheme requires the
//
, which is dropped if empty, but conveys different meanings in the final URL if it’s present or not. In this case, you can usesafe_conversion
, which will return the URL unchanged if it only contains ASCII characters and no whitespace. This can result in a URI with unquoted characters if it was not already quoted correctly, but preserves the URL’s semantics. Werkzeug uses this for theLocation
header for redirects.Changelog
Changed in version 0.15: All reserved characters remain unquoted. Previously, only some reserved characters were left unquoted.
Changed in version 0.9.6: The
safe_conversion
parameter was added.New in version 0.6.
werkzeug.urls.uri_to_iri(uri, charset='utf-8', errors='werkzeug.url_quote')
-
Convert a URI to an IRI. All valid UTF-8 characters are unquoted, leaving all reserved and invalid characters quoted. If the URL has a domain, it is decoded from Punycode.
>>> uri_to_iri("http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF") 'http://\u2603.net/p\xe5th?q=\xe8ry%DF'
- Parameters
- Return type
Changelog
Changed in version 0.15: All reserved and invalid characters remain quoted. Previously, only some reserved characters were preserved, and invalid bytes were replaced instead of left quoted.
New in version 0.6.
werkzeug.urls.url_decode(s, charset='utf-8', include_empty=True, errors='replace', separator='&', cls=None)
-
Parse a query string and return it as a
MultiDict
.- Parameters
-
- s (AnyStr) – The query string to parse.
- charset (str) – Decode bytes to string with this charset. If not given, bytes are returned as-is.
- include_empty (bool) – Include keys with empty values in the dict.
- errors (str) – Error handling behavior when decoding bytes.
- separator (str) – Separator character between pairs.
- cls (Optional[Type[ds.MultiDict]]) – Container to hold result instead of
MultiDict
.
- Return type
Changelog
Changed in version 2.0: The
decode_keys
parameter is deprecated and will be removed in Werkzeug 2.1.Changed in version 0.5: In previous versions “;” and “&” could be used for url decoding. Now only “&” is supported. If you want to use “;”, a different
separator
can be provided.Changed in version 0.5: The
cls
parameter was added.
werkzeug.urls.url_decode_stream(stream, charset='utf-8', include_empty=True, errors='replace', separator=b'&', cls=None, limit=None)
-
Works like
url_decode()
but decodes a stream. The behavior of stream and limit follows functions likemake_line_iter()
. The generator of pairs is directly fed to thecls
so you can consume the data while it’s parsed.- Parameters
-
- stream (IO[bytes]) – a stream with the encoded querystring
- charset (str) – the charset of the query string. If set to
None
no decoding will take place. - include_empty (bool) – Set to
False
if you don’t want empty values to appear in the dict. - errors (str) – the decoding error behavior.
- separator (bytes) – the pair separator to be used, defaults to
&
- cls (Optional[Type[ds.MultiDict]]) – an optional dict class to use. If this is not specified or
None
the defaultMultiDict
is used. - limit (Optional[int]) – the content length of the URL data. Not necessary if a limited stream is provided.
- Return type
Changelog
Changed in version 2.0: The
decode_keys
andreturn_iterator
parameters are deprecated and will be removed in Werkzeug 2.1.New in version 0.8.
werkzeug.urls.url_encode(obj, charset='utf-8', sort=False, key=None, separator='&')
-
URL encode a dict/
MultiDict
. If a value isNone
it will not appear in the result string. Per default only values are encoded into the target charset strings.- Parameters
-
- obj (Union[Mapping[str, str], Iterable[Tuple[str, str]]]) – the object to encode into a query string.
- charset (str) – the charset of the query string.
- sort (bool) – set to
True
if you want parameters to be sorted bykey
. - separator (str) – the separator to be used for the pairs.
- key (Optional[Callable[[Tuple[str, str]], Any]]) – an optional function to be used for sorting. For more details check out the
sorted()
documentation.
- Return type
Changelog
Changed in version 2.0: The
encode_keys
parameter is deprecated and will be removed in Werkzeug 2.1.Changed in version 0.5: Added the
sort
,key
, andseparator
parameters.
werkzeug.urls.url_encode_stream(obj, stream=None, charset='utf-8', sort=False, key=None, separator='&')
-
Like
url_encode()
but writes the results to a stream object. If the stream isNone
a generator over all encoded pairs is returned.- Parameters
-
- obj (Union[Mapping[str, str], Iterable[Tuple[str, str]]]) – the object to encode into a query string.
- stream (Optional[IO[str]]) – a stream to write the encoded object into or
None
if an iterator over the encoded pairs should be returned. In that case the separator argument is ignored. - charset (str) – the charset of the query string.
- sort (bool) – set to
True
if you want parameters to be sorted bykey
. - separator (str) – the separator to be used for the pairs.
- key (Optional[Callable[[Tuple[str, str]], Any]]) – an optional function to be used for sorting. For more details check out the
sorted()
documentation.
- Return type
-
None
Changelog
Changed in version 2.0: The
encode_keys
parameter is deprecated and will be removed in Werkzeug 2.1.New in version 0.8.
werkzeug.urls.url_fix(s, charset='utf-8')
-
Sometimes you get an URL by a user that just isn’t a real URL because it contains unsafe characters like ‘ ‘ and so on. This function can fix some of the problems in a similar way browsers handle data entered by the user:
>>> url_fix('http://de.wikipedia.org/wiki/Elf (Begriffskl\xe4rung)') 'http://de.wikipedia.org/wiki/Elf%20(Begriffskl%C3%A4rung)'
werkzeug.urls.url_join(base, url, allow_fragments=True)
-
Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.
werkzeug.urls.url_parse(url, scheme=None, allow_fragments=True)
-
Parses a URL from a string into a
URL
tuple. If the URL is lacking a scheme it can be provided as second argument. Otherwise, it is ignored. Optionally fragments can be stripped from the URL by settingallow_fragments
toFalse
.The inverse of this function is
url_unparse()
.- Parameters
- Return type
werkzeug.urls.url_quote(string, charset='utf-8', errors='strict', safe='/:', unsafe='')
-
URL encode a single string with a given encoding.
- Parameters
- Return type
Changelog
New in version 0.9.2: The
unsafe
parameter was added.
werkzeug.urls.url_quote_plus(string, charset='utf-8', errors='strict', safe='')
-
URL encode a single string with the given encoding and convert whitespace to “+”.
werkzeug.urls.url_unparse(components)
-
The reverse operation to
url_parse()
. This accepts arbitrary as well asURL
tuples and returns a URL as a string.
werkzeug.urls.url_unquote(s, charset='utf-8', errors='replace', unsafe='')
-
URL decode a single string with a given encoding. If the charset is set to
None
no decoding is performed and raw bytes are returned.
werkzeug.urls.url_unquote_plus(s, charset='utf-8', errors='replace')
-
URL decode a single string with the given
charset
and decode “+” to whitespace.Per default encoding errors are ignored. If you want a different behavior you can set
errors
to'replace'
or'strict'
.
© 2007–2022 Pallets
Licensed under the BSD 3-clause License.
https://werkzeug.palletsprojects.com/en/2.1.x/urls/