API Reference¶
ietfparse.algorithms¶
Implementations of algorithms from various specifications.
remove_url_auth()
: removes and returns the auth portion of a URL. This is particularly handy for processing URLs from configuration files or environment variables.rewrite_url()
: modify a portion of a URL.select_content_type()
: select the best match between a HTTPAccept
header and a list of availableContent-Type
s
This module implements some of the more interesting algorithms described in IETF RFCs.
-
ietfparse.algorithms.
IDNA_SCHEMES
¶ A collection of schemes that use IDN encoding for its host.
-
class
ietfparse.algorithms.
RemoveUrlAuthResult
¶
-
ietfparse.algorithms.
remove_url_auth
(url)¶ Removes the user & password and returns them along with a new url.
Parameters: url (str) – the URL to sanitize Returns: a tuple
containing the authorization portion and the sanitized URL. The authorization is a simple user & passwordtuple
.>>> auth, sanitized = remove_url_auth('http://foo:bar@example.com') >>> auth ('foo', 'bar') >>> sanitized 'http://example.com'
The return value from this function is simple named tuple with the following fields:
- auth the username and password as a tuple
- username the username portion of the URL or
None
- password the password portion of the URL or
None
- url the sanitized URL
>>> result = remove_url_auth('http://me:secret@example.com') >>> result.username 'me' >>> result.password 'secret' >>> result.url 'http://example.com'
-
ietfparse.algorithms.
rewrite_url
(input_url, **kwargs)¶ Create a new URL from input_url with modifications applied.
Parameters: - input_url (str) – the URL to modify
- fragment (str) – if specified, this keyword sets the
fragment portion of the URL. A value of
None
will remove the fragment portion of the URL. - host (str) – if specified, this keyword sets the host
portion of the network location. A value of
None
will remove the network location portion of the URL. - password (str) – if specified, this keyword sets the
password portion of the URL. A value of
None
will remove the password from the URL. - path (str) – if specified, this keyword sets the path
portion of the URL. A value of
None
will remove the path from the URL. - port (int) – if specified, this keyword sets the port
portion of the network location. A value of
None
will remove the port from the URL. - query – if specified, this keyword sets the query portion of the URL. See the comments for a description of this parameter.
- scheme (str) – if specified, this keyword sets the scheme
portion of the URL. A value of
None
will remove the scheme. Note that this will make the URL relative and may have unintended consequences. - user (str) – if specified, this keyword sets the user
portion of the URL. A value of
None
will remove the user and password portions. - enable_long_host (bool) – if this keyword is specified
and it is
True
, then the host name length restriction from RFC 3986#section-3.2.2 is relaxed. - encode_with_idna (bool) – if this keyword is specified
and it is
True
, then thehost
parameter will be encoded using IDN. If this value is provided asFalse
, then the percent-encoding scheme is used instead. If this parameter is omitted or included with a different value, then thehost
parameter is processed usingIDNA_SCHEMES
.
Returns: the modified URL
Raises: ValueError – when a keyword parameter is given an invalid value
If the host parameter is specified and not
None
, then it will be processed as an Internationalized Domain Name (IDN) if the scheme appears inIDNA_SCHEMES
. Otherwise, it will be encoded as UTF-8 and percent encoded.The handling of the query parameter requires some additional explanation. You can specify a query value in three different ways - as a mapping, as a sequence of pairs, or as a string. This flexibility makes it possible to meet the wide range of finicky use cases.
If the query parameter is a mapping, then the key + value pairs are sorted by the key before they are encoded. Use this method whenever possible.
If the query parameter is a sequence of pairs, then each pair is encoded in the given order. Use this method if you require that parameter order is controlled.
If the query parameter is a string, then it is used as-is. This form SHOULD BE AVOIDED since it can easily result in broken URLs since no URL escaping is performed. This is the obvious pass through case that is almost always present.
-
ietfparse.algorithms.
select_content_type
(requested, available)¶ Selects the best content type.
Parameters: - requested – a sequence of
ContentType
instances - available – a sequence of
ContentType
instances that the server is capable of producing
Returns: the selected content type (from
available
) and the pattern that it matched (fromrequested
)Return type: tuple
ofContentType
instancesRaises: NoMatch
when a suitable match was not foundThis function implements the Proactive Content Negotiation algorithm as described in sections 3.4.1 and 5.3 of RFC 7231. The input is the Accept header as parsed by
parse_http_accept_header()
and a list of parsedContentType
instances. Theavailable
sequence should be a sequence of content types that the server is capable of producing. The selected value should ultimately be used as the Content-Type header in the generated response.- requested – a sequence of
ietfparse.datastructures¶
Important data structures.
ContentType
: MIMEContent-Type
header.
This module contains data structures that were useful in implementing this library. If a data structure might be useful outside of a particular piece of functionality, it is fully fleshed out and ends up here.
-
class
ietfparse.datastructures.
ContentType
(content_type, content_subtype, parameters=None, content_suffix=None)¶ A MIME
Content-Type
header.Parameters: Internet content types are described by the Content-Type header from RFC 2045. It was reused across many other protocol specifications, most notably HTTP (RFC 7231). This header’s syntax is described in RFC 2045#section-5.1. In its most basic form, a content type header looks like
text/html
. The primary content type istext
with a subtype ofhtml
. Content type headers can include parameters asname=value
pairs separated by colons.RFC 6839 added the ability to use a content type to identify the semantic value of a representation with a content type and also identify the document format as a content type suffix. For example,
application/vnd.github.v3+json
is used to identify documents that match version 3 of the GitHub API that are represented as JSON documents. The same entity encoded as msgpack would have the content typeapplication/vnd.github.v3+msgpack
. In this case, the content type identifies the information that is in the document and the suffix is used to identify the content format.
-
class
ietfparse.datastructures.
LinkHeader
(target, parameters=None)¶ Represents a single link within a
Link
header.-
target
¶ The target URL of the link. This may be a relative URL so the caller may have to make the link absolute by resolving it against a base URL as described in RFC 3986#section-5.
-
parameters
¶ Possibly empty sequence of name and value pairs. Parameters are represented as a sequence since a single parameter may occur more than once.
The Link header is specified by RFC 5988. It is one of the methods used to represent HyperMedia links between HTTP resources.
-
ietfparse.errors¶
Exceptions raised from within ietfparse.
All exceptions are rooted at RootException
so
so you can catch it to implement error handling behavior associated with
this library’s functionality.
-
exception
ietfparse.errors.
MalformedLinkValue
¶ Value specified is not a valid link header.
-
exception
ietfparse.errors.
NoMatch
¶ No match was found when selecting a content type.
-
exception
ietfparse.errors.
RootException
¶ Root of the
ietfparse
exception hierarchy.
-
exception
ietfparse.errors.
StrictHeaderParsingFailure
(header_name, header_value)¶ Non-standard header value detected.
This is raised when “strict” conformance is enabled for a header parsing function and a header value fails due to one of the “strict” rules.
See
ietfparse.headers.parse_forwarded()
for an example.
ietfparse.headers¶
Functions for parsing headers.
parse_accept()
: parse anAccept
valueparse_accept_charset()
: parse aAccept-Charset
valueparse_cache_control()
: parse aCache-Control
valueparse_content_type()
: parse aContent-Type
valueparse_forwarded()
: parse a RFC 7239Forwarded
valueparse_link()
: parse a RFC 5988Link
valueparse_list()
: parse a comma-separated list that is present in so many headers
This module also defines classes that might be of some use outside of the module. They are not designed for direct usage unless otherwise mentioned.
-
ietfparse.headers.
parse_accept
(header_value)¶ Parse an HTTP accept-like header.
Parameters: header_value (str) – the header value to parse Returns: a list
ofContentType
instances in decreasing quality order. Each instance is augmented with the associated quality as afloat
property namedquality
.Accept
is a class of headers that contain a list of values and an associated preference value. The ever present Accept header is a perfect example. It is a list of content types and an optional parameter namedq
that indicates the relative weight of a particular type. The most basic example is:Accept: audio/*;q=0.2, audio/basic
Which states that I prefer the
audio/basic
content type but will accept otheraudio
sub-types with an 80% mark down.
-
ietfparse.headers.
parse_accept_charset
(header_value)¶ Parse the
Accept-Charset
header into a sorted list.Parameters: header_value (str) – header value to parse Returns: list of character sets sorted from highest to lowest priority The Accept-Charset header is a list of character set names with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.
Note
Character sets that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.
-
ietfparse.headers.
parse_accept_encoding
(header_value)¶ Parse the
Accept-Encoding
header into a sorted list.Parameters: header_value (str) – header value to parse Returns: list of encodings sorted from highest to lowest priority The Accept-Encoding header is a list of encodings with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.
Note
Encodings that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.
-
ietfparse.headers.
parse_accept_language
(header_value)¶ Parse the
Accept-Language
header into a sorted list.Parameters: header_value (str) – header value to parse Returns: list of languages sorted from highest to lowest priority The Accept-Language header is a list of languages with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.
Note
Languages that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.
-
ietfparse.headers.
parse_cache_control
(header_value)¶ Parse a Cache-Control header, returning a dictionary of key-value pairs.
Any of the
Cache-Control
parameters that do not have directives, such aspublic
orno-cache
will be returned with a value ofTrue
if they are set in the header.Parameters: header_value (str) – Cache-Control
header value to parseReturns: the parsed Cache-Control
header valuesReturn type: dict
-
ietfparse.headers.
parse_content_type
(content_type, normalize_parameter_values=True)¶ Parse a content type like header.
Parameters: Returns: a
ContentType
instance
-
ietfparse.headers.
parse_forwarded
(header_value, only_standard_parameters=False)¶ Parse RFC7239 Forwarded header.
Parameters: - header_value (str) – value to parse
- only_standard_parameters (bool) – if this keyword is specified
and given a truthy value, then a non-standard parameter name
will result in
StrictHeaderParsingFailure
Returns: Raises: ietfparse.errors.StrictHeaderParsingFailure
is raised if only_standard_parameters is enabled and a non-standard parameter name is encounteredThis function parses a RFC 7239 HTTP header into a
list
ofdict
instances with each instance containing the param values. The list is ordered as received from left to right and the parameter names are folded to lower case strings.
-
ietfparse.headers.
parse_http_accept_header
(header_value)¶ Parse an HTTP accept-like header.
Parameters: header_value (str) – the header value to parse Returns: a list
ofContentType
instances in decreasing quality order. Each instance is augmented with the associated quality as afloat
property namedquality
.Accept
is a class of headers that contain a list of values and an associated preference value. The ever present Accept header is a perfect example. It is a list of content types and an optional parameter namedq
that indicates the relative weight of a particular type. The most basic example is:Accept: audio/*;q=0.2, audio/basic
Which states that I prefer the
audio/basic
content type but will accept otheraudio
sub-types with an 80% mark down.Deprecated since version 1.3.0: Use
parse_accept()
instead.
-
ietfparse.headers.
parse_link
(header_value, strict=True)¶ Parse a HTTP Link header.
Parameters: Returns: a sequence of
LinkHeader
instancesRaises: ietfparse.errors.MalformedLinkValue – if the specified header_value cannot be parsed
-
ietfparse.headers.
parse_link_header
(header_value, strict=True)¶ Parse a HTTP Link header.
Parameters: Returns: a sequence of
LinkHeader
instancesRaises: ietfparse.errors.MalformedLinkValue – if the specified header_value cannot be parsed
Deprecated since version 1.3.0: Use
parse_link()
instead.
-
ietfparse.headers.
parse_list
(value)¶ Parse a comma-separated list header.
Parameters: value (str) – header value to split into elements Returns: list of header elements as strings
-
ietfparse.headers.
parse_list_header
(value)¶ Parse a comma-separated list header.
Parameters: value (str) – header value to split into elements Returns: list of header elements as strings Deprecated since version 1.3.0: Use
parse_list()
instead.