API Reference

ietfparse.algorithms

Implementations of algorithms from various specifications.

This module implements some of the more interesting algorithms described in IETF RFCs.

ietfparse.algorithms.IDNA_SCHEMES

A collection of schemes that use IDN encoding for its host.

ietfparse.algorithms.rewrite_url(input_url, **kwargs)

Create a new URL from input_url with modifications applied.

Parameters:
  • input_url (str) – the URL to modify
  • fragment (str) – if specified, this keyword sets the fragment portion of the URL. A value of None will remove the fragment portion of the URL.
  • host (str) – if specified, this keyword sets the host portion of the network location. A value of None will remove the network location portion of the URL.
  • password (str) – if specified, this keyword sets the password portion of the URL. A value of None will remove the password from the URL.
  • path (str) – if specified, this keyword sets the path portion of the URL. A value of None will remove the path from the URL.
  • port (int) – if specified, this keyword sets the port portion of the network location. A value of None will remove the port from the URL.
  • query – if specified, this keyword sets the query portion of the URL. See the comments for a description of this parameter.
  • scheme (str) – if specified, this keyword sets the scheme portion of the URL. A value of None will remove the scheme. Note that this will make the URL relative and may have unintended consequences.
  • user (str) – if specified, this keyword sets the user portion of the URL. A value of None will remove the user and password portions.
  • enable_long_host (bool) – if this keyword is specified and it is True, then the host name length restriction from RFC 3986#section-3.2.2 is relaxed.
  • encode_with_idna (bool) – if this keyword is specified and it is True, then the host parameter will be encoded using IDN. If this value is provided as False, then the percent-encoding scheme is used instead. If this parameter is omitted or included with a different value, then the host parameter is processed using IDNA_SCHEMES.
Returns:

the modified URL

Raises:

ValueError – when a keyword parameter is given an invalid value

If the host parameter is specified and not None, then it will be processed as an Internationalized Domain Name (IDN) if the scheme appears in IDNA_SCHEMES. Otherwise, it will be encoded as UTF-8 and percent encoded.

The handling of the query parameter requires some additional explanation. You can specify a query value in three different ways - as a mapping, as a sequence of pairs, or as a string. This flexibility makes it possible to meet the wide range of finicky use cases.

If the query parameter is a mapping, then the key + value pairs are sorted by the key before they are encoded. Use this method whenever possible.

If the query parameter is a sequence of pairs, then each pair is encoded in the given order. Use this method if you require that parameter order is controlled.

If the query parameter is a string, then it is used as-is. This form SHOULD BE AVOIDED since it can easily result in broken URLs since no URL escaping is performed. This is the obvious pass through case that is almost always present.

ietfparse.algorithms.select_content_type(requested, available)

Selects the best content type.

Parameters:
  • requested – a sequence of ContentType instances
  • available – a sequence of ContentType instances that the server is capable of producing
Returns:

the selected content type (from available) and the pattern that it matched (from requested)

Return type:

tuple of ContentType instances

Raises:

NoMatch when a suitable match was not found

This function implements the Proactive Content Negotiation algorithm as described in sections 3.4.1 and 5.3 of RFC 7231. The input is the Accept header as parsed by parse_http_accept_header() and a list of parsed ContentType instances. The available sequence should be a sequence of content types that the server is capable of producing. The selected value should ultimately be used as the Content-Type header in the generated response.

ietfparse.datastructures

Important data structures.

This module contains data structures that were useful in implementing this library. If a data structure might be useful outside of a particular piece of functionality, it is fully fleshed out and ends up here.

class ietfparse.datastructures.ContentType(content_type, content_subtype, parameters=None)

A MIME Content-Type header.

Parameters:
  • content_type (str) – the primary content type
  • content_subtype (str) – the content sub-type
  • parameters (dict) – optional dictionary of content type parameters

Internet content types are described by the Content-Type header from RFC 2045. It was reused across many other protocol specifications, most notably HTTP (RFC 7231). This header’s syntax is described in RFC 2045#section-5.1. In its most basic form, a content type header looks like text/html. The primary content type is text with a subtype of html. Content type headers can include parameters as name=value pairs separated by colons.

class ietfparse.datastructures.LinkHeader(target, parameters=None)

Represents a single link within a Link header.

target

The target URL of the link. This may be a relative URL so the caller may have to make the link absolute by resolving it against a base URL as described in RFC 3986#section-5.

parameters

Possibly empty sequence of name and value pairs. Parameters are represented as a sequence since a single parameter may occur more than once.

The Link header is specified by RFC 5988. It is one of the methods used to represent HyperMedia links between HTTP resources.

ietfparse.errors

Exceptions raised from within ietfparse.

All exceptions are rooted at RootException so so you can catch it to implement error handling behavior associated with this library’s functionality.

exception ietfparse.errors.MalformedLinkValue

Value specified is not a valid link header.

exception ietfparse.errors.NoMatch

No match was found when selecting a content type.

exception ietfparse.errors.RootException

Root of the ietfparse exception hierarchy.

ietfparse.headers

Functions for parsing headers.

This module also defines classes that might be of some use outside of the module. They are not designed for direct usage unless otherwise mentioned.

ietfparse.headers.parse_accept(header_value)

Parse an HTTP accept-like header.

Parameters:header_value (str) – the header value to parse
Returns:a list of ContentType instances in decreasing quality order. Each instance is augmented with the associated quality as a float property named quality.

Accept is a class of headers that contain a list of values and an associated preference value. The ever present Accept header is a perfect example. It is a list of content types and an optional parameter named q that indicates the relative weight of a particular type. The most basic example is:

Accept: audio/*;q=0.2, audio/basic

Which states that I prefer the audio/basic content type but will accept other audio sub-types with an 80% mark down.

ietfparse.headers.parse_accept_charset(header_value)

Parse the Accept-Charset header into a sorted list.

Parameters:header_value (str) – header value to parse
Returns:list of character sets sorted from highest to lowest priority

The Accept-Charset header is a list of character set names with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.

Note

Character sets that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.

ietfparse.headers.parse_accept_encoding(header_value)

Parse the Accept-Encoding header into a sorted list.

Parameters:header_value (str) – header value to parse
Returns:list of encodings sorted from highest to lowest priority

The Accept-Encoding header is a list of encodings with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.

Note

Encodings that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.

ietfparse.headers.parse_accept_language(header_value)

Parse the Accept-Language header into a sorted list.

Parameters:header_value (str) – header value to parse
Returns:list of languages sorted from highest to lowest priority

The Accept-Language header is a list of languages with optional quality values. The quality value indicates the strength of the preference where 1.0 is a strong preference and less than 0.001 is outright rejection by the client.

Note

Languages that are rejected by setting the quality value to less than 0.001. If a wildcard is included in the header, then it will appear BEFORE values that are rejected.

ietfparse.headers.parse_cache_control(header_value)

Parse a Cache-Control header, returning a dictionary of key-value pairs.

Any of the Cache-Control parameters that do not have directives, such as public or no-cache will be returned with a value of True if they are set in the header.

Parameters:header_value (str) – Cache-Control header value to parse
Returns:the parsed Cache-Control header values
Return type:dict
ietfparse.headers.parse_content_type(content_type, normalize_parameter_values=True)

Parse a content type like header.

Parameters:
  • content_type (str) – the string to parse as a content type
  • normalize_parameter_values (bool) – setting this to False will enable strict RFC2045 compliance in which content parameter values are case preserving.
Returns:

a ContentType instance

ietfparse.headers.parse_http_accept_header(header_value)

Parse an HTTP accept-like header.

Parameters:header_value (str) – the header value to parse
Returns:a list of ContentType instances in decreasing quality order. Each instance is augmented with the associated quality as a float property named quality.

Accept is a class of headers that contain a list of values and an associated preference value. The ever present Accept header is a perfect example. It is a list of content types and an optional parameter named q that indicates the relative weight of a particular type. The most basic example is:

Accept: audio/*;q=0.2, audio/basic

Which states that I prefer the audio/basic content type but will accept other audio sub-types with an 80% mark down.

Deprecated since version 1.3.0: Use parse_accept() instead.

Parse a HTTP Link header.

Parameters:
  • header_value (str) – the header value to parse
  • strict (bool) – set this to False to disable semantic checking. Syntactical errors will still raise an exception. Use this if you want to receive all parameters.
Returns:

a sequence of LinkHeader instances

Raises:

ietfparse.errors.MalformedLinkValue – if the specified header_value cannot be parsed

Parse a HTTP Link header.

Parameters:
  • header_value (str) – the header value to parse
  • strict (bool) – set this to False to disable semantic checking. Syntactical errors will still raise an exception. Use this if you want to receive all parameters.
Returns:

a sequence of LinkHeader instances

Raises:

ietfparse.errors.MalformedLinkValue – if the specified header_value cannot be parsed

Deprecated since version 1.3.0: Use parse_link() instead.

ietfparse.headers.parse_list(value)

Parse a comma-separated list header.

Parameters:value (str) – header value to split into elements
Returns:list of header elements as strings
ietfparse.headers.parse_list_header(value)

Parse a comma-separated list header.

Parameters:value (str) – header value to split into elements
Returns:list of header elements as strings

Deprecated since version 1.3.0: Use parse_list() instead.