Type evaluation¶
Type evaluation is a mechanism for replacing complex overloads and version checks. It provides a restricted subset of Python that can be executed by type checkers to customize the behavior of a particular function.
Motivation¶
Consider the
definition of round()
in typeshed:
@overload
def round(number: SupportsRound[Any]) -> int: ...
@overload
def round(number: SupportsRound[Any], ndigits: None) -> int: ...
@overload
def round(number: SupportsRound[_T], ndigits: SupportsIndex) -> _T: ...
With type evaluation, this could instead be written as:
@evaluated
def round(number: SupportsRound[_T], ndigits: SupportsIndex | None = None):
if ndigits is None:
return int
else:
return _T
This makes it easier to see at a glance what the difference is between various overloads.
Other features of type evaluation, as proposed here, include customizable error messages, branching on the type of an argument, and branching on whether an argument was provided as a positional or keyword argument.
Type evaluation functions can replace most complex overloads with simpler, more readable code. They solve a number of problems:
Type evaluation functions provide ways to implement several type system features that have been previously requested, including:
Marking a function, parameter, or parameter type as deprecated.
Accepting
Sequence[str]
but notstr
.Checking whether two generic arguments are overlapping
Error messages involving overloads are often hard to read. Type evaluation functions enable the author of a function to provide custom error messages that clearly point out the issue in the user’s code.
Complex overloads can be difficult to understand and write. Type evaluation functions provide a more natural interface that is closer to how such functions are written at runtime.
The precise behavior of overloads is not specified and varies across type checkers. The behavior of type evaluation functions is more precisely specified.
Specification¶
This section specifies how type evaluation works, without commentary. Discussion, with motivating use cases, is provided in the “Discussion” section below. The examples in this section are meant to clarify the semantics only.
Type-evaluated functions may be declared both at runtime and in stub files, similar to the existing @overload
mechanism. At runtime, the evaluated function must immediately precede
the function implementation:
@evaluated
def round(number: SupportsRound[_T], ndigits: SupportsIndex | None = None):
if ...
def round(number, ndigits=None):
return number.__round__(ndigits)
In stubs, the implementation is omitted.
When a type checker encounters a call to a function for which a type evaluation has been provided, it should do the following:
Validate that the arguments to the call are compatible with the type annotations on the parameters to the evaluation function, as with a normal call.
Symbolically evaluate the body of the type evaluation until it reaches a
return
statement, which provides the type that the call should return. During this symbolic evaluation, each argument is set to the value it has at the call site that is being evaluated.If execution reached a
return
statement, return the type provided by that statement. Otherwise, return the type set in the evaluation function’s return annotation, orAny
if there is no return annotation.
Type checkers are encouraged to provide a strictness option that produces an error if an evaluation function is missing a type annotation on a parameter or return type. However, no error should be provided if the return annotation is missing and all branches (including error branches) return a type.
The default value of a parameter to an evaluation function
may be either ...
or any value that is valid inside
Literal[...]
. If an argument with default X
is not
provided in a call, the type of the argument within the
evaluation function is Literal[X]
. If the default is
...
, the type is the parameter’s annotation instead.
Simple examples to demonstrate the semantics:
@evaluated
def always_returns(x: int):
return str
always_returns("x") # error: "x" is not an int
always_returns() # error: not enough arguments
reveal_type(always_returns(1)) # str
@evaluated
def always_errors(x: int):
show_error("error")
x = always_errors(1) # error
reveal_type(x) # Any
@evaluated
def always_errors_with_type(x: int) -> str:
show_error("error")
x = always_errors(1) # error
reveal_type(x) # str
@evaluated
def with_defaults(x: int = ..., y: int = 1) -> None:
reveal_type(x)
reveal_type(y)
with_defaults() # x is "int", y is "Literal[1]"
with_defaults(1) # x and y are both "Literal[1]"
Supported features¶
The body of a type evaluation uses a restricted subset of Python. The only supported features are:
if
statements andelse
blocks. These can only contain conditions of the form specified below.return
statements with return values that are interpretable as type annotations. This indicates the type that the function returns in a particular condition.pass
statements, which do nothing.Calls to
show_error()
, which cause the type checker to emit an error. These are discussed further below.Calls to
reveal_type(arg)
, where arg is one of the arguments to the type evaluation function. These cause the type checker to emit a message showing the current type ofarg
. This is a debugging feature.
Conditions in if
statements may contain:
A call to one of the following functions, which are covered in more detail below:
is_provided()
, which returns whether a parameter was explicitly provided in a call.is_positional()
, which returns whether a parameter was provided through a positional argument.is_keyword()
, which returns whether a parameter was provided through a keyword argument.is_of_type()
, which returns whether a parameter is of a particular type.
Expressions of the form
arg <op> <constant>
, where<op>
is one ofis
,is not
,==
, or!=
. This is equivalent to(not) is_of_type(arg, Literal[<constant>], exclude_any=True)
.<constant>
may be any value that is valid insideLiteral
(None
, a string, a bool, an int, or an enum member).Version and platform checks that are otherwise valid in stubs, as specified in PEP 484.
Multiple conditions combined with
and
oror
.A negation of another condition with
not
.
show_error()¶
The show_error()
special function has the following
signature:
def show_error(message: str, /, *, argument: Any | None = ...): ...
The message
parameter must be a string literal.
Calls to this function cause the type checker to emit an
error that includes the given message.
Execution continues past the show_error()
call as normal.
If the argument
parameter is provided, it must be one of
the parameters to the function, indicating the parameter that
is causing the error. The type checker may use this
information to produce a more precise error (for example, by
pointing the error caret at the specified argument in the
call site).
is_provided(), is_positional(), and is_keyword()¶
These special functions have the following signatures:
def is_provided(arg: Any, /) -> bool: ...
def is_positional(arg: Any, /) -> bool: ...
def is_keyword(arg: Any, /) -> bool: ...
arg
must be one of the parameters to the function.
is_provided()
returns True if the parameter was explicitly
provided in the call; that is, the default value was not
used. Similarly, is_positional()
returns True if the
parameter was provided as a positional argument, and
is_keyword()
returns True if the parameter was provided
as a keyword argument.
Parameters in Python can be provided in three ways, which we call argument kinds for the purpose of this specification:
POSITIONAL
: at the call site, either a single positional argument or a variadic one (*args
)KEYWORD
: at the call site, either a sinngle keyword argument or a variadic one (**kwargs
)DEFAULT
: no value provided at the call site; the default defined in the function is used
Static analyzers must add a fourth kind in the presence
of calls with *args
and **kwargs
:
UNKNOWN
: the kind cannot be statically determined. This can happen in the following situations:A positional-only parameter with a default in a call with
*args
of unknown size.A keyword-only parameter with a default in a call with
**kwargs
of unknown size.A positional-or-keyword parameter that matches either of the above conditions.
A positional-or-keyword parameter (with or without a default) in a call with both
*args
and**kwargs
.
The three special functions map to these kinds as follows:
is_provided()
: kind isPOSITIONAL
orKEYWORD
is_positional()
: kind isPOSITIONAL
is_keyword()
: kind isKEYWORD
Thus, there is no way to distinguish between DEFAULT
and UNKNOWN
, and a parameter for which is_provided()
returns False in the type evaluator may actually be
provided at runtime.
For variadic parameters (*args
and **kwargs
), the
kind is either DEFAULT
if no arguments are provided
to the parameter, or either POSITIONAL
(for *args
)
or KEYWORD
(for **kwargs
) if arguments may be
provided. If the type checker can
prove that a variadic argument is empty, is_provided()
may return False. (For example, given a definition
def f(*args)
and a call f(*())
, is_provided(args)
may return False.)
Examples:
@evaluated
def reject_arg(arg: int = 0) -> None:
if is_provided(arg):
show_error("error")
args: Any = ...
kwargs: Any = ...
reject_arg() # ok
reject_arg(0) # error
reject_arg(arg=0) # error
reject_arg(*args) # ok
reject_arg(**kwargs) # ok
@evaluated
def reject_star_args(*args: int) -> None:
if is_provided(args):
show_error("error")
reject_star_args() # ok
reject_star_args(1) # error
reject_star_args(*(1,)) # error
reject_star_args(*()) # may error, depending on type checker
@evaluated
def reject_star_kwargs(**kwargs: int) -> None:
if is_provided(kwargs):
show_error("error")
reject_star_kwargs() # ok
reject_star_kwargs(x=1) # error
reject_star_kwargs(**{"x": 1}) # error
reject_star_args(**{}) # may error, depending on type checker
@evaluated
def reject_keyword(arg: int = 0) -> None:
if is_keyword(arg):
show_error("error")
reject_keyword() # ok
reject_keyword(0) # ok
reject_keyword(arg=0) # error
reject_keyword(*args) # ok
reject_keyword(**kwargs) # ok
@evaluated
def reject_positional(arg: int = 0)-> None:
if is_positional(arg):
show_error("error")
reject_keyword() # ok
reject_keyword(0) # error
reject_keyword(arg=0) # ok
reject_keyword(*args) # ok
reject_keyword(**kwargs) # ok
@evaluated
def invalid(arg: object) -> None:
if is_provided(x): # error, not a function parameter
show_error("error")
is_of_type()¶
The special is_of_type()
function has the following
signature:
def is_oF_type(arg: object, type: Any, /, *, exclude_any: bool = True) -> bool: ...
arg
must be one of the parameters to the function and
type
must be a form that the type checker would accept
in a type annotation.
If exclude_any
is False, is_of_type(x, T)
returns true if x
is
compatible with T
; that is, if the type checker would
accept an assignment _: T = x
.
If the exclude_any
parameter is True (the default), normal type checking
rules are modified so that Any
is no longer compatible with
any other type, but only with another Any
. All other types
are still compatible with Any
.
Examples:
@evaluated
def length_or_none(s: str | None = None):
if is_of_type(s, str, exclude_any=False):
return int
else:
return None
any: Any = ...
opt: int | None = ...
reveal_type(length_or_none("x")) # int
reveal_type(length_or_none(None)) # None
reveal_type(length_or_none(opt)) # int | None
reveal_type(length_or_none(any)) # int
@evaluated
def length_or_none2(s: str | None):
if is_of_type(s, str):
return int
elif is_of_type(s, None):
return None
else:
return Any
reveal_type(length_or_none2("x")) # int
reveal_type(length_or_none2(None)) # None
reveal_type(length_or_none2(opt)) # int | None
reveal_type(length_or_none2(any)) # Any
@evaluated
def nested_any(s: Sequence[Any]):
if is_of_type(s, str):
show_error("error")
elif is_of_type(s, Sequence[str]):
return str
else:
return int
anyseq: Sequence[Any] = ...
nested_any("x") # error
reveal_type(nested_any(["x"])) # str
reveal_type(nested_any([1])) # int
reveal_type(nested_any(any)) # int
reveal_type(nested_any(anyseq)) # int
Interaction with unions¶
Type checkers should apply normal type narrowing rules to arguments that are of Union types. If only some members of a Union match a condition, both branches of the conditional are taken, with the parameter type narrowed appropriately in each case. The return type of the function is the union of the two branches.
For example:
@evaluated
def switch_types(arg: str | int):
if is_of_type(arg, str):
return int
else:
return str
reveal_type(switch_types(1)) # str
reveal_type(switch_types("x")) # int
union: int | str
reveal_type(switch_types(union)) # int | str
Generic evaluators¶
If any type variables appear in the parameters of the type evaluation function, the type checker should first solve those and use the solution in the body of the function:
@evaluated
def identity(x: T):
return T
reveal_type(evaluated(int())) # int
As a result, is_of_type()
checks that use a type variable work:
@evaluated
def safe_upcast(typ: Type[T1], value: object):
if is_of_type(value, T1):
return T1
show_error("unsafe cast")
return Any
reveal_type(safe_upcast(object, 1)) # object
reveal_type(safe_upcast(int, 1)) # int
safe_upcast(str, 1) # error
Type compatibility¶
The type of an evaluated function is compatible with a
Callable
with the same arguments and returning the
Union
of the possible return types, and with any
Callable
for which the evaluation function would
return a compatible type given the same arguments.
Examples:
@evaluated
def maybe_path(path: str | None):
if path is None:
return None
else:
return Path
_: Callable[[str | None], Path | None] = maybe_path # ok
_: Callable[[None], None] = maybe_path # ok
_: Callable[[str], Path] = maybe_path # ok
_: Callable[[str | None], Path] = maybe_path # error
_: Callable[[str], Path | None] = maybe_path # ok
_: Callable[[Literal["x"]], Path] = maybe_path # ok
Runtime behavior¶
At runtime, the @evaluated
decorator returns a dummy function
that throws an error when called, similar to @overload
. In
order to support dynamic type checkers, it also stores the
original function, keyed by its fully qualified name.
A helper function is provided to retrieve all registered evaluation functions for a given fully qualified name:
def get_type_evaluations(
fully_qualified_name: str
) -> Sequence[Callable[..., Any]]: ...
For example, if method B.c
in module a
has an evaluation function,
get_type_evaluations("a.B.c")
will retrieve it.
Dummy implementations are provided for the various helper
functions (is_provided()
, is_positional()
, is_keyword()
,
is_of_type()
, and show_error()
). These throw an error
if called at runtime.
The reveal_type()
function has a runtime implementation
that simply returns its argument.
Discussion¶
Interaction with Any¶
The below is an evaluation function for a simplified
version of the open()
builtin:
@evaluated
def open(mode: str):
if is_of_type(mode, Literal["r", "w"]):
return TextIO
elif is_of_type(mode, Literal["rb", "wb"]):
return BinaryIO
else:
return IO[Any]
What should open()
return if the type of the mode
argument is Any
? With the equivalent code expressed
using overloads, existing type checkers do not agree:
pyright picks the first overload that matches and returns
int
, since Any
is compatible with None
; mypy and pyanalyze
see that multiple overloads might match and return Any
.
There are good reasons for both choices,
as discussed here
by Eric Traut. In particular, mypy’s behavior is more sound
for a type checker, but pyright’s behavior helps generate
better autocompletion suggestions in a language server.
Type evaluation functions potentially have the same ambiguity, so in order to provide predictable behavior across type checkers, we need to specify a single behavior.
As specified above, our choice is to treat Any
specially
by default within evaluation functions, making it
incompatible with other types, both within is
/==
comparisons and within the is_of_type
primitive.
This behavior makes it easiest
to write evaluation functions that read naturally and
behave as desired. In particular, this choice makes
open(Any)
return IO[Any]
, which is both the most
intuitive and the most useful result.
The most natural alternative is to make is_of_type()
follow
normal type compatibility rules, where Any
is compatible
with everything. But this would create confusing behavior
for evaluation functions like the one for open()
:
open()
would returnTextIO
ifmode
isAny
, which is too precise in general.The order of the
BinaryIO
andTextIO
checks would matter: the function would behave differently if the two checks were flipped. This would be a subtle behavior that is not obvious to readers of the code.There would be no obvious way to provide a customized fallback behavior for
Any
. Technically, a check likeis_of_type(mode, Literal["r"]) and is_of_type(mode, Literal["w"])
could be used to check forAny
(onlyAny
is compatible with both literals), but this would be obscure and unreadable.It would be difficult to show an error for a particular parameter value. For example, a stub for
open()
might want to show a warning if the deprecatedrU
mode is used. The obvious way to do that would be to writeif mode == "rU": show_error(...)
, but if this returned true forAny
, we would show the error formode: Any
.
As an additional example, consider functions that take
some object or None
and return either None
or a
transformed version of the object, like this:
@evaluated
def maybe_path(path: str | None):
if path is None:
return None
else:
return Path
Functions of this form are fairly common, and it is
natural to write them with the trivial branch (None
) first,
both in the implementation and in the evaluation function.
But if path is None
would be true for Any
, the evaluation
function would return None
, which is bad both for type
checkers and for autocomplete suggestions.
One downside of this behavior is that type checkers may
incorrectly flag is None
checks after a maybe_path()
call as unreachable. However, such checks are usually only
enabled in a strict mode, and Any
should be rare in
strictly typed code. Type checkers could also provide a
mechanism that labels types derived from an evaluation
function that used Any
to disable diagnostics about
unreachable code.
Another alternative would be to use a mechanism similar to
mypy-style overload resolution: conditions that match due to
Any
would essentially match neither branch and simply
return Any
. This behavior would avoid returning any
overly precise types, but it would be useless for
autocompletion suggestions and would remove a lot of
useful type precision. For example, there would be no way
for the open()
evaluation function to produce IO[Any]
.
Argument kind functions¶
The three argument kind functions is_provided()
,
is_positional()
, and is_keyword()
are useful in various
ways:
Functions implemented in C sometimes change behavior depending on the presence of an argument, without a meaningful default. For example,
dict.pop(key)
returns the key’s value type (or else it raises an exception), butdict.pop(key, default)
returns either the value type or the type ofdefault
. Currently overloads are necessary to represent this behavior, butis_provided()
provides an alternative.It is common for new versions of Python to add or remove parameters. For example,
zip()
gained astrict=
keyword argument in Python 3.10. Usingis_provided()
with asys.version_info
check, we can provide an error if the parameter is used in an older version, without duplicating the entire function definition.Similarly, new versions of Python often change parameters from positional-or-keyword to positional-only or vice versa. Version checks can be used with
is_positional()
oris_keyword()
to reflect such changes in the stub.Library authors who want to evolve an API sometimes want to make a function parameter keyword-only. An evaluation function can be used to warn users who pass the parameter positionally without changing the runtime parameter kind, so that users have time to adapt before the runtime code is broken.
As an example, this is the current implementation of sum()
in typeshed:
if sys.version_info >= (3, 8):
@overload
def sum(__iterable: Iterable[_T]) -> _T | Literal[0]: ...
@overload
def sum(__iterable: Iterable[_T], start: _S) -> _T | _S: ...
else:
@overload
def sum(__iterable: Iterable[_T]) -> _T | Literal[0]: ...
@overload
def sum(__iterable: Iterable[_T], __start: _S) -> _T | _S: ...
This is how it could be implemented using @evaluated
:
@evaluated
def sum(__iterable: Iterable[_T], start: _S = ...):
if not is_provided(start):
return _T | Literal[0]
if sys.version_info < (3, 8) and is_keyword(start):
show_error("start is a positional-only argument in Python <3.8", argument=start)
return _T | _S
Generic evaluators¶
The specification for generic evaluators allows creating an evaluator that checks whether two types have any overlap:
T1 = TypeVar("T1")
T2 = TypeVar("T2")
@evaluated
def safe_contains(elt: T1, container: Container[T2]) -> bool:
if not is_of_type(elt, T2) and not is_of_type(container, Container[T1]):
show_error("Element cannot be a member of container")
lst: List[int]
safe_contains("x", lst) # error
safe_contains(True, lst) # ok (bool is a subclass of int)
safe_contains(object(), lst) # ok (List[int] is a subclass of Container[object])
Thus, type evaluation provides a way to implement checks similar to mypy’s strict equality flag directly in stubs.
Compatibility¶
The proposal is fully backward compatible.
Type evaluation functions are going to be most frequently useful in library stubs, where it is often important that multiple type checkers can parse the stub. In order to unblock usage of the new feature in stubs, type checker authors could simply ignore the body of evaluation functions and rely on the signature. This would still allow other type checkers to fully use the evaluation function.
Possible extensions¶
The following features may be useful, but are deferred for now for simplicity.
Error categories¶
It may be useful to provide hints to the type checker
about the severity of a show_error()
call. For example,
deprecation warnings could be marked so that the user can
control whether to show them.
One possibility is to add a keyword-only argument
category: str = ...
to show_error()
. We would specify
some standard categories that can be used in typeshed:
deprecation
(for deprecated behavior)python_version
(for wrong Python version)platform
(for wrong sys.platform)warning
(for miscellaneous non-blocking issues)
Type checkers could add support for additional categories as desired. Other type checkers would be expected to silently ignore unrecognized category strings.
Reusable error messages¶
Because show_error()
requires a string literal as the
message, typeshed would contain a lot of hardcoded string
messages about version changes.
Some possible solutions include:
Allow the message to be a variable of
Literal
type instead of a string literal. However, this would not allow customizing an error message to include e.g. the name of the argument or the Python version when some behavior changed.Allow the message to be a call to
.format()
on a string literal orLiteral
variable, where all the arguments are function arguments or literals:show_error(NEW_IN_VERSION.format(arg, "3.10"))
.Allow the message to be a call to another evaluation function that returns a string literal instead of a type. This would allow even more complex logic for emitting the error message.
The last option could look like this:
@evaluated
def added_in_py_version(feature: str, version: str):
return f"{feature} was added in Python {version}"
def zip(strict: bool = False):
if is_provided(strict) and sys.version_info < (3, 10):
show_error(
added_in_py_version("strict", "3.10"),
argument="strict"
)
Adding attributes¶
A common pattern in type checker plugins is for the plugin
to add some extra attribute to the object. For example,
@functools.total_ordering
inserts various dunder methods
into the class it decorates.
We could add an add_attributes()
primitive that given
a type and a dictionary of attributes, modifies the type
to add these attributes.
Usage could look like this:
@evaluated
def total_ordering(cls: Type[T]):
return add_attributes(
cls,
{"__eq__": Callable[[T, T], bool]}
)
Status¶
A partial implementation of this feature is available in pyanalyze:
from pyanalyze.extensions import evaluated, is_provided
@evaluated
def simple_evaluated(x: int, y: str = ""):
if is_provided(y):
return int
else:
return str
def simple_evaluated(*args: object) -> Union[int, str]:
if len(args) >= 2:
return 1
else:
return "x"
Currently unsupported features include:
Type compatibility for evaluated functions.
Overloaded evaluated functions.
Areas that need more thought include:
Interaction with overloads. It should be possible to register multiple evaluation functions for a function, treating them as overloads.
Interaction with
__init__
and self types. How does an eval function set the self type of a function? Perhaps we can have the return type have special meaning just for__init__
methods.