pyanalyze.stacked_scopes

Implementation of scope nesting in pyanalyze.

This module is responsible for mapping names to their values in pyanalyze. Variable lookup happens mostly through a series of nested dictionaries. When pyanalyze sees a reference to a name inside a nested function, it will first look at that function’s scope, then in the enclosing function’s scope, then in the module scope, and finally in the builtin scope containing Python builtins. Each of these scopes is represented as a Scope object, which by default is just a thin wrapper around a dictionary. However, function scopes are more complicated in order to track variable values accurately through control flow structures like if blocks. See the FunctionScope docstring for details.

Other subtleties implemented here:

class pyanalyze.stacked_scopes.VisitorState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

The phase of type checking.

class pyanalyze.stacked_scopes.CompositeVariable(varname: str, attributes: Sequence[str | KnownValue])

varname used to implement constraints on instance variables.

For example, access to self.x would make us use CompositeVariable('self', ('x',)). If a function contains a check for isinstance(self.x, int), we would put a Constraint on this CompositeVariable.

Also used for subscripts. Access to a[1] uses CompositeVariable('a', (KnownValue(1),)). These can be mixed: a[1].b corresponds to CompositeVariable('a', (KnownValue(1), 'b')).

class pyanalyze.stacked_scopes.VarnameWithOrigin(varname: str, origin: frozenset[Optional[object]] = frozenset({None}), indices: collections.abc.Sequence[tuple[Union[str, pyanalyze.value.KnownValue], frozenset[Optional[object]]]] = ())
class pyanalyze.stacked_scopes.Composite(value: Value, varname: VarnameWithOrigin | None = None, node: AST | None = None)

A pyanalyze.value.Value with information about its origin. This is useful for setting constraints.

value: Value

Alias for field number 0

varname: VarnameWithOrigin | None

Alias for field number 1

node: AST | None

Alias for field number 2

class pyanalyze.stacked_scopes.AbstractConstraint

Base class for abstract constraints.

We distinguish between abstract and concrete constraints. Abstract constraints are collected from conditions, and may be null constraints, concrete constraints, or an AND or OR of other abstract constraints. When we add constraints to a scope, we apply the abstract constraints to produce a set of concrete constraints. For example, a null constraint produces no concrete constraints, and an AND constraint AND(C1, C2) produces both C1 and C2.

Concrete constraints are instances of the Constraint class.

apply() Iterable[Constraint]

Yields concrete constraints that are active when this constraint is applied.

invert() AbstractConstraint

Return an inverted version of this constraint.

class pyanalyze.stacked_scopes.Constraint(varname: VarnameWithOrigin | None, constraint_type: ConstraintType, positive: bool, value: Any, inverted: Constraint | None = None)

A constraint is a restriction on the value of a variable.

Constraints are tracked in scope objects, so that we know which constraints are active for a given usage of a variable.

For example:

def f(x: Optional[int]) -> None:
    reveal_type(x)  # Union[int, None]
    assert x
    # Now a constraint of type is_truthy is active. Because
    # None is not truthy, we know that x is of type int.
    reveal_type(x)  # int
varname: VarnameWithOrigin | None

The varname that the constraint applies to.

constraint_type: ConstraintType

Type of constraint. Determines the meaning of value.

positive: bool

Whether this is a positive constraint or not. For example, for an is_truthy constraint, if x would lead to a positive and if not x to a negative constraint.

value: Any

Type for an is_instance constraint; value identical to the variable for is_value; unused for is_truthy; pyanalyze.value.Value object for is_value_object.

apply_to_value(value: Value) Iterable[Value]

Yield values consistent with this constraint.

Produces zero or more values consistent both with the given value and with this constraint.

The value may not be a MultiValuedValue.

class pyanalyze.stacked_scopes.NullConstraint

Represents the absence of a constraint.

pyanalyze.stacked_scopes.NULL_CONSTRAINT = NullConstraint()

The single instance of NullConstraint.

class pyanalyze.stacked_scopes.PredicateProvider(varname: VarnameWithOrigin, provider: Callable[[Value], Value], value_transformer: Callable[[Value, type[AST], object], Value] | None = None)

A form of constraint implemented through a predicate on a value.

If a function returns a PredicateProvider, equality checks on the return value will produce a predicate constraint.

Consider the following code:

def two_lengths(tpl: Union[Tuple[int], Tuple[str, int]]) -> int:
    if len(tpl) == 1:
        return tpl[0]
    else:
        return tpl[1]

The impl for len() returns a PredicateProvider, with a provider attribute that returns the length of the object represented by a value. In turn, the equality check (== 1) produces a constraint of type predicate, which filters away any values that do not match the length of the object.

In this case, there are two values: a tuple of length 1 and one of length 2. Only the first matches the constraint, so the type is narrowed down to that tuple and the code typechecks correctly.

class pyanalyze.stacked_scopes.EquivalentConstraint(constraints: tuple[AbstractConstraint, ...])

Represents multiple constraints that are either all true or all false.

class pyanalyze.stacked_scopes.AndConstraint(constraints: tuple[AbstractConstraint, ...])

Represents the AND of two constraints.

class pyanalyze.stacked_scopes.OrConstraint(constraints: tuple[AbstractConstraint, ...])

Represents the OR of two constraints.

class pyanalyze.stacked_scopes.Scope(scope_type: ~pyanalyze.stacked_scopes.ScopeType, variables: dict[str | ~pyanalyze.stacked_scopes.CompositeVariable, ~pyanalyze.value.Value] = <factory>, parent_scope: ~pyanalyze.stacked_scopes.Scope | None = None, scope_node: object | None = None, scope_object: object | None = None, simplification_limit: int | None = None, declared_types: dict[str, tuple[~pyanalyze.value.Value | None, bool, ~ast.AST]] = <factory>)

Represents a single level in the scope stack.

May be a builtin, module, class, or function scope.

add_constraint(abstract_constraint: AbstractConstraint, node: object, state: VisitorState) None

Constraints are ignored outside of function scopes.

scope_used_as_parent() Scope

Class scopes are skipped in scope lookup, so don’t set them as parent scopes.

class pyanalyze.stacked_scopes.FunctionScope(parent_scope: Scope, scope_node: object | None = None, simplification_limit: int | None = None)

Keeps track of the local variables of a single function.

FunctionScope is designed to produce the correct value for each variable at each point in the function, unlike the base Scope class, which assumes that each variable has the same value throughout the scope it represents.

For example, given the code:

x = 3
x = 4
print(x)

FunctionScope will infer the value of x to be KnownValue(4), but Scope will produce a pyanalyze.value.MultiValuedValue because it does not know whether the assignment to 3 or 4 is active.

The approach taken is to map each usage node (a place where the variable is used) to a set of definition nodes (places where the variable is assigned to) that could be active when the variable is used. Each definition node is also mapped to the value assigned to the variable there.

For example, in the code:

x = 3  # (a)
print(x)  # (b)

(a) is the only definition node for the usage node at (b), and (a) is mapped to KnownValue(3), so at (b) x is inferred to be KnownValue(3).

However, in this code:

if some_condition():
    x = 3  # (a)
else:
    x = 4  # (b)
print(x)  # (c)

both (a) and (b) are possible definition nodes for the usage node at (c), so at (c) x is inferred to be a MultiValuedValue([KnownValue(3), KnownValue(4)]).

These mappings are implemented as the usage_to_definition_nodes and definition_node_to_value attributes on the FunctionScope object. They are created completely during the collecting phase. The basic mechanism uses the name_to_current_definition_nodes dictionary, which maps each local variable to a list of active definition nodes. When pyanalyze encounters an assignment, it updates name_to_current_definition_nodes to map to that assignment node, and when it encounters a variable usage it updates usage_to_definition_nodes to map that usage to the current definition nodes in name_to_current_definition_nodes. For example:

# name_to_current_definition_nodes (n2cdn) = {}, usage_to_definition_nodes (u2dn) = {}
x = 3  # (a)
# n2cdn = {'x': [(a)]}, u2dn = {}
print(x)  # (b)
# n2cdn = {'x': [(a)]}, u2dn = {(b): [(a)]}
x = 4  # (c)
# n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)]}
print(x)  # (d)
# n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)], (d): [(c)]}

However, this simple approach is not sufficient to handle control flow inside the function. To handle this case, FunctionScope supports the creation of subscopes and the combine_subscopes operation. Each branch in a conditional statement is mapped to a separate subscope, which contains an independently updated copy of name_to_current_definition_nodes. After pyanalyze visits all branches, it runs the combine_subscopes operation on all of the branches’ subscopes. This operation takes, for each variable, the union of the definition nodes created in all of the branches. For example:

# n2cdn = {}, u2dn = {}
if some_condition():
    # subscope 1
    x = 3  # (a)
    print(x)  # (b)
    # n2cdn = {'x': [(a)]}, u2dn = {(b): [(a)]}
else:
    # subscope 2
    x = 4  # (c)
    print(x)  # (d)
    # n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)], (d): [(c)]}
# combine_subscopes([subscope 1, subscope 2]) happens
# n2cdn = {'x': [(a), (c)]}, u2dn = {(b): [(a)], (d): [(c)]}
print(x)  # (e)
# n2cdn = {'x': [(a), (c)]}, u2dn = {(b): [(a)], (d): [(c)], (e): [(a), (c)]}

This model applies most cleanly to if blocks, but try-except can also be analyzed using this approach. Loops are more complicated, because variable usages sometimes need to be mapped to definition nodes later in the same loop body. For example, in code like this:

x = None
for _ in (1, 2):
    if x:
        print(x[1])  # (a)
    else:
        x = (1, 2)  # (b)

a naive approach would infer that x is None at (a). To take care of this case, pyanalyze visits the loop body twice during the collecting phase, so that usage_to_definition_nodes can add a mapping of (a) to (b). To handle break and continue correctly, it also uses a separate “loop scope” that ends up combining the scopes created by normal control flow through the body of the loop and by each break and continue statement.

Try-finally blocks are handled by visiting the finally block twice. Essentially, we treat:

try:
    TRY-BODY
finally:
    FINALLY-BODY
REST-OF-FUNCTION

as equivalent to:

if <empty>:
    FINALLY-BODY
    return
else:
    TRY-BODY
    FINALLY-BODY
    REST-OF-FUNCTION

This correctly expresses that variables used in the FINALLY-BODY can have either the values set in the TRY-BODY or those set before the try-finally. It does not express that the TRY-BODY may have been interrupted at any point, but that does not matter for our purposes. It has the disadvantage that the finally body is visted twice, which may lead to some errors being doubled.

A similar approach is used to handle loops, where the body of the loop may not be executed at all. A for loop of the form:

for TARGET in ITERABLE:
    FOR-BODY
else:
    ELSE-BODY

is treated like:

if <empty>:
    TARGET = next(iter(ITERABLE))
    FOR-BODY
else:
    ELSE_BODY

Special logic is also needed to take care of globals (which are kept track of separately from normal variables) and variable lookups without a node context. For the latter, the name_to_all_definition_nodes maps each variable name to all possible definition nodes.

add_constraint(abstract_constraint: AbstractConstraint, node: object, state: VisitorState) None

Add a new constraint.

Constraints are represented as assignments of fake values, which are _ConstrainedValue objects. These contain a constraint and a set of definition nodes where the unconstrained variable could have been defined. When we try to retrieve the value of a variable, we look at the values in each of the definition node and at the constraint.

The node argument may be any unique key, although it will usually be an AST node.

get_all_definition_nodes() dict[str | CompositeVariable, set[object]]

Return a copy of name_to_all_definition_nodes.

suppressing_subscope() Iterator[dict[str | CompositeVariable, list[object]]]

A suppressing subscope is a subscope that may suppress exceptions inside of it.

This is used to implement try and with blocks. After code like this:

x = 1
try:
    x = 2
    x = 3
except Exception:
    pass

The value of x may be any of 1, 2, and 3, depending on whether and where an exception was thrown.

To implement this, we keep track of all assignments inside the block and give them effect, so that after the suppressing subscope ends, each variable’s definition nodes include all of these assignments.

subscope() Iterator[dict[str | CompositeVariable, list[object]]]

Create a new subscope, to be used for conditional branches.

class pyanalyze.stacked_scopes.StackedScopes(module_vars: dict[str, Value], module: ModuleType | None, *, simplification_limit: int | None = None)

Represents the stack of scopes in which Python searches for variables.

add_scope(scope_type: ScopeType, scope_node: object, scope_object: object | None = None) Iterator[None]

Context manager that adds a scope of this type to the top of the stack.

ignore_topmost_scope() Iterator[None]

Context manager that temporarily ignores the topmost scope.

allow_only_module_scope() Iterator[None]

Context manager that allows only lookups in the module and builtin scopes.

get(varname: str | CompositeVariable, node: object, state: VisitorState) Value

Gets a variable of the given name from the current scope stack.

Parameters:
  • varname (Varname) – varname of the variable to retrieve

  • node (Node) – AST node corresponding to the place where the variable lookup is happening. FunctionScope uses this to decide which definition of the variable to use; other scopes ignore it. It can be passed as None to indicate that any definition may be used. This is used among others when looking up names in outer scopes. Although this argument should normally be an AST node, it can be any unique, hashable identifier, because sometimes a single AST node sets multiple variables (e.g. in ImportFrom nodes).

  • state (VisitorState) – The current VisitorState. Pyanalyze runs the collecting phase to collect all name assignments and map name usages to their corresponding assignments, and then the checking phase to locate any errors in the code.

Returns pyanalyze.value.UNINITIALIZED_VALUE if the name is not defined in any known scope.

get_with_scope(varname: str | CompositeVariable, node: object, state: VisitorState) tuple[Value, Scope | None, frozenset[object | None]]

Like get(), but also returns the scope object the name was found in.

Returns a (pyanalyze.value.Value, Scope, origin) tuple. The Scope is None if the name was not found.

get_nonlocal_scope(varname: str | CompositeVariable, using_scope: Scope) Scope | None

Gets the defining scope of a non-local variable.

set(varname: str | CompositeVariable, value: Value, node: object, state: VisitorState) None

Records an assignment to this variable.

value is the value that is being assigned to varname. The other arguments are the same as those of get().

subscope() AbstractContextManager[dict[str | CompositeVariable, list[object]]]

Creates a new subscope (see the FunctionScope docstring).

loop_scope() AbstractContextManager[list[dict[str | CompositeVariable, list[object]]]]

Creates a new loop scope (see the FunctionScope docstring).

combine_subscopes(scopes: Iterable[dict[str | CompositeVariable, list[object]]], *, ignore_leaves_scope: bool = False) None

Merges a number of subscopes back into their parent scope.

scope_type() ScopeType

Returns the type of the current scope.

current_scope() Scope

Returns the current scope dictionary.

module_scope() Scope

Returns the module scope of the current scope.

contains_scope_of_type(scope_type: ScopeType) bool

Returns whether any scope in the stack is of this type.

is_nested_function() bool

Returns whether we’re currently in a nested function.

pyanalyze.stacked_scopes.constrain_value(value: Value, constraint: AbstractConstraint, *, simplification_limit: int | None = None) Value

Create a version of this value with the constraint applied.

pyanalyze.stacked_scopes.uniq_chain(iterables: Iterable[Iterable[T]]) list[T]

Returns a flattened list, collapsing equal elements but preserving order.