pyanalyze.stacked_scopes¶
Implementation of scope nesting in pyanalyze.
This module is responsible for mapping names to their values in pyanalyze. Variable lookup happens
mostly through a series of nested dictionaries. When pyanalyze sees a reference to a name inside a
nested function, it will first look at that function’s scope, then in the enclosing function’s
scope, then in the module scope, and finally in the builtin scope containing Python builtins. Each
of these scopes is represented as a Scope
object, which by default is just a thin
wrapper around a dictionary. However, function scopes are more complicated in order to track
variable values accurately through control flow structures like if blocks. See the
FunctionScope
docstring for details.
Other subtleties implemented here:
Multiple assignments to the same name result in
pyanalyze.value.MultiValuedValue
Globals are represented as
pyanalyze.value.ReferencingValue
, and name lookups for such names are delegated to thepyanalyze.value.ReferencingValue
’s scopeClass scopes except the current one are skipped in name lookup
- class pyanalyze.stacked_scopes.VisitorState(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)¶
The phase of type checking.
- class pyanalyze.stacked_scopes.CompositeVariable(varname: str, attributes: Sequence[str | KnownValue])¶
varname used to implement constraints on instance variables.
For example, access to
self.x
would make us useCompositeVariable('self', ('x',))
. If a function contains a check forisinstance(self.x, int)
, we would put aConstraint
on thisCompositeVariable
.Also used for subscripts. Access to
a[1]
usesCompositeVariable('a', (KnownValue(1),))
. These can be mixed:a[1].b
corresponds toCompositeVariable('a', (KnownValue(1), 'b'))
.
- class pyanalyze.stacked_scopes.VarnameWithOrigin(varname: str, origin: frozenset[Optional[object]] = frozenset({None}), indices: collections.abc.Sequence[tuple[Union[str, pyanalyze.value.KnownValue], frozenset[Optional[object]]]] = ())¶
- class pyanalyze.stacked_scopes.Composite(value: Value, varname: VarnameWithOrigin | None = None, node: AST | None = None)¶
A
pyanalyze.value.Value
with information about its origin. This is useful for setting constraints.- varname: VarnameWithOrigin | None¶
Alias for field number 1
- node: AST | None¶
Alias for field number 2
- class pyanalyze.stacked_scopes.AbstractConstraint¶
Base class for abstract constraints.
We distinguish between abstract and concrete constraints. Abstract constraints are collected from conditions, and may be null constraints, concrete constraints, or an AND or OR of other abstract constraints. When we add constraints to a scope, we apply the abstract constraints to produce a set of concrete constraints. For example, a null constraint produces no concrete constraints, and an AND constraint AND(C1, C2) produces both C1 and C2.
Concrete constraints are instances of the
Constraint
class.- apply() Iterable[Constraint] ¶
Yields concrete constraints that are active when this constraint is applied.
- invert() AbstractConstraint ¶
Return an inverted version of this constraint.
- class pyanalyze.stacked_scopes.Constraint(varname: VarnameWithOrigin | None, constraint_type: ConstraintType, positive: bool, value: Any, inverted: Constraint | None = None)¶
A constraint is a restriction on the value of a variable.
Constraints are tracked in scope objects, so that we know which constraints are active for a given usage of a variable.
For example:
def f(x: Optional[int]) -> None: reveal_type(x) # Union[int, None] assert x # Now a constraint of type is_truthy is active. Because # None is not truthy, we know that x is of type int. reveal_type(x) # int
- varname: VarnameWithOrigin | None¶
The varname that the constraint applies to.
- positive: bool¶
Whether this is a positive constraint or not. For example, for an is_truthy constraint,
if x
would lead to a positive andif not x
to a negative constraint.
- value: Any¶
Type for an
is_instance
constraint; value identical to the variable foris_value
; unused for is_truthy;pyanalyze.value.Value
object for is_value_object.
- class pyanalyze.stacked_scopes.NullConstraint¶
Represents the absence of a constraint.
- pyanalyze.stacked_scopes.NULL_CONSTRAINT = NullConstraint()¶
The single instance of
NullConstraint
.
- class pyanalyze.stacked_scopes.PredicateProvider(varname: VarnameWithOrigin, provider: Callable[[Value], Value], value_transformer: Callable[[Value, type[AST], object], Value] | None = None)¶
A form of constraint implemented through a predicate on a value.
If a function returns a
PredicateProvider
, equality checks on the return value will produce a predicate constraint.Consider the following code:
def two_lengths(tpl: Union[Tuple[int], Tuple[str, int]]) -> int: if len(tpl) == 1: return tpl[0] else: return tpl[1]
The impl for
len()
returns aPredicateProvider
, with a provider attribute that returns the length of the object represented by a value. In turn, the equality check (== 1
) produces a constraint of type predicate, which filters away any values that do not match the length of the object.In this case, there are two values: a tuple of length 1 and one of length 2. Only the first matches the constraint, so the type is narrowed down to that tuple and the code typechecks correctly.
- class pyanalyze.stacked_scopes.EquivalentConstraint(constraints: tuple[AbstractConstraint, ...])¶
Represents multiple constraints that are either all true or all false.
- class pyanalyze.stacked_scopes.AndConstraint(constraints: tuple[AbstractConstraint, ...])¶
Represents the AND of two constraints.
- class pyanalyze.stacked_scopes.OrConstraint(constraints: tuple[AbstractConstraint, ...])¶
Represents the OR of two constraints.
- class pyanalyze.stacked_scopes.Scope(scope_type: ~pyanalyze.stacked_scopes.ScopeType, variables: dict[str | ~pyanalyze.stacked_scopes.CompositeVariable, ~pyanalyze.value.Value] = <factory>, parent_scope: ~pyanalyze.stacked_scopes.Scope | None = None, scope_node: object | None = None, scope_object: object | None = None, simplification_limit: int | None = None, declared_types: dict[str, tuple[~pyanalyze.value.Value | None, bool, ~ast.AST]] = <factory>)¶
Represents a single level in the scope stack.
May be a builtin, module, class, or function scope.
- add_constraint(abstract_constraint: AbstractConstraint, node: object, state: VisitorState) None ¶
Constraints are ignored outside of function scopes.
- class pyanalyze.stacked_scopes.FunctionScope(parent_scope: Scope, scope_node: object | None = None, simplification_limit: int | None = None)¶
Keeps track of the local variables of a single function.
FunctionScope
is designed to produce the correct value for each variable at each point in the function, unlike the baseScope
class, which assumes that each variable has the same value throughout the scope it represents.For example, given the code:
x = 3 x = 4 print(x)
FunctionScope
will infer the value of x to beKnownValue(4)
, butScope
will produce apyanalyze.value.MultiValuedValue
because it does not know whether the assignment to 3 or 4 is active.The approach taken is to map each usage node (a place where the variable is used) to a set of definition nodes (places where the variable is assigned to) that could be active when the variable is used. Each definition node is also mapped to the value assigned to the variable there.
For example, in the code:
x = 3 # (a) print(x) # (b)
(a) is the only definition node for the usage node at (b), and (a) is mapped to
KnownValue(3)
, so at (b) x is inferred to beKnownValue(3)
.However, in this code:
if some_condition(): x = 3 # (a) else: x = 4 # (b) print(x) # (c)
both (a) and (b) are possible definition nodes for the usage node at (c), so at (c) x is inferred to be a
MultiValuedValue([KnownValue(3), KnownValue(4)])
.These mappings are implemented as the usage_to_definition_nodes and definition_node_to_value attributes on the
FunctionScope
object. They are created completely during the collecting phase. The basic mechanism uses the name_to_current_definition_nodes dictionary, which maps each local variable to a list of active definition nodes. When pyanalyze encounters an assignment, it updates name_to_current_definition_nodes to map to that assignment node, and when it encounters a variable usage it updates usage_to_definition_nodes to map that usage to the current definition nodes in name_to_current_definition_nodes. For example:# name_to_current_definition_nodes (n2cdn) = {}, usage_to_definition_nodes (u2dn) = {} x = 3 # (a) # n2cdn = {'x': [(a)]}, u2dn = {} print(x) # (b) # n2cdn = {'x': [(a)]}, u2dn = {(b): [(a)]} x = 4 # (c) # n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)]} print(x) # (d) # n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)], (d): [(c)]}
However, this simple approach is not sufficient to handle control flow inside the function. To handle this case,
FunctionScope
supports the creation of subscopes and the combine_subscopes operation. Each branch in a conditional statement is mapped to a separate subscope, which contains an independently updated copy of name_to_current_definition_nodes. After pyanalyze visits all branches, it runs the combine_subscopes operation on all of the branches’ subscopes. This operation takes, for each variable, the union of the definition nodes created in all of the branches. For example:# n2cdn = {}, u2dn = {} if some_condition(): # subscope 1 x = 3 # (a) print(x) # (b) # n2cdn = {'x': [(a)]}, u2dn = {(b): [(a)]} else: # subscope 2 x = 4 # (c) print(x) # (d) # n2cdn = {'x': [(c)]}, u2dn = {(b): [(a)], (d): [(c)]} # combine_subscopes([subscope 1, subscope 2]) happens # n2cdn = {'x': [(a), (c)]}, u2dn = {(b): [(a)], (d): [(c)]} print(x) # (e) # n2cdn = {'x': [(a), (c)]}, u2dn = {(b): [(a)], (d): [(c)], (e): [(a), (c)]}
This model applies most cleanly to if blocks, but try-except can also be analyzed using this approach. Loops are more complicated, because variable usages sometimes need to be mapped to definition nodes later in the same loop body. For example, in code like this:
x = None for _ in (1, 2): if x: print(x[1]) # (a) else: x = (1, 2) # (b)
a naive approach would infer that x is
None
at (a). To take care of this case, pyanalyze visits the loop body twice during the collecting phase, so that usage_to_definition_nodes can add a mapping of (a) to (b). To handle break and continue correctly, it also uses a separate “loop scope” that ends up combining the scopes created by normal control flow through the body of the loop and by each break and continue statement.Try-finally blocks are handled by visiting the finally block twice. Essentially, we treat:
try: TRY-BODY finally: FINALLY-BODY REST-OF-FUNCTION
as equivalent to:
if <empty>: FINALLY-BODY return else: TRY-BODY FINALLY-BODY REST-OF-FUNCTION
This correctly expresses that variables used in the FINALLY-BODY can have either the values set in the TRY-BODY or those set before the try-finally. It does not express that the TRY-BODY may have been interrupted at any point, but that does not matter for our purposes. It has the disadvantage that the finally body is visted twice, which may lead to some errors being doubled.
A similar approach is used to handle loops, where the body of the loop may not be executed at all. A for loop of the form:
for TARGET in ITERABLE: FOR-BODY else: ELSE-BODY
is treated like:
if <empty>: TARGET = next(iter(ITERABLE)) FOR-BODY else: ELSE_BODY
Special logic is also needed to take care of globals (which are kept track of separately from normal variables) and variable lookups without a node context. For the latter, the name_to_all_definition_nodes maps each variable name to all possible definition nodes.
- add_constraint(abstract_constraint: AbstractConstraint, node: object, state: VisitorState) None ¶
Add a new constraint.
Constraints are represented as assignments of fake values, which are _ConstrainedValue objects. These contain a constraint and a set of definition nodes where the unconstrained variable could have been defined. When we try to retrieve the value of a variable, we look at the values in each of the definition node and at the constraint.
The node argument may be any unique key, although it will usually be an AST node.
- get_all_definition_nodes() dict[str | CompositeVariable, set[object]] ¶
Return a copy of name_to_all_definition_nodes.
- suppressing_subscope() Iterator[dict[str | CompositeVariable, list[object]]] ¶
A suppressing subscope is a subscope that may suppress exceptions inside of it.
This is used to implement try and with blocks. After code like this:
x = 1 try: x = 2 x = 3 except Exception: pass
The value of x may be any of 1, 2, and 3, depending on whether and where an exception was thrown.
To implement this, we keep track of all assignments inside the block and give them effect, so that after the suppressing subscope ends, each variable’s definition nodes include all of these assignments.
- subscope() Iterator[dict[str | CompositeVariable, list[object]]] ¶
Create a new subscope, to be used for conditional branches.
- class pyanalyze.stacked_scopes.StackedScopes(module_vars: dict[str, Value], module: ModuleType | None, *, simplification_limit: int | None = None)¶
Represents the stack of scopes in which Python searches for variables.
- add_scope(scope_type: ScopeType, scope_node: object, scope_object: object | None = None) Iterator[None] ¶
Context manager that adds a scope of this type to the top of the stack.
- ignore_topmost_scope() Iterator[None] ¶
Context manager that temporarily ignores the topmost scope.
- allow_only_module_scope() Iterator[None] ¶
Context manager that allows only lookups in the module and builtin scopes.
- get(varname: str | CompositeVariable, node: object, state: VisitorState) Value ¶
Gets a variable of the given name from the current scope stack.
- Parameters:
varname (Varname) – varname of the variable to retrieve
node (Node) – AST node corresponding to the place where the variable lookup is happening.
FunctionScope
uses this to decide which definition of the variable to use; other scopes ignore it. It can be passed as None to indicate that any definition may be used. This is used among others when looking up names in outer scopes. Although this argument should normally be an AST node, it can be any unique, hashable identifier, because sometimes a single AST node sets multiple variables (e.g. in ImportFrom nodes).state (VisitorState) – The current
VisitorState
. Pyanalyze runs the collecting phase to collect all name assignments and map name usages to their corresponding assignments, and then the checking phase to locate any errors in the code.
Returns
pyanalyze.value.UNINITIALIZED_VALUE
if the name is not defined in any known scope.
- get_with_scope(varname: str | CompositeVariable, node: object, state: VisitorState) tuple[Value, Scope | None, frozenset[object | None]] ¶
Like
get()
, but also returns the scope object the name was found in.Returns a (
pyanalyze.value.Value
,Scope
, origin) tuple. TheScope
isNone
if the name was not found.
- get_nonlocal_scope(varname: str | CompositeVariable, using_scope: Scope) Scope | None ¶
Gets the defining scope of a non-local variable.
- set(varname: str | CompositeVariable, value: Value, node: object, state: VisitorState) None ¶
Records an assignment to this variable.
value is the value that is being assigned to varname. The other arguments are the same as those of
get()
.
- subscope() AbstractContextManager[dict[str | CompositeVariable, list[object]]] ¶
Creates a new subscope (see the
FunctionScope
docstring).
- loop_scope() AbstractContextManager[list[dict[str | CompositeVariable, list[object]]]] ¶
Creates a new loop scope (see the
FunctionScope
docstring).
- combine_subscopes(scopes: Iterable[dict[str | CompositeVariable, list[object]]], *, ignore_leaves_scope: bool = False) None ¶
Merges a number of subscopes back into their parent scope.
- scope_type() ScopeType ¶
Returns the type of the current scope.
- contains_scope_of_type(scope_type: ScopeType) bool ¶
Returns whether any scope in the stack is of this type.
- is_nested_function() bool ¶
Returns whether we’re currently in a nested function.
- pyanalyze.stacked_scopes.constrain_value(value: Value, constraint: AbstractConstraint, *, simplification_limit: int | None = None) Value ¶
Create a version of this value with the constraint applied.
- pyanalyze.stacked_scopes.uniq_chain(iterables: Iterable[Iterable[T]]) list[T] ¶
Returns a flattened list, collapsing equal elements but preserving order.