package documentation

Simple static analysis library for Python, based on beniget.

Goals and non-goals

The main goal of this project is to provide a simple, standard library compatible framework to statically analyse a collection of related python modules. The initial intent beeing to support static analyzers and API document generators working with the ast.

Trade-offs

Libstatic tries to be relatively lightweight and fast, so here are some trade-offs:

  • Only provide intra-procedural analyses.
  • Partial path sensitivity: libstatic relies on over-approximations, some unreachable execution paths will be filtered out but impossibe paths might still be considered.
  • No pointer or shape analysis: Aliasing that happens in non-trivial ways will not be detected.
  • No soundness guarantees: ignores the effects of eval-like, setattr, etc. functions on the program state. It doesn’t make worst-case sound assumptions, but rather "reasonable" ones.
  • Incomplete type system: While basic type inference is provided, libstatic does not carry the complexity to support full-featured type-checking.

The model

The core model, provided by beniget, is basically two directed graphs linking definitions to their uses and vice versa. We call these data structures Def-Use chains and Use-Def chains. This model is extended in order to include imported names, including the ones from wildcard imports.

All ast nodes categorized as a use or a definition have a coresponding Def instance. Definitions are represented using one of the specialized Def subclass: Mod, Cls, Func, Var, etc... The direct users of a definition are accessible with Def.users(), which returns a collection of Def (generally wrapping a ast.Name or ast.alias).

Additionnaly, reachability analysis helps with cutting down the number of potential definitions for a given user, giving more precise results. From there, we can trace the genuine definition of any symbol if it's in the system. As well find all references of a given symbol, accross all modules in the project.

The Def-Use chains and Use-Def chains, and other analyses results are made available througth the State.

How to use the library

The Project and State classes represent the primary hight-level interface for the library, (some other lower-level parts can be used indenpedently). The API is designed to work with current code using the standard ast module.

Keep in mind that all module should be added before calling analyze_project(). For performance reasons, it does not analyze the use of the builtins or any other dependent modules by default. Use Project(builtins=True) to analyze builtins or Project(dependencies=True) to recusively find and load any dependent modules (including builtins), see Options for other arguments.

The State instance acts like a façade and present accessors for several kind of analyses.

Module __main__ Undocumented
Package _analyzer This package contains analysis code highly coupled with the global State.
Package _lib No package docstring; 5/13 modules documented

From __init__.py:

Class Arg Model a function argument definition.
Class Attr Model an attribute definition.
Class ClosedScope Model a closed scope (abstract). Closed scope have <locals>.
Class Cls Model a class definition.
Class Comp Model the definition of a generator or comprehension.
Class Def Model a use or a definition, either named or unnamed, and its users.
Class Func Model a function definition.
Class Imp Model an imported name definition.
Class Lamb Model the definition of a lambda function.
Class Mod Model a module definition.
Class NameDef Model the definition of a name (abstract).
Class NodeLocation No class docstring; 0/4 instance variable, 0/1 method, 1/1 class method documented
Class OpenScope Model a open scope (abstract).
Class Options Undocumented
Class Project A project is a high-level class to analyze a collection of modules together.
Class Scope Model a python scope (abstract).
Class State The Project's state: container and accessors for analyses results.
Class TopologicalProcessor Base class for processing objects in topological order. Decoupled from the concrete types so it can be re-used for several order-sensitive analysis.
Class Type The type of a Python expression.
Class Var Model a variable definition.
Exception StaticAmbiguity Definition is ambiguous.
Exception StaticAttributeError Attribute not found.
Exception StaticCodeUnsupported Syntax is unsupported.
Exception StaticEvaluationError The evaluation could not be completed.
Exception StaticException Base exception for the library.
Exception StaticImportError An import target could not be found.
Exception StaticNameError Unbound name.
Exception StaticStateIncomplete Missing required information about analyzed tree. Shouldn't be raised under normal usage of the library.
Exception StaticTypeError A node has an unexpected type.
Exception StaticUnknownValue Used by literal eval when a used value is not known.
Exception StaticValueError Can't make sens of analyzed syntax tree.
Function load_path Load a project form a python package/module path in the filesystem. Project.analyze_project() must still be called after loading a path into the project.
def load_path(project: Project, path: Path, exclude: Sequence[str] | None = None): (source)

Load a project form a python package/module path in the filesystem. Project.analyze_project() must still be called after loading a path into the project.

>>> from libstatic import Project
>>> p = Project()
>>> load_path(p, Path('./libstatic'))
>>> # then call p.analyze_project()