In this post we are going to understand how we can define the static variables with Python types, we will also talk about why static typing is very important for a performance boost, and finally Reference Counting will also be part of this video.
Until now, we have utilized the cdef keyword to statically declare variables with a C type. But it is also possible to use cdef to statically declare variables with a Python type. We can do this for the built-in types like list, tuple, and dict; extension types like NumPy arrays; and many others.
Not all Python types can be statically declared: they must be implemented in C and Cython must have access to the declaration. The built-in Python types already meet these requirements, and declaring them is straightforward. As you can see few examples here:
cdef list postal_codes, modified_postal_codes cdef dict names_from_postal_codes cdef str pname cdef set unique_postal_codes
The variables in this example are full Python objects. Under the hood, Cython declares them as C pointers to some built-in Python struct type. They can be used like ordinary Python variables, but are constrained to their declared type:
# ...initialize names_from_postal_codes... postal_codes = list(names_from_postal_codes.keys())
As you can see in this example, we can access the keys attribute of names_from _postal_codes as usually, we can do in pure Python. Dynamic variables can be initialized from statically declared Python types:
other_postal_codes = postal_codes del other_postal_codes
Here, deleting the 0th element via other_postal_codes will delete the 0th element of postal_codes as well, since they are referring to the same list.
One difference between postal_codes and other_postal_codes is that postal_codes can only ever refer to Python list objects, while other_postal_codes can refer to any Python type. Cython will enforce the constraint on postal_codes at compile time and at runtime.
But at this stage, one important thing you should keep in mind is that: In cases where Python built-in types like int or float have the same name as a C type, the C type will be considered more important. This is almost always what we want.
When we are adding, subtracting, or multiplying scalars, the operations have Python semantics when the operands are dynamically typed Python objects. They have C semantics when the operands are statically typed C variables.
Division and modulus deserve special mention. C and Python have considerably different behavior when computing the modulus with signed integer operands: C rounds toward zero, while Python rounds toward infinity. For example, -1 % 5 evaluates to 4 with Python semantics; with C semantics, however, it evaluates to -1.
When dividing two integers, Python always checks the denominator and raises a ZeroDivisionError when it is zero, while C has no such safeguards in place.
Cython uses Python semantics by default for division and modulus even when the operands are statically typed C scalars. To obtain C semantics, we can use the cdivision compiler directive either at the global module level or in a directive comment like this one:
[ # cython: cdivision=True ]
Or at the function level in the form of a decorator, like this one:
cimport cython @cython.cdivision(True) def divides(int x, int y): return x / y
One another option might be within a function with a context manager, as you can see here in this example:
cimport cython def remainder(int x, int y): with cython.cdivision(True): return x % y
Great, that’s how we can utilize static typing for Python types.
Let's talk about how static typing really helps us for better performance? It may look strange at first that Cython allows static declaration of variables with built-in Python types. There's maybe a question in your mind and it should be if you are following me in this series, the question is: Why not just use Python’s dynamic typing as usual? The answer to this question leads to a general Cython principle: the more static type information we provide, the better Cython can optimize the result. As always, there are exceptions to this rule, but it is more often true than not.
For example, the line of code that appeared on the screen simply appends a recipe object to a dynamic dynamic_recipes variable:
dynamic_recipes = make_recipes(...) # ... dynamic_recipes.append(Recipe()) # ...
The Cython compiler will generate code that can manage any Python object, and tests at runtime if dynamic_recipes is a list. If it is not, as long as it has an append method that takes an argument, this code will run. Under the hood, the generated code first looks up the append attribute on the dynamic_recipes object maybe by using PyObject_GetAttr and then calls that method using the fully general PyObject_Call Python/C API function. This essentially imitates what the Python interpreter would do when running equivalent Python bytecode.
Suppose we statically declare a static_recipes Python list and use it instead, as you can see on the screen.
cdef list static_recipes = make_recipes(...) # ... static_recipes.append(Recipe()) # ...
Now Cython can generate specialized code that directly calls either the PyList_SET_ITEM or the PyList_Append function from the C API. This is what PyObject_Call in the previous example ends up calling anyway, but static typing allows Cython to remove dynamic dispatch on static_recipes. Cython supports several built-in statically declarable Python types like complex, type, and object, array, slice, etc.