Introduction#
| |
This Python code assigns a string to a variable, then assigns an integer to the same variable, and finally prints its value. What do you think will happen if we run this? Will it run without errors?
If you’re familiar with Python even at a very basic level, this is a simple question. It will run without errors and just print 2026. Here in this example, the second assignment overwrites the first one, making my_variable an integer. You can also confirm this by printing type(my_variable) to see <class 'int'>.
Now, let’s add a str type hint in the first line:
| |
We annotated my_variable as a string, but in the next line we assigned an integer. Now this time, do you think it will still run without errors? Think about it for a moment.
Again yes, nothing magical will happen just because we added a type hint. Python is a dynamically typed language, which means the type of a variable is determined at runtime. Other well-known examples of dynamically typed languages include PHP, Ruby, Lua, R, and JavaScript. This contrasts with statically typed languages, where the type of a variable is fixed at compile time, such as in C, C++, Java, Go, Rust, and TypeScript.
Where it gets interesting is that we can use static type checkers like mypy, pyright, and ty to catch potential runtime errors even before running the code!
If we ran mypy
$ mypy simple_example_typed.pywe will get an error:
error: Incompatible types in assignment (expression has type "int", variable has type "str") [assignment]But how is this helpful? Why are we getting an error? After all, the above code is a perfectly valid Python code and Python lets us do this. Why should we care if a type checker complains about this?
Today we’re going to answer these questions and talk about static typing or simply typing in Python. This is one of the most important modern approaches to writing high-quality, production-ready Python code. The dynamic nature of Python is good for development, scripting, and quick prototyping, but we will see that for production code we want to be explicit about the types of our variables and do not want to change their types at runtime. Using typing, your code becomes less prone to bugs, easier to understand, and easier to maintain. We will also see how you can utilize AI coding tools together with typing to generate much better Python code.
Our approach is not to provide a tutorial on typing syntax, but rather to help you understand what typing actually does and why it is important. As AI coding assistants take on more of the actual code writing, understanding the essential ideas behind typing matters even more, instead of learning syntax. We will also not go into the details of different type checkers and how they compare, but below you will find links to some of them frequently utilized in production workflows. Because the tooling around Python and software development is constantly evolving, our north star is to focus on the core ideas that will stay with you for years to come. Given the current pace of advancements in AI, years is an eternity. So, you will greatly benefit from understanding these ideas if your work involves using Python. Whether you’re a data scientist, ML engineer, data engineer, or software engineer (whatever this title will be in the future), type hints will make your Python code noticeably better and can directly or indirectly help you move into more senior roles.
Static Typing in a Codebase#
In a codebase with many variables and functions, it quickly becomes difficult to keep track of types manually. It’s time-consuming, it adds cognitive load, and it’s error prone. This is not only true for us, humans, but also true for language models. For example, if a string variable in our codebase was later changed to an integer, and we call a string method on it, we will get an error at runtime.
As we just saw, static typing catches these errors that would otherwise be deferred to runtime, making it one of the most effective tools for Python production code. We do not need to run python to get the error, we just need to run a static type checker. In fact, in production workflows, a Continuous Integration pipeline will run a static type checker to catch these bugs automatically before running, merging, or even committing the code. In a future lesson, we will talk about CI in more detail.
A Latent Bug Example#
Here’s an example where static typing makes a real difference.
Suppose we have three membership tiers: gold, silver, and bronze, and a function that applies a discount based on a user’s membership tier.
| |
If you test this code, it works as expected.
$ python latent_bug.py
80.0Suppose a new “platinum” tier is added.
$ python -c "from latent_bug import apply_discount; \
discounted_price = apply_discount(100.0, 'platinum')"
KeyError: 'platinum'We get a runtime crash, on a payment path!
You may be wondering, how come we add a new tier without updating this function? Isn’t it our job to update the code? Yes, but this is a very common mistake, especially in large codebases. Otherwise, we would never have type errors in production, and the world would be a much better place.
Using Literal Types#
Now let’s add type hints. Price is a float. We use Literal to restrict the tier parameter to exactly the values we support. The return type is also a float.
| |
The code would still crash during runtime, but a static type checker catches this.
$ mypy latent_bug_literal.py
error: Argument 2 to "apply_discount" has incompatible type "Literal['platinum']"; expected "Literal['gold', 'silver', 'bronze']" [arg-type]The type checker sees that "platinum" is not one of the allowed values and flags it immediately. This is the kind of bug that is trivial for a machine to find but surprisingly easy for a human to miss.
But notice, this only works because we used Literal to restrict the type. If we had hinted tier as just str, the type checker would see nothing wrong. The string "platinum" is a perfectly valid str type, so type checkers would report no errors and the bug would still make it to production. Just adding some type hints is not going to save us if we don’t properly restrict the types.
You might also be tempted to make some runtime checks like this:
| |
This looks more robust, but it is a bad idea. We did not solve our problem, we just replaced a KeyError with a ValueError that we still need to catch at runtime. And how would we handle it? Apply no discount? Apply the largest discount? No answer will make sense because this is a business logic error. On top of that, the tier input to this function is not external, it does not come from a user or a config file. It comes from within our own code. A runtime guard for something our own code controls just bloats the function for no benefit. This is an example where static type checking is a better tool than runtime checks: it catches the mistake at the call site, before the code ever runs.
Fully Typed Version#
Should we also add type hints to the discounts dictionary and the discount variable inside the function? Yes, it is a good practice to do so. Here is the fully typed version:
| |
The type checker can often infer the types of local variables from their assignments, so these inner annotations are not strictly necessary. But being explicit makes the code self-documenting. Anyone or any language model reading this function can immediately see what every variable is supposed to be without tracing through the logic. It also gives us more ground to catch bugs. This time, even if someone changed the tier parameter back to str in the function signature, the type checker would still catch the problem at the discounts[tier] line:
error: Invalid index type "str" for "dict[Literal['gold', 'silver', 'bronze'], float]"; expected type "Literal['gold', 'silver', 'bronze']" [index]Because the dictionary’s key type is Literal["gold", "silver", "bronze"], it refuses to be indexed by an arbitrary str. The more thoroughly you type your code, the more layers of protection you get.
Open-Closed Principle#
Notice there’s something deeper going on here. The fact that this function breaks whenever a tier it doesn’t know about shows up is a violation of what is known as the open-closed principle, which is one of the well-known software design principles called the SOLID principles. The open-closed principle says that code should be open for extension but closed for modification. You should be able to add new behavior without changing existing code. Here, every time a new payment tier is added, this function or any function that uses a tier needs to change. If someone forgets to update the relevant functions, program will crash at runtime. The type checker is not just catching a bug, but it’s also pointing at a design problem.
So, another benefit of typing is that it can nudge you toward better design. Design patterns can help you satisfy the open-closed principle, but until you implement one, typing will still be there to catch these bugs. In a future lesson, we will talk about design patterns in more detail but let’s do a better fix right now.
Instead of a dictionary lookup, we can encode the discount values directly into the type system using an enum:
| |
The function no longer knows or cares about which specific tiers exist. It just takes a Tier and applies it. If someone adds a PLATINUM = 0.3 to the enum, the function doesn’t need to change, it will just work. That’s the open-closed principle in action. Moreover, the type checker still has your back:
$ mypy latent_bug_enum.py
error: "type[Tier]" has no attribute "PLATINUM" [attr-defined]It tells you that PLATINUM doesn’t exist on Tier yet.
So let’s add it:
| |
One line added to the enum. The apply_discount function didn’t change, and no other code that uses membership tiers needs to change either. The type checkers are happy. The call returns 70.0, a 30% discount, exactly as expected, and most importantly, the platinum user is happy.
Practical Tips#
How much should you use type hints? If you’re new to typing, use them as much as possible. An additional benefit of adding type hints is that you will also need to think about the data structures you use and the operations you perform on them.
If you’re working with an existing untyped codebase, you don’t need to add type hints everywhere at once. Start by typing the most important parts such as public function signatures, module boundaries, and areas where bugs tend to appear. This is called gradual typing, and it’s one of the strengths of Python’s approach. Most type checkers support this incremental adoption by allowing you to configure them to only check files that already have annotations.
If you are in a job that involves working with data, you will often work with notebooks. The static type checkers may not support running natively in notebooks since notebooks are just JSON files that do not have a valid Python AST that type checkers can parse. But one of the greatest advantages of open source software development is that someone else has probably already solved your problem. For notebooks, you can use tools that can link each notebook to a python file, and then run the type checker on the associated python file. Below you will find some links to tools that can do this.
Finally, AI coding assistants can fully utilize typing to generate better code. This is just as simple as asking them to generate code with type hints and run a type checker on it. The main idea is to let AI know about the best practices for writing production Python code. We will cover more of these best practices in each lesson.
History of Typing in Python#
Python was created by Guido van Rossum and first released in 1991. Its core design principles were readability, simplicity, and elegance. “Beautiful is better than ugly” is the first line of the Zen of Python, and the community has a word for code that follows Python’s this philosophy: Pythonic.
Consider what it takes to write a simple hello world program. In statically typed languages you need includes, main functions, class definitions, or explicit return types just to print something to standard out, whereas in Python you can just write a single line of code.
print("Hello, World!")One part of that simplicity is dynamic typing. Variables don’t need type declarations, and functions don’t need signatures specifying what goes in and what comes out. This keeps the code concise, but as we already saw, it comes with tradeoffs once codebases grow. Untyped codebases with thousands of lines of code and dozens of contributors became harder to maintain, harder to refactor, and more prone to bugs that only appeared at runtime.
Python is a community driven language, and to address this growing need, in 2014, the community responded with PEP 484 and the typing module. For the first time, Python had an official vocabulary for expressing types, and third party static type checkers could use that vocabulary to check your code before it runs. Since then, the typing system has continued to evolve, with new features arriving in nearly every Python release.
It’s worth mentioning how JavaScript solved the same problem. JavaScript, another dynamically typed language, faced identical growing pains. The solution there was TypeScript, a superset of JavaScript. You write .ts files, run them through a transpiler, and out comes .js.
Python took a different approach. Instead of creating a new language, it embedded type annotations directly into the existing syntax. There is no TypedPython, no separate .tpy extension, no transpiler. A type-annotated Python file is still just a Python file. The annotations are there, but they don’t change how the code executes.
This separation is an intentional design choice, and there are good reasons for it. First, it preserves backwards compatibility. Existing Python code doesn’t break when you add type hints. Second, it keeps the interpreter simple and fast, with no type-checking overhead at runtime. Third, and perhaps most importantly, it lets the type-checking ecosystem evolve independently and faster than the language itself. Python release cycles are long, but type checkers can ship improvements every week. Python’s approach turned out to be both elegant yet still powerful enough.
Under the Hood#
If you made it this far, it is a good time to look under the hood to understand what’s actually happening when Python encounters type hints.
In Python everything is an object, and every object has a type. 99.9% of the Python code you will encounter in the wild is executed by CPython, the reference implementation of Python that is written in C. So, Python is unironically executed by a statically typed language. Another contrast between Python and C is that Python is a strongly typed language, while C is a weakly typed language, but this is a topic for another lesson.
In CPython, every value is represented by a C struct that starts with a PyObject header.
// Include/object.h (CPython 3.14)
typedef struct _object {
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
} PyObject;This header contains just two fields: a reference count (ob_refcnt) that tracks how many references point to this object, and a type pointer (*ob_type) that tells Python what kind of object this is. The actual data lives in extended structs like PyLongObject for ints and PyUnicodeObject for strings. These embed PyObject at the top so they can all be treated uniformly through pointer casting. This is the “everything is an object” part of Python.
In the no type hints scenario, when you write x = "Python in Production", Python allocates a PyUnicodeObject on the heap to hold the string, and the name x gets mapped to that object. How that mapping works depends on scope, and beyond the scope (pun intended) of this lesson. What matters here is that the type information lives inside the object itself, not attached to the name x. This is what makes Python dynamically typed at its core. When you reassign x = 2026, Python allocates a new PyLongObject on the heap for the integer and updates the mapping so x now maps to the new object. The name x doesn’t care what type it refers to. The old PyUnicodeObject has its reference count decremented immediately, and if no other references remain, it is deallocated on that same bytecode instruction. This is the CPython’s primary memory management mechanism, reference counting.
Type hints don’t change any of what we just saw. When you write x: int = 2026 with the type hint, Python still allocates the PyLongObject for 2026 and maps the name x to it, exactly as before. The annotation changes nothing about how the value is stored or how the variable behaves. Python stores the annotations as a dictionary, and you can inspect them yourself:
x: int = 2026
print(__annotate__(1))$ python3.14 annotations.py
{'x': <class 'int'>}So, type hints are metadata, not instructions. Python records them but doesn’t act on them and it doesn’t affect the runtime behavior or performance. The external tools, type checkers, language servers, read this metadata and enforce the types statically.
Conclusion#
Today we covered the essential ideas behind typing in Python. Whether you’re building an API, a data pipeline, or a machine learning model, static typing will catch bugs that testing alone won’t, making your code self-documenting, and save you and your team time during code reviews.
If you’re not using type hints yet, start today. Pick one file, add annotations to its function signatures, and run a type checker. You might be surprised by what it finds.
Earlier, we saw that typing catches classes of mistakes your tests forgot to cover. The opposite is also true, testing can catch some of the things that typing alone won’t, making them complementary not substitute tools.
| |
Here the developer added the diamond tier, but there is a typo: the discount value is larger than 1. Now, if we call the apply_discount function with the diamond tier, we will have to give items for free or even worse, we will have to pay the customer for whatever item they bought since the discounted price will be negative. Typing is not going to save us here, but testing will.
In the next lesson, we will look at testing your code. We will understand how typing and testing together form such a powerful pair for production Python code. See you there!
