💾 Archived View for dmerej.info › blog › 0071-giving-mypy-a-go.gmi captured on 2021-12-04 at 18:04:22. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
2018, Jun 11 - Dimitri Merejkowsky License: CC By 4.0
This is a follow up to the I don't need types[1] article.
I left a teaser explaining I'll be giving concrete examples using a Python project, so here goes.
As you may have guessed, I'm going to talk about type annotations and the mypy[2] static type checker.
Here's a quick overview of how mypy works, in case you are not already familiar with it.
Let's use a rather contrived example:
def is_odd(num): return num % 2 == 0 if is_odd("this sentence contains %s"): print("ok")
We have a function `is_odd`, which obviously only works with numbers, and we call it with a string.
But since the string contains `%s`, Python will happily assume we are trying to format it, and the bug will go unnoticed. `is_odd` will simply return False, because we are *also* allowed to compare strings and numbers with `==` in Python.
Without type annotations, mypy detects nothing:
$ mypy foo.py <nothing>
But if we add type annotations and re-run mypy, we do get an error message:
def is_odd(num: int) -> bool: return num % 2 == 0 if is_odd("this sentence contains %s"): print("ok")
$ mypy foo.py error: Argument 1 to "is_odd" has incompatible type "str"; expected "int"
Thus, you can use mypy in a loop:
This is called "gradual typing".
The goal of the experiment is to check if going through the trouble of adding type annotations is worth it.
Let me define what I mean by "worth it".
There are many things said about types annotations like:
Those may be true, but they are hard to measure.
So I asked myself something else:
I know the second question is a bit subjective. I'm still going to use this metric because it's one that really matters to me.
To test this hypothesis, I needed an existing project. I used tsrc[3], the command-line tool we use at work[4] to manage multiple git repositories and automate GitHub and GitLab review automation.
3: https://github.com/dmerejkowsky/tsrc
I chose it because:
So here is the protocol I used:
Before I continue, I should tell you I used mypy with three important options:
Let's look at trivial patches first:
def find_workspace_path() -> Path: head = os.getcwd() tail = True while tail: tsrc_path = os.path.join(head, ".tsrc") if os.path.isdir(tsrc_path): return Path(head) else: head, tail = os.path.split(head) raise tsrc.Error("Could not find current workspace")
This function starts by checking there is a `.tsrc` hidden directory in the working path. If not, it goes through every parent directory and check if the `.tsrc` directory is here. If it reaches the root of the file system (the second value of `os.path.split` is None), it raises an exception.
mypy did not like that `tail` started as a boolean, and then was assigned to a string.
We can fix that by using a non-empty string, with a value that reveals the intention:
- tail = True + tail = "a truthy string" while tail:
An other way would be to use a type like `Union[bool,str]` but that would be more confusing I think.
Anyway, I'm not sure the quality of the code improved there. No point for mypy.
class Options: def __init__(self, url, shallow=False) -> None: self.shallow: bool = shallow self.url: str = url def save_config(options: Options): config = dict() config["url"] = options.url config["shallow"] = options.shallow with self.cfg_path.open("w") as fp: ruamel.yaml.dump(config, fp)
We are using `save_config` to serialize a "value object" (the `Options` class) into a YAML file.
mypy saw the first two lines, `config = dict()`, `config["url"] = options.ul` and wrongly deduced that `config` was a dict from strings to strings.
Then it complained about `config["shallow"]` that was assigned to a boolean.
We can fix that by forcing the `config` type to be a dict from string to `Any`: `Any` is a "magic" type we can use precisely for this kind of situation[5].
5: http://mypy.readthedocs.io/en/latest/kinds_of_types.html#the-any-type
- config = dict() + config: Dict[str, Any] = dict()
This a bit annoying, but the type annotation makes it clearer what the `config` is. 1 point for mypy.
When you use the GitLab API, you often have to use the 'project id'. The doc says you can use the `<namespace>/<project_name>` string if it is "url encoded", like this:
Get info about the Foo project in the FooFighters namespace: GET /api/v4/projects/FooFighters%2Foo
In python, the naive approach does not work:
import urllib.parse >>> urllib.parse.quote("FooFighters/Foo") FooFighters/Foo
Instead you have to specify a list of "safe" characters, that is characters that you *don't* need to encode.
The doc says the default value is "/".
Thus, here's what we use in `tsrc.gitlab`:
project_id = "%s/%s" % (namespace, project) encoded_project_id = urllib.parse.quote(project_name, safe=list())
mypy saw that the default value was a string, so he complained we were using a list instead. We fixed it by:
- encoded_project_name = urllib.parse.quote(project_name, safe=list()) + encoded_project_name = urllib.parse.quote(project_name, safe="")
Here, mypy forced us to follow an implicit convention. There are two ways to represent a list of characters in Python. A real list: `['a', 'b', 'c']`, or a string: `"abc"`. The authors of the `urllib.quote()` function decided to use the second form, so it's a good thing we follow this convention too.
An other win for mypy.
We already use a lot of tools in the hope of catching bugs:
Despite of this, mypy found bugs that tests, all the linters and all the reviewers did not find.
Here are a few of them. Feel free to try and find the bug yourself, the answer will be given after the ⁂ symbol.
class GitLabAPIError(GitLabError): def __init__(self, url: str, status_code: int, message: str) -> None: ... def handle_stream_errors(response: requests.models.Response) -> None: if response.status_code >= 400: raise GitLabAPIError( response.url, "Incorrect status code:", response.status_code)
This code wraps calls to the GitLab API, and arrange for a particular `GitLabAPIError` to be raised, when we make a "stream" request. (For instance, to download a GitLab CI artifact):
<center>⁂</center>
The problem here is that we inverted the `status_code` and `message` parameter. Easy to fix:
- raise GitLabAPIError( - response.url, "Incorrect status code:", response.status_code) + raise GitLabAPIError( + response.url, response.status_code, "Incorrect status code")
The bug was not caught because the code in question was actually copy/pasted from a CI script (and you usually don't write tests for CI scripts).
We actually don't need streamed responses anywhere in tsrc, so this is in fact dead code (and it is now gone from the code base)
def handle_json_errors(response: requests.models.Response): try: json_details = response.json() except ValueError: json_details["error"] = ( "Expecting json result, got %s" % response.text ) ... url = response.url if 400 <= status_code < 500: for key in ["error", "message"]: if key in json_details: raise GitLabAPIError(url, status_code, json_details[key]) raise GitLabAPIError(url, status_code, json_details) if status_code >= 500: raise GitLabAPIError(url, status_code, response.text)
This one is slightly more interesting. It is located near the previous one and handles errors for the calls to the GitLab API which returns JSON objects.
We of course have to catch 500 errors, which hopefully do not happen often.
In case of a status code between 400 and 499, we know there was a problem in the request we made, but we need to tell the user why the request was rejected.
Most of the time the GitLab API returns a JSON object containing a `error` or `message` key, but sometimes neither key is found in the returned object, and sometimes the text of the response is not even valid JSON code.
So we have to check for both keys in the JSON object, and if not found (if we exit the for loop), just store the entire JSON response in the exception.
<center>⁂</center>
The bug is in the second `raise GitLabAPIError`. We are passing an entire object where the `GitLabAPIError` expected a string.
The fix was:
- raise GitLabAPIError(url, status_code, json_details) + raise GitLabAPIError(url, status_code, json.dumps(json_details, indent=2))
Again, this was hard to catch with tests. The case where the JSON returned by GitLab did *not* contain a `error` or `message` key only happened once in the lifetime of the project (which explain why the code was written), so manual QA and unit tests did not need to check this code path.
Anyway, note we did not blindly wrote something like `str(json_details)` to convert the JSON object to a string. We found out it was used in a message displayed to the end user, thus we use `json.dumps(json_details), indent=2)` to make sure the message contains neatly indented JSON and is easy to read.
class LocalManifest: def __init__(self, workspace_path: Path) -> None: hidden_path = workspace_path.joinpath(".tsrc") self.clone_path = hidden_path.joinpath("manifest") self.cfg_path = hidden_path.joinpath("manifest.yml") self.manifest: Optional[tsrc.manifest.Manifest] = None def update(self) -> None: ui.info_2("Updating manifest") tsrc.git.run_git(self.clone_path, "fetch") tsrc.git.run_git(self.clone_path, "reset", "--hard", branch) def load(self) -> None: yml_path = self.clone_path.joinpath("manifest.yml") if not yml_path.exists(): message = "No manifest found in {}. Did you run `tsrc init` ?" raise tsrc.Error(message.format(yml_path)) self.manifest = tsrc.manifest.load(yml_path) @property def copyfiles(self) -> List[Tuple[str, str]]: return self.manifest.copyfiles def get_repos(self) -> List[tsrc.Repo]: assert self.manifest, "manifest is empty. Did you call load()?" return self.manifest.get_repos(groups=self.active_groups)
After you run `tsrc init git@example.com/manifest.git`, the manifest is cloned inside `<workspace>/.tsrc/manifest`.
Thus, the contents of the `manifest.yml` is in `<workspace>/.tsrc/manifest/manifest.yml>`.
The `LocalManifest` class represent this manifest repository.
Here's what happens when you run `tsrc sync`:
<center>⁂</center>
It's not really a bug, but mypy forced us to acknowledge that `LocalManifest.manifest` starts as None, and only gets its real value after `.load()` has been called.
We already have an `assert` in place in `get_repos()`, but mypy forced us to add a similar check in the `copyfiles` getter:
@property def copyfiles(self) -> List[Tuple[str, str]]: + assert self.manifest, "manifest is empty. Did you call load()?" return self.manifest.copyfiles
Some of the tests for tsrc are what we call end-to-end tests:
Here's how they work:
Thus, we don't have to mock file systems or git commands (which is doable but pretty hard), and things are easy to debug because in case of problem we can just `cd` to the test directory and inspect the state of the git repositories by hand.
Any way, those tests are written with a test helper called `GitServer`.
You can use this class to create git repositories, push files on some branches, and so on:
Here's what the helper looks like:
def add_repo(self, name: str, default_branch="master") -> str: ... url = self.get_url(name) return url def push_file(self, name: str, file_path: str, *, contents="", message="") -> None: ... full_path = ... full_path.parent.makedirs_p() full_path.touch() if contents: full_path.write_text(contents) commit_message = message or ("Create/Update %s" % file_path) run_git("add", file_path) run_git("commit", "--message", commit_message) run_git("push", "origin", "--set-upstream", branch) def tag(self, name: str, tag_name: str) -> None: run_git("tag", tag_name) run_git("push", "origin", tag_name) def get_tags(self, name: str) -> List[str]: src_path = self.get_path(name) rc, out = tsrc.git.run_git_captured(src_path, "tag", "--list") return out
You can use it like this:
def test_tsrc_sync(tsrc_cli: CLI, git_server: GitServer) -> None: git_server.add_repo("foo/bar") git_server.add_repo("spam/eggs") manifest_url = git_server.manifest_url tsrc_cli.run("init", manifest_url) git_server.push_file("foo/bar", "bar.txt", contents="this is bar") tsrc_cli.run("sync") bar_txt_path = workspace_path.joinpath("foo", "bar", "bar.txt") assert bar_txt_path.text() == "this is bar"
<center>⁂</center>
The bug is in `get_tags`:
def get_tags(self): rc, out = tsrc.git.run_git(src_path, "tag", raises=False) - return out + return out.splitlines()
The `get_tags` methods is also dead code. It has an interesting story.
The need for the `GitServer` helper occurred when I started writing end-to-end tests for tsrc and discovered I needed a test framework to write those end-to-end test.
It was vital that the `GitServer` implementation uses clean code.
So to have a clean implementation of the `GitServer` I of course used the best technique I know of: TDD.
You can find them in the [test_git_server.py](https://github.com/dmerejkowsky/tsrc/blob/main/tsrc/test/helpers/test_git_server.py[6] file.
6: https://github.com/dmerejkowsky/tsrc/blob/main/tsrc/test/helpers/test_git_server.py
Anyway, I was writing an end-to-end test for tsrc that involved tags.
I thought: "OK, I need a `.tag()` method in `GitServer`. So I also need a `get_tags()` method to check the tag was actually pushed". Thus I wrote the `get_tags` method, forgetting to write a failing test for `GitServer` first (that still happens after 5 years of TDD, so don't worry if it happens to you too.). At that point I only had my end-to-end test failing, so I made it pass and completely forgot about the `get_tags()` method. Oh well.
In the implementation of tsrc we often have to loop over a list of items, perform actions an each of them and record which one failed, and when the loop is done, display the error messages to the user.
For instance:
tsrc sync
Or:
$tsrc foreach :: Running `ls foo` on every repo
So in order to keep things <abbr title="Don't Repeat Yourself">DRY</abbr>, we have some high-level code that only deals with loop and error handling:
First, a generic `Task` method.
(Note that you can have your own generic types[7] with mypy. This is *awesome* because without it you get laughed at by C++ programmers)
7: https://mypy.readthedocs.io/en/latest/generics.html
class Task(Generic[T], metaclass=abc.ABCMeta): @abc.abstractmethod def description(self) -> str: pass @abc.abstractmethod def display_item(self, item: T) -> str: pass @abc.abstractmethod def process(self, item: T) -> None: pass
And then, a generic `SequentialExecutor` who knows how to execute a given task on a list of items and display the outcome:
class SequentialExecutor(Generic[T]): def __init__(self, task: Task[T]) -> None: self.task = task self.errors: List[Tuple[Any, tsrc.Error]] = list() def process(self, items: List[T]) -> None: if not items: return True ui.info_1(self.task.description()) self.errors = list() for item in enumerate(items): ... self.process_one(item) if self.errors: self.handle_errors() def process_one(self, item: T) -> None: try: self.task.process(item) except tsrc.Error as error: self.errors.append((item, error)) def handle_errors(self) -> None: # Display nice error message ... raise ExecutorFailed()
Thus we can inherit from `Task` to implement `tsrc sync`:
class Syncer(tsrc.executor.Task[Repo]): def process(self, repo: Repo) -> None: ui.info(repo.src) self.fetch(repo) self.check_branch(repo) self.sync_repo_to_branch(repo) def check_branch(self, repo: Repo): current_branch = tsrc.git.get_current_branch(repo_path) if not current_branch: raise tsrc.Error("Not on any branch")
Ditto for `tsrc foreach`:
class CmdRunner(tsrc.executor.Task[Repo]): def process(self, repo: Repo) -> None: full_path = self.workspace.joinpath(repo.src) rc = subprocess.call(cmd, ...) if rc != 0: raise CommandFailed
Did you spot the bug?
<center>⁂</center>
The bug is in here:
def process(self, items: List[T]) -> None: if not items: return True
This is the result of a bad refactoring. `Executor` used to track success of task by looking at the return value the `process()` method.
After a while, I found out it was clearer to just use exceptions for that, mostly when I implemented the `Syncer` class. (You can just raise an exception instead of adding lots of `ifs`).
But the early `return True` was left. Here `mypy` found something that would have puzzled an external reader. Almost everything function or method dealing with Executors and Tasks in tsrc either return None or raise Exception. What is that boolean doing here?
There were of course no tests left that checked the return value of `process` (they got refactored at the same time of the rest of the code), so the bug went unnoticed.
The last changed required by mypy was also pretty interesting. Here's the deal.
In tsrc, we often need to run `git` commands, but we can run them in two very different ways:
Here's what the implementation looked like:
def run_git(working_path, *cmd, raises=True): """ Run git `cmd` in given `working_path` If `raises` is True and git return code is non zero, raise an exception. Otherwise, return a tuple (returncode, out) """ git_cmd = list(cmd) git_cmd.insert(0, "git") options = dict() if not raises: options["stdout"] = subprocess.PIPE options["stderr"] = subprocess.STDOUT process = subprocess.Popen(git_cmd, cwd=working_path, **options) if raises: process.wait() else: out, _ = process.communicate() out = out.decode("utf-8") returncode = process.returncode if raises: if returncode != 0: raise GitCommandError(working_path, cmd) else: return returncode, out
And here's how to use it:
# run `git fetch`, leaving the output as-is run_git(foo_path, "fetch") # run git remote get-url origin rc, out = run_git(foo_path, "remote", "get-url", "origin", raises=False): if rc == 0: # Handle the case where there is no remote called 'origin' ... else: # Do something with out
Here's an other place we used `run_git`: we have a command named `tsrc status`, which displays a summary of the whole workspace. Here what's its output looks like:
tsrc status
And here is the implementation:
class GitStatus: def update_remote_status(self): _, ahead_rev = run_git( self.working_path, "rev-list", "@{upstream}..HEAD", raises=False ) self.ahead = len(ahead_rev.splitlines()) _, behind_rev = run_git( self.working_path, "rev-list", "HEAD..@{upstream}", raises=False ) self.behind = len(behind_rev.splitlines())
Found the bug yet?
<center>⁂</center>
The code in `update_remote_status()` assumes that `git rev-list` outputs a list of commits, and then count the lines in the output.
This works well if there *is* a remote branch configured:
$ git revlist HEAD..@{upstream} 30f729a6a0ec3926cf063f5f8a3953b89d7b252e ef564f0ef38a163beb3db52474ac4e256a6c2cd4
But if not remote is configured, the git command will fail with a message looking like:
$ git revlist HEAD..@{upstream} fatal: no upstream configured for branch 'dm/foo'
Since `update_remote_status()` does not check the return code, `self.ahead_rev` and `self.behind_rev` both get set to `1`, and the output looks like:
Oops.
<center>⁂</center>
But there is more to it than just this bug.
Notice the type of the return value of `run_git` depends *on the value of the `raises` parameter*.
At first I tried to annotate the function with a `Union` type, like this:
def run_git(working_path: Path, *cmd: str, raises=True) -> Union[Tuple[int, str], None]: pass
But then mypy complained for each and every line that used `run_git` with `raises=False`:
rc, out = run_git(..., raises=False) foo.py:39: error: 'builtins.None' object is not iterable
I thought about this and found out it was cleaner to split `run_git` into `run_git` and `run_git_captured`:
def run_git(working_path: Path, *cmd: str) -> None: """ Run git `cmd` in given `working_path` Raise GitCommandError if return code is non-zero. """ git_cmd = list(cmd) git_cmd.insert(0, "git") returncode = subprocess.call(git_cmd, cwd=working_path) if returncode != 0: raise GitCommandError(working_path, cmd) def run_git_captured(working_path: Path, *cmd: str, check: bool = True) -> Tuple[int, str]: """ Run git `cmd` in given `working_path`, capturing the output Return a tuple (returncode, output). Raise GitCommandError if return code is non-zero and check is True """ git_cmd = list(cmd) git_cmd.insert(0, "git") options: Dict[str, Any] = dict() options["stdout"] = subprocess.PIPE options["stderr"] = subprocess.STDOUT returncode = process.returncode if check and returncode != 0: raise GitCommandError(working_path, cmd, output=out) return returncode, out
True, there's a bit more code and slightly duplication but:
Now the intent about whether you want to capture the output or not is encoded *in the name of the function* (`run_git_captured` instead of `run_git`).
Also note that you can't forget about checking the return code anymore.
You can either write:
_, out = run_git_captured(repo_path, "some-cmd") # We know `out` is *not* an error message, because if `some-cmd` failed, an # exception would have been raised before `out` was set.
Or you can be explicit about your error handling:
rc, out = run_git_captured(repo_path, "some-cmd", check=False) if rc == 0: # Ditto, out can't be an error message else: # Need to handle error here
That alone would be a good reason to use mypy I think :)
Well, the `mypy` changes have been merged[8] and the CI now runs `mypy` in strict mode on every pull request.
8: https://github.com/dmerejkowsky/tsrc/pull/110
I'm still curious about the other benefits of type annotations I could not check (maintainability, code comprehension, ease of refactorings ...).
I guess we'll see how the next refactoring in tsrc goes.
We saw how mypy, while stilling making relative few false positives, still found inconsistencies, a few bugs, and even design problems.
So, my advice is for you to use it if you get a chance. Cheers!
----