💾 Archived View for dmerej.info › en › blog › 0041-rewriting-z-from-scratch.gmi captured on 2024-12-17 at 09:55:24. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2022-07-16)

-=-=-=-=-=-=-

2017, May 21 - Dimitri Merejkowsky
License: CC By 4.0

z[1] is a tool that will remember all the directories you are visiting when using your terminal, and then make it possible to jump around those directories quickly.

1: https://github.com/rupa/z

Let's try and rewrite this functionality from scratch, maybe we'll learn a few things this way.


Motivation

I started using `z` about one year ago.

It worked fine except for one thing.

Let's say I have two directories matching `spam`, `/path/to/spam-egss`, and `/other/path/to/eggs-with-spam`.

`z` will compute a score for each directory that depends on how often and how recently it was accessed.

So, in theory:

a directory that has low ranking but has been accessed
recently will **quickly** have higher rank than a directory accessed frequently a
long time ago. (excerpt of `z`'s README, emphasis is mine)

But in practice, if you start working from `spam-eggs` to `eggs-with-spam` you will get the wrong answer for `z spam` a few times, ("quickly" does not means "instantly")

Also, *because* the algorithm uses the date of access, you cannot (easily) predict which directory it will choose.

Step one: choose a database format

`z` uses a database that looks like this

/other/path/to/eggs-with-spam|24|1495372880
/path/to/spam-eggs|4|1495372491
...

There's the path, the number of times it was accessed and the timestamp of last visit.

We'll use `json`, and since we want an algorithm that does not depend of the time, we'll only store the number of accesses:

{
  "/other/path/to/eggs-with-spam": 24,
  "/path/to/spam-eggs": 4,
  ...
}

Why `json`? Because it's quite easy to read and write (including pretty print) in any language, while still being editable by humans.

It still is possible to add new data should we need to.

Step two: decide how to handle non-existing directories

Let's say you work in `/tmp/foo`, and no other directory is called `foo`. You also create the directories `/tmp/foo/src` and `/tmp/foo/include`.

With `z`, once the three paths, `/tmp/foo` , `/tmp/foo/src/` and `/tmp/foo/include` are stored in the database, they will stay there for ever.

This means that if you remove `/tmp/foo`, `z foo` will still try to go into the non-existing directory. But, if you re-create `/tmp/foo` later on, `z foo` will work again.

In our rewrite, we'll deal with this situation an other way:

Step three: write a command line tool

I decided to name the tool `cwd-history`, and to use an command line syntax looking like `git`, with several possible "verbs" for the various actions:

The code is on github[2] if you want to take a look.

2: https://github.com/dmerejkowsky/dotfiles/blob/master/bin/cwd-history

Some notes:

def get_db_path():
    zsh_share_path = os.path.expanduser("~/.local/share/zsh")
    os.makedirs(zsh_share_path, exist_ok=True)

def add_to_db(path):
  path = os.path.realpath(path)

def clean_db():
    cleaned = 0
    entries = read_db()
    for path in list(entries.keys()):
        if not os.path.exists(path):
            cleaned += 1
            del entries[path]
    if cleaned:
        print("Cleaned", cleaned, "entries")
        write_db(entries)
    else:
        print("Nothing to do")

import operator

def list_db():
    entries = read_db()
    sorted_entries = sorted(entries.items(),
                            key=operator.itemgetter(1))
print("\n".join(x[0] for x in sorted_entries))

Step four: hook into zsh

We want to call `cwd-history add $(cwd)` every time `zsh` changes the current working directory.

This is done by writing a function and add it to a special array:

function register_cwd() {
  cwd-history add "$(pwd)"
}
typeset -gaU chpwd_functions
chpwd_functions+=register_cwd

Step five: filtering results

Instead of trying to guess the best result, we'll let the user choose by hooking into `fzf`.

I already talked about `fzf` on a previous article[3]. The gist of it is that you can pass any command as input to `fzf`, and let the user interactively select one result from the list.

3: fzf for the win

The implementation looks like this:

  cwd_list=$(cwd-history list)
  ret="$(echo $cwd_list| fzf --no-sort --tac --query=${1})"
  cd "${ret}"
  if [[ $? -ne 0 ]]; then
    cwd-history remove "${ret}"
  fi

Notes:

Step six: from the shell to neovim and vice-versa

One last thing. Since I do most of my editing in neovim, I'm always looking for ways to achieve similar behaviors in my shell and in my editor.

So let's see how we can transfer information about visited directories from one tool to an other.

From neovim to the shell

This is kind of a ugly hack.

First, I add a auto command to write neovim's current directly into a hard-coded file in `/tmp/`:

" Write cwd when leaving
function! WriteCWD()
  call writefile([getcwd()], "/tmp/nvim-cwd")
endfunction

autocmd VimLeave * silent call WriteCWD()

And then, I wrap the call to `neovim` in a function that reads the content of the file and then calls `cd`, but only if `neovim` exited successfully.

# Change working dir when Vim exits
function neovim_wrapper() {
  nvim $*
  if [[ $? -eq 0 ]]; then
    cd "$(cat /tmp/nvim-cwd 2>/dev/null || echo .)"
  fi
}

From the shell to neovim

To go the other way, I just call `fzf#run()` from the fzf.vim[4] plugin with `cwd-history list` as source and `:tcd` as sink:

4: https://github.com/junegunn/fzf.vim

function! ListWorkingDirs()
  call fzf#run({
        \ 'source': "cwd-history list",
        \ 'sink': "tcd"
        \})
endfunction

command! -nargs=0 ListWorkingDirs :call ListWorkingDirs()
nnoremap <leader>l :ListWorkingDirs<CR>

Conclusion

And there you have it.

`z` is 458 lines of `zsh` code.

My re-implementation is 75 lignes of Python, 6 lines of `zsh`, and 8 lines of `vimscript`.

It shares the database between the shell and the editor, it is never wrong, and the database stays clean and editable by hand.

Not bad I think :)

PS: You can also use `z` directly with `fzf` with a few lines of code, as show in fzf wiki

----

Back to Index

Contact me