💾 Archived View for thebird.nl › blog › 2023 › mgamma.gmi captured on 2023-06-14 at 13:54:57. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-03-20)

-=-=-=-=-=-=-

Mgamma

Introduction

It is February 2023 and, after some detours in pangenomics and vcflib, I am back on linear mixed models (LMMs). I have decided the time is right to pursue a new tool named `mgamma`. Mostly because mgamma sounds nice and may be about right for people in genetical genomics ;).

Mgamma will be written in zig with a dash of guile, mostly because these are minimalistic languages that supplement each other. I am taking a cue from Joe Armstrong who said that part of the success of Erlang was its few dependencies, even so I am settling for Zig and guile. Development will be hosted on our own git instance.

Bringing in lmdb (20230208)

The first step for mgamma is to read and write lmdb files. Thanks to work by Arun Isaac we can store versioned and deduplicated vectors in an lmdb file. I created a build envirnonment that includes lmdb as a shared lib. It can simply be passed in as `-llmdb`.

Next, I tried compiling

https://github.com/lithdew/lmdb-zig/

but it is not even happy with zig version 0.9. One particular error

contrib/lmdb-zig/lmdb.zig:749:23: error: expected type 'builtin.CallModifier', found '@TypeOf(.{})'
    const rc = @call(.{}, function, args);

is probably easy to fix, but I have to see if I want to carry the lmdb-zig code along as it is a simple C library. Zig has a translation function:

zig translate-c $GUIX_ENVIRONMENT/include/lmdb.h > lmdb.zig

and it compiles. Only problem is that I am getting now

mgamma.zig:22:9: error: variable of type 'type' must be const or comptime
    var env =  [*c]?*lmdb.MDB_env;
        ^~~

I'll have to look at the C code and maybe roll my own opaque pointer. The lmdb-zig package uses

try call(c.mdb_env_create, .{&inner});

(note the ampersand) where

inner: ?*c.MDB_env

Meanwhile lmdb simply says

int mdb_env_create(MDB_env **env)

How hard can it be? The lmdb docs are at

http://www.lmdb.tech/doc/group__mdb.html

This led to a blog on zig pointers

zig-pointers-and-c.gmi

The genotype format

This CL code builds an lmdb genotype file

https://github.com/genenetwork/cl-gn/blob/main/genodb.lisp

it exports

:genotype-db-current-matrix
:genotype-db-matrix
:genotype-db-matrix-ref
:genotype-db-matrix-row-ref
:genotype-db-matrix-column-ref

Kinship (20230319)

In the next phase we are going to compute the kinship matrix:

mgamma-kinship.gmi

Command line parsing

Zig

Guile