💾 Archived View for breadpunk.club › ~toast › granitedb.gmi captured on 2023-01-29 at 03:17:25. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

████████╗ ██████╗  █████╗ ███████╗████████╗
╚══██╔══╝██╔═══██╗██╔══██╗██╔════╝╚══██╔══╝
   ██║   ██║   ██║███████║███████╗   ██║   
   ██║   ██║   ██║██╔══██║╚════██║   ██║   
   ██║   ╚██████╔╝██║  ██║███████║   ██║   
   ╚═╝    ╚═════╝ ╚═╝  ╚═╝╚══════╝   ╚═╝

Go to site menu

GraniteDB

This page is a bit of a mess at the moment, it's still being written

GraniteDB is a minimalist relational database management system. Currently in-development, being written in Rust and targetting Linux.

Project goals

A single standalone server binary no larger than 10MB
Performant under 10,000 concurrent reads and 1000 concurrent writes
A compact and fully-specified disk format
A succinct, expressive, and orthogonal query language
A byte-based (rather than text-based) over-the-wire format

The aim is to design a new database system from first principles, implementing only the functionality that is strictly required from a database. No user management, no access control, no views. Just tables, records, and relationships.

Ideally it will be possible to write a proof-of-concept Granite client for Uxn, to show that the system is simple.

Points of difference from traditional relational database systems

TODO:

- Make notes on the short-comings of SQL as a query language (perhaps as a separate page)

- text-vs-byte over-the-wire format

- no need for user management for current use-cases

- relationships between tables being explicit

Query language

Here's a query written in SQL:

SELECT part.name FROM part 
INNER JOIN supplier ON part.supplier_id=supplier.id 
WHERE supplier.city="Auckland";

Here's the same query, but written in Quarry, the GraniteDB query langauge:

part.name, part->supplier.name : part->supplier.city="Auckland"

Both of these queries would return a response like the following:

<<<table>>>

The relationships between tables are explicitly named in Granite. In the query above, the relationship between `part.supplier_id` and `supplier.id` is defined inside the database. To get the supplier name for each part one can just use `part->supplier.name`.

To elaborate further, each row in the `part` table points to exactly one row in the `supplier` table (via the `part.supplier_id` column pointing to the `supplier.id` column). The `->` syntax is used to traverse a one-to-one relationship, where for `part->supplier` there is guaranteed to be exactly one supplier for each part. This relationship can also be traversed backwards as a many-to-one relationship with the `=>` syntax, where `supplier=>part` would return zero or more parts for each row of the `supplier` table.

Relationships and foreign-key constraints are bundled together, so relationships are always safely constrainted and constraints can always be traversed with no extra work.

Syntax

A **table** is a set of tabular data. A table contains zero or more columns, zero or more rows, and zero or more relationships. Neither columns, rows, or relationships are ordered.

A **column** comprises a name and a data type. Each row of a table contains exactly one data value for each column of the table.

A **row** contains as many data values as the table has columns.

A **relationship** is a link between a column in one table and a column in another table.

As an example, consider the following tables:

`part`, with columns `id` and `name`
`supplier`, with columns `id`, `name`, and `city`
`supplier_part`, with columns `part_id`, `supplier_id` linking to the `part` and `supplier` tables

CREATE TABLE part (id: u64, name: str);

CREATE TABLE supplier (id: u64, name: str, city: str);

CREATE TABLE supplier_part (part_id: u64, supplier_id: u64)
  RELATIONSHIP part_id->part.id
  RELATIONSHIP supplier_id->supplier.id;

A **table expression** is an expression that returns a table.

`part` is a valid table expression, returning the full content of the `part` table.
`part.name` returns a single column.
`part : part.name="Bolt"` filters rows.
`part : part.name ∈ ["Bolt", "Screw"]`uses a **column literal**
`part->supplier.name` uses a relationship

Table expressions have two parts. The left half restricts a table to a subset of columns, the right half restricts the rows.

More ideas

Enum fields. Declare a string field as `*str[1..20]`, and a hidden table is created to handle it. Instead of using up to 20 bytes to store and index on the value, only 1 byte is used, and no joining is needed. This is a neat idea, but we lose one of the main advantages of a foreign table, which is that the foreign-key constraint prevents typos from creating new erroneous entries. How about we have some kind of indicator to show that a table is being used for this purpose? Or can this be automatically detected? Yeah, do that, automatically detect the tiny join. The extra piece of join syntax is no hassle, `order->status.name` vs `order.status`.

`{order->part.name, order.quantity}` will return a row for each order.

`{+order.quantity}` will return a single scalar.

`{/order.quantity}` will return a single scalar, returning the column default if empty.

`{/order.quantity, +order.quantity}` will throw an error, need unique names on columns.

`{avg: /order.quantity, sum: +order.quantity}` will return a single row.

`{order->part.name, +order.quantity}` will return a row for each order. `quantity` **is not** summed.

`{order->part.name, {+order.quantity}}` will return a row for each order. `quantity` **is** summed.

`{order->part.name!, +order.quantity}` will return a row for each _part_. `quantity` is summed over each part.

`{order->part.name, status.name}` will return an error, there's so far no use for Cartesian products in Granite.

`{order->part.name!, order.status}` will throw an error, can't shrink down `status`.

`{order->part.name!, order.status, +order.quantity}` will throw an error, can't shrink down `status`.

`{order->part.name!, order.status!, +order.quantity}` will return a row for each part and status, summing `quantity` over them.

`{part.name, part->supplier.name | part.name=["Nail","Rivet"]}` uses the `=` symbol to check for membership of a scalar in a column expression.

`[]` is a column expression, `()` is a row expression. `name: value` is an alias.

`order->part.(name, supplier.name)` as a way of concisely referring to multiple fields, following from the row expression syntax.

`DELETE { table | condition }`

`TABLE table (column_name: type,)`

Testing the language

TABLE part (
    id: u32!,
    name: str[1..20],

    orders: id => order.part_id,
);

TABLE supplier (
    id: u32!,
    name: str[1..20],
    
    orders: id => order.supplier_id,
);

TABLE order (
    part_id: u32!,
    supplier_id: u32!,
    status_id: u8,
    project_id: u32?,
    quantity: u32,
    cost: d64.2,

    part: part_id -> part.id,
    supplier: supplier_id -> supplier.id,
    status: status_id -> status.id,
    project: project_id -> project.id,
);

TABLE status (
    id: u8!,
    name: str[1..20],
);

TABLE project (
    id: u32!,
    name: str[1..20],
    manager_id: u64?,
    budget: d64.2,
    
    manager: manager_id -> employee.id,
    orders: id => order.project_id,
);

TABLE department (
    id: u32!,
    name: str[1..20],
    code: str[3],
);

TABLE employee (
    id: u32!,
    name: str[1..],
    department_id: u64?,
    manager_id: u64?,

    department: department_id -> department.id,
    manager: manager_id -> employee.id,
)


// Get the total number of parts in the database
{ #part }

// Get a list of all projects that have exceeded their budgets, with their current cost and budgets
{ ~project!, project.name, project.budget, cost_to_date: +project=>orders.cost | cost_to_date > budget }

// Get the total quantity of parts that have ever been ordered from Fisher Appliances Ltd
{ +order.quantity | order.supplier->name="Fisher Appliances Ltd" }

// Get the IDs and names of all parts where fewer than 1000 have ever been purchased
{ order->part.id!, order->part.name | +order.quantity > 1000 }

// Get the names of all managers
// None values are excluded from optional relationships unless `table->relationship.column?` is specified
{ ~employee->manager.id!, employee->manager.name }