💾 Archived View for dioskouroi.xyz › thread › 29430720 captured on 2021-12-03 at 14:04:38. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Pg_GraphQL: A GraphQL Extension for PostgreSQL

Author: samwillis

Score: 218

Comments: 65

Date: 2021-12-03 14:59:16

Web Link

________________________________________________________________________________

akulkarni wrote at 2021-12-03 15:48:28:

As co-founder of another database company in the PostgreSQL ecosystem, I have to say that I'm really impressed with the quality and velocity of launches from the Supabase team. Nothing else to add, except please keep up the great work!

kiwicopple wrote at 2021-12-03 16:00:10:

thanks Ajay, we're also big fans of yours here at Supabase :)

nextaccountic wrote at 2021-12-03 17:24:15:

Hey it's unrelated but, what's the relationship between Supabase and Postgrest?

My understanding is that Postgrest is a separate project that began development in an unrelated way to Supabase, and Supabase is just an user. Is that correct?

But, Supabase is also a contributor to Postgrest, right? Does Supabase employs engineers dedicated to it?

oliverrice wrote at 2021-12-03 17:31:27:

> What's the relationship between Supabase and Postgrest?

Supabase uses PostgREST for its automatic/reflected REST API

> Supabase is just an user. Is that correct?

yep!

> Does Supabase employs engineers dedicated to it?

https://supabase.com/blog/2020/06/15/supabase-steve-chavez

nextaccountic wrote at 2021-12-03 18:06:44:

Cool, thanks!

nextaccountic wrote at 2021-12-03 17:19:27:

What's your company?

mrkurt wrote at 2021-12-03 17:58:58:

Timescale. Another Postgres adjacent startup that's doing amazing work.

AdriaanvRossum wrote at 2021-12-03 17:24:00:

Click his name if you want to find out :)

cmrajan wrote at 2021-12-03 15:58:39:

I've explored multiple Graphql solutions for PostgreSQL in the past and couldn't convince of the resolvers based approach and always felt that an extension would be the right solution. The closest I've come across and used so far is Graphjin

https://github.com/dosco/graphjin

Thanks Supabase for the fantastic extension!

lucasyvas wrote at 2021-12-03 16:52:40:

There is no way every cloud provider won't jump on this as soon as it's stable for their Postgres offerings.

darksaints wrote at 2021-12-03 17:33:53:

Hopefully so. It is awesome that supabase is releasing this as open source, as they could have easily kept this proprietary. I probably won't ever be able to use supabase for my primary job (the company is obnoxiously invested in an Azure/AWS hybrid cloud). But I do have some profitable side projects with the ability to choose my own service providers, and this sort of developer goodwill really goes a long way.

yurisagalov wrote at 2021-12-03 18:25:00:

Supabase is open source. I believe the hosted version is on AWS, but there's no reason you couldn't run it on your company's Azure/AWS directly? :)

sporkland wrote at 2021-12-03 18:08:02:

I enjoyed Sam Newman's take on this capability [1]

Great to see AWS providing direct data coupling as a service. /s

This service allows you to directly map a GraphQL endpoint to a database table. It’s like putting getters and setters on an object and claiming your encapsulating private variables. The end result is coupling between GraphQL clients and the underlying datasource.

Information hiding is a key concept in independent change. Can I change the provider (of the GraphQL) endpoint independent of the clients? Directly exposing internal data structure makes this very difficult.

And [2]:

So a few people have asked why I have this snarky response. What is my problem with this service? Well, to be clear, it’s not an issue with GraphQL, it’s an issue with direct coupling with underlying datasources #thread

The service as advertised makes it simple to map a GraphQL definition against a database. Now, what’s the problem with this? Well, the devil here is in the detail. But fundamentally it comes down to how important information hiding is to you.

... see the thread for more ...

[1]

https://twitter.com/samnewman/status/1346541251617828877

[2]

https://twitter.com/samnewman/status/1346749556583780352

Twisol wrote at 2021-12-03 18:31:15:

I don't think Sam is wrong, exactly -- and that second thread does add some of the missing nuance -- but I do think there's a key missing piece here.

He's absolutely right that you shouldn't couple directly to the underlying representation. But Postgres lets you transparently define views that can be queried (and, with a little more elbow grease, updated) just like any other table. You can provide decoupling from within the database, and do so on-demand as your domain and your data model evolve.

I don't enjoy planting a separate bespoke API server on top of the database. Usually you end up lifting many of the same capabilities the database already has (auth, batching, ...) to your custom API, so a lot of the server is just boilerplate. Many API operations are natural consequences of your data model; there's little business or engineering value-add once you've settled on the latter, you're just writing glorified FFI bindings.

Lazy engineering will cause problems no matter what architectural stack you end up using. But a state-first architecture doesn't have to mean a complete loss of loose coupling -- it just means different techniques for achieving it.

andrewingram wrote at 2021-12-03 16:16:45:

Whilst I don't like the paradigm of generating a GraphQL schema from your database (or vice-versa!), I appreciate this one for having a specific goal of being able to run as part of a small database VM. So congrats on the release!

darksaints wrote at 2021-12-03 17:24:31:

This is awesome! Really creative approach to the problem. Does this also support custom datatypes that can be marshalled to json? For example, PostGIS geometries?

oliverrice wrote at 2021-12-03 17:37:38:

Currently, any datatypes that are not in the allow list

https://supabase.github.io/pg_graphql/reflection/#type-conve...

are cast as strings, but prioritizing a JSON conversion if one is available is a great idea that we'll look into

darksaints wrote at 2021-12-03 17:45:59:

Personally I like the idea of allowing the user to create a cast:

https://www.postgresql.org/docs/current/sql-createcast.html

There are some weird permissions issues to work out IIRC, but this allows the user to specify how the json should be marshalled, and then can be implicitly or explicitly used by any sql statements. This is how PostgREST solves this problem, and I currently have a few custom casts created for range and multirange datatypes through my PostgREST server.

oreilles wrote at 2021-12-03 17:51:41:

Don't know how your JSON generation works but when using `row_to_json()`, all PostGIS geometries are automatically casted to GeoJSON objects by default which is very convenient.

ndejaco wrote at 2021-12-03 18:36:07:

I see this as a neat way of exposing a graphql interface but a pattern with a number of limitations that are traded off against the goal of reducing the network latency. Definitely useful for users who want a single datasource and only want to expose access through GraphQL (cannot scale db resources indepedently of the api access layer). Wondering if the graphql engine could make use of the postgres stats collector for optimizing graphql resolver execution ex) resolver dispatch scheduling or pre-execution query optimization?

mmargerum wrote at 2021-12-03 16:07:16:

Thanks for giving this to the community. Will definitely be looking at you for a SASS app i'm building.

wiradikusuma wrote at 2021-12-03 17:22:49:

I use Dgraph, a "native" GraphQL DB. But I still transform the incoming/outgoing JSON to Java object and transforming the object again to/from the DB. Why? 2 things: Business Logic and Security/Access Control. Yes I can put (some) Business Logic in Dgraph DB, but it feels "leaky" like Stored Procedure (maybe because I come from Java). I feel it's tiring and stupid, but I don't know a better solution.

I reckon for this extension, the business logic uses Stored Procedure and Access Control uses PG's user role? Many apps I know simply have 1 user "myappuser" (or even default user) to access its DB.

Twisol wrote at 2021-12-03 18:43:45:

> Many apps I know simply have 1 user "myappuser" (or even default user) to access its DB.

Sure; either those apps don't need to differentiate access between their users, in which case one role is sufficient, or they reimplement their own auth system, in which case you'd use Postgres' own rather robust auth system instead. It comes down to the needs of the domain; you'll solve the problems differently depending on what approach you take, but you need to solve the same problems.

Yes -- I've found it very tiring, as you put it, to keep reimplementing the same boilerplate in every API server just to lift the operations my database can already support out to an HTTP frontend. Postgres' auth means I don't have to make or press into service a separate auth system, and there are multiple ways to handle business logic orthogonally.

Stored procedures and triggers work well, but are synchronous within the current transaction, and sometimes simply don't map well to the domain needs. You can also use the AWAIT and NOTIFY statements to set up asynchronous external workers. I find this has a positive effect on the data model, as you're forced to consider what states a system will pass through during an asynchronous flow.

Apaec wrote at 2021-12-03 16:30:54:

How does it compare with

https://github.com/solidsnack/GraphpostgresQL

? It's like 7 years older and it takes the same approach, it seems.

solidsnack9000 wrote at 2021-12-03 17:34:43:

GraphpostgresQL is no longer maintained and hasn't been updated for many years.

I am the person who wrote it. It was a proof-of-concept, written in PL/pgSQL. This made it fairly easy to set up and test but made maintenance and contributions very difficult.

oliverrice wrote at 2021-12-03 16:48:56:

Very cool! I was not familiar with this project. It does takes a similar approach to query building. It is described in the README as an alpha POC

oliverrice wrote at 2021-12-03 15:35:18:

author here! happy to answer questions

andrew_ wrote at 2021-12-03 18:14:01:

We use Postgraphile heavily, it drives the entirety of our API. A few feature questions; Is there any plan to support, or has there been any conversation around:

- custom directives that run custom code

- custom routes on the server that do custom things? Or perhaps proxying to another app to handle alternate/custom routes

- schema injection, custom resolver logic

thelastbender12 wrote at 2021-12-03 15:52:14:

> After tallying the resources reserved for PostgreSQL, PostgREST, Kong, GoTrue, and a handful of smaller services, we were left with a total memory budget of ... 0 MB

This was a really interesting bit to me, could you give detail on how the memory usage gets split up across these? Thank you.

oliverrice wrote at 2021-12-03 16:19:07:

sure, if we're talking free-tier:

Postgres runs in a 1 GB VM all by itself with some optimizations to get the most that limited hardware

All other services are in a second 1 GB VM.

You can see a list of those services here

https://github.com/supabase/supabase/blob/master/docker/dock...

The memory use per process can differs by use-case, so its important to leave a bit of headroom. For example, PostgREST's memory consumption can grow if the amount of data being returned from its queries is large.

By the time you include:

- supabase studio

- kong

- auth

- storage

- meta

if they each take only 100MB, its pretty snug on a 1 GB VM!

httgp wrote at 2021-12-03 15:55:30:

Very excited about!

Do you have performance comparisons for the same datasets with Hasura / Graphile?

oliverrice wrote at 2021-12-03 16:01:53:

We don't have an apples-to-apples comparison for similar hardware and queries at this point but here are the links to performance figures for pg_graphql and grahile. Hasura does not publish throughput number afaik

pg_graphql:

https://supabase.github.io/pg_graphql/performance/

graphile:

https://www.graphile.org/postgraphile/performance/

We'll certainly be keeping an eye on performance as it gets closer to GA

kiwicopple wrote at 2021-12-03 16:02:40:

just want to give a big shout out to Oli - his implementation really ingeneous, leveraging multiple parts of the Supabase stack while also being agnostic enough to work natively inside Postgres (or with other tools)

{supabase team}

config_yml wrote at 2021-12-03 17:00:55:

So you're calling Graphql via PostgREST, but I didn't ready about why you're including Graphql in the first place. Isn't it orthogonal to PostgREST which is already used in your stack?

oliverrice wrote at 2021-12-03 17:28:21:

PostgREST does solves a few of the core challenges that GraphQL is intended to address like over/under-selection and resource embedding (relationships).

Even so, there has still been a lot of interest in GraphQL so users can leverage the growing ecosystem for things reflecting the data model/types for client usage, offline caching, etc

rlili wrote at 2021-12-03 18:08:25:

Does this work in YugabyteDB/CockroachDB?

kaspermarstal wrote at 2021-12-03 17:04:12:

I've been looking for something like this for a long time, so cool! Great work!

Does it work with TimescaleDB?

nwienert wrote at 2021-12-03 16:41:23:

Congrats on release. We use Hasura but haven’t been happy with their speed of iteration, lack of communication of various large bugs, and lack of M1 support / communication.

But we do use a lot of their more advanced features like being able to use aggregates in sorts, aggregates in results, custom functions, etc. What are your plans there and will you have a public roadmap?

oliverrice wrote at 2021-12-03 17:02:14:

There is high level public roadmap that covers the minimum features required before we'd consider going GA here:

https://supabase.github.io/pg_graphql/roadmap/

Its early days, so the conversations around aggregates haven't happened yet, but I'm optimistic that they'll make an appearance in a future release

rubyist5eva wrote at 2021-12-03 17:50:46:

how do you implement search/sort against something like this?

xrd wrote at 2021-12-03 15:37:52:

I've been using Hasura extensively and love it. I'm curious what the delta is between the two. It's so simple to run Hasura on dokku and instantly get an amazing UI for postgres with the bonus of graphql. It would take a lot for me to switch.

piaste wrote at 2021-12-03 15:46:25:

> It would take a lot for me to switch.

We were pretty happy with Hasura but had to switch to Postgraphile due to poor multi-database support, bummer.

(Postgraphile is not as polished as Hasura in some ways, but since it can used as a library, it's easy to dynamically create N instances of it with different configurations at runtime. Hasura required our ops team to define a new instance of the service in the docker-compose.yml file for each database)

jensneuse wrote at 2021-12-03 18:52:55:

Hey, I'd be curious to hear your thoughts on our solution. It allows you to combine any number of Databases with PostgreSQL, MySQL, SQLite and SQLServer are supported, here's the full list of supported DataSources: [0]

Our solution comes with a feature called Namespacing [1], which means, every API has its own namespace so there are 0 collisions between the different types and fields. It even goes so far that we also namespace directives so you can have a combined schema of multiple GraphQL APIs and can still use the namespaced directives on fields from that particular upstream.

Disclaimer, I'm the founder of WunderGraph.

[0]:

https://wundergraph.com/docs/overview/datasources/overview#o...

[1]:

https://wundergraph.com/docs/overview/features/api_namespaci...

kall wrote at 2021-12-03 16:11:08:

Pretty recently, as of 2.0, hasura has added support for multiple databases.

piaste wrote at 2021-12-03 17:28:05:

It's a VERY fragile support:

https://github.com/hasura/graphql-engine/issues/6648

Basically, if two entities in the databases have the same names, Hasura fails unless you manually define a unique `custom_name` for each such entity.

Given that the most common multiple database scenario involves different databases with either the exact same schema (one-db-per-tenant) or similar schemas (staging vs. production database), it forces you to painstakingly set a custom_name for basically every single entity in your db.

Thankfully there is an API so in theory you could set this programmatically, but it still means that your client code needs to be manually kept in-sync with whatever custom name generator rule you used.

the_arun wrote at 2021-12-03 16:10:06:

What are the use cases we could use a direct plugin to database for GraphQL? I mean where do we write business logic before data becomes GraphQL?

warent wrote at 2021-12-03 16:12:28:

Security definers are a powerful feature of postgres that allow a lot of business logic to live in the database. With an extension like this, I reckon postgres can theoretically become a first class public api for a surprising number of use cases.

brap wrote at 2021-12-03 15:30:26:

That’s interesting. How does it deal with authentication and authorization? It it handles those well, it could be a real “serverless” solution for most CRUD apps.

clessg wrote at 2021-12-03 15:58:14:

As the sibling comments point out, it uses Postgres Row Level Security (RLS). For an approachable introduction to how Supabase's auth works:

https://supabase.com/docs/guides/auth

> 1. A user signs up. Supabase creates a new user in the auth.users table.

> 2. Supabase returns a new JWT, which contains the user's UUID.

> 3. Every request to your database also sends the JWT.

> 4. Postgres inspects the JWT to determine the user making the request.

> 5. The user's UID can be used in policies to restrict access to rows.

> Supabase provides a special function in Postgres, auth.uid(), which extracts the user's UID from the JWT. This is especially useful when creating policies.

For instance, say you have a `todos` table and want to make it so users can only read their own todos - you could have an RLS policy `todos.user_id = auth.uid()`. Afterward, `SELECT * FROM todos` will only return the authenticated user's todos. (Equivalent to manually issuing `SELECT * FROM todos WHERE todos.user_id = auth.uid()`.)

There's also `auth.role()` so you can easily restrict access by role: `auth.role() = "admin"`

majkinetor wrote at 2021-12-03 18:04:53:

Is there integrated way to achieve claim based approach where access to a given resource is governed by both uid (lets say org uid, and all users belong to particular org) and set of claims (lets say there are at least CRUD claims for all tables and I can set for any user any combination of them, or create a role containing them then assign that role to a user) ?

oliverrice wrote at 2021-12-03 15:43:09:

The extension is compatible with your existing row level security policies. If you connect to the database as a role like `authenticated` then those policies will be applied.

The columns and tables that are visible are also controlled by the role.

One cool thing about that approach is you could run e.g. an admin API and a user facing API all from that same endpoint by executing as different postgres roles!

dljsjr wrote at 2021-12-03 15:33:13:

They mention in the article that auth is handled by reusing the existing row level security you'd already use to secure the DB (i.e. `CREATE POLICY`).

nikivi wrote at 2021-12-03 15:14:45:

Curious how it compares to

https://www.graphile.org/postgraphile/

underbluewaters wrote at 2021-12-03 17:53:38:

I had to make sure someone had plugged postgraphile here. It's a great system. What impresses me most about it is how it grows with the complexity of your application. There are so many thoughtful points of extensibility built in, I always have confidence that I can go into the docs and find a way to do what I need to.

That reminds me, I need to figure out how to get my employer to sponsor the project...

gardnr wrote at 2021-12-03 19:07:31:

You could hire one of the maintainers to consult on your project.

oliverrice wrote at 2021-12-03 15:52:40:

its very similar in goals to postgraphile

as samwillis mentioned, the memory footprint is tiny, which is a big perk for supabase's platform (or if you're self hosting) but its also fully language agnostic which opens up lots of options for extensibility:

For simple use-cases you can expose the graphql functionality over http using a PostgREST as described here

https://supabase.github.io/pg_graphql/quickstart/

but, if you want more configuration like adding in middleware or wiring it up to an existing backend application, you can do that from any programming language that can connect to postgres, rather than only javascript

samwillis wrote at 2021-12-03 15:20:10:

The article suggests they built it so that by running it within the DB it would have less compute overhead and they could match GraphQL requests to SQL transactions 1:1.

ancharm wrote at 2021-12-03 15:29:20:

How long before someone just goes into the Postgres source code and creates true native support for GraphQL

whazor wrote at 2021-12-03 15:38:13:

An extension is the best next thing to native support and in my opinion a better place for such functionality.

halfmatthalfcat wrote at 2021-12-03 16:12:52:

Wow, with Relay built-in. Looks great!

agilob wrote at 2021-12-03 17:07:16:

PostgreSQL has the great shiny features, but still doesn't support zero downtime upgrades?

supa-osum wrote at 2021-12-03 16:23:20:

How does it compare to Apache Age?

oliverrice wrote at 2021-12-03 16:44:20:

Apache Age is based around the openCypher graph query language rather than GraphQL

pictur wrote at 2021-12-03 17:28:32:

supabase look like so helpful tool. which has anyone use in production? does it have a hasura like event trigger mechanism?

oliverrice wrote at 2021-12-03 17:42:17:

Yes, event triggers are built right into Postgres so they're supported out-of-the-box

Event triggers are the mechanism pg_graphql uses to keep the GraphQL schema up-to-date with the SQL schema