💾 Archived View for jb55.com › log › 2021-07-12-protoverse-metaverse-protocol.gmi captured on 2024-12-17 at 09:18:04. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
Ever since I read the book Ready Player One I've been fascinated by the idea of the metaverse. Imagine instead of joining an IRC chatroom, you joined a virtual room with objects: chairs, tables, robots, other people? A virtual shared space where you could meet up with your friends and hang out. This is typically envisioned as some sort of virtual reality where you meet with people face to face, but the metaverse could be much more general than that.
I started thinking about, what could a metaverse protocol look like? Is such an ambitious project even possible? How do we avoid the mess that is the web? Could we keep it simple and extensible? I believe I have a pretty good plan on how to achieve this: I'm calling it the protoverse.
Before we start thinking about the ingredients needed to build a metaverse protocol, first let's look at some high level goals:
The protoverse is the metaverse protocol I'm working on that is trying to achieve all these goals. At a high level, protoverse is a network of abstract virtual spaces. It achieves this with a few key ideas:
You define a "space" with a high-level description like so:
(room (shape rectangle) (material "gold") (name "Satoshi's Den") (width 10) (depth 10) (height 100) (group (table (id welcome-desk) (name "welcome desk") (material "marble") (width 1) (depth 2) (height 1) (light (name "desk"))) (chair (id welcome-desk-chair) (name "fancy")) (light (location ceiling) (name "ceiling") (state off) (shape circle))))
When you first connect to a server, you pull this high-level description to quickly get an idea of where you are and what types of entities are in the environment. The server could dynamically generate this description or it could be static. This "space" is analogous to html documents. So far so good.
At this point, if there is a more detailed description of the room, the client could start pulling additional texture and model information via protocol messages.
Due to this level-of-detail feature, simple text clients can still do something useful here. For instance, if you just want to get an idea of what the room is about without rendering anything, you can use a text-description client:
$ ./protoverse serve index.space serving protoverse on port 1988... $ ./protoverse client proto://localhost
Output:
There is a clean and shiny rectangular room made of solid gold called Satoshi's Den. It contains four objects: a welcome desk table, fancy chair, throne chair and ceiling light.
As you can see, in this case the client simply parses the high-level description and outputs a description of the room. More advanced clients could render a 2D representation of the room, and even more advanced clients could render full VR-capable experiences.
To support a wide variety of experiences, we need some concept of computability within the metaverse. This would be the analog of javascript from the web. The protoverse uses WebAssembly (WASM) to enable computation for clients interacting with the metaverse. WASM was originally devised as a generic virtual machine for the web, but it is general enough to use for cases beyond that.
With WASM you can use any programming language to code the metaverse. Protoverse comes with an embedded WASM interpreter that can execute WASM code. You will be able to augment clients to render your space in greater detail, show HUD elements, create multiplayer games, etc.
You can already do a lot without client computation. For instance, your space could be served dynamically, which you could periodically fetch to get an updated description of the room. This would be equivalent to "refresh the page" on the web, except due to the level-of-detail nature of the protoverse, you wouldn't need to refetch the entire room. The client could cache models and other details that have been previously fetched.
The default, high-level description of the room could include position information, so you will be able to see things that have moved when you "refetch" the state of the room. State updates like this could be a bit jarring, so most likely you wouldn't want to reload the room for position updates, these can be served via "object state/position update" network messages.
What you do with these network messages could be handled automatically for simple cases by the client, but otherwise could be handled by WASM code served by the protoverse server.
Thanks to WASM, we can offload much of the rendering to WASM code that chooses how to render its environment. This does affect accessibility, so we need to be careful, but it does have the benefit of avoiding a huge pain point of the web: the massive growth of specifications required to implement web functionality. If we have a very thin client with a small set of rendering APIs (Vulkan? Curses?), then protoverse servers can provide any experience it desires. It could serve full multiplayer video games!
I still have more to think about with respect to server-to-server communication, but there is some interesting potential here. For now, the protocol only cares about client to server communication, such as updating entity positions, etc. I think it makes sense for there to be a variety of server-to-server protocols for something like the metaverse, I just haven't thought too deeply as to what those could be yet.
The design space for metaverse protocols is huge. I would love to brainstorm new ideas about how I could improve the protoverse. If you have any ideas feel free to send your thoughts to the protoverse mailing list at:
https://lists.sr.ht/~jb55/protoverse-discuss
Also, patches welcome! I'm currently working on the protoverse WASM interpreter. If you want to help hack on the project feel free to email patches to `jb55@jb55.com`
http://git.jb55.com/protoverse
That's all for now. I plan on posting more protoverse updates here on my gemlog in the future!