💾 Archived View for dioskouroi.xyz › thread › 29390751 captured on 2021-11-30 at 20:18:30. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
________________________________________________________________________________
I really like this quote from the manual:
<<There are a class of "ideal attractors" in engineering, concepts like "everything is an object," "homoiconicity," "purely functional," "pure capability system," etc. Engineers fall into orbit around these ideas quite easily. Systems that follow these principles often get useful properties out of the deal.
However, going too far in any of these directions is also a great way to find a deep reservoir of unsolved problems, which is part of why these are popular directions in academia.
In the interest of shipping, we are consciously steering around unsolved problems, even when it means we lose some attractive features.>>
I wonder if the attributes of Hubris and similar systems -- real-time, lack of dynamism -- will become "ideal attractors" for developers not working in problem domains where these things are absolutely required, especially as the backlash against the complexity at higher layers of the software stack continues to grow. In other words, I wonder if a sizable number of developers will convince themselves that they need an embedded RTOS like Hubris for a project where Linux running on an off-the-shelf SBC (or even PC) would work well enough.
Right now we are going in the opposite direction. Web developers on HN refuse to learn proper embedded programming, and instead stack abstraction on top of abstraction with MicroPython and using Raspberry Pis for every job under the sun.
It is a shame that Arduino/AVR never bothered implementing support for the full C++ library. If the full power of C++ is available to the end user, then perhaps alternatives like MicroPython would be less attractive.
On the contrary, because the experience isn't the same, with MicroPython you drop a py file over the usb and you are done.
There is a REPL experience, and Python comes with batteries, even if MicroPython ones are tinier.
Python is the new BASIC in such hardware.
Also the Arduino folks don't seem to have that high opinion about C++,
https://www.youtube.com/watch?v=KQYl6th8AKE
I think MicroPython is a boon to beginners in the space. They have the option to go deeper when the projects run into limitations.
I would tend to disagree because MicroPython is so abstracted that it resembles writing regular Python on a server more than it does anything embedded.
Just as an example, the WiFi setup resembles a server far more than an ESP32 with esp-idf. All you do is give it the connection details, and MicroPython seems to handle the details like trying to reconnect in the background. It's not far off from what systemd-networkd or similar provides. esp-idf forces you to handle that yourself, and to think about what you want to happen in that situation.
MicroPython also doesn't support threads afaict, so you don't even have to handle scheduling threads.
I like MicroPython as a way to run Raspberry Pi like stuff on the cheap, and it's a great learning tool in that sense, but you're still too far from the hardware to really be learning about embedded systems.
Sure, that's happening, but some of us are also tempted to use Rust for applications where an easier to learn, more popular language would be good enough.
This doesn’t seem like an apt comparison. Using the wrong tool for the job is expensive forever; learning a tough language pays dividends forever.
If the predictions are true that we will see more and more specialized hardware due to the end of Moore's Law, then we will see more OS services just look like services running on separate processors. Special purpose hardware doesn't need a batteries included operating system. We could argue whether a modular OS still counts as general purpose, but I'll let you guys do that.
With IPC, latency becomes the elephant in the room. An RTOS can't remove that, but it can help.
My Android phone is full of processes talking among each other over Android IPC, including the drivers themselves.
It is already more common than common monolith defenders think.
Containerisation already goes a way towards this, each executable is bundled with what it needs and nothing else. And the next step is one app per hardware, where you have maybe a minimal stripped OS that just launches your app, effectively a container in hardware. I think Google does this a fair amount.
The jump from this to RTOS is large, though. The abstractions are different. The limitations are different. You probably need to rewrite anything you need. And what do you gain? Mostly only predictable latency, and the ability to run on very limited (but cheap) hardware. Which you need why?
Also, massively reduced energy consumption if you choose the right hardware (e.g. ultra low power microcontrollers).
You can get a lot of the way towards that without needing a RTOS though.
> Which you need why?
If I'm selling a million Tamagotchis I'd rather use the 5 cent part over the 50 cent one and pocket the extra $.45M. A $5 part is a nonstarter.
> Containerisation already goes a way towards this, each executable is bundled with what it needs and nothing else.
Maybe in theory, but in practice most people still ship an entire OS in their containers (most of the time it's Alpine and it's not to big of a deal, but too many times it's an entire Debian!)
I think you are right and I would add Turing incompleteness to that list. If your problem isn't Turing complete, then a complete language is actually probably not the best tool for the job. Incomplete languages actually give you _more_ power than in a Turing complete language in that case. Completeness is a feature of a language and like all features there are tradeoffs. The ability to express more kinds of problems comes with the cost of not being able to leverage the contours of a particular problem (e.g. monotonic data, no cycles) to increase things like speed, debuggability, parallelization. This can enable cool features things that seem completely out of reach today. e.g. rewinding the state of your program to debug production errors. Modifying programs as they're running. Automatically parallelizing computations across an arbitrary number of cores. The ability to query the provenance of any variable in your program.
Datalog's incompleteness for example allows it to resolve queries faster than a complete language like Prolog due to the simplifying inferences it can make about the code.
I doubt it. Real time adds its own flavor, and a small OS doesn't come with everything you might want. It's useful when it's what you want, not when you don't want something else.
That impulse will occur as they try to do more and more complex things with systemd.
> In the interest of shipping, we are consciously steering around unsolved problems, even when it means we lose some attractive features
Huh, this is more like pragmatism than "Hubris".
We'd like to think so, but it's really hard to convince people that writing an entire new OS in Rust is the pragmatic choice. The name is a good reminder of the balance there, in my humble opinion.
To which I would add that naming a system "Pragmatic" may well be an act of unspeakable hubris, one that the gods would surely punish with a plague of architectural astronauts...
Haha, pulling a bit of reverse psychology there?
Memento Mori
"Hubris" is supposed to be a name, not a description.
We are gettig an increasing amount of interesting Rust operating system for different uses.
- Hubris for deep embedded
- Redox OS for Desktop/Server (
)
- Tock for embedded (
)
- Xous for trusted devices (
https://xobs.io/announcing-xous-the-betrusted-operating-syst...
)
I assume there are more.
We also built a Rust framework called FerrOS (
https://github.com/auxoncorp/ferros
) atop the formally-verified seL4 microkernel.
It has a similar set of usage idioms to Hubris it looks like in terms of trying to setup as much as possible ahead of time to assemble what's kind of an application specific operating system where everything your use case needs is assembled at build-time as a bunch of communicating tasks running on seL4.
We recently added a concise little persistence interface that pulls in TicKV (
https://docs.tockos.org/tickv/index.html
) from the Tock project you referenced above, and some provisions are being added for some more dynamic task handling based on some asks from an automotive OEM.
Also, sincerest h/t to the Tock folks.
Looks like a really cool idea to build on the formally verified C code and building everything else in Rust on top.
I defiantly think all the embedded Rust and even others will end up sharing lots of code.
I would like to spend some time working Datomic style on a more powerful immutable DB that could run on different KV interfaces. Lots of configuration should more immutable.
>Redox OS
What is interesting about this OS except it is made with Rust? I mean some interesting architecture , new exiting features that are not in Windows/Unix world?
My question is if Redox OS is more similar to other hobby Oses we see here or is something more well thought like Plan9, Fuschia, Singularity ? Is there a link where someone that is not a Rust fanoboy (but maybe an OS fanboy) reviewed/described it in more detail?
Redox is Plan9-inspired except it goes a little further with some of the core concepts.
Instead of everything being a file, everything is a scheme (basically a URL).
I thought Unix was “everything is a file” and Plan9 was “everything is a filesystem”?
Maybe so. I guess the point is, the concept that different types of resources need different protocols is baked in, rather than picking one type of abstraction and applying it to every type of resource.
https://doc.redox-os.org/book/ch04-10-everything-is-a-url.ht...
There are some filesystem-like usage patterns built on top of that which are universal, but they're more limited.
https://doc.redox-os.org/book/ch04-06-schemes.html
Microsoft is a bit schizophrenic in Rust's adoption.
In one side of the fence you have Azure IoT Edge, done in a mix of .NET and Rust.
https://msrc-blog.microsoft.com/2019/09/30/building-the-azur...
On the other side you have the Azure Sphere OS with a marketing story about secure IoT, yet the SDK is C only,
https://docs.microsoft.com/en-us/azure-sphere/app-developmen...
To the point they did a blog post trying to defend their use of C,
https://techcommunity.microsoft.com/t5/internet-of-things-bl...
_> Microsoft is a bit schizophrenic in Rust's adoption._
It's only "schizophrenic" to the degree that you expect an organization of thousands of individuals to behave like a single human mind.
I expect the business unit that sells "secure" IoT devices, to follow the guidelines from Microsoft Security Response Center, but that is expecting too much I guess.
> To the point they did a blog post trying to defend their use of C,
I do a lot of Rust work, but C still occupies a prominent place in my toolkit.
In fact, this is fairly standard among all of the professionals I know. The idea that companies need to abandon C and only do Rust as soon as possible is more of an idealistic idea on internet communities.
Sure, but they should be honest about it instead of marketing Azure Sphere OS as an unbreakable castle, when it happens to be built in quick sand.
Additionally they could also allow for all GCC languages like Ada and C++, given it is the SDK toolchain, instead of being C only, if the security marketing from Azure Sphere is to be taken seriously.
> Microsoft is a bit schizophrenic in Rust's adoption.
This is pretty famously not limited to Microsoft's views on Rust. Probably at any sufficiently large company, though Microsoft is the tech company that gets most discussed about. At a previous employer, there was a division that was gung ho on Go and React, while another was Java and Ember.
My point was more related to the views on security than Rust.
Microsoft Security advises 1. managed languages, 2. Rust, 3. C++ with Core Guidelines.
Then comes out this group selling "secure IoT" devices with a C only SDK.
Interesting that MS would adopt Go or Java. I would have thought that everything in the GC'd lang space would have to be C#. Is java common there?
To clarify, my previous company (the one which had Go and Java) is not Microsoft
Ah, I misread the context
Rust is the language of choice for dApps on Solana as well (which I thought was an interesting choice.)
That is completely true. Rust is where it is going for cryptocurrencies that need to have very secure smart-contracts and scale up to millions of users with tens of thousands of transactions per second.
Even if it is completely unrelated, it is 'at least' production ready and will be used more than these projects would ever be used in terms of Rust projects.
Rust in Cryptocurrency is mostly a marketing play (and I say this as someone who does a lot of Rust).
Are there even any cryptocurrencies that allowed less-safe languages like C in the first place?
In my opinion (as a Rust dev), Rust is weirdly over complicated for what they’re trying to do. Common types scripting languages are basically more than sufficient (and safe)for these applications.
It’s only a matter of time before a crypto project overplays the safety of Rust and them has a huge heist due to a logic bug, which will further contribute to jokes about Rust programmers. Most of the Rust devs I know are wary of Rust crypto projects.
We need to distinguish nodes written in Rust vs smart contracts.
Given the context assuming smart contracts you may have a point. Often consensus will be much slower and a limiting factor in execution. Solana may be a bit different here in their efforts to parallelise independent transactions.
High level provable languages always seemed like a good idea to me for smart contracts. As you say Rust doesn't necessarily seem like the sweetspot for this.
The Ethereum EVM assmebly is wildly unsafe but regularly used for the sake of gas and other things that would be impossible (most hilariously string manipulation). Solidity is unsafe with respect to things like overflow. It doesn't have memory unsafety in the traditional sense. Partly because it is allocate only and you don't have to deal with bounds checking yourself.
If that includes C++ then basically every cryptocurrency that’s not written in Rust (Bitcoin, Ethereum, Monero…)
Rust won't protect smart contracts from logic bugs.
It depends on how much logic and/or arithmetic you can get away with encoding into the type system. We abuse the heck out of it to restrict things like register & field access, state-machine transitions, and also track resource allocation/consumption. That said, it's incredibly painful to develop anything that way, and it also doesn't ultimately prevent a different problem of the "model" you've written down in the types being wrong. So, it's not a panacea, and it's incredibly difficult, but it can winnow down the surface area of potential problems and bugs... or at least move them to compile-time.
I don’t think anyone said it would.
> Rust is where it is going for cryptocurrencies that need to have very secure smart-contracts
There is an implication here that Rust will help make smart contracts "secure", but AFAIK the vulnerabilities in smart contracts have been in their logic, not in their memory/type safety or whathaveyou.
I have an embedded real-time control project that is currently written in Rust, but runs with RTIC (
), a framework which is conceptually similar (no dynamic allocation of tasks or resources) but also has some differences. RTIC is more of a framework for locks and critical sections in an interrupt based program than a full fledged RTOS. Looking through the docs, here's the main differences (for my purposes) I see:
1. In Hubris, all interrupt handlers dispatch to a software task. In RTIC, you can dispatch to a software task, but you can also run the code directly in the interrupt handler. RTIC is reliant on Cortex-M's NVIC for preemption, whereas Hubris can preempt in software (assuming it is implemented). This does increase the minimum effective interrupt latency in Hubris, and if not very carefully implemented, the jitter also.
2. Hubris compiles each task separately and then pastes the binaries together, presumably with a fancy linker script. RTIC can have everything in one source file and builds everything into one LTO'd blob. I see the Hubris method as mostly a downside (unless you want to integrate binary blobs, for example), but it might have been needed for:
3. Hubris supports Cortex-M memory protection regions. This is pretty neat and something that is mostly out of scope for RTIC (being built around primitives that allow shared memory, trying to map into the very limited number of MPU regions would be difficult at best). Of course, it's Rust, so in theory you wouldn't need the MPU protections, but if you have to run any sort of untrusted code this is definitely the winner.
Hubris does support shared memory via leases, but I'm not sure how it manages to map them into the very limited 8 Cortex-M MPU regions. I'm quite interested to look at the implementation when the source code is released.
Edit: I forgot to mention the biggest difference, which is that because tasks have separate stacks in Hubris, you can do blocking waits. RTIC may support async in the future but for now you must manually construct state machines.
> Hubris does support shared memory via leases, but I'm not sure how it manages to map them into the very limited 8 Cortex-M MPU regions.
What I did in a similar kernel was dynamically map them from a larger table on faults, sort of like you would with a soft fill TLB. When you turn off the MPU in supervisor mode you get a sane 'map everything' mapping, leaving all 8 entries to user code.
The way LDM/STM restart after faults is amenable to this model on the M series cores.
Neat, I didn't know that the MPU fault handler was complete enough to allow for restarts.
Now that the source is available, I took a look at what hubris does - it is not actually anything fancy, just a static list of up to 8 MPU regions per task [1].
It seems that leases aren't actually shared memory, but rather just grant permission for a memcpy-like syscall [2]. This is slightly better than plain message passing as the recipient gets to decide what memory it wants to access, but is still a memcpy.
[1]
https://github.com/oxidecomputer/hubris/blob/8833cc1dcfdbf10...
[2]
https://hubris.oxide.computer/reference/#_borrow_read_4
I don't think they link the binaries. It's more like, put them each on executable flash in separate places and the kernel just calls them.
The intent here seems to be that each binary has no need (and no ability) to get all up in another binary's business. Nothing shared, except access to the RTOS.
It doesn't need to link any symbols, but I believe it does need to do relocations if the code isn't PIC, and to relocate the task's statically allocated RAM.
Your link does not appear to work, maybe this one [1] is intended instead? I can't resolve that name, at least.
[1]
Maybe they want it to be possible to have closed source and open source task to be mixed.
In general we plan on making the system as open sourced as we possibly can, so there's no specific thought currently being put into closed source tasks. While it could work, the build system doesn't actually support that at all right now.
I assumed you guys wouldn't do this, but thought this could lead to a larger adoption in this space.
Alternatively I thought maybe you needed to have some closed source component from a vendor and could only include it like that.
Yeah, I hear you on the adoption thing for sure, though to be honest, right now we are laser focused on shipping Oxide's product, and so if nobody else but us uses Hubris, that is 100% okay. As our CONTRIBUTING.md mentions, we aren't yet at the stage where we're trying to grow this as an independent thing.
It's true that vendors are likely to be an area where we have to deal with some things being closed source, though we're not simply accepting that as a given. This is one reason we're writing so much of our own software, and also, poking at the bits that must remain closed:
https://oxide.computer/blog/lpc55
I definitely didn't mean it as a feature request - blobs aren't actually that common in embedded (esp32 and some motor driver libraries are the most common exceptions), so I don't think it's important for adoption. In fact, not supporting it enables future ergonomics improvements and code sharing between tasks, so I appreciate that it's not a driving factor in the design.
> esp32 and […] are the most common exceptions
Well, nearly _anything_ having to do with wireless is typically blobs :/
Even Nordic has blobby SoftDevices, though you don't have to use them since Apache NimBLE exists (and rubble in Rust though that's only usable for advertising-only for now).
Now that it's open, how open are y'all to MRs? I want to port it to a few archs, but I'm not sure whether to hard fork or try to upstream.
In addition to everything that steveklabnik said, it would be interesting to know which architectures you're eyeing, as some are much more modest (e.g., other Cortex-M parts/boards) than others (e.g., RISC-V). Things get gritty too with respect to I/O, where variances across different MCUs and boards can run the gamut from slight to dramatic. As steveklabnik points out, we are very much focused on our own products -- but also believe that Hubris will find many homes far beyond them, so we're trying to strike a balance...
I was eyeing RISC-V M/U-Mode with PMP. That's the closest thing to the semantics of Cortex-M from a memory protection perspective I can think of that's in common use still, plus I've got ESP-C3 and K210 dev boards laying around. I've been wanting to use them in my home automation, am cursed with the knowledge of what a nice RTOS feels like, and well, those yaks won't shave themselves.
Sounds like I should plan to do that on my own at the moment, but I'll keep it in a github style fork in case y'all's focus moves in that direction.
We spent a ton of time trying to strike a balance in our CONTRIBUTING.md. Basically, we are happy to get PRs, but at the same time, we reserve the right to ignore them completely at the moment. We're trying to focus on shipping our product, and so are unlikely to be able to spend time shepherding PRs that aren't directly related to that. It's not you, it's us. For now. :) So yeah, we love to see the activity, and please give that a try if you'd like, but it's unlikely we'd merge new arches upstream at this point in time.
Word, makes sense. One of the major reasons why I'm interested in hubris in the first place is the strong opinion I have that systems code particularly should have a use case more important than "hey, look what I can do with this systems code". Lack of spoons on y'all's part kind of comes with that territory.
instead of having an operating system that knows how to dynamically create tasks at run-time (itself a hallmark of multiprogrammed, general purpose systems), Cliff had designed Hubris to fully specify the tasks for a particular application at build time, with the build system then combining the kernel with the selected tasks to yield a single (attestable!) image.
I worked briefly at John Deere, and their home-grown operating system (called "JDOS", written in C) also baked every application into the system at compile time. This was my only embedded experience, but I assumed this was somewhat common for embedded operating systems?
It's been a long time since I've worked in that world but in the micro-controller world it is common.
The Hubris debugger, Humility…
That is some great naming
I'd like to hear more about Oxide's development process. Was this designed on an index card, and then implemented? Or was it done with piles and piles of diagrams and documents before the first code was committed? Was it treated as a cool, out-there idea that's worth exploring, and then it gradually looked better and better?
It's hard to get software organizations to do ambitious things like this, and it's impressive that this was done on a relatively short timescale. I think the industry could learn a lot from how this was managed.
So, the Hubris repo itself will show a bunch of that history, but in particular, Cliff used the "sketch" nomenclature for the earliest ideas. I think in those first days, our thinking was that we were hitting a bunch of headwind on other approaches -- and had an increasingly concrete idea of what our own alternative might look like. I think whenever doing something new and bold, you want to give yourself some amount of time to at least get to an indicator that the new path is promising. For Hubris, this period was remarkably brief (small number of weeks), which is a tribute to the tremendous focus that Cliff had, but also how long some of the ideas had been germinating. Cliff also made the decision to do the earliest work on the STM32F407 Discovery; we knew that this wouldn't be the ultimate hardware that we would use for anything, but it is (or was!) readily attainable and just about everything about that board is known.
To summarize, it didn't take long to get to the point where Hubris was clearly the direction -- and a working artifact was really essential for that.
Cool, thanks. This matches my experience with ambitious projects that actually succeed:
- Start with a really good engineer getting frustrated with existing stuff - but frustrated in a _targeted_ way, and over a long period of time, not just a few weeks of grumpiness.
- Let them loose for a few weeks to sketch an alternative.
- Pause, and then the hardest part - smell whether this is going in the right direction. This just takes good taste!
- Make a decisive cut - either it's not working, or Let's Do It!
I can think of four or five ambitious projects I've been on or around that have really worked well, and they all seem to have worked in this way. I don't think I realized this clearly until this comment thread - thank you.
It was probably done using RFDs (Requests for discussion),. You can read more on the process here [1].
But someone from Oxide would need to tell you exactly how many RFDs took to desing and implement Hubris.
[1]
https://oxide.computer/blog/rfd-1-requests-for-discussion
There was an RFD for Hubris; it laid out the basic concepts and design goals, as well as non-goals. But after that, it's largely just iterating. When I joined there were four or five people working on Hubris regularly; we have weekly meetings where we sync up, talking about what we're working on, and discuss things.
Can anyone explain to a non-server person what Oxide hopes to accomplish? Is it basically just a new server with its own OS that makes it more secure?
Their target market is essentially private cloud space combined with turnkey rack space - a pretty common on-premise setup where you order not individual servers, but complete racks that are supposed to be "plug and play" in your DC (in practice YMMV, spent a month fighting combination of manglement and Rackable mess).
You can think in this case of the final product as pretty big hypervisor cluster that is delivered complete. I'll admit more than once I'd kill for that kind of product, and I suspect that the price/performance ratio might be actually pretty good.
The operating system in this case is used for internal service processor bits (compare: Minix 3 on Intel ME, whatever was that proprietary RTOS on AMD's PSP, etc. etc) that help keep the whole thing running and ship shape.
>a pretty common on-premise setup
I have been wondering if it will become a thing in some cloud hosting services as well. I guess we need to see their pricing.
Depending on market, I would be totally unsurprised to see some cloud providers using turnkey racks (though they might usually have nicer deals with places like quanta), and oxide could definitely strike some contracts there, though the question is how it would mesh with the existing setup
Bingo. My guess is this is for control plane microcontrollers
Brian has also complained in interviews about how many microcontrollers are already on your motherboard and how few of them Linux really controls. It's all proprietary and god knows what's actually running in there (and how many bugs and security vulnerabilities they have).
None of those are a great situation for multitenant scenarios.
This doesn't have to be control plane only. It could also be IO subsystems.
That said, Oxide doesn't get as much control in hands of owner as Raptor, but Raptor doesn't provide high integration rack like that :<
It is primarily being used for the root of trust as well as our service processor, aka "totally not a BMC."
Just to be clear, this OS Hubris is for the service processor. Its an OS for firmware, not the main OS that will run on the CPU.
However they will likely ship with a something derived from Illumos and bhyve hypervisor. You can then provision VM threw the API (and likely integrated with tools like terraform or whatever). You will likely not interact directly with Illumos.
Its basically attempt to help people make running data-center easier.
Pretty much "let's redo datacenter hardware from the ground up for current requirements, cutting off legacy things we don't need anymore"
But the BMC would be the #1 item on my list of "things I don't need any more". How do you come up with a scratch legacy-free universe that still includes BMCs?
Because BMC is a term for function (which turns out to be very useful and important) not a specific technology (I like the tongue in cheek "totally not a BMC" used by some people from Oxide)
You don't need low-level remote management anymore? Or what specifically are you associating with the term "BMC"? (i.e. for me, "BMC" is "turn it off and on and force it to boot from network, remotely")
Correct. The larger my installation becomes, the less I care about the state of individual machines. The ability to promptly remediate a single broken machine becomes irrelevant at scale.
But at scale you now have more and more machines that are going offline. That tends to me to push the organization more and more to having something doing this sort of management. And without a BMC-like system, that means more in-person work, which again, at scale becomes a real cost burden.
It sounds to me more like at the scale you are at you are no longer the person making sure that individual computers are still running, and so are forgetting that this job needs to be done.
So if a machine behaves odd/goes away and its OS doesn't respond you don't want management plane to be able to redeploy it/run hardware checks/... automatically?
If you put it that way you make it too simple. The question is whether I want a second, smaller computer inside my larger computer that may at any time corrupt memory, monkey with network frames, turn off the power, assert PROCHOT, or do a million other bad things. It's not just a tool with benefits. It has both benefits and risks, and in my experience the risks are not worth those benefits.
but we are talking in the context of a project which specifically aims to do these things without the baggage of other platforms. And these things are BMC functions.
That's exactly what they are doing. The are removing as many thing from the BMC as possible. It only contains a few things, it boots and hand over and allows for some remote control of the low level. That's it.
I haven't looked at Oxide in depth. Hubris seems to be about reducing the Attack Surface of a server by
* decreasing the active codebase by at least three orders of magnitude
* using no C-Code (Rust only?)
* most code is kernel independent and not privileged (e.g. drivers, task management, crash recovery)
Also: Administration is mostly done by rebooting components.
Hubris is for system management components like the BMC, not for the main CPU.
I think this OS is intended to run in embedded context where there are significant memory constraints; read its description, no runtime allocations etc.
I linked two speeches where he goes over this in a bit more detail, but I hope the presentation opens it up even more.
No, it's a rack level design with the target market being not companies that buy single servers and fill racks with them but hyperscalers that need to fill whole datacenters with servers.
Basically there are a huge range of scale levels where having hardware on-prem make sense financially.
Also, Bryan Cantrill has some sort of personal nitpick with modern servers basically being x86 PCs with a different form factor, and with the fact that in modern servers hardware and software do not cooperate at all (and in some occasion, hardware gets in the way of software).
> but hyberscalers that need to fill whole datacenters with servers.
I strongly doubt this is aimed at Amazon, Google, or Microsoft (hyperscalers). They all already have their own highly customized hardware and firmware. If that is their target I wish them luck. There’s no margin and a ton of competition in that space and as long as they’ve been working on this that feels like a pretty poor gamble.
What I believe this is actually targeting is small enterprise and up. A company that has dozens to thousands of servers. They’re willing to pay a premium for an easier go to market.
There's a big "turnkey rack" market, where multiple servers might be delivered as complete racks and are supposed to be already wired up and everything.
All ranges of business except very small turn up in those purchases.
> I strongly doubt this is aimed at Amazon, Google, or Microsoft (hyperscalers).
Indeed, that is not aimed at _those_ hyperscalers.
They are pretty much the "only" hyperscalers. The only two you could add is possibly Alibaba and Tencent Cloud.
I think IBM (softlayer), hetzner, and ovh would disagree. They may not have the breadth of services but they measure their scale in datacenters, not servers.
Joyent?
https://www.joyent.com/press/samsung-to-acquire-joyent
Which was a result of Joyent being unable to scale like the hyperscalers because there was no 3rd party that could make the hardware as well as the hyperscalers. That's what Oxide is for, to fix what Joyent was unable to do, to enable others to become hyperscalers.
Hyperscalers have already moved their custom stuff in a direction quite far from x86 PCs (how many new form-factors and interconnects and whatnot are under the Open Compute Project already?) while the typical Supermicro/Dell/HPE/whatever boxes available to regular businesses are still in that "regular PC" world. This is what they're trying to solve, yeah.
If anyone else wondered about the term BMC:
https://www.servethehome.com/explaining-the-baseboard-manage...
So this is what Cantrill has been talking about.
https://www.youtube.com/watch?v=XbBzSSvT_P0
https://www.youtube.com/watch?v=cuvp-e4ztC0
The github links don't work, are the repositories still private?
It's going to be presented at a talk in ~9 hours, so probably:
https://talks.osfc.io/osfc2021/talk/JTWYEH/
Yeah also noticed and got a little bit upset about this.. I mean publishing a website with broken links does not seems very smart nor makes very much sense to me..
Ha, sorry -- HN scooped us on our own announcement! We had the intention of turning the key on everything early this morning Pacific time, but we put the landing page live last night just to get that out of the way. Needless to say, we were a bit surprised to see ourselves as the #2 story on HN when we hadn't even opened it yet! It's all open now, with our apologies for the perceived delay -- and never shall we again underestimate HN sleuths! ;)
Did you forget to put a license in the repo? I'm guessing you meant to release it under the MPL.
That was one of the PRs that was to be merged before opening up, yes. I merged it one minute before you made your comment :)
https://github.com/oxidecomputer/hubris/pull/270
(And yes it's MPL)
That's _very_ impressive. Response time of -1 minute? Best I've ever seen.
;-)
Ha!
That said if Hubris OS will be presented at a talk later today I guess things seems to be more clear to _me.
_Are kind of way very much looking forward to the talk and the presentation of the operating system btw.
I guess we need to keep ourselves busy with some docs, as that one works.
yeah sounds as a very good suggestion to me :)
Has Oxide released any information on the price range of one of their machines? I assume if they're targeting mid-size enterprises it would be outside what I would consider buying for hobby use, but it would be sweet in the future if there was a mini-Oxide suitable for home labs.
AFAIK they won't even sell individual machines, the product is a whole rack.
Since they aim to open source everything, there probably will be a way to use their management plane and stuff with a homelab eventually :)
no C code in the system. This removes, by construction, a lot of the attack surface normally present in similar systems.
Not to be too pedantic here, but it's important to note that the absence of C code, while arguably a benefit overall, doesn't by itself guarantee anything with regards to safety/security...I suppose there's going to necessarily be at least some "unsafe" Rust and/or raw assembly instructions sprinkled throughout, but I can't yet see that myself (as of the time of writing this comment, the GitHub links are responding with 404). Nonetheless, it's always refreshing to see some good documentation and source code being provided for these kinds of things. Many companies in this space, even these days, sadly continue to live by some outdated values of hiding behind "security through obscurity", which is somehow championed (though using different words) as a benefit even to their own customers, so it's refreshing that others (Oxide among them) are really starting to take a different approach and making their software/firmware publicly available for inspection by anyone inclined to do so.
To be clear, that sentence refers to the sum total of the things in the previous sentence, not just the bit about C. And it's "a lot of the attack surface" and not "guarantee" for a reason. We don't believe that simply being written in Rust makes things automatically safe or secure.
There is some unsafe Rust and _some_ inline assembly, yes. I imagine a lot less than folks may think.
I will admit perhaps I was a bit too loose with my own interpretation of the statement there. I think maybe this was influenced by my being tired of grand statements others have made in the past about the infallibility of writing code in Rust (even with liberal usage of "unsafe" without a proper understanding of what this implies).
It's all too often I see some cargo cult-style declaration of "no more C; it's all Rust" as if that has somehow solved all problems and absolved the programmer of the responsibility for ensuring their code is otherwise safe and correct (granted, Rust _does_ make this easier to do), and IMHO this just ends up doing a disservice both to those proclaimers and ultimately to Rust itself. To be clear, this is not a statement against Rust by any means but rather a complaint against the conduct of some of its practitioners.
With that being said, I feel I really also need to state here that I absolutely do not believe the above to be the case with the announcement here...Even to the contrary, I would say, as the name "Hubris" says it all. It's great to see Rust used in practice like this, and I look forward to seeing more of the details in the code itself!
> it's all Rust" as if that has somehow solved all problems and absolved the programmer of the responsibility for ensuring their code is otherwise safe and correct (granted, Rust does make this easier to do)
Rust is all about making it easier to ensure safety and correctness, yeah! It's still a tough job, but significantly easier than C or C++.
Which is why they state _a lot of the attack_ and not _all of the attacks_.
As someone who's only worked with a prepared hardware kit (a dsPIC33F on an Explorer 16 that came with cables and the debugging puck), if I want to pick up the board they recommend in the blog post, do I need to make sure I get any other peripherals?
This all seems very cool, and I badly want to poke at embedded stuff again, but I have whatever the opposite of a green thumb is for hardware. Advice would be appreciated ^_^
How are these docs being built? I really like how these look and it looks to be asciidoc based, but I can't seem to find a build script for these.
There's an open PR with the details, I set it to deploy every push to that PR for now so we could make quick fixes. It just runs asciidoctor, nothing fancy.
https://github.com/oxidecomputer/hubris/pull/272
(specifically
https://github.com/oxidecomputer/hubris/pull/272/files#diff-...
)
HTML says
<meta name="generator" content="Asciidoctor 2.0.16">
so I guess
That's what bootable Modula I offered on the PDP-11, over 40 years ago.
This needs citation, but what does the "that" even refer to? I'm genuinely curious because there's little on it; did you use it? Did it survive? And what particular aspect of Hubris and Humility reminds you of this system?
Compile all the processes together, allocating all resources at compile time. Modula I had device register access, interrupt access, cooperative multitasking (async, the early years) and it worked moderately well on PDP-11 machines.
Yes, I did use it. We wrote an operating system in it at an aerospace company.[1] It didn't work out well. Sort of OK language, weak compiler, not enough memory given the poor code generation. It was just too early to be doing that back in 1978-1982. We got the thing running, and it was used for a classified high-security application, once.
[1]
https://apps.dtic.mil/sti/pdfs/ADA111566.pdf
Thanks for the reference -- that's helpful. KSOS is not a widely known system, but I can certainly see some similarities in approach with Hubris.
That said, there are far more differences than there are similarities, and it's a gross oversimplification to say -- or even imply -- that the work in Hubris somehow replicates (or is even anticipated by) KSOS. More generally, I find this disposition -- that a new technology is uninteresting because it was "done" decades ago -- to be generally incurious, dour, discouraging, and (as in this case) broadly wrong on the facts. We as a team have as much reverence for history as any you will ever find; it is not unreasonable to ask those who have lived that history to return the favor by opening their minds to new ideas and implementations -- even if they remind them of old ones.
No, not KSOS. The Modula 1 environment. That was one of Wirth's early languages. Modula 2 is better known. Modula 1 was for embedded. This was the first language to have compile-time multitasking, something very rarely seen since. Here's a better reference.[1]
One of the most unusual features is that the Modula 1 compiler computed stack size at compile time, so that stacks, too, were allocated at compile time. Recursion required declaring a limit on the maximum recursion depth.
It's interesting as a working example of how minimal you can go running on bare metal and stay entirely in a high level language. Few languages since have been specifically dedicated to such a minimal environment.
This is what you need for programming your toaster or door lock.
[1]
https://www.sciencedirect.com/science/article/pii/S147466701...
The supervisor model reminds me a bit of how BEAM (Erlang/Elixir) works although I'm sure that's probably where the similarities end.
As much as most of this is way over my head, I'm always fascinated to read about new ground-up work like this.
Their repo is a rare case which embraced git submodules. For some reason they generate a lot of friction and not used often.
They are in fact a giant pain, but sometimes, still the best option.
Did you try
https://github.com/ingydotnet/git-subrepo
? Looks like it vendors in other repositories, making submodules entirely transparent for consumers and still allowing sumbodule workflow for authors.
We didn't, thanks for the tip!
Their mention of individually restarting components and "flexible inter-component messaging" really reminds me of the BEAM. Very exciting!
Reminds me of QNX. It was an amazing OS and restarting display drivers over the network was just one of its amazing abilities.
Not an accident. Cliff was influenced by QNX, Minix, L3 and L4 in his design (specifically, QNX proxies directly inspired Hubris notifications). And for me personally, QNX -- both the technology and the company -- played an outsized role in my own career, having worked there for two summers while an undergraduate.[0] (And even though it's been over 20 years since he passed, just recalling the great technology and people there makes me miss the late Dan Hildebrand[1] who had a singular influence on me, as well as so many others.)
[0]
http://dtrace.org/blogs/bmc/2007/11/08/dtrace-on-qnx/
[1]
There should be a retweet on HN!
QNX was one the most impressive OS I've seen, especially for its time. From the full OS in a 1.44Mb floppy disk to restartable drivers, real-time, etc. IPC with messages is built in and most things ran in userland.
It ended in the hands of BlackBerry which is probably not the best home for it...
Edit: I googled out of curiosity, and despite being closed source, it seems to still be marketed by BlackBerry and is supposedly a market leader in embedded OSes! More than 20 years later, well deserved.
https://www.automotiveworld.com/news-releases/blackberry-qnx...
Yes, it's still around. Sadly nowadays feels a bit neglected, and isn't quite keeping up. They have their markets that have little other choices and/or are very conservative in switching to something else, and will live of those for quite a while.
Most of Cisco's high-end service provider routers were running IOS-XR on top of QNX. They switched from QNX to a Linux kernel (specifically, Wind River) around 7 years back.
QNX is still going strong even with Blackberry. I worked at a company recently that heavily relied on QNX for safety critical embedded OS projects. It's Qt and QML integration made rapid prototyping a snap. Unfortunately it requires pretty (relatively) hefty processors so I never personally got to use it.
It is an amazing OS,
No surprise, since Bryan Cantrill worked at QNX for a short time in the 90s.
I feel like Rust is everywhere and nowhere at the same time; how do they do it?
"They" is just a lot of software engineers who really really like it and want to be using it, but can't use it in their day job, so continue to talk and talk and talk about it in the hopes that it's used more. I'm one of them (not using it currently, want to use it).
I think reference provide more info than above announcement itself:
https://hubris.oxide.computer/reference
Looks amazing imo. Waiting for github code :D
https://github.com/oxidecomputer/hubris
I'm not familiar w/the details of the Cortex-Ms -- do any of them support SMT/multicore? Does Hubris have a scheduler which can support a multithreaded/core cpu?
Definitely no SMT (lol). Multicore is quite rare in the deep embedded space, though FreeRTOS has added some SMP support now:
https://www.freertos.org/2021/10/freertos-adds-reference-imp...
— note that it's _just now_, that fact should already tell a lot.
And often multicore in embedded is not SMP, rather MCUs with multiple cores often end up running independent programs / OS instances on the different cores.
I asked because I was considering porting to a DSP that has multiple threads, but it wouldn't make sense unless it had a scheduler that would work.
How do the independent cores mediate access to shared I/O on buses? (that includes RAM too)
The multiple cores are in a SoC, it's handled by bus fabric
Refreshing to see this seems tailored for RISC-V and ARM, rather than it being just another x86 OS. RISC is the future, and the future is exiting!
RISC vs CISC is a legacy of the past. All ISAs now are a bit of RISC and a bit of CSIC.
It's an embedded project. It's for _microcontrollers_. It's the Cortex-M kind of ARM, not Cortex-A. x86 isn't even a consideration in this space.
when i started working on a recent realtime project i used linux, although i wanted to do bare metal. but that was not an option because of all the drivers necessary, and i knew i wanted to use the GPU and the cortex A processor i am using. i am still wondering if there really no solution to this situation.
Depending on what you need, you might end up wasting a ton of time reinventing the wheel if you go down the bare metal route.
Maybe a unikernel (if applicable) could be a good compromise?
that is true, but fine tuning linux to get what you want is a huge investment in time as well.
i don’t need all the features of a complete OS. i only need a small set of features, and i want them to work in a specific way.
Intersting choices of names, Hubris and Humility. Combined with the style of the page, it gives to me a solemn and heavy feeling. Especially compared to most projects presented that tend to be very "positive energy and emojis". Their website is also beautiful
. Though I wonder who's the target for this. Is this for cloud provider themselves, for people that self host, for hosters? For everyone?
I think the names are very clever.
The OS is named Hubris. Building a new Operating System does take a lot of confidence.
The debugger is named Humility. It can be humbling to know your program is not working correctly and use a tool to discover how it is broken.
Impatience would be a great name for the task scheduler. (Because you want your task to run NOW!)
Laziness would be a great name for a hardware-based watchdog timer. (Because you keep on putting it off / resetting it until later.)
Compare:
Cantrill has talked quite a few times about this, it is for people that still build their own data centers.
Their podcasts is similarly interesting even for me who has no real (professional) interest or knowledge about building computers.
Speaking of interesting names, their control plane is called Omicron:
https://github.com/oxidecomputer/omicron
The target market is users that want to build their own cloud infrastructure, but don't have the scale required to go directly to ODM's to have their own custom designs manufactured.
When Cantrill and his team works on something, HN listens, and for good reason. Startups like Oxide show that there's room for a lot of innovation still on a smaller scale, even within fields like HW.
> Their website is also beautiful
But horribly broken for me (mobile firefox), with text cut off at borders and overlaid by images.
Apologies, pushing a fix now. I broke this earlier today!
Much better - thanks!
Same is true in mobile Safari (iOS) but I'll cut them some slack as long as it doesn't work in Chrome on iOS (since then it would be a Chrome specific hack, since Chrome on iOS uses the same engine as Safari.)
It's really a breath of fresh air.
So for a new OS like this, how does one compile their program for it?
You need to write your program for Hubris specifically; we don't support POSIX or anything. I want to have better "getting started" docs sometime soon.
Hey folks! The 404s are because we were planning on actually publishing this a bit later today, but it seems like folks noticed the CNAME entry. Happy to talk about it more, though obviously it'll be easier to see details once things are fully open.
EDIT: blog post is up:
https://oxide.computer/blog/hubris-and-humility
and the GitHub should be open.
EDIT 2: The HN story now points to this blog post, thanks mods!
I don't get why they had to reinvent the wheel, just use libreboot or whatever.
libreboot is a great project but it's only viable on specific x86 computers, and the chips we're using for the tasks Humility/Hubris is taking on are ARM.
What do Chromebooks use?
https://libreboot.org/docs/hardware/c201.html
says
> NOTE: support for this machine is dropped in recent Libreboot releases. It will be re-added at a later date. For now, please use Libreboot 20160907 on this machine.
Guessing by that number, this means support was discontinued five years ago.
Oxide’s work is always interesting and basically a perfect confluence of all of my combined hardware and software experience to date.
However, I can’t quite get over their policy of paying everyone the same salary of $175,000. (
https://news.ycombinator.com/item?id=26348836
) I’d love to apply and work on these things, but I wouldn’t love the idea of sacrificing $xxx,000 per year for the privilege of building someone else’s startup.
Does anyone know if they have some variability in equity compensation at least? I’m no stranger to taking significant compensation in startup equity, but it would have to be significant enough to make up for the significant comp reduction relative to just about every other employer in these domains.
Do you know how much money $175kUSD is to a lot of people?
Yes! Of course it is, but employment is a market like anything else.
The difference between $175K and market rate compensation (which may be significantly higher right now, considering the job market and the skills they’re asking for) is captured entirely by the founders and investors. We shouldn’t be shaming people for expecting a higher portion of the value they create in a competitive market like this.
But the fixed salary method creates a lot of secondary problems for a company: It can become a revolving door as people join for quick experience and resume-building, but then leave as soon as they can get a higher paying job somewhere else.
> The difference between $175K and market rate compensation (which may be significantly higher right now, considering the job market and the skills they’re asking for) is captured entirely by the founders and investors
Why wouldn't some % of this be captured by your teammates?
That’s why I asked about equity. If they’re giving significant equity then I have no problems with the $175K compensation limit.
One of my favorite employers right out of college did almost exactly this same thing: Everyone gets paid the same (although they had a couple tiers) and a lot of talk about how we’re all equal.
I believed it, until the acquisition event and I realized that the founders and early team members literally had 100-1000X or more as much equity than I did. All of the talk about paying everyone the same suddenly seemed like a cruel joke.
Also it looks like they did a 3% inflation bump in the last month, as it now says $180,250 on their careers page.
Another 10% salary makes a much bigger difference when your finances are out of control. Once you are living below your means it moves your retirement date around. It's not enough to move your Fuck You, I'm Moving to Tuscany date by an appreciable amount.
(The bump was done nine months ago, it even talks about it in the linked post itself.)
Did that make it onto the careers page at the same time?
I don't remember to be honest. I would hope so!
I feel like I looked a month ago and it was still a round number. Happy to be wrong though.
It's a startup, you shouldn't compare it with FAANG, but to other startups, in which case I think it's competitive, is it not?
I have some friends I'd be visiting in Cupertino if they had offered that much about 3 years ago, but they didn't. The real money was going to be based off of grants and getting a promotion back to their current level, and that was all too abstract for moving to the Valley, so they passed.
Worked out as their boss quit in the interim and now they're the boss.
In my first-hand experience interviewing at other startups (Q3 and Q4 2021) it’s definitely not competitive. Startups are unbelievably well-funded right now and capital is cheap and easy to come by. Even small startups don’t hesitate to compensate their employees well because they know it’s their only shot at attractive the type of talent that gets them to exit.
Basically, it doesn’t make sense for a startup to be frugal with compensation right now.
Unless you interpret it as their way of hiring only promising junior candidates or remote workers from locations where $175K is a lot of money?
Regardless, it’s weird to pay everyone the same amount of money because you’re basically pretending experience doesn’t matter. This leads to more experienced people leaving for other companies where they can (easily) get paid more while the less experienced people won’t leave because it’s a boost over what they’d get at other places.
I’ve worked at places with HR-mandated salary caps before. The best people always leave because there’s no hope of moving up and there’s no real incentive to work any harder than anyone else earning the same amount (as long as you avoid getting fired).
All of this is more or less answered in the blog entry from nine months ago.[0] We haven't really done a follow-up, but since that blog entry, we attracted a new wave of absolutely outstanding folks. I don't think that I'm speaking out of turn to say that this is the best team that any of us have ever worked on -- and to the contrary, experience matters a great deal, as the team is _exclusively_ experienced. I would acknowledge that we are a different kind of company, and one that attracts a different kind of technologist. (And frankly, given the issue that you take with our compensation, you would likely take issue with many other aspects of our hiring process as well -- and that's okay! Not every company needs to be a fit for every person.)
[0]
https://oxide.computer/blog/compensation-as-a-reflection-of-...
I can't speak for anyone else, but I certainly would like to see more companies with a similar level of transparency, and which do a similar equal-pay compensation scheme, and which exclusively attract experienced engineers. I have long decided even before your blog post that this is exactly how I am going to run my company.