💾 Archived View for dioskouroi.xyz › thread › 24992970 captured on 2020-11-07 at 00:48:17. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
________________________________________________________________________________
Cloud computing seems to be a winner take all scenario. For example, if you use AWS and need a message broker service then you'll use this. If you use Azure, you'll use their version. Development seems like just hooking up this components. I can't tell if this is a good thing or a bad thing.
It’s a bad thing - whenever you cannot make a choice you become a slave to tyranny. Infra lock-in means you’re a slave to their whims (financial, legal, competitivenes, whatever)
This is justified by “less maintenance, and easier deployment” but the reality of the situation is, it’s not worth giving your freedom up for, and to a lesser degree - if your platform becomes popular, you end up spending the same amount of time tweaking and optimising to match the idiosyncrasies of their implementation anyway.
But the most important part is vendor lock-in, it’s bad.
How many cases are you truely not locked into anything? Even if you host your own RabbitMQ you're still coupled to the software. I'm not sure being coupled to cloud service X is any worse than OSS product Y. For the latter, you tend to need to have more expertise to run it yourself.
At the end of the day, you can still rewrite your code and switch in both cases. You can end up in a tough spot if the OSS community loses interest in the software you've already bet your complicated app on, as well
How is using AWS or Azure's managed RabbitMQ service lock-in? You can easily switch to someone else's managed service, or roll your own, since it's still just good old RabbitMQ.
I was responding to the parents more general point about cloud services being "winner take all"
AWS has sqs & kinesis which are much better queueing options in that scenario where you’re in AWS doing new dev and can pick a technology. This is more likely to do with opening doors for large and complex applications that can’t be rewritten to come into the AWS cloud. Unsexy stuff but there’s some fun engineering to be done in that realm, if you find like puzzles and shitshows to be fun anyways :-)
how are those much better? if you use SQS and you write your code for it, then you are stuck on a proprietary platform. Also, SQS is super-basic and actually requires a bunch of code to do anything beyond trivial - although yes, it seems reliable and well-supported, at least from my experience. I was actually really waiting for AWS to support Rabbit since it seems to hit the right combo of features, usability and platform independence for me, and it looked friendlier than ActiveMQ.
If you're using a decent framework there's a good chance it already does most of the work for you. With Ruby on Rails, there's the ActiveJob abstraction which you can hook up to different backends like SQS or Redis with a few lines of code. In addition, AWS has a lot of out of the box integration with SNS and SQS for other services like Cloudwatch and Lambda (in fact, lambda you don't need any special code).
If you have a Lambda function processing SQS messages they just get dumped in your handler method and it your function runs successfully they get automatically removed from the q. If your lambda fails, the message reappears after the visibility timeout out subject to your redrive policy
"sqs & kinesis"? These are two vastly different queuing systems. It's like saying "SSDs and Tape" are a better storage system than X.
Neither SQS nor Kinesis support the same functionality as RabbitMQ.
> AWS has sqs & kinesis which are much better queueing options in that scenario where you’re in AWS doing new dev and can pick a technology.
Why are they better?
They’ve been around for quite some time so they have a much wider customer base & bigger teams supporting them. AWS services can be a bit choppy in the beginning so imo, especially with queues, I’d wait for it to bake.
I agree. SQS has done nothing but improve over time.
They aren't. They're technically more correct but not always the practical best choice.
RabbitMQ is a smart play as Rabbit is very easy to use, understand, and troubleshoot at the low end (which is where I suspect the vast majority of queue systems live).
It also has a feature which is actually really hard to do (and sqs doesn't do). Guaranteed delivery of a message _once_.
That was THE reason we never migrated to SQS, there are scenarios where SQS can double deliver. Our codebase was built up from nothing over time and couldn't gracefully handle double delivery of messages in all scenarios. We could have refactored, but it wasn't worth the work when we were already doing a half billion in revenue without getting even close to the limitations of rabbit AND were close to selling (which we ultimately did).
AWS is great at selling multiple slight variations of the same product. If you look you can usually find ONE variation that works for you. The real test will be if the billing isn't garbage (garbage billing is why we didn't use their other AMQP service and part of the reason why we don't use things like EKS or Managed SFTP despite having the need).
> Guaranteed delivery of a message once
That flies in the face of my distributed systems knowledge. It's not possible in some failure cases.
If your acknowledgement of a message gets lost (because either server involved or the pipes in-between fail) you've processed the message already but the queue server will think you haven't. It either has to resend it (duplicate delivery) or it ignores acknowledgements all together (drops messages that it sent you, but you didn't process - maybe because your server failed.) So the choice when there is a failure in the system is between at least once or at most once - exactly once cannot be guaranteed.
I'm not aware of any way around that predicament.
You are correct, a better description is that their path to 'deliver exactly once to the best of your ability' is clearer.
If I remember correctly SQS is hard limited to a fairly short timeout to requeue messages delivered but not acked. In rabbit it's much more configurable.
Also regular rabbit hosts support the kludge pattern of, 'just run one host and accept if it goes poof you can lose messages,' which is useful if you don't want to bother with the complexity of clustering or are on a shoe string budget.
Lastly you get a nice user interface with the management plugin and you can stand it up locally with docker compose (without depending on AWS for dev or any of the 'aws but on your laptop' solutions).
Yeah, those are nice features to have. Plus you don't get the platform lock in.
SQS also supports FIFO queues, which have once-only delivery and ordering. Any reason those didn't work for you?
Aren't they expensive with performance limitations?
Yes we could do that, but we had already been using rabbit in a bunch of places. It made no sense to change it.
Nope. If only. There are four considerations to look at which are rarely if ever mentioned between the mountains of well organised hype:
The first is application complexity. A lot of real workloads are quite complex but do not need to scale out a lot. The cloud and all the hype is organised around simple workloads that need to scale out easily which is an easy win. So for example Netflix or a SaaS application with a few tens of endpoints and a React front end or something. The wide sprawling real businesses are a terrible fit and tend to get rather expensive rather quickly when you start putting their workloads into the cloud. There is marginal aggregate cost benefit over actually buying hardware ($4m a year SQL server clusters are a reality in the cloud), the real benefit being only agility.
The second is simply "hooking up components" sounds really easy. But it's not. I think perhaps 50% of my time is working out why X won't talk to Y or why Z is broken and finding some opaque abstraction which doesn't allow me to get to the bottom of the problem. It's very very easy to turn your deployment into a complete tangle of chaos and circular dependencies which are very hard to rationalise and automate even with state of the art automation tools (which I will say tend to melt in your hands). This is existing layering on top of the same concerns you had before rather than a different one.
Thirdly we have to work out the difference between mature products and hype. Nearly all solutions are described in little blog snippets that make things look really easy for a specific and narrow use case but realistically things are really fucking complicated and in some cases absolutely awfully described in documentation. In a lot of cases, including AWS, it's actually hard to find someone at the cloud vendor who knows how something works when you break it. And sometimes there are solutions which are just absolutely dire. Again pointing the finger here at Amazon's managed ElasticSearch.
Fourthly, you end up being perpetual bean counter afraid of the rube goldberg machine waking in the middle of the night due to some event you didn't anticipate and drinking the content of your credit card in a few minutes. Some of the cost management and spot instance management software automates this rather nicely into a whole cluster of new failure modes as well just as if the complexity wasn't enough already. A trite version of this is "saving money costs money and sometimes the benefits are less than the costs"
So what you end up doing is trading your original problems for a set of new and shiny ones which are possibly even more complicated.
But at least you only have one vendor to shout at, which is a net win if you've ever tried to get HPE and Cisco to work out what fucked up mess is going on between their two lumps of iron.
I digress but be careful with assumptions about it being magical unicorns. They poop and you have to shovel it.
good - my job is really easy
bad - my job is really boring
I beg to differ on your second point - at my company we've fully embraced AWS and putting vendor lock-in issues aside, the end result is focusing more on the application and less on the minutiae of operational issues which is a big win. This makes things much more interesting since you can get there faster and consequently take on more impactful projects in the same timeframe. This in general is a boon for developers in my experience.
Agreed. The flip side isn't vendor lock-in though, it is building complex systems under the guise of scale.
Somehow I don't think physical laborers ever complain when they get new and more powerful tools to make their job easier.
It's only software engineers that bemoan their lives getting easier, so they can spend more time working on other problems higher up the abstraction chain.
The 'problems higher up the abstraction chain' are the ones that are closer to labour, or factory work. Mundane and repetitive, relatively speaking easy - requiring less thought and being to die extent trainable as working within a pattern/template.
How about when the new powerful tools help get rid of some of them due to productivity gain?
And you can spend that time thinking about how not to use a message broker... And maybe just use lower level MANAGED services, like SQS or SNS.
RIP
Nah, they support so many platforms and AWS is only one of them. I also think their Heroku business might actually benefit from this. Now they don't even have to manage the servers themselves -- they can essentially re-sell AWS's service with custom support if they wanted and charge customers to move the cost to Heroku's bill rather than AWS's (you'd be surprised how many do this).
Reselling AWS's offerings would cost more than running it on bare VM's, I'm sure they would be able to compete on price with them.
It also doesn't make sense to rewrite their current software which is probably abstracted for multi-cloud to support re-selling.
> Reselling AWS's offerings would cost more than running it on bare VM's, I'm sure they would be able to compete on price with them.
True -- I do think passing on the cost and taking a tiny margin with drastically reduced maintenance cost could be an attractive business model at scale though.
> It also doesn't make sense to rewrite their current software which is probably abstracted for multi-cloud to support re-selling.
I have no idea what their current software looks like, do you have any inside knowledge?
If they have abstracted, then they probably have multiple implementations of a similar API -- this is just changing _one_ of them (or maybe even cloning it to reduce possibility of breakage). This might be as simple as just changing the AWS-specific provisioner to call out to AmazonMQ instead of EC2, or changing some code that generates terraform/pulumi scripts.
One thing I think they'd have to deal with is the fact that they support custom plugins that AmazonMQ may not.
The backups are also something that would differ greatly, along with metrics that rely on internal APIs which AWS may not provide access to.
Like other AWS products (RDS, Elasticache), there’s limitations since they provide protocol interoperability with proprietary tech behind the scenes.
From the FAANG, Amazon strikes me as the one using most open source code while at the same time not having much to show as open source - firecracker is a (relative) toy.
They must be laughing their sock off at the tops of FAANG: thousands upon thousands commit millions of hours to open source which they then run for a nice profit.
This is because reliability, ease of setup, and support are actually just as important as what is being run.
If you're not using AWS you're not fully leveraging your software team, because that means you've got people spending time building and supporting these sorts of internal systems. That should only be done when you reach large scale, at which point people have leveraged other AWS synergies making it harder to exit the platform.
>That should only be done when you reach large scale, at which point people have leveraged other AWS synergies making it harder to exit the platform.
And then you're bleeding money!
And drowning in a sea of complexity...
I really doubt you have any real world experience with AWS if you try to sell it as a less complex alternative to anything,really. AWS is perhaps the most complex and arcane service provider ever, and is progressively getting worse by the way the service keeps growing and changing.
And no, learning a specific AWS service is not a solid career investment. Tell that to anyone who tried to learn CloudFormation and then SAM and now forget everything because AWS pushes you to use CDK.
> If you're not using AWS you're not fully leveraging your software team, because that means you've got people spending time building and supporting these sorts of internal systems.
This is simply not true at all, and flies in the face of real world usage.
The only concrete and objective selling point of AWS is it's global coverage of data centers, and the infrastructure they have in place aimed at delivering reliable global-scale web services.
The problem is that the companies who actually operate at such a scale and with such tight operational requirements can be counted with your fingers. That count then drops down to a fraction once you start to do a cost/benefit analysis.
The rest of the world is quite honestly engaged in cargo cult software development.
And no, doing AWS is not simpler nor more efficient. You might launch an EC2 instance with a couple of clicks, but to navigate a service designed with global scale and multiple levels of redundancy across the same service and with tight integration and dependency across half a dozen AWS offers which may or may not be redundant or competing... No, that is not simple or allows for any type of time efficiency.
Hell, with AWS you do not learn how to manage or operate Infrastructure. With AWS you learn the AWS dashboard,and learn pavlov reactions to which button you press if you hear an alarm. You never fully grasp the impact or the reaction of pressing a button, and you have absolutely no idea what impact that click will have on your monthly bill.
In contrast, if you need to run microservices chatting through a message broker then your system on OVH or Hetzner or any other barebones system will be comprised of a bunch of nodes where one of them runs RabbitMQ and everyone else points to the RabbitMQ node. You can get everything running from scratch on a cluster managed by Docker Swarm in about 15 to 20 minutes. In the end you have a far simpler service running for a fraction of the cost and ina far more manageable environment.
AWS is resume-driven development fueled by cargo cult development.
For some stacks - sure. There’re good number where you’ll run into scalability/cost problem fairly quickly
Do they need to? The point of FOSS isn't to get freebies from big corporations. It's so that software remains in the hands of the users. If Amazon doesn't contribute back, or if they fork it, that's fine, because RabbitMQ will always be there.
Amazon contribute plenty to open source, both in code and in money, although they can definitely afford to do more sponsorship.
I'm curious why you'd characterize firecracker as a toy?
It’s (literally) Google’s crosvm with some changes. AWS forked an existing codebase, made some changes that mostly consisted of removing functionality to tailor the VMM for their use case, and then made a big PR push about “open sourcing” Firecracker as though it was a project they built from scratch. The announcement devoted 1 sentence to the fact that it’s a crosvm fork, and that’s not even the only thing covered in that one sentence.
https://aws.amazon.com/blogs/aws/firecracker-lightweight-vir...
Toy in what sense? It’s the backbone of most serverless offering from AWS.
Toy in the sense that it’s not a significant contribution to the open source community.
Edit: In comparison to the scale of Amazon and the scale of contribution of other similarly-sized tech companies. Firecracker would rank as a more major contribution in my book if it wasn’t a cut-down fork of a pre-existing (and still active!) project.
Define significant? If you're expecting every open source software user to push a million commits before they can be called "contributors" apart from Microsoft/Facebook I dont see any significant contributors.
The context up thread is that Amazon contributes a remarkably small amount to open source compared to the amount of code that Amazon produces and the amount of open source projects that Amazon depends on. Also I find it interesting that you don’t think that Microsoft and Facebook are comparable to Amazon.
After watching a presentation on firecracker from Amazon team it seemed like it is more of a playground for their junior developers than some high profile library. Not trying to be dismissive, just an impression.
Firecracker has 13K stars on Github [1]. I'm not in this space. Why does a toy have so much activity (issues, pull requests) and stars?
1 -
https://github.com/firecracker-microvm/firecracker
Offtopic: we are evaluating rabbit at the moment vs activemq; rabbit won for one detail: we need delayed messages (ex. publish {...} in 2h) and activemq seems to support those clustered while rabbit only on one node, which does not fit the business case (we cannot lose messages). I worked with rabbitmq before and it was great but this seems indeed an issue. Someone here with some insights?
Hopefully I'm not misunderstanding your use case, but I would think you could accomplish this by publishing a durable message to a highly available, durable queue with no consumers, setting a TTL on the message with a policy to publish to a "dead letter exchange" that fronts the queue or queues with your eventual consumers. The durability flags ensure both message and queue survive a restart, while the high availability policy on the queue ensures that each node in the cluster has a copy. And I'm sure there are a few similar patterns that would garner the same desired behavior.
I'm not taking a position on whether this is an acceptable level of complexity for the desired feature, of course, just pointing out how one might accomplish it if Rabbit is otherwise desirable.
Open source is free to do what you will, but how many PR's does AWS send back to origin?
Disclosure: I work at AWS and this is my personal opinion.
I've seen growing levels of AWS contribution back to upstream projects over the past four years. Teams start out by operating a piece of software at scale, whether it is Redis, Kubernetes, etc. After they have operated it for a while they discover the bugs or performance issues, or customers of the service complain about something. At that point the team now has enough real world experience with that software to begin to contribute back to upstream.
It takes time: to learn the ins and outs of the software well enough to know where and what improvements should be made, to understand the software's design and history well enough not to make bad suggestions or contributions that were already determined to be dead ends in the past, and to earn the approval of the community and existing maintainers enough to get significant contributions accepted in the first place.
Hey Thief,
Which team you work on? Why would you cheer taking other people's work and defend the monetization? Do you publish PRs secretly pulling from public records, huh? You are not welcome here.
I don't know about Rabbit MQ, but for redis AWS sends back a lot, so much that one of their developers is now a core team member
https://redislabs.com/blog/redis-core-team-update/
They have quite a number of commits from various authors to the Linux kernel at least:
linux (master=) $ git shortlog -ns --author amazon 79 Arthur Kiyanovski 76 Gal Pressman 59 David Woodhouse 51 Sameeh Jubran 45 Netanel Belgazal 38 KarimAllah Ahmed 35 SeongJae Park 30 Jan H. Schönherr 26 Frank van der Linden 18 Andra Paraschiv 12 Paul Durrant 11 Talel Shenhar 10 Shay Agroskin ...
Usually zero.
Another bold innovative move from AWS.
To be honest, charging for things that other people have created and released for free _is_ pretty bold! ha
Is it bold? If a group of people put something together and release the details for free, they have to expect others will profit off of it. This is the fate of all useful open source software that doesn't have some kind of no-commercial clause (which potentially moves it outside the ideals of open source as it restricts the use of the source code).
Not really. This has been going on for decades. Examples: early web hosts running the LAMP stack: Linux/Apache/MySQL/PHP. Eventually people put an admin UI on top (control panels, like Plesk and cPanel.) The core functionality is all free software. The value is not having to configure and manage all that stuff yourself.
There is absolutely nothing wrong with charging for this. Have you heard of "Wordpress hosting" for example?
You are still free to use the free open source version.
.. and not contributing back to the community and original authors
I know a lot of HNers believe that AWS just consumes OSS and doesn't contribute back, I just wanted to share what AWS has to say about it-
https://aws.amazon.com/blogs/opensource/setting-the-record-s...
I am in no way defending or attacking anyone,I just want to provide a data point.
What a bizarre post! Seems really strange to me to see such a 'toys out of the pram' reaction piece like that from an organisation as big and corporate as AWS.
That it feels the need to respond like that makes me see it in worse light over the matter rather than better.
Will it force RabbitMQ to also change its open source license model like Redis ?
Semantically idiotic in my experience with the founder
AMQP 0.9.1 only. Azure Service Bus supports AMQP 1.0 and does not require any servers on the tenant side.
That's true, but a lot of software standardised on RabbitMQ specifically, including its client libraries. That means there's still a lot of demand for AMQP 0.9.1 over 1.0. (RabbitMQ does have a plugin for AMQP 1.0, however.)
Amazon MQ supports ActiveMQ (AMQP1.0) and RabbitMQ (AMQP 0.9.1), so take your pick.