What would happen if instead of users swarming existing servers when a fediverse service was put in the spotlight, each user spun up their own micro-instance and tried to federate with existing servers?
There’s always the odd person who decides to host a personal fediverse service in their homelab for themselves, but would the fediverse work if that was actually the primary mode of interaction? Or would it fail in a similar way to now where the servers which receive the most federation requests need to scale up?
Presumably the failure modes for federation are easier to scale than browser requests since it’s an async process.
The way activitypub works is that each community has a list of every server that has at least one subscriber to that community.
Every time someone does something in that community, the community sends all those servers a message that tells them what just happened.
So instead of a few hundred servers it might have to inform of your one upvote of a post, it would have to basically inform every user (every user’s server)
It would be bad, it’s not designed to do that.
Possibly failure, because setup isn’t just a simple or of box plop. And i can’t see how pings from 5000 microservers is better than 5000 users looking to register? But that’s more of a question than an informed opinion
Maybe I should clarify with “each user successfully spun up…” I’m mostly curious if the 5000 microservers trying to federate is a more sustainable access pattern than 5000 users hitting the website.
Since federation is an async process, it can be optimized on both ends in a way that user browser requests cannot.
At the same time, federation would overall result in more bandwidth being used because not every user wants to view every post in the frontend.
Maybe I should clarify with “each user successfully spun up…” I’m mostly curious if the 5000 microservers trying to federate is a more sustainable access pattern than 5000 users hitting the website.
Sustainable in what sense?
It’s way more sustainable in the sense of “one website is not controlling the entirety of the experience of a given type of service for 5000 users”, for example. I think it’s important to talk about specific kinds of sustainability, and specific threats to it.
Things to consider (apart from bandwidth-related considerations):
- technical knowledge necessary to safely and securely run and maintain a service
- space, time, and resources (including financial) to do so
- ability, willingness, and energy to moderate a service (this is where Big Tech platforms are falling flat on their faces, for example, and where smaller fedi communities work pretty damn well)
that ansible book works great, its just a bash script away from regular user DiY.
I’ve watched people who never used a computer install blockchain nodes and miners (including the networks). If someone wants to do it, they WILL figure it out.
I dont think so. As an example, take the [email protected] community for example. It can have say 1000 subscribers from lemmy.ml but only needs to send content to lemmy.ml once as it comes in. All 1000 subscribers see the cache copy from lemmy.ml and a message is only sent back to beehaw.org for comments, votes, etc. With everyone having their own instance beehaw.org would have to send updates to each one instead of sending an update to one instance and 100 users seeing it. A good level to strive for is many small communities of say a few thousand (1-5 thousand or so). That way one single server doesnt get to massive but federation requests arent overwhelming instances either
What you’re describing is no longer federation but full P2P. From a purely technical point of view, it may work, but the biggest problem will be abuse (spam, excessive resource use, illegal content). When a new instance shows up, how do you know if it’s a spammer or not? And if an instance is blocked by another instance, whose side should you be on?
It wouldnt really be full P2P: I’d expect moderated communities to act as a funnel which everyone interacts with each other through. I wasn’t really considering the hypothetical micro instances to be like a normal server, since even when federated its unlikely that they would consume as much federation bandwidth as a large instance. Most people wouldn’t run a community, simply because they don’t want to moderate it.
Realistically, the abuse problems you mention can already currently happen if someone wants to. It’s easier to make an account on an existing server with a fresh email, spam a bit, and get banned than it is to register a new domain ($) and federate before doing the same. I think social networks would have a lot less spam if every time you wanted to send an abusive message, you had to spend $10 to burn a domain name.
Most of the content would still live on larger servers, so you end up moderating in the same place. Not much difference between banning an abusive user on your instance and banning an abusive single-user instance.
I hadn’t even thought of the moderating yet.
I would be shocked if it worked well, seeing as it wasn’t designed for that.
Even if it did though, where would we be having this conversation? It would work more like a texting app than any kind of community.
It’s a similar concept to email, so I would imagine there will always be big players who will have a reputation of trustworthiness/reliability.
The whole concept here seems to favor spinning up your own “cache” instance between you and the content you want (similar to how old email clients worked, downloading emails from the mail server and never live-fetching them), which is fabulous for distributing the load. Discovery takes a back seat when doing that, but it’s still pretty doable.
I think the main difference between fediverse and email WRT cache instances is that if you create a cache instance for email, you’re only caching your personal emails. If you create a cache instance for a lemmy community, you’re caching every event on the community.
My intuition says there’s probably a breakpoint in community size where the cost of federating all events to the users who subscribe to them becomes greater than the cost of individually serving API requests to them on demand. Primarily because you’ll be caching a far greater amount of content than you actually consume, unlike with email.
Edit: That said, scaling out async work queues is a heck of a lot easier than scaling out web servers and databases. That fact alone might skew the breakpoint far enough that only communities with millions of subscribers see a flip in the cost equation…
Maybe I’m wrong (I’m on Lemmy since yesterday morning) but if you host your instance you’re only caching the communities you are interested in …if you never care about a community or interacted with an instance then those data will never reach your instance. Federated doesn’t imply full redundancy