After moving from lemmy.ml to programming.dev, I’ve noticed that web responses are fulfilled much more quickly, even for content on federated instances like lemmy.ml and lemmy.world.
It seems like this shouldn’t make such a big difference. If a large instance is overloaded, it’s overloaded, whether the traffic is coming from clients with accounts on that instance or from other federated instances.
Can this be explained entirely by response caching?


Yes, caching. When you ask for a remote community it doesn’t go fetch it right then. In fact, it doesn’t fetch at all. The remote community pushes when there is new data.