Off-and-on trying out an account over at @[email protected] due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 59 Posts
  • 1.53K Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle
  • He could probably run an NFS server that isn’t a closed box, and have that just use the Synology box as storage for that server. That’d give whatever options Linux and/or the NFS server you want to run have for giving fair prioritization to writes, or increasing cache size (like, say he has bursty load and blows through the cache on the Synology NAS, but a Linux NFS server with more write cache available could potentially just slurp up writes quickly and then more-slowly hand them off to the NAS).

    Honestly, though, I think that a preferable option, if one doesn’t want to mess with client global VM options (which wouldn’t be my first choice, but it sounds like OP is okay with it) is just to crank up the timeout options on the NFS clients, as I mention in my other comment, if he just doesn’t want timeout errors to percolate up and doesn’t mind the NAS taking a while to finish whatever it’s doing in some situations. It’s possible that he tried that, but I didn’t see it in his post.

    NFSv4 has leases, and — I haven’t tested it, but it’s plausible to me from a protocol standpoint — it might be possible that it can be set up such that as long as a lease can be renewed, it doesn’t time out outstanding file operations, even if they’re taking a long time. The Synology NAS might be able to avoid taking too long to renew leases and causing clients to time out on that as long as it’s reachable, even if it’s doing a lot of writing. That’d still let you know if you had your NFS server wedge or lost connectivity to it, because your leases would go away within a bounded amount of time, but might not time out on time to complete other operations. No guarantee, just it’s something that I might go look into if I were hitting this myself.


  • That’s a global VM setting, which is also going to affect your other filesystems mounted by that Linux system, which may or may not be a concern.

    If that is an issue, you might also consider — I’m not testing these, but would expect that it should work:

    • Passing the sync mount option on the client for the NFS mount. That will use no writeback caching for that filesystem, which may impact performance more than you want.

    • Increasing the NFS mount options on the client for timeo= or retrans=. These will avoid having the client time out and decide that the NFS server is taking excessively long (though an operation may still take longer to complete if the NFS server is taking a while to respond).


  • We will ⁠take other steps to stabilise energy markets, including working with certain producing nations to increase output.

    I suspect that, for natural gas and Europe, the problem here is gonna be similar to the issue when Russia cut natural gas off. You might be able to ramp up extraction in Africa or the US or South America or whatever. But you’re going to have a hard time getting more liquefication capacity near extraction online rapidly, to convert it to a form that can be viably moved to major natural gas importers. That’s something like a 4-5 year time window to build out more capacity, from past reading.

    https://en.wikipedia.org/wiki/List_of_countries_by_natural_gas_production

    Looking at that, the major natural gas producers in Europe, outside Russia, are Norway (#8 globally) and the UK (#22 globally). Those guys are directly on the European pipeline network. If they can ramp up their production and have pipeline capacity, they can get it to the rest of Europe.

    But outside of that, there’s only so much liquification capacity.

    US gasoline and diesel prices promptly rose after the conflict kicked off, because the market for those is global. Anyone in the world can buy oil and transport it without much trouble, so prices tend to be correlated around the world. But the natural gas market requires scarce, specialized liquification and regasification facilities for long distance transport and as a result is fragmented, decoupled. Henry Hub (US natural gas) prices aren’t up. The major European natural gas index (TTF) is well up. The two diverging will mean that there isn’t a way to move adequate natural gas between the two. That’ll cause European consumers to be disproportionately hit by natural gas price increases.


  • Frankly, I don’t see why sending funds requires unanimous EU approval. Hungary wasn’t even contributing, based on the article:

    Hungary, Slovakia and the Czech Republic approved the idea with the crucial caveat that they did not have to contribute to the loan.

    So I’d expect that any group of countries that do want to send funds could do so absent Hungary without even affecting the total commitment.

    Sanctions on goods out of Russia or something, sure. The EU controls what gets into the EU at the customs border. You have to have an EU-level decision, or goods can enter via another EU member and then there’s no control of the flow of goods internal to the EU. The EU structure doesn’t permit for that to be done below the EU level.

    But I don’t believe that there’s any intrinsic reason that Hungary need sign off on funds going to Ukraine, assuming that other countries honestly do want to send that money. A country can send funds wherever it wants, on an individual or collective basis.






  • What makes this worse is that git servers are the most pathologically vulnerable to the onslaught of doom from modern internet scrapers because remember, they click on every link on every page.

    The especially disappointing thing is that, for the specific case that Xe was running into, a better-written scraper could just recognize that this is a public git repository and just git clone the thing and get all the useful code without the overhead. Like, it’s not even “this scraper is scraping data that I don’t want it to have”, but “this scraper is too dumb to just scrape the thing efficiently and is blowing both the scraper’s resources and the server’s resources downloading innumerable redundant copies of the data”.

    It’s probably just as well, since the protection is relevant for other websites, and he probably wouldn’t have done it if he hadn’t been getting his git repo hammered, but…

    EDIT: Plus, I bet that the scraper was requesting a ton of files at once from the server, since he said that it was unusable. Like, you have a zillion servers to parallelize requests over. You could write a scraper that requested one file at once per server, which is common courtesy, and you’re still going to be bandwidth constrained if you’re schlorping up the whole Internet. Xe probably wouldn’t have even noticed.


  • https://en.wikipedia.org/wiki/National_Helium_Reserve

    The National Helium Reserve, also known as the Federal Helium Reserve, was a strategic reserve of the United States, which once held over 1 billion cubic meters (about 170,000,000 kg)[a] of helium gas.

    The Bureau of Land Management (BLM) transferred the reserve to the General Services Administration (GSA) as surplus property, but a 2022 auction[10] failed to finalize a sale.[11] On June 22, 2023, the GSA announced a new auction of the facilities and remaining helium.[12] The auction of the last helium assets was due to take place in November, 2023.[13] Though the last of the Cliffside reserve was to be sold by November 2023, more natural gas was discovered at the site than was previously known, and the Bureau of Land Management extended the auction to January 25, 2024 to allow for increased bids.[14] In 2024 the remaining reserve was sold to the highest bidder, Messer Group.[15]

    Arguably not the best timing on that.


  • Sure. What that guy is using is actually not the most-interesting diagram style, IMHO, for automatic layout of network maps, if you want large-scale stuff, which is where the automatic layout gets more interesting. I have some scripts floating around somewhere that will generate very large network maps — run a bunch of traceroutes, geolocate IPs, dump the results into an sqlite database, and then generate an automatically laid-out Internet network map. I don’t want to go to the trouble of anonymizing the addresses and locations right now, but if you have a graphviz graph and want to try playing with it, I used:

    goes looking

    Ugh, it’s Python 2, a decade-and-a-half old, and never got ported to Python 3. Lemme gin up an example for the non-hierarchical graphviz stuff:

    graph.dot:

    graph foo {
        a--b
        a--d
        b--c
        d--e
        c--e
        e--f
        b--d
    }
    

    Processed with:

    $ sfdp -Goverlap=prism -Gsep=+5 -Gesep=+4 -Gremincross -Gpack -Gsplines=true -Tpdf -o graph.pdf graph.dot
    

    Generates something like this:

    That’ll take a ton of graphviz edges and nicely lay them out while trying to avoid crossing edges and stuff, in a non-hierarchical map. Get more complicated maps that it can’t use direct lines on, it’ll use splines to curve lines around nodes. You can create massive network maps like this. Note that I was last looking at graphviz’s automated layout stuff about 15 years ago, so it’s possible that they have better layout algorithms now, but this can deal with enormous numbers of nodes and will do reasonable things with them.

    I just grabbed his example because it was the first graphviz network map example that came up on a Web search.







  • You have all your devices attached to a console server with a serial port console set up on the serial port, and if they support accessing the BIOS via a serial console, that enabled so that you can access that remotely, right? Either a dedicated hardware console server, or some server on your network with a multiport serial card or a USB to multiport serial adapter or something like that, right? So that if networking fails on one of those other devices, you can fire up minicom or similar on the serial console server and get into the device and fix whatever’s broken?

    Oh, you don’t. Well, that’s probably okay. I mean, you probably won’t lose networking on those devices.


  • You have remote power management set up for the systems in your homelab, right? A server set up that you can reach to power-cycle other servers, so that if they wedge in some unusable state and you can’t be physically there, you can still reboot them? A managed/smart PDU or something like that? Something like one of these guys?

    Oh. You don’t. Well, that’s probably okay. I mean, nothing will probably go wrong and render a device in need of being forcibly rebooted when you’re physically away from home.


  • You have squid or some other forward http proxy set up to share a cache among all the devices on your network set up to access the Web, to minimize duplicate traffic?

    And you have a shared caching DNS server set up locally, something like BIND?

    Oh. You don’t. Well, that’s probably okay. I mean, it probably doesn’t matter that your devices are pulling duplicate copies of data down. Not everyone can have a network that minimizes latency and avoids inefficiency across devices.