• mesa@piefed.social
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 days ago

    Google and OpenAI sucks:

    Google’s legal theory has another significant problem: the requirement that a TPM must “effectively control” access. Just last week, a court rejected Ziff Davis’s attempt to turn robots.txt into a 1201 violation when OpenAI allegedly ignored its crawling restrictions. The court’s reasoning is directly applicable here:

    OpenAI slamed my small server into the ground, until I put fail2ban on top. It was really bad, like thousands of requests per second bad.

    • apftwb@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      11 days ago

      How does fail2ban prevent scrapping? My understanding was that fail2ban works on failed login attempts.

      • mesa@piefed.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        11 days ago

        There’s some premade scripts out there that make it do more. I have it hooked up to nginx and other such logs. Its common enough in login attempts for login portals online, not just ssh. It can work with any grep-able log file.

        I just took two scripts other people have made, verified they soon my mini PC and set it loose. Within about 10 min it caught most scrappers and banned the IPs.