Skip to content

drafts.csswg.org keeps blocking my IP, rate limits are too low #11354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Loirooriol opened this issue Dec 11, 2024 · 38 comments
Closed

drafts.csswg.org keeps blocking my IP, rate limits are too low #11354

Loirooriol opened this issue Dec 11, 2024 · 38 comments
Assignees

Comments

@Loirooriol
Copy link
Contributor

The server doesn't respond. Not even ping drafts.csswg.org works.

@plinss

@Loirooriol
Copy link
Contributor Author

I can access it if I use the data connection from my phone. But I can't access it from the my home's connection. Maybe my IP is banned or something, I'm behind CGNAT :(

@svgeesus
Copy link
Contributor

It is working for me.

I have been banned in the past though for requesting drafts too quickly :(

@Loirooriol
Copy link
Contributor Author

Works again for me. Comparing traceroute -I drafts.csswg.org, previously it got stuck in ae22.gw4.scz1.netarch.akamai.com (23.203.158.53), now it reaches smtp.csswg.org (45.79.94.155) after that.

@plinss
Copy link
Member

plinss commented Dec 11, 2024

You most likely triggered a CrowdSec block. This can happen for exceeding rate limits (among other things) and will block all traffic from your IP to the server for 4 hours.

If it keeps happening let me know what your IP address is and I can look at what caused the block, and adjust rate limits if needed. It's a balance between allowing normal access and blocking scrapers.

@svgeesus
Copy link
Contributor

Image

@plinss

Individual drafts still work, but the overall drafts index is down.

@svgeesus svgeesus reopened this Jan 15, 2025
@svgeesus
Copy link
Contributor

Still down. Individual drafts do work if you know the address, but the overall index is still missing.

@Loirooriol
Copy link
Contributor Author

@plinss My IP seems blocked again, it's 77.75.177.5, I wonder if I'm sharing it with a bot due to CGNAT.

@Loirooriol Loirooriol reopened this Feb 6, 2025
@plinss
Copy link
Member

plinss commented Feb 7, 2025

Your IP was blocked due to rate limiting violations. The only UA string associated with that IP in the logs is "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0", so doesn't look like a bot, Just hitting too many specs too close together (each image load counts, but it does allows higher bursts for those). If you get a 429 error, just back off for a while to avoid an IP block.

@plinss plinss closed this as completed Feb 7, 2025
@Loirooriol
Copy link
Contributor Author

I didn't get see any error before getting blocked, though.

@Loirooriol
Copy link
Contributor Author

Blocked again, same IP, these limits seem way too low.

@Loirooriol
Copy link
Contributor Author

Loirooriol commented Feb 10, 2025

And blocked again. Pretty sure I only loaded https://drafts.csswg.org/css2/ and https://drafts.fxtf.org/filter-effects-2/. Then blocked when trying to load https://drafts.csswg.org/css-flexbox-1/

@Loirooriol Loirooriol reopened this Feb 10, 2025
@Loirooriol Loirooriol changed the title drafts.csswg.org is down drafts.csswg.org keeps blocking my IP, rate limits are too low Feb 10, 2025
@Loirooriol
Copy link
Contributor Author

OK, ironically after whatwg/html#11005, I need to switch back to using Github pages, e.g. https://w3c.github.io/csswg-drafts/css-flexbox/

I have written this violentmonkey script to avoid the redirection to the official website:

// ==UserScript==
// @name        CSSWG drafts in Github Pages
// @namespace   https://github.com/Loirooriol/
// @match       https://w3c.github.io/csswg-drafts/*
// @grant       none
// @version     1.0
// @author      Oriol Brufau
// @description Avoids redirecting away from the CSSWG drafts hosted in Github Pages
// @run-at      document-start
// ==/UserScript==

addEventListener("beforescriptexecute", function listener(event) {
  if (event.target.textContent.includes("githubPrefix")) {
    event.preventDefault();
    removeEventListener("beforescriptexecute", listener);
  }
});

@Loirooriol
Copy link
Contributor Author

And how long are these blocks?? I'm still blocked, almost 12h later.

@plinss
Copy link
Member

plinss commented Feb 10, 2025

I increased the burst limit on images, this should help you while still catching crawlers.

I don't see your IP on the current block list, either it expired or you're using a different IP now.

Blocks are for 4 hours per violation, so the 4th time you get blocked is for 16 hours.

@plinss plinss closed this as completed Feb 10, 2025
@Loirooriol
Copy link
Contributor Author

So maybe it's something else. My IP is still 77.75.177.5, and I can't reach drafts.csswg.org, e.g.

$ ping drafts.csswg.org
PING drafts.csswg.org (45.79.94.155) 56(84) bytes of data.
$ traceroute -I drafts.csswg.org
traceroute to drafts.csswg.org (45.79.94.155), 30 hops max, 60 byte packets
 1  localhost (192.168.1.1)  0.398 ms  0.467 ms  0.534 ms
 2  100.127.255.254 (100.127.255.254)  2.618 ms  2.665 ms  2.952 ms
 3  10.12.0.13 (10.12.0.13)  3.404 ms  3.476 ms  3.650 ms
 4  bcn-b1-link.ip.twelve99.net (62.115.36.96)  3.516 ms  3.591 ms  3.720 ms
 5  prs-bb2-link.ip.twelve99.net (62.115.136.120)  32.858 ms  32.954 ms  33.034 ms
 6  ash-bb2-link.ip.twelve99.net (62.115.140.107)  106.573 ms  104.916 ms  105.147 ms
 7  rest-b2-link.ip.twelve99.net (62.115.121.216)  105.492 ms  105.540 ms  105.586 ms
 8  akamai-ic-386429.ip.twelve99-cust.net (62.115.190.161)  105.395 ms  104.901 ms  104.929 ms
 9  ae3.r21.iad02.mag.netarch.akamai.com (23.209.165.106)  117.152 ms  117.337 ms  117.407 ms
10  ae20.r01.iad02.icn.netarch.akamai.com (23.209.165.91)  117.180 ms  117.205 ms  117.233 ms
11  ae30.r01.ord01.icn.netarch.akamai.com (23.32.62.83)  124.192 ms  124.277 ms  124.529 ms
12  ae16.r01.sjc01.icn.netarch.akamai.com (23.32.62.79)  174.875 ms  174.885 ms  174.928 ms
13  ae1.r12.sjc01.ien.netarch.akamai.com (23.207.232.37)  175.102 ms  175.144 ms  174.465 ms
14  ae22.gw4.scz1.netarch.akamai.com (23.203.158.53)  170.736 ms  170.764 ms  170.756 ms
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

@Loirooriol
Copy link
Contributor Author

OK it has been fixed just now, I guess you did something, thanks!

@Loirooriol Loirooriol reopened this Apr 1, 2025
@Loirooriol
Copy link
Contributor Author

Happening again (still same IP).

@AtkinsSJ
Copy link
Contributor

AtkinsSJ commented Apr 1, 2025

I've frequently been hitting this lately, but only just came across this issue. I just assumed the server was overloaded or something. It's there some way the limits can be raised generally? All I'm doing is opening multiple specs at a time, to cross-reference things.

I'm also using Firefox on Linux, maybe that combination causes spammy requests?

EDIT: My IP is 92.29.234.208 in case that's helpful.

@AtkinsSJ
Copy link
Contributor

AtkinsSJ commented Apr 2, 2025

I brought this up with Ladybird developers and a couple of other people have been caught by this, just for trying to look at specs to implement them. Obviously I'm completely unaware of what level of scraping/bot traffic you were getting without this, so I don't know how bad it is - but if the blocking can be relaxed then we'd all appreciate it!

@sideshowbarker sideshowbarker reopened this Apr 5, 2025
@sideshowbarker
Copy link
Member

Multiple implementor contributors to the Ladybird project have been repeatedly running into this problem. Please fix whatever the root cause of it is so that this doesn't keep happening. We don't want to keep dealing with it case by case. The CSS WG drafts are the only specs we're running into this problem with. We're able to access all other W3C specs without running into rate limiting. So all that's needed here is to minimally make the behavior for accessing CSS WG drafts be the same as for all other W3C drafts.

@sideshowbarker
Copy link
Member

I'd also like to request that this be added to the agenda for the next working-group meeting, and I'd be happy to join the meeting to discuss the problem and how the group plans to fix it.

@plinss
Copy link
Member

plinss commented Apr 5, 2025

The "root cause" is that the server is constantly getting hit by hundreds of crawlers that ignore robots.txt. The traffic also spikes when multi-billion dollar companies, who don't contribute to the draft servers at all, post links to drafts in press and social media releases touting their latest features, or even crawl the server themselves to feed their LLMs.

While the drafts are currently served by two fairly robust servers, they're not able to handle the load without multiple layers of bot defenses. Rate limiting, followed by IP blocks, has been an effective tool to that end and is not going to get turned off any time soon.

The rate limits for regular drafts are currently 1 request per second with bursts of up to 50 requests, images allow bursts of up to 500. Multiple rate limit violations are required before IP blocks take effect. If Ladybird contributors are hitting that limit, then they're either behind a single NAT or proxy, or they need to rethink their draft access. If you want to provide specific IPs, I can look at the logs and see what's going on. I can also add a few IPs, or a small range to the blocking safelist (provided they're not abusing the server themselves).

As far as "addressing the root cause", you can either get everyone to stop crawling the site, get someone to start sponsoring the hosting so I can scale up the servers, or find someone to take over hosting it entirely. I'm more than happy for someone else to take over, but anyone offering to do so, or replace the server with a different solution, would be best served by starting out asking what the server is really hosting and what kinds of loads it's under, before proposing half-assed solutions.

I'm currently paying about $150/month out of my own pocket for the servers, for no personal gain, and have been doing so for years. And that's not taking my time into account, which has been considerable. So maybe being a bit less demanding here would also be a good place to start.

@plinss plinss closed this as completed Apr 5, 2025
@sideshowbarker
Copy link
Member

On behalf of the W3C team, I am conveying that we do not consider this resolved.

Please don't close this issue again until we have agreement with the working group and with the W3C team that it's actually resolved.

@sideshowbarker sideshowbarker reopened this Apr 5, 2025
@plinss
Copy link
Member

plinss commented Apr 5, 2025

On behalf of the un-sponsored person paying for the server, doing all the work, and currently having their time wasted, I am conveying that I consider your tone and behavior here inappropriate and unprofessional.

This issue was resolved on April 2 at 1:08PM PDT when the rate limits were last adjusted.

You have provided no information that anyone has experienced a rate-limit based IP block since then. What information you have provided is vague, anecdotal, and not actionable.

Please do not reopen this issue without first confirming that: legitimate users (i.e. not bots) are currently blocked, those blocks have started since the rate limits were last adjusted (see above where repeated blocks result in longer bans), and providing the date and time the block started and the IP address(es) affected. Otherwise there's no actionable information and nothing to do here. If there are left over long-term blocks from before the rate limits last changed, I also need IP addresses to release those.

If you want to start a general conversation with the WG about replacing the draft server, feel free to do so. I welcome it. File another issue for that, and please begin the conversation with a practical plan, a source of funding, or replacement servers, not just complaints about the status quo.

This issue is about IPs being blocked due to rate limit violations. Stop hijacking it.

@svgeesus
Copy link
Contributor

svgeesus commented Apr 7, 2025

I'd also like to request that this be added to the agenda for the next working-group meeting, and I'd be happy to join the meeting to discuss the problem and how the group plans to fix it.

@sideshowbarker See

@svgeesus
Copy link
Contributor

svgeesus commented Apr 7, 2025

anyone offering to do so, or replace the server with a different solution, would be best served by starting out asking what the server is really hosting and what kinds of loads it's under, before proposing half-assed solutions.

Good point. Do you have that information to hand?

@ADKaster
Copy link
Contributor

Please do not reopen this issue without first confirming that: legitimate users (i.e. not bots) are currently blocked, those blocks have started since the rate limits were last adjusted (see above where repeated blocks result in longer bans), and providing the date and time the block started and the IP address(es) affected. Otherwise there's no actionable information and nothing to do here. If there are left over long-term blocks from before the rate limits last changed, I also need IP addresses to release those.

For a concrete example of rate limits being too-aggressively applied, I visited https://drafts.csswg.org/css-fonts-4/ at around 9:10 am UTC today from a link on mdn (https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face#formal_syntax).

I immediately realized "oh, I wanted the specific link to font face lower down in the mdn page", so I closed the tab, and clicked on the link to https://drafts.csswg.org/css-fonts/#font-face-rule from https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face#specifications instead. I was hit with an immediate 429.

@svgeesus
Copy link
Contributor

svgeesus commented Apr 21, 2025

Blocked from 98.217.98.87 at 14:27 EST 21 April 2025 while attempting to access https://drafts.csswg.org/css-color-5/#absolute-color
Block resolved by 14:29

@svgeesus svgeesus reopened this Apr 21, 2025
@plinss
Copy link
Member

plinss commented Apr 22, 2025

The only log entries I find for that IP on the draft server are:

98.217.98.87 - - [17/Apr/2025:07:20:47 -0700] "GET /css3-images/ HTTP/2.0" 302 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko)"
98.217.98.87 - - [17/Apr/2025:07:20:48 -0700] "GET /css-images-3/ HTTP/2.0" 200 75828 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko)"
98.217.98.87 - - [17/Apr/2025:07:20:51 -0700] "GET /css3-images/ HTTP/2.0" 302 0 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 18_4 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.4 Mobile/15E148 Safari/604.1"
98.217.98.87 - - [17/Apr/2025:07:20:52 -0700] "GET /css-images-3/ HTTP/2.0" 200 75828 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 18_4 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.4 Mobile/15E148 Safari/604.1"

No 429's, no IP blocks. Are you sure about that IP?

Also, if the rate limit defenses kicked in you'd have been blocked for a minimum of 4 hours. A 2 minute block must have been something else.

@plinss
Copy link
Member

plinss commented Apr 22, 2025

For a concrete example of rate limits being too-aggressively applied, I visited https://drafts.csswg.org/css-fonts-4/ at around 9:10 am UTC today from a link on mdn (https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face#formal_syntax).

@ADKaster I need an IP address to be able to check the logs. Also, did you just get a 429 response or was your IP blocked?

@Loirooriol
Copy link
Contributor Author

I just got blocked while trying to add myself into https://wiki.csswg.org/planning/paris-2025
IP: 77.75.177.5

@plinss
Copy link
Member

plinss commented May 1, 2025

I released that block and increased the burst limit on the wiki (which was lower than the draft server)

@plinss plinss closed this as completed May 1, 2025
@Loirooriol
Copy link
Contributor Author

Yeah, I could reload the page. When trying to save, I got blocked again.

@Loirooriol Loirooriol reopened this May 1, 2025
@plinss
Copy link
Member

plinss commented May 1, 2025

It helps if I actually reload the server so the new configuration kicks in...

@Loirooriol
Copy link
Contributor Author

I'm blocked again, adding myself to the wiki is impossible

@Loirooriol Loirooriol reopened this May 1, 2025
@plinss
Copy link
Member

plinss commented May 1, 2025

I increased the rate limit on the wiki some more and unblocked you

@plinss plinss closed this as completed May 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants