Sciencemadness Discussion Board

Issues Loading the SM Website

chempyre235 - 24-10-2025 at 05:27

I've been having problems loading pages on this site for the past three or four days and based on the short list of visitors in the last 24 hours, I suspect I'm not alone in this. Do any of the mods know what the problem is?

Radiums Lab - 24-10-2025 at 06:24

I am also facing similar issues from past few days.

DraconicAcid - 24-10-2025 at 13:18

Same here (on mulltiple computers).

RedDwarf - 24-10-2025 at 15:21

I've been having the same issue for several months, the website becomes unavailable for a period time (greater than an hour, sometimes several).

Admagistr - 24-10-2025 at 17:07

Yes, it happens to me too. It's unpleasant...

esquizete_electrolysis - 26-10-2025 at 12:56

I’ve noticed as well. Usually the https portion of the site remains unloadable, but the http service remains up most of the time.

chempyre235 - 27-10-2025 at 06:21

It seems at least for now, the loading issues have subsided over the weekend.

Sulaiman - 27-10-2025 at 07:31

I have had problems with SM lately - something has changed recently.
Also, if I have more than one tab open with SM, things may go really slow.
(so, check for unused tabs)

I am using chrome on android.

BromicAcid - 27-10-2025 at 16:17

If I recall correctly this was brought up a few months ago, Polverone stepped in and said the issue is that the site is being constantly scraped for AI learning/training and that is overwhelming the server.

It was mentioned in this thread:
https://www.sciencemadness.org/whisper/viewthread.php?tid=16...

[Edited on 10/28/2025 by BromicAcid]

chempyre235 - 28-10-2025 at 06:34

I'd forgotten about that until you brought it up (I even posted on that thread :D). I saw that someone had suggested using a CAPTCHA, but services like CloudFlare are designed to limit bot traffic and speed up websites and networks without having to perform tasks. I've got no idea of pricing, though. However, it seems to be commonly used even on sites whose hosts I wouldn't expect to have much funding, so maybe it's feasible.

Twospoons - 19-11-2025 at 12:16

This is still a problem. I've been unable to access SM for days at a time.

mayko - 19-11-2025 at 18:59

might be the first time I've seen this... :(

scimad_ddosd.png - 76kB

violet sin - 19-11-2025 at 20:46

I've had trouble as well. To me I could usually seem to get in like once a day, but later in the day it would time out over and over. Some times it's been a few days in a row, I always chocked it up to my poor coverage and mint mobile service provider. But... It sounds like a larger problem and my suspicions anecdotal at best.

j_sum1 - 19-11-2025 at 22:22

Not ideal, but I have found a workaround that works much of the time.

View the http version of the site instead of https.

[Edited on 20-11-2025 by j_sum1]

bnull - 22-11-2025 at 05:14

Frigging flaming hell. I was reading a post and clicked to open one of the attachments on a new tab when, all of a sudden, I get a 403 forbidden anywhwre in the tree. Emptied the cache, tried a private tab. Nothing.

This is the first time a see a 403 here. As I write, I'm accessing the forum through VPN (I'm not in Japan).

Radiums Lab - 22-11-2025 at 05:34

@j_sum1 mentioned a workaround for this in the Secret Santa thread, it works good for me.

[Edited on 22-11-2025 by Radiums Lab]

bnull - 22-11-2025 at 08:05

I had been using that trick occasionally but it does nothing to the 403.

Keras - 22-11-2025 at 11:33

It seems to be better since a couple of days. Might be the server which ran out of VM?

beta4 - 23-11-2025 at 06:18

After a long time away from this forum, today I try to connect back and keep getting "Connection timeout". At first I was worried this corner of the internet was gone, but after some more digging I found the website just became so slow that the browser times out every time.

For those having the same problem, here's how I "fixed" it. I'm using Firefox, went to about :config and increased the following timeouts to 300 seconds:

network.http.connection-timeout
network.http.tls-handshake-timeout

Although with this trick I can login and browse the forum, it is so slow that every page load takes minutes so it's far from a real fix.


Since it looks like the slowdown issue is due to AI bots crawling it massively, may I suggest to the sysadmins to install anubis?

https://anubis.techaro.lol/

It's a firewall designed specifically to stop AI crawlers, and I've seen it used effectively for other websites sharing the same problem. I've never installed it or configured it though, so I do not have direct experience with it.

Radiums Lab - 5-12-2025 at 19:31

This forum is finally running after several days. I was scared that this was dead.

greenlight - 5-12-2025 at 20:08

Are the downtimes going to get increasingly longer until one day it never comes back?

Was quite worried for those three days too.

Radiums Lab - 5-12-2025 at 20:19

I noticed one thing tho, Polverone was online after many days. He might have fixed it.

bnull - 6-12-2025 at 02:54

Here's another thing: the number of "guests" goes from 300-400 when the forum is running smooth to over 1200 when it becomes unstable.

I hope this AI bubbles bursts and takes the parent companies down with them.

[Edited on 6-12-2025 by bnull]

BromicAcid - 6-12-2025 at 05:56

Quote: Originally posted by bnull  
Here's another thing: the number of "guests" goes from 300-400 when the forum is running smooth to over 1200 when it becomes unstable.

I hope this AI bubbles bursts and take the parent companies down with them.


It is insane to me, our forum used to have to have at most 20 or so guests on at a time, now it's never not in the hundreds, and of course these are not real people. Sadly we used to usually have more members active than guests at any given time but usually it's just one or two of us on the main page.

bnull - 7-12-2025 at 02:35

The last time I checked yesterday we had 268 "guests". Right now we have 558.

bnull - 8-12-2025 at 07:44

704 "guests": http plus talk works, https plus whisper does not.

chempyre235 - 10-12-2025 at 12:32

I've just now been able to get on the SM forum pages. As of even this morning, it hadn't been working. Was able to log on about five days ago, then endless loading and error messages until now.

Not even the http/talk trick had been working.

[Edited on 12/10/2025 by chempyre235]

Radiums Lab - 10-12-2025 at 16:08

I used to hate that human verification thing in other websites, now I understood its importance.

RogueRose - 12-12-2025 at 20:48

I was having problems loading the site for at least 12 days or so. My browser history only goes back that far b/c of an OS reload. This is the first time I was able to load the site for at least that long. Does anyone know how long the site was having problems? I'm in the US if that matters..

jackchem2001 - 2-1-2026 at 04:38

I think the site always has some occasional connectivity problems, just have to try again later, not a problem

bnull - 2-1-2026 at 06:34

No problem in the last two weeks. Number of guests has been below 500 as far as I know.

Twospoons - 14-1-2026 at 13:13

Site unreachable yesterday, repeated timeouts today. I guess we're getting hammered by bots again.

Dr.Bob - 14-1-2026 at 17:54

Is there anything we can do to help? or can Polverone or someone create a backup site or email place to connect if the site is down.

Radiums Lab - 14-1-2026 at 19:45

The guest count is more than 500. That explains the site being down.

bnull - 15-1-2026 at 03:15

I knew we were close to an event when the number of leeches jumped from 320 to 580. It would be interesting if we could plot the number of leeches as a function of time. Maybe there is a pattern, maybe not.

BromicAcid - 15-1-2026 at 17:27

Someone's gotta train the AI models, it might as well be us :D

I'm sure lots of issues like this are getting buried in the models thanks to us:

https://www.youtube.com/watch?v=W2xZxYaGlfs

[Edited on 1/16/2026 by BromicAcid]

bnull - 15-1-2026 at 17:42

Just imagine if they had access to Detritus.

521 "guests" right now.

[Edited on 16-1-2026 by bnull]

bnull - 21-1-2026 at 12:07

1611 leeches ten hours ago, 862 now.

Twospoons - 21-1-2026 at 12:34

Is there no way to limit guest bandwidth or concurrent numbers? Is the server software too old for this?

BromicAcid - 21-1-2026 at 14:35

Quote: Originally posted by Twospoons  
Is there no way to limit guest bandwidth or concurrent numbers? Is the server software too old for this?


I have to wonder the same thing or at least what is hosting costing now? We had a hosting crunch several years back, are we burning through money or good will now? Do we need to start locking things down - is this becoming untenable? Couldn't access this morning, more than 1000 of those buggers.

bariumbromate - 21-1-2026 at 18:01

polverone could try nepenthes? https://zadzmo.org/code/nepenthes/

it might deter ai scrapers?

pesco - 24-1-2026 at 21:49

Site is unreachable most of the time. When it becomes accessible then posting fails every time.
It becomes unusable :(

Please limit number of guest allowed simultaneously or allow only logged in users.

bnull - 25-1-2026 at 03:24

What about limiting the time allowed to move between threads? Human operators take seconds to do that, leeches fractions of a second.

yobbo II - 25-1-2026 at 16:03


Hello,


[img] https://www.sciencemadness.org/whisper/files.php?pid=516736&aid=67913[/img]



At least the spammers could not (effectively) lock you out of the site.


Shoot The Dog.jpg-thumb.jpg - 20kB Spammer Meme.jpg - 203kB saw.jpg - 16kB


These joyous activities have no effect on Bots.
Welcome to modernity!!
What to do?

Yob

j_sum1 - 25-1-2026 at 19:20

This is a frustrating problem and I agree with the bulk of what has been said here. The solution may be as simple as a captcha to access the site.

It does expose a bigger problem – one which has been discussed but for which there is no current solution.
Polverone basically has the keys to the board. It began as his baby and he is really the only one who can implement changes. However, he is extremely infrequent – his life has moved on. Practically, this means the board is vulnerable to all manner of potential problems and not all of them could be solved through software upgrades.

My thoughts are that a steady stream of emails from members may be the best way to catch his attention. Nothing big or spammy. Just a simple request in the subject line.

pesco - 25-1-2026 at 22:18

Captcha might not work.
It easy to train AI to pass the captcha and I suspect the AI crawlers have that ability already built in.

Only 3 solution I can see working for sure:
- only logged in users, but that makes it unavailable to wide public
- extensive filters/acl, but that takes time and effort to implement
- best one, next post :D

Althought i specialised in embedded systems and networking and not web servers configuration I had some experience with bunch of Apache installs and could help with the filters/acl as time allows.

Something has to be done as the forum becomes unusable more and more often and it will become worse over time as AI hype gers worse and ever so smaller entities try their luck in this tech.

pesco - 25-1-2026 at 22:46

Best solution - make a mirror, then keep one on login only excluding AI crawlers and the other one on free public access.

That way we can comfortably read and post on member only version while crawlers will be free to clog up the public copy.

I don't have private servers anymore, but I do have one virtual server on ionos and if the specs are good enough I am happy to put a mirror on it. I use it only for my email server, so I am not bothered about space or bandwidth. Email is not interactive anyway and its space quota is separate from web space quota.

bnull - 26-1-2026 at 11:02

Quote: Originally posted by j_sum1  
It does expose a bigger problem – one which has been discussed but for which there is no current solution.
Polverone basically has the keys to the board. It began as his baby and he is really the only one who can implement changes. However, he is extremely infrequent – his life has moved on. Practically, this means the board is vulnerable to all manner of potential problems and not all of them could be solved through software upgrades.

I wonder if he would agree to setting up a board of trustees.

Twospoons - 26-1-2026 at 13:25

The bigger problem is these AI crawlers chewing up server bandwidth that has to be paid for by someone. Hosting aint free.
They're getting this AI training data for nothing - and this is part of an internet wide issue of AI gobbling up peoples hard work without compensation, and then using that to make money. It's theft, basically.

Twospoons - 29-1-2026 at 12:39

814 'guests' ?! This is getting out of hand. Reading around, it seems we're not the only ones getting hit by aggressive bots.

bnull - 29-1-2026 at 14:43

1454 now and the forum is still usable. Weird.

bnull - 31-1-2026 at 19:51

All right, this one's got to be a joke.
[leeches.jpg - 227kB
And yet the forum was running smooth, https and whisper and all.

charley1957 - 1-2-2026 at 07:12

Today is the first day I’ve been able to get on the website in a week. I keep getting timeouts, server stopped responding messages.


yobbo II - 3-2-2026 at 17:26

01:17 GMT 4 Feb


Are there normally (before the bots) this many guests.

The bots may have registered?

Yob

guests.gif - 15kB

bnull - 4-2-2026 at 04:06

The guests are bots, or rather the majority of the guests are. Take 20 guests as real persons and what remains will be those bloody leeches.

bnull - 6-2-2026 at 03:14

Same joke again. Ha ha. :mad:

More leeches.jpg - 223kB

Different scrapers?

charley1957 - 6-2-2026 at 04:25

Wow, second day in a row I’ve been able to get in. Fast too, instantly. Are there that many bots out there all using this website at one time? I’m not real up on how all that works, what the bots are for, who runs them, etc. It’s my partial understanding that at least some of them are here to get data for training AI models. That’s just from some of the discussion in here about it. But are they all here doing that, all at one time?

bnull - 10-2-2026 at 14:51

Stable. Number of 'guests' below 800.

bnull - 22-2-2026 at 02:37

Well...
leeches21022026.jpg - 234kB

Twospoons - 22-2-2026 at 17:13

Certain times seem to be worse than others : I can rarely load the website before midday (local time is GMT+12).

esquizete_electrolysis - 23-2-2026 at 17:18

Haven't been able to load the site in the last week, but had an idea during that time. Why not implement a captcha to view the site? Its better than having to log in to view it and many other forums (albeit, on more modern codebases, mostly xenforo) have used it to great success. It may have the downside of reducing the ability to search by threads via google, but I reckon that that's a fair trade off for being able to use the site more than once a week

Edit: I wanted to flesh out my thought a bit more so this doesn't seem like a piss in the wind. Due to the bulk of the traffic likely being AI crawlers, some more simple systems could be implemented prior to a captcha. I also don't particularly care for most captcha providers since google datamines as much as possible, and cloudflare has a good tendency to pull service for ethereal reasons.

The most simple is using an updated robots.txt config. If its mostly crawlers from the most popular sources, this should take care of the bulk. I doubt this is not already in use, but it may need updating. Github typically hosts decent stuff for this.

If that doesn't work, the next would be including a no indexing tag on most pages. This would unfortunately neuter the ability to search up relevant info in our more expansive threads. This could be applied to body text so post titles will still remain searchable.

If neither of these worked, it would be more of a point of seeing if the site is actively being attacked by someone. At that point, implementing a captcha along with DDoS protection frameworks like CrowdSec could be a decent way of aiding the issue. Its unlikely its this deep, but who knows, there may be a long disgruntled loser who wants to take down this wonderful resource.

Finally, scorched earth. Have the no index tag on every page and require a trusted tag to access anything other than the login page. This could be expanded out to have a list of trusted IPs, or using cookies to grant a trust token. All of that is unnecessary, since its still just requiring users to login to use the site.

[Edited on 24-2-2026 by esquizete_electrolysis]

bnull - 24-2-2026 at 04:14

The leeches always find a workaround. Solving captcha is within their abilities. robots.txt is not a rule, the crawlers comply if they were instructed to do so. A gentlemen's agreement, in fact.

If it were an attack, there would be more than just plain unavailability. If you have read older threads, there was a guy who messed up posts and changed passwords and stuff until he was caught. I can't remember what post was it.

Is there an inverse relation between leech number and ease of access?


leeches_20260224.jpg - 124kB

esquizete_electrolysis - 25-2-2026 at 16:30

Quote: Originally posted by bnull  
The leeches always find a workaround. Solving captcha is within their abilities. robots.txt is not a rule, the crawlers comply if they were instructed to do so. A gentlemen's agreement, in fact.


Perhaps, but compliance is necessary for any large company to continue their operations. Legal precedence (atleast in the US) has been set that considers noncompliance as either trespassing (kinda odd) or violation of DMCA.

I did look and we do have a decent robots.txt, however it only blocks some of the large ones, 11 to be precise, with only 4 of those being AI specific crawlers. Of those 4, only 1 is in major use anymore, with the other 35 that make up 90% of crawler traffic not being blocked.

I attached both ours (as a quote since its short) and a random AI megalist one I pulled off github for comparison.
Quote:

User-agent: Amazonbot
User-agent: Amazonbot/0.1
User-agent: GPTBot **Note: This is the only blocked one of their 6 bots.
User-agent: Applebot
User-agent: SemrushBot
User-agent: Bytespider
User-agent: meta-externalagent
User-agent: coccocbot-web
User-agent: AhrefsBot
User-agent: PetalBot
User-agent: BLEXBot


Attachment: robots.txt (3kB)
This file has been downloaded 20 times


chempyre235 - 26-2-2026 at 08:38

It's been almost two weeks since I've been able to access the forum (I was getting a bit worried). I now see that the sheer number of bots constantly on this forum is far past what this site was designed for. At the time that I typed this, there are well over 900 users on this site, with three members (including myself) logged in.

macckone - 26-2-2026 at 10:49

Compliance with robots.txt is necessary under the computer fraud and abuse act. It is technically illegal to exceed authority granted to access a website. They can also be sued for damages. Which at this point are substantial.

bnull - 26-2-2026 at 14:26

Quote:
Compliance with robots.txt is necessary under the computer fraud and abuse act.

No. First of all, robots.txt was created in 1994 and
Quote:
"is not an official standard backed by a standards body, or owned by any commercial organisation. It is not enforced by anybody, and there no guarantee that all current and future robots will use it. Consider it a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." (From https://www.robotstxt.org/)

As I wrote before, compliance with robots.txt is based on a gentlemen's agreement, which the leeches are not obliged to honor.

The Computer Fraud and Abuse Act was enacted in 1986, almost a decade before robots.txt. It does not define what constitutes unauthorized access.

Quote:
It is technically illegal to exceed authority granted to access a website.

Yes, but unauthorized access requires that the leeches scrape content that is only visible to logged members (the u2u system and the References, Whimsy, and Detritus subforums). The content that is visible to the general public without the necessity of username and password (the rest of the subforums) does not fall under that definition. From Van Buren v. United States (attachment, page 5):
Quote:
We must decide whether Van Buren also violated the Computer Fraud and Abuse Act of 1986 (CFAA), which makes it illegal "to access a computer with authorization and to use such access to obtain or alter information in the computer that the accesser is not entitled so to obtain or alter."

He did not. This provision covers those who obtain information from particular areas in the computer—such as files, folders, or databases—to which their computer access does not extend. It does not cover those who, like Van Buren, have improper motives for obtaining information that is otherwise available to them.

robots.txt is not legally binding, so whether the leeches comply with it or not is immaterial for the definition of unauthorized access.

Attachment: Van Buren.pdf (207kB)
This file has been downloaded 29 times

BromicAcid - 26-2-2026 at 19:26

Since we're so popular, poison?

https://rnsaffn.com/poison3/

Edit: I suppose we have Detritus though which is about the same thing.

[Edited on 2/27/2026 by BromicAcid]

Twospoons - 26-2-2026 at 19:51

There's always the nuclear option : no guest access. I really don't like the idea of blocking casual users, but the current situation is blocking everyone.

[Edited on 27-2-2026 by Twospoons]

BromicAcid - 27-2-2026 at 18:40

Quote: Originally posted by Twospoons  
There's always the nuclear option : no guest access. I really don't like the idea of blocking casual users, but the current situation is blocking everyone.


Or maybe just block access to everything except Beginnings or something?