Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
 Pages:  1    3
Author: Subject: Issues Loading the SM Website
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 29-1-2026 at 14:43


1454 now and the forum is still usable. Weird.



Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 31-1-2026 at 19:51


All right, this one's got to be a joke.
[leeches.jpg - 227kB
And yet the forum was running smooth, https and whisper and all.




Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
charley1957
Hazard to Others
***




Posts: 197
Registered: 18-2-2012
Location: Texas
Member Is Offline

Mood: Hotter than the hinges of Hell! Well, not yet, but fixing to be!

[*] posted on 1-2-2026 at 07:12


Today is the first day I’ve been able to get on the website in a week. I keep getting timeouts, server stopped responding messages.





You can’t claim you drank all day if you didn’t start early in the morning.
View user's profile View All Posts By User
yobbo II
National Hazard
****




Posts: 820
Registered: 28-3-2016
Member Is Offline

Mood: No Mood

[*] posted on 3-2-2026 at 17:26


01:17 GMT 4 Feb


Are there normally (before the bots) this many guests.

The bots may have registered?

Yob

guests.gif - 15kB
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 4-2-2026 at 04:06


The guests are bots, or rather the majority of the guests are. Take 20 guests as real persons and what remains will be those bloody leeches.



Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 6-2-2026 at 03:14


Same joke again. Ha ha. :mad:

More leeches.jpg - 223kB

Different scrapers?




Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
charley1957
Hazard to Others
***




Posts: 197
Registered: 18-2-2012
Location: Texas
Member Is Offline

Mood: Hotter than the hinges of Hell! Well, not yet, but fixing to be!

[*] posted on 6-2-2026 at 04:25


Wow, second day in a row I’ve been able to get in. Fast too, instantly. Are there that many bots out there all using this website at one time? I’m not real up on how all that works, what the bots are for, who runs them, etc. It’s my partial understanding that at least some of them are here to get data for training AI models. That’s just from some of the discussion in here about it. But are they all here doing that, all at one time?



You can’t claim you drank all day if you didn’t start early in the morning.
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 10-2-2026 at 14:51


Stable. Number of 'guests' below 800.



Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 22-2-2026 at 02:37


Well...
leeches21022026.jpg - 234kB




Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
Twospoons
International Hazard
*****




Posts: 1392
Registered: 26-7-2004
Location: Middle Earth
Member Is Offline

Mood: A trace of hope...

[*] posted on 22-2-2026 at 17:13


Certain times seem to be worse than others : I can rarely load the website before midday (local time is GMT+12).



Helicopter: "helico" -> spiral, "pter" -> with wings
View user's profile View All Posts By User
esquizete_electrolysis
Harmless
*




Posts: 26
Registered: 9-10-2018
Location: N.C. RT
Member Is Offline


[*] posted on 23-2-2026 at 17:18


Haven't been able to load the site in the last week, but had an idea during that time. Why not implement a captcha to view the site? Its better than having to log in to view it and many other forums (albeit, on more modern codebases, mostly xenforo) have used it to great success. It may have the downside of reducing the ability to search by threads via google, but I reckon that that's a fair trade off for being able to use the site more than once a week

Edit: I wanted to flesh out my thought a bit more so this doesn't seem like a piss in the wind. Due to the bulk of the traffic likely being AI crawlers, some more simple systems could be implemented prior to a captcha. I also don't particularly care for most captcha providers since google datamines as much as possible, and cloudflare has a good tendency to pull service for ethereal reasons.

The most simple is using an updated robots.txt config. If its mostly crawlers from the most popular sources, this should take care of the bulk. I doubt this is not already in use, but it may need updating. Github typically hosts decent stuff for this.

If that doesn't work, the next would be including a no indexing tag on most pages. This would unfortunately neuter the ability to search up relevant info in our more expansive threads. This could be applied to body text so post titles will still remain searchable.

If neither of these worked, it would be more of a point of seeing if the site is actively being attacked by someone. At that point, implementing a captcha along with DDoS protection frameworks like CrowdSec could be a decent way of aiding the issue. Its unlikely its this deep, but who knows, there may be a long disgruntled loser who wants to take down this wonderful resource.

Finally, scorched earth. Have the no index tag on every page and require a trusted tag to access anything other than the login page. This could be expanded out to have a list of trusted IPs, or using cookies to grant a trust token. All of that is unnecessary, since its still just requiring users to login to use the site.

[Edited on 24-2-2026 by esquizete_electrolysis]
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 24-2-2026 at 04:14


The leeches always find a workaround. Solving captcha is within their abilities. robots.txt is not a rule, the crawlers comply if they were instructed to do so. A gentlemen's agreement, in fact.

If it were an attack, there would be more than just plain unavailability. If you have read older threads, there was a guy who messed up posts and changed passwords and stuff until he was caught. I can't remember what post was it.

Is there an inverse relation between leech number and ease of access?


leeches_20260224.jpg - 124kB




Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
esquizete_electrolysis
Harmless
*




Posts: 26
Registered: 9-10-2018
Location: N.C. RT
Member Is Offline


[*] posted on 25-2-2026 at 16:30


Quote: Originally posted by bnull  
The leeches always find a workaround. Solving captcha is within their abilities. robots.txt is not a rule, the crawlers comply if they were instructed to do so. A gentlemen's agreement, in fact.


Perhaps, but compliance is necessary for any large company to continue their operations. Legal precedence (atleast in the US) has been set that considers noncompliance as either trespassing (kinda odd) or violation of DMCA.

I did look and we do have a decent robots.txt, however it only blocks some of the large ones, 11 to be precise, with only 4 of those being AI specific crawlers. Of those 4, only 1 is in major use anymore, with the other 35 that make up 90% of crawler traffic not being blocked.

I attached both ours (as a quote since its short) and a random AI megalist one I pulled off github for comparison.
Quote:

User-agent: Amazonbot
User-agent: Amazonbot/0.1
User-agent: GPTBot **Note: This is the only blocked one of their 6 bots.
User-agent: Applebot
User-agent: SemrushBot
User-agent: Bytespider
User-agent: meta-externalagent
User-agent: coccocbot-web
User-agent: AhrefsBot
User-agent: PetalBot
User-agent: BLEXBot


Attachment: robots.txt (3kB)
This file has been downloaded 26 times

View user's profile View All Posts By User
chempyre235
Hazard to Others
***




Posts: 207
Registered: 21-10-2024
Location: Between Nb and Tc
Member Is Offline

Mood: Quite distracted

[*] posted on 26-2-2026 at 08:38


It's been almost two weeks since I've been able to access the forum (I was getting a bit worried). I now see that the sheer number of bots constantly on this forum is far past what this site was designed for. At the time that I typed this, there are well over 900 users on this site, with three members (including myself) logged in.



"However beautiful the strategy, you should occasionally look at the results." -Winston Churchill

"I weep at the sight of flaming acetic anhydride." -@Madscientist

"...the elements shall melt with fervent heat..." -2 Peter 3:10
View user's profile View All Posts By User
macckone
Dispenser of practical lab wisdom
*****




Posts: 2211
Registered: 1-3-2013
Location: Over a mile high
Member Is Offline

Mood: Electrical

[*] posted on 26-2-2026 at 10:49


Compliance with robots.txt is necessary under the computer fraud and abuse act. It is technically illegal to exceed authority granted to access a website. They can also be sued for damages. Which at this point are substantial.
View user's profile View All Posts By User
bnull
National Hazard
****




Posts: 994
Registered: 15-1-2024
Location: East Woods
Member Is Offline

Mood: Fecking annoyed

[*] posted on 26-2-2026 at 14:26


Quote:
Compliance with robots.txt is necessary under the computer fraud and abuse act.

No. First of all, robots.txt was created in 1994 and
Quote:
"is not an official standard backed by a standards body, or owned by any commercial organisation. It is not enforced by anybody, and there no guarantee that all current and future robots will use it. Consider it a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." (From https://www.robotstxt.org/)

As I wrote before, compliance with robots.txt is based on a gentlemen's agreement, which the leeches are not obliged to honor.

The Computer Fraud and Abuse Act was enacted in 1986, almost a decade before robots.txt. It does not define what constitutes unauthorized access.

Quote:
It is technically illegal to exceed authority granted to access a website.

Yes, but unauthorized access requires that the leeches scrape content that is only visible to logged members (the u2u system and the References, Whimsy, and Detritus subforums). The content that is visible to the general public without the necessity of username and password (the rest of the subforums) does not fall under that definition. From Van Buren v. United States (attachment, page 5):
Quote:
We must decide whether Van Buren also violated the Computer Fraud and Abuse Act of 1986 (CFAA), which makes it illegal "to access a computer with authorization and to use such access to obtain or alter information in the computer that the accesser is not entitled so to obtain or alter."

He did not. This provision covers those who obtain information from particular areas in the computer—such as files, folders, or databases—to which their computer access does not extend. It does not cover those who, like Van Buren, have improper motives for obtaining information that is otherwise available to them.

robots.txt is not legally binding, so whether the leeches comply with it or not is immaterial for the definition of unauthorized access.

Attachment: Van Buren.pdf (207kB)
This file has been downloaded 34 times




Quod scripsi, scripsi.

B. N. Ull

We have a lot of fun stuff in the Library.

Read The ScienceMadness Guidelines. They exist for a reason.
View user's profile View All Posts By User
BromicAcid
International Hazard
*****




Posts: 3323
Registered: 13-7-2003
Location: Wisconsin
Member Is Offline

Mood: Rock n' Roll

[*] posted on 26-2-2026 at 19:26


Since we're so popular, poison?

https://rnsaffn.com/poison3/

Edit: I suppose we have Detritus though which is about the same thing.

[Edited on 2/27/2026 by BromicAcid]




Shamelessly plugging my attempts at writing fiction: http://www.robvincent.org
View user's profile Visit user's homepage View All Posts By User
Twospoons
International Hazard
*****




Posts: 1392
Registered: 26-7-2004
Location: Middle Earth
Member Is Offline

Mood: A trace of hope...

[*] posted on 26-2-2026 at 19:51


There's always the nuclear option : no guest access. I really don't like the idea of blocking casual users, but the current situation is blocking everyone.

[Edited on 27-2-2026 by Twospoons]




Helicopter: "helico" -> spiral, "pter" -> with wings
View user's profile View All Posts By User
BromicAcid
International Hazard
*****




Posts: 3323
Registered: 13-7-2003
Location: Wisconsin
Member Is Offline

Mood: Rock n' Roll

[*] posted on 27-2-2026 at 18:40


Quote: Originally posted by Twospoons  
There's always the nuclear option : no guest access. I really don't like the idea of blocking casual users, but the current situation is blocking everyone.


Or maybe just block access to everything except Beginnings or something?




Shamelessly plugging my attempts at writing fiction: http://www.robvincent.org
View user's profile Visit user's homepage View All Posts By User
 Pages:  1    3

  Go To Top