Sciencemadness Discussion Board
Not logged in [Login ]
Go To Bottom

Printable Version  
 Pages:  1  ..  22    24    26  ..  28
Author: Subject: Tired of reporting spam
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 24-10-2018 at 21:33


BotKilla in action.

I sort of gave it an overhaul today, that allowed me to run the spam tests independently of the main script. This makes it a lot easier for me to test spam that isn't caught in the filter and quickly write and test new rules that might do a better job of catching it.

Any spam that does get through the filter, it'd help to move it to Detritus so I can take a look at it later and see how it got through.

botkilla.png - 46kB




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
streety
Hazard to Others
***




Posts: 110
Registered: 14-5-2018
Member Is Offline


[*] posted on 25-10-2018 at 12:16


Are you checking for false positives?
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 25-10-2018 at 13:33


Yes, so far just that one, and that was because of a glitch in the way links were checked against the whitelist. I PMed him and told him what happened, and included the text of his post in case he wanted to repost it. I don't think he's been back in a few days though. Once I fixed that glitch, it no longer got flagged. Really, the scrutiny is just a lot more on recently-registered users, especially when posting links. His link would have been whitelisted though, had the whitelisting been working correctly at the time. I also raised the threshold for what gets deleted and added a few more conditions that can trigger additional flags. The "linking from unrecognized domain" flag has really been a key predictor of spam, by a long shot. And the "registered today" flag too, just because it focuses on the narrow subset of members who post 99% of the spam.

The algorithm has worked flawlessly, actually, it's just a stupid bug in my code that was responsible for that one false positive. :P




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 25-10-2018 at 14:50


Maybe botkilla need another feature:
temporarily ban a new account if it has more than a certain # of spam posts deleted, say 5?




View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 25-10-2018 at 18:06


I only have moderation privileges, I think banning someone requires an admin. Not that it matters, since it looks like spam is almost always deleted within a minute after being posted. Really, this is a temporary measure to make this board usable until we can commit to switching forum software, and make sure we're all on the same page as far as how we go about it. Obviously, the sooner we do this, the better.



The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
phlogiston
International Hazard
*****




Posts: 1375
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

[*] posted on 31-10-2018 at 15:21


Does BotKilla have a day of or did the spammers find a way to circumvent it?
I just browsed through 5 pages of today's posts and found just 10 'real' threads in them.




-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 31-10-2018 at 18:32


Oops. There was some strange character encoding in one of the spam posts that crashed BotKilla, and I didn't notice what happened until just a little while ago. I think I've covered all the bases now, but the only way to know for sure is to just let it run and keep checking it.

Really though, this is turning into quite a lot of responsibility. It's nothing I can't handle, but it would certainly help if there was compensation that came with it. I guess I should probably start a new thread on the subject to see what people think.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 1-11-2018 at 02:05


Quote: Originally posted by Melgar  
Really though, this is turning into quite a lot of responsibility. It's nothing I can't handle, but it would certainly help if there was compensation that came with it. I guess I should probably start a new thread on the subject to see what people think.
Maybe put the thread in whimsy?



View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 1-11-2018 at 11:39


Hidden spam in signature:
https://www.sciencemadness.org/whisper/viewthread.php?tid=65...




View user's profile View All Posts By User
unionised
International Hazard
*****




Posts: 5102
Registered: 1-11-2003
Location: UK
Member Is Offline

Mood: No Mood

[*] posted on 4-11-2018 at 02:41


What the hell is it with mangosteens?
Anyway, I thought I'd say thanks to whoever else was involved in the episode about 15 minutes ago where a whole bunch of spam was quickly detected + wiped.
Very satisfying.
View user's profile View All Posts By User
j_sum1
Administrator
********




Posts: 6218
Registered: 4-10-2014
Location: Unmoved
Member Is Offline

Mood: Organised

[*] posted on 4-11-2018 at 02:55


Just you and Melgar's botkilla this thme unionised.

I got no idea what a mangosteen is either and I am no googling it.
View user's profile View All Posts By User
unionised
International Hazard
*****




Posts: 5102
Registered: 1-11-2003
Location: UK
Member Is Offline

Mood: No Mood

[*] posted on 4-11-2018 at 04:57


Nice to know I achieved something today.
A mangosteen is a fruit, BTW, Nice enough. Nothing special except it's a bit of a novelty for most people in the
UK or US
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 4-11-2018 at 06:12


Quote: Originally posted by unionised  
Nice to know I achieved something today.
A mangosteen is a fruit, BTW, Nice enough. Nothing special except it's a bit of a novelty for most people in the
UK or US
Mangosteen is also a...typhoon aka hurricane/cyclone:P



View user's profile View All Posts By User
Tsjerk
International Hazard
*****




Posts: 3022
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

[*] posted on 4-11-2018 at 08:04


Very nice work done Melgar! could I have a copy of your code? Out of curiosity
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 4-11-2018 at 08:18


Quote: Originally posted by unionised  
What the hell is it with mangosteens?
Anyway, I thought I'd say thanks to whoever else was involved in the episode about 15 minutes ago where a whole bunch of spam was quickly detected + wiped.
Very satisfying.

Oh right. If my U2U message count shoots up due to spam reports, that's my cue to check on the BotKilla script. Indeed, it had malfunctioned and needed to be restarted. Since I let this thing loose with a lot of untested code initially, I've been operating under the principle that it's better to have it stop when it malfunctions. That way, I can see what caused the problem, and there isn't a malfunctioning bot on the loose, deleting posts. But like the last three times that's happened, it's been some sort of temporary network error, so next time that happens I'll set it to ignore that specific error and just keep trying again until it works.

When it finally does get running again, it keeps track of how many spam posts it's deleted per hour, and I admit that seeing spam text scroll past faster than I can read it, and watching that rate temporarily shoot up into the tens of thousands is quite satisfying. :D




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 4-11-2018 at 08:41


Quote: Originally posted by Tsjerk  
Very nice work done Melgar! could I have a copy of your code? Out of curiosity

Ok, here's the main script :

https://github.com/toldani/sm-transition/blob/master/botkill...

And here's the domain whitelist that links are checked against:

https://github.com/toldani/sm-transition/blob/master/sm-link...

No single flag is enough to trigger a deletion, but most spam is deleted when a user's first post includes a link to a domain not on the whitelist. The whitelist was generated by scanning 14 years of forum history and storing all the domains that were linked to at least ten times.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
unionised
International Hazard
*****




Posts: 5102
Registered: 1-11-2003
Location: UK
Member Is Offline

Mood: No Mood

[*] posted on 4-11-2018 at 14:44


I'm sure teh question was asked earlier but... can you get the code to spot posts in cyrillic?
View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 4-11-2018 at 15:51


Quote: Originally posted by unionised  
I'm sure teh question was asked earlier but... can you get the code to spot posts in cyrillic?

Yeah, it already does. Also, Chinese, Tagalog, Arabic, and Thai. Although I've really only seen much Cyrillic and Chinese.

Right now, it only gets new threads and not spam that's been tacked onto the end of existing threads, but that's mostly because that spam is less common and it usually gets deleted before I have a chance to figure out how to detect it.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
Texium
Administrator
********




Posts: 4508
Registered: 11-1-2014
Location: Salt Lake City
Member Is Online

Mood: PhD candidate!

[*] posted on 13-11-2018 at 10:14


In case anyone was wondering, I reported the Melgar impersonator as Spam in order to swiftly destroy it. So that's where it went.



Come check out the Official Sciencemadness Wiki
They're not really active right now, but here's my YouTube channel and my blog.
View user's profile Visit user's homepage View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 18-11-2018 at 18:07


Oh, ok. I was kinda wondering what happened with that.

It's nice to see so many new threads being started that aren't spam. :) Also, I've configured BotKilla to wait five minutes after it crashes, then start running again. The main thing crashing it now seems to actually be if a thread gets deleted between when BotKilla determines a thread to be spam, and when it tries to navigate to that thread to delete it. So really it was just a 404 error that only became a problem because I hadn't set up error handling in the "delete thread" function. But since it seems to cause way more problems when it's down than when it's up and malfunctioning slightly, it should now recover from any errors that it throws, and go back to hunting spam. :D




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
Mr. Rogers
Hazard to Others
***




Posts: 184
Registered: 30-10-2017
Location: Ammonia Avenue
Member Is Offline

Mood: No Mood

[*] posted on 19-11-2018 at 08:45


I haven't seen any spam in two days. Something is obviously working.
View user's profile View All Posts By User
12thealchemist
Hazard to Others
***




Posts: 181
Registered: 1-1-2014
Location: The Isle of Albion
Member Is Offline

Mood: Rare and Earthy

[*] posted on 19-11-2018 at 11:55


I'm afraid you've just jinxed it. Two spam posts this evening, but different to the usual ones - paper writing and increased business



Just my two pennyworth
My YouTube channel: www.youtube.com/channel/UC4t9tVlAk7ww1wgCVW4yUjg
Elements collected so far: 65; to collect: Ln, Rb, Sr, Ba, F, Kr, radioactives
View user's profile Visit user's homepage View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 19-11-2018 at 13:45


I lowered the refresh time from 60 seconds to 30 seconds, meaning that spam that's slated for automatic deletion will only be visible half as long now. It's not actually that big of a deal, but if you guys don't mind, could you just wait about a minute to see if spam gets deleted automatically before reporting it? Most of the spam that gets reported actually does end up getting auto-deleted a few seconds later, actually.

The only spam I've seen getting through regularly are these super vague posts with no links and terrible English. Since those are the epitome of pointlessness though, I'm not worried about them, and have just been deleting those manually.




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
fusso
International Hazard
*****




Posts: 1922
Registered: 23-6-2017
Location: 4 ∥ universes ahead of you
Member Is Offline


[*] posted on 19-11-2018 at 13:45


Quote: Originally posted by Melgar  
It's nice to see so many new threads being started that aren't spam. :) Also, I've configured BotKilla to wait five minutes after it crashes, then start running again.
Why not config it to restart sooner, preferably instantly?



View user's profile View All Posts By User
Melgar
Anti-Spam Agent
*****




Posts: 2004
Registered: 23-2-2010
Location: Connecticut
Member Is Offline

Mood: Estrified

[*] posted on 19-11-2018 at 13:53


Quote: Originally posted by fusso  
Why not config it to restart sooner, preferably instantly?

Sometimes the reason it's crashing is because there's content in a spam message that causes a crash. In this case, BotKilla has to wait for a human moderator to delete the spam that it got stuck on. If it just restarted instantly, it could get stuck in an infinite loop until it goes haywire and starts marauding through the forums destroying everything it sees.

Actually it'd probably just generate a huge repetitive log file and crash in a way that it can't recover from so easily, but a malfunctioning robot wearing Crips colors and killing everything in its path is more fun to imagine, no?




The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
View user's profile View All Posts By User
 Pages:  1  ..  22    24    26  ..  28

  Go To Top