Not logged in [Login - Register]
 Sciencemadness Discussion Board » Non-chemistry » Forum Matters » Tired of reporting spam Select A Forum Fundamentals   » Chemistry in General   » Organic Chemistry   » Reagents and Apparatus Acquisition   » Beginnings   » Responsible Practices   » Miscellaneous   » The Wiki Special topics   » Technochemistry   » Energetic Materials   » Biochemistry   » Radiochemistry   » Computational Models and Techniques   » Prepublication Non-chemistry   » Forum Matters   » Legal and Societal Issues   » Detritus   » Test Forum

Pages:  1  ..  21    23    25  ..  28
Author: Subject: Tired of reporting spam
Tsjerk
International Hazard

Posts: 1594
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

 Quote: Originally posted by Magpie Would it be possible to have the forum software automatically delete: 1. posts in Russian?

cyrillic you mean?
Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

Ok, my proprietary spam-destroying system "botkilla" has been activated. There was just an insane amount of spam, and even with the threshold turned down pretty low, there wasn't a single false positive. So I'm going to try leaving it on overnight. It runs every three minutes, and so far has gotten all the spam that's been posted in the last few hours. I'll probably tweak the sensitivity tomorrow, if all goes well. In the meantime, if you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the title and see how long it stays up.

I'll make it so that there's additional tests for those cases, but in the meantime, I think most of you will notice a huge drop in the amount of visible spam now. And with the moderators no longer having quite as many spam reports to deal with, we might actually be able to respond to U2U messages, if this actually works.

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
j_sum1

Posts: 4673
Registered: 4-10-2014
Location: Oz
Member Is Offline

Mood: Metastable, and that's good enough.

 Quote: Originally posted by Melgar In the meantime, if you want to test it, try starting a new thread with the words "mature" or "adult" or "galleries" or Cyrillic characters in the title and see how long it stays up.

I did just that. here

It seems to be staying up -- it has been longer than 3 mins and there have been no replies in the thread. I will guess that the reason is that it is noit a new registration. (Does your botkilla test for that?) I might try again later with a dummy registration.

Ok. I registered with the name, jsumxgwsukspamtester and posted something with a spammy title. It lasted only a couple of minutes before vanishing.
Account is not deleted or banned but the garbage is gone.

Well done, Melgar.
<clap><clap><clap><clap><clap><clap><clap><clap><clap><clap>

[Edited on 10-10-2018 by j_sum1]
streety
Hazard to Others

Posts: 109
Registered: 14-5-2018
Member Is Offline

Nice to see some automation. From your description and the behavior of your bot I assume you are not re-using any of the program I sent you.

I can also send you all the spam messages my script has collected over the past few months. Hopefully it will help with training your own implementation. I'll wait for the current batch to be cleared.

Some of the spam posts have 80+ views. I don't think I've ever seen that before. Melgar, will that be your script?

[Edited on 10-10-2018 by streety]
streety
Hazard to Others

Posts: 109
Registered: 14-5-2018
Member Is Offline

The attached file contains a sqlite database file with three tables: topic, member and forum.

This is the definition for the tables:

 Code: from sqlalchemy import Column, Integer, Text, DateTime, ForeignKey from sqlalchemy.orm import relationship from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() class Topic(Base): __tablename__ = 'topic' id = Column(Integer, primary_key=True) num_posts = Column(Integer, ) num_views = Column(Integer, ) first_post = Column(Text, ) topic_title = Column(Text, ) topic_id = Column(Integer, index=True, ) post_date = Column(DateTime) delete_date = Column(DateTime, nullable=True, ) forum_id = Column(Integer, ForeignKey('forum.id')) member_id = Column(Integer, ForeignKey('member.id')) def __init__(self, num_posts, num_views, first_post, topic_title, topic_id, post_date, forum_id, member_id): self.num_posts = num_posts self.num_views = num_views self.first_post = first_post self.topic_title = topic_title self.topic_id = topic_id self.post_date = post_date self.forum_id = forum_id self.member_id = member_id def __repr__(self): return '' % (self.topic_title) class Member(Base): __tablename__ = 'member' id = Column(Integer, primary_key=True) name = Column(Text, ) register_date = Column(DateTime) deleted_date = Column(DateTime, nullable=True, ) topics = relationship('Topic', backref='member', lazy='dynamic', ) def __init__(self, name, register_date): self.name = name self.register_date = register_date def __repr__(self): return '' % (self.name) class Forum(Base): __tablename__ = 'forum' id = Column(Integer, primary_key=True) forum_id = Column(Integer) name = Column(Text, ) topics = relationship('Topic', backref='forum', lazy='dynamic', ) def __init__(self, forum_id, name): self.forum_id = forum_id self.name = name def __repr__(self): return '' % (self.name)

If a post has been deleted the delete_date field will be set.

Attachment: db.sqlite.tar.gz (4.8MB)

Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

I'm posting a more detailed description of what botkilla looks for in Whimsy, but I'll mention here that the script does keep logs of every post it deletes. Here's the one j_sum1 seems to have made:

 Code: title: mature spam poke out username: jsumxgwsukspamtester replies: 0 last_poster: jsumxgwsukspamtester tid: 96794 spam_score: 12 flags: - spam words in title - registered since yesterday thread_text: "Just a quick test of the new botkilla
\r\nj_sum1"

Here's the link to a longer description of what it does, for whoever can access Whimsy:

I think I'm going to have to have to implement all the rules manually, just because I want to lower the chances of false positives as much as possible.

[Edited on 10/10/18 by Melgar]

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
JJay
International Hazard

Posts: 3321
Registered: 15-10-2015
Member Is Offline

Mood: resigned

Modern antispam software has very little trouble with false positives. Quit making excuses; if you're incompetent, turn in your resignation.

I'm no longer involved in this forum.
Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

Yes, I took the script down for like a half hour while I worked on it, and that's what's accumulated during that time period. Look at the times on all of those posts.

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

0 posts?

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
j_sum1

Posts: 4673
Registered: 4-10-2014
Location: Oz
Member Is Offline

Mood: Metastable, and that's good enough.

That's how much post count really means.
Tsjerk
International Hazard

Posts: 1594
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

@fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what?

@Melgar; nice work! No more spam for at least a couple of days!

[Edited on 14-10-2018 by Tsjerk]
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

 Quote: Originally posted by Tsjerk @fusso; what do you mean with that post? It is a guy who registered in 2011 and he didn't post anything. So what?
I said something wrong?!:O

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
Tsjerk
International Hazard

Posts: 1594
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

No, nothing wrong, I just don't understand what you mean with that screenshot.

Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it.

[Edited on 14-10-2018 by Tsjerk]
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

 Quote: Originally posted by Tsjerk No, nothing wrong, I just don't understand what you mean with that screenshot. Ah ok, it is a post but his count is on zero, well, I wouldn't worry too much about it. [Edited on 14-10-2018 by Tsjerk]
My date format is yy/mm/dd, so that bot is registered in 2018, not 2011

The screenshot shows the bot's spam post but its post count is 0, and these 2 observations contradict each other.

[Edited on 18/10/14 by fusso]

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
Tsjerk
International Hazard

Posts: 1594
Registered: 20-4-2005
Location: Netherlands
Member Is Offline

Mood: Mood

Ah, then I understand why I didn't get it

Eidt: All spam is disappearing in minutes!

[Edited on 15-10-2018 by Tsjerk]
Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

It probably catches a new spam post every 5-10 minutes. There was a bug in my script that would make it stop running if several unlikely things happened at once. It's fixed now though, so that shouldn't happen anymore.

I don't know why that user has no posts, but I recognize that user as a spammer, since I've been periodically checking the logs. And I'm not going to worry about a spammer's post count.

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
phlogiston
International Hazard

Posts: 1262
Registered: 26-4-2008
Location: Neon Thorium Erbium Lanthanum Neodymium Sulphur
Member Is Offline

Mood: pyrophoric

Melgar, THANK YOU! It is such a joy to find nearly only non-spam in 'today's post' and browse for interesting contributions again.

-----
"If a rocket goes up, who cares where it comes down, that's not my concern said Wernher von Braun" - Tom Lehrer
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

@Melgar

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
j_sum1

Posts: 4673
Registered: 4-10-2014
Location: Oz
Member Is Offline

Mood: Metastable, and that's good enough.

 Quote: Originally posted by fusso @Melgar Can your program detect spam posts in other threads?

No.
Report those. Or, preferably, send a mod a u2u. These will need a manual deletion. The volume of "reported post" u2us has gone way down but I still don't read all of them if the board is looking clear of spam. (I presume the other mods are the same.) If you want to attract out attention then a specific message will work better.
Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

 Quote: Originally posted by fusso @Melgar Can your program detect spam posts in other threads?

It could, but doesn't. The main reason is that I haven't seen enough of those types of posts to come up with a good method of identifying them. A spam post would have to stay up long enough for me to write the code to detect it and then test that code on it to make sure that everything works properly. By the time that all could take place, the spam is typically gone, thanks to the quick action of a mod.

I've noticed that spam reports are about 10x as frequent when the script has been temporarily taken down, so I'm assuming that the occasional spam that appends itself to other threads is a pretty minor problem. I could come up with rules to use to eliminate it easily enough, but a few of those posts would need to stay up long enough to make sure those rules work to identify them. I'll leave it up to others to determine whether it's worth a coordinated effort to address.

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
Mr. Rogers
Hazard to Others

Posts: 179
Registered: 30-10-2017
Location: Ammonia Avenue
Member Is Offline

Mood: No Mood

Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM.

[Edited on 21-10-2018 by Mr. Rogers]
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

 Quote: Originally posted by Mr. Rogers Does the sign-up system use CAPTCHAs? I don't recall. These posts don't look like real people. It's like generic Viagra SPAM. [Edited on 21-10-2018 by Mr. Rogers]
No, no antispam features in registration.

[Edited on 181021 by fusso]

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

SPAMMERS HAVE EVOLVED

SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko
Melgar
Anti-Spam Agent

Posts: 2002
Registered: 23-2-2010
Location: NYC
Member Is Offline

Mood: Aromatic

 Quote: Originally posted by fusso SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS

My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it registered within the last 48 hours.

The first step in the process of learning something is admitting that you don't know it already.

I'm givin' the spam shields max power at full warp, but they just dinna have the power! We're gonna have to evacuate to new forum software!
fusso
International Hazard

Posts: 1522
Registered: 23-6-2017
Location: ∥ universe
Member Is Offline

Mood: Mood

Quote: Originally posted by Melgar
 Quote: Originally posted by fusso SPAMMERS HAVE EVOLVED TO PRETEND TO BE CHEMISTS

My guess is it's just copying text from some other thread. BotKilla would get that one anyway, because it's linking to an unrecognized domain, and it registered within the last 48 hours.
Even if it's only copying, it does seem to know what should it copy

Useful sites:
Balance Chemical Equation: http://www.webqc.org/balance.php
Molecular mass and elemental composition calculator: https://www.webqc.org/mmcalc.php
Solubility table: https://en.wikipedia.org/wiki/Solubility_table
Azeotrope table: https://en.wikipedia.org/wiki/Azeotrope_tablesIt's not crime if noone finds out - Nyaruko