public virtual MemoryStream: Anti-BlogSpam Measures

Thursday, November 20, 2003 12:25 AM

Anti-BlogSpam Measures

Current anti-blogspam measures focus on keeping spam out of weblogs. I'm beginning to wonder if this is the correct approach.

Why? Because this only provides incentive for the spammers to spam even more. I would guess a lot of blogspam currently goes undetected by blog owners and stays on their weblogs doing exactly what the spammers want it to do: improve their Google pagerank. Suppose anti-spam measures get better and more weblog owners start automatically or manually removing this filth from their systems... Suppose we get rid of 9 out of every 10 blogspam messages... or suppose it's 99 out of 100... it won't matter. The spammers will simply counteract this by spamming more. It's the same tactics as email-spam. Only a small percentage is effective, so they up the scale to where even a small percentage will make a big difference.

So how can we fight this? Approaches like Joseph Duemer's or Adam Kalsey's may hit them where it hurts, but I don't think they scale very well. These tactics simply take way too much time and effort on the part of the weblogger to have much of an effect against this plague. Jay Allen's MT-BlackList does a great job at detecting spam, but some manual tweaking is still needed and as I stated before, the more effective we get at blocking spam, the more blogspam will be thrown at us.

But what if, instead of removing the blogspam, we would simply replace their link targets? We'll keep the spam, but instead of having a "viagra" link pointing to their site, we'll change it to point to one of a number of standard urls we all agree upon. As anti-blogspam measures get more effective and we detect more blogspam, more links to these urls will be generated, poisoning what the spammers covet most: their Google pagerank. Left unchecked by the spammers, soon these urls will have a higher pagerank than their own sites.

This will leave the spammers with only one solution: stop spamming weblogs that employ this tactic. It will force them to blacklist us and leave us alone, for if they don't, they'll get exactly the opposite result from what they're after.

TrackBack URL for this entry: http://www.hutteman.com/scgi-bin/mt/mt-tb.cgi/107

Comments

I think you're right, to some extent. As Dave did with his referrals, just preventing comments from being indexed by search engines (via robots.txt or the robots meta tag) would render spam useless. *However* (and this is a big if), it would require *everyone* to do so. Just as you say, if 1% of blogs don't prevent search engines from indexing their content, you'll still see spammers.

Posted by Michael Fagan at November 20, 2003 1:13 AM

And what's different about that approach that will prevent an uptick in spam? If, as you say. spammers will increase their efforts as anti-spam methods become effective, then why is this any different? Under your scenario, spammers would send out even more spam in hopes of counteracting the PageRank redirection of anti-spammers.

You also are assuming that the only purpose behind blog spam is and always be to increase PageRank. What about other types of spam? How would you propose we combat that?

Posted by Adam Kalsey at November 20, 2003 1:15 AM

Adam: what's different about this approach is that if applied at a large scale, it forces spammers to be selective about where they spam. Currently a brute force approach works great for blogspammers because sites that block their spam don't hurt them. If however weblogs start changing the blogspam links, a brute force approach may generate more links to the blogspam-decoy urls than it will to the spammer's own sites, thus they would hurt their pagerank instead of helping it. The only way to counteract this would be for the spammers to selectively spam to sites that do not employ this tactic. Yes, it would not stop them completely, but it would keep them away from your weblog.

I'm not sure about other types of spam - currently blogspam seems to be focused on pagerank only. When they expand their focus, we'll have to likewise change our tactics to fight them.

Posted by Luke Hutteman at November 20, 2003 1:27 AM

These links could go to sites about anti-spam techniques, stopping spam, and so on. Or if its selling a product, maybe to a site explaining why not to buy such products which use spam to promote themselves. Once enough such links gather, that could effect the product manufactures themselves. The key is to get the corporations and business's involved whose products spam is promoting. I get tons of spams selling things like Norton software, and others from big companies. I'm pretty sure these companies don't want their products promoted in such a way.
Even Viagra... after all Pfizer is a multibillion dollar company they should police how their products are being sold and advertised on the net. They arrest people selling them on the streets after all...

All those p*nis enlargement spams can be redirected to a site explaning how none of these products actually work and so on... Maybe a colloborative anti-spam blog, and have products like mt-blacklist replace spamlinks with links to that.

It should work to some extent.

Posted by KO at November 20, 2003 6:10 AM

Another approach could be to generate an small image and require users to enter the text they see in the image before they can leave a comment.. This seems like a pretty simple why to prevent automated blog-spamming and could certainly be used in conjunction with your link-replacement approach, if needed.

Posted by Josh Christie at November 20, 2003 9:18 AM

I'm not sure I'm understanding why this approach is better than Jay Allen's. Your comment about the 'manual tweaking' needed for MT-Blacklist for example. Surely there would still be some manual tweaking involved with your approach as well. Am I missing something?

Posted by Kev Spencer at November 20, 2003 11:24 AM

This approach is not one to replace MT-BlackList, but rather would be an addition to MT-BlackList and other current automated ways to identify blogspam. We would still need something like MT-BlackList to initially identify blogspam so that it can be handled in an automated fashion. The difference is that where MT-BlackList currently removes spam, I propose instead altering it to attack their pagerank.

Simply blocking spam doesn't punish the spammer. There is currently no incentive for spammers to try and avoid weblogs that have MT-BlackList installed. As more webloggers start using tools like MT-BlackList, blogspammers will simply start spamming more to keep generating the same amount of links. And with more and more blogspam arriving at your weblog, you're forced to keep manually updating your blacklist to fight it.

If, instead of blocking their spam, a plugin would simply alter its links to point to a number of standard urls though, then there will be a price for the spammers when they target these blogs as they will make their pagerank go down instead of up. The only way out for the spammers would be to blacklist these blogs and stop spamming them. They'd have to instead specifically target weblogs that do not employ this tactic.

Posted by Luke Hutteman at November 20, 2003 11:56 AM

Hi Luke - I'm afraid I don't see this happening.
I can't believe anyone would _leave_ spam on their site, even if the urls do point elsewhere. I wouldn't have hundreds of bits of crap on my site all pointing to a geniune Viagra company - that would defeat the point of anti-spam measures. I don't care where the links point, I just don't want pointless posts cluttering up my site.

I can see you point about trying to fight them on another front, but really, this would hurt the blog owner as much as the spammer, and you'd have to go to the trouble of doing it manually. Why manually? Well think about it, if you could do it automatically then you must have a system for identifying the spam that has sneaked past blacklist - and if that's the case, why not add that recognition ability to blacklist and stop them at submission?

Well done for putting your idea out there, but I can't see any merit in it I'm afraid (all my ideas got shot down as well :o)

Cheers - Dunstan

Posted by Dunstan at November 20, 2003 12:46 PM

BTW, this was my idea, which was along the same lines as yours, i.e. neutralise the spam:
http://www.1976design.com/blog/archive/2003/11/16/50/

And like your idea, it could be combined with blacklist to pick up the stragglers.

Posted by Dunstan at November 20, 2003 12:51 PM

I like it.

Posted by Tim Marman at November 20, 2003 1:09 PM

Dunstan: your anti-spam idea certainly removes the reason for blogspam, but unless everyone would do this, spammers will keep coming. And since neutralized comments don't do them any harm, there's no incentive for them to try and avoid your weblog.

Keeping spam comments on your blog is indeed something that seems counter intuitive. After all, you want that filth removed from your blog. But there's nothing stopping you from changing the style of these comments to make them less readable and obtrusive for your human visitors. You could put it in a small font, color it to blend into the background, use strikethrough, etc. The text and (changed) link will still be there for search engines to pick up, but your real visitors won't be inconvenienced much.

Also, the idea is that if this were to take off, blogspammers would be forced to start actively avoiding your weblog, meaning there won't be much spam to worry about in the first place.

Posted by Luke Hutteman at November 20, 2003 2:01 PM

Adding to what Luke said just before, you could put all the unreadable-but-still-hyperlinked blogspam in a special area, like after all the non-spam comments. Display them with #blogspam {line-spacing:0%;color:white}, or even better #blogspam {display:none}, and they will not even be obnoxious.

Posted by Tom Passin at November 21, 2003 12:04 AM

Why don't you just use a redirecting technology for links postet on your site?

For example if someone put's a link to http://msdn.microsoft.com/architecture/patterns/esp/ let the blog rendering software instead make a link to some local http://www.hutteman.com/redirect.pl?http://msdn.microsoft.com/architecture/patterns/esp/ or even put the link's in a list so they wont show up at all and have a id-no in the redirector instead, like http://www.hutteman.com/redirect.pl?10234.

With this approach posting stuff in your blog won't help the spammers page-ranking at all, and the more blog's use this the less inclined should blogspammers be to continue their stuff since it won't help them anymore.

just my 2 cent...
Sam

Posted by Sam at November 21, 2003 2:34 AM

I once had a weblog. Long ago. I don't keep that as a hobby any more, but I continue to read others writings. Because I do not keep a weblog anymore, this new problem of blog spamming doesn't concern me directly, but you have come up with a very well thought out, promising solution to this problem. Even though this concerns only bloggers right now, I think that your solution and thought process to this is vital to solving the over-all internet solution of spam. The anti-SPAM measures on the capital right now won't solve anything, it will just create more over-seas SPAM projects and make the spammers sneakier. Very good, Luke. I like the way you think. For the first time, I have heard someone present a viable end solution to spam of any sort. Everyone else is concerned with simply blocking it.

Thanks, Dave, for the link.

Posted by Josh at November 22, 2003 4:28 PM

Many posters seem to be missing the point. For different reasons, I do not think this will catch on, but the idea is that if you employ this technique spammers are actually harming themselves, rather than just being eliminated. This would make spamming the site counterproductive, not just useless.The reasons I doubt this would work are that it requires a critical mass or centralized control. A way to subscribe to spammers urls or ip address, so the system was automagically updated would be helpful. Additionally, an agreement with Google for a way to have the links not appear (display: none;), but be processed for page rank would be helpful. What about a way to tag something as blogspam for google's robots or feedster. It could happen...

Posted by theCoach at November 24, 2003 3:27 PM

Isn't it amusing that an entry with this particular topic would attract spam comments four months after being published?

Posted by BillSaysThis at March 17, 2004 9:45 PM

spam? where?

Posted by Luke Hutteman at March 17, 2004 11:44 PM

http://www.hutteman.com/weblog/2003/11/20-144.html#comment-1274

At least, it showed up in my SR feed... LOL

Posted by BillSaysThis at March 18, 2004 12:49 AM

oh they were here alright, but they never get to stay for long... or come back with the same url for that matter (thanks to mt-blacklist). But since I just had to remove three more blogspams on this thread, I think I'll shut it down...

Posted by Luke Hutteman at March 18, 2004 9:28 AM

This discussion has been closed. If you wish to contact me about this post, you can do so by email.