๐ Link to this comment
Comment by
Dario Giovannetti (kynikos) -
Thursday, 21 January 2016, 03:57 GMT
๐ Link to this comment
Comment by Shulhan (sulhan) -
Thursday, 21 January 2016, 08:05 GMT
> The malicious accounts are created serially after few
minutes, so the captcha is arguably broken.
So, the problem is not user creation process? AFAICR, wiki have an
option for user to confirm user by email right?
๐ Link to this comment
Comment by
Jakub Klinkovskรฝ (lahwaacz) -
Thursday, 21 January 2016, 08:50 GMT
๐ Link to this comment
Comment by Pierre Schmitz (Pierre) -
Monday, 01 February 2016, 20:02 GMT
Due to ongoing spam that cannot defeated by even more Arch
specific Captchas I had to disable account creation for now. They
started to post large content which could lead to a denial of
service attack. I'll see if I can come up with a more clever
solution.
๐ Link to this comment
Comment by Shulhan (sulhan) -
Friday, 05 February 2016, 05:52 GMT
I just remember something, did spams happened after mediawiki
upgrade? If its so, I think one of the solution is either
downgrade it to the last version or upgrade it to the latest
stable.
๐ Link to this comment
Comment by brian downing (bsd) -
Wednesday, 10 February 2016, 12:13 GMT
Since the Captcha is not rotating, its always the same. That's not
much of a captcha.
Even if the community were to create a list of Arch captchas, it
would be trivial for the vandal to code for the list of questions
and valid responses.
Spammer -> vandal (not really spam, but vandalism)
The Arch specific Captcha is "cute" but if it is non effective,
maybe its time to replace it with a more standard captcha.
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Thursday, 11 February 2016, 14:16 GMT
Hi,
I made an anti-spam plugin for a wiki I administer,
http://wiki.dlang.org/
. It's also a domain-specific question/answer CAPTCHA, but the
questions are randomly generated. After some tweaking, we've had
zero spam since then.
Here's my code:
https://github.com/CyberShadow/DCaptcha-MW/blob/master/DCaptcha.php
https://github.com/CyberShadow/dcaptcha/blob/master/dcaptcha.d
(It had a much larger variation of questions before, but people
complained that some may have been too hard and would scare away
newbies.)
> The Arch specific Captcha is "cute" but if it is non
effective, maybe its time to replace it with a more standard
captcha.
Unfortunately, by my experience, you will have a much worse time
with a standard CAPTCHA. I think spammers can buy 1000 reCAPTCHA
solutions for $5 or so.
Let me know if I can help.
๐ Link to this comment
Comment by brian downing (bsd) -
Thursday, 11 February 2016, 20:45 GMT
What about Google's reCAPTCHA? Is that compromised as well?
๐ Link to this comment
Comment by
Jakub Klinkovskรฝ (lahwaacz) -
Thursday, 11 February 2016, 20:56 GMT
In [8] in one of the comments above, reCAPTCHA is evaluated as
having low effectiveness at stopping spam.
But the QuestyCaptcha module in the "official" ConfirmEdit
extension might be a good alternative (it would probably solve the
problem with rotation), though it is still in beta state. And of
course we'd need to build the database of questions ourselves.
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Thursday, 11 February 2016, 21:04 GMT
Yes, that's the one I meant. But to clarify, it's not
"compromised" in a technical or algorithmic sense. Spambot
operators can buy in bulk CAPTCHA "solutions" - i.e. access to an
API (which they can plug into their spam botnet) which connects
CAPTCHA-defended websites with humans who can solve them cheaply.
Given the cost of labor in countries such as China, this is very
cost-effective and allows them a net profit. They will also do the
same for static challenge/response questions, which is probably
the reason for the recent spam wave discussed here.
The important part here is that, with few exclusions, spammers are
not going to be bothered to customize their spamming software for
individual websites - the cost/benefit is too high. Thus, the
registration form must present a challenge that cannot be
outsourced to somewhere else for low pay (e.g. reCAPTCHA), or
present a static challenge which requires a one-time effort to
defeat (such as the current challenge).
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Thursday, 11 February 2016, 21:08 GMT
We've used QuestyCaptcha with a set of questions on wiki.dlang.org
before. It didn't work too well.
The issue is that there is almost zero penalty for failing a
CAPTCHA challenge. Thus, if you have 100 questions and the spammer
can solve (or outsource solving) one, they get a 1% success rate.
Given how many requests a spam botnet can make to the wiki server,
it will still be enough to flood the wiki with spam.
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Thursday, 11 February 2016, 21:48 GMT
An example of something that could work is to modify the current
challenge (output of "pacman -V|base64|head -1") add a
randomly-generated string, so it becomes "(echo hhE6qhrQQ8;pacman
-V)|base64|head -1". Although simple, this requires writing custom
(site-specific) code to defeat, which is unlikely to happen.
๐ Link to this comment
Comment by brian downing (bsd) -
Friday, 12 February 2016, 10:09 GMT
If there is a dedicated individual, or group, that is determined
to cause damage through the creation of accounts, then even a
perfect CAPTCHA is only going to slow them down.
We should be looking at the attack vector(s) and thinking of ways
to stop the attack and slow them down as well
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Friday, 12 February 2016, 11:06 GMT
Brian, spammers are not going to target the Arch Wiki
specifically. Their goal is to target as many websites at once as
possible with a minimum per-website effort. If you are dealing
with a targeted attack (and this is not the case here), then that
is a completely different situation requiring different
counter-measures.
I don't see any fixable attack vectors here to speak of. IP
blacklists / DNSBLs do not work because botnets grow faster than
these lists can keep up. Heuristics involving JavaScript checks,
UA sniffing etc. are all defeatable en-masse (unless Arch Wiki
implements a custom registration form, but even that is defeatable
by outsourcing registration to a human).
Additionally, I would not recommend exploring "slowing them down"
as a pursuable direction. Even though the CAPTCHA is often solved
by humans, the spamming is done by bots, and it will continue to
eat into wiki editors' and administrators' time until solved.
๐ Link to this comment
Comment by brian downing (bsd) -
Friday, 12 February 2016, 12:39 GMT
I was under the impression this was a targeted attack. How can you
be certain that it is not?
Since there was an arch specific captcha they would have had to go
to a certain amount of effort to answer the first time.
That doesn't sound like an easy drive-by, nor low hanging
fruit.
Slowing them down does makes it more expensive in time and
pennies, and if they're looking for easy drive-bys, then slowing
would help to some degree. Looks like there is no **perfect**
solution, but the collection of measures will be additive.
If they can afford to hire humans, then the CAPTCHA has little
effect other than slowing them down and adding cost, however
little it may be.
๐ Link to this comment
Comment by
Vladimir Panteleev (CyberShadow) -
Friday, 12 February 2016, 13:36 GMT
> I was under the impression this was a targeted attack. How
can you be certain that it is not?
Well... probably better to ask, targeted by whom, and to what
degree? As in, who had to spend time to target the Arch Wiki, and
how much? It is entirely possible that the spambot operator did
not have to perform any action to target the Arch Wiki
specifically. I'm not certain, and what follows is mostly
conjecture as far as it concerns this case, but this is what I've
seen:
It is known that spambot operators use CAPTCHA-solving services
(there is even a common API), so I think it's very likely that a
similar service exists for Questy-like CAPTCHAs. For example, the
forum spam software XRumer has a database of 170,000 questions and
answers. Services such as Amazon's Mechanical Turk are often used
for defeating such challenges.
Another trick that is used on some shady websites is that in place
of using a CAPTCHA service directly, they connect to a service
which sends you other websites' CAPTCHAs. Thus, the user on the
shady website (A) is actually solving the CAPTCHA on a
registration form of a different victim website (B). Apart from a
delay, this is completely transparent, so in our case it could've
appeared as, "In the context of <Arch Linux Wiki>, what is
the answer to the question: What is the output of `pacman ...`?"
Website A still knows whether the CAPTCHA answer is correct
(because website B provides that), and as a result website A's
operator wins a bit of money and website B gets spammed.
So, the person who solved the CAPTCHA (running Arch Linux, or
finding the answer on Google, or asking an Arch Linux user) may
not have even known what the answer would be used for.
> Slowing them down does makes it more expensive in time and
pennies, and if they're looking for easy drive-bys, then slowing
would help to some degree.
Indeed, but it's important to target the most costly areas. CPU
time or bandwidth on a botnet you own is effectively free. The
most expensive part is almost surely the human involvement. Any
Arch Linux user can solve the CAPTCHA I proposed once, but to
defeat my proposal one would need to write custom code.
Anyway, what would you propose? I could have a go at implementing
the QuestyCaptcha plugin I proposed if you like.
๐ Link to this comment
Comment by David Runge (dvzrv) -
Friday, 01 April 2016, 23:29 GMT
I've had similar issues recently and solved them by blocking the
attackers outdated Chrome version (they were all the same across
many different IPs) by using a simple http_user_agent rule with
nginx.
I know this is probably the most brutal way of doing it, but the
attackers used a version of Chrome from way way back in the day,
so it was safe to assume that no real users would be harmed by
this choice.
if ($http_user_agent ~ Chrome/<attackers chrome version> )
{
return 403;
}
๐ Link to this comment
Comment by
Jakub Klinkovskรฝ (lahwaacz) -
Sunday, 22 May 2016, 12:14 GMT
Update of the current state: the account creation has been
suspended again due to another spam wave on May 19.
Since the wiki's CAPTCHA has zero effect even with very
site-specific questions, we should take the approach of the
Wikimedia Foundation and prevent the vandalism from ever
happening, instead of assuming good faith of all created accounts.
The AbuseFilter extension, suggested in the very first post,
provides a ready-to-go solution, which is extensively tested on
Wikipedia and her sister projects, where even anonymous editing is
allowed. Besides blocking abusive changes to articles' content, it
can also be used to throttle actions of new accounts (if the
server uses memcached), which would give us chance to continually
improve the filter without having to worry about extensive
cleanup. I hope that this will help to restore the normal account
registration once and for all.