Bots are Both a Strength and a Vulnerability
In the world of ICOs, Telegram is where the tokens are. It's now a must-use platform to engage the public, as Telegram metrics are featured heavily on the charts of ICO analytics sites. If you are a founder running an ICO you absolutely must have constant engagement with it. Telegram offers a wide-open array of flexible tools to help you automate this work, but we've discovered (and neutralized) one previously unaddressed critical attack point.
Bots are now commonplace; users expect them to be a part of every community in Telegram, Slack, FB, etc. They help us interact in a more relaxed and meaningful way with the service, they support the community by enforcing rules, and perform small routine tasks so we don't have to. As a channel admin you definitely want service bots, and they are easy to set up: Have a look here https://core.telegram.org/bots!
The Telegram bot can manage your account, send messages in bulk, and even interact with other bots within channels. There are now thousands of airdrops all over Telegram, offering perks and tokens to users who engage with the channels. This raises the obvious question as to the best way to automate these processes to achieve maximum return on your resources spent. You want the "Bounty Hunters" to join in, but you also have to defend against a handful of bad actors automatically registering tens of thousands of accounts.
Here we tell the normally untold story of running an ICO on Telegram, how we got hammered by botnets that filled our channel to 100,000 Users within 2 ½ days, and the way we finally managed to cut them down by using the Telegram API.
Smashed by Bots Straight to the Max Cap
At DREAM our plan was to launch straight into our token sale by offering an airdrop to our loyal community. It was designed in part to reward them for helping bring our project all this way, and to celebrate our collaboration with our Indorse.io partners. We deployed our signup bot, and invited the first 10,000 users who registered themselves (and remained until the time of our token sale), offering them a modest token drop as a gesture of appreciation.
As they say, "No good deed goes unpunished". When we began our cross-platform promotion of DREAM.ac with Indorse.io, we invited the public to sign up with both and join our Telegram channel. A couple of hours into the process we started to receive traffic, and we were excited to see people joining. We started out at ~780 long term members, and when we went to bed had hit 1500. We were excited to see where it would go, and When we got up the next morning we celebrated that we had reached over 7000 members.
That joy quickly turned to dread, as it quickly became apparent we were under attack from black hats hitting us with thousands of bogus signups per hour. We were under attack by signup botnets! The pace was relentless, quickly picking up speed to ~15 joins/min and maxing out at ~35/min. It absolutely hammered our channel, filling it to its max capacity of 100,000 in only 2 ½ days.
Getting the Upper Hand
Ultimately we prevailed. Through the Telegram API each account can register and pass on routine jobs to an automated script. Many libraries also provide convenient wrappers to make this task easy to perform. You don’t even need advanced developer skills to get the job done. Here is how we ultimately cut them down by using the Telegram API.
Separating the Wheat From the Chaff
How do you go about insuring you aren’t going to drop all your airdrop-bounty reward tokens to a bunch of Telegram scraper-bots? And how do you do this without hitting your genuine supporters that legitimately signed up to grow your ecosystem? You have to look for bot “fingerprints”
The hunt for clues was on. We started collecting the data we had available from our different sources and combine it into a dataset that would help us solve that problem.
We had registration logs from the channel giving us: timestamp information on the joins and identities, logged registration data from the signup bot we deployed, and the officially logged channel information from the Telegram API.
Loading this data, we ended up with a nice overview of what was going on. But the data itself was not yet ready to reveal what was going on here.
After we joined and cleaned the datasets to include only the relevant information, we started looking for clues that would allow us to pinpoint real signups with a high confidence, while punishing bots for what they are best at: speed!
To inform our models about bot typical behavior we manually extracted some features by looking at the distribution of the data itself. Analyzing how specific behavior various with features throughout our channel.
We found that many accounts never actually got activated, meaning they just came in, signed up for the drop, and never showed any other activity from that point on.
Having a look at how long it took all our accounts to register themselves (both with the channel and finish the signup interaction with our bot), we could clearly see a similar picture.
What it Means to be a Bot
Given that bot accounts were:
- the fastest to register
- the quickest to leave
- the least informative in their profiles
- the fewest in interaction in the channels
- the most likely to never ever return even once
We did not even have to look at the highly suspicious email domains that where used to register with our bot.
Many where even bluntly stupid enough to register one and use the same wallet address for over 70 different accounts (but not as many as we would have hoped). Most fake accounts only registered 1 or 2 accounts with the same wallet address (indicating they were willing to choke down the bitter pilll of having to later join all those mini transactions into a larger deposit). This made our job that much harder.
The problem you see here is typical in getting a model off the ground. You are missing labeled data with which to begin, right?
So what we did here (and of course this can be massively expanded upon when more time is available to dig into the model definition), iasto come up with statistical measures of significant differences in behavior – or, “unsupervised learning”.
Our manual approach brought us a result we were initially quite happy with. It brought down the number of accounts by almost 50%. A quick win for sure!
Deploying the Model
To free our channel from the bots’ grasp, we now wrote up a quick management class that we could use to interact with the channel.
It’s purpose was to:
- Collect information about the registered Users in our channel for model training
- Provide a convenient interface to connect our user accounts to the Telegram API to deploy the script in parallel to make the best use of our time before getting ultimately throttled by Telegram
- Ensure optimization measures to reduce the load on the API in cases when we had to interrupt and restart the process
- Build a solid foundation for future expansion to ensure we could pass on standard clean up processes to our bot
The structure of our model was a quick and painless wrapper around the Telethon Library, applying some necessary login, connection, and caching routines to the API call methods.
The main function we needed to optimize here is the loop over all marked accounts, and use their IDs instead of their username for banning them from the channel.
Telegram does not directly provide a way to get a user by their telegram ID, but instead they offer a full user object that can be retrieved from the list of users in a given channel.
So first we implemented a function that would efficiently fetch the users from the channel, and then create a mapping that allowed us to efficiently retrieve a user object by its telegram ID in the ban loop that follows.
To speed up resets that occur (as we were running this function from our local machines, instead of taking the time to deploy it to our servers), we implemented a caching routine.
Now we were finally set to deploy the actual banning function to the channel. Unfortunately the Telegram API does not provide an efficient way to pass a list of user IDs to apply a batch job, but instead requires a single interaction with the server for each action you want to take on a given user in your channel. This was for sure annoying and puzzling, but we pressed on.
This required us to throttle our requests to the server to an acceptable speed (which was hard to find in the official documentation), as only the limits for sending messages where explicitly stated. The approach we used was to catch the error payload we received from the server, note the sleep time, and calculate a ratio of the amount of sleep we got from the server over the number of requests we had made in the meantime. With this approach we ended up having a smoothly running process at a somewhat “reasonable” speed.
We have released the code we used to accomplish this on GitHub
as open source. We hope the community can put this to good use - because as crypto-enthusiasts we are all in the same boat!
– Frank Fichtenmueller, CTO
The DREAM platform isn’t just another untested beta program on a white paper… It’s live and being used right now to hire blockchain professionals. The token sale will enable DREAM’s innovative team to take DREAM to the next level by integrating A.I. and incorporating our platform token.
- Click here to browse talent, or sign up and list your services.
- Click here to learn about our project and read our whitepaper.
- Click here to learn more about our DREAM Rewards Campaign.
- Telegram: https://t.me/dreamtoken
- Twitter: https://twitter.com/DREAM_Ecosystem
- Facebook: https://www.facebook.com/dream.token/
- LinkedIn: https://www.linkedin.com/company/dream-ac
- Google+: https://plus.google.com/+DREAMEcosystem
- YouTube: https://www.youtube.com/channel/UCISkXKQ7V2TSb71W-x1B6zA
- Reddit: https://www.reddit.com/r/DREAMToken/
- Bitcointalk: https://bitcointalk.org/index.php?topic=2956231
- Bitcoingarden: https://bitcoingarden.org/forum/index.php?topic=29226
Subscribe to DREAM Blog
Get the latest posts delivered right to your inbox