May 31st 2018
In this tutorial, I talk about how to build a Twitter bot, based on the @newsycombinator bot, that scrapes content from the web and posts on Twitter hourly. We write a Ruby script which we get on to hosting service Heroku and set up to run automatically once an hour.
(electronic music)- [Instructor] So one thing that people alwaysask me is how do you build a Twitter bot?Now, I've built one in the past.I built this one back in April, 2008.And it's got over 181,000 followers,and it's only about 30 lines of code that runs once an hour.How does it work?Well, it basically has thetop five stories from Hacker News,news.ycombinator.com and just post them online.Now, we can make one ourselves.Now, I'm gonna not use Hacker News.I'm actually gonna use Designer News instead,and again, take the top five from here,and post them to a account that I've just set up.Now, I've just set up this account.It's a totally fresh account.No tweets so far.I could just tweet them out manually,but let's automate this.Now, if you do wanna follow this, it's dnsuperhi on Twitter.So what we're gonna dois get the stories from Designer News,and put them on our account.Now, how do I do this?Now, I've already set up the Twitter,but I need to set up a Twitter app.I'll talk about that in a minute.I'll set up my project using Ruby.I'll tweet things out,I'll stop the duplicationsand I'll get this online to run once an hour.So those are the five steps I'm gonna do.Set up Twitter, set up the project,tweet things out, stop dupes, and get online.(electronic music)So the first thing to do is log inas a Twitter account that you want post from.Now, I've logged in as dnsuperhi.com,and the next thing we need to do is go to apps.twitter.com.Now, you should, again, be logged in as this.Now, here we have our Twitter apps.Now, I can make a Twitter app for myself.I can make them for lots of people for instance.In this instance, I'm just gonna do it for one account.I'm gonna create a new app,and I'm gonna call it something like Autoposting DN.
This auto posts tweets from Designer News.And I'm just gonna put the superhi website in there,but it can just be anywhere that you want to call that.Don't worry about callback URLs.That's just for kind of things that log in with Twitter.We have read the developer agreement account,so let's just create that.And we should have Autoposting DN.Now, there's a few things that we need to use from this.Now, what we're gonna use is down here.We're gonna have some keys and access tokens,so I'm just gonna click this.Now, the two things, again,I will delete this later,but I will have changed this,so you can't steal and post from my account.It's the application settings,which is this one, and this one.I can read and write,'cause obviously I want to write to Twitter.Now, I'm gonna generate as well an accesstoken for this particular account.So I'll just go create one there.
Here we go.So now, I've got four things,I've got my consumer key,and consumer secret, whichis basically the whole app's login.And then I've got onefor the specific DN Super High account,which is this one, and this one.So next, what you need to do is set up some Rubyto actually post things to Twitter.(electronic music)So there's three things I actually needfor this project to pull in.So we're gonna use something called a Ruby gem.Now, Ruby gem is just kind oflike a library or some filesthat basically make our codea little bit easier to work with.Now, I'm gonna use The Twitter Ruby Gemto interact with Twitter itself.I'm gonna use a second one called httparty,which is to make HTTP fun.Basically we wanna grab content from Designer News,and then third one is Nokogiri to takethat content from Designer News, and make it useful.Now, what I'm gonna do is in my finder,I'm gonna go into Sites.I've got a Sites folder, and you can put it anywhere.I'm gonna make a new folder called designernewsbot.
Now, I'm gonna open this in a text editor or a code editor.Mine is called Atom.You can get it from atom.io.And it just makes it a little bit easierto see what's going on.I'm gonna add two files in here.The first one I'm gonna add is called the Gemfile.
So it's capital G, em, and then file.And the things I want to put in here arewhich gems I want to use.So I'm gonna use gem, and then twitter.Now, I can find this out,it's on here, it just says to install,use gem install twitter.Or I could just do gem twitter in my actual Gemfile.What is httparty?Again, you install it in the same way,so I'll just below here,gem httparty, and the third one is Nokogiri.How to install that, down here in the same way.Gem nokogiri.
And we're gonna save that.Now, the next thing I'm gonna dois actually make a folder first.I'm gonna make a folder called bin.Bin just stands for binary executable,basically something that can run code.Now, my Gemfile is just gonna work in the background,but this is where my code is gonna run.Now I'm gonna make a file in here.And I can call it whatever I like,but I'm gonna call it updates.rb.Now, it needs the .rb on the end.Now, I'm just gonna put very quickly in here, puts hello.So what I want to see eventuallyis something that says hello.Now, I'm just gonna save this again,and what we're gonna do is see if this works.Now, to see if this works,we actually need to go to a command line.Now, I've already got mine set up over here,mine's called terminal.You can go into applications,utilities, terminal if you are on a Mac.Or it's the command prompt if you're on Windows.So I'm just gonna go in here,and currently, I'm justin my home folder, which is riklomas.So I need to go into Sites,I need to go into designernewsbot,and then we wanna run some code.So I'm gonna do cd Sites, that means change directory.So now I'm in Sites.This is kinda similar to click in here,and to click in there, and I'm gonna go cd designernewsbot.
And there we go.We are in that.Now, if you haven't set up Rubyor Rails or anything in the past,you can just do gem install bundler,and that will install the things we needto set up the Gemfile itself.Now, if you have done that,all you need to do next,and you can do this after installation,is just do bundle install.Now, what bundle install will dois it'll go through your Gemfileand add all of these into your code base.So I'm just gonna run that,and what it'll do is it'll pullin all the information that it needs.Now yours might take a minute or two,because I've already installed the files in the past.The next, all I'm gonna do is see if my update.rb runs.
To see if it runs, I'm gonna do ruby bin/update.rb.
And what we should see now is just the word hello.Now what we'll do nextis we'll add some real code basedon the gems that we've added.(electronic music)In our code, we don't just want to say hello,we actually wanna do things with it.So how do we do things with our code?Now, the first thing we need to do is actually just get ridof this line of code,and what we need to do is set up our gems in this update.rb.We've got them in our project,but we want to use them in this file.So I'm gonna require twitter first of all,'cause I wanna post things to Twitter.And next, I want to require httparty,'cause we wanna get things from the Internet.And then I want to require nokogiri,'cause I want to basically do something withthat content itself.So the first thing that we're gonna do is set up Twitter.Now, on the Twitter gem page,it'll tell you how to configure the options,and we've got all this information already.Now I can copy and paste this,I will write this out properly,but I'm just gonna follow along with how this works.So clients equals this stuff,and then we set up the configuration.Now, what does this actually mean?Let's go through it.So I'm first of all gonna set up a variable called twitter.And this is gonna equal to Twitter,with a capital T, and then,we're gonna use a REST clientwhich is a way of kind of interacting with Twitter itself,and the client itself.And we're gonna make a new one,and we're gonna add a do with some config.Now, I could copy and paste this,but I'm gonna write this out.A do needs an end all the time,and how do we want to config this?So in the config, it's gonna beconfig.consumer_key, underscore key.And this is gonna equal to something in a string.String is just some lettersand numbers and I'll add them in a minute.Config.consumer_secret equals something.Config and then the next one is,access_token, and the next oneis config.access_token_secret, and that equals something.
Now, what do these things mean?Where do they come from?Well, we can get them from our app.So we got the config,consumer key, so I'm gonna copy that one into here.And then, the secret goes in here.And then, access token from further down,and then access token secret.And again, I'm just gonna save that file.Now what we can do is basically tweet out whatever we like.Now, below here, how do we make a tweets?Well, using all of this config,which is up here, we can just do.updates, and then brackets.And in here, we can just do some quotes,and then say, Hi this is our first tweet!Now, if I save this now,and go back to my command line,I can just press up or I can type this out again.And if I run this now,it looks like it's not done anythingbecause we haven't printed anything outto say it's finished,but if I go back to my account,and see, if I refresh the page,we should see, Hi this is our first tweet!So next what we want to do is actually set it upso that we can get content from Designer Newsand post this insteadof just actual text we've written ourselves.(electronic music)So I could just scrap the contentof Designer News itself,so I'll just get all the ton of new tiles from the homepage,but there's actually an easy way,if I scroll down to the bottom,which is it's RSS feed,which gives us a verykind of structured data set of what we can use.Now, this looks like complicated stuff.It's basically just how content is structured.So basically what I want to do is get all the items,and then get the title,the description, maybe the link as well.Now, how do we actually put this into our page?Now, if I go into our script,again, I don't wanna say twitter update,this is our first tweet.Instead, what I want to do is get that RSS feed.So rss, we're gonna make a new variable.This is gonna equal to,well, how do we get stuff from the Internet?Now, the way that we're gonna do thisis we're gonna use HTTParty, and we're gonna use .get.
And then some brackets to do this thing.Now, what do we want to do?What kind of URL do we wanna pull in?Well, the URL is up here.So I'm just gonna copy that in.Https, designernews.com,add the co, and then that at the end.Now, this gets the data.We want to turn this data into usable format.Now, to do that, we're gonna use Nokogiri.So in Nokogiri, we're gonna say turnthis into a document doc that we can use.This is gonna equal to Nokogiri,and then we're gonna use the XML version of this.Now, don't worry too much about thesekind of double colons, which are up here.We're just kind of following the instructions on the site,so a lot of this is copy and paste.And in brackets, we're gonna use this rss feed.So we're gonna get the content,and turn it into something usable.What is something usable though?Well, in Nokogiri, it has a way to doa kind of CSS selector.So what we're gonna use is we're gonna say,well, let's get all of the items as a whole,so all of that, then all of this, and all this, and so on.So in Nokogiri, we're gonna get that doc that we just made,and I'm gonna use the CSS selector to find each item.
Now, I don't wanna find all the items,because there's quite a lot on that page.I'm just gonna take the first five instead.Now, for this one, I'm gonna do a .each.And for each item, I want to do something.Now, what do I want to do with it?Well, I'm gonna call this a variable of my choice.I'm gonna call that item, and every do needs an end.So our code is gonna live in here.Now, for each item, there's a title,description, the link, and the pubDate,and all this extra info.Now, we wanna basically format our tweet.We're gonna take this data and turn it into a tweet.So the first thing I need is the title,so title equals something,well, for each item, this variablethat we just made for each one that comes through,we're gonna do the same things we did up here.In the CSS style, we're gonna get the title.So I've got this, then this,and then this, and so on, for the first five.Now, in that, that's the whole tag itself.I wanna find just the text, in there.Next thing, I wanna put the link in as well.That would just give the title so far.So I'm gonna make the link equal to,well, this is gonna look quite similar.We wanna get the description,which is this one, or this one, or this one.
And again, we're gonna get the text.Now, one thing you might noticeis that not all of them start as description,so some of them are just text posts.If they are text posts,we just wanna go to the actual link, to Designer News.So we're gonna say unless the link.starts_with,question mark, http.
Then, we're gonna do something.We're gonna override that link instead,and make it not equal to the description,but equal to the actual link in here.So again, .text.Now, the last thing we're gonna dois we're gonna update Twitter.So we're gonna take those five itemsand post them to Twitter.So twitter.update.Now, this is gonna be a string,because a string is just like we had,hi, this is our first tweet.What we're gonna do instead is put this,title and link in there.So the first thing is the title.Now, this is a variable in a string.And then we're gonna put a space,because we wanna have a gap.And then, we're gonna put the link,and we're gonna save the file, and that's the first thing.We basically are gonna set up Twitter,we're gonna get the RSS feed,turn it into something useful,find all the items, find the first five ones,and then do a loop for each item.Get the title, get the description,if the description isn't a link,get the link itself, and update Twitter.Let's see if this works.So now, I'm just gonna run the file again.
And again, it doesn't anything particularly.And if I go back to the site and refresh,we should now see five tweets.
So the problem with the code is obviously,if I run this again and again and again very quickly,how do I make sure there's no duplicates?So the next thing we'll make sure there's no dupes.(electronic music)Now if I rerun this code again,like this, Twitter's clever enoughto actually dedupe things.Now, the reason for thatis it doesn't want things posting againand again and again to stop spam.Now, because we're running this once an hour,now Twitter does let you do dedupes if it'slike once an hour onwards,so we have to write some codeto make sure this doesn't dedupe thingsthat have already been posted.So the way that we do thatis we need to check our Twitter timelineand basically find any URLs that were already posted,and dedupe it against the Designer News feed.How do we do that?Now, the first thing we're gonna dois just quickly highlight this area, and comment it out.I've just done this in my codebase using commandand forward slash just as a quick thing.Now, how do I get my user timeline?Now, on the Twitter gem, this is where we set it up.Further down is how you can use it.Now, further down, you can seehere fetch the timeline of tweets by user.So we wanna find out the dnsuperhi tweets.Now, what I'm gonna do is just write a new variable up here.So this is called latest_tweets.
This is gonna equal to,well, they call it client, but we set it up as twitter.Twitter.user_timeline.
Which timeline?Our timeline, dnsuperhi.So let's see what that says.So we're gonna puts the latest_tweets.
And we save that, I'm just gonna quickly go backto my command line, and just run this code,and what we'll see is tweet, tweet, tweet, tweet, tweet.So how do we think about this?How do we actually look at this?Now, in this is Twitter tweets,on the sidebar is all this information about a tweet.There's all this information that we can get,all the way down here.Now, I'm gonna quickly do this,but what I want to find is all the previous links in tweets.
So the way that I'm gonna do thisis I'm gonna set up a new variable called previous_links.
This is gonna equal to the latest_tweets,and we're gonna take each tweet and map it.The map it just means turn it into something else,basically find the links within the tweets.We're gonna do something to each tweet,and every do needs an end.Now, again, I'm gonna do this pretty quickly.So if tweets, which is just this thing that we passed in,has any urls at all, .any?Every if needs an end.And if it does have a URL,we're gonna turn it, that tweet into the first link,because obviously there could be multiple links,and we're gonna pass in,oops, the expanded_url, basically the full URL.
Twitter gives you two different options.One is the t.code, the shortened URL.We want the expanded one,because we are basically matching it further down.Now, what I'm gonna do down here is uncomment my code again.There we go.And what I want to do is if this linkhas already been posted on Twitter,basically don't post it,because we don't wannakind of fill up the timeline with things once an hour.So what I'm gonna do is basically check if this link,this one or here is already been postedin this list of tweets.How do we do that?Well, around my twitter.update,I'm gonna say, unless in my previous_links,any of these links are the linkthat's going to be posted from here.
Unless means don't do it basically.Don't do it if any previous_links match.Now, because they don't match,obviously then we want to update Twitter,because we wanna post new tweets.And every unless needs an end as well, just like every if.So only now, if this link isn't in the previous links,can we post this Twitter update.So this stops any deduplication of tweets.Stops annoying people's timelines.So again, let's try and run this.We shouldn't see anything at the end.
Oops.Wrong number of arguments.Ta-da-da.Tweets.
Oh, is not any?It's actually just include.So if any previous links include this link,again, I'm just gonna save that,and again, we can see where that goes wrong.It's just bin/update.rb.Line number 31 is where we went wrong,which is this line, and it said, any doesn't exist.So again, I used the wrong thing.Let's see if we can do now.And hopefully, again, that runs, no complaints in there.Again, this shouldn't do anything, so if I refresh,there's still five tweets there.Next, what we're going to do is run this once on hour,and how do we do that?Well, we don't want to do it manually,we wanna put it online and run a script once an hour.We'll do that next.(electronic music)So how do we get this codethat's currently on my computer online?Now, I'm gonna be using Heroku to host this fileand run it once an hour.All you need to do is sign up for an account,follow the instructions,you will also need to installthe Heroku commandline tool, CLI.There's some instructions of how to install thatand also how to actually get logged in.Now, you might need to restart your terminal,and then do cd, Site cd,designernewsbot for instance,to actually get back to your folder.But I've kind of set this up now.Now, there's two things we needto do to add this to Heroku.And the first one that I need to do is in my Gemfile.Basically I need to tell the gems where they live.Now, the where they live online is actually on the source,http://rubygems.org.
Now, if I save this, I also need to do bundle install.Then basically this just kind of updates where it's at.Now, the next thing I need to do is set up gits.Now, git is usually installed on a computer,and what we're gonna do is just to git init.
Now, git init sets up a git repositoryin this folder itself.So there we go.The next thing I need to do is add all of these filesthat are currently in here.So the Gemfile and the update.rb.I'm gonna do git add .,and the dot just means the current folder,which is the designernewsbot folder.Add everything that's currently in there.There we go.The next I need to do, kind of like a big save.The big save that I'm gonna do is say get online,if I do any updates in the future.I'll do another commits, and say, update to this script.I'm gonna do git commit -a,a for add every file that's currently added.And m means add a message.And the message that we're gonna addis get these files onto heroku.And then, end that message.Now, what this will do is as you can see here,it's created these files and added them into a commit.Now, the next thing you'll want to dois actually send this code to Heroku.Now, if you've looked into Heroku,you should be able to do this.It should say, heroku create , then the nameof the thing you want to create an app for.Now, I'm gonna call mine designernewsbot,and I'll just put superhi there justin case it's already taken, and there we go.I've just added this to Heroku.Now what we should see is on your logged in Heroku,now I've got quite a lot on here,but somewhere down here,if I refresh, you should see, where is it?There it is.Designernewsbotsuperhi.Now, I need to send my code from my computer to Heroku,and to do that, we're gonna do git push heroku master.
We're gonna send our code,which is currently on the git master branch,to Heroku's master branch on git as well.So if I do this, this will take my files,package them all up.It's detected it's a Ruby app.It's trying to install all the gem files.There we go.Just like we did.It's trying to install everything.And installing twitter.
So hopefully in a minute or two, this should work.
And there we go.Compressing, and there we go, done.Launching, and then if we go to that URL right now,we should see an application error,which is correct because we don't have any codefor the (chuckles), well, we don't have a website.We've got a script in the background.How do we run this script once an hour?I'm gonna add into my add-ons,I'm gonna configure my add-ons.And I'm gonna find in my add-ons, the scheduler.Now, there's a few, I'm gonna useHeroku's own scheduler itself,the free one, and provision it.Now if I click down here again, that has been added.I can go and run my script.I'm gonna add a new job.When do I want it to run?Every hour.Next one is due in, 20 past.And what code do I need to run?Well, same as what we were running earlier.Because I was running earlier,if I press up a few times, there it is.Ruby bin/update.rb.Ruby bin/update.rb.
Now if I save this, this will now run once an hour.And all I need to do is sit backand relax and watch my followers come in.So that is how easy it is to get basically a bot online.
Now I've got 181,000 followers (chuckles),in 30 odd lines of code.Pretty straightforward.Again, my code runs on Herokuin exactly the same way as we've done here.And hopefully you can make your own botfor whatever you want to make.
If you liked this video, check out our Ruby on Rails course.We talk about how to build really complex softwarein a really simple fast way.(electronic music)
May 31st 2018