Yesterday we covered how you would go about controlling pins of your arduino over the internet using the Arduino Ethernet Shield set up as a server. Today we are going to take a look at using the shield as a client to get information off of a web page, and report back. I used this method a few months ago when I made the Nixie Twitter follower counter dubbed Twixie.
The ethernet shield can be used to access any non-password protected site with ease, but getting the information back to you is the hard part. For Twixie I created a special php page that queried the twitter API and displayed only the twitter count. This made it so I didn’t have to tell the arduino what to look for, or scour endless lines of HTML looking for a single number. For our example im going simplier. I created a PHP file that just outputs a random alphanumeric string. I did this because getting everyone setup with an API account somewhere is beyond the scope of this article, and unneeded to prove the concept and get you started. But, the idea is that you could easily take that PHP file (any web accessible file) and taylor it to display whatever you need.
In the client mode, the ethernet shield is able to access a webpage and return what is read. But the reading is done one byte at a time, and it reads the entire thing. So it can be like a needle in a haystack on large pages. Even if the page we are reading contains only the information we need, there is extra information at the beginning that is sent to the arduino. You never see it, but a web server actually sends extra information know as a “header” that tells the browser various information about that page (this is different than HTML’s tag).
Because of this, we need a way to tell the arduino what is junk, and what is the good stuff. To do this we are going to surround the information in < >. When the Arduino starts to read the page we tell it to ignore everything until it sees “<“. From this point on we tell the arduino to record each following character until it sees our ending character “>“. At this point the arduino has everything it needs, and disconnects from the server then reports back with the data it found.
The Ethernet Shield library does not come with DNS support out of the box meaning we can not simply access the website we need by its simple URL (like http://arduino.cc). No, sadly, we need to have access the site through an IP address. For instance, bildr’s IP address is 22.214.171.124 and you can actually access bildr like so http://126.96.36.199/~bildr/ – Not every web-server allows you to do this, but when you can it is typically IPADDRESS/~ACCOUNT_USERNAME – So you can see the PHP file I created for you here http://188.8.131.52/~bildr/examples/ethernet/
For more detail: Getting Data From The Web – Arduino + Ethernet