SEO Experts Stickyeyes Crawl the Web Faster with Cloudant DBaaS

By Lynnette Nolan

With search rankings changing in real-time, SEO experts can’t afford to lose tracking time, or lose time dealing with database administration. And no one understands that more than Ben Davies, head of technical development at Stickyeyes, a UK-based digital marketing agency.

Stickyeyes logo

Stickyeyes provides SEO analytics, including Google AdWords pay-per-click (PPC) keyword advertising, down to the hour. They build tools to generate reports on their data, and getting those reports off to their clients is crucial.

Ben and his small team of PHP developers were having problems scaling MySQL as the company added clients and its Web crawlers needed to store increasing volumes of data. Lagging replications led to data corruption, which led to Ben and his team coming in on weekends to resolve the issues. In order to focus on providing clients with actionable site improvements, Stickyeyes needed to offload the day-to-day database management. With only five developers, they didn’t have time to waste on dealing with painful master-slave replication topologies MySQL forced them to use.

We spoke with Ben about making the switch from MySQL to Cloudant.

image from: http://www.ldodds.com/projects/slug/ via: Elroy Serrao, flickr

Cloudant: Why did Stickyeyes make the switch to Cloudant?

Ben: We were running MySQL and having a lot of problems with log replication and consistency. MySQL is well designed for high volumes of reads, but we do high volumes of writes. We had seven terabytes, spread across two or three databases. Queries had multiple left joins. Reports took several days to run. There were hundreds of tables. Our replication would lag behind and get out of sync. When it came to the point where we were constantly dealing with corrupted slaves, we asked, “are we using the right database solution here?”

We looked at Hadoop and were drawn to NoSQL because the data model was a better fit. We’re crawling Google, and HTML pages are like documents — they don’t fit well into rows and tables. Originally, we found BigCouch. Because Cloudant open-sourced it, we decided to try it. We installed it ourselves and the master-master architecture handled replication with ease.

You guys are smart. All we have to do is give you data and ask for it back. We don’t have to worry about the administration. Cloudant lets us take on the big boys. We’re a small team — five devs strong — and we’re putting in 700 gigabytes per month. We needed to find someone who could take the database management off our hands. We don’t have staff for a full-time administrator to manage the database-side of things and make sure replications aren’t bogging down our application. We’d need another four or five devs to keep things running otherwise.

Cloudant: How did the migration go?

Ben: It can take weeks to learn the ins and outs of MySQL. Getting new devs up to speed with Cloudant was a definite plus. Everyone understands HTTP requests. People don’t want to lag behind other team members. Within a week of explaining how things work, a dev who has never seen NoSQL is already up and running. Plus, we were no longer worrying about replication, or if MySQL would fall over. When you’re collecting data based on time, you have to keep moving. We can’t miss an hour of tracking. When MySQL is falling over, you’re running around by the seat of your pants.

Cloudant: There can be some ramp up time moving from MySQL to NoSQL, but it doesn’t seem that was much of an issue for Stickyeyes?

Ben: Cloudant’s database architecture was a better fit, and we were up and running in a couple of weeks. We switched all our tools over, and were live soon after. There was a learning curve but, having come from a strong relational background, we were quite savvy.

Modifying the code was simple because it’s all just HTTP. Fine-tuning our Web crawlers only took a couple days. It was quite straightforward, really.

With MySQL, log replication meant lag times that sometimes resulted in corruption, but with Cloudant, we were able to write to it as we were importing old data as well, so we didn’t miss a beat.

Cloudant: How has the performance been?

Ben: We’re able to pull all of our data out of Cloudant like that! Our reporting tools give internal and external clients insights into Google and how they’re doing. We’re providing correlation analysis based on keyword positions and showing how, across all data, clients are affected by Google’s algorithms. Because we no longer have to worry about data storage, we just keep collecting to see more patterns to enable our clients to improve their search rankings. We’re giving them feedback in real-time, showing the effects of website tweaks on their Google rankings. And since Cloudant handles the database administration, our developers can focus on the software.

Cloudant: Is there anything we could have done to have made moving to Cloudant easier?

Ben: If anything, guides that illustrated scenarios: ‘if you come from this DB, this is what you should know...’ could have helped the transition. There was some trial and error, but it came down to attacking the problem in a different way. We didn’t know how to deal with MapReduce-generated views that well, so we tried to pull the whole document out, with a pre-determined doc ID. Cloudant helped us structure our views.

We had questions like, “how do we filter the data the best way?” So we were in the Cloudant channel on IRC a lot, which really helped. We were able to reach engineers who actually wrote Cloudant’s software, and they were quick and attentive. It’s nice to have mentors who are available on IRC. Support and tickets are great, but IRC was almost like having a dev in our office.

Cloudant: That’s great. Thanks for taking the time!


You can learn more about Stickyeyes’ products and services on their website at stickyeyes.com.

— Lynnette Nolan, marketing communications specialist, Cloudant

Create an account and try Cloudant yourself

Sign Up for Updates!

Recent Posts