Where Are My Rankings?
The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
UPDATE: 9/7/2012 4:00pm Pacific
We have fixed our rankings service issues-
- We are collecting today's rankings as planned.
- The Keyword Difficulty Tool is back up and running.
- We are backfilling the previous 5 days worth of data completely(Sunday through Thursday), but will be missing rankings from Friday the 31st or Saturday the first. Today's data will be complete by the end of the day, and the backfill will be complete by Monday, the 10th.
- Rank Tracker is still not functional, but we are working hard to fix it.
By the way, we are hiring, if you or someone you know would like to help develop our products, drop us a line.
Over the last 3 weeks, you may have noticed some instability with our Rankings tools through missing data and error messages stating some tools are unavailable. On Friday, we experienced a totally different, unrelated problem with our rankings data. We expect to have an updated prognosis for that problem by tomorrow, but we want to fill you in on what went down at Mozplex to cause these issues in the first place. To be as transparent as possible about what happened and how we're working to fix the issue, below is a summary of what was impacted, the work we did to get things going again, and what we’re doing in the future to make the system more resilient.
Database issues? What gives?
Impacted services
- Custom reports
- On-page reports
- Historical rankings CSVs
- Rankings
- Keyword Difficulty & Full SERP Analysis reports
Work completed to get things going again
- Created scripts to heal the different broken states of jobs
- Added more nodes to speed up processing and help in future failures
- Improved monitoring to get information about failures and performance bottlenecks
- Improved performance in a multiple areas
Future work
- Improving health checks and threshold monitoring of Riak nodes and subsystem dependencies
- Adding more Riak nodes
- Beefing up queue and job execution monitoring and alarming
- Creating a dependency matrix that indicates what’s impacted when something goes down
- Improving fault tolerance in parts of the system
- Providing additional excess service capacity
- Creating system operations documentation for dealing with emergency scenarios and how to recover
Comments
Please keep your comments TAGFEE by following the community etiquette
Comments are closed. Got a burning question? Head to our Q&A section to start a new conversation.