According to Fishkin, algorithmic calculations begin on the spider's first crawl of site, discovered through the hyperlink. Registration data seems to be a player, such as the date of domain purchase and the length of the time for which a domain is purchased. Domains registered for more than a year are thought to more likely not to be spam or throw away domains.
However, spikes in popularity can be a red flag for spamming techniques, and thus more qualitative analysis of those links is needed. Here are some of the factors that go into link analysis to protect against link spamming:
-
Freshness: How often is a link appearing or disappearing?
- Trustworthiness: How trusted is the source of the links (i.e., is from a .gov or a .edu)?
- Speed, Number, Topic: How fast are links to a website appearing? Why are there 5000 new links to a website in one day? Is this the result of spam or because of some newsworthy event like the Death of Peter Jennings? Tsunami topics would be deemed more natural than thousands of new "Buy Pharmaceuticals Online" links.
Domain ranking history seems to be another point of examination. Jumps in page rank are apt to be scrutinized more heavily. Seasonality or burstiness also may play a role in this, similar to the current event described above.