- The simple answer is yes, Google could, but in practice, it’s unlikely to be that significant.
- The long answer is beneath.
Background: SEMRush’s SEO Ranking Factors Study
If you haven’t head of this, which rock have you been hiding under? This is the second study this year, with roughly the same conclusions, namely Direct Traffic, Time on Site, Pages Per Session, and Bounce Rate are the four most important ranking factors. We’ll come to this in a while, but notice anything about those factors?
The study was conducted across rankings for 600,000 keywords from head terms to mid-tail and long-tail terms (see Basic Keyword Research for an outline explanation). Rather than pure correlation, they chose a number of statistical methods to analyse the data, the most important of which were the Random Forest algorithms, These fun little number crunchers enable machine learning to produce a mode, or predictive mean of classes of data. It’s very clever, but we are heading back to means and modes.
So far, so good?
There is an issue with universes in this study. The sites considered rank. The sites that don’t rank weren’t considered. So, it’s likely there are plenty of sites with vast amounts of direct traffic that don’t rank for much. It should also be noted that Direct Traffic is comparative, that is the sites are compared to their ranking competitors. Once you know that, you know that, it’s not possible for Direct Traffic to be an SEO Ranking Factor in isolation. Compare this with SE Lands’s SEO Ranking Factors.
Is Direct Traffic an SEO Ranking Factor?
- The answer to that is no. Not a *direct* ranking factor.
- It is likely that it’s an indirect ranking factor however.
If it was a direct ranking, the search engine rankings would be a simple ordering of websites by traffic, and Facebook would be first for everything. It’s the same as the PageRank argument, if PageRank was the most important factor, results would be ordered by it. They aren’t and never were.
It is, however, likely to be an indirect ranking factor. More eyeballs on the site, more visits, means more likelihood that users will share; that there is great content on the site that people will link to; that the site is useful and enjoyable for a class of users interested in a topic; that they’ll mention it; that the site will become *known* for a certain topic or topics. All of those things increase the likelihood of other rankings factors being bumped: inbound links, mentions, citations, positive sentiment; all of them. I talk about these things in Basic External Linking & Mentions.
There is also the lengthy argument that top rankings are a virtuous circle: better rankings beget better traffic beget better rankings and so on.
Finally, there’s also the strong argument that Direct Traffic is the waste bin of logging and analytics. It is a collection of genuine Direct Traffic from bookmarks and type-ins, combined with the detritus of the logs .
- Can’t work out the referrer? Dump it in Direct.
- Browser gets confused and strips the referrer? Dump it in Direct.
- Privacy plugin installed, stripping the referrer? Dump it in Direct.
- WordPress links with the noreferrer tag? Dump it in Direct.
- Something borks on the Analytics script? Go on, guess.
Even so, we know that Google is pretty smart, and the people who work for it, apart from Adwords reps, are pretty smart. So how could they construct a way to see through the gloom and use Direct Traffic as an SEO Ranking Factor?
How Google Could Use Direct Traffic as an SEO Ranking Factor?
Let’s start with the things we’re fairly sure Google can’t do (a short list):
- It doesn’t read Google Analytics data and doesn’t use it in search. Even though Google is pretty greedy about data collection, there’s too much reputational risk to do this, and too many privacy issues to surmount, even with anonymised data. There also hasn’t been a whistle-blower, and by now you’d have expected one to surface, iron-clad exit NDA or no.
- It can’t read webserver logs.
- It can’t tell if you’re on a site for a length of time because you’re interested in it, or because you’re eating a sausage sandwich.
- Told you it was a short list.
Now let’s think about the things we do know Google can do:
- It carries a heap of traffic through its DNS servers. Every request is logged (remember PRISM?). That data can be crunched.
- It has a small browser by the name of Chrome. You may have heard of it. With a ~60% usage rate on the internet, the majority of traffic goes through its browser. That’s why they felt confident to push HTTPS warnings to users: HTTP to HTTPS Website Update Required Due To Chrome 62
- That browser does pre-fetch and pre-render requests for every page you visit, and every link on those pages. Visiting a brand new page in Google Chrome has been known to send Googlebot scurrying out shortly after to retrieve it for indexing (see Find Crawl Index for more info on how Google crawls & indexes web pages). That’s a lot of data.
- Of course, Google Chrome can also record every page you visit from search and what you do in your return to search. that’s how personalised search works, see here for more info Google Reveals Search Uses Click Data to Update Rankings, and for the refutation: This Week Google Says They DON’T Use Click Data for Search Rankings.
Also let’s think about the things Google may be able to do:
- It has some very clever people on board – the kind of people who understand machine forest algorithms, but neglect to spot they are looking at the wood, not the trees.
- Speaking of Chrome above, let’s not forget Android, that small mobile OS that Google licenses for free. Remember, if it’s free you’re the product. I think we can guess that they are likely to be sniffing data and web visits through Android. Let’s not forget the importance of mobile to Google: 57% of Google Search Traffic Now Mobile.
- I’ve run big websites before for a long time, and have worked with big clients: direct traffic is reasonably predictable. Ups, downs, quiet periods, slow times, the lot usually follows a distinct and identifiable pattern. If I can see a pattern by eye, Google can certainly work it out algorithmically. By the same token, it’s also possible to weed out real spikes (front page of Reddit, anyone?) and fake spikes caused by gamed traffic (if you’re going to fake, you have to fake reality).
- Direct traffic will be incredibly noisy, but Google has the smarts, and the data to know reasonably well which websites are popular, which receive the most traffic and which are liked and enjoyed by users.
- On the basis of that, it is very possible that Google does give a slight bump to well-known, well-trafficked websites if they meet all the normal relevancy criteria. We know from the Vince update in 2010 and the Google quality raters guidelines that they prefer to surface known or expected websites for a query, sometimes even to the detriment of sites which probably should rank for that query (whatever happened to AOL Sucks?).
- So, do they? There is likely to be something in the bowels of the algorithm, but is it likely to be post-relevancy and a part of the quality algorithm. It is not a direct ranking factor. Think an indirect ranking factor, or reflected glory.
What about Direct Traffic, Time on Site, Bounce Rate and Pages per Session?
The eagle-eyed amongst you will have notice that these are
The Four Horseman of the Apocalypse four signs of a great website, that users enjoy, that has great content, that is engaging and useful. These are the trees that the woody Forest Algorithm and its interpreters cannot see.
- Direct Traffic: even allowing for analytics rubbish-dumping, direct is a good sign of bookmarks and type-ins. Your domain is known.
- Time on Site: well your content is really engaging, now isn’t it? In-depth, readable, interesting.
- Bounce Rate: don’t get me started on bounce rate, you can fake it easily, but assuming you’re not, people who do something on site are engaged, whether it be click a link, watch a video, fire an event.
- Pages per Session: lucky you! Not only do you have a site that people visit, they go on to visit more pages. Your content must be super interesting and valuable to them, you lucky people.
- All these factors are things that will end up with better search rankings by virtue of being a better website, providing the website is set up so that Google can crawl and interpret its relevancy – again look at Find Crawl Index for info on this. If you’re doing some SEO on it, all the better.
A Note About Correlation and Causation:
People are very quick to shout “correlation does not equal causation”, and it’s true, it doesn’t. But it does give you a very big clue about what might be going on and deserves the respect of a hearing and investigation.
Occam’s Razor is often stated as: “other things being equal, simpler explanations are generally better than more complex ones”, or as a chap called Theodore Woodward put it “When you hear hoofbeats, think of horses not zebras”. You can also translate this as Keep It Simple Stupid. The obvious answer is also quite possibly correct.
As Sherlock Holmes put it “How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?”
Don’t dismiss Direct Traffic as an SEO Ranking Factor, and don’t dismiss the findings as correlation not equalling causation.