Content Optimization: Revisiting Topic Modeling, LDA & Our Labs Tool

Content Optimization: Revisiting Topic Modeling, LDA & Our Labs Tool

Many times as SEOs, we think about the "on-page optimization" process as simply following the best practices for placing our targeted keywords (and possibly, some variations of them) on the page. For years, search engines have been doing work with topic modeling (this paper from Berkeley researchers does a nice job exploring the concept as it relates to IR). For those of you who've been following our blog posts about research into this area over the past few months, you know we've hit some stumbling blocks. At the PRO Training Seminar in London, Ben Hendrickson shared our latest findings - much more conservative numbers, but consistent with the data and defensible. As you can see, these numbers are lower than our previous datapoints, and the true correlation of the tool in its current format (which has been re-engineered) is now between the "LDA even vs. odds" and "LDA w/ bias" numbers. This makes our version less predictive than, say, # of links or linking root domains, but more predictive than any other on-page factor we analyzed (save features around exact/partial keyword match domains). More WebProNews Videos If you've got more questions about LDA, the tool, topic modeling in general or anything else related, feel free to ask below. For those who have been following the posts closely, you may have noticed that a number of individuals who often don't like the work SEOmoz publishes were particularly vehement in criticizing our LDA research. I don't have much that can address those concerns other than to say - as before, we're still in the early stages of work on this. It's very challenging to do from a coding, mathematics and analysis perspective, so more polished results may still be several months away.

What Keeps Content Marketers Up at Night? We Dived into the Digital Footprint of #ThinkContent Attendees to Find Out
A blind man attached a GoPro to his guide dog – the results are horrifying
From Hilarious to Heartbreaking: 10 of the Best Ads from October

Many times as SEOs, we think about the “on-page optimization” process as simply following the best practices for placing our targeted keywords (and possibly, some variations of them) on the page. My previous blog post about Perfecting Keyword Targeting covers this in some detail. But, we also know that search engines aren’t nearly naive enough to care only about the individual terms/phrases that the user queries. For years, search engines have been doing work with topic modeling (this paper from Berkeley researchers does a nice job exploring the concept as it relates to IR).

While it’s challenging as SEOs to know where this work has taken them, we can certainly assume that the words and phrases you use on a page likely influence its ranking, as well as how and where you use the targeted query term.

For those of you who’ve been following our blog posts about research into this area over the past few months, you know we’ve hit some stumbling blocks. Initially, we thought the free LDA Labs Tool had an extremely high correlation with Google.com rankings (higher even than most link-based metrics). However, after analyzing some results from others who ran tests, we saw biasing in our results and went back to the drawing table.

At the PRO Training Seminar in London, Ben Hendrickson shared our latest findings – much more conservative numbers, but consistent with the data and defensible.

LDA Corrrelation October 2010

As you can see, these numbers are lower than our previous datapoints, and the true correlation of the tool in its current…

COMMENTS

WORDPRESS: 0
DISQUS: 0