How much of Wikipedia is in your Wiki?
Take Wikipedia, get rid of all its help, user, and discussion pages — and your basically left with articles, definitions, and stubs. That’s the core of Wikipedia, the encyclopedia we all love (and some of us hate, mainly for plagiarism by students). Now, take a guess of how many percent all the articles make up of the core. My gut feeling was a lot, say, 90+ percent or so, which isn’t that far off. According to Voss (2005, p. 6), it’s somewhere between 90 and 95 percent (for the German Wikipedia), depending on what counts as a stub. So far, so good.
But Wikipedia is open to the public, that is, to most of the Internet users worldwide (’cept China, e.g.). That’s certainly different for any corporate wiki. An interesting question is thus, just How much of Wikipedia is in your wiki? Or, in other words, What is the percentage of articles in corporate wikis?
In our research project, we have access to a couple of corporate wikis. The data keeps rollin’ in and among the first things I did was to strip one of the wikis off all help, user, and discussion pages. I was left with a little more than 700 pages in total. Far smaller than the claimed two million articles of the English Wikipedia, of course, yet a manageable size to apply some genre analysis.
So, 700+ pages of qualitative data analysis later, I took the below screenshot (yes, screenshot, Graphviz wouldn’t render the PNG in time, whereas the DOT itself only took a couple of minutes).

Nodes are pages and edges are links between pages. The network is pretty dense (I’ll calculate some measures later on), spotting somewhat more than 2.500 links among the 700+ pages. The red nodes are articles as one would expect them to find in Wikipedia, that is, they are pieces of writing covering a particular topic, they are structured in a particular manner (e.g., introduction, body, conclusion), and they are authored by members of the organization. All in all there are 68 articles out of the 700+ pages. That’s a little less than 10 percent — and quite a different picture from Wikipedia.
To be honest, a little less than another third of all the pages in this corporate wiki are actually articles, too. But they are more or less copy & paste works from already published articles. I decided to name them features just to distinguish them from articles by organizational members. Features serve the same function as articles, that is, they cover a particular topic, and so on. However, in terms of wiki functionality, they are simply mirrors of someone else’s work outside the membership of the organization. Features are of lesser interest since they don’t come about the (cooperative) authoring of organizational members. Still, articles and features taken together barely make up 40 percent of the corporate wiki!?
Back to the articles that are actually of interest. Take a closer look at the following subnetwork of only articles and other pages the articles link to or are linked from.

The articles are fairly well linked among the other pages. However, among the articles themselves, there are hardly any links at all (it’s hard to see, but I colored the inter-article links in red, too). This suggests that the articles cover substantially different topics — and looking at the articles themselves reveals this assumption to be just right.
These first findings suggest a couple of additional questions, for example, What genres do we find in corporate wikis other than articles and the expected definitions and stubs?, Are those other genres also found in other corporate communication media (e.g., in coroporate blogs, document management systems, etc.)?, and, if not, Are those other genres in any way innovative means for the organization?
I’ll follow up on those question in a later post ![]()