Long-term readers may recall that at WebSci'12 last year, I published a paper on representation of disciplines in the Web Science conference series. It was motivated by the ongoing discussion and (at times) uncertainty in the WebSci community about the disciplinary composition of that community, and about defining Web Science itself. The paper explained this problem,  described a method for gaining initial insight into disciplinary presence at the WebSci conference, and shared early results.

The previous paper was a great first step and I'm pleased with it, but it had a couple of weaknesses. It used a small data corpus (69 WebSci papers), and depended to a significant degree on subjective interpretations of graph structures and taxonomies. As a helpful reviewer of the paper pointed out, attempts at subject demarcation require knowledge of the political and ideological boundaries that have developed over the years: for example, some people see criminology as a field that stands outside of sociology, while others see it as a sub-discipline.

I worked with Georgetta Bordea and Paul Buitelaar to better understand WebSci research while addressing the above issues, and will be presenting our work at WebSci'13. We worked with a much bigger corpus, just shy of 500 Web Science publications. To mitigate the issues of interpreting the data, we ran an expert survey of terms rather than attempt ourselves to map terms to disciplines.

As to the precise method and results, I'll quote the abstract:

We applied Natural Language Processing and topic extraction to a corpus of Web Science material, analysing it with graphing and visualisation tools, MatLab and an expert survey. We discovered four communities within Web Science, and trends in the conference series over time (a strong impact from collocation) and format (posters covering a broader range of topics than papers). The expert survey linked highly ranked terms with disciplines, yielding strong links with Communication, Computer Science, Psychology, and Sociology. Controversially, experts described highly ranked topics and suggested disciplines (extracted from WebSci CFPs) as not reflecting the nature of Web Science.

Want to hear more? Come to my talk on Friday 3rd, at 11am!

A visualisation of the extracted terms (colours indicate communities).