Computing statistics for pages in a course

One of the useful things that can be done is Canvas is to access the pages in a course and process them in some way. In this example, Download compute_stats_for_pages_in_course.py

we will access the page and compute some statistics about the page, including numbers of characters, words, sentences,  ... to various readability scores. We compute these statistics using  Erin Hengel's package Textatistic.

This example program basically fetches the list of all the pages for a specific course, then for each page it fetches the page and passes the page's content to Textatistic. Finally, it outputs a spreadsheet of values (for example, Download statistics_for_course_11.csv

). For example, one can produce plots and label specific data points to get an idea about how readable your pages are or which pages have a lot of words that are outside of a simple vocabulary (note that one can even add to or delete words from this vocabulary or even put your own word list in). For example, the figure below shows the Flesch-Kincaid score with several of the higher values highlighted, showing the associated file name. (The whole Excel spreadsheet is   Download statistics_for_course_11-20160712-1.xlsx)

fleschkincaid_score-plot-20160712.png

For the same data, we can plot the Dale-Chall score, as shown below:

dalechall_score.png

Sentence (sent_count) and word statistics (word count, polysyllable count, and not Dale-Chall count), see below:

sentence_and_word_statistics-20160712.png

Not Dale-Chall statistics, i.e., number of non-simple words:

non-dalechall-statistics-20160712.png

The following figure shows the ratio of Not Dale-Chall words to total words for each page (thus a larger fraction represents a larger fraction of the words are not simple words):

fraction-notdalechall-20160713.png

An improved program that keeps more information about the module structure and calculates the readability metrics is compute_stats_for_course.py there is also a program to augment the sheet of information about the pages called augments-course-stats-with-plots.py - both are available at https://github.com/gqmaguirejr/Canvas-tools Links to an external site.. Note that these programs can be edited to change the language(s) that are removed and those that are kept. The program does not yet pass the appropriate language information to the hyphenation package.
An example of part of a chart of SMOG scores is shown below. The strings shown on the left are the module name of the page.

Bar chart of some SMOG scores from course 11