A decade of OLab3: what do 25 million data points tell us?

Our main demonstration server at https://demo.openlabyrinth.ca has now been up and running for a decade. And all through this time, we have been collecting a steady flow of activity metrics on how our materials are used. Every click, every response, even the thoughtful pauses, have all been recorded. So what does this tell us?

With over 25 million data points, we can sure say that this server is well used. Originally set up as a quick test bed to demonstrate what OpenLabyrinth v3 can do, it has been used by many for all sorts of purposes. We emphasize that this is a demo box with no guarantee of uptime but many authors and groups remain keen to use it. And it has been pretty reliable.

We just published a tech report on the various analytics that we have pulled from its databases. For those who are more technically inclined, feel free to peruse the report which delves into these numbers in more detail, along with some description of how we generated them.

Most of this data, we pulled directly from the SQL database that runs this server. But we have also used data pulled from our Learning Records Store, using xAPI statements.

This has given us a valuable source of orthogonal data to correlate with our SQL reports. This server has recorded nearly half a million user sessions and over 1.5 million clicks. There are  over 2000 maps and over 4000 registered users. And this is just for a single server.

We run 8 servers for PiHPES related services at UofC. Around the world, we are aware of many other installations of OpenLabyrinth but we do not track these. We know that users from around the world enjoy OpenLabyrinth. As noted a few years ago, we looked at the geolocations of some of them:

Now, let’s try to find those maps that we think have been played significantly i.e. 6 or more clicks. Here are the first 30 rows of MostPlayedCaseNamesView. The Cnt column shows how many clicks they have accumulated each. 

map_id name Cnt
624 Angus – IPE case 50172
24 Emil, Palliative Care 45820
23 Mildred Blonde 42034
33 John’s back again 24860
553 Digital Professionalism 22697
21 Chester Angermeier 21812
206 Suturing Session 16621
321 Cathy 1 (CAMH ODT Core Course) 16221
388 Lackadaisical Larry – a virtual learner 12814
933 Náhly pôrod v domácnosti 12706
335 Cathy 2 (CAMH ODT Core Course) 12065
937 GDM 11867
304 Sam (CAMH ODT Core Course) 11607
49 VP on VPs 11389
486 Obs SCT Case Series 10187
578 Rushing to Keep Up 10086
5 Welcome 10084
639 Tehotná ena 9844
940 Starostlivos? o enu pri neefektívnom doj?ení  9605
272 Sarah-Jane Pritchard 9495
771 Virtuele SOLK Patiënt: Mevrouw de Vos 9413
770 Virtuele SOLK Patiënt: Mevrouw de Graaf 9180
1551 TTalk: Allison resolving conflicts 9114
909 Perinatálna strata 8890
489 Ed Nekke 8196
207 Kendal Sweetman 8090
1679 Podpora bondingu po pôrode 8014
1922 Shoulder pain in a tennis player 8008
346 Abdominal Pain for SharcFM 7884
640 Diferenciálna diagnostika ikteru 7711

Each time a user clicks on a Link taking them to another Node, it is recorded, along with the state of Counters, the timestamp, user_id and whether this was part of a Course or Scenario. A single session can generate many thousands of rows in this table for cases like Medical Careers. But in crude terms it is an accurate representation of engagement or interaction with a case. The user cannot generate increased data or the appearance of activity by merely clicking randomly on the screen. Only valid links are recorded.

How much time is spent on a case? This is surprisingly difficult to answer across the board. While we can tell exactly how long each session was, there is such a wide variation, with lots of power law distributions, and it is not meaningful to use things like medium or maximum or standard deviation to describe them. We have lots of sessions where the user drops the case almost immediately — not the Droids they were looking for.

For a more complete look at how we analyzed this, check out our report at OLab3 case analytics tech report on Dataverse. We looked in more depth at cases which saw significant use, and used some approaches to filter out outlying data.

For an initial look at how much time might have been spent productively on the server, we estimate that comes to 10,500 hours of play time. This in itself is not inconsiderable and compares favorably with the activity metrics that we examined for our YouTube Clinisnips series: 100,000 hours: a decade on YouTube.

Which is the most popular case by amount of time spent on there? We found that to be Angus McWhindae, map 624, an interprofessional education case that was designed to be played in parallel by multiple small discussion groups. This case has accumulated more than 232 hours of learner interaction time. 

Let’s try the second one, map 24, Emil, one of our oldest palliative care cases and quite complex. This case has accumulated 151 hours for significant learner interaction. And Mildred ( a case on dizziness that we use in a large number of clinical teaching situations with individual learners), has 6200 sessions. This case has accumulated over 109 hours of interaction time. 

To give you an illustration of the power law curves that we see when looking at how long a user spends per session,

On the x-axis is duration in seconds; on the y-axis is the frequency count of sessions. 

For reference, 6000 seconds which we used for our cutoff, is 100 mins. This was the longest example of a credible, “truly played and engaged” session that we could find. This assessment of “engaged” was based on other metrics from that session, such as user_responses, consistent time spent per Node (no long interval while the user forgot about the case). 

Without filtering for very long or very short duration sessions, the curve is much more exaggerated but also less helpful.

What about ‘case completion’ metrics? As noted in the report, we found this quite difficult to assess. It does raise the higher issue in the minds of some educators as to whether the learner should be given credit for ‘completing’ the case. Indeed, this is foundational to SCORM and most other badge or credit-related systems that purport to assess ‘learning’.

If you have a resource that is essentially linear in style (such as most online course materials, which are just page-turners: a means to send information to the learner), then getting to the last page means that the learner ‘completed’ the lesson. Of course, this means very little educationally. 

If we accept the premise that the majority of our users are experienced learners and are discriminatory in the use of their time, they will not waste time on activities that do not contribute to their learning. There is very little frivolous play of these cases – that aspect is not like YouTube at all. So in relative terms for comparing cases with each other, or learners with each other, these simple metrics would appear to have value. Of course, this should be tested more thoroughly but at least this paradata gives us the tools to measure these factors. 

The purpose of this report was to illustrate, not just the extent to which OLab has been used, but also the wide variety of paradata that can be extracted from this usage. It does allow us to whittle out some weaker areas (of which there are many), and potentially learn from the stronger examples. 

It does provide a range of objective data that is orthogonal to our usual evaluation tools: happiness ratings and similar subjective surveys of learner opinions. The business world decided decades ago that customer satisfaction surveys tell you little but activity streams (who buys/does what and when) are a rich source of data that can be captured without threatening the woman-on-the-street with a clipboard. I think we are all a little tired of surveys.

Leave a Reply

Your email address will not be published. Required fields are marked *