The Nuances of Data-Informed MOOC Design | A Research Talk with Mary Ellen Wiltrout

learning strategy oh thanks digital learning strategy for biology at MIT she mentors instructors postdocs and students, manages projects involving MOOCs and hybrid learning experiences, and conducts research on the design of digital learning experiences. This is really cool, three of her MOOCs are ranked on the best online courses of all time list by Class Central and prior to her current position, Mary Ellen earned her PhD in biology from MIT and then taught at Harvard. Her broader roles

include co-leading the MIT Digital Learning Lab, organizing conferences and workshops on digital learning and serving on committees across MIT so um I'm very pleased to welcome Mary Ellen um she'll give her a talk and we'll uh save questions uh for towards more towards the end so Mary Ellen go ahead and take it away. Thank you for that introduction thank you for having me today I'm excited to share with you work that we've done over the years to help give some ideas of what I mean by the nu

ances of data-informed MOOC design and what we've learned. So why at MIT do we invest in creating open online courses? And that really comes down to the core of our mission. In our mission statement itself we say to advance knowledge and education in STEM and other scholarship to best serve the nation in the world so my start with MOOCs massive open online courses that many of you probably are familiar with on this Zoom today um started in late 2012 when we were beginning to discuss a poss

ible introduction to biology MOOC at MIT and at the time I was still working at Harvard but then by that uh January 2013 I switched back to MIT to work full time on the creation of this introduction to biology MOOC that we released after a few months for the world and that started um you know my career in the space. So what I want to discuss today are is how our studies that we've worked done over the years have challenged common assumptions that people make when designing MOOCs some of tho

se assumptions will be um that we'll discuss today and give you highlights from the studies include that video length matters the most for engagement, that you need to organize your course content for into many many small sections and not get too long, that open online courses must run self-paced. that you need to proctor and police for controlling cheating, that increased enrollment is good for engagement, the instructor presence does not influence much with enrollment or engagement, and

instructors do not have to revise after initial course development and these are assumptions that many or at least some people have when it comes to course design. Before I get into those specific studies I wanted to just mention that in an essay in Frontiers, we discussed how in all of our MOOCs um from the beginning we really have the principles of applying learning Sciences in what we do and how we do it and in this essay we summarized how we use the learning sciences and the research b

ehind those decisions so in the outer ring you can see that we are um discussing the different science learning Sciences so it may be the multimedia learning theory or the idea of concrete examples and then in the colored uh circles on the inside it's showing how the learning Sciences apply to the various course aspects um what learning Sciences will apply to what course component so it may be summative assessments formative assessments the multimedia so to address the assumption that I me

ntioned earlier for study one is video length the most important when we worked on this study we weren't going after that question that wasn't the purpose of the paper what we noticed were that we created animated videos that you'll see here on the my left um and then you'll notice that in this uh video we did the process of scripting to have precise language which to have uh following multimedia principles for video design and it was fully animated with voiceover and that is in contrast to

the majority of the videos that were in our MOOCs at that time and still continue today where we have someone teaching in an MIT classroom and we are showing uh you know what they are writing on the board were adding in images where they may have used slides in the classroom and we noticed that in the animated videos there seemed to be a drop off and engagement compared to the classroom videos so we wanted to look is there a trend is this just one video or is there really a trend where the

re's a lack of Engagement so the percent of Learners who viewed the video and completed it um it was much lower between 60 to 65% for the anima videos compared to the videos that were in the classroom and what also uh you can notice in this data are that the videos that are in the classroom or even animated there's not a trend that's very noticeable as length gets longer um so the length is not the biggest Factor that's causing the loss of Engagement here but it's actually the video style d

espite designing them with the best intentions of learning principles for the second assumption does content structure influence engagement so what I mean by that is when we have you may recognize from an open EdX type of course that you have a learning sequence and you have uh videos that will appear in a row you may call this week two lecture three some sort of title of content and usually there's questions interspersed between videos so this is a video and then this may be a question th

at appears immediately after a video in between another short video segment so in this uh study we actually didn't use the moo population but we used students at MIT who were using the moo materials in their course We are in their course components and they had these materials all as completely optional so nothing was for grade and so we wanted to know does the position of ungraded material impact the students's behavior to attempt the problem so there's no requirement to do these problems

we just knew there's a high activity in these practice materials and we wanted to see does position matter so what we did is set up a Content experiment with groups a and Group B group a would have the first problem that we had in the experiment inside the video sequence meaning it will appear between those video segments or outside meaning it would be in another section labeled problem set or practice problem set then we would have that same group a have another problem that would be the

reverse so it wasn't that one group got all the same type in the same type of location it was a mix and then when we looked at the behaviors of the students completing these optional questions over 20 Questions we had 18 that they were more likely to complete if the question was inside the video sequence than if it was outside if it was in some section called Problem set practice problem set and we completed that study over two years and the results reproduced um that 16 out of 18 the probl

ems in the next year were inside more likely to complete when they were inside the video sequence so the next question um and study that we did was looking at instructor paste versus self-paced so the Assumption may be that in an open online course you need to run the course as self-paced so that you're giving Learners flexibility to complete and um having more of a chance to complete more of a chance to do the work that they want to do within the uh optional course so instructor p means t

hat content is released in gradual pieces there are due dates in certain times throughout the course and overall there's a short duration of the course and this study was about 50 or 60 days for the courses the self-based courses the content released all at the starts available fully for anyone there's one due date at the very end and these can be open for as long as a year so in this study that I um we complet we looked at the impact of the self-paced courses versus the instructor paste an

d for us in this quantitative biology Workshop course we had years where it was self-paced and instructor paced and we can compare the different years so in this case 16 and 17 were the years that were self-paced and we went back to instructor Pace because we had this feeling that there was less engagement in self-paced years so then we want to look at not only just in terms of numbers but in proportions do we see that there is less or more engagement in these different types of modes so if

you look at participated the people who have clicked into to the course at least once explored meaning they clicked half of the chapters of the course completed meaning they passed a threshold to earn a certificate or just earn pass the threshold to to complete and then verified meaning that they signed up for the paid uh certificate track and if you look at those metrics for the self-paced years compared to the years where it was instructor paced almost always the you know especially for

the explored completed and verified they're lower in number in percentage of Learners compared to the instructor Pace time now did the self-paced learning take advantage of the greater amount of time and yes because when we looked at when they submitted a problem we can see that they submitted from the first time they submitted something to the last time they submitted that uh a question in the course that happened in a greater number of days so some people are taking advantage of more tha

n 300 days in that range and you compare that to instructor past where instructor pace is really within that 50 or 60 days because the course starts and ends within that period and so for the people who completed they took a larger range of days and for the people who completed we noticed that actually they took a total number of days that were fewer than when it's run in instructor paced so that means that even though it's open for greater number and they're using a larger range the number

of days independently that they're logging on and completing questions were actually smaller so that may mean self-paced is helping more of the people that can go quickly through a course so overall we didn't find that self-paced increased engagement metrics for participating exploring completing or verifying and we also looked at posting to The Forum attempting a problem or watching a video and we didn't find that the completers submitted problems um in a greater range of days submit but

they did submit in a fewer number of days and attempted fewer problems per day in the next study I'm going to talk about a compy exam model that we originally started with the introduction to biology biology course and we also use for another biochemistry moo as well in this model we are looking at um assessment and what can we do for summi of assessment to maintain the Integrity of the certificate so that the Learners who do earn the certificate can feel like they have have really um have

something of accomplishment and that cannot be easily taken uh by others in a way where cheating is the mode to uh M to get that uh achievement and so that happened with um early on there were some studies in um after a year or two of MOOcs just showing how Learners because it's so easy to just register for a course you could register for uh with multiple usernames and use techniques like harvesting to take answers from one course to another so to combat that sort of behavior uh we design

ed this course setup and structure so that within the course the was formative assessment for learning it was available self-paced with feedback it was non-graded and we had that set up from the beginning where we released all of it at once and everyone could just work at whatever Pace they wanted for several months to work through material that were in between video kind of questions but also more in-depth problem set exam kind of questions as well and then we had a compos exam that would

be the summative assessment in the course so this was available for one specific week at the end of the course run it was timed we had verified track access only and even though that may be more common now at the time is one of the first courses to do that sort of verified track access only for the summit of assessment it was randomized per problem and multi-part questions so every page of the exam is there's one of four different versions and they look very similar so even at first glance

it looks like you have the same exam there's no feedback on correctness whether they have it right or wrong or explanation and they have no exam access after completing and there's no discussion form during this time for people to publicly discuss something that we just give them email access so in this um exam we try to also design it so that it's best to really measure are they do they really know the material so there's over a 100 questions it's dist distributed across blooms levels from

simple uh explain knowledgebase to more of the application in prediction and then we try to test as many learning objectives as possible with in the course so in this study that I completed with former MIT colleagues that are now in Israel and Spain we took an approach of looking at aarant performance behaviors so when students were how well they performed on questions when was were the patterns looking odd and we compare the a run where there was no compeny exam model to for runs of havin

g the compeny exam model and we could see that every time the compeny exam model of the course ran there was less of this inar performance then we looked at a signature for response time patterns and we saw that when the compy exam was running there were fewer there was less uh abnormalities in the data for the response time that Learners gave and these are for both cases are talking about only the people who completed and earned a certificate in the next study I'm going to talk about enro

llment and does that mean that enrollment means leads to higher engagement so like many others during the pandemic we looked at the spike that we had in the number of Learners and you can see here that during the uh 2020 runs of the course the enrollment rate was much higher than prior to the pandemic and this is work that we did with Kitty Blackwell who's now a graduate student in the department but she did this as a remote undergrad project uh before she became a grad student and we wante

d to look at the engagement so not only just how many enrolled but then are they participating and through 2019 the perent who participated clicked into the course at least once ranged from 68 to 71% but then the 2020 runs we could see that there was a spike in 20 March 2020 but then by July 2020 there was already a dip in that participation the people who registered or enrolled didn't necessarily participate in the course at all so then we looked at the viewing of videos and that we looke

d at everything like problem set and forums but the video viewing really stood out as having a difference so compared to 2019 in March 2020 you can see that there's pretty high certified uh Learners the people who got a certificate how they watched uh the number of unique videos viewed by a Learner in this case were higher but if you look already at the July 2020 run you can see that these numbers do dip a good bit that the N unique videos viewed by someone as an auditor a verifier or someo

ne who's certified is lower meaning that there's less engagement and we can look at that sort of pattern in another way if you look at the percent of people who join the verified track so over the runs the 2020 runs had the highest number of people signing up for the verified track but as a percentage that was also higher when you look at the percent who earn a certificate on the verified track that decreased for the 2020 runs um you go from almost 18% down to the July 2020 run being aroun

d 12% then another interesting behavior that we see in this sort of in this course in this course model is that it's not necessarily that the Learners are failing the exam it's that they don't even show up and try and so if you take the percent of verified track participants who actually attempted the exam and passed or did not pass if the percent who passed of those who actually tried the numbers are much higher it's roughly 45% before and then when you get to July 2020 that run it goes do

wn to about 35% of Learners who pass of those who tried so higher enrollment doesn't necessarily mean these Learners are more engaged so another aspect is not just designing the course but then what do you do when the course is live and running does sending course emails matter I think the sending of the emails has dropped off a lot since uh over the last 10 years and I think you know it's hard to maybe measure is there something happening is it worth sending or not so this is showing exam

ple of on March 2nd I sent an email to everyone to start another course because this was a course that has ended or recently ended sorry I went the wrong way and then in that course email I said You know here are other courses that you can go to that will be starting soon or is open and then I use bitly links so I could look in an informal study kind of way and see that on March 2nd I see a spike and the number clicks to my links for all of those courses that I mentioned in the email so I k

now at minimum people are clicking to see the registration pages of that um course when I send those emails and then you can see overall you know newer courses had fewer numbers here and older courses have into the thousands or even tens of thousands of clicks of the links that I created that I'm sharing by email or through our website in engagement it may be a little bit harder to measure the blue line here shows active Learners people who just click into the course and then orange and gr

een are watching a video or trying a problem and so you see fluctuations week to week and this could be due to emails or other factors and in this course this was after the March 2 email but maybe there's like some more people clicking in but not necessarily a big impact for engagement in problems or video trying but then in other courses that are happening more recently now I've been sending some emails and we're also in the last month or so and you can see that there's a pattern of increa

sing of problem uh completion as well as video watching and that could also just be a factor of the course uh reaching the point of a summative exam in this type of course and so pulling apart engagement here would take some more study to really understand what matters but in terms of enrollment numbers the thing that I've seen can impact enrollment the most is just MIT sending an email to people who have signed up for other MIT courses and specifically the mitx courses that are open online

courses and when we've had genetics cell biology in this case two of our cell biology courses and genetics course advertised in an email that went out on January 6th then I could see that the enrollment spiked significantly on those uh around that date of January 6 for those three courses that appeared in the email in this case we also had another uh cell biology course open for registration that was not included in the email which served as a good control the green line here to show that

um if it's not in the email you won't see that Spike of enrollment and there's other cases where little smaller Peaks maybe because of the emails I'm sending but relative to the MIT email there it's not um noticeable in the plot here in the final study that I'll talk about today is just the idea of course revision so after you designed the course you've run it you've looked at um how students have performed then what do you do do you just leave it to continue to run um or do you go back an

d do a revision I think it's easiest for all of us to just ignore it and not go back and do the work of evaluating how is all the content working for learners but we wanted to set up a structure and a revision framework to think through our process of revising courses so in this course in this work that we did with Mooney Caitlyn Darcy um we can show that our general pattern is to run the course in this case we wanted to focus on questions that were used for summi of assessment and identify

low performing questions then I hypothesize the source of low performance evaluate the cause of the low performance and then Implement revisions and see what the results are so to go through that process again we have identifying the low performance and then we had two categories for Source either instructor or learner based and you'll see in the second what we mean by instructor controlled versus learner the instructor side is more like clear wording or narrowness in the grading that was

more our fault for maybe why the scores were low performance was low for a given question and the learner is really like what we're assuming by the answers they gave or maybe discussion for post is that there may be inadequate scaffolding that's sort of our fault as well as maybe on the learner side but there could be disregarded instructions or failure to synthesize multiple Concepts so after we hypoth siize the cause we would decide a mode of action so adjusting the grading code or modif

ying wording or emphasizing key instructions and giving hints maybe so is that work worth it is there a difference when you go through that process and we looked at a cell biology course and analyzed the quiz uh questions for that course over three runs and we found that 21 Questions met the criteria of having a threshold of lower than 70% performance um of people getting that question correct and we looked at dividing those questions into sources we found four that we categorized as instru

ctor and we've categorized 17 as learner then we had causes that we divided into the different categories given here and then we decided how to improve and in some cases our Improvement path we just decided to leave unrevised because in some cases maybe there's low performance but maybe it's just a harder question and there's not something that's obvious to us of what's missing of why they're going to get incorrect other than it just being a harder question so if we look at the results for

these changes over run one compared to two and three you can see with the green and purple bars that there is a jump in performance to above the 70% threshold after making these revisions it's not for every question but for a number of them we see that jump and then in a number of cases that uh increase is significant so in this case the time is worth it if we want to be sure Learners are getting the best experience when they're going through these quizzes not feeling frustrated and possib

ly not deciding oh I want to quit because this is just a frustrating experience so going back to the seven challenges or seven assumptions and the challenges that came up within these studies so the first one video length matters the most for engagement but I've shown you in our earlier study that video content and style actually can override the Impact versus length to uh have the advice and assumption to organize course content with many small sections and I've shown that Learners will e

ngage more with fewer breaks in the content Su sequence and there have been other studies since that paper as well that showed in different ways that having different sections didn't necessarily help it actually caused drop off the assumption that open online courses must be self-paced for flexibility that there's not evidence at least in our study that it resulted in more engagement or completion by having self-paced mode there's other reasons to choose self-paced versus instructor paced b

ut it might not be that you're aiming for engagement or completion it's very common practice now and with a lot of discussion the need to Proctor police control cheating because that discussion became so prevalent during 2020 but we have shown that just course design alone can deter cheating and you may not go need to go to those other lens if you can design and with as many techniques as possible one setup to deter cheating using evidence-based methods increased enrollment would mean more

engag engagement but we find that you know does not necessarily correlate a lot of people who may be enrolling especially during 2020 might not have been as committed and also there's other factors of attention and distractions especially during that time period that maybe have led to less engagement the instructor presence cannot influence enrollment or engagement much but we find that even though it might not be huge Spice from an instructor email that there is something going on there's

definitely numbers there showing that the instructor emails matter and that the school can actually impact the things like enrollment even more and instructors do not have to revise the course after the initial development and we feel that revision improves learner performance so it's worth the time and effort to continue to revise as you're running open online courses or other online courses or any course really so the key takeaways for today is just that open online courses are not all t

he same make decisions considering the nuances always be testing your assumptions so even though it might have been shown in one paper to work a certain way that doesn't mean that your course content your videos are the same scenario to have the same outcome and always be looking at the data there's access to so many sort of data for online courses but and often it's the case that people don't even look to see what's going on week to week and if you use it as best you can then you may be ab

le to find these patterns and these abnormalities to look into more detail so with that I want to thank you for your time today um looking forward to any other questions or discussion on these topics and I hope giving you this summary and uh a little bit bit of highlights from the studies we've done has been helpful to make you think about these design processes thank you thank you um as folks have um questions or comments you want to raise I'll just invite you to use the um reaction button

on the bottom bar and go ahead and um raise your hand and um we'll take we'll take questions from there um as folks are kind of thinking about what you'd like to ask and and getting situated though I'll start with one um this it's a it's it's about study five um and enrollment and engagement and I just wanted to hear you kind of pick apart a little bit more like of course it was a unique time and you know there's all kinds of um um factors in there as to why somebody might not have been su

per motivated in July you know in summer 2020 and so I Wasing like you know if you reran that study now do you think you'd find kind of the same do you think you'd find the same result yeah I think that's a very good question for the time because I feel like you know without looking you know doing the study of the detailed data analysis that the engagement has stayed in at this lower level since that period of 2020 that the prior to 2020 period people were more likely to keep going to get

further in the course or watch more videos or do more problems and that sort of behavior is just dropping off even more after uh the pandemic and so it could be something with attention that is you know has changed over the that period maybe there's something with less motivation um but ideally we want to have and you know have people be intrinsically motivated so that they will complete these courses or do whatever Parts they want just because they you know out of their own interests and u

h goals want to learn so I would say um probably you would still see those kind of results I think in general I've seen cases where when the courses are broadcast advertised in some way to a more broad audience um so like if a link is in the New York times or in an email that goes out to all edex Learners they're probably that's going to lead to more people joining and quitting than if it would go out to more targeted email list that would be people who have done biology courses before um b

ecause the types of people who are taking the courses may be not having the expectations that others who have taken the mitx courses uh in biology or other disciplines have for this being truly a full experience of trying to complete a course that's very like what's happening on MIT campus and it's not something you could just do in a few hours in one day to complete and so we you know would get more people who may enroll but quickly lose interest thanks any other questions from the audienc

e here oh Peter are you I'm not hearing you I see you're unmuted but um I can't hear you okay how about now that's good yeah great thank you um so question M thank you thank you again for a wonderful presentation and sharing of of the studies um I was wondering for the first study regarding the video Styles um were there were you able to S out further characteristics of these styles of videos that seem to uh elicit more uh attention um so you have the the looks like what what looks like a t

raditional lecture kind of video and then the kind of the animated uh example kind of video um were there other kinds of styles that that you observed and um um things that you noticed as as far as learning engagement and and how long people attended to these videos yeah so we definitely looked at other factors like what could be causing this and we were able to eliminate more than really pinpoint the cause you know we were having different hypotheses of okay what if it is the speaker becau

se the speaker we had in the animations it was a woman and then the speaker in the lecture you could see it was uh mostly for most of them it was uh male speaker so then we tested using all of the voices of the people were involved in the course and having it o overlaid over an animation and we did not see a difference in that engagement drop you know still no matter who was speaking we would see that drop and then we would um our ending uh theory that we didn't really you know necessarily

go off of was maybe it's just down to Natural voice so when someone's speaking in a room with MIT students it's much easier for them to engage person to person be looking at the audience having feedback of that audience and being able to speak in more natural tone and these are really subtle things that um to most instructors right it's not necessarily noticeable but you may feel that when you're watching a video online and then when we have the scripted videos you know someone is more or

less reading the script even if they practiced as much as they could it's scripted and it's maybe not natural tone or natural flow even though the person we have doing the over lay of voice was you know really good at doing that but it doesn't mean that it sounded as natural or was maybe as engaging um or it may be something where you know the students really just feel like in the classroom was the priority like you know whatever was happening in the classroom and if it was something animat

ed maybe they were considering this was more optional and that was another you know idea we had like they just feel like oh it's just an animated video so I can ignore that but if it's in the classroom then I have to take that much more seriously so those are some ideas we had thank you I just a quick comment observation I I wonder with with the Learners that you surveyed or observed mostly um MIT students um and I I wonder if the expectation of these students given that they're more predis

posed to or or used to a lecture uh style kind of courses versus let's say someone in the workforce right outside of outside of school I I wonder if that had something to do with the fact that oh I'm more used to watching a lecturer a professor in a classroom therefore I pay more attention versus someone else who uh you know again is in in a workplace and looking for professional development I I wonder if you might have have a comment about that yeah I mean that would totally make sense exc

ept that study was done with the M Learn learners but we know Muk Learners um you know tend to be professionals who already have degrees too so it may be the same type of people who you know they go to college they have this signaling that this is the type of format I'm supposed to learn in so it may be the same sort of question even but it wasn't the MIT students but even with the moo Learners this is the pattern that we were seeing thank you thank you so the experiment there is like putt

ing a Rob in the classroom and like yeah finding him up and letting and we can do that now right with AI so just have the robot give the lecture and see what happens yeah yeah you do that in MIT um well I'll ask one more um this is on study seven where you were kind of looking at these low performing questions and um you know then revising them to sort of get at uh to make to make them better or whatever um and I was wondering about if you looked at like sort of the are were there are there

were there certain are there students who are getting most of those questions wrong more of those questions wrong like how how sort of distributed were those um incorrectly answered questions across the the learner population I should say yeah I mean there's definitely in these quizzes Learners that are doing poorly and doing really well um throughout those questions so there's some people who would generally like have no room for improvement because they're already getting most of them ri

ght but I think as an overall population there's room for improvement um we didn't necessarily look at jumps between low performing versus like average versus high performing students you know to see where the gains mostly were um but uh we were looking at in that same study repeat Learners because in MOOCs you know people can take the course as many times as possible so are we seeing this jump just because people are familiar with the content more and they're like taking the course the sec

ond or third time and we were able to eliminate that as a factor like that wasn't the reason why we were seeing an increase in performance that it wasn't because of those Learners the people who are taking the course multiple times question um okay well I'll just invite us to thank Mary Ellen one more time um thanks for thanks for joining us and and sharing your sharing your studies um Chelsea if you want to go ahead and put up the yes the next event in um well event coming up quite soon i

s the Ed Tech Showcase um for those of you uh around campus here we'd love to see you this is two weeks from Wednesday I think come join us in the afternoon you can learn about a couple of the different tools in the Showcase and um following that our next uh research seminar will be given by our very own Sarah Dysart in early March so um we'll be looking forward to hearing about um the study she did for her PhD there so yeah come join us again either on the 21st or in early March for our ne

xt one and and thanks again Mary Ellen and and thanks for um to all for for joining this afternoon thank you for having me thanks everyone for joining

The Nuances of Data-Informed MOOC Design | A Research Talk with Mary Ellen Wiltrout

Related articles

Comments