Skip to content

Tracker: Full learner data utilization #549

@kohlhase

Description

@kohlhase

We are currently not utilizing all the data about the learners we potentially have access to for generating updates to the learner model. This wastes many opportunities to improve the learner experience in ALeA.

This issue is a design space and tracker for accomplishing full learner data utilization.

We have the current and potential data streams we could use to gain information on the learners:

  1. ALeA Browsing stream: which concepts are visible, where is the mouse, which clicks, hovers, video playback, ...
  2. Gaze stream: where are learners looking, what are they fixating, ... <-- eyetracking data
  3. Facial Expression stream: which micro-expressions occur, ...

The first one we already have access to, we can get the latter two by interpreting the video stream from the web camera. There may be others, e.g. audio, ...

In the raw form, these data streams are high-volume/low-value, so we need to interpret them (in multiple steps) to gain knowledge about learner competency updates that can be sent to the LMP. But they also include low-volume/high-value interactions for complex/specific LOs like solving quiz questions, where we can directly interpret the interactions; these we already have somewhat under control in ALeA, so we concentrate on the more numerous raw events here.

The first interpretation step is to correlate all the raw (browser, video analysis) events to the ALeA semantics to create semantically relevant events (SRE): We conjecture that we are only interested in events that can be correlated to semantic objects (and that all other raw events can be disregarded). Here we correlate them to learning objects (LO). Note that LOs are nested (sections, subsections, slides, logical paragraphs, down to definienda and term references -- all of them have FLAMS URIs in ALeA4, so we can reference those).

In all of this we should make sure that we can experiment with and evaluate these SREs. For that we need to have a way to record and export the SRE streams for external analysis.

This gives the first set of tasks at the SRE level:

  • Catalog all SRE types and document their types and components.
  • build an analysis function that turns streams of raw events into SRE streams.
  • "sessions" are a special (complex) SRE, they correspond to extended, contiguous times of learner activity (from start to stop). They provide semantic context to the SREs involved.
  • provide a SRE study API, so that we can expose study participants to controlled ALeA interaction settings and record the SRE stream their interactions produce.

We assume that every session corresponds to one or more study interaction of the learner (though it may be preempted by other external events). So the second interpretation level is to group SREs into study interaction patterns (SIP), that carry some study intent, which we want to uncover in this step. Here are some initial SIP types ordered by complexity

1.a. Encounter (A): i.e. "perceive but do not interact with a LO A", e.g. reading over a pink or blue word (reading over other words we do not count as SREs and therefore (must) disregard.
1.b. Visit (A): i.e. "perceive and significantly interact with a LO A", the significant interaction can be hovering over blue word long enough that it is plausible to assume the learner reads. But it can also be reading on after encountering a pink word (definiendum) and then reading on over its definiens. Here the time period is one indicator for "significance".
2.a. Revisit (A): i.e. visit A again (probably intentionally) after another sequence S of SIP. Here the intermediary sequence S is important for significance and meaning (see below).
2.b. Follow-Up (A,B): i.e. visit B while studying B, here we have a hierarchy between A and B, where A still carries the study intent.
3.a. Forage A: i.e. forage for information about the LO A. This covers all kinds of information gathering activity
4. study G: any purposesful interaction with the LOs to close a competency gap (in terms of Bloom's taxonomy).

The last study G is what we can report to the learner model provider. The lower-level (levels 1-3) SIPs can act as justification of for the "Study G" judgement of the interpretation. The judgements (study gap G together with the justification J) are communicated to the LMP who uses G for update and puts J into the logs.

The list of SIPs above is by no means complete (and probably not even correct). Consider the following example: we identify the following SIP sequence (a sub-session?)
a. visit A
b. visit B (an example for A)
c. revisit A
d. revisit B

We can interpret this as trying to understand (examples generally impart understanding) A. So the study gap G would be an increase of "understand A" and "remember B" with the subsession above as a justification.

This gives the first set of tasks at the SIP level:

  • Catalog all SIP types and document their types and components.
  • build an analysis function that turns SRE streams into SIP streams
  • group SIP sequences (subsessions) of the various levels into SIPs of the next higher level.
  • Assemble a stream of judgements and communicate that to the LMP (will need documentation and content negotiation with Jonas).
  • provide a SIP study API as above.

FINALLY (and not necessarily directly related; here are some ideas about how ALeA content generation relates to MiKo's "four-models" conceptuationzation given the above:

  1. We started specifying a function that creates personalized learner model updates from the domain model D (that needs to be consulted for the internal semantics of the domain) and the interactions I@LL@L of a learner L on some LOs LL): f(D, L:I@LL) --> LM(L)
  2. This personalized learner model is fed into a function f(LOc,LM(L),RD) --> LO(L) that constructs from a curated learning objects model, the personal learning model LM(L), and the rethoric/didactic model RD. Note that LO(L) is usually a complex LO, e.g. a whole guided tour consisting of loads of smaller LOs.
  3. And finally, RD the model can be induced inductively from learner interactions using a function f(D, LOc+C:I@LL) --> RD which observes whole cohorts C of learners and their interactions on the curated learning model

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions