Version: 12.14.0

Content Management / Data classification

Content Management / Data classification

Abstract

As described in the overview, personalisation requires a set of classified data that ML models can associate with one or multiple users. Good classification is the main and primary criteria of a powerful engine. This document describes our approach and the impacts on technical systems.

TODO

Components impacted in PoC

CMS
navida-pro-be-bawu-bff-service
navida-pro-be-motivation-service

Components planned for impact

CMS
All features displaying content inside the app

Philosophy

To perform an efficient classification of content on the long-term, our approach should define some rules that each content will have to follow to comply with the personalisation logic.

Personalisation system will not adapt to each feature. Instead, each feature should make the appropriate efforts to embed systems. It applies to classification, but also on using personalisation inside the feature.

Inside our system, classification relies on "Keywords", formely named Personalisation Hints. These hints will be affected, manually or automatically, on a wide range of content. Therefore they require some control.

1. Hints are a controlled set, not a free-form set

Or to say it otherwise: not everyone can define personalisation hints. Hints should be carefully defined and added for a purpose. They are not tags that can be defined on the fly.

There are two reasons behind this :

We do not want to mistakenly introduce hints that would lead the engine to harmful predictions. For instance, pregnancy is a sensitive topic for personalisation, because you do not control when exactly the recommandation will be perform. In case of a baby loss, consequences could be particularly harmful.
Hints might be duplicated, because content writer did not notice a synonym with similar meaning already exist. This would weaken the predictive power of the engine.

2. All content integrate hints with the same data structure

Any feature that want its content to be taken into account by the personalisation engine should perform the following changes :

If content is defined in the CMS : add a field linked to the content type defining hints in the CMS : PersonalisationHint.
If content is provided by a third party, an advanced analysis on data should be performed to find out how to link the hints with the content. For the Proof of Concept, we will rely on a translation table (also nammed mapping table) we will use to assign hints based on words found in the content.
Finally, make sure each item of your content sent to the frontend (in JSON) contains attribute "personalisationHints". This will be the attribute used by frontend to track the interest of the user.

Regarding 3., you can rely on the following JSON Schema to adjust your content :

{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://aok.de/schemas/personalisation-consumable.schema.json",
    "title": "Personalisation Consumable",
    "description": "Data that can be consumed by the personalisation stack",
    "type": "object",
    "properties": {
        "personalisationHints": {
            "description": "Hints describing how the current data reflects the user interest.",
            "type": "array",
            "items": {
                "type": "string"
            }
        }
    }
}

Everything below this line has a delimited scope to our Proof of Concept.

CMS

As defined above for any feature, we adapt the content of motivation and gesundheitkurse to include new fields for personalisation. We also create a new content type for personalisation hints, that will be used by all features.

An overview diagram is available at the end of this section.

Feature flag

Personalisation is a cross-functional feature that might be used or not depending each AOK. For instance, during PoC, only BaWü will enable the feature.

We should add a new feature flag inside the Features and Mapping content type, specific to Personalisation. It should be disabled on PLUS configuration and enabled on BaWü.

New content type : Personalisation Hint

Personalisation hints should be started simple. This content only have two fields : the hint, and a quick description to help content writers getting the exact meaning behind this term.

In a way, personalisation hint are close to taxonomies. However, we should keep in mind a few constraints :

A hint cannot be deleted if it is in use in a content.
A hint cannot be modified once created. the only possible action is to delete it.
Hints are not free-form : creating a new hint on-the-fly, while editing content should be strictly forbidden.

Cautions: Not all content writers are supposed to create new personalisation hints. This should be a specific permission only allowed to several users. The exact person reponsible of these personalisation hints is yet to be determined.

Changes on Motivation

Motivation content should now carry personalization hints. To do so, a new field should be added, that reference terms coming from our new content "Personalisation Hint".

Hints can be assigned or removed to a motivation with no specific constraints, as any other field.

New content type : Terms mapping for Gesundheitkurse

Gesundheitkurse relies on a third party content, Queo, we can reach through AOK APIProxy. Since we cannot change the data at the source, we should find a way to assign personalisation hints on the fly.

To do so, for our Proof of Concept we will introduce a mapping table between terms found inside the content and our hints. One term found inside the content can lead to multiple personalisation hints.

A sample of events can be found in The appendix. This still needs to be defined precisely, but some relevant fields to look for might be :

training_title
topics[].name
descriptions[].title
descriptions[].text
booking_data.training_title
booking_data.event_topic

For instance, we define the mapping yoga -> [yoga, sports, relaxing] if the term "yoga" is found in either training_title or topics[1].name, the three hints yoga, sports, relaxing should be associated by navida-pro-be-bawu-bff-service to this content.

New content type : Onboarding questionnaire

Onboarding is new feature that will provide a first set of hints to fuel the recommendation engine. It is a simple questionnaire, with multiple choices. Each choice provides a set of hints for personalisation.

Each question have a title, an optional description, a set of choices and a flag to accept one or multiple choices. More details can be found in the content structure overview.

Update on CMS APIs

Change type	Route	Request type	Response / Description of change
New route	/v1/getEventPersonalisationMapping	GET	List of terms mapping with personalisation hints
New route	/v1/getOnboardingQuestionnaire	GET	List of questionnaire items, sorted by order
API Change	/v1/getMotivations	GET	Add list of personalisation hints for each motivation item
API Change	/v1/getMotivationDetail	GET	Add list of personalisation hints

Content Management / Data classification

Abstract

TODO

Components impacted in PoC

Components planned for impact

Philosophy

1. Hints are a controlled set, not a free-form set

2. All content integrate hints with the same data structure

CMS

Feature flag

New content type : Personalisation Hint

Changes on Motivation

New content type : Terms mapping for Gesundheitkurse

New content type : Onboarding questionnaire

Update on CMS APIs

Overview - Content structure

Change on logic

Gesundheitkurse (navida-pro-be-bawu-bff-service - Events)

Motivation

Abstract​

TODO​

Components impacted in PoC​

Components planned for impact​

Philosophy​

1. Hints are a controlled set, not a free-form set​

2. All content integrate hints with the same data structure​

CMS​

Feature flag​

New content type : Personalisation Hint​

Changes on Motivation​

New content type : Terms mapping for Gesundheitkurse​

New content type : Onboarding questionnaire​

Update on CMS APIs​

Overview - Content structure​

Change on logic​

Gesundheitkurse (navida-pro-be-bawu-bff-service - Events)​

Motivation​

Abstract

TODO

Components impacted in PoC

Components planned for impact

Philosophy

1. Hints are a controlled set, not a free-form set

2. All content integrate hints with the same data structure

CMS

Feature flag

New content type : Personalisation Hint

Changes on Motivation

New content type : Terms mapping for Gesundheitkurse

New content type : Onboarding questionnaire

Update on CMS APIs

Overview - Content structure

Change on logic

Gesundheitkurse (navida-pro-be-bawu-bff-service - Events)

Motivation