Content Management / Data classification
Abstract
As described in the overview, personalisation requires a set of classified data that ML models can associate with one or multiple users. Good classification is the main and primary criteria of a powerful engine. This document describes our approach and the impacts on technical systems.
TODO
- Personalisation hints (Keywords)
- Change on motivation content
- Feature flag
- Translation table for Queo
- Phisolophy
- JSON Schema to apply on all content
- Content mapping on Queo data
- Hint naming strategy
Components impacted in PoC
- CMS
- navida-pro-be-bawu-bff-service
- navida-pro-be-motivation-service
Components planned for impact
- CMS
- All features displaying content inside the app
Philosophy
To perform an efficient classification of content on the long-term, our approach should define some rules that each content will have to follow to comply with the personalisation logic.
Personalisation system will not adapt to each feature. Instead, each feature should make the appropriate efforts to embed systems. It applies to classification, but also on using personalisation inside the feature.
Inside our system, classification relies on "Keywords", formely named Personalisation Hints. These hints will be affected, manually or automatically, on a wide range of content. Therefore they require some control.
1. Hints are a controlled set, not a free-form set
Or to say it otherwise: not everyone can define personalisation hints. Hints should be carefully defined and added for a purpose. They are not tags that can be defined on the fly.
There are two reasons behind this :
- We do not want to mistakenly introduce hints that would lead the engine to harmful predictions. For instance, pregnancy is a sensitive topic for personalisation, because you do not control when exactly the recommandation will be perform. In case of a baby loss, consequences could be particularly harmful.
- Hints might be duplicated, because content writer did not notice a synonym with similar meaning already exist. This would weaken the predictive power of the engine.
2. All content integrate hints with the same data structure
Any feature that want its content to be taken into account by the personalisation engine should perform the following changes :
- If content is defined in the CMS : add a field linked to the content type defining hints in the CMS : PersonalisationHint.
- If content is provided by a third party, an advanced analysis on data should be performed to find out how to link the hints with the content. For the Proof of Concept, we will rely on a translation table (also nammed mapping table) we will use to assign hints based on words found in the content.
- Finally, make sure each item of your content sent to the frontend (in JSON) contains attribute "personalisationHints". This will be the attribute used by frontend to track the interest of the user.
Regarding 3., you can rely on the following JSON Schema to adjust your content :
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://aok.de/schemas/personalisation-consumable.schema.json",
"title": "Personalisation Consumable",
"description": "Data that can be consumed by the personalisation stack",
"type": "object",
"properties": {
"personalisationHints": {
"description": "Hints describing how the current data reflects the user interest.",
"type": "array",
"items": {
"type": "string"
}
}
}
}
Everything below this line has a delimited scope to our Proof of Concept.
CMS
As defined above for any feature, we adapt the content of motivation and gesundheitkurse to include new fields for personalisation. We also create a new content type for personalisation hints, that will be used by all features.
An overview diagram is available at the end of this section.
Feature flag
Personalisation is a cross-functional feature that might be used or not depending each AOK. For instance, during PoC, only BaWü will enable the feature.
We should add a new feature flag inside the Features and Mapping content type, specific to Personalisation. It should be disabled on PLUS configuration and enabled on BaWü.
New content type : Personalisation Hint
Personalisation hints should be started simple. This content only have two fields : the hint, and a quick description to help content writers getting the exact meaning behind this term.
In a way, personalisation hint are close to taxonomies. However, we should keep in mind a few constraints :
- A hint cannot be deleted if it is in use in a content.
- A hint cannot be modified once created. the only possible action is to delete it.
- Hints are not free-form : creating a new hint on-the-fly, while editing content should be strictly forbidden.
Cautions: Not all content writers are supposed to create new personalisation hints. This should be a specific permission only allowed to several users. The exact person reponsible of these personalisation hints is yet to be determined.
Changes on Motivation
Motivation content should now carry personalization hints. To do so, a new field should be added, that reference terms coming from our new content "Personalisation Hint".
Hints can be assigned or removed to a motivation with no specific constraints, as any other field.
New content type : Terms mapping for Gesundheitkurse
Gesundheitkurse relies on a third party content, Queo, we can reach through AOK APIProxy. Since we cannot change the data at the source, we should find a way to assign personalisation hints on the fly.
To do so, for our Proof of Concept we will introduce a mapping table between terms found inside the content and our hints. One term found inside the content can lead to multiple personalisation hints.
A sample of events can be found in The appendix. This still needs to be defined precisely, but some relevant fields to look for might be :
training_title
topics[].name
descriptions[].title
descriptions[].text
booking_data.training_title
booking_data.event_topic
For instance, we define the mapping yoga -> [yoga, sports, relaxing]
if the term "yoga" is found in either training_title
or topics[1].name
, the three hints yoga, sports, relaxing
should be associated by navida-pro-be-bawu-bff-service
to this content.
New content type : Onboarding questionnaire
Onboarding is new feature that will provide a first set of hints to fuel the recommendation engine. It is a simple questionnaire, with multiple choices. Each choice provides a set of hints for personalisation.
Each question have a title, an optional description, a set of choices and a flag to accept one or multiple choices. More details can be found in the content structure overview.
Update on CMS APIs
Change type | Route | Request type | Response / Description of change |
---|---|---|---|
New route | /v1/getEventPersonalisationMapping | GET | List of terms mapping with personalisation hints |
New route | /v1/getOnboardingQuestionnaire | GET | List of questionnaire items, sorted by order |
API Change | /v1/getMotivations | GET | Add list of personalisation hints for each motivation item |
API Change | /v1/getMotivationDetail | GET | Add list of personalisation hints |
Overview - Content structure
Change on logic
Such changes in data