Below follows a description of modeling alternatives and some questions to discuss, originally posted by Noah Hellman as a Github issue
Overview of the issue
The primary issue with modelling physical activity is that there is a lot of
activities. Some of these activities are similar to each other while some are
not. There will therefore unavoidably be overlapping elements between different
For example, cycling and running can share a lot of elements. When recording a
cycling session, relevant data include distance, duration, speed, heart rate,
incline/elevation change and energy expenditure. All of these metrics may also
be relevant for a running session.
There are however also elements that may only be relevant for either cycling or
running. Things like power output, pedaling rate and bicycle type is only
applicable to cycling, while things like stride length and footwear is only
applicable to running.
The problem is therefore, how do we model all possible activities without
having to model elements that are shared between different activities multiple
Types of sharing
Before we jump into solutions, we can group shared elements into the below
Common for all
Common elements are those that can be relevant for any type of physical
activity. Examples may be heart rate, energy expenditure, PAL and MET.
Vertically shared / inherited
Vertically shared elements are elements that are inherited to all “below”
activities if you arrange them in a hierarchy. Let’s say we structure
activities like this:
├── Distance activity
│ ├── Cycling
│ ├── Running
│ │ ├── Long distance running
│ │ ├── Middle distance running
│ │ └── Sprinting
│ └── Swimming
└── Stationary activity
└── Strength training
├── Bodyweight training
└── Free weight training
All activities under the distance branch will inherit elements like distance
and speed. Further, all activities under the running branch will inherit
elements like stride length and other running dynamics.
Elements that are common for all activities can also be considered inherited
elements as all activities inherit them from the root activity.
It may however be difficult, if not impossible, to arrange activities in a
hierarchy where all shared elements are inherited vertically. There may
therefore be elements that are instead shared horizontally. In the above
example, data about elevation gain may be of interest to both walking/running
and cycling activities but not for swimming. All three do not inherit these
elements from “Distance activities” as they are not relevant for all distance
activities. Instead, walking/running and cycling share these elements
In this example it is possible to add another layer to the hierarchy like this;
├── Varying elevation activity
│ ├── Cycling
│ └── Running
└── Static elevation activity
in order to make elements for elevation change inherited. But adjusting the
hierarchy to make all elements inherited may make for a very messy hierarchy
and may even be impossible if two distant activities happen to share elements.
Below are some possible solutions that have been discussed at the moment. Each
solution has a list of pros and cons that are labeled either “usage” or
“modelling”. “Usage” means that the pro or con applies to the usage of the
archetypes after they have been reviewed and published. “Modelling” means that
it applies to the modelling and review process before publishing, including
Single large archetype
One option is to use a single large archetype for all types of physical
activity. In this case, all elements for all possible activities will be
included in this archetype.
- (modelling) Simplest approach to modelling.
- (usage) Easy to find which archetype to use.
- (modelling) It’s hard to account for all possible activities at first and it
may require frequent updating.
- (modelling) May become very large and cumbersome to maintain.
- (usage) Cumbersome to use if large and including many elements that are not
relevant for the activity you wish to record.
- (usage) Sessions with multiple activities (e.g. a triathlon) must be divided
into separate observations.
Specializations of a main “Physical activity” observation can be used to handle
vertically shared elements. By arranging activities in a hierarchy and creating
a observation that specializes its parent for each node, all elements can be
accounted for if all elements are vertically shared. When creating a new
activity, only new elements will have to be modeled.
- (usage) Simple to use, simply find the archetype named most similarly to
the activity that you wish to record. It will only include elements that
are related to that activity.
- (modelling) Requires careful planning when arranging the activities in
order to not having to make huge arrangement changes later if more
activities appear and new patterns emerge.
- (modelling) Many archetypes to review if multi-level specialization is
- (modelling) Cannot be reused for non-observation archetypes. May require
similar elements to be modeled multiple times.
- (usage) Sessions with multiple activities must be separated into multiple
observations (one for each activity).
Single-level specialization has been experimented with in the sbrs
Clusters and cluster slots
Clusters along with cluster slots can be used to account for horizontally
shared elements. One can use a “Physical activity” observation with general
elements and cluster slots for specific elements. It’s possible to use clusters
for a whole activity or clusters for a specific set of elements that are shared
between multiple activities.
- (usage) Sessions with multiple activities can be placed in a single
- (modelling) Can be reused for non-observation archetypes, e.g.
- (modelling) May be a lot of cluster archetypes to review.
- (usage) May be cumbersome and/or confusing to set up all clusters when
Clusters have been experimented with in the clst archetypes.
Separate observations can be used for all types of shared elements. By separate
observation we refer to creating an observation that is not a specialization of
“Physical activity” and using the new observation next to it. For example,
there exists a “Pulse/Heart beat” observation already that must be placed next
to the “Physical activity” observation if used in order to record heart rate.
- (usage) Can be reused on their own for other observation purposes.
- (usage) Splits a single session into multiple observations. Observations
for the same session are linked with each other only by time. They are also
by linked by their common composition, but there may be multiple sessions
in a single composition.
- (modelling) May be a many archetypes to review if used extensively.
Combinations of the above
As all of the above approaches have both pros and cons, it might be possible to
find a sweet spot by combining them and creating a decent balance. Both the
sbrs and clst archetypes have attempted to combine them in different ways. The
sbrs archetypes primarily use specialization but also has clusters to account
for horizontal sharing. The clst archetypes has a single main observation with
cluster slots, but the clusters for these slots are themselves specialized.
Further discussion may be required to find a good solution. Some questions that
I can think of are:
- Do you agree with the pros and cons listed above?
- Are there other pros and cons to the solutions?
- Are there other types of solutions that could be used?
- Can you think of more ways to combine the different solutions?
- Are there small or big alterations that can be made to the existing