Attendees: Johan L, Shayna, Ivan, Mikael, Matthew, Enrique, Hannah, Alex
0 - Agenda bash
1 - Project review
a. General
Discuss whether we should move some of Roland's project repos (which have been moved under Sunet) to be under IdPy:
SUNET/openid4v - ??
SUNET/satosa-idpy - frontend - becomes part of SATOSA
SUNET/satosa-openid4vci - extension to proxy - becomes MR under SATOSA
SUNET/fedservice - under idpy
SUNET/idpy-sdjwt - under idpy
Need time to understand - should probably slack Roland about what changes they are introducing
Also there is a document that explains the process for adopting the code under IdPy which Ivan was going to share.
e. Any other project (pyFF, djangosaml2, pyMDOC-CBOR, etc)
Enrique mentioned in issue in pyFF, where entities in different federations have the same entity id. One possibility is that it may be the same entity registered to different federations, but also it could be two different entities in different federations but they have the same entity id. Enrique has the use case in Seamless Access, where entities are registered in two different federations, in different entity categories. He needs to be able to merge certain entity attributes in this case.
The way things are now, the entity in the first federation would be overwritten by the one in the second federation.
We may want to merge different entity categories, which may differ from one federation to another.
For filtering in mdq, we would need the attributes, which could have different values in different federations, to be merged.
Enrique's PR was merged to allow select pipe to be able to produce a list of entities with duplicates. But the overwriting happens before the select pipe - it happens in the load pipe - all the entities from all the sources are squashed into one dictionary, keyed by entity id.
One idea would be to keep the entities keyed not only by entity id but also by metadata service in the load pipe, but there is some concern that this could introduce problems. When the user requests an entity id, how would we know which one to choose? Could this be handled in the MDQ service? Would the rest of the attributes give a fairly good heuristic to distinguish between those cases? The select pipe can default to doing it the way things currently work - overwriting the first with the second - or use a heuristic downstream to either merge the duplicated entities or keep them separate since they are different. Mikael warns that we need to be careful not to change the default heuristic - otherwise we might break all federations. A new heuristic would need to be approved for each aggregator. We need to better define what happens when we get conflicts so someone will be notified, instead of just getting the last loaded. We will need a feature flag so we don't break deployments.
Ivan wants to establish if this a current problem, or if we are trying to think ahead to a potential problem, especially in the case where the entity ids are the same in two different federations but the actual entities are different. We should document that we make the assumption that there will not be different entities in different federations that will have the same entity id. However, If they are the same entity but in different federations, there is an issue with the entity exposing different things in each federation.
Merging multi-value attributes: What if there is an entity with an entity category in InCommon (say, "hide from discovery") that is not available when that same entity is in eduGain? If we merge them, something that was not intentional would happen. Need to examine these things more thoughtfully before doing merging - some are based on policy.
PyFF - Microsoft EntraID causes a problem with hash marks in entity ids. pyFF doesn't support getting it by url, only by sha1sum. pyFF refuses to load a file when hash marks appear in it.
pyMDOC-CBOR
2 - AOB
Welcome to Alex - he has been working with Ivan looking at the proxy/Satosam, focusing on projects under GÉANT CoreAII (formerly eduteams). He has a lot of experience with different languages and libraries, and has a lot of ideas on how we can approach things and structure things better.
Need to create a list of questions generated from Matthew's efforts to write his own IdP / test SP, then work through a few of them at a time weekly.