On Fri, Feb 21, 2025 at 11:21:23AM +0000, Alex Stuart wrote:
Hi Enrique and all,
Finally there was some discussion on pyFF's issue 289 [2] (When an
entity is loaded from 2 sources, entity data from the 1st source is
lost). I started by showing some "proof of concept" code (just around 15
lines, see [3]) that addresses this issue.
<snip>
A few concerns with this issue were discussed:
* How do consumers (MDQ service) deal when they have duplicates in the
metadata and are asked for some particular entity? In thiss-mdq, when
the metadata is loaded, entities are deduplicated, but some number of
(multivalued) entity attributes are merged.
* What happens when one entityID present in 2 md sources correspond to
different entities (name collisions)? This is a difficult problem, but
somewhat orthogonal to the issue, since in the current pyFF form, it
is also present (currently, one of the entities would just dissapear).
* Can this be abused, if federation A has less strict requirements for
some entity attribute than federation B? Yes possibly, this would need
some risk assessment by the working group; some of the metadata would
not be affected, for example registrationAuthority.
As far as I know, the mechanism for trying to merge metadata for two entities
with the same entityID has not been attempted before in a production
environment, precisely because of the last two issues you raise.
- eduGAIN has a precedence algorithm that chooses one of the entities according
to a precedence algorithm. These clashes can be shown on the entities database:
https://technical.edugain.org/entities and choose "only clashes" from the
entity clashes drop-down. It turns out there are some 3-way collisions in the
eduGAIN upstream so I'd advise that any algorithm you build allows for
arbitrary numbers of entities with the same entityID.
Yes that's taken into account.
- The UK federation has a collision avoidance algorithm which is complementary
to eduGAIN's
Your last sentence talks about registrationAuthority, which I think is a very
interesting point. The registrationAuthority string indicates which
organisation takes responsibility and liability for the entity's metadata. So
when pyFF merges metadata from multiple registrationAuthorities, which is the
one that you will take? The SAML V2.0 Metadata Extensions for Registration and
Publication Information spec [4] has two things to take into account:
- The <mdrpi:RegistrationInfo> element MUST NOT appear more than once
- registrationAuthority is the unique identifier of the authority that
registered the entity
For the SeamlessAccess usecase, I think that in this case, a union of
all registrationAuthorities would make sense. This union would not be
meant to be provided to the metadata consumer (e.g. the party querying
the MDQ), but just to be used for indexing purposes by the (enhanced)
MDQ service. So that idp1, registered in both regauth1 and regauth2,
will be pulled by queries that select by regauth and specify any of
both; and the metadata returned does not need to include regauth info.
In this case I don't see a possibility of abuse.
For cases where there is possiblity of abuse, like entity category, or
assurance certification, perhaps a mitigation might be to merge by
intersection rather than union. Again, the usecase I contemplate is to
use those merges for indexing purposes, and not to generate frankenstein
metadata.
I understand that the use case I contemplate is fairly narrow, and there
may be others that are not compatible. Perhaps an idea might be for a
tool like pyFF to confine all this merged data into a
"for-indexing-purposes" attribute in the resulting structures - and
maybe offer other merging strategies to be confined into other
attributes. Just brainstorming, I understand this would get too complex
too soon.
What's the "working group" you mention in your last issue? I do think the
issue
That's most probably me misremembering a comment made by Ivan, I'm not
good at taking notes at meetings.
of merging metadata needs to be explored more. Perhaps
I should join the IdPy
developers call on Monday?
+1
Regards,
Alex S
[4]
http://docs.oasis-open.org/security/saml/Post2.0/saml-metadata-rpi/v1.0/
cs01/saml-metadata-rpi-v1.0-cs01.html
In the end, we agreed that more discussion is needed to reach a
difinitive conclussion, and that any solution is going to carry problems
that can at most only be mitigated but not fully solved.
Best regards,
1.-
https://github.com/IdentityPython/pyFF/issues/291
2.-
https://github.com/IdentityPython/pyFF/issues/289
3.-
https://github.com/enriquepablo/pyFF/commit/
0fb326d6043c1a3c6c2bb9a431cf4a98e600270f
--
Enrique Pérez Arnaud
_______________________________________________
Idpy-discuss mailing list -- idpy-discuss(a)lists.sunet.se
To unsubscribe send an email to idpy-discuss-leave(a)lists.sunet.se
—
Alex Stuart (he/him)
Trust and Identity technical architect
alex.stuart(a)jisc.ac.uk
Jisc is a registered charity (in England and Wales under charity number
1149740; in Scotland under charity number SC053607) and a company limited by
guarantee registered in England under company number 05747339, VAT number GB
197 0632 86. Jisc's registered office is: 4 Portwall Lane, Bristol, BS1 6NB. T
0203 697 5800.
Jisc Services Limited is a wholly owned Jisc subsidiary and a company limited
by guarantee which is registered in England under company number 02881024, VAT
number GB 197 0632 86. The registered office is: 4 Portwall Lane, Bristol, BS1
6NB. T 0203 697 5800.
For more details on how Jisc handles your data see our privacy notice here:
https://www.jisc.ac.uk/website/privacy-notice
--
Enrique Pérez Arnaud