Notes: IdPy developers meeting, 24 February 2025 - Idpy-discuss

25 Feb 2025


      Attendees: Shayna, Mikael, Matthew, Alex Stuart, Enrique
Thank you to Enrique for providing a synopsis of some of the discussion of
PyFF issue 289, below.
0 - Agenda bash
1 - Project review
a. General  -
b. OIDC libraries - https://github.com/IdentityPython (idpy-oidc,
JWTConnect-Python-CryptoJWT, etc)
c. Satosa - https://github.com/IdentityPython/SATOSA
   - Matthew is working on the container refresh - trying to get default
      configuration right. He is trying to decide on what to put in place of
      SAMLtest.id.
d. pySAML2 - https://github.com/IdentityPython/pysaml2
e. Any other project (pyFF, djangosaml2, pyMDOC-CBOR, etc)
   - There was much discussion on pyFF issue 289
      <https://github.com/IdentityPython/pyFF/issues/289>
      - There are two issues: distinguishing between name collisions and
         the same entity in two different federations. It should not be hard to
         distinguish name collisions with heuristics or distance measurements.
         Confidence in the same entity in two different federations ??
They may have
         the same title or same description.
         - Take subsets of all the entities that the MDQ might know about
         and create merged data with a merging strategy. It would be
good to make
         that data available, but make it clear that it is merged.
         - Mikael states they use pyFF as an authoritative source. If they
         select which entity is published then they will be making
decisions for
         their users. He would like a clearer policy on how the merge
is done. Right
         now, the "replace with the latest-read entity id" rule is
hidden in the
         code. There are provisions in the code for how to merge when there are
         conflicts. We could invent a way to do the merge. Enrique
would like to
         look into this. Mikael states that whenever pyFF is run as a
daemon and the
         update endpoint is called, the provisions are applied. Mikael
is not sure
         how other people are using pyFF - eduGAIN uses pyFF - how do they do
         aggregation? Alex says there is an algorithm using a
precedence rule, but
         he is not sure if that is using pyFF functionality or if
there is something
         additional written in Python whcih makes that selection.
         - Mikael would like a cleaner way to define how auto merging is
         done, and Enrique would like to have all the data available.
Maybe in the
         select pipeline, there is some way to configure/indicate
using different
         lists. The aggregate is like a hard rule/limit, and the
discovery service
         is more a convenience approach. But If the entity is not part of the
         discovery flow, it won't be used at all.
         - Here are Enrique's notes summarizing the discussion, with a
         potential solution via Alex. Thank you, Enrique! :
         - There is a need to provide SeamlessAccess with filtering
            capabilities based on entity data that is merged from
different sources.
            For example, we might want to filter IdP data based on registration
            authority, which will be different in each metadata
source; or based on
            entity categories, that may or may not differ in different sources.
            - We have been discussing in pyFF's issue #289 the possibility
            of performing this merge in pyFF. However we are reaching
the consensus
            that if we take this responsibility in pyFF, we would need
to account for
            all possible usecases that would benefit from this merged
data, and not
            just SeamlessAccess' usecase. This will require more
extensive research and
            discussion among more interested parties; and much care egarding
            communication and documentation.
            - So one possibility to move forward with this, suggested by a
            comment from Alex Stuart regarding the UK federation,
would be to move the
            responsibility of merging data to the downstream of pyFF
in SeamlesAccess,
            i.e., to thiss-mdq. This would entail using pyFF toproduce
one discojson
            output for each of the metadata sources aggregated by
SeamlessAccess, and
            prepare thiss-mdq to consume all these outputs, merging
them in whatever
            way it seems convenient for SeamlessAccess.
            - We will discuss this within SeamlessAccess, and we'll be back
            with whatever we conclude.
            - Matthew asked where the pyFF documentation is? Mikael says
         there are documents exported to readthedocs, and an old
version of inline
         documentation in the code that is not picked up by modern
tools. Mikael is
         trying to improve the documentation and test without breaking anything.
         - Matthew wanted to use pyfFF for a private thiss.js deployment
         but couldn't quite get it working. For the pipelines, it uses a domain
         specific language that is not easy to work with. We are stuck
with that
         right now unless there is a new major version. Without a
working example
         its hard to get something going - but there are examples in
the unit tests.
         - instead of pyFF, the UK federation uses the shibboleth MDA
         framework. EduGAIN uses both pyFF and shibboleth MDA. It's a
complicated
         framework to set up. The spring config files are XML-based.
In both cases,
         the SAML pipelines refresh.
         - Mikael is still working on upstreaming the OpenID Federation
      code bases / Wallet repositories
2  - AOB