Attendees: Shayna, Mikael, Matthew, Alex Stuart, Enrique
Thank you to Enrique for providing a synopsis of some of the discussion of
PyFF issue 289, below.
0 - Agenda bash
1 - Project review
a. General -
b. OIDC libraries -
https://github.com/IdentityPython (idpy-oidc,
JWTConnect-Python-CryptoJWT, etc)
c. Satosa -
https://github.com/IdentityPython/SATOSA
- Matthew is working on the container refresh - trying to get default
configuration right. He is trying to decide on what to put in place of
SAMLtest.id.
d. pySAML2 -
https://github.com/IdentityPython/pysaml2
e. Any other project (pyFF, djangosaml2, pyMDOC-CBOR, etc)
- There was much discussion on pyFF issue 289
<https://github.com/IdentityPython/pyFF/issues/289>
- There are two issues: distinguishing between name collisions and
the same entity in two different federations. It should not be hard to
distinguish name collisions with heuristics or distance measurements.
Confidence in the same entity in two different federations ??
They may have
the same title or same description.
- Take subsets of all the entities that the MDQ might know about
and create merged data with a merging strategy. It would be
good to make
that data available, but make it clear that it is merged.
- Mikael states they use pyFF as an authoritative source. If they
select which entity is published then they will be making
decisions for
their users. He would like a clearer policy on how the merge
is done. Right
now, the "replace with the latest-read entity id" rule is
hidden in the
code. There are provisions in the code for how to merge when there are
conflicts. We could invent a way to do the merge. Enrique
would like to
look into this. Mikael states that whenever pyFF is run as a
daemon and the
update endpoint is called, the provisions are applied. Mikael
is not sure
how other people are using pyFF - eduGAIN uses pyFF - how do they do
aggregation? Alex says there is an algorithm using a
precedence rule, but
he is not sure if that is using pyFF functionality or if
there is something
additional written in Python whcih makes that selection.
- Mikael would like a cleaner way to define how auto merging is
done, and Enrique would like to have all the data available.
Maybe in the
select pipeline, there is some way to configure/indicate
using different
lists. The aggregate is like a hard rule/limit, and the
discovery service
is more a convenience approach. But If the entity is not part of the
discovery flow, it won't be used at all.
- Here are Enrique's notes summarizing the discussion, with a
potential solution via Alex. Thank you, Enrique! :
- There is a need to provide SeamlessAccess with filtering
capabilities based on entity data that is merged from
different sources.
For example, we might want to filter IdP data based on registration
authority, which will be different in each metadata
source; or based on
entity categories, that may or may not differ in different sources.
- We have been discussing in pyFF's issue #289 the possibility
of performing this merge in pyFF. However we are reaching
the consensus
that if we take this responsibility in pyFF, we would need
to account for
all possible usecases that would benefit from this merged
data, and not
just SeamlessAccess' usecase. This will require more
extensive research and
discussion among more interested parties; and much care egarding
communication and documentation.
- So one possibility to move forward with this, suggested by a
comment from Alex Stuart regarding the UK federation,
would be to move the
responsibility of merging data to the downstream of pyFF
in SeamlesAccess,
i.e., to thiss-mdq. This would entail using pyFF toproduce
one discojson
output for each of the metadata sources aggregated by
SeamlessAccess, and
prepare thiss-mdq to consume all these outputs, merging
them in whatever
way it seems convenient for SeamlessAccess.
- We will discuss this within SeamlessAccess, and we'll be back
with whatever we conclude.
- Matthew asked where the pyFF documentation is? Mikael says
there are documents exported to readthedocs, and an old
version of inline
documentation in the code that is not picked up by modern
tools. Mikael is
trying to improve the documentation and test without breaking anything.
- Matthew wanted to use pyfFF for a private thiss.js deployment
but couldn't quite get it working. For the pipelines, it uses a domain
specific language that is not easy to work with. We are stuck
with that
right now unless there is a new major version. Without a
working example
its hard to get something going - but there are examples in
the unit tests.
- instead of pyFF, the UK federation uses the shibboleth MDA
framework. EduGAIN uses both pyFF and shibboleth MDA. It's a
complicated
framework to set up. The spring config files are XML-based.
In both cases,
the SAML pipelines refresh.
- Mikael is still working on upstreaming the OpenID Federation
code bases / Wallet repositories
2 - AOB