Hi Leif,
I added 2 pipes to buildin.py:
- publish_html creates static HTML views of IDPs and SPs, using XSLT based on Peter Schober’s alternative to MET;
- publish_split: similar to store, but added validUntil and creates signed XML-file per EntityDescriptor. This can be consumed dynamically by ADFS in an IDP role.
I put it directly into buildin.py because it shares some code with the sign pipe. Is this viable from your PoV - if yes, I would make an PR.
Cheers, Rainer
Hi all,
being part of Commons Conservancy brought up yet another subject,
which is whether we should add a header with license information in
every file in the projects under idpy. This is not something done in
an abstract way, there is a specific format modelling this information
(see https://spdx.org/ and https://reuse.software/ - more specifically
https://reuse.software/practices/2.0/) Still, I find it problematic.
We want to open up the question to the wider community and consider
their thoughts on this. The forwarded message below is discussing this
subject. You can see the question we posed, the answer we got and my
comments. Feel free to tell us what you think on this.
---------- Forwarded message ---------
Date: Thu, 16 May 2019 at 09:56
> ---------- Forwarded message ----------
> Date: May 8, 2019, 8:15 AM -0700
>
> > Why does CC think having a single license file per project is
> > insufficient? Our thought is that if we can avoid adding a header to
> > every single file, that would be nice, esp. given we already have this
> > info in the license file and we have the Note Well.
>
>
> this is not just our opinion, but something that is an industry and
> community standard for legal compliance these days. When companies like
> Siemens, Samsung or Honeywell use some code in one of the hundreds or
> thousands of devices and systems in their product line, they need to be
> able to provide the correct license and a download of the exact version.
> This means machine readability too.
>
I've actually observed the opposite of that. Communities abandon the
"license in every file" model, and just use a single LICENSE file in
the root of the project. The LICENSE file contains license
information, that is, it is not a single license but it has exception
sections and so on.
> To quote from https://reuse.software/practices/2.0/ :
>
> Scroll to the section "2. Include a copyright notice and license in each
> file"...
>
> "Source code files are often reused across multiple projects, taken from
> their origin and repurposed, or otherwise end up in repositories where
> they are separate from its origin. You should therefore ensure that all
> files in your project have a comment header that convey that file’s
> copyright and license information: Who are the copyright holders and
> under which license(s) do they release the file?
>
Continuing from above, the standardization of package-management
formats and tools has helped exactly with that: to avoid distribution
of single files, and instead provide packages and modules. It is bad
practice and considered a hack to copy files. Nobody liked that model
and everyone is moving away; it is unstructured, it becomes
unmanageable and it will cause problems.
> It is highly recommended that you keep the format of these headers
> consistent across your files. It is important, however, that you do not
> remove any information from headers in files of which you are not the
> sole author.
>
> You must convey the license information of your source code file in a
> standardised way, so that computers can interpret it. You can do this
> with an SPDX-License-Identifier tag followed by an SPDX expression
> defined by the SPDX specifications."
>
> (the text goes on for a while after this, to clarify the point but this
> is the basic gist of it)
>
> There is a nice Python tool to check:
>
> https://github.com/fsfe/reuse-tool
>
> I hope this makes sense
>
Well, it does not make complete sense. We're talking about licensing a
project. A project is not just code; there are data files (html, xml,
yaml, json files), binary files (archives/zip, images, audio, video,
etc), text files (configs, ini-files, etc) all "not-code". How do you
mark those files? Does the LICENSE file need a license-header? The
json format does not define comments, how do you add a header there?
If a binary file does not get a license header, why should a file with
code get one?
I would expect there to be a way to have the needed information
unified. If the files themselves cannot provide this information it
has to be external; thus the LICENSE file. If someone is worried about
somebody else re-using single files that do not have license
information (a python file, a png image, etc) there is really nothing
you can do (the DRM industry has been trying to solve for a long time;
and still your best bet is "social DRM").
Since, we're developing on open source with a permissive license, even
if someone does that, should we be happy that someone is actually
using what we built or sad that the files they copied did not have a
license header? And if they include the license information of that
copied file in their project's LICENSE file, is this solved?
Having pointed these contradictions, I am thinking that the "license
in every file" model seems to be a step backwards. It is introducing
overhead and does not really solve the problem, while at the same time
it enables a culture of bad practice (copying files around).
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3
Attendees:
Ivan, Giuseppe, Scott, Heather, John
Regrets:
Roland, Matthew
1 - Status of architecture documentation
Ivan wrote up a page on middleware: https://github.com/IdentityPython/SATOSA/wiki/Middlewares
Will be adding more on the usage of of middleware, enabling and disabling them.
Question raised if ASGI is something we want to consider in place of or in addition to WSGI. ASGI is generally a good thing. If performance was critical, this might be of interest where we’re worried about reading/writing to disk, timing, and encryption. Mostly, we don’t need the asynchronicity so the cost in terms of refactoring the code is not worth it.
2 - GitHub review
a. OIDC - https://github.com/IdentityPython (JWTConnect-Python-OidcRP, JWTConnect-Python-CryptoJWT, etc)
Giuseppe and Roland have been working through new feature requests (e.g., a configuration validator) which might also be of interest for pySAML2. See https://github.com/IdentityPython/oidcendpoint/issues/70. Many ways to implement this, suggest including the documentation in the validator. This could be a schema. Ivan suggests looking at https://pydantic-docs.helpmanual.io/
b. Satosa - https://github.com/IdentityPython/SATOSA
Added the middleware documentation to the wiki space (see above).
Fixed the metadata generation for the virtual frontend.
c. pySAML2 - https://github.com/IdentityPython/pysaml2
New release is out, which includes changes from Johan needed for the eIDAS proxy node. Mostly changes in the aggregate maps. Some fixes in the docs. New general config option for creating entity attributes that we don’t have specific configurations for. Allows us to set the friendly name attribute and values. See https://github.com/IdentityPython/pysaml2/commit/e5d0b4f0760144430d885165d4…
NameID format config has been changed, and how we handle the NameID format policy. This was a breaking change. See https://github.com/IdentityPython/pysaml2/commit/0c1873da1f280d4921b9c9b3da…
Working on changing the default algorithms. We do need to decide what we want the new defaults to be (currently SHA1). Changing the defaults will be considered a breaking change. Still need to work through how this will be configured. SHA1 is hardcoded in some places. We need better defaults for signing and digesting pieces of XML documents. Given what we do (we are not in a high performance environment) the slight performance hit of SHA512 over SHA256 may not be significant. When issuing responses with poor vendor implementations of SPs, we have to consider whether they can handle the more complex algorithms. We will go with SHA512 as our choice, but need to be clear on documenting how to change the default and consider how to set this as a per-SP choice.
We use a cert to check the types, and whether some things are verified. This is not ideal; certs are stripped out when python code is compiled. Will replace the certs with exception raises, making the type of exception reasonable.
d. pyFF - https://github.com/IdentityPython/pyFF
3 - AOB
Backend initiated responses
Scott has a modification to the SAML front end to support unsolicited flows, so that you can have a URL and specify all the details you want for the flow. This has been useful in COmanage when, after a registration flow, there is enough information to send the user to a specific SP based on their enrollment. If anyone else needs this kind thing, please contact Scott.
Typehints - Should we be using these? Yes, where they make sense. If we can have actually schemas that can verify input/output, and within the system we can then rely on the data that has been verified, that would be good. Typehints should be set where the API is defined.
Should we consider incorporating Django SAML into idpy? Giuseppe will reach out to the Django SAML team and see if they are interest, and then send a message to the idpy-discuss list for feedback.
Thanks! Heather
Attendees:
Giuseppe, Heather, Ivan, John P, Hannah, Christos
Regrets:
Scott
Note:
0 - Agenda bash
1 - Status of architecture documentation
All this started around the idea of middleware; these will be a configurable set of things. Ivan continues to develop his ideas on this, has not yet started writing these down.
Will also be creating an identifier for any request that comes in; that will allow us to mark requests as unsolicited as needed, and change how we process it.
Microservices look like middlewares, just not at the network level.
Alternatively, may start developing something called interceptors instead of middlewares; Ivan still investigating the possibilities.
Look to the wiki space in Satosa GitHub for upcoming documentation. May also be creating diagrams (using https://plantuml.com/) that will explain at least part of the flow.
We still need better basic documentation to get people started; problem is Satosa is so complicated, it’s hard to put together good documentation.
In terms of diagrams, Ivan is also looking at:
- https://www.websequencediagrams.com/
- https://plantuml.com/
- https://app.diagrams.net/
- http://blockdiag.com/en/blockdiag/index.html
- https://www.lucidchart.com/
- https://www.gliffy.com/
- https://creately.com/
- https://miro.com/
One question about using signals to help automate the documentation, but that won’t work for Satosa, which doesn’t use a framework that would support that.
Note that Rainer created some graph diagrams for Satosa as well about a year ago ( https://github.com/IdentityPython/SATOSA/pull/263)
2 - GitHub review
a. OIDC - https://github.com/IdentityPython (JWTConnect-Python-OidcRP, JWTConnect-Python-CryptoJWT, etc)
b. Satosa - https://github.com/IdentityPython/SATOSAhttps://github.com/IdentityPython/SATOSA/pull/322
People are bookmarking the discovery service as their initial landing page to a service, but that leaves the SAML backend confused as to which front end the discovery service belongs to in that scenario. Scott, Ivan, and Matthew have talked about whether the backend should have its own discovery flow. We would need to define a list of strategies on where to get the target service, and the backend would need to make that part of the context so the front end would know what to pick up and send back to the service. This is doable, but not trivial to code.
Could throw an error that lets the user know they need to start at whatever service they want to go to. Not sure this would work either, because the user may well not know the URL to the service URL. Alternatively, have a generic landing page that contains links to all the services and let them pick from there. This page would be created by the deployer, and not automatically generated by Satosa. Can be implemented in the same way that the unhandled exceptions will be handled.
Ivan will need to finish writing up and then implementing middleware, and then we can work on this issue.
c. pySAML2 - https://github.com/IdentityPython/pysaml2
Ivan has been looking at refactoring name_id_format (https://github.com/IdentityPython/pysaml2/issues/669)
Ivan has a patch, but it is a breaking change. New option is called name_id_policy_format. The policy format specifics what happens with an authN request is created, and the name_id_format options is only used to define what’s in the metadata. Name_id_format can have multiple values (the order of the values matter) while the name_id_policy_format only takes a single value which, if not specified, will not be set. The “none” string is now gone.
Ivan plans to do a non-breaking release first with some other changes, and then will introduce this breaking change.
Also looking at https://github.com/IdentityPython/pysaml2/pull/628 that will allow us to disable weak algorithms. This is a good initial step, but it’s not really enough. pySAML2 defines different algorithms that it knows how to work with, as does xmlsec. We currently ask xmlsec to give us that list, getting a different set depending on xmlsec version, and then we compare it to what pySAML2 knows how to handle. If we follow this pattern, we will need to update pySAML2 with the list of what it can handle. Alternatively, we can store a restricted last that will focus on what to disable, rather than what to support. The trick is that we need the proper namespace names out of xmlsec rather than the short names we get now. We also need to define proper defaults (right now, the default is SHA1). We should disallow md5.
Hopes to have something by the end of the week that will get us moving in the right direction by configuring the choice of algorithms.
d. pyFF - https://github.com/IdentityPython/pyFF
3 - AOB
Thanks! Heather