Hi Leif,
I added 2 pipes to buildin.py:
- publish_html creates static HTML views of IDPs and SPs, using XSLT based on Peter Schober’s alternative to MET;
- publish_split: similar to store, but added validUntil and creates signed XML-file per EntityDescriptor. This can be consumed dynamically by ADFS in an IDP role.
I put it directly into buildin.py because it shares some code with the sign pipe. Is this viable from your PoV - if yes, I would make an PR.
Cheers, Rainer
Hi all,
being part of Commons Conservancy brought up yet another subject,
which is whether we should add a header with license information in
every file in the projects under idpy. This is not something done in
an abstract way, there is a specific format modelling this information
(see https://spdx.org/ and https://reuse.software/ - more specifically
https://reuse.software/practices/2.0/) Still, I find it problematic.
We want to open up the question to the wider community and consider
their thoughts on this. The forwarded message below is discussing this
subject. You can see the question we posed, the answer we got and my
comments. Feel free to tell us what you think on this.
---------- Forwarded message ---------
Date: Thu, 16 May 2019 at 09:56
> ---------- Forwarded message ----------
> Date: May 8, 2019, 8:15 AM -0700
>
> > Why does CC think having a single license file per project is
> > insufficient? Our thought is that if we can avoid adding a header to
> > every single file, that would be nice, esp. given we already have this
> > info in the license file and we have the Note Well.
>
>
> this is not just our opinion, but something that is an industry and
> community standard for legal compliance these days. When companies like
> Siemens, Samsung or Honeywell use some code in one of the hundreds or
> thousands of devices and systems in their product line, they need to be
> able to provide the correct license and a download of the exact version.
> This means machine readability too.
>
I've actually observed the opposite of that. Communities abandon the
"license in every file" model, and just use a single LICENSE file in
the root of the project. The LICENSE file contains license
information, that is, it is not a single license but it has exception
sections and so on.
> To quote from https://reuse.software/practices/2.0/ :
>
> Scroll to the section "2. Include a copyright notice and license in each
> file"...
>
> "Source code files are often reused across multiple projects, taken from
> their origin and repurposed, or otherwise end up in repositories where
> they are separate from its origin. You should therefore ensure that all
> files in your project have a comment header that convey that file’s
> copyright and license information: Who are the copyright holders and
> under which license(s) do they release the file?
>
Continuing from above, the standardization of package-management
formats and tools has helped exactly with that: to avoid distribution
of single files, and instead provide packages and modules. It is bad
practice and considered a hack to copy files. Nobody liked that model
and everyone is moving away; it is unstructured, it becomes
unmanageable and it will cause problems.
> It is highly recommended that you keep the format of these headers
> consistent across your files. It is important, however, that you do not
> remove any information from headers in files of which you are not the
> sole author.
>
> You must convey the license information of your source code file in a
> standardised way, so that computers can interpret it. You can do this
> with an SPDX-License-Identifier tag followed by an SPDX expression
> defined by the SPDX specifications."
>
> (the text goes on for a while after this, to clarify the point but this
> is the basic gist of it)
>
> There is a nice Python tool to check:
>
> https://github.com/fsfe/reuse-tool
>
> I hope this makes sense
>
Well, it does not make complete sense. We're talking about licensing a
project. A project is not just code; there are data files (html, xml,
yaml, json files), binary files (archives/zip, images, audio, video,
etc), text files (configs, ini-files, etc) all "not-code". How do you
mark those files? Does the LICENSE file need a license-header? The
json format does not define comments, how do you add a header there?
If a binary file does not get a license header, why should a file with
code get one?
I would expect there to be a way to have the needed information
unified. If the files themselves cannot provide this information it
has to be external; thus the LICENSE file. If someone is worried about
somebody else re-using single files that do not have license
information (a python file, a png image, etc) there is really nothing
you can do (the DRM industry has been trying to solve for a long time;
and still your best bet is "social DRM").
Since, we're developing on open source with a permissive license, even
if someone does that, should we be happy that someone is actually
using what we built or sad that the files they copied did not have a
license header? And if they include the license information of that
copied file in their project's LICENSE file, is this solved?
Having pointed these contradictions, I am thinking that the "license
in every file" model seems to be a step backwards. It is introducing
overhead and does not really solve the problem, while at the same time
it enables a culture of bad practice (copying files around).
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3
Attendees:
Ivan, Giuseppe, Heather, Hannah, Roland, Scott, John P, Johan, Nikos
1 - Status of architecture documentation & Normalizing idpy projects (see email from Ivan, "Subject: [idpy-discuss] Normalizing across all projects”, 10 November 2020)
Before we make djangosaml2 part of idpy, we should have a stronger set of guidelines on what we expect in terms of managing a project (using semver, readthedocs, change logs, etc). Could also use something called “cookie cutter” which does several of the set up of these kinds of things for projects in GitHub. While we would not create new project spaces for projects that come in with their own repository, it would allow for a type of model and technical documentation on how we think projects should look like. Note that this would let us to also normalize on how we handle issues, what labels we use, PR templates, a common FAQ, tooling to build and test packaging, boilerplate (README, LICENSE, etc)
• Example: https://github.com/SUNET/eduid-webapp/tree/master/cookiecutter-app
Ivan will continue to write up his thoughts in email.While Ivan can make these decisions for Satosa and pySAML2, need input from other maintainers.
Discussion:
• We can agree to semver and pep8, but some of the other things we’ve been talking about are very heavy for some of the smaller projects.
• Do we know who our customers are, and what they need? Will they be taking our libraries and using them to build their own use cases? Or are they looking for a packaged service they can just run? We can’t cater to both. Can we have two categories of rules? One size won’t fit all projects.
• What about change logs? They send a good signal to deployers about what to watch for when they upgrade (or help them decide if an upgrade is required).
Consensus:
• semver
• pep8
• readthedocs
• README and LICENSE files
• PR and issue templates (already exist)
• Change logs
3 - GitHub review
a. OIDC - https://github.com/IdentityPython (JWTConnect-Python-OidcRP, JWTConnect-Python-CryptoJWT, etc)
New project for the documentation on session management (code is still in oidcendpoint). Since readthedocs can’t handle documentation for forks, this allows the publication of material before the code is even final. Has not pushed to readthedocs yet; will aim to have that done before the next call.
b. Satosa - https://github.com/IdentityPython/SATOSA
No changes.
c. pySAML2 - https://github.com/IdentityPython/pysaml2
There was a bug in the redirect binding, where the request was also going to be signed. The problem is that by default, if the signing algorithm was not specified, the right parameters were not produced. This has been fixed. In the same PR will be Giuseppe’s work on configurable signing and digest algorithm (see https://github.com/IdentityPython/pysaml2/pull/744) Ivan will be pushing a commit with partial notes (in the form of bullet points) and other people involved will help turn that into proper documentation. We will need to refactor the whole process of signing, encryption, and decryption.
d. pyFF - https://github.com/IdentityPython/pyFF
4 - AOB
Thanks! Heather
Hello everyone,
at the last idpy meeting, we started a discussion around normalizing
how idpy projects operate. The goal is to agree on a baseline on how
we treat different project aspects and how we can harmonize those in
order to give users uniform expectations. A user that is familiar with
how project-x works, should automatically be familiar with how
project-y works when they decide to use that too. The first step is to
agree on guidelines, and then have each project steadily work its way
to that. In the process, we should effectively be cultivating a
culture on how we manage projects, and not blindly follow policies.
The initial discussion started around documentation, versioning and
changelogs. Many more things can and should be discussed. Those
include topics such as tooling, CI/CD, testing, packaging, release
schedules, issues management, git-workflow, etc. I will expand on some
of those topics and we can further discuss in this or separate threads
or in the following calls about each one.
## Documentation
I think we can all agree that documentation is important and needed. I
make a distinction between documentation whose audience is a user and
documentation whose audience is a developer. The user-documentation
consists mainly of the project API, how it is used, and use-case
scenarios. The developer-documentation consists of architectural
diagrams, design choices, separation of entities, and general guidance
on how modules are separated and code is organized.
Being part of the python community, the natural tool around
documentation is https://readthedocs.org/. Readthedocs works in tandem
with Git(Hub) projects, and can automatically be kept in sync with a
repo.
- Do we agree to use https://readthedocs.org/ to publish documentation?
- Should we request PRs to include documentation? or, should we
fill-in the documentation when creating a new release?
## Managing change
### Versioning
What we actually want to do with versioning is _signal_ the type of
changes that are part of a new release. The most common versioning
scheme is semantic versioning (https://semver.org/) - other things out
there include CalVer (https://calver.org/) Apache versioning
(https://apr.apache.org/versioning.html) and more. However, I think
semver is quite common and probably everyone will agree that it's
fine. Most importantly it achieves the goal of signaling in a simple
way:
> Given a version number MAJOR.MINOR.PATCH, increment the:
>
> - MAJOR version when you make incompatible API changes,
> - MINOR version when you add functionality in a backwards
> compatible manner, and
> - PATCH version when you make backwards compatible bug fixes.
>
> Additional labels for pre-release and build metadata are available
> as extensions to the MAJOR.MINOR.PATCH format.
- Do you think we should use semver?
### Changelog
Same as with documentation, the Changelog comes in two flavours - the
changelog that is intended for developers, and the changelog that is
intended for users (aka release notes). The technical-changelog is the
VCS-log itself and contains purely technical entries on how the code
was modified to achieve a certain effect. We have the technical
changelog for free.
On the other hand, the user-changelog has the form of a set of release
notes. The release notes should be enough to explain what’s going on
to someone deploying the package, or to a user that wants to use new
functionality - it is a summary of the new release value. This type of
changelog cannot be autogenerated, and we must write and curate by
hand. As with versioning, there are some formats out there, but I
think something like keepachangelog (https://keepachangelog.com/) is
easy to use and very close to what most of us would end up with while
trying to specify a consistent format.
When should the user-changelog be updated? Usually this is done either
at release-time.
- Should we maintain a changelog/release-notes?
I will continue this subject with more topics, soon. I'm looking
forward to your thoughts!
Cheers,
--
Ivan Kanakarakis
Attendees:
Ivan, Heather, Giuseppe, Johan, Roland, John, Hannah, Matthew, Christos, Leif
Regrets:
Scott
1 - Status of documentation
Hannah is still working on the pySAML2 documentation.
Scott K is working on reorganizing the documentation he’s written on GitHub (how readthedocs is organized and presented)
No update on the architecture documentation, but it is coming up in priority.
Board considers documentation (both user and developer) critical as a way to grow the community.
2 - GitHub review
a. OIDC - https://github.com/IdentityPython (JWTConnect-Python-OidcRP, JWTConnect-Python-CryptoJWT, etc)
Regarding documentation, we need a central site that describes how it all fits together. Could we use readthedocs for this? Possibly, but we would need link from the project sites to the documentation site.
We have some new people programming in this space: Ori Mizrahi and David Hess.
Roland is working with the session management backend.This is mostly working; focusing on the tests now to make sure it works in all cases. When that’s done, it will be very different from the old version. Documentation will be absolutely required (background, design choices, how to use it in context). See branch “new session handling” in https://github.com/IdentityPython/oidcendpoint
b. Satosa - https://github.com/IdentityPython/SATOSA
Within eduTEAMS, had an incident that may be related to the memory leak. Unclear, and no more info is available.
We have a contribution for an apple signing backend.
c. pySAML2 - https://github.com/IdentityPython/pysaml2
Ivan is working through the algorithm PRs (744 and 745). They aren’t quite passing tests, but as soon as those are sorted, they will be merged.
• https://github.com/IdentityPython/pysaml2/pull/744
• https://github.com/IdentityPython/pysaml2/pull/745
Namespace prefix changes to make them more readable. This is mostly a “nice-to-have”.
• https://github.com/IdentityPython/pysaml2/pull/326
• https://github.com/IdentityPython/pysaml2/pull/625
Priority list:
• Make algorithms configurable, enforcing the policies (see https://github.com/IdentityPython/pysaml2/pull/744 and 745)
• Getting away from xmlsec1 and instead working in memory. Expect we will switch to lxml.
• Architecture documentation
AOB
• At the Board meeting, we talked about normalizing some things across all projects. Example: documentation using readthedocs; how we manage changes and versions (e.g., semantic versioning); change logs. The group needs to discuss this further. We need to add these to future agendas (and to the mailing list) for discussion.
• The goal isn’t to enforce this harmonization immediately, but to give each project a direction to start making changes.
• If we use readthedocs, it is structured by projects, so even general documentation will need to be structured as a “project"
• For change logs, there is what’s automatically available with a revision control, but that doesn’t always describe adequately what actually happened. What may be more useful would be a set of release notes. The release notes should be enough to explain what’s going on to someone deploying the package, and gitlog is what you need as a developer to understand what’s going on. We have discussed this and semantic versioning at a TIIME meeting in 2019. pySAML is not using semver because they decided to have Ivan only work on the current/future work, and only back ports handled the patch version.
• Ivan will post a summary and a recommendation around this discussion re: versioning and change logs to the list.
Thanks! Heather