Quote from navaru on November 19, 2011, 01:46
This project is new, and to ensure that we don't mess up too bad from the beginning, cause we'll eventually do, in some cases (writing a lot of bugs), we (as a community) should create some standards.
As a start we all need to find our *list of needs*, the building blocks, and come up with a list of standards on how to write a package and a package manager so we could easily find what we need written by others -- in order not to reinvent the hot water every time.
There is a pattern with humans, they find something new, they play with it, make something interesting and stop -- this is the case where someone writes a very good library with bad documentation and doesn't support it anymore, so you'll find yourself using it for a while and then ... 'oh crap, new Raspberry Pi updateds, and no library support...", and what do you do? "#include <frustration_mode>"
Also we need a style guide how to write code so we could all read it, and how to write a good read.me (documentation) <-- a must
Please feel free to post your opinion, if you don't have one, search google.
Have you considered forgoing binary packages altogether and standardizing on Gentoo?
As nice as binary package package managers are, they are the result of compromises that stem from the following issues:
The sheer quantity of software available.
The sheer number of options available when building software.
The multitude of ISAs available (e.g. alpha, arm, mips, ppc, sparc, x86) and revisions to them (e.g. instruction set extensions)
Differing ABI conventions (e.g. little endian versus big endian, how to do function calls to both normal and variadic argument functions)
Build options (e.g. Qt support versus GTK+ support, GnuTLS verus OpenSSL)
Legal issues from software patents (e.g. S3 texture compression)
Legal issues from trademarks (e.g. Firefox branding)
Distribution vendor priorities (try using CentOS as a desktop distribution to learn what I mean)
This has lead to the following compromises:
Multiple repositories, which makes it difficult to locate software and poses security risks due to the ability for anyone to make one, including malicious third parties
Software built to cover as many use cases as possible, which can result in the omission of debugging symbols, bloat from the inclusion of debugging symbols, bloat from runtime performance optimizations, missing performance optimizations, missing features stemming from legal issues or features that are undesirable (e.g. KDE's semantic desktop), absent security features (e.g. PIC to enable the kernel to implement ASLR, SSP, etcetera) and/or slowdowns from present security features. Avoiding all issues on this list is logically impossible and the reality of that has lead binary distributions to divide big packages into many little ones, at the expense of ease of use.
Outdated software to avoid ABI incompatibility issues or for lack of testing. This mandates new software be provided through distribution upgrades, which rely on run-once upgrade scripts that suffer from quality problems due to limited testing and that will often leave a system in an inconsistent state upon failure.
Limited support for various architectures (e.g. try Ubuntu on SPARC, PPC, IA64). The support that is available results in a significant duplication of effort.
Liberal patching of packages to introduce fixes for software flaws or to provide distribution unique functionality, which can often introduce issues upstream never would. This also requires laborious vulnerability tracking to determine when things need to be backported and this often misses security vulnerabilities. A researcher recently made a tool to help automate this practice, but no tool is perfect and it is not yet certain that it will fix things.
Numbers 1, 2 and 3 often cause advanced users to compile their own software. In those situations, advanced users will often do "wget url/to/tarball; tar -xvzf ./tarball; cd tarball directory; ./configure; make; make install". That can work in the short term, but it is often the source of problems:
End-users who manually compile software must do dependency management themselves and it is easy to horribly break things.
Tools meant to address dependency issues when compiling your own packages introduce new problems, which is presumably why they are not recommended by various distributions. For instance, if you upgrade python from 2.6 to 2.7 on Ubuntu 10.04 using checkinstall and name the resulting package "python", dependency conflicts will disable apt-get.
Upgrades can break things long after the installation due to either API incompatibility or poorly handled file collisions.
Attempts to remove obsolete packages will often break things outside of the package manager that depend on them or potentially inside of the package manager if the user upgraded something in a way that did not inform the package manager of its dependencies.
Security vulnerabilities will usually go unpatched in self-compiled software because end-users have little direct incentive to update them manually, even if they installed them properly.
Even when those problems are avoided, there is still the problem of root directory pollution because things are no longer being managed by the package manager, which can lead to issues stated in #3 in my compromise list and unfortunately, users in such situations receive little help. Such problems are always "their fault", not the distribution's fault. That is despite of the fact that the design compromises taken by the distribution put them in a situation where they felt the need to do such things and I imagine that it is very disenfranchising to many new users.
Gentoo makes a few compromises of its own, but in doing that, it resolves all of the issues that stem from compromises that other distributions make. Here is how Gentoo avoids issues stemming from each of the compromises I listed binary distributions as having:
Gentoo's package manager introduces an abstraction known as an ebuild. Ebuilds are wrappers for whatever upstream provides, be it a tarball, makeself, repository (GIT, SVN, CVS, Mecurcial, etcetera). They are rather succinct and once one is written, a very small team can maintain a great number of them.
The abstractions provided by Gentoo's package manager provides defaults that users can override to make nearly all decisions that binary distributions make for them, but with far less effort. This avoids the need to break packages into smaller ones, making it significantly easier for users to find what they want, although it is not perfect. make specific options are abstracted into a global make file (with the option to create per-package env files) and build options are abstracted into system-wide and per-package USE flags. USE flags can be configured in ways that prevent the package manager from handling upgrades gracefully (e.g. a specific USE flag is needed to resolve dependencies, but it is set for package verson 1.0 and then version 1.0.1 becomes available). Experience at system administration enables users to avoid such problems while the corresponding issues in binary distributions have no resolution.
Gentoo makes it easy to resolve ABI incompatibility through recompilation, which enables software to always be up to date or close to it. It even provides a tool called revdep-rebuild, which can automate detection of ABI incompatibility issues. It is theoretically possible for the package manager to be modified in a way that permits ebuild maintainers to provide information that should enable the package manager to detect such situations beforehand, but the Gentoo developers are still considering proposals to do that.
The design of Gentoo makes it significantly easier to support multiple architectures due to the deduplication of effort provided by ebuild files.
Gentoo attempts to stay close to upstream, which makes backporting functionality largely unnecessary and the patches that are done are always sent to upstream. Usually, the patches that are done are to fix issues involved in supporting multiple architectures and/or security features. Security vulnerabilities that upstream fixed years ago are incredibly rare because Gentoo stays close to upstream.
With that said, Gentoo's package manager should enable work done on other architectures to much more easily translate into support for the Raspberry Pi than it would on a binary distribution. It should also be flexible enough to produce binaries for each individual's needs, so there is little need for fragmentation.
The primary focus of the Raspberry Pi is in education, which is an area in which Gentoo excels by making it easy to modify the system source code within the safety net provided by its package manager. You simply need to do "ebuild $(equery which package-category/package-name) unpack", make changes and then do "ebuild $(equery which package-category/package-name) merge". For some thing more permanent, you can create a local overlay (e.g. /usr/local/portage) for a place to store permanent changes. That only needs to be done once and then you can follow a simple procedure for all packages, which is as follows:
Copy the package's subdirectory to there from /usr/portage (possibly deleting things like older versions so you don't duplicate them and the changelog so you don't have useless files)
Place the patch in the files subdirectory
Modify the ebuild to apply the patch
Update the manifest file (i.e. "ebuild /path/to/modified/ebuild digest")
Run "emerge --oneshot updated-package" to install the update.
Note that in #5, the flag --oneshot is optional, but generally recommended. Portage makes a distinction between things you actually want on the system and the things that you need to have them. Everything you specify to emerge is automatically assumed to be something you want unless you specify --oneshot (or -1, which is the short version). That distinction enables obsolete packages to be removed through "emerge --depclean --ask", which is a difficult task on other distributions.
The only downside to patching software on a Gentoo system is that you need to modify the ebuilds every time a new version is available unless you explicitly tell Gentoo to use the older version, but it isn't very hard to keep your patches alive for newer versions. Gentoo will present a color-coded list of packages each time you update, so it is rather easy to watch for the ones you customized and update the ebuild to reflect your changes. Most of the time, you just need to rename the ebuild and update the manifest file. The tactic of staying at an older version is also possible, but for packages whose dependencies change rapidly, you will need to move dependencies into your overlay as time passes.
It is rather easy to create new ebuilds for Gentoo when existing ones do not exist and doing this enables people to avoid the issues that often occur when they compile software themselves. In my opinion, such issues serve as a barrier to things that are of real educational value, so avoiding them is a good thing.
With that said, there are two main compromises Gentoo makes:
It has no release schedule.
All software is typically compiled from source.
These are made to avoid the issues derived from compromises made by other distributions. In my opinion, these compromises are more natural for open source software. With that said, Gentoo does make attempts to mitigate them.
Issue #1 is only an issue because organizations rely on release schedules to coordinate regression testing and quality assurance. Gentoo does internal quality assurance via its tinderbox efforts, which attempt to automate the discovery of problems before users encounter them by compiling all software available for Gentoo repeatedly and detecting build failures. Those efforts have lead to the discovery of numerous issues, many of which have resulted in patches sent to upstream. Gentoo also marks packages as either stable, testing or masked. Gentoo will only install stable packages by default. Software updates must wait approximately 30 days before entering the stable tree with the only exception being security updates. Those updates that do enter the stable tree enter only after manual review. Users willing are able to upgrade specific packages or the entire system to the testing tree versions and they tend to detect issues that the tinderbox efforts cannot. It is practically always possible for them to recover from issues they encounter and they tend to catch issues before they reach the stable tree.
Third party organizations can implement their own fixed release schedule and regression testing practices by taking a snapshot of its tree and maintaining it independently of the main tree. This is often done by forks of Gentoo. Some examples include Funtoo Linux, Sabayon Linux and Gentoo Prefix. It is usually fairly easy to switch a fork to the main tree, which basically convert it into a Gentoo Linux installation. The only real exceptions to this are LiveCD based derivatives of Gentoo (e.g. Pentoo Linux, System Rescue CD, etcetera) and Gentoo Prefix. Gentoo Prefix is somewhat special in that it is an official fork of Gentoo. It permits the installation of Gentoo on other operating systems, including Solaris and Microsoft Windows NT, although it can be forked by others should they desire it.
Gentoo attempts to address issue #2 in the following ways:
Gentoo's package manager supports ccache, which enables packages to be recompiled rapid recompilation.
Gentoo supports distributed compilation via distcc, which can partially
Gentoo Linux supports compilation in a tmpfs, which can increase compilation speed while reducing disk IO. The feature is more a consequence of the design of the Linux kernel Linux than anything Gentoo Linux did, but it works and people do it all the time.
Gentoo has partial support for binary packages, although it is not widely known. However, there are three distinct types of binary packages on Gentoo and two of them are significantly different from their counterparts on binary distributions.
The various types of binary packages are as follows:
Ebuilds for binary packages exist when sources from upstream are unavailable. Some popular software that falls into this category are Google Chrome, Opera, the Nvidia binary driver, the Oracle JVM, VMWare Player and various video games. Packages in this category that involve Linux kernel modules do not integrate well into Gentoo or Linux in general, but others tend to work well. They usually rely on bundled libraries that would be unnecessary bloat had sources been available.
Gentoo's developers provide precompiled versions of open source software for older systems when the time taken to compile them is considered to be excessive. Open Office and Libre Office are in this category. Packages in this category are targetted to Gentoo stable and rarely work on both Gentoo testing and Gentoo stable due to API incompatibilities. They also don't integrate well with use flags, although they are entirely optional and with the removal of chromium-bin, their numbers are dwindling.
Gentoo's package manager can generate binary packages from ebuilds and avoid software compilation by making use of them. It supports the specification of a list of BINHOST servers and it will use them similarly to how apt-get uses .deb files or yum uses .rpm files, but there are some significant differences, the full extent to which I have not explored. The main difference is that the package manager will refuse to install binary packages unless they appear to be exactly what it would have produced had it compiled them. I have not fully explored how strict it is, but the areas where there can be variation are different global settings in make.conf (e.g. CFLAGS, CXXFLAGS, LDFLAGS, global USE flags, portage FEATURE settings, etcetera), different per-package settings in /etc/portage/, a different system profile (which affects USE flags, FEATURE flags, LDFLAGS, which ebuilds are available, etcetera), different ebuilds and the system compiler. You can always mimick a Gentoo installation in a chroot directory on a faster system for the purpose of generating binary packages for installation through this feature. Others just physically remote the hard drive, install it in a faster computer and then chroot into it, which is the same thing without using this feature.
With that said, I highly recommend that the Raspberry Pi community standardize on Gentoo. It is the only distribution flexible enough to minimize duplication of work while permitting everyone to get what they want out of it. *BSD fans would probably also appreciate Gentoo/FreeBSD, which replaces the GNU/Linux components with FreeBSD's counterparts. Interest in it is low, but with some work, it probably could support the Raspberry Pi provided the FreeBSD does while minimizing the duplication of effort involved in supporting multiple kernels. Anyone that wants to try Gentoo without setting up a computer from scratch can install Gentoo Prefix, which runs on most operating systems available, including Microsoft Windows NT provided Interix (i.e. Windows Services for UNIX) is installed.
I doubt that the Raspberry Pi will standardize on any single distribution, but I highly encourage people to consider what I had to say about standardization. I believe that I identified some important issues and that if we really care about education, it would be best for us to talk about ways of resolving them than it would be to fight the distribution wars. Fighting among ourselves would only serve to disenfranchise students and that will hurt everyone.