Discussion:
Summary of the Arm ports BoF at DC18
Steve McIntyre
2018-09-20 08:24:02 UTC
Permalink
[ Please note the cross-post and Reply-To ]

Hi folks,

As promised, here's a quick summary of what was discussed at the Arm
ports BoF session in Hsinchu. Apologies for the delay in posting...

Thanks to the awesome efforts of our video team, the session is
already online [1]. I've taken a copy of the Gobby notes too,
alongside my small set of slides for the session. [2]

Ports update
============

arm64

* Current port, first released with Jessie
* Working well
* More devices available now, more coming
* Real server hardware!
* Simple choice of kernel with DTB and/or ACPI

armhf

* Current port, first released with Wheezy
* Hard-float ABI, v7, VFPv3-D16
+ Standard for 32-bit Arm Linux distros, so we should have binary
compatibility with other distros
* armmp kernel & DTBs
+ Potentially massive set of supported devices
* UEFI is becoming more of a thing on armhf, and will make it easier
to run armhf VMs on top of arm64 hardware

armel

* Current port, first released with Lenny
* Soft-float ABI, used to be v4t, now v5te
* I'd been planning to drop from testing/release, but others have
stepped up to keep armel running
* Still supported upstream for most languages, but can also be a
problem where things newer language runtimes expect v7 as a
minimum baseline. Fixable, but takes effort. People will need to
keep working on this. We'll need to manage expectations as to what
is supported/supportable in armel

Buildds and hardware
====================

* Existing Marvell Armada XP hardware we're using for armel/armhf is
not great in terms of supportability, e.g. needs a manual button
press to reboot after power failure. The machines are quite fast
and powerful, but they're still development boards and are missing
features we'd like such as support for more than one disk to do
RAID.

* There's a wider choice of arm64 machines available, e.g. Seattle,
X-Gene, ThunderX, Macchiatobin, Centriq, Synquacer, to suit a range
of budgets and requirements. I'm hoping that we'll soon see a
commercially successful, readily available arm64 server machine
that we'll be able to just buy off the shelf...

* DSA would like to get rid of the Marvell machines, replacing them
with more manageable server machines. Supporting dev boards is hard
work. AFAICS there are not any worthwhile 32-bit Arm server
machines that we could sensibly use as buildds. Hence, we'll need
to start switching to using arm64 machines as buildds for our
32-bit ports too. There was already some discussion about this on
the mailing list - see the thread around [3]

Building 32-bit software on arm64
=================================

There are some issues here :-(

* Some arm64 machine won’t run the A32 ISA; server vendors are often
targeting 64-bit only and cramming more cores into a CPU by
leaving out the 32-bit support

* We've started using an arm64 machine (arm-arm-01, a Seattle-based
machine which includes A32 support) to build armhf. Unfortunately,
there have been some problems detected already:

+ Alignment - Arm CPUs care about alignment, but we've been
running the 32-bit buildds with kernel support enabled for
fixing up user-space alignment problems in software. No such
option exists in the arm64 kernel, so we get SIGBUS errors.

+ #define mixups - glibc fails its testsuite due to a mismatch
in 32- and 64-bit definitions for SIGMINSTKSZ. Patch already
developed by the Arm kernel team, but not upstream yet.

+ It seems that our armhf Haskell binaries are mis-targeted
(ARMv6, not ARMv7?), so as well as needing lots of alignment
fixups they're also using old-fashioned primitives for
barriers. Looks like it will need fixing up and completely
rebuilding...

+ I'm doing rebuilds of the armhf *and* armel archive to check if
there are any other problems to be found. [UPDATE: very nearly
finished, expect results and analysis very shortly...]

+ Possible that we could mask some of the building problems using
32-bit VMs running on the arm64 machines, but that won't solve
the problems properly - these binaries still won't run properly
on arm64 machines like they should. Much better to fix things
like alignment issues properly, IMHO.

+ Fixing alignment problems also helps other ports like m68k and
sparc, which have long had the same issues. They only exist
because developers can get away with invalid assumptions on x86
where the hardware will fix things at runtime. Even there,
proper alignment will typically improve performance.

Discussion
==========

* arm64 hardware for reproducible builds? Use reproducible builds to
systematically detect arm64 vs. armhf issues

* Ubuntu are also building armhf on arm64 and are seeing the same
alignment problems we are. Should we disable the alignment fixups
on the armhf buildds? Probably, to get consistent behaviour. Even
better, turn on reporting of alignment fixups so we can find and
file bugs. We should also do this for autopkgtest to find more
alignment faults at runtime.

* I'm expecting to pick up several Synquacer machines (24-core Cortex
A53) to use for Debian, donated by Linaro. Some will become
buildds, wanting to get more to use for autopkgtest, debian-ci,
reproducible builds etc. [UPDATE: 3 of these are now in Vancouver,
ready to set up as buildds]

* When will people be able to get hold of arm64 server machines
readily? It's a depressing story - lots of vendors in theory, but
not much commercial success so far :-(

* We've had arm64 openstack images for a while now.

* Lots of the large cloud-based services (e.g. Travis) are starting
to support arm64 too. It's an ongoing process.

* Vagrant has been experimenting with newer U-Boot versions which
include the new UEFI support. Should make installation easier on
some arm64 and armhf devices, for example. Possible issues with
devicetree here - discussion about problems with the devicetree
moving target, and hence the move to ACPI for some devices like
arm64 servers. Standards like SBBR and EBBR are meant to help here,
defining lowest common standards for how systems should work.

* Is armel going to continue? Are the buildd support issues the only
blocking problems? Basically... Clarifying - the new arm64 builders
for armhf *should* also work for armel too. Wait for rebuild
results to see how that works, I'd expect it to be OK. We
apparently have offers of donations to help keep armel alive. Let's
see!


[1] http://meetings-archive.debian.net/pub/debian-meetings/2018/DebConf18/2018-08-04/arm-ports-bof.webm
[2] https://www.einval.com/~steve/talks/Debconf18-arm-BoF/
[3] https://lists.debian.org/debian-arm/2018/06/msg00062.html
--
Steve McIntyre, Cambridge, UK. ***@einval.com
"When C++ is your hammer, everything looks like a thumb." -- Steven M. Haflich
Paul Gevers
2018-09-20 19:49:45 UTC
Permalink
Hi Steve,
Post by Steve McIntyre
* I'm expecting to pick up several Synquacer machines (24-core Cortex
A53) to use for Debian, donated by Linaro. Some will become
buildds, wanting to get more to use for autopkgtest, debian-ci,
reproducible builds etc. [UPDATE: 3 of these are now in Vancouver,
ready to set up as buildds]
Cool, very cool. Regarding autopkgtest/ci, do you already have any
(rudimentary) plans how you want to handle this? E.g. should DSA manage
these machines and does the infra gets access? Are there other
possibilities to host these machines?

Paul
Steve McIntyre
2018-09-20 20:39:14 UTC
Permalink
Post by Paul Gevers
Hi Steve,
Post by Steve McIntyre
* I'm expecting to pick up several Synquacer machines (24-core Cortex
A53) to use for Debian, donated by Linaro. Some will become
buildds, wanting to get more to use for autopkgtest, debian-ci,
reproducible builds etc. [UPDATE: 3 of these are now in Vancouver,
ready to set up as buildds]
Cool, very cool. Regarding autopkgtest/ci, do you already have any
(rudimentary) plans how you want to handle this? E.g. should DSA manage
these machines and does the infra gets access? Are there other
possibilities to host these machines?
That's q good question. :-)

Right now I only have the first 3 machines, and I'm waiting for
another batch to get more. As soon as we see some more, let's see what
we can work out. I don't know if the DSA folks are interested in
hosting these, but I'm sure we can find a solution either way.
--
Steve McIntyre, Cambridge, UK. ***@einval.com
"Because heaters aren't purple!" -- Catherine Pitt
Steve McIntyre
2018-10-28 17:01:05 UTC
Permalink
Post by Paul Gevers
Post by Steve McIntyre
* I'm expecting to pick up several Synquacer machines (24-core Cortex
A53) to use for Debian, donated by Linaro. Some will become
buildds, wanting to get more to use for autopkgtest, debian-ci,
reproducible builds etc. [UPDATE: 3 of these are now in Vancouver,
ready to set up as buildds]
Cool, very cool. Regarding autopkgtest/ci, do you already have any
(rudimentary) plans how you want to handle this? E.g. should DSA manage
these machines and does the infra gets access? Are there other
possibilities to host these machines?
did you discuss getting some of this synquacer machines to vagrant for
reproducible builds testing?
Reponding to both of you...

I now have 3 more basic Synquacer machines in my posession, ready to
order new cases, RAM, etc. (They ship in desktop cases with a single
1TB hard drive and 4 GiB of RAM.) Again, I'm hoping to pick more up in
the future, but supply is still limited.

Again, I'll reiterate - these machines are *not* fast for
single-threaded workloads but they have a lot of cores so it's
perfecty reasonable to run lots of things in parallel. For my own
build testing, I've been running up to 6 sbuilds in parallel for
better throughput while lots of our builds don't parallelise
individually. They should also work well for multiple VMs running in
parallel.

So, practical questions...

Hardware setup: I've configured the earlier 3 as buildds in little 1U
cases with 32GiB RAM and mirrored SSDs, but for $reasons they're not
yet installed and running. I'm assuming that a similar spec would be
wanted for autopkgtest/ci and reproducible builds? If so, we'll need
to ask for approval for funds for that - it cost ~£750 per machine to
do it. I had offers of funds at DC18 which I'm about to chase to help.

Hosting: Talking to DSA, it seems they're not too keen on hosting /
managing new machines for these projects. What are current hosting
arrangements for you folks?
--
Steve McIntyre, Cambridge, UK. ***@einval.com
Who needs computer imagery when you've got Brian Blessed?
Antonio Terceiro
2018-10-28 21:00:47 UTC
Permalink
Post by Steve McIntyre
Post by Paul Gevers
Post by Steve McIntyre
* I'm expecting to pick up several Synquacer machines (24-core Cortex
A53) to use for Debian, donated by Linaro. Some will become
buildds, wanting to get more to use for autopkgtest, debian-ci,
reproducible builds etc. [UPDATE: 3 of these are now in Vancouver,
ready to set up as buildds]
Cool, very cool. Regarding autopkgtest/ci, do you already have any
(rudimentary) plans how you want to handle this? E.g. should DSA manage
these machines and does the infra gets access? Are there other
possibilities to host these machines?
did you discuss getting some of this synquacer machines to vagrant for
reproducible builds testing?
Reponding to both of you...
First of all, thank you for going after this.
Post by Steve McIntyre
I now have 3 more basic Synquacer machines in my posession, ready to
order new cases, RAM, etc. (They ship in desktop cases with a single
1TB hard drive and 4 GiB of RAM.) Again, I'm hoping to pick more up in
the future, but supply is still limited.
Again, I'll reiterate - these machines are *not* fast for
single-threaded workloads but they have a lot of cores so it's
perfecty reasonable to run lots of things in parallel. For my own
build testing, I've been running up to 6 sbuilds in parallel for
better throughput while lots of our builds don't parallelise
individually. They should also work well for multiple VMs running in
parallel.
For CI, in my experience the limiting factor is I/O, because a lot
of time is spent installing packages, so running too many jobs in
parallel doesn't really help if the storage can't keep up.
Post by Steve McIntyre
So, practical questions...
Hardware setup: I've configured the earlier 3 as buildds in little 1U
cases with 32GiB RAM and mirrored SSDs, but for $reasons they're not
yet installed and running. I'm assuming that a similar spec would be
wanted for autopkgtest/ci and reproducible builds?
Yes
Post by Steve McIntyre
If so, we'll need to ask for approval for funds for that - it cost
~£750 per machine to do it. I had offers of funds at DC18 which I'm
about to chase to help.
Let me know is there is something I can do to help here.
Post by Steve McIntyre
Hosting: Talking to DSA, it seems they're not too keen on hosting /
managing new machines for these projects. What are current hosting
arrangements for you folks?
Currently all the amd64 CI nodes are VM's on Amazon EC2. There is
currently no arrangement for hosting actual hardware.
Holger Levsen
2018-10-29 17:20:56 UTC
Permalink
Post by Steve McIntyre
I now have 3 more basic Synquacer machines in my posession, ready to
order new cases, RAM, etc. (They ship in desktop cases with a single
1TB hard drive and 4 GiB of RAM.) Again, I'm hoping to pick more up in
the future, but supply is still limited.
very nice!
Post by Steve McIntyre
Again, I'll reiterate - these machines are *not* fast for
single-threaded workloads but they have a lot of cores so it's
perfecty reasonable to run lots of things in parallel. For my own
build testing, I've been running up to 6 sbuilds in parallel for
better throughput while lots of our builds don't parallelise
individually. They should also work well for multiple VMs running in
parallel.
6 sbuilds with how much RAM?
Post by Steve McIntyre
So, practical questions...
Hardware setup: I've configured the earlier 3 as buildds in little 1U
cases with 32GiB RAM and mirrored SSDs, but for $reasons they're not
yet installed and running. I'm assuming that a similar spec would be
wanted for autopkgtest/ci and reproducible builds?
yes. 32gb ram sounds reasonable. 100 or better 200gb ssd is fine for us.
Post by Steve McIntyre
If so, we'll need
to ask for approval for funds for that - it cost ~£750 per machine to
do it. I had offers of funds at DC18 which I'm about to chase to help.
cool, thanks! (for everything! :)
Post by Steve McIntyre
Hosting: Talking to DSA, it seems they're not too keen on hosting /
managing new machines for these projects. What are current hosting
arrangements for you folks?
we have a zoo of armhf/arm64 machines in Vagrant's basement, 8 arm64
sleds at codethink and lots of virtual machines in the profitbricks
cloud. we maintaim them ourselves.
--
cheers,
Holger

-------------------------------------------------------------------------------
holger@(debian|reproducible-builds|layer-acht).org
PGP fingerprint: B8BF 5413 7B09 D35C F026 FE9D 091A B856 069A AA1C
Adrian Bunk
2018-11-11 18:40:23 UTC
Permalink
Post by Steve McIntyre
...
Hardware setup: I've configured the earlier 3 as buildds in little 1U
cases with 32GiB RAM and mirrored SSDs, but for $reasons they're not
yet installed and running. I'm assuming that a similar spec would be
wanted for autopkgtest/ci
...
One thing that might be worth considering would be using one or more of
the old Marvell buildds for armhf autopkgtest.

Otherwise it will become pretty hard to keep a non-NEON armhf baseline
for bullseye when these are no longer visible on the buildds.

I am aware of the problems with the old hardware and that this would
not be a long-term solution, but for some time it could help with that.

cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
Loading...