It’s been two weeks since my last update when the MAAS 2.3 beta 2 release was announced! Since then, the MAAS team has been split, both by participating in internal events (and recovering from travel) as well as continue to focus on the stabilization of MAAS 2.3. As such, I’m happy to provide the updates of the past couple of weeks.
MAAS 2.3 In the past couple of weeks the team has been focused on stabilizing MAAS 2.3 and has fixed the following issues:
LP: #1705594 [2.2, HA] rackd errors after fresh install
LP: #1722848 [2.3, HWTv2] Memtester test is not robust
LP: #1724677 [2.x] [critical] TFTP back-end failed right after node repeatedly requests same file via tftp
LP: #1727073 [2.3, HA] rackd — 12% connected to region controllers.
LP: #1696485 [2.2, HA] MAAS dhcp does not offer up multiple domains to search
LP: #1696661 [2.2, HA] MAAS should offer multiple DNS servers in HA case
LP: #1721268 [2.3, UI, HWTv2] Metrics table (e.g. from fio test) is not padded to MAAS’ standard
LP: #1721886 [2.3, UI, HWTv2] Hardware Test tab doesn’t auto-update
LP: #1722671 [2.3, pod] Unable to delete a machine or a pod if the pod no longer exists
LP: #1724235 [2.3, HWTv2] Aborted test should not show as failure
LP: #1726865 [snap,2.3beta3] maas init uses the default gateway in the default region URL
LP: #1724904 Changing PXE lease in DHCP snippets global sections does not work
LP: #1680819 [2.x, UI] Tooltips go off screen
MAAS 2.4 I’m happy to announce that the roadmap for MAAS 2.4 has now been defined, and it is targeted for April 2018. However, I’ll create a bit of suspense as we will announce the upcoming features once MAAS 2.3 final has been released! Stay tuned!
This past week, the MAAS team met face to face in NYC! The week was concentrated on finalizing the improvements that users will be able to enjoy in MAAS 2.3 and preparing the for the first beta release. While MAAS 2.3.0 beta 1 will be announced separately, we wanted to bring you an update of the work the team has been doing over the past couple weeks.
MAAS 2.3 (current development release)
Hardware Testing Phase 2
Backend work to support the new UX changes for hardware testing. This includes websockets handlers, directives and triggers.
UI – Add ability to upload custom hardware tests scripts.
UI – Update the machine listing page to show hardware status. This shows status of hardware testing while pending, running, failed, degraded, timed out, etc.
UI – Implement new designs for Hardware Testing:
Add cards (new design) on node details pages that include metrics (if tests have been run) and hardware test information.
Add a new Hardware Test tab that better surfaces status of hardware tests per component
Add a more detailed log view of hardware test results.
Surface hardware test results per storage device on each of the block devices (on the machines details page).
Add ability to view all test results performed on each of the components overtime.
Add actions to switch listing page (still under a feature flag)
Fetch Wedge 100 switch metadata using the FRUID API endpoint on the BMC.
UI – Add websockets and triggers to support the UI changes for switches.
UI – Update the UI to display the vendor and model on the switch listing page (behind feature flag)
Add DHCP status column on the ‘Subnet’s tab.
Add architecture filters
Implement a new design for node details page:
Consolidate all of machine, devices, controllers, switches Summary tab into cards.
Add a new Settings tab, combined with the Power tab to allow editing different components of machines, devices, controllers, etc.
Consolidate commissioning output and installation logs in a “Log” tab.
Update VLAN and Space details page to no longer allow inline editing.
Update VLAN page to include the IP ranges tables.
Convert the Zones page into AngularJS (away from YUI).
Add warnings when changing a Subnet’s mode (Unmanaged or Managed).
Rack controller deployment
Add ability to deploy any machine as a rack controller via the API.
Add volume_groups, raids, cache_sets, and bcaches field to the Machine API output.
#1711320 [2.3, UI] Can’t ‘Save changes’ and ‘Cancel’ on machine/device details page
#1696270 [2.3] Toggling Subnet from Managed to Unmanaged doesn’t warn the user that behavior changes
#1717287 maas-enlist doesn’t work when provided with serverurl with IPv6 address
#1718209 PXE configuration for dhcpv6 is wrong
#1718270 [2.3] MAAS improperly determines the version of some installs
#1718686 [2.3, master] Machine lists shows green checks on components even when no tests have been run
#1507712 cli: maas logout causes KeyError for other profiles
#1684085 [2.x, Accessibility] Inconsistent save states for fabric/subnet/vlan/space editing
#1718294 [packaging] dpkg-reconfigure for region controller refers to an incorrect network topology assumption
We have improved the library to allow the managing of block devices and partitions.
Add ability to list machine’s block devices.
Add ability to update, create and delete block devices.
Add ability to list machine’s partitions.
Add ability to update, create and delete partitions.
Add ability to format/unformat partitions and block devices.
Add ability to mount/unmount partitions and block devices.
The release of a new version of libmaas will be announced separately.
MAAS has been working on a new CLI that’s based (and uses) MAAS’ python client library. The work that has been done includes:
Add ability to log in/log out via user and password.
Add ability to switch between profiles.
Add support for interactive login.
Add help command.
Ability to list nodes, machines, devices, controllers.
Ability to list all components in the networking model (subnets, vlans, spaces, fabrics).
Ability to obtain details on machines, devices and controllers.
Ability to obtain details on subnets, vlans, spaces, fabrics.
Ability to perform actions on machines (with the exception of testing and rescue mode).
Add ability to perform actions for multiple nodes
Add a ‘maas ssh’ command.
When listing, add support for automatic paging.
Add ability to view output in different formats (pretty, plain, json, yaml, csv).
Show progress indication on actions that are synchronous or blocking.
The release of the new CLI will be announced separately.
MAAS has now introduced an improved hardware testing framework. This new framework allows MAAS to test individual components of a single machine, as well as providing better feedback to the user for each of those tests. This feature has introduced:
Ability to define a custom testing script with a YAML definition – Each custom test can be defined with YAML that will provide information about the test. This information includes the script name, description, required packages, and other metadata about what information the script will gather. This information can then be displayed in the UI.
Ability to pass parameters – Adds the ability to pass specific parameters to the scripts. For example, in upcoming beta releases, users would be able to select which disks they want to test if they don’t want to test all disks.
Running test individually – Improves the way how hardware tests are run per component. This allows MAAS to run tests against any individual component (such a single disk).
Adding additional performance tests
Added a CPU performance test with 7z.
Added a storage performance test with fio.
Please note that individual results for each of the components is currently only available over the API. Upcoming beta release will include various UI improvements that will allow the user to better surface and interface with these new features.
Rack Controller Deployment in Whitebox Switches
MAAS has now the ability to install and configure a MAAS rack controller once a machine has been deployed. As of today, this feature is only available when MAAS detects the machine is a whitebox switch. As such, all MAAS certified whitebox switches will be deployed with a MAAS rack controller. Currently certified switches include the Wedge 100 and the Wedge 40.
Please note that this features makes use of the MAAS snap to configure the rack controller on the deployed machine. Since snap store mirrors are not yet available, this will require the machine to have access to the internet to be able to install the MAAS snap.
Improved DNS Reloading
This new release introduces various improvements to the DNS reload mechanism. This allows MAAS to be smarter about when to reload DNS after changes have been automatically detected or made.
UI – Controller Versions & Notifications
MAAS now surfaces the version of each running controller, and notifies the users of any version mismatch between the region and rack controllers. This helps administrators identify mismatches when upgrading their MAAS on a multi-node MAAS cluster, such as a HA setup.
Issues fixed in this release
#1702703 Cannot run maas-regiond without /bin/maas-rack
#1711414 [2.3, snap] Cannot delete a rack controller running from the snap
#1712450 [2.3] 500 error when uploading a new commissioning script
#1714273 [2.3, snap] Rack Controller from the snap fails to power manage on IPMI
#1715634 ‘tags machines’ takes 30+ seconds to respond with list of 9 nodes
#1676992 [2.2] Zesty ISO install fails on region controller due to postgresql not running
#1703035 MAAS should warn on version skew between controllers
#1708512 [2.3, UI] DNS and Description Labels misaligned on subnet details page
#1711700 [2.x] MAAS should avoid updating DNS if nothing changed
#1712422 [2.3] MAAS does not report form errors on script upload
#1712423 [2.3] 500 error when clicking the ‘Upload’ button with no script selected.
#1684094 [2.2.0rc2, UI, Subnets] Make the contextual menu language consistent across MAAS
#1688066 [2.2] VNC/SPICE graphical console for debugging purpose on libvirt pod created VMs
Hello MAASters! This is the development summary for the past couple of weeks:
MAAS 2.3 (current development release)
Hardware Testing Phase 2
Added parameters form for script parameters validation.
Accept and validate results from nodes.
Added hardware testing 7zip CPU benchmarking builtin script.
WIP – ability to send parameters to test scripts and process results of individual components. (e.g. will provide the ability for users to select which disk they want to test, and capture results accordingly)
WIP – disk benchmark test via Fio.
Network beaconing & better network discovery
MAAS controllers now send out beacon advertisements every 30 seconds, regardless of whether or not any solicitations were received.
Backend changes to automatically detect switches (during commissioning) and make use of the new switch model.
Introduce base infrastructure for NOS drivers, similar to the power management one.
Install the Rack Controller when deploying a supported Switch (Wedge 40, Wedge 100)
UI – Add a switch listing tab behind a feature flag.
Minor UI improvements
The version of MAAS installed on each controller is now reported on the controller details page.
Added ability to power on, power off, and query the power state of a machine.
Added PowerState enum to make it easy to check the current power state of a machine.
Added ability to reference the children and parent interfaces of an interface.
Added ability to reference the owner of node.
Added base level `Node` object that `Machine`, `Device`, `RackController`, and `RegionController` extend from.
Added `as_machine`, `as_device`, `as_rack_controller`, and `as_region_controller` to the Node object. Allowing the ability to convert a `Node` into the type you need to perform an action on.
LP: #1676992 – force Postgresql restart on maas-region-controller installation.
LP: #1708512 – Fix DNS & Description misalignment
LP: #1711714 – Add cloud-init reporting for deployed Ubuntu Core systems
LP: #1684094 – Make context menu language consistent for IP ranges.
LP: #1686246 – Fix docstring for set-storage-layout operation
MAAS 2.2.0 is currently available in the following MAAS team PPA.
Please note that MAAS 2.2 will replace the MAAS 2.1 series, which will go out of support. We are holding MAAS 2.2 in the above PPA for a week, to provide enough notice to users that it will replace 2.1 series. In the following weeks, MAAS 2.2 will be backported into Ubuntu Xenial.
For a while, I have been wanting to write about MAAS and how it can easily deploy workloads (specially OpenStack) with Juju, and the time has finally come. This will be the first of a series of posts where I’ll provide an Overview of how to quickly get started with MAAS and Juju.
What is MAAS?
I think that MAAS does not require introduction, but if people really need to know, this awesome video will provide a far better explanation than the one I can give in this blog post.
MAAS have been designed in such a way that it can be deployed in different architectures and network environments. MAAS can be deployed as both, a Single-Node or Multi-Node Architecture. This allows MAAS to be a scalable deployment system to meet your needs. It has two basic components, the MAAS Region Controller and the MAAS Cluster Controller.
The MAAS Region Controller is the component the users interface with, and is the one that controls the Cluster Controllers. It is the place of the WebUI and API. The Region Controller is also the place for the MAAS meta-data server for cloud-init, as well as the place where the DNS server runs. The region controller also configures a rsyslogd server to log the installation process, as well as a proxy (squid-deb-proxy) that is used to cache the debian packages. The preseeds used for the different stages of the process are also being stored here.
The MAAS Cluster Controller only interfaces with the Region controller and is the one in charge of provisioning in general. The Cluster Controller is the place the TFTP and DHCP server(s) are located. This is the place where both the PXE files and ephemeral images are being stored. It is also the Cluster Controller’s job to power on/off the managed nodes (if configured).
As you can see in the image above, MAAS can be deployed in both a single node or multi-node. The way MAAS has being designed makes MAAS highly scalable allowing to add more Cluster Controllers that will manage a different pool of machines. A single-node scenario can become in a multi-node scenario by simply adding more Cluster Controllers. Each Cluster Controller has to register with the Region Controller, and each can be configured to manage a different Network. The way has this is intended to work is that each Cluster Controller will manage a different pool of machines in different networks (for provisioning), allowing MAAS to manage hundreds of machines. This is completely transparent to users because MAAS makes the machines available to them as a single pool of machines, which can all be used for deploying/orchestrating your services with juju.
How Does It Work?
MAAS has 3 basic stages. These are Enlistment, Commissioning and Deployment which are explained below:
The enlistment process is the process on which a new machine is registered to MAAS. When a new machine is started, it will obtain an IP address and PXE boot from the MAAS Cluster Controller. The PXE boot process will instruct the machine to load an ephemeral image that will run and perform an initial discovery process (via a preseed fed to cloud-init). This discovery process will obtain basic information such as network interfaces, MAC addresses and the machine’s architecture. Once this information is gathered, a request to register the machine is made to the MAAS Region Controller. Once this happens, the machine will appear in MAAS with a Declared state.
The commissioning process is the process where MAAS collects hardware information, such as the number of CPU cores, RAM memory, disk size, etc, which can be later used as constraints. Once the machine has been enlisted (Declared State), the machine must be accepted into the MAAS in order for the commissioning processes to begin and for it to be ready for deployment. For example, in the WebUI, an “Accept & Commission” button will be present. Once the machine gets accepted into MAAS, the machine will PXE boot from the MAAS Cluster Controller and will be instructed to run the same ephemeral image (again). This time, however, the commissioning process will be instructed to gather more information about the machine, which will be sent back to the MAAS region controller (via cloud-init from MAAS meta-data server). Once this process has finished, the machine information will be updated it will change to Ready state. This status means that the machine is ready for deployment.
Once the machines are in Ready state, they can be used for deployment. Deployment can happen with both juju or the maas-cli (or even the WebUI). The maas-cli will only allow you to install Ubuntu on the machine, while juju will not only allow you to deploy Ubuntu on them, but will allow you to orchestrate services. When a machine has been deployed, its state will change to Allocated to <user>. This state means that the machine is in use by the user who requested its deployment.
Once a user doesn’t need the machine anymore, it can be released and its status will change from Allocated to <user> back to Ready. This means that the machine will be turned off and will be made available for later use.
But… How do Machines Turn On/Off?
Now, you might be wondering how are the machines being turned on/off or who is the one in charge of that. MAAS can manage power devices, such as IPMI/iLO, Sentry Switch CDU’s, or even virsh. By default, we expect that all the machines being controlled by MAAS have IPMI/iLO cards. So if your machines do, MAAS will attempt to auto-detect and auto-configure your IPMI/iLO cards during the Enlistment and Commissioning processes. Once the machines are Accepted into MAAS (after enlistment) they will be turned on automatically and they will be Commissioned (that is if IPMI was discovered and configured correctly)..This also means that every time a machine is being deployed, they will be turned on automatically.
Note that MAAS not only handles physical machines, it can also handle Virtual Machines, hence the virsh power management type. However, you will have to manually configure the details in order for MAAS to manage these virtual machines and turn them on/off automatically.