# The One-Out-of-m Multicore Problem

Jim Anderson, Kenan Professor University of North Carolina at Chapel Hill

Work supported by NSF and AFOSR

#### **Outline**

- Problems caused by multicore.
  - » "The one-out-of-m problem."
  - » Why this is an important problem.
- Basic solution strategy.
  - » MC<sup>2</sup> (mixed-criticality on multicore).
  - » Hardware management in MC<sup>2</sup>.
- Brief overview of recent work.
  - » Key focus: features of real-world task systems that break hardware isolation.

#### The One-Out-Of-m Multicore Problem

» In many safety-critical domains, we would like to be able to exploit the computational capacity of multicore. *However:* 



mage source: http://www.as.northropgrumman.com/products/nucasx47b/assets/lgm\_UCAS\_3\_0911.jpg

- When using an m-core platform in a safety-critical domain, analysis pessimism can be so great, the capacity of the "additional" m 1 cores is entirely negated.
- » We call this the "one-out-of-m" problem.
  - In avionics, this problem has led to the common practice of simply disabling all but one core if highly critical system components exist.

#### Roots of the problem:

- Shared hardware that is not predictably managed.
  - See the FAA position paper "CAST 32" for an extensive discussion of problems caused by multicore.
- Excessive pessimism in provisioning tasks.
  - Mixed-criticality analysis seeks to address this.

able to exploit the computational capacity multicore. *However:* 

age source: http://www.as.northropgrumman.com/products/nucasx47b/assets/lgm\_UCAS\_3\_0911.jp

- When using an m-core platform in a safety-critical domain, analysis pessimism can be so great, the capacity of the "additional" m 1 cores is entirely negated.
- » We call this the "one-out-of-m" problem.
  - In avionics, this problem has led to the common practice of simply disabling all but one core if highly critical system components exist.

# What is Mixed-Criticality Analysis?

(Vestal [RTSS '07])

- Each task is assigned a criticality level.
- Each task has provisioned execution time (PET) specified at <u>each</u> criticality level.
  - » PETs at higher levels are (typically) larger.
- Example: Assuming criticality levels A (highest), B, C, etc., task τ<sub>i</sub> might have PETs C<sub>i</sub><sup>A</sup> = 20, C<sub>i</sub><sup>B</sup> = 12, C<sub>i</sub><sup>C</sup> = 5, ...
- Rationale: Will use more pessimistic analysis at high levels, more optimistic at low levels.

# What is Mixed-Criticality Analysis?

(Vestal [RTSS '07])

- Each task is assigned a criticality level.
- Each task has provisioned execution time (PET) specified at <u>each</u> criticality level.
  - » PETs at higher levels are (typically) larger.
- The task system is correct at Level X iff all Level-X tasks meet their timing requirements assuming all tasks have Level-X PETs.

# What is Mixed-Criticality Analysis?

(Vestal [RTSS '07])

- Some "weirdness" here: Not just one system
- anymore, but <u>several</u>: the Level-A system, Level-B,...
  - » PETs at higher level voically) larger.
- The task system is correct at Level X iff all Level-X tasks meet their timing requirements assuming all tasks have Level-X PETs.

#### **Outline**

- Problems caused by multicore.
  - » "The one-out-of-m problem."
  - » Why this is an important problem.
- Basic solution strategy.
  - » MC<sup>2</sup> (mixed-criticality on multicore).
  - » Hardware management in MC<sup>2</sup>.
- Brief overview of recent work.
  - » Key focus: features of real-world task systems that break hardware isolation.

# **Our Solution Strategy**

- W.r.t. lessening capacity loss generally (even on uniprocessors), two orthogonal approaches have been investigated previously:
  - » Hardware-management techniques that reduce hardware interference.
  - » Mixed-criticality analysis techniques that enable less critical tasks to be provisioned less pessimistically.

Hardware-Management Techniques Mixed-Criticality Analysis

# **Our Solution Strategy**

- Our work focuses broadly on research questions that arise when applying <u>both</u> approaches together.
  - » We are addressing such questions in the context of a resource-allocation and analysis framework developed by us called MC<sup>2</sup> (mixed criticality on multicore).



# MC<sup>2</sup>: Starting Assumptions

- Modest core count (e.g., 2-8).
  - » Quad-core in avionics would be a tremendous innovation.

# MC<sup>2</sup>: Starting Assumptions

- Modest core count (e.g., 2-8).
- Modest number of criticality levels (e.g., 2-5).
  - » 2 may be too constraining
  - » ∞ isn't practically interesting.
  - » These levels may not necessarily match DO-178B/C.

# MC<sup>2</sup>: Starting Assumptions

- Modest core count (e.g., 2-8).
- Modest number of criticality levels (e.g., 2-5).

Main motivation: To develop a framework that allows interesting design tradeoffs to be investigated that is reasonably plausible from an avionics point of view.

A Non-Goal: Developing a framework that could really be used in avionics today.









Jim Anderson 17



Jim Anderson 18























#### Our Actual Allocation Scheme



#### Our Actual Allocation Scheme



## **Experimental Evaluations**

- We have assessed the value of hardware management w.r.t.
  - » individual tasks through experiments involving benchmark programs,
  - » entire task systems from a schedulability point of view.

As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



As a Function of Allocated LLC Area



This is One Out of About 500 Graphs



This is One Out of About 500 Graphs



This is One Out of About 500 Graphs













This is One Out of About 500 Graphs



Uniprocessor EDF (the current de facto standard)

## d-Aware Schedulability Study

s is One Out of About 500 Graphs













This is One Out of About 500 Graphs



#### **Outline**

- Problems caused by multicore.
  - » "The one-out-of-m problem."
  - » Why this is an important problem.
- Basic solution strategy.
  - » MC<sup>2</sup> (mixed-criticality on multicore).
  - » Hardware management in MC<sup>2</sup>.
- Brief overview of recent work.
  - » Key focus: features of real-world task systems that break hardware isolation.

#### Recent Work

#### **Dealing with Shared Pages**

- Real-world task systems share memory pages.
- In recent work, we've dealt with these sources of sharing:
  - » "Explicit" read/write sharing due to producer/consumer relationships [RTSS'16].
  - » "Implicit" read-only sharing due to shared libraries [RTAS'17].
  - » Sharing due to interrupt-driven I/O [under construction].
- We've also investigated:
  - » Applications that must support mode changes [under construction].

# MC<sup>2</sup> Papers

(Available at http://www.cs.unc.edu/~anderson/papers.html)

- J. Anderson, S. Baruah, and B. Brandenburg, "Multicore Operating-System Support for Mixed Criticality," Proc. of the Workshop on Mixed Criticality: Roadmap to Evolving UAV Certification, 2009.
  - » A "precursor" paper that discusses some of the design decisions underlying MC<sup>2</sup>.
- M. Mollison, J. Erickson, J. Anderson, S. Baruah, and J. Scoredos, "Mixed Criticality Real-Time Scheduling for Multicore Systems," *Proc. of the 7<sup>th</sup> IEEE International Conf. on Embedded Software and Systems*, 2010.
  - » Focus is on **schedulability**, i.e., how to check timing constraints at each level and "shift" slack.
- J. Herman, C. Kenna, M. Mollison, J. Anderson, and D. Johnson, "RTOS Support for Multicore Mixed-Criticality Systems," *Proc. of the 18<sup>th</sup> RTAS*, 2012.
  - » Focus is on RTOS design, i.e., how to reduce the impact of RTOS-related overheads on high-criticality tasks due to low-criticality tasks.
- B. Ward, J. Herman, C. Kenna, and J. Anderson, "Making Shared Caches More Predictable on Multicore Platforms," *Proc. of the 25<sup>th</sup> ECRTS*, 2013.
  - » Adds **shared cache management** to a two-level variant of MC<sup>2</sup>. The approach in today's talk is different.
- J. Erickson, N. Kim, and J. Anderson, "Recovering from Overload in Multicore Mixed-Criticality Systems," *Proc. of the 29<sup>th</sup> IPDPS*, 2015.
  - » Adds virtual-time-based scheduling to Level C.

## MC<sup>2</sup> Papers

(Available at http://www.cs.unc.edu/~anderson/papers.html)

- M. Chisholm, B. Ward, N. Kim, and J. Anderson, "Cache Sharing and Isolation Tradeoffs in Multicore Mixed-Criticality Systems," *Proc. of the 36<sup>th</sup> RTSS*, 2015.
  - » Presents linear-programming-based techniques for optimizing LLC area allocations.
- N. Kim, B. Ward, M. Chisholm, C.-Y. Fu, J. Anderson, and F.D. Smith, "Attacking the One-Out-Of-m Multicore Problem by Combining Hardware Management with Mixed-Criticality Provisioning," *Proc. of the 22<sup>nd</sup> RTAS*, 2016.
  - » Adds shared hardware management to MC<sup>2</sup>.
- M. Chisholm, N. Kim, B. Ward, N. Otterness, J. Anderson, and F.D. Smith, "Reconciling the Tension Between Hardware Isolation and Data Sharing in Mixed-Criticality, Multicore Systems," Proc. of the 37<sup>th</sup> RTSS, 2016.
  - » Adds support for data sharing to MC<sup>2</sup>.
- N. Kim, M. Chisholm, N. Otterness, J. Anderson, and F.D. Smith, "Allowing Share Libraries while Supporting Hardware Isolation in Multicore Real-Time Systems," Proc. of the 23<sup>rd</sup> RTAS, 2017 (to appear).
  - » Adds selective sharing of libraries to MC<sup>2</sup>.

#### Thanks!

Questions?



CMAAS, Apr. 2017