This tutorial is an introduction to using a small footprint real-time operating system on an Arm Cortex-M microcontroller. If you are used to writing procedural-based 'C' code on small 8-/16-bit microcontrollers, you may be doubtful about the need for such an operating system. If you are not familiar with using an RTOS in real-time embedded systems, you should read this chapter before dismissing the idea. The use of an RTOS represents a more sophisticated design approach, inherently fostering structured code development which is enforced by the RTOS application programming interface (API).

The RTOS structure allows you to take a more object-orientated design approach, while still programming in 'C'. The RTOS also provides you with multithreaded support on a small microcontroller. These two features actually create quite a shift in design philosophy, moving us away from thinking about procedural ‘C’ code and flow charts. Instead we consider the fundamental program threads and the flow of data between them. The use of an RTOS also has several additional benefits which may not be immediately obvious. Since an RTOS based project is composed of well-defined threads, it helps to improve project management, code reuse, and software testing.

The tradeoff for this is that an RTOS has additional memory requirements and increased interrupt latency. Typically, the Keil RTX5 RTOS will require 500 bytes of RAM and 5k bytes of code, but remember that some of the RTOS code would be replicated in your program anyway. We now have a generation of small low-cost microcontrollers that have enough on-chip memory and processing power to support the use of an RTOS. Developing using this approach is therefore much more accessible.

We will first look at setting up an introductory RTOS project for a Cortex-M based microcontroller. Next, we will go through each of the RTOS primitives and how they influence the design of our application code. Finally, when we have a clear understanding of the RTOS features, we will take a closer look at the RTOS configuration options. If you are used to programming a microcontroller without using an RTOS i.e. bare metal, there are two key things to understand as you work through this tutorial. In the first section, we will focus on creating and managing Threads. The key concept here is to consider them running as parallel concurrent objects. In the second section, we will look at how to communicate between threads. In this section the key concept is synchronization of the concurrent threads.

Prerequisites

It is assumed that you have Keil MDK installed on your PC. For download and installation instructions, please visit the Getting Started page. Once you have set up the tool, open Pack Installer:

Use the Search box on the Devices tab to look for the STM32F103 device.
On the Packs tab, download and install the latest Keil:STM32F1xx_DFP pack and the latest Hitex:CMSIS_RTOS2_Turorial pack.

Note: It is assumed that you are familiar with Arm Keil MDK and have basic 'C' programming knowledge.

First Steps with Keil RTX5

The RTOS itself consists of a scheduler which supports round-robin, pre-emptive and co-operative multitasking of program threads, as well as time and memory management services. Inter-thread communication is supported by additional RTOS objects, including signal thread and event flags, semaphores, mutex, message passing and a memory pool system. As we will see, interrupt handling can also be accomplished by prioritized threads which are scheduled by the RTOS kernel.

Accessing the CMSIS-RTOS2 API

To access any of the CMSIS-RTOS2 features in our application code, it is necessary to include the following header file.

#include <cmsis_os2.h>

This header file is maintained by Arm as part of the CMSIS-RTOS2 standard. For Keil RTX5, this is the default API. Other RTOS will have their own proprietary API but may provide a wrapper layer to implement the CMSIS-RTOS2 API so they can be used where compatibility with the CMSIS standard is required.

Threads

The building blocks of a typical 'C' program are functions which we call to perform a specific procedure and which then return to the calling function. In CMSIS-RTOS2, the basic unit of execution is a "Thread". A Thread is very similar to a 'C' procedure but has some very fundamental differences.

unsigned int procedure (void) {
  ...
        return(ch);                     
}
 
void thread (void) {
  while(1) {
    ...
        }
}       
 
__NO_RETURN void Thread1(void*argument) {
  while(1) {
    ...
  }
}

While we always return from our 'C' function, once started an RTOS thread must contain a loop so that it never terminates and thus runs forever. You can think of a thread as a mini self-contained program that runs within the RTOS. With the Arm Compiler, it is possible to optimize a thread by using a __NO_RETURN macro. This attribute reduces the cost of calling a function that never returns.

An RTOS program is made up of a number of threads, which are controlled by the RTOS scheduler. This scheduler uses the SysTick timer to generate a periodic interrupt as a time base. The scheduler will allot a certain amount of execution time to each thread. So thread1 will run for 5 ms then be de-scheduled to allow thread2 to run for a similar period; thread2 will give way to thread3 and finally control passes back to thread1. By allocating these slices of runtime to each thread in a round-robin fashion, we get the appearance of all three threads running in parallel to each other.

Conceptually we can think of each thread as performing a specific functional unit of our program with all threads running simultaneously. This leads us to a more object-orientated design, where each functional block can be coded and tested in isolation and then integrated into a fully running program. This not only imposes a structure on the design of our final application but also aids debugging, as a particular bug can be easily isolated to a specific thread. It also aids code reuse in later projects. When a thread is created, it is also allocated its own thread ID. This is a variable which acts as a handle for each thread and is used when we want to manage the activity of the thread.

osThreadId_t id1, id2, id3;

In order to make the thread-switching process happen, we have the code overhead of the RTOS and we have to dedicate a CPU hardware timer to provide the RTOS time reference. In addition, each time we switch running threads, we have to save the state of all the thread variables to a thread stack. Also, all the runtime information about a thread is stored in a thread control block, which is managed by the RTOS kernel. Thus the “context switch time”, that is, the time to save the current thread state and load up and start the next thread, is a crucial figure and will depend on both the RTOS kernel and the design of the underlying hardware.

The Thread Control Block contains information about the status of a thread. Part of this information is its run state. In a given system, only one thread can be running and all the others will be suspended but ready to run. The RTOS has various methods of inter-thread communication (signals, semaphores, messages). Here, a thread may be suspended to wait to be signaled by another thread or interrupt before it resumes its ready state, whereupon it can be placed into running state by the RTOS scheduler.

State	Description
Running	The currently running thread
Ready	Threads ready to run
Wait	Blocked threads waiting for an OS event

At any given moment a single thread may be running. The remaining threads will be ready to run and will be scheduled by the kernel. Threads may also be waiting pending an OS event. When this occurs they will return to the ready state and be scheduled by the kernel.

Starting the RTOS

To build a simple RTOS, program we declare each thread as a standard 'C' function and also declare a thread ID variable for each function.

void thread1 (void);    
void thread2 (void);
 
osThreadId thrdID1, thrdID2;

Once the processor leaves the reset vector, we will enter the main() function as normal. Once in main(), we must call osKernelInitialize() to setup the RTOS. It is not possible to call any RTOS function before the osKernelInitialize() function has successfully completed. Once osKernelInitialize() has completed, we can create further threads and other RTOS objects. This can be done by creating a launcher thread, in the example below this is called app_main(). Inside the app_main() thread, we create all the RTOS threads and objects we need to start our application running. As we will see later, it is also possible to dynamically create and destroy RTOS objects as the application is running. Next, we can call osKernelStart() to start the RTOS and the scheduler task switching. You can run any initializing code you want before starting the RTOS to setup peripherals and initialize the hardware.

void app_main(void *argument) {
  T_led_ID1 = osThreadNew(led_Thread1, NULL, &ThreadAttr_LED1);
  T_led_ID2 = osThreadNew(led_Thread2, NULL, &ThreadAttr_LED2);
  osDelay(osWaitForever);
  while (1)
    ;
}
 
void main(void) {
  IODIR1 = 0x00FF0000;               // Do any C code you want
  osKernelInitialize();              // Initialize the kernel
  osThreadNew(app_main, NULL, NULL); // Create the app_main() launcher thread
  osKernelStart();                   // Start the RTOS
}

When threads are created they are also assigned a priority. If there are a number of threads ready to run and they all have the same priority, they will be allotted run time in a round-robin fashion. However, if a thread with a higher priority becomes ready to run, the RTOS scheduler will de-schedule the currently running thread and start the high priority thread running. This is called pre-emptive priority-based scheduling. When assigning priorities, you have to be careful because the high priority thread will continue to run until it enters a waiting state or until a thread of equal or higher priority is ready to run.

Options	Description
osFlagsWaitAny	Wait for any flag to be set(default)
osFlagsWaitAll	Wait for all flags to be set
osFlagsNoClear	Do not clear flags that have been specified to wait for

Bitmask	Description
osMutexRecursive	The same thread can consume a mutex multiple times without locking itself.
osMutexPrioInherit	While a thread owns the mutex it cannot be preempted by a higher priority thread.
osMutexRobust	Notify threads that acquire a mutex that the previous owner was terminated.

Prerequisites

First Steps with Keil RTX5

Accessing the CMSIS-RTOS2 API

Threads

Starting the RTOS

Exercise 1 - A First CMSIS-RTOS2 Project

Creating Threads

Exercise 2 - Creating and Managing Threads

Thread Management and Priority

Memory Management

Exercise 3 - Memory Model

Multiple Instances

Exercise 4 - Multiple Instances

Joinable Threads

Exercise 5 - Joinable Threads

Time Management

Time Delay

Absolute Time Delay

Exercise 6 - Time Management

Virtual Timers

Exercise 7 - Virtual Timer

Idle Thread

Exercise 8 - Idle Thread

Inter-thread Communication

Thread Flags

Exercise 9 - Thread Flags

Event Flags

Exercise 10 - Event Flags

Semaphores

Using Semaphores

Signalling

Exercise 11 - Semaphore Signalling

Multiplex

Exercise 12 - Multiplex

Rendezvous

Exercise 13 - Rendezvous

Barrier Turnstile

Exercise 14 - Semaphore Barrier

Semaphore Caveats

Mutex

Exercise 15 - Mutex

Mutex Caveats

Data Exchange

Message Queue

Exercise 16 - Message Queue

Extended Message Queue

Exercise 17 - Message Queue

Memory Pool

Exercise 18 - Zero Copy Mailbox

Configuration

System Configuration

Thread Configuration

System Timer Configuration

Conclusion