Tinkering TI MSP430F5529

Hardware Multiplier Module – MPY32

Many high-end microcontrollers consist of internal hardware multiplier. So, what does this hardware do? Well a multiplier performs multiplication and simply that’s it. Why is it important to have a hardware multiplier when we are coding in high-level C language and simple multiplication instruction is enough to do the job? Without hardware multiplier, multiplication is made possible by complex coding, i.e. by hidden software methods. Unlike dedicated hardware, software method takes time and uses resources since it is an emulated task. It is more like hardware rendering vs software rendering. We tend to like good graphics while playing games. A general-purpose computer without a dedicated graphics card may not properly run a game that has good graphical details. A gaming computer, on the other hand, has a dedicated graphics card and it can run the game with best performance. Just like the graphics card, a hardware multiplier is such a necessary hardware renderer that is often needed in time-limited complex computations and digital signal processing (DSP).

MSP430F5529 has a 32-bit hardware multiplier named MPY32. Like DMA, it is not a part of the CPU and so it won’t affect the operations of the CPU when used. However, the CPU is needed to load and extract data from the multiplier. It supports the following multiplications:

• Unsigned multiply

• Signed multiply

• Unsigned multiply accumulate

• Signed multiply accumulate

• 8-bit, 16-bit, 24-bit, and 32-bit operands

• Saturation

• Fractional numbers

• 8-bit and 16-bit operation compatible with 16-bit hardware multiplier

• 8-bit and 24-bit multiplications without requiring a “sign extend” instruction

Software-based multiplication utilizes CPU and keeps it busy. Software multiplication is like repetitive addition. We don’t see it happening because the whole process is done in machine/assembly language level while we are coding in high-level C language.

Code Example

#include "driverlib.h"
#include "delay.h"
#include "lcd.h"
#include "lcd_print.h"

void clock_init(void);
void timer_T0A5_init(void);

void main(void)
{
    unsigned int i = 0;
    unsigned int j = 0;

    signed long res = 0;
    signed int num1 = 263;
    signed int num2 = 249;
    unsigned int timer_count = 0;

    WDT_A_hold(WDT_A_BASE);

    clock_init();
    timer_T0A5_init();

    LCD_init();
    LCD_clear_home();

    LCD_goto(4, 0);
    LCD_putstr("Software");
    LCD_goto(1, 1);
    LCD_putstr("Multiplication");

    delay_ms(2000);
    LCD_clear_home();

    LCD_goto(0, 0);
    LCD_putstr("263(x)249=");
    LCD_goto(0, 1);
    LCD_putstr("T.CNT = ");

    Timer_A_startCounter(__MSP430_BASEADDRESS_T0A5__,
                         TIMER_A_CONTINUOUS_MODE);

    for(i = 0; i < num1; i++)
    {
        for(j = 0; j < num2; j++)
        {
            res++;
        }
    }

    timer_count = Timer_A_getCounterValue(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    print_I(10, 0, res);
    print_I(10, 1, timer_count);

    delay_ms(4000);
    LCD_clear_home();

    res = 0;
    timer_count = 0;
    Timer_A_clear(__MSP430_BASEADDRESS_T0A5__);

    LCD_goto(4, 0);
    LCD_putstr("Hardware");
    LCD_goto(1, 1);
    LCD_putstr("Multiplication");

    delay_ms(2000);
    LCD_clear_home();

    LCD_goto(0, 0);
    LCD_putstr("263(x)249=");
    LCD_goto(0, 1);
    LCD_putstr("T.CNT = ");

    Timer_A_startCounter(__MSP430_BASEADDRESS_T0A5__,
                         TIMER_A_CONTINUOUS_MODE);

    MPY32_setOperandOne16Bit(MPY32_MULTIPLY_UNSIGNED,
                             num1);

    MPY32_setOperandTwo16Bit(num2);

    res = MPY32_getResult();

    timer_count = Timer_A_getCounterValue(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    print_I(10, 0, res);
    print_I(10, 1, timer_count);

    delay_ms(4000);

    num1 = -99;
    num2 = 660;

    LCD_clear_home();

    LCD_goto(0, 0);
    LCD_putstr("Signed Software");
    LCD_goto(1, 1);
    LCD_putstr("Multiplication");

    delay_ms(2000);
    LCD_clear_home();

    LCD_goto(0, 0);
    LCD_putstr("-99(x)660=");
    LCD_goto(0, 1);
    LCD_putstr("T.CNT = ");

    Timer_A_startCounter(__MSP430_BASEADDRESS_T0A5__,
                         TIMER_A_CONTINUOUS_MODE);

    res = (((signed long)num1) * ((signed long)num2));

    timer_count = Timer_A_getCounterValue(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    print_I(10, 0, res);
    print_I(10, 1, timer_count);

    delay_ms(4000);
    LCD_clear_home();

    res = 0;
    timer_count = 0;

    Timer_A_clear(__MSP430_BASEADDRESS_T0A5__);

    LCD_goto(0, 0);
    LCD_putstr("Signed Hardware");
    LCD_goto(1, 1);
    LCD_putstr("Multiplication");

    delay_ms(2000);
    LCD_clear_home();

    LCD_goto(0, 0);
    LCD_putstr("-99(x)660=");
    LCD_goto(0, 1);
    LCD_putstr("T.CNT = ");

    Timer_A_startCounter(__MSP430_BASEADDRESS_T0A5__,
                         TIMER_A_CONTINUOUS_MODE);

    MPY32_setOperandOne16Bit(MPY32_MULTIPLY_SIGNED,
                             num1);

    MPY32_setOperandTwo16Bit(num2);

    res = MPY32_getResult();

    timer_count = Timer_A_getCounterValue(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    print_I(10, 0, res);
    print_I(10, 1, timer_count);

    while(1)
    {
    };
}

void clock_init(void)
{
    PMM_setVCore(PMM_CORE_LEVEL_3);

    GPIO_setAsPeripheralModuleFunctionInputPin(GPIO_PORT_P5,
                                               (GPIO_PIN4 | GPIO_PIN2));

    GPIO_setAsPeripheralModuleFunctionOutputPin(GPIO_PORT_P5,
                                                (GPIO_PIN5 | GPIO_PIN3));

    UCS_setExternalClockSource(XT1_FREQ,
                               XT2_FREQ);

    UCS_turnOnXT2(UCS_XT2_DRIVE_4MHZ_8MHZ);

    UCS_turnOnLFXT1(UCS_XT1_DRIVE_3,
                    UCS_XCAP_3);

    UCS_initClockSignal(UCS_FLLREF,
                        UCS_XT2CLK_SELECT,
                        UCS_CLOCK_DIVIDER_4);

    UCS_initFLLSettle(MCLK_KHZ,
                      MCLK_FLLREF_RATIO);

    UCS_initClockSignal(UCS_SMCLK,
                        UCS_XT2CLK_SELECT,
                        UCS_CLOCK_DIVIDER_1);

    UCS_initClockSignal(UCS_ACLK,
                        UCS_XT1CLK_SELECT,
                        UCS_CLOCK_DIVIDER_1);
}

void timer_T0A5_init(void)
{
    Timer_A_initContinuousModeParam ContinuousModeParam =
    {
         TIMER_A_CLOCKSOURCE_SMCLK,
         TIMER_A_CLOCKSOURCE_DIVIDER_1,
         TIMER_A_TAIE_INTERRUPT_DISABLE,
         TIMER_A_DO_CLEAR,
         false
    };

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_initContinuousMode(__MSP430_BASEADDRESS_T0A5__,
                               &ContinuousModeParam);
}

Hardware Setup

Explanation

For demoing the merits of having hardware multiplier, I coded a simple program that simply performs certain calculations and keep record of time count as to determine how long it took to perform the calculations. Two sorts of multiplications are performed both using hardware module and by software means. Same numbers are used in both cases. Timer TA0 is set with SMCLK having XT2_CLK source and no division. Thus, it is running at 4 MHz speed. No interrupt is used to keep things simple. The measure of time is taken as timer ticks and not as actual time in seconds or other units. The larger the value of ticks the longer is the time it took to perform a given calculation. The timer is started right before performing a calculation and stopped immediately after completing it.

void timer_T0A5_init(void)
{
    Timer_A_initContinuousModeParam ContinuousModeParam =
    {
         TIMER_A_CLOCKSOURCE_SMCLK,
         TIMER_A_CLOCKSOURCE_DIVIDER_1,
         TIMER_A_TAIE_INTERRUPT_DISABLE,
         TIMER_A_DO_CLEAR,
         false
    };

    Timer_A_stop(__MSP430_BASEADDRESS_T0A5__);

    Timer_A_initContinuousMode(__MSP430_BASEADDRESS_T0A5__,
                               &ContinuousModeParam);
}

The first calculation is an unsigned multiplication of numbers 263 and 249. As I stated before software multiplication is simply repetitive addition and the code below does exactly that:

for(i = 0; i < num1; i++)
{
    for(j = 0; j < num2; j++)
    {
        res++;
    }
}

The numbers to be multiplied form loops and a variable called res is incremented on each loop passes. This, in effect, behaves like a rudimentary multiplication. The result of this multiplication is 65487. It is revealed that this calculation takes about 13500 counts or about 3 ms.

The timer is cleared and restarted. This time however the hardware multiplier is used to multiply the same numbers again.

MPY32_setOperandOne16Bit(MPY32_MULTIPLY_UNSIGNED, 
                         num1);

MPY32_setOperandTwo16Bit(num2);

res = MPY32_getResult();

It is found that this time the same calculation takes just 17 ticks or about 4 µs. This shows that hardware multiplication is roughly 750 times faster than software multiplication.

The same is done once more but this time using signed software multiplication of numbers -99 and 660. This time however compiler’s multiplication operator is used. The result of this calculation is -65340. It is found that it takes 36 timer ticks or 9 µs.

res = (((signed long)num1) * ((signed long)num2));

Again, the hardware multiplier is used for computing the same calculation and again it took 17 ticks or about 4 µs. Unlike the first example with unsigned numbers, the time difference is small this time because when the compiler sees multiplication operator, it performs multiplication in assembly/ machine language level. Thus, time is reduced significantly but still it is not as fast as hardware-based multiplication.

Demo

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

Related Posts

10 comments

Leave a Reply to Cristian Cancel reply

Your email address will not be published. Required fields are marked *