Getting started with the DUE (Part 2)

Part [-101, 2

I got a mail with an interesting question (Thanks Andrew!) regarding the performances of the DUE versus the UNO using PlainFFT.



I built this simple sketch based on the example which comes with the PlainFFT package:

#include "PlainFFT.h"
#include "math.h"

PlainFFT FFT; /* Create FFT object */
/* Simulation parameters */
#define TWO_PI 6.283185307179586476925286766559
const uint16_t _samples = 128;
const float _samplingFrequency = 200.0;
const float _signalFrequency = 50.0;
const float _signalIntensity = 100.0;
These are the input and output vectors 
Input vectors receive computed results from FFT
float _vReal[_samples]; 
float _vImag[_samples];

void setup()
	// BlinkLed(3, 200);
	/* Build raw data */
	for (uint16_t i = 0; i < _samples; i++) {
		float abscissa = (i / _samplingFrequency);
		_vReal[i] = (sin(abscissa * TWO_PI * _signalFrequency) * (_signalIntensity / 2.0));
		_vImag[i] = 0.0;

void loop() 
	uint32_t tip = millis();
	for (uint16_t i = 0; i < 100; i++) {
		/* Weigh data */
		FFT.Windowing(_vReal, _samples, FFT_WIN_TYP_HANN, FFT_FORWARD);	
		/* Compute FFT */
		FFT.Compute(_vReal, _vImag, _samples, FFT_FORWARD);	
		/* Compute magnitudes */
		FFT.ComplexToReal(_vReal, _vImag, _samples, FFT_SCL_TYP_AMPLITUDE);
	uint32_t top = millis();
	Serial.println((top - tip) / 100.0, 2);

Plain trivial!

Running this sketch on the UNO shows that each conversion lasts 62 ms, while running it on the DUE will cuts down the execution time to … 11 ms. That’s still a lot. Why? Well, the DUE is running approximately 5 times faster than the UNO (100MHz vs 16MHz). However PlainFFT is using 32 bits… floats, that the DUE’s micro-controller does not like so much. The ultimate performances would be achieved in fixed point maths…

Here is an other example which illustrates the situation at using an other library: PlainFIR. The lastest version that I am preparing performs real time filtering which lead me to use integer maths. The results were obtained for a 17 taps FIR filter.  Here are the results: UNO 100ms, DUE 5us! Whaow! How come! In this case, I took full advantage of the 32 bits versus 8 bits architecture. Under these conditions, integer math on large numbers take … 4 times less  time.

Let’s recap: [gain of 4 (32 bits vs 8 bits)]  x [gain of 5 (100Mhz vs 16Mhz)] = bingo! The code executes 20 times faster on the DUE compared to the UNO, which is more or less the expected figure. And that makes a huge difference. Let’s consider real time processing of a signal which is sampled 44100 times per second, so as to say  in  “CD quality” conditions. This means that we have less than 25µs for: converting the analog signal to digital, storing the acquired data, executing the filtering process, setting the filtered value and converting the digital value  to an analog signal! That’s a lot of work to do in a short period of time where every µs counts! In other words, performing simple audio mods with the UNO is possible but it definitely fails to execute the least loop within the time interval between two signal samples.

Let’s see what we can get from the DUE. Unfortunately, the DUE suffers from a couple of drawback which drove me to use external components in order to achieve acceptable performances. Among these drawbacks, the slowness of analogRead and of the analogWrite is the worst, and  DAC pins do not provide rail to rail output voltage. That’s really sad. The good news for the followers/fans of Arduinoos is that I will explain in the the next weeks how to overcome these problems to the cost of a few € (6€). So far, I managed to perform ADC and DAC within a 3.5µs time frame, which leaves quite sometime to perform additional tasks.

Leave a Reply

You must be logged in to post a comment.