Boolean and bitwise expressions are often used in C when programming microcontrollers or writing device drivers. With some knowledge of boolean algebra complex algorithms can often be greatly simplified. This can improve performance and aid in code readability.
My lab makes use of some high voltage (±250V) switching electronics for conducting ultrasound research. Recently we ran into some trouble resulting in a number of channels getting damaged. We believe that this occurred due to the short circuiting of some channels. Since they will need to be replaced anyways it seemed appropriate to make some very minor hardware revisions.
One of the current limitations of the hardware is that switching times are much too slow for our applications. These can be improved by reducing the size of the output resistors used to mix the outputs of the Positive and Negative voltage rails together. However these resistors also serve to limit the current passed between power supplies in the event of a shorted channel.
These shorts can occur both internally to the circuit and externally due to the connected electronics. I will be examining the internal case today. While the best way to fix both cases is to add foldback current limiting to each channel the electronics are already very complicated and we would prefer not to make major modifications. Instead we can apply a patch in software. Before I explain lets first understand the problem leading to the internal shorts.
Shown above is a highly simplified diagram of a single output channel. The switches are controlled by means of a 2 to 4 decoder; given 2 bits of input 1 of the 3 available output states is chosen. There is also an additional state where all switches are open. Assuming the switches are ideal only one can ever be closed at any given time.
Unfourtunately, nothing is ever ideal. In practice this means that one switch will always open or close faster than the others. At a transition between states this will result in a short period of time where the two power supplies are connected and the current between them is limited only by the pair of resistors shown. As stated previously, these must be reduced to increase the circuit's operating speed.
One way to minimize this effect with only a small amount of additional
software delay is to insert a very short transition to ground between
any transition between VPP
and VNN
. At first glance this may
seem useless since this will simply limit the resistance to R between
a voltage rail and ground instead of 2R between the two voltage
rails. However, in our use case the load is highly capacitive. At
a transition between VPP
and VNN
the current to the second rail
will be both the current that passes between the supplies and an
additional spike from trying to discharge the load. By inserting a
ground state in the middle the load can mostly discharge and thus
the current driving it to the next state will be reduced.
Before stating the problem in a more formal way I need to define the 4 states available to each output channel.
Due to a hardware implementation detail the states are defined as follows:
HIZ |
VPP |
VNN |
GND |
---|---|---|---|
00 |
01 |
10 |
11 |
Here HIZ
represents the additional state where all switches are open.
Given a channels current output state and knowledge of the next output state should the channel first transition to ground?
The following outputs are required:
VPP
or VNN
and the next state
is the negative of the current state then the channel must first
go to GND
.VPP
or VNN
and the next state is the
same then the channel must not go to GND
.HIZ
or GND
then it
doesn't matter if the channel goes to GND
.With these requirements stated a first pass can be taken at writing the code that implements this.
Before giving the code let me note that each channel needs only 2 bits of information to define its state. Since the smallest unit of storage available in most computer architectures is an 8 bit integer each byte will represent 4 output channels. Lets look at the code:
1enum { HIZ = 0, VPP = 1, VNN = 2, GND = 3 };
2uint8_t get_interim_pattern(uint8_t prev, uint8_t next) {
3 uint8_t interim = next;
4 for (uint8_t i = 0; i < 8; i += 2) {
5 bool gnd = (prev & (3 << i) == GND) || (next & (3 << i) == GND);
6 bool hiz = (prev & (3 << i) == HIZ) || (next & (3 << i) == HIZ);
7 bool trans = (((prev & (3 << i)) + (next & (3 << i))) >> i) == 3;
8 if (!gnd && !hiz && trans)
9 interim |= (GND << i);
10 }
11 return interim;
12}
First, the interim pattern is initialized to the next pattern. Then
each pair of bits stored in the 8 bit integers is looped through
and the previous and next states are analyzed. Each pair is checked
to see if the current or next state is either GND
or HIZ
and
the result is stored in some temporary variables. The magic number
3 here, represented in binary as 11
, is a bitmask used for checking
the value of 2 bits at a time. Next the states are checked for a
transition between VNN
and VPP
. Since the summation of their
two binary representations is 3 the result of their addition can
be checked for equality with 3. Some care just has to be taken to
make sure that their summation ends up in the lowest two bits. This
is achieved with the right shift by i
. Finally all three results
can be combined to determine if that channel should transition to
GND
prior to going to the next state. If the channel should go
to GND
the two relevant bits are set using an OR
operation.
This calculation is obviously quite complicated and should ideally
be optimized. Looking at it again as I write this, I realize that
the checks for HIZ
and GND
are not necessary and are covered
by the check for a transition. However, even if those checks were
removed this would still not be ideal since the final check will
always be limited to operation on 2 bits at a time. This is because
an addition where the result is greater than 3 will result in a
carry that will effect the result of the adjacent pair of bits.
Instead of trying to optimize this further it may be beneficial to
try another approach.
Given the problem statement as shown previously, this problem can
be thought of as a 4 bit boolean (yes/no) function. The 4 bit inputs
are the 2 bits associated with the previous state and the 2 bits
associated with the next state. The outcome is a true/false (yes/no)
used for determining if the channel should first transition to
GND
. Additionally there are many pairs of previous and next states
where the intermediate state doesn't matter. This is a prime example
of a problem where a Karnaugh Map can be used to simplify the boolean
expression. Thus this problem can be summarized by the map shown
below:
N | 00 |
01 |
11 |
10 |
|
---|---|---|---|---|---|
P | |||||
00 |
X |
X |
X |
X |
|
01 |
X |
0 |
X |
1 |
|
11 |
X |
X |
X |
X |
|
10 |
X |
1 |
X |
0 |
Previous states are listed along the rows and Next states
are listed along the columns. Positions with a 1
represent the
pairs where a GND
must be inserted between states, positions with
0
represent pairs where a GND
should not be inserted, and finally
positions with an X
represent pairs where it doesn't matter if
GND
is inserted in the middle or not.
To find a simplified boolean expression we want to group together
either ones or zeros in the fewest number of groupings containing
a power of 2 cells. Groupings can only be made from connected
(adjacent) cells but the cells wrap around (think of bending the
map into a sphere). If we group ones together we will get the actual
boolean expression and if we group zeros together we will get the
complimentary expression. The most simplified expression will come
from the fewest number of the largest groupings and cells with an
X
can be assigned whatever value is convenient to achieve this
goal.
After trying a few different options I arrived at the following groupings of ones with everything else removed for readability:
N | 00 |
01 |
11 |
10 |
|
---|---|---|---|---|---|
P | |||||
00 |
X |
X |
|||
01 |
X |
1 |
|||
11 |
X |
X |
|||
10 |
1 |
X |
This gives a boolean equation of F = (!P0)(N0) OR (P0)(!N0)
which
is nice because it only depends on 2 out of the 4 available bits.
Actually this equation can be simplified further: the expression
is simply F = P0 XOR N0
. This leads to the first code simplification:
1enum { HIZ = 0, VPP = 1, VNN = 2, GND = 3 };
2uint8_t get_interim_pattern(uint8_t prev, uint8_t next) {
3 uint8_t interim = next;
4 for (uint8_t i = 0; i < 8; i += 2)
5 if ((prev & (1 << i)) ^ (next & (1 << i)))
6 interim |= (GND << i);
7 return interim;
8}
While the simplification above is good, it still operates on pairs of bits in the integer. Since the intermediate operations don't do anything that would have side effects, this can simply extend to whole 8 bit integers:
1uint8_t get_interim_pattern(uint8_t prev, uint8_t next) {
2 uint8_t gset = (prev & 0x55) ^ (next & 0x55);
3 return next | (gset * 3);
4}
There are a few new magic numbers here so let me explain.
First, the hexadecimal number 0x55
is a bitmask which keeps the
lower bit of each pair of 2 bits in the 8 bit integer. This is
applied to both the next and previous patterns, the results are
XOR
ed together, and the final result is stored in the variable
gset
. This variable stores a binary number of the form 0b0G0G0G0G
where the value of G
determines whether the channel associated
with the pair of bits should first transition to GND
or not.
Next gset
is multiplied by 3
. This number comes from some
knowledge about binary. If gset
is of the form 0b0A0B0C0D
then
a multiplication by 3
will produce a value of 0bAABBCCDD
. In
effect it duplicates each of the lower bits into their higher
positions. This is convenient because a channel can be set to GND
by an OR
with 0b11
. If the previous XOR
calculation determined
that the channel should be set to GND
then it's G
value will
be 1
. Multiplying by 3
will duplicate it into the second position.
On the other hand, if the value of G
was 0
the second bit will
also be 0
.
Finally the result is OR
ed with the next state to determine the
interim pattern. This pattern will be equivalent to next apart from
the channels which will go through a transition and are therefore
set to GND
.
A quick note on over-optimization. One might be tempted to eliminate
the multiplication since on some low power devices it will need to
be implemented with software emulation. For this specific situation,
where the intermediate result is of the form 0b0G0G0G0G
, the
following will multiply by 3:
1/* multiplies a number of the form 0b0X0X0X0X by 3
2 * the result will be incorrect if there is a 1 in
3 * any of the higher bit positions
4 */
5uint8_t soft_mul_by_three(uint8_t n) {
6 return ((n << 1) + 0x55) & n;
7}
On devices with either hardware multiplication or well optimized software routines this will lead to more generated bytecode. This is even true when the code appears inline and not as a separate function.
Update 2023-10-05:
A better way to multiply by 3 which gives the same amount of bytecode and is always correct, assuming the result doesn't overflow, is:
1uint8_t soft_mul_by_three(uint8_t n) {
2 return (n << 1) + n;
3}
A uint16_t
could be returned to avoid overflow.
This post explained how a Karnaugh map was used to take a complicated and slow 12 line function and reduce it to a fast 4 line function. When operating on large amounts of channels this method can easily be extended to operating on 32 or 64 bit integers. On the 32 bit microcontrollers we use in our lab this method can be applied to 32 bit integers giving a further 4x speed up.