Coron Wah-Pedal W-220

Now something completely different: Coron Guitar Effect pedals

Yea right! A couple of years ago I was doing music and like any other guitarist I have amassed a pile of guitar effects.

If you search the net for guitar effect pedals you’ll usually find tons of information including schematics and review. I have a wah which I wasn’t able to find much information for, so I decided to take it apart and do a technical review. The pedal was in need for a good clean-up anyway.

The Coron Wah-Pedal W-220.

So first I have to admit that I haven’t played it much. Wah is just not my thing. I mean – I totally love the sound, but I never got that hand/foot coordination thing going.

If you open it up you’ll see that it’s mechanic is quite sturdy and similar to higher praised brands like Vox or Dunlop.


Also at the first look the circuit board looks like just another Vox style two transistor Wah. I wanted to find out exactly which schematic they’ve copied, so I dug a little bit deeper and I found:

It’s not a 100% clone! There is more going on inside. Quickly I drew out the schematic:

coron w-220 schematic

What you see here is mostly the two transistor Vox style Wah.

They use 2SC732 transistors. Nice high gain transistors with excellent hFE linearity. The inductor only has half the inductance usually found in Wah pedals, but that has been compensated by doubling C2 from 4.7µF to 10µF. Quality film capacitors for all remaining caps in the audio path. The big 220nF caps have for sure been a lot more expensive than ceramic disk style caps.

The Feedback Path

Remarkable however is the negative feedback path right at the bottom (R12, C6, R13).

This is something I haven’t seen in Wah circuits before and it’s quite clever: It is frequency dependent, so it linearizes the frequency response and lowers distortion a bit. The frequency dependency makes it much more pronounced in the low frequencies, so it kind of works like a presence control giving a tighter low-end. It also moves the wah bumps a bit up in the frequencies.

Here is a simulation of three pot positions with (green) and without (red) the feedback path:


Actual Traces

I have them as well: I did a small test by generating white-noise using my mbed LPC1768 board, dividing it down to around 0.3Vpp and running it into the Wah. I changed the pot values: Both extremes and some in-between values. The output was sampled using my PC sound-card and converted to a spectrum using FFT:


The output is noisy, true, but you can see that the bump in the frequency response end up roughly at the same spots as the simulation. Nice.

But wtf is this?

Somehow it makes me wonder if the Coron engineers actually knew what they’ve been doing: Notice the resistor R12 (4.7k) and R13 (220k): They’re in series and have 10% tolerance. Whatever R12 is supposed to do, it has no effect because R12 is smaller than the 10% tolerance of R13. Also it’s not simply used as a jumper because there is no trace to jumps over either. This resistor is completely useless.

And while we’re at it: The R9, the 47 Ohm resistor right after the power source does nothing as well. It’s tiny compared to the 22k and 1k collector resistors, and these have a local bypass in C5 anyways. I could see it if the pedal would be powered from a wall-wart. In this case R9 and C5 would build a low-pass filter to get rid of power-source hiss, but this pedal doesn’t even have a plug for external power. Only battery is allowed.

There are a few wtf’s on the trace side of the PCB as well like needless ground loops and so on. But hey, not critical, it works, doing PCB in the 80’th was harder than today.

Measuring the Inductor

No technical review of a Wah would be complete without taking a look at the inductor: I measured mine as about 320mH with 12.8 Ohm series resistance. The bobbin is type EE-1916 and the core looks like to be laminated steel. The copper wire is very thin. I haven’t measured it, but it looks like 0.1mm style wire.

For measuring the inductor I did nearly the same setup as for taking the frequency response curves. It’s super simple and doesn’t need any special parts:

Here is how:

You take your inductor and a high quality capacitor of know value. You’re likely have an idea in what range the inductance is, so pick a capacitor that would result in a notch in the low audio range around 1Khz or so.

Then connect these as a notch filter as shown below:


Now you use an audio editor like Audacity, generate a few seconds of white noise and play them into the filter while sampling from the output. Do a spectrum/FFT analysis of the sampled data and a very deep notch will appear. This is the resonant frequency of your LC filter.

To get from that frequency to the inductance just calculate as:

L = 1/(4 pi^2 C F^2)

And you’re done..

Wrap up

The Coron W-220 is not a straight clone of the Vox Wah but a variation of it, it’s tighter in the low registers. This may be just what you’re looking for. In a band context you don’t want to compete with the bass player and if your guitar or rig is already on dark sounding to begin with this Wah may be a good fit as well.

Build quality is on par with the flagships like Vox and Dunlop. They didn’t tried to squeeze the last penny out of the parts cost.

These cheap 80th Guitar pedals from Asia tend to have bad reputation overall. Sometimes this is justified, especially for the boss style knock-offs with the plastic housing (does your still has a working battery compartment?). Sometimes they’re pretty good though and always worth a try.

Oh, for completes sake, here are the images I’ve used to trace the schematic from. You can directly overlay them to see the components and traces at the same time:



Posted in effects, guitar | 4 Comments

LPC Microcontroller flash checksum utility

Wrote an utility to fix something annoying in my micro-controller work-flow.

If you’re coding for the LPC17xx micro-controllers under Linux using OpenOCD and the GNU tool-chain you’re likely familiar with the following message during flashing:

Warn : Verification will fail since checksum in image (0x00000000) to be written to flash is different from calculated vector checksum …

This message won’t prevent you from flashing a working image because OpenOCD will calculate the checksum on the fly. It will however prevent you from doing a verify of the data because verification will be done to the unpatched binary.

If you’re working with mbed and the external compiler tool-chain you’ll likely have seen this output a thousand times:

***** You must modify vector checksum value in *.bin and *.hex files.

Here comes my lpcpatchelf utility into play:

It will process an ARM binary in ELF format and fix the checksum. This will get rid of the messages and allows OpenOCD to verify.

The source-code is up on my new github:

To use it, compile it and put it somewhere in your path. (You may have to install the libelf-dev package which I need to manipulate the elf files).

Once done you can process the files by just calling:

lpcpatchelf -f mybinary.elf

right after the linker call in your makefile.

This calculates the valid checksum and updates the elf file in place. Once done you can then use objcopy to get your .bin or .hex files if needed.

The utility will work for LPC17xx and most other LPC micro-controllers by default. Some LPC microc-controllers have a slightly different algorithm though. For example in the LPC20xx family the way the checksum is calculated is identical except that the checksum is placed at a different position.

You can define this position by explicitly setting it via the the -c option. For example this call will patch the elf file for the LPC20xx family.

lpcpatchelf -c 5 -f mybinary.elf

I hope you find it as useful as I do. If you have questions just drop a comment below..

Posted in Uncategorized | 4 Comments

SWP Tracer/Sniffer

It’s possible to modify the SWP Transceiver front-end circuit presented earlier to a full functional SWP Tracer/Sniffer.


This is almost the same circuit as the SWP transceiver but I moved some parts around to make the signal flow a bit clearer.

I don’t drive the SWP_TX signal myself anymore. Instead the SWP_TX becomes an output which is directly connected to a SWP master (aka NFC controller chip):

The SWP slave / SWP sim part remains the same, and so does the TX/RX signal splitter circuit.

The trick to get it working is to feed-back the extracted RX signal as a current-sink on the SWP master side.

Here it’s done with a fast BFS20 transistor as a switch.

R1 defines the current. At a SWP signaling voltage of 1.8V the 1.8kOhm roughly sinks 1mA of current. Due to VCE(sat) of Q1 the actual current is slightly lower, but that’s okay. The specification allows us to down to 600µA. We’re at the safe side here.

R2 is the usual base resistor with C1 acting as a speed-up capacitor to improve the switching speed.

The IO voltages are 1.8V for SWP_TX and 3.3V for SWP_RX.

You can now connect SWP_TX and SWP_RX to a logic-analyzer/micro-controller and trace away.

For this circuit you can’t replace the parts with slower devices. The signal already takes a complete round-trip through the opamp and comparator. The overall propagation delay should be small enough not to cause any confusion on the SWP master side.

Happy Hacking!

Posted in Uncategorized | 4 Comments

SWP Reader – The Analog Part

I think it’s time to lift the curtain how the analog part of my SWP reader project looks like. This is the exact same circuit that I’ve used in the last two prototypes. I’m going to describe the circuit block by block and show the whole thing at the very end.


There are a bunch of signals and voltages that connect to the circuit. Those are:

5V main supply voltage, taken from USB
V+ 9.5V, higher voltage to supply the opamps
V- -4.5V, negative supply for opamp and comparator
SWP_TX SWP transmit signal, 3.3V
SWP_RX SWP receive signal, 3.3V
SIM_SWP connects to the C6 pin of the SIM-card
DAC supplys a reference voltage between 0 and 3.3V

Transceiver Front-End

This is the heart of the SWP transceiver. It takes the digital TX signal, sends it to the SIM-card while extracting the SWP RX signal by measuring the current drawn by the SIM. The architecture is built around the trans impedance amplifier circuit with some tweaks.

R1/R2 form a voltage divider that convert the incoming 3.3V signal to 1.8V. R2 is also doing double duty as a pull-down. The level-shifted signal directly feeds into the non-inverting input of U1.

The SIM-card SWP pin is directly connected to the inverted opamp input. Yes, I’m using the opamp input as an output here. Since negative feedback is present via R3, the voltage at the two inputs will always be very close to each other, so the SIM will always see the SWP_TX signal.

R3 and the opamp itself is where the magic takes place. Any current that is flowing into the SIM card will cause a voltage drop across R3, and we see this voltage drop at the output of the opamp along with the input signal added to it.

With a maximum SWP signaling current of 1mA we’ll see 1.8V for RX plus 1.8V for the TX signal. Here is a simulation screen-shot with a pulse-train of one-bits on the TX and alternating ones and zeros on the RX:

real-outThe spikes on the edges is caused by the parasitic capacitance of the SIM card and it’s socket. When switching, the charge stored in this capacitance causes a very short burst of current flow. This manifests itself as the spikes on the signal transitions. The ringing is not present at the SIM card terminal though.

If you want to substitute for another opamp make sure that the gain-bandwidth product and the slew rate is sufficient. You need a fast part. I would not go below 100Mhz GBW and 100V/µS slew-rate. The LT1227 does a really good job here.

RX Signal Extractor

This part is straight forward. The mixed RX/TX signal gets converted back to a digital signal with the help of a comparator.


At first the R4/R5 voltage divider brings the mixed RX/TX signal voltage down into the safe range. This is necessary because the power-hungry LT1016 gets powered from the 5V rail instead of the (rather weak) V+. The reference signal from the DAC gets a bit of noise-filtering via R10/C5. Finally R7 provides a good deal of hysteresis for a clean output signal.

The conversion of the comparator output down to 3.3V level is a bit dirty but worked fine so far. I just load the output of the comparator using R8. This reduces the output voltage to the required level and also provides some termination. If you want to be on the safe side you’d rather put a zeener diode clamp in here. R9 and C4 limit the slew-rate to something sane and reduce EMI.

Power Supply

Not much to see here. The power-supply is built around a LT1054 buck-boost converter. That’s pretty much the same chip as the ICL7660, MAX1044 and others. I used low ESR ceramic chip capacitors exclusively.

There is some ripple left on the generated voltages, but that is not causing any issues. The digital outputs look clean and communication over SWP works perfect.


The SD103BW are very cool Schottky diodes by the way. Good spec and *cheap*. They also survive 1.5A peak current. Robust little buggers.


That’s all the analog stuff you need to talk SWP with a SIM-card. How you generate the SWP signals is up to you. I use a Xilinx CPLD for this which talks SPI to a micro-controller and drives/samples the SWP signals from this analog circuit. I’ll likely write about this another day.

If you want to build something upon this circuit don’t forget that you also need to control the power and reset line of the SIM.

The circuit – as is – has no issues reaching the full 1.69 Mbit/s data-rate of the SWP bus. You can even run it at a higher speed without degrading the signal much. The SIM-card that I’m testing with stops to respond after about 2 Mbit/s (way out of spec) but the signals itself still look fine.

For completes sake, here is the entire schematic in one image with the required decoupling capacitors added:


Posted in Uncategorized | 15 Comments

SWP Reader Evolution

I’ve been working on my SWP reader for about a year now, so I think it’s a good time to dump some photos and show how the project evolved:

The first “proof of concept” prototype:

final-highres-webFor this prototype I decided to stay with plain old through-hole packages and build the analog part in a modular way. The restriction to through-hole packages had a great influence of what parts I could use because most of the good stuff is in SMD these days.

On the top I’ve used a XuLA-200 FPGA board. A really nice breakout for the Spartan 3A FPGA family. The analog part consists of (left to right): LT1227 current feedback OpAmp as the main SWP transceiver, good old LM311 as a one-bit A/D converter (to slow for high-speed SWP, but good enough up to 400 kbit/s). On the right there is a local power supply based around the LT1054 to generate the supply voltages for the OpAmp.

This prototype worked right away and was great to do the first data exchanges between the SIM and my PC. In the end I decided to abandon the FPGA in favor for a microprocessor with better connectivity to the PC side.

Entering Prototype 2:

proto1The FPGA is gone and has been replaced by a Cortex-M3 CPU. The blue board is a mbed LPC1768 which is quite nice and easy to work with (and no, mbed doesn’t force you into their online-compiler anymore). The red board below is a Xilinx CoolRunner-II CPLD breakout from Dangerous Prototypes which I use to translate the data-stream from SPI to SWP and vice versa. Going from 200k FPGA gates down to 64 gates was not a big deal because most of the complex stuff is now running in software on the Cortex-M3. I even have plenty of gates and flip-flops available, so I can add some more Shenanigans if I want to.

Finally all the analog stuff is now in SMD package on the black board. The circuit is almost the same as the first prototype except that I’ve upgraded the comparator from LM311. Since I had to order samples from linear anyway, so I thought: “Let’s get one of the finest comparators as well”. This turned out to be a mistake because the chip was *way* to fast for my needs so I had to slow it down with some external circuitry. In one of the next revisions I’ll change that chip to something cheaper and more sane.

Along with the stuff already seen I’ve also added a voltage tracker to power the SIM and a PWM to DC converter to set the comparator threshold voltage (both based on a good old and trusty LM358).

Here is another shot of the same board with USB connectors attached. Most of the wiring between boards is on the backside:



I was really happy how the prototype 2 turned out. It was working fine, however due to all the long connections running on the backside of the board the signal quality was questionable:

nullbits-webThese are SWP zero-bits transmitted at around 1.7Mhz, taken at the opamp output. You can clearly see how the clocks from the CPU and from the CPLD leak into the received signal. The signal was good enough not to cause any transmission errors though.

Nonetheless I decided to bring everything on one board. The end result is this:

The Single Board Prototype:

swp-rev2-webThis was the first board I’ve designed with KiCad and it turned out pretty good. Due to a bug in the version of KiCad I was using I missed two unrouted wires so I had to patch the board. Nonetheless it’s working great. The micro-controller has been changed from LPC1768 to the low pin count version LPC1758. The analog circuit is still almost the same except that there is no PWM to DC converter anymore. The LM358 is still there but working as a voltage tracker for SIM supply exclusively.

I’ve also added the ISO7816 interface, so the board is feature complete.

The Ethernet interface has not been assembled and probably never will. Due to the two air-wires I’ll do another revision of this board anyway, and I’ll change the Ethernet PHY chip from the QFN package (not shown, it’s on the back of the board) to something in LQFP package for easier hand assembly.

The next version will also get a new (cheaper) analog part. The chips from Linear are really nice and all, but they are so damn expensive. I’ll probably change the OpAmp to a TSH82 dual opamp, one half for the SWP transceive job, and the other to power the SIM. The comparator could be a TS3011. These parts are much cheaper and will still be more than fast enough. I’ll also get rid of the buck-boost converter LT1054 as well because no other supply than 5V will be needed anymore.

So, that’s it so far. Right now I’m working on the software side of things. I’m busy porting the mbed based C++ code to C and the new micro-controller. I’ll probably do a FreeRTOS port as well but I haven’t decided on this yet.

Posted in Uncategorized | 12 Comments

Ultra simple ISO-7816 Interface

While laying out a PCB for my SWP reader project I realized that I haven’t ever tested the ISO-7816 (aka contact) interface yet. I probably forgot that because it’s not all that difficult and not that interesting, but I’d rather see it working before I order PCBs.

So I spend an hour or two in the internet looking for inspiration how other people did it. There are lots of specialized chips for this purpose out there, but sourcing is always a problem and since it’s “just” a simple serial interface I was more interested in a simple hack that will work.

Turns out there are a lot of simple SIM/Smart-card readers out there that just do this, and they pretty much all look like this:


Here Q1 is running as a open collector driver with R1 as a pull-up resistor. This transistor will invert the signal, that’s why there is an additional inverter A1 in front of the base. R2 and C1 are the usual base-resistor and speed-up capacitor.

You’ll find variations of this basic circuit all over the net. Sometimes they omit the speed-up capacitor, sometimes you find buffers in the RX-UART path, but that’s it basically.

I heated up my soldering iron and gave this circuit a test drive, and lo and behold: It works as expected (aka good enough in practice).
So problem solved, move on.

Not so. In the middle of the night it came to me, that almost the entire circuit is unnecessary. What does it really do? On the Q1 collector we see a replica of the TX-UART signal. The SIM card IO pin (which is just an open collector IO-pin) is able to pull the signal down at will without causing a short to TX-UART.

RX-UART picks up this signal and echoes back either what comes from TX-UART or from IO. Neither pin is pulling up the signal, that’s what R1 is doing.

So how about this:


It’s working just the same, just faster and with less parts.

In case that TX-UART is transmitting, and IO is listening the singal will just pass R1. If TX-UART stops transmitting the UART will go into idle-state (logic high). This effectively ties R1 to VCC and we have exactly the same behaviour as with the more “complex” circuit.

If the SIM transmit something SIM-IO just pulls the line down to ground. A bit of current will flow out of TX-UART, but that’s fine. Compared to drive a LED from a GPIO pin that’s nothing.

If you feel inspired to try this out, here is a short how-to:

  • Configure the UART on the micro-controller side for 9600 baud, 8 data bits, two stop-bits and even parity.
  • Power the SIM VCC pin, have reset low.
  • Apply a clock signal 372 times the UART baud-rate: 3.57 Mhz.
  • Wait a little for the SIM/SmartCard to stabilize.
  • Raise the reset line to taking the SIM-card out of reset.

And then watch the Answer to Rest (ATR) signature arriving at your micro-controller UART-RX pin. You’re now ready to implement the ISO7816-3 T=0 or T=1 protocol and do some real data-exchange. With practically any micro-controller and just a simple resistor.

Oh, by the way. You’re allowed to let the IO pin of the SIM to pull down up to 500µA, so if you get problems with stray capacitance just lower R1. Minimum values are:

  • 3.6k for 1.8V supply
  • 6.6k for 3.3V supply
  • 10k for 5V supply.
Posted in Uncategorized | 8 Comments

NFC SWP Physical Layer – How it works

As I’m currently building an NFC-SWP reader device I have to tackle quite some challenges simply because there there is no single chip solution out there that you can simply connect to USB and a SIM card. Most NFC Controllers can of course talk SWP, but they will not work as a simple and transparent bridge. Therefore I’ll design my own solution.

To do so, it is crucial to understand how the protocol works on the lowest level.

The SWP physical layer is quite a unique thing. It allows the NFC SIM-card and the NFC controller to exchange data at a rate of 1.7 megabit/second full duplex over just a single wire. And they not only transmit bidirectional data, they transmit a clock signal for synchronization as well.


How do they pulled that off?

First have a look at the S1 signal. This is the signal that transmits the clock and the data from the NFC controller towards the NFC SIM. Each bit gets transmitted using a full cycle, and the pulse-width of the signal defines if a zero or one bit gets transmitted.

Here is a picture of the S1 signal transmitting a stream of zeroes:


Each bit starts with the raising edge. The voltage high duration of each cycle is 25% of the cycle length. This is interpreted as a zero-bit. Likewise, if the voltage high duration is 75% of the cycle length a one-bit gets transmitted. Again a picture of one-bits to illustrate this:


Extracting the clock and data from this signal is easy. Each clock-cycle starts with the raising edge of the signal. For the clock extraction the falling edge can simply be ignored.

Getting data-bits is easy as well. All you have to do to extract the bits from this signal is to take a look at the voltage at the middle of the cycle. In the images I’ve aligned the numbers denoting the bit-value to this place. I use this method when I debug SWP signals taken with a logic-analyzer. In a hardware-design it is probably not feasible because you’ll never know where exactly the middle of the cycle is until the cycle has ended. I don’t know for sure, but I bet they measure the durations of the high and low periods and extract the bit from that.

For completes sake, here is a picture of a signal with some one and zeros:


This is how one side of the communication works. I’ve simplified a bit and left out things like fall- and rise-times, voltage-levels, tolerances and so on. Also it is worth noting that the clock-rate is not fixed. The NFC-controller is allowed to change the clock rate at will as long as you stay within the allowed range.

How does the SIM transmits data?

Now we have seen how the NFC-controller talks to the SIM-card and how the synchronization works. But the SIM card probably wants to transmit data as well. How does this work?

The NFC-SIM can not transmit data by applying a voltage to the SWP link because the link is always driven by the NFC-Controller. However, the NFC-SIM can draw current from the SWP-link without interfering with the NFC-controller.

Take a look at one of the S1 signal images again. In each clock cycle there will always be voltage high period. During these high periods the SIM can load the SWP link and draw some current. During the voltage low periods it can’t because to draw a current a voltage must be present.

This leads to the fact that the signal from the SIM (S2) will always be modulated by the signal S1 generated by the NFC-controller.

To make things a bit more easy to understand here is a picture of signals S1 and S2. The voltage domain is shown in blue while the current domain is shown in red.

First S1 transmitting some one and zero bits while S2 (the SIM) transmitting a stream of zeros:

s2-zeroesNot much going on on the S2 signal. That’s how zero-bits look like.


And now S1 transmitting the same data again while S2 is transmitting a stream of ones:s2-onesOn the S2 signal, one bits become a copy of the S1 bits.


And finally both signals transmitting a bunch of bits:s2-bitsYou probably already guessed it. One bits in S2 become copies of S1, zero bits are just a flat line.

The NFC-controller can read the S2 bit-stream by measuring the current consumption of the SWP link just before it generates the falling edge.


Timing, Timing and Levels:

In the general case communication over SWP must be done at a frequency between 200 kilobit/second up to 1 megabit/s. A SIM card may also announce the capability to go faster (up to 1.69 megabit/s) or slower (down to 100 kilobit/s).

And they mean it! While troubleshooting I’ve tried to run SWP at a slower clock-rate. It didn’t worked at all. There is some wiggle room, but not much.

The S1 (voltage) is in practice a 1.8V digital signal regardless of the SIM-card supply voltage. The levels that define the high and low regions change somewhat with the different supply voltage classes, but if you provide a clean digital 1.8V signal you won’t run into issues.

In practice it seems that NFC SIM-cards are very forgiving about the voltage applied to the SWP-pin as long as you don’t exceed the supply voltage.

Note that these are ballpark figures, within 10% or so of the real thing. You’ll find the exact values in the specification ETSI TS 102 613 (you’ll find it via Google), chapter 7.1.3.

The S2 (current) signaling definition is much simpler. Independent of the class, a logic high is defined by drawing 600µA or more. The SIM should not draw more than 1mA though.

Why did they came up with such a oddball protocol?

To be honest, I don’t know exactly.

A couple of years ago, while I was working on a NFC middle-ware, I had good contacts with a NFC chip manufacturer. While on site, during lunch I asked exactly this question. The answer was more pragmatic than I expected:

During the stone ages of smart-cards the contact interface (aka the gold pads on your SIM that makes contact with the phone) has been defined. The same interface has been re-used by SIM-cards because SIMs *are* just small smart-cards.

Back then writing to smart-cards required a dedicated programming voltage. Nowadays no one needs the programming voltage anymore, so the pin was always unconnected. Since they wanted NFC functionality in their SIM’s they re purposed this pin and use it for SWP now.

If they had two free pins we would probably have something much simpler. Doing stuff in the current domain has a price: It consumes power. Now SIM cards usually end up in mobile phones. Going low power and having having a long standby time is a thing so I’ve heard. Fortunately the SIM is not communicating over SWP most of the time.

There is one thing that I don’t understand at all though: Going current domain and doing three things over a single wire: All fine. But if you read the specification of SWP you’ll find a lot of details where some crazy timing constraints are required. The requirement bit-rates up to 1 megabit per second for example, let alone 1.7 megabit/second in the high-speed case. I can’t come up with any use-case that even remotely needs such a high bandwidth.

Posted in Uncategorized | 8 Comments

About a failed Circuit Idea

I had a circuit idea in mind that I’ve never been able to try out. I’m working on a SWP reader device right now (that’s a device that should directly talk to NFC enabled SIM cards).

So recently while browsing through semiconductor lists I across the TSH70 OpAmp. This is a great part and it’s much cheaper than the OpAmp I’m currently using and the specs are just about the minimum requirements that I have.

Also contrary to the OpAmp I’m currently using it’s a normal voltage feedback OpAmp This allows for a new transceiver design that I came up with.


Here you’ll see the idea in action: The SWP-TX signal (1.8V digital logic) goes directly into the non-invertig input. The OpAmp circuit itself is just a voltage follower with a transistor booster (Q1).

The voltage seen at the C6 input of the SIM card should be identical to the SWP-TX signal, and in fact it is.

If the SIM card wants to transmit data it does so by drawing current (Rule of thumb: 1mA current equals to a logic one). So the task is to measure current at high speeds.

If the SIM card draws current this current will be sourced by the Q1 emitter and not the OpAmp output. The majority of current that passes Q1 emitter is again sourced from the collector.

On the top of the circuit you’ll see Q2 and Q3. These form a current-mirror. E.g. whatever current is drawn from the Q2 collector will also be drawn from the Q3 collector.

Long story short: The current drawn from the SIM card will appear at the collector of Q3 (just mirrored). Adding a load-resistor R5 converts this voltage to a current and we can measure it.

And here is how it looks like in a spice simulation:


Blue is the control signal that controls the SIM current. If it’s high the current flowing into C6 is 1mA.

Red is the SWP-TX signal. I’ll show it along with the other curves so you can see that the actual voltage across the SIM does not affect the output much.

Green is the SWP-RX signal.

This green signal looks great eh? Nice, defined edges, just a little bit of ripple. Very low propagation delay. I could directly connect this signal to a micro-processor pin and start reading data.

Except it won’t work like this. I completely missed to add some parasitic capacitance across the emulated SIM card.

Here is the same circuit with C1 added in. I’s just a tiny 10pF capacitor that should emulate the capacitance of the SIM card itself along with sockets and so on.


And this is the signal response after the capacitor has been added:


Now the received signal rings a lot, and there is also the propagation delay went completely over the roof. It’s almost half as long as the impulse itself!

What happened? Once the capacitor has been charged, and the SWP-TX signal goes back to zero there is no quick way for the capacitor to discharge!. Q1 can only source current to C1, not sink any. The only way to lose charge and lower the voltage across C1 is to slowly leak through R4. And this completely messes up the negative feedback loop of the OpAmp (not his fault!).

I could lower R4 to allow for faster discharge, but then again more current will flow through the transistors and I mess up my nice green output signal.

I could probably replace Q1 with a proper push-pull stage. That’s something I’ll try one day. Right now I’m staying with my “tried and trusted” SWP analog front-end. It has a different topology where the parasitic capacitance actually speed things up! It requires the much more expensive current-feedback OpAmp I’ve mentioned, but but it doesn’t show this defect.

Lesson learned: small parasitics can mess up things much more than expected.

Posted in Uncategorized | 28 Comments

DSP default cache-sizes not optimal?

While debugging some DSP code yesterday I came a cross a performance oddity. Adding more code lowered the performance of an unrelated function.

By itself this is not *that* odd. It happens if the size of your code is larger than your first level code-cache and different functions start to kick each other out of the cache. However, in my little toy program this was unlikely. I had only around 20kb of code and the code-cache is 32kb in size.

Better safe than sorry I thought and took a look how the caches are configured. Big and pleasant surprise: Two of them are running at half the maximum size for no good reason:

In my case after DSP-boot I got:

Level 1 Data-Cache 32k
Level 1 Code-Cache 16k
Level 2 Cache      32k

However, the maximum possible cache sizes for the BeagleBoard are

Level 1 Data-Cache 32k (no change)
Level 1 Code-Cache 32k (16kb larger)
Level 2 Cache      64k (32kb larger)

So 48kb of valuable cache has been left unused. Changing the cache sizes is easy:

  #include < bcache.h >

  // and somewhere at the start of main()
  BCACHE_Size size;
  size.l1dsize = BCACHE_L1_32K;
  size.l1psize = BCACHE_L1_32K;
  size.l2size  = BCACHE_L2_64K;
  BCACHE_setSize (&size);

That still leaves you the 48kb of L1DSRAM for single cycle access and 32kb of L2RAM to talk with the video accelerators. Oh – and it gave a noticeable performance boost.

Btw- it’s very possible that this only applies to the DspLink configuration that I am using.


It turned out that the reason for the smaller cache-sizes is the default DspLink configuration. You can override this if you add the following lines to your projects TCF-file. Just put them somewhere between utils.importFile(“”); and prog.gen():

prog.module("GBL").C64PLUSL2CFG  = "64k";
prog.module("GBL").C64PLUSL1DCFG = "32k";
prog.module("GBL").C64PLUSL1PCFG = "32k";

var IRAM = prog.module("MEM").instance("IRAM");
IRAM.len = 32768;

This will configure the OMAP3530 DSP with:

L2-Cache:     64kb
L1 Data-Cache 32kb
L1 Code-Cache 32kb
L1SDRAM       48kb
IRAM (L2 Ram) 32kb
Posted in Beagleboard, DSP, OMAP3530 | 12 Comments

Faster Cortex-A8 16-bit Multiplies

I did a small and fun assembler SIMD optimization job the last week. The target architecture was ARMv6, but since the code will run on the iPhone I tried to keep the code fast on the Cortex-A8 as well.

When I did some profiling on my BeagleBoard, and I got some surprising results: The code run a faster as it should. This was odd. Never happened to me.

Fast forward 5 hours and lots of micro-benchmarking:

The 16 bit multiplies SMULxy on the Cortex-A8 are a cycle faster than documented!

They take one cycle for issue and have a result-latency of three cycles (rule of thumb, it’s a bit more complicated than that). And this applies to all variants of this instruction: SMULBB, SMULBT, SMULTT and SMULTB.

The multiply-accumulate variants of the 16 bit multiplies execute are as documented: Two cycles issue and three cycles result-latency.

This is nice. I have used the 16 bit multiplies a lot in the past but stopped to use them because I thought they offered no benefit over the more general MUL instruction on the Cortex-A8. The SMULxy multiplies mix very well with the ARMv6 SIMD multiplies. Both of them work on 16 bit integers but the SIMD instructions take a packed 16 bit argument while the SMULxy take only a single argument, and you can specify if you want the top or bottom 16 bits of each argument. Very flexible.

All this leads to nice code sequences. For example a three element dot-product of signed 16 bit quantities. Something that is used quite a lot for color-space conversion.

Assume this register-values on entry:

              MSB16      LSB16

      r1: | ignored  |    a1    |
      r2: | ignored  |    a2    |
      r3: |    b1    |    c1    |
      r4: |    b2    |    c2    |

And this code sequence:

    smulbb      r0, r1, r2      
    smlad       r0, r3, r4, r0

Gives a result: r0 = (a1*a2) + (b1*b2) + (c1*c2)

On the Cortex-A8 this will schedule like this:

    smulbb      r0, r1, r2          Pipe0
    nop                             Pipe1   (free, can be used for non-multiplies)
    smlad       r0, r3, r4, r0      Pipe0
    nop                             Pipe1   (free, can be used for non-multiplies)
    blocked, because smlad is a multi-cycle instruction.

The result (r0) will be available three cycles later (cycle 6) for most instructions. You can execute whatever you want in-between as long as you don’t touch r0.

Note that this is a special case: The SMULBB instruction in cycle0 generates the result in R0. If the next instruction is one of the multiply-accumulate family, and the register is used as the accumulate argument a special forward path of the Cortex-A8 kicks in and the result latency is lowered to just one cycle. Cool, ain’t it?

Btw: Thanks to Måns/MRU. He was so kind and verified the timing on his beagleboard.

Posted in Beagleboard, OMAP3530 | 12 Comments