Decode 5G NR Wireless Signals on an FPGA
Overview
Learn how to transition 3GPP 5G New Radio (NR) and other wireless communications algorithms to FPGA-based implementation by building a connected workflow and using hardware-proven IP and reference applications.
Highlights
- Workflow and methodology
- How to use 5G IP blocks
- Overview of the 5G NR Cell Search reference application
- Targeting Xilinx Zynq-based hardware
- Generating SystemVerilog verification components
- Example customer projects
About the Presenters
As a senior applications engineer, Jeff Miller focuses on supporting customers for adopting HDL code generation and 5G/LTE technology. Customer projects have included HDL designs for high performance FFT, FIR, Matrix Mathematics, Encryption, Custom Floating Point, and LTE receivers. Prior to joining MathWorks, Jeff worked at Applied Signal Technology doing Signal Intelligence, and at Morphics Technology doing commercial wireless communications. Jeff has a Master’s of Electrical Engineering from Georgia Tech and a Master’s of Education from the University of Arizona.
Jack Erickson is responsible for product marketing and product management for the HDL product family at MathWorks. Prior to joining MathWorks, he spent over 20 years at Cadence Design Systems, Inc., as an applications engineer and in product marketing for simulation, RTL synthesis, and high-level synthesis. He has a BSEE from Tufts University and an MBA from Worcester Polytechnic Institute.
Recorded: 8 Sep 2020
Hi, I'm Jack Erickson, and I'm here with Jeff Miller. And today we will be focusing on how to decode 5G new radio wireless signals and deploy that on an FPGA and interface with the verification team. So in terms of the agenda, Jeff is going to start off outlining the workflow and the methodology we have set up here as well as providing an overview of this 5G new radio cell search reference application that comes with Wireless HDL Toolbox. And how to use some of the 5G IP blocks that we have available. And then I will come back and describe how we target Xilinx Zynq-based hardware and generate verification components to help the production design team. And finally wrap up with some example customer projects. So for now, I'll hand it over to Jeff.
Thanks Jack. So this is Jeff Miller here, and what I'd like to start off with is just a product overview of Wireless HDL Toolbox. It can be thought of as an add on to HDL Coder in particular, that provides more advanced communications, HDL capabilities. And in particular that comes into two main forms, reference designs for the wireless standards for OFDM in particular, and the IP blocks that you need to build up those receivers-- the forward error correction, the LDPC, the Polar, the Turbo, the Viterbi, the OFDM IP blocks, OFDM Mod, Demod, Channel Estimation in the utility blocks.
And I'll just show you quickly in the documentation and the libraries where you would find it. So if you were to open up your Simulink library browser, and you can actually go down to Wireless HDL Toolbox. And you can see how it's organized out of the forward error correction, the OFDM modulation, and you can drill in here, and you can see the Viterbi decoder, in particular, for LTE. The LDPC is a big IP block for 5G, as is polar. The OFDM IP blocks here, LTE needs a custom FFT of 1536 points.
Here we also have generic OFDM modulators into modulators. And a huge value as well is the reference design. So if you look for a cell search for 5G. And we'll get into this in a bit more detail later, there's actually a reference design that does 5G cell search.
And as a matter of fact, if we look at the examples from Wireless HDL Toolbox, we'll see those reference receivers for both 5G and 4G LTE built into the product. Each of those receivers comes with a very comprehensive write up not just in the Simulink HDL Coder model, but also of the wireless standard aspect of it. Understanding things like PSS and SSS from the wireless standard perspective.
What I'd also like to do is kind of just quickly review the motivation for having products like this, an HDL Coder. From my background, I've spent the past decade doing probably 200 plus customer projects with The MathWorks and customers that are taking signal processing. In particular, communications designs from typically where the algorithm engineer is most comfortable, which is in the MATLAB environment, and then all the way to HDL implementation in an actual hardware board.
And so there's this kind of a disconnect between the algorithm engineers who are comfortable in MATLAB, working with floating point data, being able to hold the whole array at once, who understand the mathematical algorithm. Then the hardware engineers who have to be in the lower level of VHDL or Verilog. They have to be fixed point. They have to have hardware architecture, and there is kind of that gap between the two.
In a product like this, an HDL Coder, that enables the two teams and the two engineers to kind of bridge the gap between those domains. So this is kind of a challenge that motivates the workflow both for HDL Coder and Wireless HDL Toolbox. And here's how we go about addressing it.
With LTE Toolbox, 5G Toolbox, you can actually have a reference algorithm, be it for your whole receiver or just an individual component, like a Turbo Decoder or a Polar Decoder. In the MATLAB version, again you can hold the whole data at once, but from a hardware perspective, these things would typically be done serially, sample by sample. In terms of being able to compare a reference receiver IP block that's in a Simulink HDL Coder model, back to the MATLAB algorithm, one of the first things you're going to have to do is have a version that's sample-based. You're going to start needing a little bit more detail, things like data-valid control, other control signals, start of frame, end of frame.
And so the idea is that you evolve the algorithm from the original high level behavioral understanding in MATLAB to the details required to actually implement it on hardware. And so kind of step one here is starting off with the algorithm in MATLAB. And then what kind of makes this methodology go is that you can actually have a Simulink model run in parallel against MATLAB, and then compare the results against each other. That's what we'll see when we get into the reference design, and It'll be a little easier to follow in the IP block portion as well.
Some of the things you have to think about it as you kind of add the detail, for example, a peak detector. A peak detector in MATLAB, when you can hold the whole waveform it once, is relatively straightforward. You can just get the global maximum and use that to get your alignment. In hardware, it has to be more serial based. You have to look over a certain window and then be willing to say something cleared a threshold, and call that a peak.
You need more towards control signals, if you're sending one sample at a time to a Polar Decoder, you need to tell it what's the start of the frame, the end of the frame, whether any individual sample is valid or not. So we're going to have to continuously add more detail, but we'll always be able to test back against the original MATLAB.
Fixed-point conversion is again another level of detail you need as you evolve from something that's behavioral to something that would generate efficient HDL. What we see here is a magnitude-squared computation, and what you can see is that before we go into a multipliers, we typically want to be 18-bit numbers that would map well to either Xilinx or Altera devices. When you actually do your mathematics here for fixed point, you typically do it full precision. So if you square two 18-bit numbers you'd get a 36, when you add them together for the magnitude you would get a 37.
The methodology I teach customers is that in the arithmetic blocks themselves you always do it full precision, and then when needed you quantize back down to 18-bit numbers for calculations thereafter. And lastly, once that level of detail has been worked through, you can then use HDL Coder to generate efficient HDL for the design that you've built up. In the first portion of today's presentation, I'm going to go through a model that would generate efficient HDL, and in the second portion, Jack Erickson's going to come back on, and he'll actually show taking that generated HDL and deploying it on actual live hardware.
And so with that what I'll do is I'll start with a 5G cell search reference example. And let me point out some of the write up in the product. And so as I noted, this is a pretty thorough write up. It goes from the theory of primary synchronization search, secondary synchronization search, and then the model itself in quite a bit of detail.
We'll highlight a few different parts, obviously we can't do that exhaustively in a 40 minute presentation. But it also links to the extended version that actually targets a Zynq device, so it actually links to this reference example as well that builds on this. And puts the apparatus around it to handle the AXI registers and the AXI streaming so that you can deploy this on a Zynq and actually run it on the live hardware. And for understanding the algorithm, it also provides MATLAB reference code for the algorithm itself.
So that's written up, and again here is the methodology from starting with MATLAB to the more detailed version in Simulink, in comparing the two against each other. And with that I'm going to open up the model and give it a quick run. And I'll point out some of the highlights, but again this isn't one of our more involved reference designs.
So let's poke around a little bit here, now that we've got the results. What it's actually doing here is it's showing for the MATLAB reference algorithm. What it was detecting is the cell ID, the quality measurement on the PSS correlation, also the computed frequency offset. That's going to ultimately become a value into an NCO. And for each of them it's showing the Simulink implementation version's results as well, so that you can see that they align quite well.
It also gets into the correlation results. For either 5G or 4G, there's effectively three PSS correlations that you're going to run in parallel, look for the quality of the peak. And you can actually see that plot here. OK, so you can actually see the two outputs and the correlations. You can actually see here that in both cases PSS 0 is clearly occurring and breaking the threshold. The threshold you can actually adjust based on input power, so you can see that threshold is actually going up and that it's clearing that.
In particular, in the Simulink model I wanted to go through some of the stuff that's in there, obviously it's a lot to follow. So if you look at some of the different inputs that are coming in here, here's your baseband IQ coming in from potentially an [? A to D, ?] whether that input is even valid. We've done some of these in some of the different SDR boards and depending on your carrier frequency and the quality of your oscillator, the frequency offset might be large enough when it usually gets to subcarrier spacing over two, that you actually have to make a course correction and iterate a bit.
So there's the opportunity to take in an external, or through some looping with an arm carrier offset. In this case here, you can potentially sweep over subcarrier spacings as well, and I can show you where in the model it's actually doing that. You can provide a course offset when you get further down the path, especially when you start decoding things like MIB. At that point, you lock in the cell ID and you get alignment for that given cell ID and you get the SSP.
Looking under the hood in here. Right at the front end where we're doing the digital down conversion, as I noted there may potentially be a course offset. That can become from attempting to correct for the large carrier offsets on the board itself. So you can actually input here a frequency offset that drives an NCO, and then there's the mix right there to make the adjustment. Depending on what the [? A to D ?] rate is, relative to the multiples of 30.72 megahertz that we're looking to lock into the subcarrier spacings, there can potentially need to be some destinations.
In particular, what a highlight is in here. In this case here, I believe it's just doing the 15 and 30 kilohertz spacing. So you're coming in with a waveform that's already been decimated down to the 30 kilohertz subcarrier spacing, and you're providing by adding another half band filter. Another waveform that's at 15 kilohertz subcarrier spacing. And the idea is that the way we convey this rate change is that the wires themselves are actually always at the same rate.
It's the duty cycle on the valid that's actually conveying the rate change, and that's what enables this model, on the fly, to be able to take an external input and then effectively tap off the correct waveform that's at the right sample rate. At that point, once you have the waveform correctly sampled, you can start doing things like PSS detection, where ultimately, what you're doing is you're running correlators here which are matched filters. As one would expect against the three PSS correlation patterns, 0, 1, and 2, you're looking for quality peaks.
When you get a quality peak that would actually get you the timing alignment. And once you have timing alignment, then you actually can do the OFDM demodulation. And so if you were to look under here, this is where we have the OFDM demodulator IP block. So that's going to remove the cyclic prefix and actually take the FFT. At that point you have the resource grid. That's really important, because to efficiently do the secondary synchronization search, that's much more efficient with the plus ones/minus ones in the frequency domain.
So that's a quick overview of the receiver design. Again, we can't do that comprehensively in a short presentation. But with that, I do want to talk briefly about the IP blocks. The IP blocks and their examples in the product, work of the same methodology that we've been talking about. There's some sort of notion of the correct answer, in this case for a Polar Decoder, and that can exist in the form of MATLAB code. And then under the hood in the Simulink implementation model, we have that IP block, and it's being driven by the same input and controls with the insertion of control signals that comes from MATLAB.
And with that, let me show what this looks like from the documentation, and work through that. So if you actually go to the documentation of the Polar Decoder, there's a write up of the block. If you actually scroll to the bottom, it'll take you through the algorithm. And in particular, it also gives some performance specs on the latency, and then also the synthesis results taken to a particular Xilinx Zynq device are actually included in the documentation as well.
And for me, what really makes this easier to use and understand is this kind of off the shelf examples that shows how to use it correctly. So this example here shows how to synthesize input for a Polar Encoder, which then its output then goes to a Polar Decoder, and those two Simulink models are tested against their MATLAB equivalent. So I'll go ahead and run these in action. And what we'll see here is a MATLAB script that actually drives both an encoder model and a decoder model and then checks them.
So here we can see it's opened up the encoder model. Now it's going to run the decoder. And so it's got a write-up here, and it's actually telling us if there are any differences as a result of the HDL implementation. In the case here where there are some optimizations taken from an implementation standpoint, but it is in fact comparing the two against each other. And if we go to the script here, this vine right here is where actually we have MATLAB run a Simulink model-- in this case, on that line, the encoder-- and then we go ahead and we look at the results and compare them against each other.
So just a quick summary of my portion-- went through some of the core components of the product, which is the IP blocks, forward error correction in the reference designs. 4G receivers, 5G receivers, in particular. A methodology from going from high level MATLAB to a more detailed Simulink HDL Coder implementation. 5G reference design, we looked at a cell search. Obviously, there is a more detailed write-up. And for the IP block, we went through a Polar Encoder and Decoder. With that I want to pass it back to Jack Erickson.
Thank you Jeff. Yeah, once you have simulated this streaming fixed-point hardware behavior, and verified that it matches the algorithm, you can deploy to an FPGA. I'm going to use the example that Jeff referred to earlier as the extended version that comes with the Zynq radio hardware support package. Basically, what I'll do here is generate HDL for the hardware subsystem, and the AXI register mapping to plug into a reference design. The hardware support package ships reference designs which define the board-specific mappings.
So when HDL Coder generates code, it can map the hardware I/O to either device I/O or to AXI registers to communicate with the rest of the system. This example also has a model of the software control layer that can be deployed to the arm, but I won't cover that here. You'll notice that this design has been refined a bit for targeting. The hardware core is the same as what Jeff was showing, it's just a model reference to that design. The biggest difference is on the output. Instead of just streaming raw data out, this uses the valid signaling with some control logic to store and then serialize the report data.
And then this control logic packetizes it for an AXI stream output. When I bring up the HDL Workflow Advisor, the reference designs from the hardware support package are available for me to pick. Here, I'll use the ZC706 with the FMCOMMS module and only target the receive path for the reference design. I only have Vivado 2019. 1 on my machines, so I'll ignore that warning, it's for setting up the embedded project. And here's where you map the I/O to the AXI registers in the reference design. This example has it setup already.
Then the target frequency depends on the base-band sampling rate. This design was built for a sampling rate of 61.44 megahertz. And then I can just have it generate HDL in the IP core. I've already run it, so we didn't have to wait, because it takes about 10 minutes to run. It generates the HDL and reports. And there are separate files for the core design, because in this design it was just model referenced.
And an IP core report for the top level. So that shows you how to integrate it into Vivado IP Integrator, if you're handing off to a larger project for integration. But if you follow this full example, it walks you all the way through generating and downloading the bitstream.
The other important part of interfacing with hardware projects is verification. If this application ends up being used in a larger chip, we can generate SystemVerilog components to help with chip-level verification. These components are generated C code wrapped with SystemVerilog's direct programming interface, so they can be called just like SystemVerilog code. You can generate the design itself as a reference model, or generate the waveforms we used for testing.
And we can generate these from Simulink or from MATLAB. It really comes down to whether you want these components to be frame-based, or streaming, fixed point or not, and so forth for use in your test bench. For example, using the MATLAB waveform generator that comes with a cell search reference application, it's as simple as making fixes so the code is compatible with C code generation, and getting with the verification team to figure out what they want to be configurable in their environment.
In this case, we just made the cell ID, signal to noise ratio, and frequency offset all configurable. Then this dpigen command just generates it all. You have to define the data types for the input parameters, but it can figure out the output types. And that generates the C code and builds it into a shared library along with the SystemVerilog interface and example usage. And then you can just hand all that over to the verification team, and they can build it in.
So for more details, I encourage you to check out this short video, and your verification team will be happy if you share that with them as well. Finally, I'd like to point you to a couple of customer projects that have used Wireless HDL Toolbox. The first one is RF Pixels, who needed a base-band designed to test the RF modules they're developing. And they save themselves at least a year of engineering development by deploying the LTE version of the reference application onto an RFSoC. Most importantly, by using proven off the shelf IP, they could focus on their differentiation, which was the RF hardware.
The second one is CS Corporation in Korea. They were building a 5G RF repeater before we had the 5G version of this reference application. But because the reference application is a Simulink design that you can modify and test, they were able to modify our LTE version for 5G. And finally the wireless communications research group at Intel labs, who needed to create a working prototype of a wireless time-sensitive network for industrial applications. And it's based on an 802.11ax transceiver. They built a working model in Simulink, and generated the hardware and software components, and were able to quickly make changes and regenerate and save themselves a lot of time.
So I know we've covered a lot of ground today, and with that it's difficult to get into a lot of depth, but we've left you with some pointers where you can learn more about these topics. The Wireless HDL Toolbox offers a tremendous amount of IP in the form of white box reference applications with full documentation. Wireless blocks and all of these designs have been proven on hardware in the field. It also offers methodologies and workflows and expert assistance from folks like Jeff to help you get up and running with standards content, so you can focus on your differentiation.