The Lempel-Ziv (LZ) compression methods are among the most popular algorithms for lossless data compression. LZ methods use a table based compression model where table entries are substituted for repeated strings of data.

LZRW3 is a well known LZ-type algorithm developed by Ross Williams in the 1990s which offers a useful combination of high throughput and good compression performance. It also has the advantage of being very efficient to build in hardware, unlike the majority of compression algorithms which tend to favour software implementations.

The Helion LZRW3 core implements the LZRW3 data compression algorithm in Altera FPGA without the need for external memory storage. It is capable of handling data throughputs in excess of 1 Gigabit/sec, and is ideal for use for improving system performance and efficiency in data communications, networking and data storage applications.
General Description

The Helion LZRW3 core operates on one input block at a time (see later section for more information on block sizing). An LZRW3 operation is started whenever the “go” input to the core is asserted and the core is not busy. The data input length and mode inputs need to be valid during the go cycle, after which the processing starts; the core busy flag is asserted whilst compression or decompression is in progress.

Whilst the core is busy, data can be passed into the core for processing. The data input and output interfaces are byte-wide, and use associated ready/taken data flow control signals to transfer data to/from the core. Input data is pushed into the core by the user application, and output data is pushed out from the core to the user application.

Once a whole data block has been processed, a single cycle pulse is output on “done” to indicate when the core is finished as well as to indicate the output length in bytes and status outputs are valid. The status output shows whether the compression attempt was successful or not. Further block processing can then be started.

The Helion LZRW3 core processes data at a peak rate of 1 byte per clock on the fastest (uncompressed) interface; obviously the slower interface (compressed) will be at a lower rate than this. The actual rate depends on exactly what is happening in the internal pipeline at any one time, which by the very nature of data compression/expansion is data dependent. As a generalisation however, the data rate is typically 70-90% of this maximum, and the performance figures given in the tables below assume a typical throughput of 0.8 bytes per clock for both compression and expansion. Since this is a complex area, please feel free to contact Helion for full background on both compression and throughput performance of these cores.

Logic Utilisation and Performance

The tables below show typical logic area and performance figures for each version of the core in the latest Altera Stratix device families. All figures shown are for versions of the core supporting a maximum block size of 2K bytes.

<table>
<thead>
<tr>
<th></th>
<th>LZRW3 Comp</th>
<th></th>
<th>LZRW3 Exp</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>StratixIII -C2</td>
<td>StratixIV -C2X</td>
<td>StratixIII -C2</td>
<td>StratixIV -C2X</td>
</tr>
<tr>
<td>technology</td>
<td>880 ALMs 8 M9Ks</td>
<td>903 ALMs 8 M9Ks</td>
<td>844 ALMs 8 M9Ks</td>
<td>829 ALMs 8 M9Ks</td>
</tr>
<tr>
<td>logic resource</td>
<td>199 MHz 1273 Mbps</td>
<td>236 MHz 1510 Mbps</td>
<td>208 MHz 1331 Mbps</td>
<td>237 MHz 1516 Mbps</td>
</tr>
<tr>
<td>max clock</td>
<td>201 MHz 1286 Mbps</td>
<td></td>
<td>228 MHz 1459 Mbps</td>
<td></td>
</tr>
<tr>
<td>typ throughput</td>
<td>980 ALMs 8 M9Ks</td>
<td></td>
<td>1021 ALMs 8 M9Ks</td>
<td></td>
</tr>
</tbody>
</table>

Please note: the cores are available with support for larger block sizes (see later). Area and performance figures are available from Helion on request for all these variants and for all device types and speed grades. Unfortunately, due to technical limitations, this core is not currently available in older Stratix devices or the Cyclone family.

LZRW3 – Choice of Blocksize

The generalised LZRW3 algorithm works on data streams with an unbounded history size. However, for a fast hardware implementation it is desirable to use local storage such as Altera TriMatrix Embedded RAM for the history, and therefore the history size must be bounded to a sensible value by introducing a maximum block size.

The Helion LZRW3 core supports maximum block sizes from 2K bytes upwards in power-of-two increments to 32K bytes. The next section lists the actual amount of RAM required for each blocksize; as you can see the number of RAMs used by the core increases significantly with the supported data block size whilst the logic area increases only slightly. For block sizes 16K and above, the larger M144K RAM blocks are deployed in addition to some M9K RAMs.

For packet based systems with defined maximum packet sizes, the whole packet can be handled in a single LZRW3 operation. For stream based systems with very long data streams, these must be formed into multiple blocks which are compressed using independent LZRW3 operations. For many applications suitably sized data blocks may already exist within the system, so the core can be sized accordingly. When defining the block size, a trade-off exists between the amount of local history storage required and the resulting compression efficiency, since longer blocks in general tend to compress better. Helion have modified the algorithm slightly (whilst maintaining full compatibility with the LZRW3 standard) to achieve better compression results for very short packets.

Please feel free to contact Helion to discuss the best choice of blocksize for your particular application.
Internal RAM Requirements

The utilisation of embedded RAM depends only on the chosen blocksize; the RAM requirements are the same for each variant of the core (compress-only, expand-only, compress/expand). The table below lists the required numbers of M9K and M144K RAM blocks for each supported LZRW3 blocksize in Altera Stratix 3 & 4 technologies.

Note also that the logic area also slightly increases for larger blocksizes, and the maximum supported clock rate may be affected. Please contact Helion for full details for the specific variant you are most interested in.

<table>
<thead>
<tr>
<th>blocksize (bytes)</th>
<th>2K</th>
<th>4K</th>
<th>8K</th>
<th>16K</th>
<th>32K</th>
</tr>
</thead>
<tbody>
<tr>
<td>number of M9Ks</td>
<td>8</td>
<td>10</td>
<td>15</td>
<td>7</td>
<td>8</td>
</tr>
<tr>
<td>number of M144Ks</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>

About Helion

Founded in 1992, Helion is a well established British company based in Cambridge, England, offering a range of product-proven Data Security IP cores backed up by highly experienced and professional design service capabilities.

Although we specialise in providing the highest performance data encryption and authentication IP, our interest does not stop there. Unlike broadline IP vendors who try to supply a very diverse range of solutions, being specialists we can offer much more than just the IP core.

For instance, we are pleased to be able to supply up-front expert advice on any security applications which might take advantage of our technology. Many of our customers are adding data security into their existing systems for the first time, and are looking for a little assistance with how best to achieve this. We are pleased to help with suitable advice and support where necessary, and pride ourselves in our highly personal approach.

In addition, our Design Services team have an impressive track record in the development of real security products for our customers; we are proud to have been involved in the design of numerous highly acclaimed security products. This knowledge and experience is fed back into our IP cores, to ensure that they are easy to integrate into real systems, and perform appropriately for real engineering applications.

Helion has a very long history in working with high performance FPGAs, so we take our Altera implementations very seriously indeed. Our cores have been designed from the ground up to be highly optimal in Altera FPGA; they are not simply based on a generic ASIC design like much of the competition.

Most Helion IP cores make use of Altera-specific architectural features; in fact in many cases we build-up custom internal logic structures by hand, in order to achieve the very highest performance and most efficient logic resource utilisation. The benefits of this dedicated approach can be clearly demonstrated by direct comparison between Helion Data Security IP cores and the equivalents from other vendors.

More Information

For more detailed information on this or any of our other products and services, please contact Helion and we will be pleased to discuss how we can assist with your individual requirements.