#### Annual Report by Prof. Dr. Matthias Fertig



# 2019

This report summarizes the Research and Development activities which I perform besides my teaching assignment as a Professor of Computer Engineering at the Konstanz University of Applied Sciences.

- P.1 Universal Memory Automaton
- P.2 Massive-parallel optical simulation
- P.3 Virtual Photonics Laboratory
- P.4 High-tech strategy: The SEPIA project
- S.5 Digital Microprocessor development

#### **Universal Memory Automaton**

and the development tool VERIGEN

Finite state automata use a state memory with direct and permanent memory access to determine the state transfer and output. Since many applications use a different memory organisation (e.g. Random Access Memory (RAM), Content Addressable Memory (CAM)), a simple state memory does not suffice to implement such applications efficiently. The concept of *Universal Memory Automata (UMA, Fertig 2019)* expands finite state and pushdown automata by additional memory concepts such as Queue,



Expanded state transfer condition of a Universal Memory Automaton.

RAM, CAM and others. The theory of finite state and pushdown automata was expanded by the necessary types of memory access, so that the content of one or several simultaneous memory accesses can be taken into account at the time of state transfer and output generation. The concept was tested for a subset of the above-mentioned memory concepts as part of a bachelor's thesis which was subsequently awarded the "MLP Award". UMAs allow for the efficient implementation of complex algorithms that use a range of different memory concepts. The state graphs of the UMA can be translated automatically to the SystemVerilog hardware development languages (HDL). Utilizing the thus developed VERIGEN tool, Universal Memory Automata were tested using the example of a cache coherency protocol. Such protocols are typically used in microprocessors.



State graph of a Universal Memory Automaton for a cache coherency protocol (MESI protocol).



Sequential circiuit of a Universal Memory Automaton.

VERIGEN delivers synthesizeable HDL at the Register Transfer Level (RTL) and generates a simple verification environment for user-defined tests. A free version of the development tool is available online. Students attending advanced courses may use the development tool to implement what they have learned in class.

Prof. Dr. Matthias Fertig

http://www-home.htwg-konstanz.de/~MFERTIG/pages/verigen.html http://www-home.htwg-konstanz.de/~MFERTIG/files/tools/VERIGEN\_FREE.tar http://www-home.htwg-konstanz.de/~MFERTIG/files/publications/2019/UniversalMemoryAutomaton\_mfertig.pdf

## Massively parallel optical simulation

The Wave Propagation Method on a Graphics Processing Unit

The Vector Wave Propogation Method (VWPM, Fertig 2011) is a 3D computing method based on the Fourier analysis of electromagnetic vector fields and can be used to compute the propagation of electromagnetic fields in nonhomogenous media, preferably systems with an optical axis. In comparison with the Finite Difference Time Domain Method (FDTD) or the Rigorous Coupled Wave Analysis (RCWA), VWPM shows clear advantages in regard to runtime and memory requirements while maintaining similar 3D-distribution of complex refractive indice for precision. While the Beam Propagation Method (BPM) does



definition of the system to simulate.

not allow for the exact computation of vector fields and the Wave Propagation Method (WPM) is limited to scalar unidirectional computations, VWPM delivers an efficient method for computing bidirectional vectorial field propagation without Winkelbeschränkung in complex systems. These complex systems are composed of various optical elements and, for reasons of runtime and memory requirements, make it impossible to use FDTD and RCWA. To increase the processing speed of the above-mentioned Fourier-analytical methods, the WMP for 2D and 3D systems was ported to a Graphics Processing Unit (GPU) in a first step.



Graphic User Interface (GUI) for simulations with Beam and Wave Propagation Methods, programmed with C++/Qt.

Porting the existing parallel code using a Simultaneous Mult Threading (SMT) paradigm on a General Purpose Processor (CPU) to a massively parallel architecture like for example a GPU has significantly improved the processing speed. To be computed on the GPU, the system is first loaded to the memory of the GPU. The field propagation is then computed by thousands of simultaneously working parallel processing elements. As part of selected courses, students can learn the required theoretical and practical skills and implement them as part of a bachelor's or master's thesis.

Prof. Dr. Matthias Fertig

http://www-home.htwg-konstanz.de/~MFERTIG/pages/gpu.html http://www-home.htwg-konstanz.de/~MFERTIG/pages/coptics.html http://www-home.htwg-konstanz.de/~MFERTIG/files/tools/gpu\_wpm\_2D.bin64 http://www-home.htwg-konstanz.de/~MFERTIG/files/tools/gpu wpm 3D.bin64

## Virtual Photonics Lab

#### Design and optimization of nano-photonic semiconductor components

Optics & photonics Since the development of photonic semiconductor components happens at the nanoscale and involves an expensive development process, neither multiple iterations during the development process nor expensive prototypes are practicable. To ascertain the viability and performance capacity of the proposed solution, precise analysis of the application, which should be based on electromagnetic theory, is required. A range of simulation processes is available that are particularly well suited to computing specific applications. The transition from theory to a working simulation tool and a suitable parametrized model of the solution is complex and requires advanced knowledge of theoretical optics and photonics as well as to some extent the theory of the simulation algorithm.



Spectral analysis of a ring resonator.

For teaching and development purposes, the virtual photonics lab utilizes two common simulation procedures to analyse and optimize three-dimensional electromagnetic field propagations for selected photonic elements. This means that new operating principles can be investigated without having to develop elaborate and expensive prototypes first. The virtual photonics lab only uses non-commercial tools. It draws on the Finite Difference Time Domain Method (FDTD) and the Vector Wave Propagation Method (VWPM), for instance to carry out field propagations and spectrum analyses.

Students are taught the basic theory of these procedures as well as how to use the simulation tool. For analysis and optimization, parametrized models are either provided or developed from scratch. Ideally, students should already have gained a sound understanding of the theoretical foundations of optics and photonics. However, they are provided the opportunity to familiarise themselves with the theory by attending additional courses.

Prof. Dr. Matthias Fertig

http://www-home.htwg-konstanz.de/~MFERTIG/pages/photonics.html http://www-home.htwg-konstanz.de/~MFERTIG/pages/optics.html



Narrow-band Gaussian beam in ring resonator.

## High-tech strategy: The SEPIA project

a Scalable Efficient Processor architecture for Intelligent Systems (SEPIA)

Microelectronics is a major source of Germany's innovative strength. As a leading industry location, Germany will thus continue to require comprehensive electronics expertise in both science and society. The strategy of the Federal Ministry of Education and Research (BMBF) and the framework programme "Mikroelektronik aus Deutschland – Innovationstreiber der Digitalisierung" (microelectronics made in Germany – innovating the digital transformation) have been adopted to significantly increase the creation of electronics value in Europe by 2025. One goal is to translate cutting-edge research into innovative and ready-to-use applications for all relevant industries.

The joint research proposal SEPIA (Scalable and Efficient Processor Architecture for Intelligent Applications), which was submitted to the BMBF for consideration in the above-mentioned framework programme, is based on the freely available instruction set RISC-V. Coordinated by Heidelberg University, a total of 13 universities and research facilities as well as seven industry partners formed a project consortium to realise their shared vision of a scalable modular processor architecture with accelerator parts for the following areas of application: High Performance Computing (HPC), Artificial Intelligence (AI) and Internet of Things (IoT).



Developing a scalable and modular microarchitecture based on an open-source instruction set will help to open up the market, which has been subject to increasing monopolization for decades, and to boost innovation in Germany. The industry partners' expertise across all fields of application as well as their long-standing know-how in the areas of processor development, connection technology and network architecture will be used to develop a versatile and powerful architecture.

The underlying idea is to promote the idea of open-source projects while at the same time allowing for economic exploitation. The RISC-V Foundation as well as up-and-coming start-up companies in the United States who have been using the RISC-V architecture have once again assumed a pioneering role in this regard. Suitable strategies for exploitation in Germany and throughout the European Union have yet to be devised.

Towards the end of 2019, the BMBF decided not to approve the SEPIA proposal for the next application round. Notwithstanding this setback, the project partners plan to take part in actively shaping events and to contribute to further enhancing Germany's position as an internationally acclaimed destination for research and development.

Prof. Dr. Matthias Fertig

Strategy

https://www.elektronikforschung.de/ https://www.bmbf.de/de/elektroniksysteme-made-in-germany-850.html https://www.hightech-strategie.de/

#### **Digital Microprocessor Design**

#### Design and verification of a RISC-V compatible DP Fused Multiply Add unit

RISC-V, an open-source and expandable instruction set, provides computer architecture with new impulses for innovative solutions. In the area of High Performance Computing (HPC) and parallel compute clusters, the Floating Point Unit (FPU) determines capacity. Computing power, for instance, is measured in Floating Point Units per second; highly efficient FPUs are essential to HPC.

As part of a joint master's thesis supervised by Heidelberg University and HTWG Konstanz, a RISC-V 64 bit Fused-Multiply-Add (FMA) unit with an efficient "Single Path" algorithm was developed and verified that is able to execute a multiplication and an addition per cycle. The thesis implements and verifies an efficient data path of a 64-bit FMA instruction. To manage the large amount of input values, the thesis utilizes the Universal Verification Methodology (UVM) and Specman e as part of a simulation-based and metrics-driven verification approach (MDV). UVM is an industrial standard and enables users to design verification components that are modular and reusable. The innovative approach introduced in the thesis utilizes and compares two tried and trusted reference models. The primary reference is the FPU of the verification system's Intel processor, the secondary reference is the Softfloat environment developed by the University of California, Berkeley. The verification is transaction-based and uses the following metrics: "functional coverage" and "code coverage" in combination with the "constraint random" and "corner case" approaches for this generation of stimuli – all of this conforming both to the IEEE-754 Standard as well as to the relevant RISC-V architecture standards. Coverage analysis reveals blind spots within the verification which enables users to design a much more precise verification process. The entire methodology is compatible with the Cadence Design Environment and thus affords comprehensive opportunities for planning and analysing test environments. It further allows users to exploit additional opportunities provided by the hardware development environment.



Struktur der Verifikationsumgebung (Quelle: F. Kaiser)

knowledge and skills to develop and verify digital semiconductor components. To that end, HTWG Konstanz provides access to relevant development tools.

suitable

Prof. Dr. Matthias Fertig

the

IBM

The developed simulation environment is

capable of verifying ten million Floating

Point Units per hour against two reference

models both at the same time. To define

development tool FPGen was used. It

organises all relevant tests for various FPU

instructions within so-called buckets. The

required tests for the FMADD instruction

were designed by adapting FPGen. As part of selected courses, students are offered the the opportunity to acquire the necessary

scenarios,

test

Digital Eng.

https://ra.ziti.uni-heidelberg.de/cag/student-work/master-theses?layout=edit&id=192 http://www-home.htwg-konstanz.de/~MFERTIG/pages/asic.html http://www-home.htwg-konstanz.de/~MFERTIG/pages/sverilog.html http://www-home.htwg-konstanz.de/~MFERTIG/pages/digital.html



Konstanz, January 2020

Text and Layout:Prof. Dr. Matthias FertigTranslation:Dr. Tullia Giersberg