Matheus F. Pontes<sup>1</sup>, Clayton R. Farias<sup>2</sup>, Rafael B. Schvittz<sup>2</sup>, Paulo F. Butzen<sup>3</sup>, and Leomar S. da Rosa Jr.<sup>1</sup>

<sup>1</sup> Centro de Desenvolvimento Tecnológico, Universidade Federal de Pelotas (UFPel), Pelotas, Brazil

<sup>2</sup> Centro de Ciências Computacionais, Universidade Federal do Rio Grande (FURG), Rio Grande, Brazil

<sup>3</sup> Departamento de Eng. Elétrica, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Brazil

e-mail: pnt.matheus@gmail.com

Abstract— The aggressive technology scaling has significantly affected the circuit reliability. The interaction of environmental radiation with the devices in the integrated circuits (ICs) may be the dominant reliability aspect of advanced ICs. Several techniques have been explored to mitigate the radiation effects and guarantee a satisfactory reliability levels. In this context, estimating circuit radiation reliability is crucial and a challenge that has not yet been overcome. For decades, several different methods have been proposed to provide circuit reliability. Recently, the radiation effects have been more faithfully incorporated in these strategies to provide the circuit susceptibility more accurately. This paper overviews the current trend for estimating the radiation reliability of digital circuits. The survey divides the approaches into two abstraction levels: (i) gate-level that incorporate the layout information and (ii) circuit-level that traditionally explore the logic circuit characteristic to provide the radiation susceptibility of combinational circuits. We also present an open-source tool that incorporates several previously explored methods. Finally, the actual research aspects are discussed, providing the newly emerging topic, such as selective hardening and critical vector identification.

*Index Terms*— Reliability; Fault Tolerance; Radiation Susceptibility; Digital Circuits.

## I. INTRODUCTION

Advances in semiconductor manufacturing technology made it possible to design transistors on nanometer scales. Some positive points of this can be highlighted: (1) the density of transistors on a single chip allows more processing power in increasingly smaller systems; (2) energy efficiency makes it possible for SoCs to be increasingly present in everyday life, popularizing the scope of concepts such as the Internet of Things. However, the scaling of technology carries some negative points. Nanometer circuits are characterized by high operating frequencies, low voltage levels, and limited noise margins, making them more susceptible to faults [1], mainly to faults caused by aging, radiation, and electromagnetic noise, making circuits less reliable and less tolerant to failures.

In this context, circuits manufactured in nanometric scales can be applied in situations requiring high reliability. It is possible to highlight the areas of medical, avionics, space missions, among others, that can cause some life risk or criticality. Thus, reliability becomes a significant concern for circuit designers since the technology scales to a few nanometers [2]. Therefore, fault-tolerant strategies are required to improve circuit reliability. Several techniques have been proposed to deal with the reliability degradation caused by the technology scaling [3]. These strategies lead to different tradeoffs between reliability and design penalties.

The evaluation of circuit robustness after his fabrication is a high-cost task. From that, the probabilistic methods used to estimate circuit reliability based on the gate reliability are even more highlighted. Besides, these methods are prone to reliability analysis under multiple faults scenarios.

Several methods estimate the reliability of digital circuits in the literature. Even so, estimating the reliability of a circuit with accuracy is still an unsolved problem [4]. The methods can be divided into two groups. In the first, the methods estimate reliability through simulation and statistics: Stochastic Computation Model [5, 6] and Monte Carlo Simulation [7, 8]. In the second, the methods use probabilistic analysis to estimate the circuit reliability: Probabilistic Transfer Matrices Method (PTMM) [9, 10], Signal Probability Reliability (SPR) [11, 12, 13], Probabilistic Gate Model (PGM) [14, 15] and recently, Neural Networks [16]. However, it is known that the limitation of these methods is the simplification of the assumption of the same error probability values for all logic gates. The work proposed in [17] shows a method in the transistor level to create the logic gate probabilistic matrices incorporating the radiation susceptibility. The created matrices show the importance of observing the transistor topology to produce more proper matrices for the logic gates that feed the probabilistic methods.

The reliability estimation methods need to represent the logic gate behavior to handle logical masking. PTMM and SPR solutions use a matrix called Probabilistic Transfer Matrix (PTM) to map the input and output probabilities [12]. In a PTM, for each input/output combination, the reliability and fault probabilities are considered. However, in research involving the PTM, fixed values were traditionally used in the reliability of the logic gates. It was proved that when using fixed values of reliability for all input/output combinations, regardless of the logic gate, it overestimates the overall reliability [17]. When using fixed values, we assume the unrealistic consideration that all input vector of any logic gate has the same error probability.

This paper presents a survey on Reliability Estimation in Digital Circuits. We aim for three objectives: (i) give background, and present reliability estimation to the reader; (ii) offer a new taxonomy for the related works using a probabilistic and statistical method for reliability estimation; and (iii) give the community some insights about challenges and research directions in this area.

The paper is organized as follows. Section II introduces basic radiation concepts and the theoretical concepts associated with the circuit reliability estimation methods. Section III describes the methodology capable of extracting the radiation susceptibility to create logic gate matrices used in the circuit estimation methods discussed in Section IV. Section V contains methods that estimate the circuit reliability exploring statistical concepts. An open-source reliability estimation tool is introduced in Section VI. Finally, Section VII presents the concluding remarks.

### **II. PRELIMINARIES**

This section presents the preliminaries for a better understanding of the performed discussion. Radiation concepts and the importance of the single event effects are initially explored. Following, the reliability metrics are presented. Finally, probabilistic transfer matrices are introduced as they are the basis of several reliability estimation methods.

## A. Radiation Effects

Radiation-induced effects have become one of the most critical failure effects in modern electronic devices. In past technologies, radiation effects were limited to hostile environments, such as space. However, with the advances in the fabrication process, the charge stored at the circuit node decreased dramatically due to the transistor and supply voltage scaling down [18]. Typical sources of ionizing radiation are cosmic radiation, the Van Allen radiation belts, nuclear reactors and explosions, and residual radiation from isotopes in chip packaging materials [19].

When a radiation particle strikes a logic circuit, it may induce a voltage glitch due to charge depositing that can upset sensitive circuit nodes and propagate to the successive blocks. Electron-hole pairs are generated corresponding to the hit of the energy particles with the transistor nodes, resulting in a generated charge that depends on the Linear Energy Transfer (LET) of the hitting ionizing particle [20]. These events are called Single Event Transients (SETs), and when they can be captured by registers or directly occur in those storage elements, causing an error in the stored bits, they are called Single Event Upsets (SEUs) [21].

Estimating radiation susceptibility is crucial even in space and nuclear applications due to higher energetic particles such as heavy ions and protons, as in terrestrial applications, since a small amount of charge can upset the digital data in modern technologies.

## B. Metrics

The reliability of a circuit  $(R_c)$  is defined as the probability of a circuit operates correctly during a time interval. Its complement is called probability of failure  $(PF_c)$  and it is denoted by  $PF_c = 1 - R_c$ .

The failure rate  $(\lambda)$ , presented in eq. 1 and the Mean Time Between Failures (MTBF), presented in eq. 2 are the most usual metrics used for the reliability estimation of electronic systems. The failure rate indicates the number of failures that a circuit can present in a one-hour operation. The MTBF reflects the time between failures in the circuit.

$$\lambda = -\ln(R_c) \tag{1}$$

$$MTBF = \frac{1}{\lambda} \tag{2}$$

## C. Probabilistic Transfer Matrix

The probabilistic transfer matrices (PTM) are used to represent the probability of success and failure of each input vector given a logic gate [22]. This matrix maps the possible inputs and the respective outputs of a given circuit. To understand how the PTM matrix is generated it is necessary to know what is the ideal transfer matrix (ITM). This matrix represents the behavior of a logic gate or circuit in a fault-free scenario.

Through the truth table of a given logic gate, it is possible to determine the ITM matrix and consequently the output that is supposed to be correct, correlating this to the chosen probability the PTM matrix is filled. There are conditions that the output can present an incorrect value in the presence of faults. If it is known how frequently it happens, it is possible to map all possible conditions of this gate by using a PTM. Fig. 1 shows how to generate a PTM matrix of NAND2 logic gate based on its truth table and ITM matrix. In this case, the PTM considers that the correct output occurs with probability (q). In the same way, the erroneous output can also occur with a probability represented by the complement of q, defined as 1 - q.



Fig. 1: NAND2 PTM relation to ITM and Truth Table: a) Truth table b) ITM matrix c) PTM matrix

## III. RADIATION RELIABILITY AT GATE LEVEL

An accurate evaluation of circuit reliability is fundamental to allow a reliability-aware automated design flow, where the synthesis tool could rapidly cycle through several circuit configurations to assess the best option. The traditional standard cell design flow and most circuit reliability estimation methods use logic gates as the starting point [23]. Circuit reliability estimation methods usually neglect the difference between logic gates reliability [17]. Methods to analyze logic gates susceptibility in different abstraction levels are discussed in the following. These methods for assessing the susceptibility of logic gates to SET (Single Event Transient) explore transistor arrangement level, stick diagram level, and layout level.

As described above, the matrices representing logic functions consider only logical information. In this situation, details about the logic gates design are missed. As a manner to improve the circuit reliability estimation, the methods can receive as basic input the logic gate matrices considering how the logic functions are designed. This information can be obtained from the method proposed in [24]. It is an analytical method that evaluates the logic gates based on the sensitive area and particle flux. The flowchart go the methods to estimate the logic gate susceptibility are depicted in Fig. 2. All of them start evaluating which logic value is expected at the output, given an input vector. The logic value expected at the output defines which logic plane will be analyzed. The next step is to evaluate the sensitive plane, searching for the sensitive areas for the given input vector. By definition, sensitive areas are those that have a low resistance conductive path to the output and also have the condition that they are reverse-biased PN junction.



Fig. 2: Flowchart of estimating logic gate susceptibility [25]

With the number of sensitive areas computed, the next step is to compute the susceptibility for the evaluated input vector of the specific logic gate. At this step, the used input abstraction level (transistor arrangement, stick diagram, or layout level) provides a different level of complexity and accuracy. At the transistor arrangement level, the concept of fault as a probabilistic event is used to compute the number of events that cause a fault at the output based on the identified susceptible nodes of previous steps.

The stick diagram model also relies on the probabilistic definition of independent events for the occurrence of faults. The stick diagram model eliminates the imprecision introduced due to shared sensitive areas in parallel arrangement transistors. The initial steps are the same. The main difference is centered on the identification of the sensitive areas. Unlike transistor arrangements, where one node may represent more than a diffusion area and consequently more than one sensitive region, this information is precisely obtained in the stick diagram. From this perspective, adding the physical information increases the method accuracy at the cost of an increment of method complexity.

The method that uses the logic gate layout as input is considered the more precise. It introduces the information of the particle collision rate in the analysis. From the layout, it is possible to extract the exact area of the sensitive diffusions. The sum of the sensitive areas for each input vector associated with the particle collision rate provides a precise susceptibility value for evaluating the input vector and logic

Table I.: Average Susceptibility (in  $10^{-6}$ ) and standard deviation ( $\sigma$ ) calculated by the models from 45nm library cell

|          |       |                |      | -        | -    |          |
|----------|-------|----------------|------|----------|------|----------|
| Gates    | Trans | ansistor Stick |      | Layout   |      |          |
|          | Mean  | $\sigma$       | Mean | $\sigma$ | Mean | $\sigma$ |
| INVERTER | 1.98  | 0.00           | 1.98 | 0.00     | 1.98 | 0.57     |
| NAND2    | 2.47  | 0.99           | 2.47 | 0.99     | 2.49 | 1.09     |
| NOR2     | 2.47  | 0.99           | 2.47 | 0.99     | 3.10 | 1.64     |
| NAND3    | 2.96  | 1.49           | 3.21 | 1.47     | 3.11 | 1.82     |
| NOR3     | 2.96  | 1.49           | 3.21 | 1.47     | 4.13 | 2.32     |
| NAND4    | 3.33  | 1.87           | 3.46 | 1.84     | 3.31 | 2.11     |
| NOR4     | 3.33  | 1.87           | 3.46 | 1.84     | 4.68 | 2.95     |
| AOI21    | 2.96  | 1.06           | 3.70 | 1.96     | 5.05 | 3.15     |
| AOI22    | 3.09  | 1.24           | 3.83 | 2.33     | 4.84 | 3.30     |
| AOI211   | 3.58  | 1.48           | 5.06 | 2.16     | 7.12 | 3.39     |
| AOI221   | 3.58  | 1.46           | 5.12 | 2.35     | 6.83 | 3.42     |
| AOI222   | 3.73  | 1.62           | 6.14 | 3.28     | 8.85 | 4.70     |
| OAI21    | 2.96  | 1.06           | 3.70 | 1.96     | 4.17 | 1.89     |
| OAI22    | 3.09  | 1.24           | 3.83 | 2.33     | 4.87 | 2.46     |
| OAI211   | 3.58  | 1.48           | 5.06 | 2.16     | 5.39 | 2.21     |
| OAI221   | 3.58  | 1.46           | 5.12 | 2.35     | 5.72 | 2.67     |
| OAI222   | 3.73  | 1.62           | 6.14 | 3.28     | 8.73 | 4.50     |
| OAI33    | 3.92  | 1.94           | 4.81 | 2.53     | 6.75 | 3.27     |
| XOR2     | 4.94  | 1.14           | 5.93 | 1.61     | 5.83 | 2.79     |

gate.

The results comparing the three described methods and considering the 45nm standard cell library are presented below. For the generation of the results, nineteen cells of this library were analyzed. The results presented are a function of the mean susceptibility and even the standard deviation ( $\sigma$ ) obtained from the values of each input vector of each function.

To generate the results considering the models that analyze the transistor arrangement or the stick diagram, the value  $p = 1.98e^{-6}$  was used as an estimative. This value is extracted from the average susceptibility of the input vectors from an inverter logic gate calculated by the layout model. This value defines the probability of the incidence of a particle in a sensitive node. For the analysis considering the layout model, it was considered as particle collision rate the value of  $\phi = 3.6e^{-11}$ . With the particle collision rate and total sensitive area of each input vector, it is possible to calculate the susceptibility of the functions. The results obtained from the application of the three models in the 45nm library are presented in Table I.

#### **IV. ANALYTICAL CIRCUIT METHODS**

This section presents an overview of several methods that estimate the reliability of combinational circuits. Reliability analysis of a combinational circuit is a computational challenge. This threat is mainly due to dependency on fault masking. In current technologies, the prior masking is logical. Fig. 3 illustrates the logical masking, where the *NOR2* logic gate does not propagate the fault generated in the *AND2* gate because its other input is the logic value "1". Methods that use simulation, such as Monte Carlo, will be covered in Section V.. The first subsection presents an overview of several methods proposed in the literature. The following subsections discuss in more detail the methods that are available in the reliability estimation tool described in Section VI.

#### A. Methods Overview

A logic gate can fail due to various external factors, such as radiation and thermal noise. In addition, problems in manufacturing processes can also impair the functioning of cir-



Fig. 3: SET logical masking example

cuits. Carrying out simulations after the circuit is manufactured is a financially unfeasible process. Already doing simulations in the design stage can have high temporal complexity. Thus, mathematical equations, among other numerical techniques, can be a viable solution for reliability in digital circuits. The bit-flip in a logic gate, also known as *von Neumann* error [26], was modeled in some works by mathematical expressions and demonstrated the feasibility in terms of temporal complexity. According to Beg et al. [27], the reliability of a logic gate and, consequently, of an entire circuit can be determined from the following equations:

$$PF_{gate} = 1 - (1 - PF_{tr})^{\delta} \tag{3}$$

$$PF_c = 1 - (1 - PF_{qate})^{\gamma} \tag{4}$$

where  $PF_{gate}$  is the probability of failure of the logic gate,  $PF_{tr}$  is the probability of failure of the transistor,  $\delta$  is the number of transistors, and  $\gamma$  is the number of total gates in the circuit. However, eq. 3 and eq. 4 are considered unrealistic. The equations assume that all logic gates have the same number of transistors, and the transistors have the same probability of failure. Thus, the equations can be rewritten to [28]:

$$PF_{gate} = 1 - \prod_{i=0}^{N-1} (1 - PF_{tr,i})^{\delta_i}$$
(5)

$$PF_c = 1 - \prod_{j=0}^{G-1} (1 - PF_{gate,j})^{\gamma_j}$$
(6)

where  $\delta_i$  in the counter of N transistors and  $\gamma_j$  is the counter of G gates. Even so, the characteristics and particularities of the gates and transistors are entirely abstracted.

In Probabilistic Gate Model, the input signals of each gate in the circuit are considered independent. The output probability of each gate is obtained using the information of the input signal, and gate error rate [14]. The complexity of this algorithm increases linearly with the number of the gates in the circuit. The overall reliability for a circuit is obtained by multiplying the individual reliabilities for each output. However, considering signal independence leads to inaccuracy in calculating reliability in circuits with reconvergent fanouts. A precise version of PGM was proposed, and it considers two auxiliary circuits in fanout nodes. Although this version PONTES et al.: Survey on Reliability Estimation in Digital Circuits.

leads to exact results, computational complexity is an exponential function of the number of fanouts and circuit size, making it impossible to use in large circuits.

Scalability is the big challenge faced by different approaches. Thus, in Chouboury's work [29], three methods have been proposed that aim to be a tradeoff between accuracy and scalability. Each version of the technique is based on (i) observability, (ii) single-pass, and (iii) maximum limitation of stations.

Zivanov & Marculesco [30] proposed a framework for estimating circuit reliability. They focused on combinational circuits and used binary diagrams (BDD) and algebraic diagrams (ADD) to model the logic gates and their respective behaviors.

Another way that was used to estimate circuit reliability was through Neural Networks. In the work of Beg et al. [27], the reliability estimation was modeled as a linear regression problem, where the characteristics of parts of several circuits formed the dataset. The comparisons performed were based on a tool developed with Bayesian Networks [31] and equations similar to eq. 5 and to eq. 6. The results showed that the estimate was better than those obtained with the mathematical equations but far from the tool.

## B. Probabilistic Transfer Matrices Method - PTMM

Many methods to estimate the reliability of a circuit have been proposed in the literature [32]. The Probabilistic Transfer Matrix Method (PTMM) framework, proposed by Patel et al [22], can produce an exact reliability evaluation of a logic circuit, in a straightforward process [12]. The method was extensively explored by Krishnaswamy et al [9]. In the PTMM, the reliability of a circuit is obtained by a combination of the individual gates' reliability and the circuit's topology. The individual gates' reliability and the circuit's reliability are represented by PTM and ITM matrices, where were described previously.

In a simplified way, each gate can be modeled by a PTM, and the PTM of the whole circuit can be computed by multiplying the PTMs of series logic functions and applying the Kronecker tensor in the PTMs of parallel logic (i.e. the same depth level in the circuit). It is worth remembering that the PTMM deals with the reconverging fanouts problem since each logic level of the circuit generates a PTM matrix, accumulating all logic gates probabilities. The circuit reliability is extracted according to the eq. 7, where p(i) denotes the probability of input vector i [33]. If all input vectors have the same probability, the eq. 7 can be simplified in eq. 8.

The main limitation of the method is the size of the matrices that must be stored and manipulated. Each level of a logic circuit is represented by a PTM. The size of a PTM is a function of the number of inputs and outputs that are being modeled by the PTM. The number of rows in a PTM is equal to  $2^n$ , where *n* is the number of inputs in the circuit level. The number of columns in a PTM is equal to  $2^m$ , where *m* is the number of utputs in the circuit level. The number of outputs in the circuit level. Then for a circuit-level with 24 inputs and 12 outputs, for example, the dimensions of the PTM of the level will be  $2^{24}$  rows by  $2^{12}$  columns, or 512 GB of storage space for 8 bytes floating-point representation of probabilities. Given this scenario, the application of the PTM is limited to small size circuits, even

with techniques that improve the efficiency of the method [33].

$$R_c = \sum_{ITM_c(i,j)=1} p(j|i)p(i) \tag{7}$$

$$R_{c} = \frac{1}{2^{n}} \sum_{ITM_{c}(i,j)=1} p(j|i)$$
(8)

### C. Signal Probability Reliability - SPR

The SPR is a method of estimating the reliability of a circuit through the probabilities of the input signals and the logic gates [11]. Like the previous method, PTM and ITM matrices are used to map the behavior of logic gates. The signal matrix is a  $2 \times 2$  matrix. This signal probability matrix represents the 4 possible states of a signal: a correct 0 ( $\#_0$ ), a correct 1 ( $\#_3$ ), an incorrect 0 ( $\#_2$ ) and an incorrect 1 ( $\#_1$ ) as shown in Fig. 4. The SPR complexity is linear to the number of gates [34]. This makes the method scalable and can be applied to circuits with thousands of logic gates.

$$SIGNAL_{4} = \begin{bmatrix} signal_{0} & signal_{1} \\ signal_{2} & signal_{3} \end{bmatrix}$$

$$P_{2\times2}(signal) = \begin{bmatrix} P(signal = correct \ 0) & P(signal = incorrect \ 1) \\ P(signal = incorrect \ 0) & P(signal = correct \ 1) \end{bmatrix}$$

Fig. 4: Matrix representation of a four-state signal probabilities [11]

The reliability of the entire circuit can be extracted according eq. 9 [11]. Despite these advantages, the SPR doesn't take into account the probability dependence of reconvergent signals, producing inaccurate reliability values, depending on the number of reconvergent fanout signals in the circuit [35].

$$R_c = \prod_{j=0}^{m-1} R_j \tag{9}$$

#### D. Signal Probability Reliability Multi-Pass - SPR-MP

Considering the accuracy limitations of the SPR, which is a one-pass algorithm, an alternative of the SPR based on multiple passes of probabilities propagation was proposed by [36], and was referred to as the SPR Multi-pass, or SPR-MP. In the SPR-MP, the probabilities associated with each reconvergent signal are propagated 4 times, with a single signal state being propagated at a time. The values computed at each pass of the algorithm are accumulated to produce the final value.

As with the SPR, there is no memory limitation associated with the SPR-MP, but processing time is dependent on the number of reconvergent fanout signals [35]. eq. 10 represents the number of passes of the algorithm to compute the reliability of a circuit with F reconvergent fanouts. The main advantage of the SPR-MP is the possibility to restrict the number of fanouts (and so, the number of passes of the algorithm) to be considered in the reliability computation. This characteristic allows a tradeoff between processing time and accuracy, leading to better scalability than the PTMM and better accuracy than the SPR [37].

$$R_c = \sum_{f=1}^{4^F} R_c \tag{10}$$

# V. STATISCAL RELIABILITY METHODS

j

Statistical methodologies are used widely in science, engineering, astrophysics, and industry, to solve problems based on the randomness of events. As exhaustive evaluations are not possible for complex hardware designs, statistical approaches provide a tradeoff between accuracy and runtime. Monte Carlo (MC) methodologies based on statistical sampling and uniform distributions have been used in this order.

The MC is a statistical method capable of inferring a confidence interval based on sampling and randomness events [38]. As the sample sizes grow, the simulation time increases in MC simulations, directly decreasing the standard deviation between the results. In other words, as more simulations are performed, the closest to real probability the average arithmetic becomes. It is important to note that any numerical, static problems with random or pseudo-random numbers can be considered a method based on MC [39].

Newest hardware designs are complex and dense. Reliability evaluation has proven to be essential in the early design stages for improving system lifetime [40]. Fast and accurate evaluation reliability procedures can help embedded system designers to produce targeted fault-tolerant designs. Most reliability procedures aim to provide a compelling tradeoff between runtime and accuracy.

In state-of-the-art are available methodologies to evaluate reliability considering the masking effects, physical layout based, particles flux. In combinational circuits [41] the authors propose a MC methodologies to evaluate reliability exploring the masking effects in multiple SETs. In [42] statically analyses the susceptibility of arbitrary combinational circuits to SEUs, formally encodes and propagates the error pulses using BDDs, in [43] framework for efficient estimation of the soft error susceptibility.

More accurate frameworks use layout-based approaches through cell libraries to provide more realistic analysis. The [44] introduces a layout-based soft error estimation MC framework, which takes into account multiple transient faults (MTFs) from the device level to the circuit level, accordingly to circuits layout modeling transient fault through nuclear reactions. Also, [45] proposes based MCs simulations that consider detailed grid analysis of the circuit layout for the identification of the vulnerable areas of a circuit, temperature, and pulse width.

The Geant4 toolkit [46] simulates radiation effects over the passage of particles through matter. The physics processes offered cover a comprehensive range, including electromagnetic, hadronic, and optical processes, a large set of long-lived particles, materials, and elements, over a wide energy range. It has been widely used for experiments and projects in a variety of application domains, including high energy physics, astrophysics, medical physics, and radiation protection, among others. Industry frameworks as TIARA [47] introduces a radiation-based electrical stimulation at terrestrial and space environment was evaluated different manufacturing technologies CMOS Bulk, FinFET and FD-SOI. The collaborative research project between industry and academia, FLO-DAM [48] provides a novel cross-layer reliability analysis from the semiconductor layer up to the application layer, able to quantify the risks of faults under environmental conditions.

### A. Open-Source Statistic Reliability Estimation

To provide more details related to the MC based method, the rest of this section is dedicated to present details related to the implementation available in the reliability estimation tool - CREsT presented in Section VI. This efficient MC methodology evaluates combinational circuits reliability against single event transients. Due to simulating radiation particles flux hitting random circuit active are:

- Fault-tolerance analysis independent of technologic based on logic masking effect
- Sensitive Library from fabrication technology provides physical layout in order to extract the cells sensitive areas
- Environment particle flux directly ionizing circuits active regions

The analyses are decomposed into different components. The number of faults (N) provide MC confidence level presented by eq. 11. Where critical value  $Z_c^2$  is extracted from a statistical confidence interval, p represents the intended fault proportion based on the total faults population, and e means error margin [49]. Thus, N is defined according to the confidence level desired

The first step is to perform the Fault-tolerance model. The fault simulator performs and compares the following simulations in parallel, the golden (fault-free) and fault injection as SET bit flip. Both results are compared, and in case of difference, the fault was not masked logically per any gate and reached at least one circuit output the detected faults counter (Ne) is incremented. When the simulation finishes, the Failure Rate (FR) is calculated based on eq. 12. The respect Fault Masking Ratio (FMR) is represented by eq. 13, which reflects the number of faults not masked logically. In other words, the among of particles that do not successfully reach at least one logic circuits outputs.

The second step uses a sensitive library database based on PN junction reversed bias areas, defined as sensitive areas. These sensitive areas are extracted from logic cells layout in GDSII format. The extraction process identifies sensitive areas with a low resistance conductive path to the output and the condition that they are reverse-biased PN junctions [17].

In order to illustrate the identification of the sensitive areas, the NAND2 layout from 45nm technology standard cell [50] was used. The Fig. 5 presents six active regions of this cell, and Table II demonstrates the sensitives areas from each input vector. Each area is calculated based on length times width extracted from PMOS, NMOS sensitive regions found for this specific gates input vector. All this information is saved in a database. The total sensitive area of circuit  $AS_c$  is computed summing up all circuits gates, according to the logic gate number of occurrences  $N_{Gate}$  times the gates input vectors  $AS_{GV}$  in the circuit. The eq. 14 represents this total sensitive area.



Fig. 5: Active Regions from NAND2 cell

Table II.: Active Regions NAND2 Cell

| A                      | B          | Active Region         | Sensitive Region $(\mu.m^2)$ |
|------------------------|------------|-----------------------|------------------------------|
| 0                      | 0          | Region 5              | 0.044                        |
| 0                      | 1          | Region 4 and Region 5 | 0.102                        |
| 1                      | 0          | Region 5              | 0.044                        |
| 1                      | 1 Region 1 |                       | 0.088                        |
| Average Sensitive Area |            |                       | 0.069                        |

In the next step, the Failure Rate FR, circuit total sensitive area  $AS_c$ , and particles flux  $\theta$  are combined. In order to model the distribution of transient faults over circuits susceptible areas at ionizing radiation, these parameters are essential for obtaining the Mean Time Between Failures (MTBF) metric, represented by eq. 15. The MTBF is used to determine the reliability of electronic systems, reflecting the mean operating time between errors in the circuit under evaluation. As output, the simulation database set is created based on input vectors, sensitive nodes, detected faults, FMR MTBF and simulation logs.

$$N = \frac{Z_c^2 \times p \times (1-p)}{e^2} \tag{11}$$

$$FR = \frac{N_e}{N} \tag{12}$$

$$FMR = 1 - FR \tag{13}$$

$$AS_c = \sum_{n=1}^{cell} AS_{GV} \times N_{Gate} \tag{14}$$

$$MTBF = \frac{1}{FR \times AS_c \times \theta} \tag{15}$$

The Table III presents the relation established between the previous cited frameworks. They are classified according to abstraction levels, Fault Models, and Software Development access. There is only one framework that is open-source. This tool is the main focus of the next section.

Table III.: Comparison between state-of-the-art frameworks

| Layer |              | Fault Model  |              |              | Soft.<br>Development |              |
|-------|--------------|--------------|--------------|--------------|----------------------|--------------|
| Ref.  | Gate         | Layout Aware | SEU          | SET          | MTF                  | Open Source  |
| [44]  | $\checkmark$ | -            | -            | $\checkmark$ | $\checkmark$         | -            |
| [45]  | $\checkmark$ | $\checkmark$ | -            | $\checkmark$ | $\checkmark$         | -            |
| [46]  | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$         | -            |
| [47]  | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$         | -            |
| [48]  | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$         | -            |
| [51]  | $\checkmark$ | $\checkmark$ | -            | $\checkmark$ | $\checkmark$         | $\checkmark$ |

# VI. CREST - CIRCUIT RELIABILITY ESTIMATION TOOL

This section presents CREsT, an innovative open-source reliability tool capable of providing a set of methods and algorithms able to estimate the circuit reliability and perform several circuit analysis. Also, CREsT aims to evaluate the reliability of digital circuits with a user-friendly interface, being easily understood by designers. All the algorithms present in CREsT were developed in Java, making the tool operational system independent. All the tool code is available in a public repository on GitHub [51].

The graphical user interface (GUI) of CREsT is presented in Fig. 6. The tool is divided into modules, where each module introduces different functionalities and output information related to reliability estimation and circuit analysis. All available modules are listed above:

- Logs and Command Line
- Menus
- · Reliability Outputs
- · Settings Options



Fig. 6: CREsT - Circuit Reliability Estimation Tool

In the Settings tab, users can interact directly with the tool through the insertion of the circuit file described in Verilog hardware description language, and the library used to map the circuit. Also, this tool allows choosing between a fixed logic gate reliability value or use the estimated matrices for each logic gate. These inputs are commonly required in all circuit analysis, which is divided into two possible executions:

1. Reliability Estimation

- PTMM
- SPR
- SPR-MP
- Statistical
- 2. Circuit Analysis
  - Critical Gates Identification
  - Worst Input Vector

To estimate the circuit reliability is needed to provide the settings parameters previously mentioned as well as select the reliability estimation method listed in the Reliability subtab: PTMM, SPR, and SPR-MP. As a result, the reliability estimation is computed and all used commands are prompted in the command line. The results are shown in the Reliability Output frame.

The critical logic gates identification is an important circuit analysis available in the presented tool. Although the SPR does not provide accurate reliability results, it has already been shown in [52] that it can be used to identify the critical blocks of a circuit. The steps for identifying critical logic gates implemented in the presented tool are described as follows:

- 1. Read the circuit and a list of all gates is obtained.
- 2. Set gates reliability to the default value.
- 3. One gate is selected at a time and its reliability is increased. Then, circuit reliability  $(R_c)$  is calculated using the SPR.
- 4. After iteration by all gates, a list of gates in the descending order of  $R_c$  is created (more critical order).

Another functionality is the identification of critical input vectors in the circuit. An algorithm based on the heuristic proposed in [56] has been implemented. The SPR is used to calculate the input vector reliability. The algorithm starts with a random generation of n input vectors, and then computes  $R_c$  for them. For each investigated input vector, a search is executed between the neighbors vectors. A list of the ten most critical vectors is updated. This list serves to determine the consensus between the bits and keeps the vector with the smallest  $R_c$ . The initial and minimal consensus values are input parameters configured in critical vectors tab. If no bits are discovered with the consensus value, then the consensus is reduced and the procedure is repeated while the number of undiscovered bits are below the set value, or the consensus falls below minimal consensus. In the next step, the algorithm tries to find a vector with  $R_c$  smaller than the





(a) c17

(b) eleven\_SC

(c) twenty\_SC

Fig. 7: Small circuits sample

| Table IV.: CREST reliability methods output |
|---------------------------------------------|
|---------------------------------------------|

|           | Circuit Attributes |        |      |        | MTBF   |        | Time $(\mu s)$ |      |        |
|-----------|--------------------|--------|------|--------|--------|--------|----------------|------|--------|
|           | Gates              | Levels | I/0  | РТММ   | SPR    | SPR-MP | РТММ           | SPR  | SPR-MP |
| c17       | 6                  | 3      | 5/2  | 102545 | 90011  | 102545 | 1320           | 110  | 658    |
| eleven_SC | 11                 | 5      | 10/1 | 198343 | 190438 | 198343 | 530890         | 121  | 1576   |
| twenty_SC | 20                 | 7      | 16/1 | 436970 | 333900 | 436970 | 89546091       | 2597 | 19716  |

| Table V.: CREsT SPR results |       |        |           |       |  |  |
|-----------------------------|-------|--------|-----------|-------|--|--|
|                             | Gates | Levels | I/0       | MTBF  |  |  |
| b02 [53]                    | 21    | 5      | 4/4       | 37515 |  |  |
| b01 [53]                    | 35    | 5      | 5/5       | 22979 |  |  |
| CD-25-16 [54]               | 128   | 11     | 25/1      | 34446 |  |  |
| alu [55]                    | 141   | 7      | 7 / 26    | 6061  |  |  |
| decoder [55]                | 296   | 3      | 8 / 256   | 1710  |  |  |
| lkh-router [55]             | 290   | 18     | 60 / 30   | 36953 |  |  |
| g25 [54]                    | 337   | 17     | 25 / 25   | 1546  |  |  |
| cavlc [55]                  | 646   | 17     | 10/11     | 7893  |  |  |
| LEKU-CB [54]                | 701   | 15     | 25 / 25   | 1800  |  |  |
| adder [55]                  | 1530  | 256    | 256 / 129 | 402   |  |  |
| arbiter[55]                 | 11951 | 88     | 256 / 129 | 1038  |  |  |

critical vector identified in the previous step. To perform this task, a sequential search up to  $2^u$  is performed, where u is the number of undiscovered bits. In the case of the number of undiscovered bits greater than the set limit, the search is limited to  $2^{17}$ .

The actions performed in the GUI are linked to a prompt command. Thus, the user can create script files, using these commands to run tasks in batch mode at the Command-Line. All inputs, activities, and methods performed are stored in the Logs module. The Reliability Outputs represent the most informative frame present in the tool, where all circuit estimation and analysis are reported.

The Save Menu is responsible to save the output data from the Reliability Output frame into a text file and also to save the command sequence to be used in batch mode. Moreover, the Config menu is responsible for instantiating tool parameters. As well, the view options are related to observe the circuit characteristics, as the amount of gates, the number of logic levels, the libraries, and the circuit signals. Furthermore, the help menu lists all available commands in CREsT to be inserted in the Command Line tab.

As an example of the results obtained by the CREsT tool, reliability estimations are presented with the three previously mentioned methods. All MTBF values were obtained using a fixed failure rate of  $1.975 \times 10^{-6}$  [17] for all logic gates. Due to the limitations of the PTMM, in the first demonstration limited inputs and logic gates circuits have to be selected. The samples are illustrated in Fig. 7. The C17 is the smallest benchmark of ISCAS85. The other two are randomly generated circuits targeting a fixed number of logic gates. Thus, we created the "eleven sample circuit" (*eleven\_SC*) and "twenty sample circuit" (twenty\_SC). The information generated by CREsT was grouped in Table IV. The presented results make it possible to notice the SPR advantage over the others regarding processing time. This advantage cames at the cost of loss in accuracy. However, Pontes et al. demonstrate in [12] the suitability of SPR in the relaibility estimation task. In this sense, in Table V a sample of larger circuits is presented. These are some circuits taken from the EPFL15 [55], LEKO&LEKU [54] and ITC99 [53] benchmarks. We perform the technology mapping using ABC [57] with a cell library containing the INV, BUFF, NAND2, NAND3, NAND4, NOR2, NOR3, and NOR4 logic gates. The purpose of this analysis is more focused on showing the use of the tool in large benchmarks and not in the obtained MTBF value itself, demonstrating the scalability of the SPR method.

The presented tool provides a set of estimation methods existing in the literature. All these methods and algorithms offer a complete solution to designers that need to deal with circuit reliability. Also, critical gates identification and worst input vector ordering enrich the available analysis. All this information provides a complete set of data to produce more reliable circuits and efficiently apply selective hardening techniques.

| Table VI.: Methods and Approaches Summary |                              |                             |  |  |  |
|-------------------------------------------|------------------------------|-----------------------------|--|--|--|
|                                           | Feature/Method               | References                  |  |  |  |
|                                           | Transistor Arrangement       | [24]                        |  |  |  |
| <b>Reliability at Gate-Level</b>          | Stick Diagram                | [25]                        |  |  |  |
|                                           | Layout Evaluation            | [17] [23]                   |  |  |  |
|                                           | Mathematical Equations       | [26] [27] [28]              |  |  |  |
|                                           | PGM                          | [14] [15]                   |  |  |  |
|                                           | PTMM                         | [9] [10] [12] [22] [33]     |  |  |  |
|                                           | SPR                          | [11] [12] [13] [34]         |  |  |  |
| <b>Reliability at Circuit-Level</b>       | SPR-MP                       | [11] [12] [34] [35] [37]    |  |  |  |
|                                           | Bayesian Networks            | [31]                        |  |  |  |
|                                           | Observability Based Merhods  | [29]                        |  |  |  |
|                                           | BDD and/or ADD Based Methods | [9] [30]                    |  |  |  |
|                                           | Neural Networks              | [16] [27]                   |  |  |  |
|                                           | Monte Carlo                  | [7] [8] [44] [45] [48] [47] |  |  |  |
|                                           | SET                          | [44] [45] [48] [47]         |  |  |  |
| Statistical Dagad Daliability             | SEU                          | [48] [47]                   |  |  |  |
| Statistical Dased Kellability             | Layout Aware                 | [45] [48] [47]              |  |  |  |
|                                           | Physics Models               | [46] [47]                   |  |  |  |
|                                           | Open Source Development      | [51]                        |  |  |  |

## VII. FINAL CONSIDERATIONS

This survey presents the main aspects related to the radiation reliability of digital circuits. The main methods found in the literature to estimate the circuit reliability are disscussed. One of the main points addressed in this paper was the issue of making reliability more realistic. In addition to approaching electrical simulations and design features of logic gates, a subject gaining much attention is the critical vectors [58]. Ibrahim [56] have shown that this critical value could be orders of magnitude higher than the average reliability. Identifying the critical vector is a task whose complexity grows exponentially with the number of circuit inputs signals. Criticality Score (CS) is a metric to rank input vectors based on probability of failure in output [59]. In Ibrahim work [56], a heuristic was proposed to find the critical vector using the CS to calculate the scores.

Another relevant point related to the reliability is the solutions used to Increase the circuit reliability. In this sense, the use of techniques that explores redundancy are used explored. However, these techniques have a high area, delay, and power penalties. Selective hardening is an alternative to achieve a better tradeoff between reliability versus cost. It consists of protecting only the most exposed part of the circuit [60]. Previous work presents a solution based on partial duplication and gate sizing capable of improving reliability with low area overhead [61]. The main weakness of this technique is related to the choice of which part of the circuit should be protected. They follow an initial rank that usually changes as the blocks are being protected [52].

Table VI shows the most relevant research projects based on the topics we discussed in the previous sections. Such taxonomy classifies the main feature of each method according to the estimation type and abstraction level. With this final analysis, we expect to summarize all the discussions performed in this survey. Moreover, we provide an extensive starting point for researchers interested in radiation reliability estimating methods.

## ACKNOWLEDGEMENTS

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES), by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and by the Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (FAPERGS).

#### REFERENCES

- A. Zimpeck, C. Meinhardt, L. Artola, and R. Reis, "Reliability challenges in finfets," in *Mitigating Process Variability and Soft Errors at Circuit-Level for FinFETs*. Springer, 2021, pp. 29–63.
- [2] W. Ibrahim and H. Ibrahim, "Multithreaded and reconvergent aware algorithms for accurate digital circuits reliability estimation," *IEEE Transactions on Reliability*, vol. 68, no. 2, pp. 514–525, 2018.
- [3] N. O. Vasilyev, M. A. Zapletina, and G. A. Ivanova, "The analysis of logic resynthesis methods to increase the fault tolerance of combinational circuits for single failures," in 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). IEEE, 2021, pp. 2050–2053.
- [4] H. Jahanirad, "Cc-spra: Correlation coefficients approach for signal probability-based reliability analysis," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 27, no. 4, pp. 927–939, 2019.
- [5] J. Han, H. Chen, J. Liang, P. Zhu, Z. Yang, and F. Lombardi, "A stochastic computational approach for accurate and efficient reliability evaluation," *IEEE Transactions on Computers*, vol. 63, no. 6, pp. 1336–1350, 2012.
- [6] A. Alaghi, W. Qian, and J. P. Hayes, "The promise and challenge of stochastic computing," *IEEE Transactions on Computer-Aided Design* of Integrated Circuits and Systems, vol. 37, no. 8, pp. 1515–1531, 2017.
- [7] B. Liu and L. Cai, "Monte carlo reliability model for single-event transient on combinational circuits," *IEEE Transactions on Nuclear Science*, vol. 64, no. 12, pp. 2933–2937, 2017.
- [8] T. Thery, G. Gasiot, V. Malherbe, J.-L. Autran, and P. Roche, "Tiara: Industrial platform for monte carlo single-event simulations in planar bulk, fd-soi, and finfet," *IEEE Transactions on Nuclear Science*, vol. 68, no. 5, pp. 603–610, 2021.

- [9] S. Krishnaswamy, G. F. Viamontes, I. L. Markov, and J. P. Hayes, "Accurate reliability evaluation and enhancement via probabilistic transfer matrices," in *Proceedings of the conference on Design, Automation and Test in Europe-Volume 1*. IEEE Computer Society, 2005, pp. 282–287.
- [10] V. N. Remala, A. A. Reddy, R. P. Vidyadhar, and S. N. Bandi, "Circuit reliability through probabilistic transfer matrix," in 2021 6th International Conference on Communication and Electronics Systems (ICCES). IEEE, 2021, pp. 165–172.
- [11] D. T. Franco, M. C. Vasconcelos, L. Naviner, and J.-F. Naviner, "Signal probability for reliability evaluation of logic circuits," *Microelectronics Reliability*, vol. 48, no. 8-9, pp. 1586–1591, 2008.
- [12] M. F. Pontes, P. F. Butzen, R. B. Schvittz, S. L. Rosa, and D. T. Franco, "The suitability of the spr-mp method to evaluate the reliability of logic circuits," in 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 2018, pp. 433–436.
- [13] S. Cai, B. He, W. Wang, P. Liu, F. Yu, L. Yin, and B. Li, "Soft error reliability evaluation of nanoscale logic circuits in the presence of multiple transient faults," *Journal of Electronic Testing*, vol. 36, no. 4, pp. 469–483, 2020.
- [14] J. Han, H. Chen, E. Boykin, and J. Fortes, "Reliability evaluation of logic circuits using probabilistic gate models," *Microelectronics Reliability*, vol. 51, no. 2, pp. 468–476, 2011.
- [15] J. Jiang, T. Wang, and Z. Wang, "Probability gate model based methods for approximate arithmetic circuits reliability estimation," *CCF Transactions on High Performance Computing*, pp. 1–19, 2021.
- [16] R. Farjaminezhad, S. Safari, and A. M. E. Moghadam, "Recurrent neural networks models for analyzing single and multiple transient faults in combinational circuits," *Microelectronics Journal*, vol. 112, p. 104993, 2021.
- [17] R. B. Schvittz, P. F. Butzen, and L. S. da Rosa, "Methods for susceptibility analysis of logic gates in the presence of single event transients," in 2020 IEEE International Test Conference (ITC). IEEE, 2020, pp. 1–9.
- [18] G. Hubert, L. Artola, and D. Regis, "Impact of scaling on the soft error sensitivity of bulk, fdsoi and finfet technologies due to atmospheric radiation," *Integration*, vol. 50, pp. 39–47, 2015.
- [19] M. Nicolaidis and R. Perez, "Measuring the width of transient pulses induced by ionising radiation," in 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual. IEEE, 2003, pp. 56–59.
- [20] R. D. Schrimpf, K. M. Warren, R. A. Weller, R. A. Reed, L. W. Massengill, M. L. Alles, D. M. Fleetwood, X. J. Zhou, L. Tsetseris, and S. T. Pantelides, "Reliability and radiation effects in ic technologies," in 2008 IEEE International Reliability Physics Symposium, 2008, pp. 97–106.
- [21] Y. Zhao, L. Wang, S. Yue, D. Wang, X. Zhao, Y. Sun, D. Li, F. Wang, X. Yang, H. Zheng *et al.*, "Seu and set of 65 bulk cmos flip-flops and their implications for rhbd," *IEEE Transactions on Nuclear Science*, vol. 62, no. 6, pp. 2666–2672, 2015.
- [22] K. N. Patel, I. L. Markov, and J. P. Hayes, "Evaluating circuit reliability under probabilistic gate-level fault models," in *Proceedings of the International Workshop on Logic and Synthesis*, 2003, pp. 59–64.
- [23] R. Schvittz, L. Soares, and P. F. Butzen, "Exploring logic gates layout to improve the accuracy of circuit reliability estimation," in 2019 IFIP/IEEE 27th International Conference on Very Large Scale Integration (VLSI-SoC), 2019, pp. 234–235.

- [24] R. B. Schvittz, D. T. Franco, L. S. da Rosa, and P. F. Butzen, "An improved technique for logic gate susceptibility evaluation of single event transient faults," in *IFIP/IEEE International Conference on Very Large Scale Integration-System on a Chip.* Springer, 2019, pp. 69– 88.
- [25] R. B. Schvittz, D. T. Franco, L. S. da Rosa Jr, and P. F. Butzen, "An improved technique for logic gate susceptibility evaluation of single event transient faults," in VLSI-SoC: New Technology Enabler: 27th IFIP WG 10.5/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2019, Cusco, Peru, October 6–9, 2019, Revised and Extended Selected Papers, vol. 586. Springer Nature, 2020, p. 69.
- [26] A. Beg, "Accurate calculation of unreliability of cmos logic cells and circuits," *Journal of Circuits, Systems and Computers*, vol. 29, no. 13, p. 2050202, 2020.
- [27] A. Beg, F. Awwad, W. Ibrahim, and F. Ahmed, "On the reliability estimation of nano-circuits using neural networks," *Microprocessors* and *Microsystems*, vol. 39, no. 8, pp. 674–685, 2015.
- [28] K. Nikolic, A. Sadek, and M. Forshaw, "Architectures for reliable computing with unreliable nanodevices," in *Proceedings of the 2001 1st IEEE Conference on Nanotechnology. IEEE-NANO 2001 (Cat. No.* 01EX516). IEEE, 2001, pp. 254–259.
- [29] M. R. Choudhury and K. Mohanram, "Reliability analysis of logic circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 28, no. 3, pp. 392–405, 2009.
- [30] N. Miskov-Zivanov and D. Marculescu, "Circuit reliability analysis using symbolic techniques," *IEEE transactions on computer-aided design of integrated circuits and systems*, vol. 25, no. 12, pp. 2638–2649, 2006.
- [31] W. Ibrahim, A. Beg, and H. Amer, "A bayesian based eda tool for accurate vlsi reliability evaluations," in 2008 International Conference on Innovations in Information Technology. IEEE, 2008, pp. 101–105.
- [32] R. Xiao and C. Chen, "Gate-level circuit reliability analysis: A survey," VLSI Design, vol. 2014, 2014.
- [33] H. Cai, K. Liu, L. A. de Barros Naviner, Y. Wang, M. Slimani, and J.-F. Naviner, "Efficient reliability evaluation methodologies for combinational circuits," *Microelectronics Reliability*, vol. 64, pp. 19–25, 2016.
- [34] S. N. Pagliarini, G. dos Santos, L. d. B. Naviner, and J.-F. Naviner, "Exploring the feasibility of selective hardening for combinational logic," *Microelectronics Reliability*, vol. 52, no. 9-10, pp. 1843–1847, 2012.
- [35] J. T. Flaquer, J.-M. Daveau, L. Naviner, and P. Roche, "Fast reliability analysis of combinatorial logic circuits using conditional probabilities," *Microelectronics Reliability*, vol. 50, no. 9-11, pp. 1215–1218, 2010.
- [36] D. T. Franco, M. C. Vasconcelos, L. Naviner, and J.-F. Naviner, "Reliability analysis of logic circuits based on signal probability," in 2008 15th IEEE International Conference on Electronics, Circuits and Systems. IEEE, 2008, pp. 670–673.
- [37] S. N. Pagliarini, T. Ban, L. A. d. B. Naviner, and J.-F. Naviner, "Reliability assessment of combinational logic using first-order-only fanout reconvergence analysis," in 2013 IEEE 56th International Midwest Symposium on Circuits and Systems (MWSCAS). IEEE, 2013, pp. 113–116.
- [38] M. Smithson and P. Smithson, *Confidence Intervals*, ser. Confidence Intervals. SAGE Publications, 2003, no. N° 140. [Online]. Available: https://books.google.com.br/books?id=1ZEMXC-Xc9gC
- [39] Y. Dodge, Monte Carlo Method. New York, NY: Springer Science & Business Media, 2008. [Online]. Available: https://doi.org/10.1007/978-0-387-32833-1\_270

- [40] M. Anglada, R. Canal, J. L. Aragón, and A. González, "Fast and Accurate SER Estimation for Large Combinational Blocks in Early Stages of the Design," *IEEE Trans. Sustainable Comput.*, vol. 6, no. 3, pp. 427–440, Dec 2018.
- [41] B. Liu and L. Cai, "Monte carlo reliability model for single-event transient on combinational circuits," *IEEE Transactions on Nuclear Science*, vol. 64, no. 12, pp. 2933–2937, 2017.
- [42] B. Zhang, W.-S. Wang, and M. Orshansky, "FASER: fast analysis of soft error susceptibility for cell-based designs," in 7th International Symposium on Quality Electronic Design (ISQED'06). IEEE, Mar 2006, pp. 6pp.–760.
- [43] N. Miskov-Zivanov and D. Marculescu, "MARS-C: modeling and reduction of soft errors in combinational circuits," in *DAC '06: Proceedings of the 43rd annual Design Automation Conference.* New York, NY, USA: Association for Computing Machinery, Jul 2006, pp. 767–772.
- [44] H.-M. Huang and C. H.-P. Wen, "Layout-Based Soft Error Rate Estimation Framework Considering Multiple Transient Faults—From Device to Circuit Level," *IEEE Trans. Comput. Aided Des. Integr. Circuits Syst.*, vol. 35, no. 4, pp. 586–597, Aug 2015.
- [45] G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou, and G. I. Stamoulis, "A Placement-Aware Soft Error Rate Estimation of Combinational Circuits for Multiple Transient Faults in CMOS Technology," in 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT). IEEE, Oct 2018, pp. 1–6.
- [46] J. Allison, K. Amako, J. Apostolakis, P. Arce, M. Asai, T. Aso, E. Bagli, A. Bagulya, S. Banerjee *et al.*, "Recent developments in geant4," *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment*, vol. 835, pp. 186–225, 2016.
- [47] T. Thery, G. Gasiot, V. Malherbe, J.-L. Autran, and P. Roche, "Tiara: Industrial platform for monte carlo single-event simulations in planar bulk, fd-soi, and finfet," *IEEE Transactions on Nuclear Science*, vol. 68, no. 5, pp. 603–610, 2021.
- [48] A. Kritikakou, O. Sentieys, G. Hubert, Y. Helen, J.-F. Coulon, and P. Deroux-Dauphin, "FLODAM: Cross-Layer Reliability Analysis Flow for Complex Hardware Designs," pp. 1–6, Mar 2022.
- [49] I. Tuzov, D. de Andrés, and J.-C. Ruiz, "Accurate Robustness Assessment of HDL Models Through Iterative Statistical Fault Injection," in 2018 14th European Dependable Computing Conference (EDCC). IEEE, Sep 2018, pp. 1–8.
- [50] Silvaco's, "45nm open library nangate," 2020. [Online]. Available: https://si2.org/
- [51] "Crest tool github repository," https://github.com/GSDE-FURG/CREsT-Development, accessed: 2022-01-27.
- [52] S. N. Pagliarini, L. A. d. B. Naviner, and J.-F. Naviner, "Selective hardening methodology for combinational logic," in 2012 13th latin American test workshop (LATW). IEEE, 2012, pp. 1–6.
- [53] F. Corno, M. S. Reorda, and G. Squillero, "Rt-level itc'99 benchmarks and first atpg results," *IEEE Design & Test of computers*, vol. 17, no. 3, pp. 44–53, 2000.
- [54] J. Cong and K. Minkovich, "Optimality study of logic synthesis for lut-based fpgas," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 26, no. 2, pp. 230–239, 2007.
- [55] L. Amarú, P.-E. Gaillardon, and G. De Micheli, "The epfl combinational benchmark suite," in *Proceedings of the 24th International Workshop on Logic & Synthesis (IWLS)*, no. CONF, 2015.

- [56] W. Ibrahim, "Identifying the worst reliability input vectors and the associated critical logic gates," *IEEE Transactions on Computers*, vol. 65, no. 6, pp. 1748–1760, 2015.
- [57] R. Brayton and A. Mishchenko, "Abc: An academic industrialstrength verification tool," in *International Conference on Computer Aided Verification.* Springer, 2010, pp. 24–40.
- [58] J. Xiao, W. Chen, J. Lou, J. Jiang, and Q. Zhou, "Identifying reliability-critical primary inputs of combinational circuits based on the model of gate-sensitive attributes," *IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems*, 2022.
- [59] W. Ibrahim, M. Shousha, and J. W. Chinneck, "Accurate and efficient estimation of logic circuits reliability bounds," *IEEE Transactions on Computers*, vol. 64, no. 5, pp. 1217–1229, 2014.
- [60] I. Polian and J. P. Hayes, "Selective hardening: Toward cost-effective error tolerance," *IEEE Design & Test of Computers*, vol. 28, no. 3, pp. 54–63, 2010.
- [61] M. Raji, M. A. Sabet, and B. Ghavami, "Soft error reliability improvement of digital circuits by exploiting a fast gate sizing scheme," *IEEE Access*, vol. 7, pp. 66 485–66 495, 2019.