Guys I have a project about summraize a small research consist of 4 pages and have to represent it . so the research in attachment below .
1. Presentation more than 7 slides to 10 slides ( without including references )
Attachments:Reliability and Fault Tolerance Analysis of FPGA Platforms Mihaela Radu Department of Electrical and Computer Engineering Technology Farmingdale State College Farmingdale, USA firstname.lastname@example.org Abstract-This paper presents reliability and fault tolerance analyses of FPGA platforms. FPGA stands for Field Programmable Gate Arrays. It is a digital technology designed to be configured by a customer or a designer after manufacturinghence “field-programmable”. The FPGA configuration is generally specified using a Hardware Description Language (HDL), similar to that used for an Application-Specific integrated Circuit (ASIC). The FPGA platforms that are investigated are Digilent Nexys digital design platforms, built around Xilinx FPGA technology. They are ideal platforms for any engineer to gain experience with Xilinx’s latest technologies, and are perfectly suited to the classroom. These platforms are intensively used in learning environments, such as colleges and universities world-wide, so they need to be reliable and robust, surviving intensive use over the years. Advanced reliability estimations for these FPGA platforms are presented, using various modules and prediction methods of CARE-CAD tool, a powerful software tool for reliability analysis (Mean Time Between Failures Module, Fault Tree Analysis Module, Reliability Block Diagram Module). After various reliability estimations are performed considering every functional block of the FPGA platforms, fault tolerance analyses are performed. Fault tolerant techniques, based on hardware redundancy, are suggested for the less reliable blocks of the platforms. New reliability analyses are performed to see if the addition of redundant blocks and components is justified, improving the reliability of the platforms for short and long mission times. Estimating the reliability of various blocks of the FPGA platforms help the designer to make the necessary changes, designing more robust products. Keywords-FPGA platforms; Reliability models; Failure rate; Redundancy; Fault Tolerance I. INTRODUCTION The use of Field Programmable Gate Arrays platforms is growing in terrestrial and space based applications because of the low application development cost, the short time to market and the reprogramming flexibility that they offer. Given the current demand for feasible, reconfigurable and applicationspecific functionality among a host of space and terrestrial applications, high density SRAM based FPGAs provide a lowcost solution which can be readily developed for the desired constraints. The current industry trend of shrinking device size helps SRAM-based FPGA to be faster and denser, but at the same time also make them less reliable. With shrinking transistor size, the charge required to switch them, and thus induce a single bit error, also shrinks, making transistors more prone to radiation induced errors. FPGA devices are susceptible to radiation-induced Single-Event-Upsets (SEUs). If not corrected, SEU can affect system’s performance, and in the worst case can result in system’s failure. In addition to space based applications, there are many ground level systems such as bank servers, telecommunication servers and avionics which require SEU tolerance, and high reliability , . To increase the reliability of FPGA for space and terrestrial applications also, several techniques have been developed to overcome SEU errors. These techniques are referred as device hardening techniques. The most common ones are: configuration scrubbing (memory scrubbing), error detecting and correcting codes and redundancy. There is a rich body of research dedicated to the study of reliability of FPGA architectures used in space and terrestrial applications, trying to increase the reliability at various design levels, from the microarchitecture to the system level [1-5]. In contrast with the previous mentioned references, this paper presents estimation of the reliability parameters for FPGA platforms (hardware blocks). The FPGA platforms that are investigated are Nexys design platforms, built around Xilinx FPGA technology and manufactured by Digilent Inc. These platforms are intensively used in learning environments, such as colleges and universities world-wide, so they need to be reliable and robust, surviving intensive use over the years. To estimate the reliability of the FPGA platforms the CARE (Computer Aided Reliability Engineering) CAD tool is used. The rest of the paper is organized as follows: Section I presents the FPGA platforms. Section II is an overview of the reliability metrics. Section III presents the reliability estimation using CARE CAD tool and fault tolerance analysis. Section IV presents the conclusions. II. FPGA ARCHITECTURE FPGA architectures based on SRAM (Static RAM memory) contain programmable look-up tables (LUT) and programmable interconnects and Flip-Flops (FF). An LUT is an SRAM with k address lines that can be programmed to realize any Boolean function of up to k variables having a single output. The contents of LUT and interconnections between them, are programmed by serially downloading a bitstream to the FPGA. Along with the programmable LUT, FPGA device also contain FFs to implement sequential logic. In addition to LUTs and FFs, FPGA contains synchronous memory blocks. The Nexys-2 is a powerful digital system design platform built around a Xilinx Spartan 3E FPGA. With 16 Mbytes of fast SDRAM and 16Mbytes of Flash ROM, the Nexys-2 is ideally suited to embedded processors like Xilinx’s 32-bit RISC Microblaze™. The on-board high-speed USB2 port, together with a collection of I/O devices, data ports, and expansion connectors, allow a wide range of designs to be completed without the need for any additional components. The complete schematics of this platform, used for the reliability analyses presented in this paper, can be found online. See reference . Figure 1. NEXYS -2 FPGA platform block diagram  III. RELIABILITY METRICS The reliability of a system follows an exponential distribution law during the useful life phase of the system: R(t) = e-λt (1) where λ represents the failure rate and is assumed to have a constant value for electronic components during the useful life of the system , . The mean time to failure (MTTF) of a system is the expected time of the occurrence of the first system’s failure. If the reliability function is defined by (1), then: MTTF = 1/ λ (2) The estimation (prediction) of the failure rate λ of electronic components is performed using military or commercial standards (handbooks), widely accepted by the reliability engineering world. Examples of such standards are: Mil-Std-217-US DOD, HRD5 -British Telecom, Bellcore TR- 332 -Bell Communications, etc. The reliability estimations presented in this paper are based on the MIL-HDBK-217 standard . The failure rate of the electronic components is predicted using experimental data obtained by analyzing the failures of actual devices. There are two possible methods: Part Counts method and Full (Part Stress) stress method The Part Counts method, known as C217F2, makes general assumptions on the applied stresses for each electronic component. It is most applicable early in the design phase and proposal formulation. It requires less information than Part Stress Analysis, such as part quantities, quality level and application environment. It uses a “simplified” formula to calculate the failure rate of the components . The Full (Part Stress) method, known as S217F2, evaluates the thermal and electrical stresses that are applied to a component under given environment conditions (operational environment and ambient temperature). It is used in the final phases of the design, when there is concrete information about the type of components, thermal and electrical stresses . To evaluate the reliability of the system, the serial model, parallel model or a combination of both models can be used . The serial system assumes that all the components should survive for the system to operate correctly. The reliability of the system is: (3) where Ri(t) repres
ents the reliability of component i. Assuming the exponential failure rate for each component: (4) (5) The parallel system assumes that the components in the system have spares. As soon as fault occurs in a component (module), the faulty component is replaced by a spare. Only one component needs to survive, in order for the system to operate correctly. For the parallel system, the probability of failure is: (6) where Qi(t) represents the failure probability of component i. The reliability of the parallel system is: (7) () () ∏= = n i series i tRtR 1 t t series system n i i eetR λ λ − − = ∑ = =1 )( ∑= = n i system i 1 λ λ )()( 1 tQtQ n i parallel ∏ i = = ( ) ∏( ) ( ) ∏( ) ( ) = = −=−= −−= n i i n i parallel parallel i QtR tQ tR 1 1 1 0.1 0.10.1 IV. RELIABILTY ESTIMATION USING CARE- TOOL A new method for estimating the reliability parameters of the FPGA platforms is presented, using CARE (Computer Aided Reliability Engineering) software tool. CARE is an engineering tool, developed by BQR Reliability Ltd., intended for complex reliability analyses of commercial and military electronic and mechanical systems. The first analyses presented in this paper were performed using the MTBF (Mean Time Between Failures) module of the CARE software, using the Mil-Std-217-US DOD standard. Using the schematics of the Nexys2 board, the serial reliability model was created, using the Library Editor of CARE for the components. When electronic components of the Nexys2 schematics (FPGA platform) were not available on the CARE library of component, components with similar reliability characteristics were used, after careful consideration and investigation. See figure 2. Figure 2. Library Editor of CARE First, the C217F2 method of was used, as a starting point. This prediction method does not take into account temperature when calculating failure rate. Because CARE provides a limited component libraries for this method, component substitutions were made for the less common resistors, capacitors, ICs, etc., found on the Nexys2 schematics. Figure 3 and 4 presents the representation of the FPGA platform using the MTBF module and the results of the C217F2 prediction method. The MTBF module uses the serial reliability model for any system. Figure 3. MTBF reliability serial model for the FPGA platform Figure 4. Reliability prediction for the FPGA platform using C217F2 The failure distribution by components, based on this prediction method is given in figure 3. The resistors present the highest risk of failures, due to the large number of resistors on the platform. Figure 5. Failure distribution by components The second method used to perform reliability analyses(predictions) was the S217F2 prediction method. The library of components corresponding to this prediction method was used and a series system, similar with the one presented in figure 3 was created. The analyses were performed for GB condition (Ground Benign environment) and at different temperatures. In this case, the failure rate distribution was more reasonably distributed. The actual FPGA (a complex ICs) is now the more prone to failure, rather than the other modules. Figure 6 and 7 presents the results of the analysis. Figure 6. Reliability prediction for the FPGA platform using S217F2 Figure 7. Failure distribution by components For advanced reliability analysis, the VRBD Module of the CARE tool was used. The Visual Reliability Block Diagram (VRBD) module of CARE tool is intended to define and calculate Reliability, MTBF, MTTR and Availability for systems as hierarchic combination of different functional components in a very simple way . This module offers more flexiblity building the reliability model of the system. It allows to add spares to the main components, components can can be repaired, etc. Figure 8 presents the reliability model of the system using the VRBD module. The Core Database Manager was used to export the system created previously using the MTBF module (S217 prediction method approach). Using several analyses performed with the MTBF and the VRBD modules of the CARE tool, the reliability of various building blocks, such as memory block, connector block and FPGA programming block seem to have the highest failure rate. These blocks can benefit by adding fault tolerant features. The fault tolerant features can be implemented adding redundant blocks (hardware redundancy). Adding an additional memory block (hot spare), the failure rate of the memory banks decreases. Figure 9 presents the reliability curves for the system without and with a hot spare for the memory block. Figure 8. RBD Diagram of the system Figure 9. Reliability graphs for memory block, without and with one spare V. CONCLUSIONS In this paper, reliability estimations for FPGA platforms are presented, using modules and prediction methods of CARE CAD tool. Fault tolerant techniques, based on hardware redundancy, are suggested for the less reliable blocks of the platforms. This work was initiated as part of a graduate course “Design of Fault Tolerant Systems” at RHIT, In. The study was suggested by the manufacturer of the Nexys2 FPGA platforms, DigilentInc. Estimating the reliability of various blocks of the FPGA platforms help the designer to make the necessary changes, designing more robust products. REFERENCES  A. Tiwari, Karen Tomko “Enhanced Reliability of Finite State Machines in FPGA Through efficient Fault Detection and Correction”, IEEE Transaction on Reliability, vol. 54, nr.3, pp. 459-467, September 2005.  M. Normand “Single event upset at ground level”, IEEE Trans. On Nuclear Science , NY, vol.43, p.2742-2750, December 1996.  M. Wirthlin, N. Rollins, M. Caffrey, P. Graham, “Hardness By Design Techniques for Field Programmable Gate Arrays,” 11th NASA Symposium on VLSI Design, Coeur d’Alene, Idaho, 2003.  F. Lima Kastensmidt, G. Neuberger, L Carro, R. Reis ”Designing and Testing Fault-Tolerant Techniques for SRAM based FPGA”, ACM International Conference on Computing Frontiers Ischia, Italy, 2004.  B. Pratt, M. Caffrey, “ Improving FPGA Design Robustness with Partial TMR”, MAPLD Conference 2005, Washington Dc.  www.digilentinc.com  I. Koren, C. Mani Krishna, “Fault-Tolerant Systems”, Morgan Kaufmann Publisher, 2007.  E. Dubrova, “Fault-Tolerant Design: An Introduction” course notes, Royal Institute of Technology, Stockholm, Sweden, 2013.  http://www.sre.org/pubs/Mil-Hdbk-217F.pdf  BQR, CARE-8-RBD-V8.8 User Manual, 2012.  D. Siewiorek, R Swarz, “ Reliable Computer Systems, Design And Evaluation”, AK Peters Ltd, 1998