Reiner - The Xputer Lab Page
Transcription
Reiner - The Xputer Lab Page
reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de VIPSI-2012 MONTENEGRO Preface Hotel Splendid in Becici Dec 31, 2012 to Jan 1, 2013 Reiner Hartenstein IEEE fellow FPL fellow SDPS fellow TU Kaiserslautern The Tunnel Vision Syndrome: Challenging Computer Science Education ICT infrastructures, energy-efficient as urgently required: impossible without reinventing ECS practices and education The main problem: the Tunnel Vision Syndrome http://hartenstein.de reiner@hartenstein.de 1 © 2012, reiner@hartenstein.de Outline (1) TU Kaiserslautern 2 http://hartenstein.de Important ICT intrastructures TU Kaiserslautern [Courtesy Ernst Denert] • The Survival of our important ICT infrastructures • The Tunnel Vision Syndrome • The von Neumann Syndrome • Reconfigurable Computing: data-stream-based • Reinvent Computing to fully cover the Taxonomy • Conclusions Lufthansa Reservation anno 1960 http://wiki.answers.com/Q/Why_are_computers_important_in_the_world 3 http://hartenstein.de © 2012, reiner@hartenstein.de TU Kaiserslautern PATMOS 2013 - 23rd International Workshop on Power And Timing Modeling, Optimization and Simulation co-located w. VARI 2013 - 4rd European Workshop on CMOS Variability © 2012, reiner@hartenstein.de 4 http://hartenstein.de Beyond Oil: Predictions TU Kaiserslautern Now with extended scope: Energy-efficient ICT infrastructures are a survival issue of our economy © 2012, reiner@hartenstein.de 5 http://xputer.de/PATMOS/ http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de 6 http://hartenstein.de 1 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Beyond Oil: Literature TU Kaiserslautern US: ~3 $ G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep 2008 … hundreds of books 7 © 2012, reiner@hartenstein.de TU Kaiserslautern Power consumption by internet: x30 til 2030 if trends continue … post petroleum … http://hartenstein.de 8 © 2012, reiner@hartenstein.de at Dallas http://hartenstein.de © New York Times Outline (2) Google‘s Electricity Bill TU Kaiserslautern 8 TU Kaiserslautern Patent for water-based data centers Google going to sell electricity, Cost of a data center determined by the monthly power bill „The possibility of computer equipment power consumption spiraling out of control could have serious consequences • • • • • • The survival of our important ICT infrastructures The Tunnel Vision Syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent Computing to fully cover the Taxonomy Conclusions for the overall affordability of computing.” [L. A. Barrosso, Google] http://hartenstein.de/ComputerStromverbrauch.pdf © 2012, reiner@hartenstein.de 9 http://hartenstein.de 10 What Synthesis Method? (2) Systolic Arrays (1) TU Kaiserslautern TU Kaiserslautern Historic example of the Tunnel Vision Syndrome IEEE 7th ISCA, La Baule, France, May 6-8, 1980 Why not a general purpose methodology ? 11 of course algebraic! (linear projection) supports only applications with strictly regular data dependencies http://kressarray.de/ 1995: M. J. Foster and H. T. Kung: The Design of SpecialPurpose VLSI Chips ... © 2012, reiner@hartenstein.de http://hartenstein.de © 2012, reiner@hartenstein.de http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro Rainer Kress replaced it by simulated annealing*: supports also any irregular & wild form pipe networks © 2012, reiner@hartenstein.de 12 *) KressArray [ASP-DAC-1995] http://hartenstein.de 2 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Who generates the data streams? “It’s not our job” TU Kaiserslautern TU Kaiserslautern http://xputer.de/ xxx xx x xxx - - x xx xxx xx | x | | | | | | | | http://data-streams.org/ - - - x xx - - - - xx x - - - - - x xx | | | x | | xx | xxx xx x without a sequencer: missed to define the machine paradigm TU Kaiserslautern any irregular pipe network structure supported asM Supersystolic Array asM asM asM asM asM 13 asM asM *) or receives © 2012, reiner@hartenstein.de http://hartenstein.de the Data stream machine (anti machine): an example data counter use, no program counter pipeline network example asM asM: AutoSequencing Memory reconfigurable address generator (GAG) inside asM © 2012, reiner@hartenstein.de 14 http://hartenstein.de Systolic Arrays (2) TU Kaiserslautern IEEE 7th ISCA, La Baule, France, May 6-8, 1980 M. J. Foster and H. T. Kung: The Design of SpecialPurpose VLSI Chips ... from La Baule to the airport Oct. 23, 2012 GAG & enabling technology: published 1989, survey: [M. Herz et al.: IEEE ICECS 2003, Dubrovnik] asM asM asM asM programmed by Flowware 15 © 2012, reiner@hartenstein.de http://hartenstein.de Mario Barbacci: „VAX? That‘s why it is so slow“ http://hartenstein.de Too many terminals „mini“computers: VAX-11/750 TU Kaiserslautern 16 © 2012, reiner@hartenstein.de TU Kaiserslautern sorrowful experiences with the VAX-11/750 quasi-standard around 1980 (personally:) NATO ASI on VLSI at SOGESTA, Urbino, Italy, 1981 UC Berkeley CS department at Kaiserslautern my Xputer lab at Kaiserslautern E.I.S. project NATO ASI, Urbino 1981 © 2012, reiner@hartenstein.de 17 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de 18 http://hartenstein.de 3 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (3) TU Kaiserslautern • • • • • • the von Neumann Syndrome TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions von Neumann: by far the most inefficient machine paradigm the 1st electrical computer, ready prototyped for mass production ? which year, which company ? 19 http://hartenstein.de © 2012, reiner@hartenstein.de The History of Computing TU Kaiserslautern Prototype 1884: Herman Hollerith the first reconfigurable computer Not yet invented in 1884: • magnetic tape (1898*), • the vacuum tube (1904), • magnetic drum (1932), • the transistor (1934), • ferrite core memory (1949), • hard disc (1956). 1989 US census use non-volatile !! The LUT (lookup table) size: 2 refrigerators first Xilinx FPGA 100 years later http://hartenstein.de 21 paradigm shift from data streams 22 © 2012, reiner@hartenstein.de TU Kaiserslautern *) wire only http://hartenstein.de Tunnel Vision: EDSAC 2 6 decades later: TU Kaiserslautern http://hartenstein.de Punched Card Data Memory … state of the art ….. TU Kaiserslautern datastream-based ! © 2012, reiner@hartenstein.de 20 © 2012, reiner@hartenstein.de fully invisible other paradigms even hardware design went von Neumann to instruction streams EDSAC 2, 1958: first microprogrammable computer, proposed 1951 about 3 hours MTBF 30 tons, 178 kW almost 1000 square feet of floor space Microprogramming: nested von Neumann machines: instruction streams + microinstruction streams Trailblazing Reconfigurable Computing ? No: nested von Neumann bottlenecks: Multiple multiplexing overhead [Günter Koch et al.: “The universal Bus considered harmful”; 1st EUROMICRO Symp., June 1975, Nice, France] Brief History of Microprogramming: http://cs.clemson.edu/~mark/uprog.html 23 © 2012, reiner@hartenstein.de http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de 24 http://hartenstein.de 4 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (4) TU Kaiserslautern Power save ~10% PISA project >15000 • • • • • • The Survival of our important ICT infrastructures The Tunnel Vision Syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent Computing to fully cover the Taxonomy Conclusions DPLA replacing 256 FPGAs’1984 (E.I.S. project) Speedup-Factor TU Kaiserslautern 106 Speed-up Factors by Software to FPGA migration Image processing, Pattern matching, 28500 Multimedia DSP and 3439 6000 Reed-Solomon Decoding video-rate stereo vision MAC pattern 730 1000 900 recognition 400 103 Speed-up factors are not new SPIHT wavelet-based image compression 52 BLAST 40 (avoiding the von Neumann syndrome) 288 457 FFT 88 protein identification 2400 DNA seq. 1116* 8723 3000 crypto CT imaging 1000 Viterbi Decoding Smith-Waterman pattern matching 100 molecular dynamics simulation Bioinformatics *)DES br. equipment size 20 100 DES breaking wireless real-time face detection Astrophysics GRAPE 25 http://hartenstein.de © 2012, reiner@hartenstein.de © 2012, reiner@hartenstein.de RC*: the intensive Impact TU Kaiserslautern TU Kaiserslautern Tarek El-Ghazawi [Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008] SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster Application . DES breaking Speed-up factor Savings factors Power Cost Size 28514 3439 96 1116 massively saving energy *) RC = Reconfigurable Computing © 2012, reiner@hartenstein.de 27 much less equipment needed http://hartenstein.de Stream Data-Flow Execution Models for Extreme Scale Computing (DFM 2012) Minneapolis, USA, Sep 19-23, 2012, in conjunction with PACT 2012 http://www.cs.ucy.ac.cy/dfmworkshop/ program source compilation result Software instruction streams Flowware data streams Configware datapath structures configured RC: why it‘s so efficient it‘s efficieny is data-stream-based: avoiding the extremely memorycycle-hungry von Neumann syndrome the anti-machine paradigm: no instruction streams at run time going thrugh the FPGA fabrics 28 © 2012, reiner@hartenstein.de http://hartenstein.de FPGA’s Semiconductor Market Share A Clean Terminology, please TU Kaiserslautern http://hartenstein.de 26 TU Kaiserslautern courtesy [Nick Tredennick] • Why stalled ? still < 2% the RC paradox FPGAs Achilles’ heel: long development time: VHDL/Verilog still dominant • Design software unusable except by experts • FPGA companies’ wrong top-level management*: – first: circuit designers, now logic designers – should be: programmers • Nick Tredennick: “a generation behind in required expertise.” Evolution of FPGAs 30 [Peter Thorwartl] © 2012, reiner@hartenstein.de 29 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de http://hartenstein.de 5 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (5) TU Kaiserslautern • • • • • • reiner@hartenstein.de 28 December 2012 going beyond the tunnel TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions Creating energy-efficient ICT infrastructures means dramatically much more than just a circuit design issue Energy-efficient programming: not with curricula from the mainframe age ! 31 http://hartenstein.de © 2012, reiner@hartenstein.de © 2012, reiner@hartenstein.de A huge design space Programmability crisis solution impossible Mike Flynn‘s taxonomy without mastering the entire design space TU Kaiserslautern TU Kaiserslautern 32 http://hartenstein.de The tunnel view of the pre-manycore age extending Flynn‘s taxonomy by going heterogeneous: Instruction vs. Data Single vs. Multiple Reiner‘s Taxonomy reconfigurable or not Diana Göhringer‘s Ph.D.thesis Diana‘s Taxonomy datastream-based (anti-machine) 33 © 2012, reiner@hartenstein.de noI versus SI or MI http://hartenstein.de Education Revolution: the M-&-C Design Revolution fragmentation reject Clearing out & intuitive models Switching level submit reject Circuit level submit reject Layout level In-house technology width of specialization © 2012, reiner@hartenstein.de TU Kaiserslautern [1980] reject Logic level submit VLSI Design Education Spreading Rapidly 1980 - 1983 world-wide Application reject RT level submit http://hartenstein.de Das E.I.S.-Projekt: http://xputer.de/EIS/ Application level submit 34 The Mead-&-Conway strategy: Removal of the education dilemma coherence division of specialization: tall thin man TU Kaiserslautern © 2012, reiner@hartenstein.de incubator of workstation and EDA industry etc. Silicon Foundry (external technology) reduced width of specialization 35 Carver Mead Lynn Conway http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro The most effective project in the history of modern computer science © 2012, reiner@hartenstein.de Carver Mead Lynn http://hartenstein.de Conway 36 36 6 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Outline (6) TU Kaiserslautern • • • • • • We need „une' Levée en Masses“ TU Kaiserslautern The survival of our important ICT infrastructures The Tunnel vision syndrome The von Neumann Syndrome Reconfigurable Computing: data-stream-based Reinvent computing to fully cover the taxonomy Conclusions We need „une' „une' Levée Levée en en Masses“ 37 http://hartenstein.de © 2012, reiner@hartenstein.de © 2012, reiner@hartenstein.de 38 TU Kaiserslautern TU Kaiserslautern backup for discussion thank you 39 © 2012, reiner@hartenstein.de http://hartenstein.de © 2012, reiner@hartenstein.de 40 What form of Parallelism? TU Kaiserslautern http://hartenstein.de [Hartenstein’s watering can model] http://hartenstein.de TU Kaiserslautern instruction-stream-based approach: data-stream-based approach: I used this picture in several earlier talks since it is popular no von Neumannbottleneck many von Neumann bottlenecks © 2012, reiner@hartenstein.de 41 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro Also other speaker use it: see next slide. © 2012, reiner@hartenstein.de 42 http://hartenstein.de 7 reiner@hartenstein.de 28 December 2012 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de parallelism solution: TU Kaiserslautern the instruction-stream-based approach the data-stream-based approach has no von Neumann bottleneck von Neumann bottlenecks Copyrightⓒ2005 J.D.Cho http://hartenstein.de © 2012, reiner@hartenstein.de TU Kaiserslautern Dual paradigm mind set: an old hat - but ignored time to space mapping: procedural to structural Duality of procedural Languages program counter: TU Kaiserslautern FF Flowware Languages read next data item goto (data address) jump to (data address) data loop data loop nesting data loop escape data stream branching yes: internally parallel loops more simple: no ALU tasks But there is an Asymmetry 44 © 2012, reiner@hartenstein.de http://hartenstein.de All but ALU is overhead: x20 efficiency TU Kaiserslautern [R. Hameed et al.: Understanding Sources of Inefficiency in General-Purpose Chips; 37th ISCA, June 19-23, 2010, St. Malo, France] token bit evoke FF 1971 data counter(s): Software Languages read next instruction goto (instruction address) jump to (instruction address) instruction loop instruction loop nesting instruction loop escape instruction stream branching no: internally parallel loops Just one of several overhead layers (data cashe) FF 1967: W. A. Clark: Macromodular Computer Systems; 1967 SJCC, AFIPS Conf. Proc. C. G. Bell et al: The Description and Use of RegisterTransfer Modules (RTM's); IEEE Trans-C21/5, May 1972 © 2012, reiner@hartenstein.de TU Kaiserslautern 45 http://hartenstein.de Program Engineering (2) The Generalization of Software Engineering TU Kaiserslautern vN versus Anti-Machine (data stream machine). auto-sequencing Memory asM FE Flowware Engineering CPU SE Software Engineering The Generalization of Software Engineering — © 2012, reiner@hartenstein.de PE pipe network model etc. conditional swap conditional swap *) do not confuse with „dataflow“! CE Configware Engineering DPU Data-Path- Unit DPA Data-PathArray http://hartenstein.de 47 Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro the Bubble Sort algorithm © 2012, reiner@hartenstein.de http://hartenstein.de Parallelized Bubble Sort (Shuffle Sort) conditional swap conditional swap conditional swap Program Engineering structures 46 © 2012, reiner@hartenstein.de conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap direct time to modified space mapping by shuffle Shuffle Sort* (animation) accessing conflicts function 48 partly back to time mapping *) http://xputers.informatik.unikl.de/papers/publications/diplo ma-theses.html#Duhl http://hartenstein.de 8 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de Other voices TU Kaiserslautern The Hardware Architecture Challenge: More parallelism needed by orders of magnitude. Entirely New Software Stack needed: New scalable and robust OS needed. Fundamental Programming Issues: New software architectures required The High Cost of Data Movement for OS, RS, APIs and compilers Hardware mechanisms for How to provide a non-disruptive efficient communication path for existing application code? A programming model expressing all available parallelism and locality Compilers and run time systems exploiting parallelism and locality 49 © 2012, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern IBM Roadrunner: 2,483 kW ASCI Red: 850 kW BTW: July 2005: an early trailblazer Exascale is no longer some vague over-the-horizon notion but rather an aggressively sought-after goal. The future of computing has never seemed so uncertain Absurdely incomprehensible abstractions in „standard“ languages © 2012, reiner@hartenstein.de TU Kaiserslautern Removing paradigm domains and abstraction layers hides critical sources of efficiency limits: memory mapping issues, overhead and bottlenecks We must change how programmers think, also by ….. http://hartenstein.de © 2012, reiner@hartenstein.de TU Kaiserslautern Data-Flow Stream Execution Models for Extreme Scale Computing (DFM 2012) DFM the ubiquitous Memory Wall DF systems could be simpler and more power efficient in handling concurrency and latencies It’s time to revisit Data-driven computation and bring it to Multi-core and extreme scale computing – an overall system concept including hardware and software Thomas Sterling Radical “disruptive research” is required in programmability Operating Systems for Exascale Computing and Beyond http://hartenstein.de Supercomputer High end Programmer Productivity The Law of More: programmer productivity declines disproportionately with increasing parallelism At particular HPC application domains massive parallelism requires 10 – 30 professionalists in multi-disciplinary multi-insitutional teams for 5 - 10 years [Douglass Post, DoD HPCMP, panelist at SC07] © 2012, reiner@hartenstein.de The High Cost of Movement of Data (and Instructions) Novel DF-inspired models, paradigms, architectures, compilers and tools for multi-core and supercomputing. No evolutionary extension of current models. Nam FLOPS yottaFLOPS 1024 zettaFLOPS 1021 exaFLOPS 1018 petaFLOPS 1015 teraFLOPS 1012 gigaFLOPS 109 megaFLOPS 106 kiloFLOPS 103 Language designer‘s tunnel vision Teaching to students the tunnel vision of language designers ? Will we never reach Zettaflops ? TU Kaiserslautern reiner@hartenstein.de 28 December 2012 53 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de 52 http://hartenstein.de Trailblazing (the xputer) TU Kaiserslautern 1975 Nizza Univ Bus considered harmful ICCAD 1984 Santa Clara, PISA 1988 Worksh on HW accelerators, Oxford MoM Kilarney 1989 super systolic MoM COMPEURO Hamburg 1989 ICPP-90 Xputer HICSS Koloa, 1991 auto-sequencing 2-dim memory space © 2012, reiner@hartenstein.de 54 http://hartenstein.de 9 Reiner Hartenstein, TU Kaiserslautern, Germany IEEE fellow, FPL fellow, SDPS fellow http://hartenstein.de reiner@hartenstein.de 28 December 2012 Links to Reinvent Computing TU Kaiserslautern Bio TU Kaiserslautern The Grand Challenge to Reinvent Computing http://xputer.de/pucminas/ Reinvent Computing? This idea is not new. See the keynote by Burton Smith (former Cray CTO): http://xputer.de/reinvent/ Dr.-Ing. Reiner Hartenstein is full professor of the TU Kaiserslautern and independent expert and consultant of EDA in Reconfigurable Computing. Invasic Computing: agressive 30 people project http://xputer.de/invasic/ As a scholar of Karl Steinbuch all his academic degrees are from EE at KIT (Karlsruhe Institute of Technology), where he later was associate professor, working in image processing, computer architecture and hardware description languages. He appreciates a decade of fruitful cooperation with colleagues of the University of Brasilia. Invasive Computing — An Overview http://xputer.de/invasive/ KAHRISMA: KArlsruhe's Hypermorphic Reconfigurable-Instruction-Set Multigrained-Array Processor http://www.kahrisma.de/ http://xputer.de/kahr/ ARAMiS (Automotive, Railway and Avionics Multicore Systems), a large German/European project http://xputer.de/aramis/ Prof. Hartenstein is FPL fellow, SDPS fellow, IEEE fellow and recipient of other awards. He gave more than 200 invited talks and 40 international keynote addresses. He has published more than 400 papers and authored, edited or co-edited 16 books 1st Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM-2011) http://xputer.de/DFM1/ DFM 2012: http://xputer.de/DFM2/ © 2012, reiner@hartenstein.de 55 http://hartenstein.de Keynote, VIPSI-2012 Conference, Dec. 31, 2012 – Jan. 1, 2013, Becici, Montenegro © 2012, reiner@hartenstein.de 56 http://hartenstein.de 10