ginormous systems
Transcription
ginormous systems
GINORMOUS SYSTEMS April 30–May 1, 2013 Washington, D.C. RECONNAISSANCE PAPERS 2013 INDEX TO PAPERS Sure, Big Data Is Great. But So Is Intuition by Steve Lohr Are the algorithms shaping our digital world too simple or too smart? There’s no easy answer; listening to data is important, but so is experience and intuition. http://www.nytimes.com/2012/12/30/technology/big-data-is-great-but-dont-forgetintuition.html?ref=technology& r=0 Looking to Industry for the Next Digital Disruption by Steve Lohr This article looks at GE’s big bet on what it calls the “industrial Internet,” bringing digital intelligence to the physical world of industry as never before. http://www.nytimes.com/2012/11/24/technology/internet/ge-looks-to-industry-for-the-next-digitaldisruption.html?pagewanted=all GE’s Industrial Internet and the Smart Grid in the Cloud by Jeff St. John Building an Internet for the utility industry that ties together smart meters, grid sensors, and enterprise IT into a cloud-hosted package may be technologically achievable, but who will manage and integrate the back end, legacy billing, and customer service? http://www.greentechmedia.com/articles/read/ges-industrial-internet-and-the-smart-grid-in-the-cloud An Internet for Manufacturing by Michael Fitzgerald Can the industrial Internet go from its current state of facility intranet to a global Internet, feeding information back about weather conditions, supply and demand, and logistics? http://www.technologyreview.com/news/509331/an-internet-for-manufacturing/ One Big Cluster: How CloudFlare Launched 10 Data Centers in 30 Days by Sean Gallagher Content delivery provider CloudFare builds its data centers by never setting foot in them; they ship the equipment with a how-to manual, use open-source software, storage tricks from the world of big data, a bit of network peering arbitrage, and voila! http://arstechnica.com/information-technology/2012/10/one-big-cluster-how-cloudflare-launched-10-datacenters-in-30-days/ Daniel Tunkelang Talks about LinkedIn’s Data Graph by Paul Miller Speaker Daniel Tunkelang says not get too obsessed with building systems that are perfect. Instead, he suggests we communicate with users and offer UIs that guide and help them explore. http://semanticweb.com/daniel-tunkelang-talks-about-linkedins-data-graph b29699 Beyond MapReduce: Hadoop Hangs On by Matt Asay Hadoop helped usher in the big-data movement. Yet the Web world is moving toward real-time, ad-hoc analytics that batch-oriented Hadoop can't match. http://www.theregister.co.uk/2012/07/10/hadoop_past_its_prime/ 10 Predictions about Networking and Moore’s Law from Andy Bechtolsheim Speaker Andy Bechtolsheim suggests that we are in the golden age of networking, and he predicts that the economics of chips will change, architecture and flexibility will matter even more, and that Moore’s Law is alive and well. http://venturebeat.com/2012/10/11/bechtolsheims-10-predictions-about-networking-and-moores-law/ Internet Architects Warn of Risks in Ultrafast Networks by Quentin Hardy The article profiles Arista and its two founders, including speaker Andy Bechtolsheim, both of whom say the promise of having access to mammoth amounts of data instantly, anywhere, is matched by the threat of catastrophe. The company was built with the 10-gigabit world in mind. http://www.nytimes.com/2011/11/14/technology/arista-networks-founders-aim-to-alter-how-computersconnect.html?pagewanted=all 741 Tenth Street, Santa Monica, CA 90402 USA • T: +1 310.394.8305 • F: +1 310.451.2104 • ttivanguard.com A “Big Data” Freeway for Scientists by John Markoff This article looks at a new advanced optical computer network that is intended to serve as a “Big Data freeway system” for next-generation science projects in fields including genomic sequencing, climate science, electron microscopy, oceanography and physics. The network is at the University of California, San Diego, and was developed by Arista Networks. http://bits.blogs.nytimes.com/2013/03/20/a-big-data-freeway-for-scientists/ A New Approach to Innovation Will Be Crucial in the Coming Era of Cognitive Systems by Bernard Meyerson Speaker Bernie Meyerson argues that in the early stages of building cognitive systems, the benefits will arrive sooner and stronger if companies, governments, and universities adopt a culture of innovation that includes making big bets, fostering disruptive innovations, taking a long-term view, and collaborating across institutional boundaries. http://asmarterplanet.com/blog/2013/01/a-new-approach-to-innovation-will-be-needed-in-the-coming-era-ofcognitive-systems.html Construction of a Chaotic Computer Chip by William Ditto, K. Murali, and Sudeshna Sinha Speaker William Ditto and his colleagues discuss progress on the construction of a chaotic computer chip consisting of large numbers of individual chaotic elements that can be individually and rapidly morphed to become the full range of logic gates. Such a chip of arrays of morphing chaotic logic gates can then be programmed to perform higher order functions and to rapidly switch among such functions. http://www.imsc.res.in/~sudeshna/Ditto for ICAND.pdf Panasas Kingpin: What's the Solid State State of Play? by Chris Mellor Speaker Garth Gibson provides his perspective on what NAND flash can do now for high-performance computing storage and how it will evolve. http://www.theregister.co.uk/2012/03/29/panasas on ssd/ Storage at Exascale: Some Thoughts from Panasas CTO Garth Gibson What kind of storage performance will need to be delivered to achieve exascale computing? Speaker Garth Gibson answers that question, and others in this interview. http://www.hpcwire.com/hpcwire/2011-0525/storage at exascale some thoughts from panasas cto garth gibson.html Biff (Bloom Filter) Codes: Fast Error Correction for Large Data Sets by Michael Mitzenmacher and George Varghese Large data sets are increasingly common in cloud and virtualized environments. There is a need for fast errorcorrection or data reconciliation in such settings, even when the expected number of errors is small. The authors, including speaker Michael Mitzenmacher, consider error correction schemes designed for large data. http://cseweb.ucsd.edu/~varghese/PAPERS/biffcodes.pdf Verifiable Computation with Massively Parallel Interactive Proofs by Justin Thaler, Mike Roberts, Michael Mitzenmacher, and Hanspeter Pfister In the cloud, the need for verifiable computation has grown increasingly urgent. The authors believe their results with verifiable computation demonstrate the immediate practicality of using GPUs for such tasks, and more generally, that protocols for verifiable computation have become sufficiently mature to deploy in real cloud computing systems. http://arxiv.org/pdf/1202.1350v3.pdf Unreported Side Effects of Drugs Are Found Using Internet Search Data, Study Finds by John Markoff Using data drawn from Web-wide search queries, scientists have been able to detect evidence of unreported prescription drug side effects before they were found by the Food and Drug Administration’s warning system. http://www.nytimes.com/2013/03/07/science/unreported-side-effects-of-drugs-found-using-internet-datastudy-finds.html?ref=technology& r=0 Six Provocations for Big Data by danah boyd and Kate Crawford With the increased automation of data collection and analysis, as well as algorithms that can extract and inform us of massive patterns in human behavior, it is necessary to ask which systems are driving these practices and which are regulating them. In this essay, the authors offer six provocations that they hope can spark conversations about the issues of big data. http://softwarestudies.com/cultural_analytics/Six_Provocations_for_Big_Data.pdf Why Hadoop Is the Future of the Database by Cade Metz A revamped Hadoop, operating more like a relational database, can now store massive amounts of information and answer questions using SQL significantly faster than before. http://www.wired.com/wiredenterprise/2013/02/pivotal-hd-greenplum-emc/ Algorithms Get a Human Hand in Steering Web by Steve Lohr Computers are being asked to be more humanlike in what they figure out. Although algorithms are growing ever more powerful, fast, and precise, computers are not always up to deciphering the ambiguity of human language and the mystery of reasoning. http://www.nytimes.com/2013/03/11/technology/computer-algorithms-rely-increasingly-on-humanhelpers.html?hp& r=0 How Complex Systems Fail by Richard I. Cook This essay shows us 18 ways in which we can look at the nature of failure, how failure is evaluated, and how failure is attributed to proximate cause. http://www.ctlab.org/documents/How%20Complex%20Systems%20Fail.pdf What Data Can’t Do by David Brooks Data can be used to make sense of mind-bogglingly complex situations, yet it can obscure values and struggle with context and social cognition. http://www.nytimes.com/2013/02/19/opinion/brooks-what-data-cant-do.html?hpw In Mysterious Pattern, Math and Nature Converge by Natalie Wolchove A universality model, coming from an underlying connection to mathematics, is helping to model complex systems from the Internet to Earth’s climate. http://www.wired.com/wiredscience/2013/02/math-and-nature-universality/all/ Embracing Complexity We are beginning to understand that complex systems are even more complex than we first thought. Complexity theorists are now studying how physical systems go through phase transitions to try and predict when everyday networks will go through potentially catastrophic changes. http://fqxi.org/community/articles/display/174 Optimization of Lyapunov Invariants in Verification of Software Systems by Mardavij Roozbehani, Alexandre Megretski, and Eric Feron The authors of this paper have developed a method for applying principles from control theory to formal verification, a set of methods for mathematically proving that a computer program does what it’s supposed to do. http://arxiv.org/pdf/1108.0170v1.pdf ADDITIONAL WEB SITES, REFERENCES, AND PAPERS Roberto Rigobon Published Papers This website contains papers on economics by speaker Roberto Rigobon http://web.mit.edu/rigobon/www/Robertos Web Page/New.html The Billion Prices Project Co-led by speaker Roberto Rigobon, this project is an academic initiative that uses prices collected from hundreds of online retailers around the world on a daily basis to conduct economic research. http://bpp.mit.edu/ Research activities by Katherine Yelick Speaker Katherine Yelick is involved in a number of HPC projects; the last link is to her publications. http://upc.lbl.gov/ http://bebop.cs.berkeley.edu/ http://crd.lbl.gov/groups-depts/ftg/projects/current-projects/DEGAS/ http://parlab.eecs.berkeley.edu/ http://www.cs.berkeley.edu/~yelick/papers.html The Cloudant Blog Keep up on the latest developments at Cloudant and speaker Michael Miller. https://cloudant.com/blog/ A Case for Redundant Arrays of Inexpensive Disks (RAID) by David Patterson, Garth Gibson, and Randy Katz The original Berkeley Raid paper, co-authored by speaker Garth Gibson. http://www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf Michael Mitzenmacher: Publications by Year Peruse speaker Michael Mitzenmacher’s papers on the general subject of verifiable computation. http://www.eecs.harvard.edu/~michaelm/ListByYear.html My Biased Coin Speaker Michael Mitzenmacher’s take on computer science, algorithms, networking, information theory, and related items. http://mybiasedcoin.blogspot.com/ HPC Wire HPCwire is a news and information portal covering the fastest computers in the world and the people who run them. http://www.hpcwire.com/ Datanami Datanami is a news portal dedicated to providing insight, analysis, and up-to-the-minute information about emerging trends and solutions in big data. http://www.datanami.com/ REFERENCES FROM PREVIOUS TTI/VANGUARD CONFERENCES Previous TTI/Vanguard Conferences have contained discussions and presentations on a number of topics related to those being presented at our conference in Washington, D.C. These may be accessed from the Member Archives section of our website (ttivanguard.com) as Reinforcement Papers and as the actual presentations. Understanding Understanding – TTI/Vanguard Conference October, 2012 – Pittsburgh, Pennsylvania Taming Complexity – TTI/Vanguard Conference October, 2011 – Washington, DC Real Time – TTI/Vanguard Conference July, 2011 – Paris, France Matters of Scale – TTI/Vanguard Conference July, 2010 – London, England Ahead in the Clouds – TTI/Vanguard Conference February, 2009 – San Diego, California Smart(er) Data – TTI/Vanguard Conference February, 2008 All That Data – TTI/Vanguard Conference February, 2005 – Washington, D.C. The Challenge of Complexity – TTI/Vanguard Conference September, 2004 – Los Angeles, California We Choose to Go Exascale, Not Because It’s Easy, but Because It’s Hard – Dr. Satoshi Matsuoka Every year, the information technologies sector consumes an increasing share of global electricity production. While there is the potential for this burden to lessen somewhat as more people turn from desktops and laptops to their smaller screened cousins—smartphones and tablets—it is incumbent on the computing sector to find ways to become as energy-miserly as possible. There are few places that recognize this as fully as Japan, where the available generating capacity plunged precipitously in the wake of the 3.11 Tohoku earthquake and the subsequent nuclear disaster. Not only is less electricity there to be had, but quick replacements were of the dirty form, generated from coal-powered plants, which is antithetical to the nation’s commitment to clean energy. Satoshi Matsuoka of the Tokyo Institute of Technology is looking to IT to help solve the problem to which it otherwise contributes. July, 2012 – Tokyo, Japan Energy and Parallelism: The Challenge of Future Computing – Dr. William Dally In terms of transistor size, there remain some generations of Moore’s Law progress ahead, but the performance improvements that have traditionally accompanied the shrinking core are in technology’s rearview mirror. The problem: The computational roadmap has hit the power wall. The development of multicore chips has provided a brief respite from the inevitable, but even these undergo routine power throttling to keep systems from overheating. This fundamental balance of power in vs. waste heat out is as relevant to the world’s most powerful supercomputers as it is to smartphones—no digital technology is immune. William Dally of NVIDIA makes it clear that none of this is news; he also makes it clear that not enough is being done to ameliorate the problem. December, 2011 – Miami, Florida HPC Clouds: Raining Multicore Services – Dr. Daniel Reed Over its history, science, in all its disciplines, has relied on two fundamental approaches for advancement: empiricism and theory. The growth of the digital domain in the past decades has added simulation-based computation to this short list, and more recently still has emerged a fourth paradigm of science: data-intensive science. In this fourth paradigm, researchers dig into huge data stores and ongoing data streams to reveal new patterns to drive science forward. To make progress requires extreme computing; to begin with, the data sets are extreme, but also involved are an extreme number of processors working on the data with extreme parallelism. The problems of high-performance computing (HPC) in science are closely aligned with those of consumer-oriented data center computation and cloud computing, although the sociological underpinnings of the two communities couldn’t be more different. Putting it succinctly, Microsoft Research’s Daniel Reed compares HPC and cloud computing as “twins separated at birth.” Whereas the cloud model makes processor- and data-intensive computation accessible to the masses, both with piecewise business models and useful tools, HPC possesses huge resources but is hardly user-friendly. December, 2009 – Salt Lake City, Utah The Parallel Revolution Has Started – Dr. David Patterson For decades, Moore’s Law has driven down the size of transistors and thereby driven up the speed and capability of the devices that rely on the chips those transistors compose. As chips have gotten faster, individuals and institutions have been eager to replace each current generation of computer, electronic apparatus, or widget with its follow-on because of the new potential it promises, rather than because its predecessor had ceased to perform as intended. Hiding within Moore’s Law was a nasty little secret, however: More transistors demand more power, and more power generates more waste heat. In 2005, Intel’s single-core chip clock rate maxed out at a scant handful of gigahertz and could go no higher without exceeding the power a standard-issue computer could withstand (~100 W). The move to multiple-core chips resulted not from a great technological breakthrough, but rather from the necessity of chip makers coming face to face with a power wall. This new way to keep chip speed chugging along at Moore’s Law pace, however, assumes a commensurate ability of programmers to make them perform to the collective potential of the cores. Assumptions—as everyone learns in grade school—are fraught with danger, and in this case the industry finds itself unprepared to switch on a dime from a sequential-programming mindset to a parallelprogramming mindset. David Patterson of the University of California-Berkeley’s Parallel Computing Laboratory (Par Lab) is sounding the alarm. December, 2008 – Phoenix, Arizona Reinventing Computing – Mr. Burton Smith Computers, as universal algorithm-executing devices, genuinely benefit from increases in speed. The steady half-century-long fulfillment of the Moore’s Law promise of ever-increasing on-chip transistor density is nearing the end of its viable lifecycle, however, and chip manufacturers are turning to other techniques to satisfy the seemingly insatiable appetite for performance. The new paradigm adopted by Intel, AMD, and others is multiple cores on the same chip; cores can be homogenous or imbued with specialized capabilities, such a graphics processing unit (GPU) and central processing unit (CPU) on a single die with shared memory. A chip with up to eight cores is dubbed multicore; raise the ante higher, and it is called manycore. As hardware manufacturers forge ahead, the software community is scrambling to catch up. It takes more than parallel hardware to create a parallel computer: it takes parallel code, appropriate languages with which to create it, and a change in zeitgeist from which to approach the matter. Thus is the conundrum faced by Microsoft’s Burton Smith. December, 2007 – Santa Monica, California On the Interaction of Life and Machines in Ultra-Large-Scale Systems – Dr. Richard Gabriel Today’s Internet may seem immense, uncontrolled, and almost with a life of its own—just as its founders and the end-to-end principle intended. Yet, Richard Gabriel of Sun Microsystems sees it as a diminutive precursor of the ultra-large-scale (ULS) system—or systems—to come, the comprehension of which is beyond our ken. Comprising trillions of lines of code and an untold number of components—large and small, coming and going, well-defined and ephemeral, tightly integrated and discontinuous—a ULS system is too complex to plan, analyze, or even build at the current juncture. Once conceived, this real-time, embedded, distributed system will inherently exceed familiar management boundaries derived from the philosophical underpinnings and tools of modern-day computer science, which indeed exist for the management of complexity through abstraction. December, 2005 – Washington, D.C. HOME PAGE TODAY'S PAPER V DEO MOST POPULAR Subscribe to Home Delivery U.S. Edition U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION UNBOXED Sure, Big Data Is Great. But So Is Intuition. John Hersey By STEVE LOHR Published: December 29, 2012 It was the bold title of a conference this month at the Massachusetts Institute of Technology, and of a widely read article in The Harvard Business Review last October: “Big Data: The Management Revolution.” FACEBOOK TWITTER GOOGLE+ SAVE Andrew McAfee, principal research scientist at the M.I.T. Center for Bits Blog: Big Data: Rise of the Digital Business, led off the Machines (December 31, 2012) conference by saying that Big Data would be “the next big chapter of our business history.” Next on stage was Erik Brynjolfsson, a professor and director of the M.I.T. center and a co-author of the article with Dr. McAfee. Big Data, said Professor Brynjolfsson, will “replace ideas, paradigms, organizations and ways of thinking about the world.” Help Search All NYTimes.com Business Day WORLD ha_levin E-MA L Related SHARE PR NT REPR NTS These drumroll claims rest on the premise that data like Web-browsing trails, sensor signals, GPS tracking, and social network messages will open the door to measuring and monitoring people and machines as never before. And by setting clever computer algorithms loose on the data troves, you can predict behavior of all kinds: shopping, dating and voting, for example. The results, according to technologists and business executives, will be a smarter world, with more efficient companies, better-served consumers and superior decisions guided by data and analysis. I’ve written about what is now being called Big Data a fair bit over the years, and I think ARTS STYLE TRAVEL JOBS REAL ESTATE AUTOS it’s a powerful tool and an unstoppable trend. But a year-end column, I thought, might be a time for reflection, questions and qualms about this technology. The quest to draw useful insights from business measurements is nothing new. Big Data is a descendant of Frederick Winslow Taylor’s “scientific management” of more than a century ago. Taylor’s instrument of measurement was the stopwatch, timing and monitoring a worker’s every movement. Taylor and his acolytes used these time-andmotion studies to redesign work for maximum efficiency. The excesses of this approach would become satirical grist for Charlie Chaplin’s “Modern Times.” The enthusiasm for quantitative methods has waxed and waned ever since. Big Data proponents point to the Internet for examples of triumphant data businesses, notably Google. But many of the Big Data techniques of math modeling, predictive algorithms and artificial intelligence software were first widely applied on Wall Street. At the M.I.T. conference, a panel was asked to cite examples of big failures in Big Data. No one could really think of any. Soon after, though, Roberto Rigobon could barely contain himself as he took to the stage. Mr. Rigobon, a professor at M.I.T.’s Sloan School of Management, said that the financial crisis certainly humbled the data hounds. “Hedge funds failed all over the world,” he said. The problem is that a math model, like a metaphor, is a simplification. This type of modeling came out of the sciences, where the behavior of particles in a fluid, for example, is predictable according to the laws of physics. In so many Big Data applications, a math model attaches a crisp number to human behavior, interests and preferences. The peril of that approach, as in finance, was the subject of a recent book by Emanuel Derman, a former quant at Goldman Sachs and now a professor at Columbia University. Its title is “Models. Behaving. Badly.” Claudia Perlich, chief scientist at Media6Degrees, an online ad-targeting start-up in New York, puts the problem this way: “You can fool yourself with data like you can’t with anything else. I fear a Big Data bubble.” The bubble that concerns Ms. Perlich is not so much a surge of investment, with new companies forming and then failing in large numbers. That’s capitalism, she says. She is worried about a rush of people calling themselves “data scientists,” doing poor work and giving the field a bad name. Indeed, Big Data does seem to be facing a work-force bottleneck. “We can’t grow the skills fast enough,” says Ms. Perlich, who formerly worked for I.B.M. Watson Labs and is an adjunct professor at the Stern School of Business at New York University. A report last year by the McKinsey Global Institute, the research arm of the consulting firm, projected that the United States needed 140,000 to 190,000 more workers with “deep analytical” expertise and 1.5 million more data-literate managers, whether retrained or hired. Thomas H. Davenport, a visiting professor at the Harvard Business School, is writing a book called “Keeping Up With the Quants” to help managers cope with the Big Data challenge. A major part of managing Big Data projects, he says, is asking the right questions: How do you define the problem? What data do you need? Where does it come from? What are the assumptions behind the model that the data is fed into? How is the model different from reality? Society might be well served if the model makers pondered the ethical dimensions of their work as well as studying the math, according to Rachel Schutt, a senior statistician at Google Research. “Models do not just predict, but they can make things happen,” says Ms. Schutt, who taught a data science course this year at Columbia. “That’s not discussed generally in our field.” Models can create what data scientists call a behavioral loop. A person feeds in data, which is collected by an algorithm that then presents the user with choices, thus steering behavior. Consider Facebook. You put personal data on your Facebook page, and Facebook’s software tracks your clicks and your searches on the site. Then, algorithms sift through that data to present you with “friend” suggestions. Understandably, the increasing use of software that microscopically tracks and monitors online behavior has raised privacy worries. Will Big Data usher in a digital surveillance state, mainly serving corporate interests? Personally, my bigger concern is that the algorithms that are shaping my digital world are too simple-minded, rather than too smart. That was a theme of a book by Eli Pariser, titled “The Filter Bubble: What the Internet Is Hiding From You.” It’s encouraging that thoughtful data scientists like Ms. Perlich and Ms. Schutt recognize the limits and shortcomings of the Big Data technology that they are building. Listening to the data is important, they say, but so is experience and intuition. After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model? At the M.I.T. conference, Ms. Schutt was asked what makes a good data scientist. Obviously, she replied, the requirements include computer science and math skills, but you also want someone who has a deep, wide-ranging curiosity, is innovative and is guided by experience as well as data. “I don’t worship the machine,” she said. HOME PAGE TODAY'S PAPER VIDEO MOST POPULAR Subscribe to Home Delivery U.S. Edition ha_levin Help Search All NYTimes.com WORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE Advertise on NYTimes.com Looking to Industry for the Next Digital Disruption Peter DaSilva for The New York Times William Ruh, a vice president at General Electric, and Sharoda Paul, an expert in social computing. By STEVE LOHR Published: November 23, 2012 SAN RAMON, Calif. — When Sharoda Paul finished a postdoctoral fellowship last year at the Palo Alto Research Center, she did what most of her peers do — considered a job at a big Silicon Valley company, in her case, Google. But instead, Ms. Paul, a 31-year-old expert in social computing, went to work for General Electric. FACEBOOK TWITTER GOOGLE+ SAVE E-MAIL Ms. Paul is one of more than 250 engineers recruited in the last year and a half to G.E.’s new software center here, in the East Bay of San Francisco. The company plans to increase that work force of computer scientists and software developers to 400, and to invest $1 billion in the center by 2015. The buildup is part of G.E’s big bet on what it calls the “industrial Internet,” bringing digital intelligence to the physical world of industry as never before. SHARE PRINT REPRINTS The concept of Internet-connected machines that collect data and communicate, often called the “Internet of Things,” has been around for years. Information technology companies, too, are pursuing this emerging field. I.B.M. has its “Smarter Planet” projects, while Cisco champions the “Internet of Everything.” But G.E.’s effort, analysts say, shows that Internet-era technology is ready to sweep through the industrial economy much as the consumer Internet has transformed media, communications and advertising over the last decade. In recent months, Ms. Paul has donned a hard hat and safety boots to study power plants. She has ridden on a rail locomotive and toured hospital wards. “Here, you get to work with things that touch people in so many ways,” she said. “That was a big draw.” G.E. is the nation’s largest industrial company, a producer of aircraft engines, power plant 1 of 4 AUTOS turbines, rail locomotives and medical imaging equipment. It makes the heavy-duty machinery that transports people, heats homes and powers factories, and lets doctors diagnose life-threatening diseases. G.E. resides in a different world from the consumer Internet. But the major technologies that animate Google and Facebook are also vital ingredients in the industrial Internet — tools from artificial intelligence, like machine-learning software, and vast streams of new data. In industry, the data flood comes mainly from smaller, more powerful and cheaper sensors on the equipment. Smarter machines, for example, can alert their human handlers when they will need maintenance, before a breakdown. It is the equivalent of preventive and personalized care for equipment, with less downtime and more output. “These technologies are really there now, in a way that is practical and economic,” said Mark M. Little, G.E.’s senior vice president for global research. G.E.’s embrace of the industrial Internet is a long-term strategy. But if its optimism proves justified, the impact could be felt across the economy. The outlook for technology-led economic growth is a subject of considerable debate. In a recent research paper, Robert J. Gordon, a prominent economist at Northwestern University, argues that the gains from computing and the Internet have petered out in the last eight years. Since 2000, Mr. Gordon asserts, invention has focused mainly on consumer and communications technologies, including smartphones and tablet computers. Such devices, he writes, are “smaller, smarter and more capable, but do not fundamentally change labor productivity or the standard of living” in the way that electric lighting or the automobile did. But others say such pessimism misses the next wave of technology. “The reason I think Bob Gordon is wrong is precisely because of the kind of thing G.E. is doing,” said Andrew McAfee, principal research scientist at M.I.T.’s Center for Digital Business. Today, G.E. is putting sensors on everything, be it a gas turbine or a hospital bed. The mission of the engineers in San Ramon is to design the software for gathering data, and the clever algorithms for sifting through it for cost savings and productivity gains. Across the industries it covers, G.E. estimates such efficiency opportunities at as much as $150 billion. Some industrial Internet projects are already under way. First Wind, an owner and operator of 16 wind farms in America, is a G.E. customer for wind turbines. It has been experimenting with upgrades that add more sensors, controls and optimization software. The new sensors measure temperature, wind speeds, location and pitch of the blades. They collect three to five times as much data as the sensors on turbines of a few years ago, said Paul Gaynor, chief executive of First Wind. The data is collected and analyzed by G.E. software, and the operation of each turbine can be tweaked for efficiency. For example, in very high winds, turbines across an entire farm are routinely shut down to prevent damage from rotating too fast. But more refined measurement of wind speeds might mean only a portion of the turbines need to be shut down. In wintry conditions, turbines can detect when they are icing up, and speed up or change pitch to knock off the ice. Upgrades on 123 turbines on two wind farms have so far delivered a 3 percent increase in energy output, about 120 megawatt hours per turbine a year. That translates to $1.2 million in additional revenue a year from those two farms, Mr. Gaynor said. “It’s not earthshaking, but it is meaningful,” he said. “These are real commercial investments for us that make economic sense now.” For the last few years, G.E. and Mount Sinai Medical Center have been working on a project to optimize the operations of the 1,100-bed hospital in New York. Hospitals, in a sense, are factories of health care. The challenge for hospitals, especially as cost pressures tighten, is to treat more patients more efficiently, while improving the quality of care. Technology, said Wayne Keathley, president of Mount Sinai, can play a vital role. At Mount Sinai, patients get a black plastic wristband with a location sensor and other information. Similar sensors are on beds and medical equipment. An important advantage, Mr. Keathley said, is to be able to see the daily flow of patients, physical assets and treatment as it unfolds. But he said the real benefit was how the data could be used to automate and streamline operations and then make better decisions. For example, in a typical hospital, getting a patient who shows up in an emergency room into an assigned bed in a hospital ward can take several hours and phone calls. At Mount Sinai, G.E. has worked on optimization and modeling software that enables admitting officers to see beds and patient movements throughout the hospital, to help them more efficiently match patients and beds. Beyond that, modeling software is beginning to make predictions about likely patient admission and discharge numbers over the next several hours, based on historical patterns at the hospital and other circumstances — say, in flu season. The software, which Mount Sinai has been trying out in recent months, acts as an intelligent assistant to admitting officers. “It essentially says, ‘Hold off, your instinct is to give this bed to that guy, but there might be a better choice,’ ” Mr. Keathley explained. At a hospital like Mount Sinai, G.E. estimates that the optimization and modeling technologies can translate into roughly 10,000 more patients treated a year, and $120 million in savings and additional revenue over several years. The origins of G.E.’s industrial Internet strategy date back to meetings at the company’s headquarters in Fairfield, Conn., in May 2009. In the depths of the financial crisis, Jeffrey R. Immelt, G.E.’s chief executive, met with his senior staff to discuss long-term growth opportunities. The industrial Internet, they decided, built on G.E.’s strength in research and could be leveraged across its varied industrial businesses, adding to the company’s revenue in services, which reached $42 billion last year. Now G.E. is trying to rally support for its vision from industry partners, academics, venture capitalists and start-ups. About 250 of them have been invited to a conference in San Francisco, sponsored by the company, on Thursday. Mr. Immelt himself becomes involved in recruiting. His message, he says, is that if you want to have an effect on major societal challenges like improving health care, energy and transportation, consider G.E. An early convert was William Ruh, who joined G.E. from Cisco, to become vice president in charge of the software center in San Ramon. And Mr. Ruh is taking the same message to high-tech recruits like Ms. Paul. “Here, they are working on things they can explain to their parents and grandparents,” he said. “It’s not a social network,” even if the G.E. projects share some of the same technology. General Electric ties smart meters, grid sensors, and enterprise IT into a cloud-hosted package. But will utilities buy in? JEFF ST. JOHN: DECEMBER 12, 2012 Two weeks ago, General Electric made a big splash in the world of the Internet of Things, or, as GE likes to call it, the “industrial internet.” In a series of high-profile announcements, the global energy and engineering giant laid out its plan to add networking and distributed intelligence capabilities to more and more of its devices, ranging from aircraft engines to industrial and grid control systems, and start analyzing all that data to drive big new gains in efficiency across the industries it serves. That includes the smart grid, of course. GE is a massive grid player, alongside such competitors as Siemens, ABB, Alstom, Schneider Electric and Eaton/Cooper Power. But in terms of scope, GE and Siemens stand apart in that they make everything from natural gas and wind turbines, to the heavy transmission and distribution gear -- transformers, sensors, switches and the like -- that delivers it to end users. GE and its competitors also have their own lines of industrial communication, networking and control gear for distribution automation (DA) tasks on the grid, of course. Unlike most of the above-named competitors, however, GE is also a big maker of smart meters – although the networking technology that links up all those meters tends to come from other partners. So we’ve got the technological underpinnings for a true Internet of things environment on the smart grid. But who’s managing it all on the back end? Right now, utilities tend to run their own data centers and back-office control rooms. But legacy billing, customer service and enterprise resource planning systems don’t easily integrate with the new breed of data coming at them from the smart grid. Indeed, we’ve got a host of IT giants like Cisco, IBM, Microsoft, Oracle, SAP, Infosys, Wipro and many more offering smart grid software services and integration, aimed at making sure data from smart meters, grid sensors and other formerly siloed technologies can be freely shared across the enterprise. Perhaps the most important stepping stone for GE in moving its smart grid business into the “industrial internet” age is to capture its own share of this future market in smart grid integration. GE’s “Grid IQ Solutions as a Service” business, launched last year, represents that effort. In a move increasingly being rolled out by grid giants and startups alike, GE is moving the smart grid to the cloud -- in this case, dedicated servers in its GE Digital Energy data center in Atlanta, Ga. -- and offering utilities the opportunity to choose from a list of products and functions they’d like to deploy, all for a structured fee. In the year since it launched, GE’s smart grid service has landed two city utilities, Norcross, Ga. and Leesburg, Fla., as named customers for its first SaaS product line, the Grid IQ Connect platform. That’s essentially a smart meter deployment run and managed by GE working with unnamed AMI partners, Todd Jackson, SaaS product line leader for GE Digital Energy, said in a Tuesday interview. GE has lined up partners to provide a host of AMI networking flavors, including the mesh networking that dominates in U.S. smart meter deployments to date, as well as point-to-multipoint and cellular solutions, Jackson said. That’s not unlike GE’s current smart metering business model, in which it works with partners such as Silver Spring Networks, Trilliant, and others that add their own communications gear to GE’s core meters. GE’s new role as back-end IT services provider to its Grid IQ Connect customers means that GE is also bringing a lot more software expertise to the fore, Jackson noted. While its AMI partners tend to provide the networking and meter data management aspects of the deployment, GE is providing about half of the remaining IT functionality, he said -- including the core task of hosting all its partners’ software on its own dedicated servers. GE has also been rolling out new feature sets for its smart-grid-as-a-service platform, including prepay options for smart meters, as well as its Grid IQ Restore, which adds outage detection and management to the array of options for its customers. Earlier this year, GE also took a step beyond the utility and into the homes and businesses that they serve, launching its Grid IQ Respond platform. Essentially, it’s a version of GE’s demand response technology offered over the cloud, and is currently being rolled out with three unnamed utilities, two in the United States and one in Europe, Jackson said. Right now the projects are mostly focused on homes, he explained, and most of those are connecting to load control switches, attached to major household loads like pool pumps, water heaters and air conditioners, that the utility can switch off and on to help manage peak power demands. A few million homes across the U.S. have these kinds of radio or pager-operated load control switches installed, usually in exchange for rebates or cheaper power rate offers from utilities desperate to curb their customers’ appetite for expensive peak power. At the same time, competitors in this business, such as Honeywell, Eaton/Cooper Power, Comverge and others, have been busy working on their own softwareas-a-service models, complete with cloud-hosted applications and increasing options for networking end-devices in homes and businesses. And of course, we’ve got literally dozens of startups competing for the still-nascent market for in-home energy management devices and the networks that can connect them to utilities, as well as the internet at large. GE, which is a huge appliance maker, has its own version of a home energy management device, called the Nucleus. But it hasn’t rolled it out to market yet, preferring to keep it in pilot projects so far, and Jackson said there aren’t any immediate plans to include it in GE’s Grid IQ Respond offerings. As for target markets, GE is largely looking at municipal utilities and cooperatives, which tend to lack the big budgets and capital expenditure recovery mechanisms of larger investor-owned utilities (IOUs), Jackson said. At the same time, GE does offer its smart grid platform in a so-called “boosted model,” in which utilities can put the capital equipment on their balance sheets, as well as a managed service model where GE owns the hardware, he said. So far, utilities are about evenly split in their interest between the two business models, he said. So how does this tie into the Internet of Things concept? Well, “Once the network is deployed, there are other things that municipal utilities can tie in there and benefit from,” Jackson noted. Some examples include the ability to connect streetlights or traffic cameras to the same network that supports smart meters, he said. That’s a concept that we’ve seen deployed by such smart grid players as Sensus and Santa Clara, Calif.-based startup Tropos Networks, which was bought by grid giant ABB earlier this year. On the backend IT side, GE is also tackling challenges like connecting smart meter data to customer service platforms and other utility business software platforms, Jackson said. That’s led to integration that allows customer service reps to tie directly into an individual customer’s smart meter during an outage, to figure out whether or not it’s a utility problem or a blown fuse, for example -- the kind of incremental improvement that only comes when data is freely shared. Whether or not utilities will catch on to the smart-grid-as-a-service model remains to be seen. Jackson said that GE has been talking to multiple utilities that haven't announced themselves yet. Amidst a general slowdown in North American smart meter deployments expected next year, smaller municipal and cooperative utilities stand out as a relatively untapped sector -- and one that will need some help in managing the IT behind an AMI or DA deployment at a cost commensurate with the smaller scale of their projects, in the tens or hundreds of thousands of meters, rather than millions. Utilities do face some regulatory challenges and uncertainties in turning over key parts of their operations to a third party. At the same time, they're under pressure to meet a whole new array of requirements, including smart grid security and data privacy, that may well be better managed by a big central provider like GE than by each small utility. In the end, services will be the key to unlocking the small utility smart grid market, to be sure. But GE faces plenty of competition in establishing itself as the platform to trust -- and as with every shift in the way utilities do business, it's going to take years to develop. With high-performance computing, "pixie-booting" servers a half-world away. by Sean Gallagher - Oct 18 2012, 3:13pm PDT The inside of Equinix's co-location facility in San Jose—the home of CloudFlare's primary data center. Photo: Peter McCollough/Wired.com On August 22, CloudFlare, a content delivery network, turned on a brand new data center in Seoul, Korea—the last of ten new facilities started across four continents in a span of thirty days. The Seoul data center brought CloudFlare's number of data centers up to 23, nearly doubling the company's global reach—a significant feat in itself for a company of just 32 employees. But there was something else relatively significant about the Seoul data center and the other 9 facilities set up this summer: despite the fact that the company owned every router and every server in their racks, and each had been configured with great care to handle the demands of CloudFlare's CDN and security services, no one from CloudFlare had ever set foot in them. All that came from CloudFlare directly was a six-page manual instructing facility managers and local suppliers on how to rack and plug in the boxes shipped to them. "We have nobody stationed in Stockholm or Seoul or Sydney, or a lot of the places that we put these new data centers," CloudFlare CEO Matthew Prince told Ars. "In fact, no CloudFlare employees have stepped foot in half of the facilities where we've launched." The totally remote-controlled data center approach used by the company is one of the reasons that CloudFlare can afford to provide its services for free to most of its customers—and still make a 75 percent profit margin. In the two years since its launch, the content delivery network and denial-of-service protection company has helped keep all sorts of sites online during global attacks, both famous and infamous —including recognition from both Davos and LulzSec. And all that attention has amounted to Yahoo-sized traffic—the CloudFlare service has handled over 581 billion pageviews since its launch. Yet CloudFlare does all this without the sort of Domain Name Service "black magic" that Akamai and that level of efficiency, CloudFlare has done some black magic of a different sort, relying on open-source software from the realm of high-performance computing, storage tricks from the world of "big data," a bit of network peering arbitrage and clever use of a core Internet routing technology. In the process, it has created an ever-expanding army of remote-controlled service points around the globe that can eat 60-gigabit-per-second distributed denial of service attacks for breakfast. CloudFlare's CDN is based on Anycast, a standard defined in the Border Gateway Protocol—the routing protocol that's at the center of how the Internet directs traffic. Anycast is part of how BGP supports the multi-homing of IP addresses, in which multiple routers connect a network to the Internet; through the broadcasts of IP addresses available through a router, other routers determine the shortest path for network traffic to take to reach that destination. Using Anycast means that CloudFlare makes the servers it fronts appear to be in many places, while only using one IP address. "If you do a traceroute to Metallica.com (a CloudFlare customer), depending on where you are in the world, you would hit a different data center," Prince said. "But you're getting back the same IP address." That means that as CloudFlare adds more data centers, and those data centers advertise the IP addresses of the websites that are fronted by the service, the Internet's core routers automatically re-map the routes to the IP addresses of the sites. There's no need to do anything special with the Domain Name Service to handle load-balancing of network traffic to sites other than point the hostname for a site at CloudFlare's IP address. It also means that when a specific data center needs to be taken down for an upgrade or maintenance (or gets knocked offline for some other reason), the routes can be adjusted on the fly. That makes it much harder for distributed denial of service attacks to go after servers behind CloudFlare's CDN network; if they're geographically widespread, the traffic they generate gets spread across all of CloudFlare's data centers—as long as the network connections at each site aren't overcome. In September, Prince said, "there was a brand new botnet out there launching big attacks, and it targeted one of our customers. It generated 65 gigabits per second of traffic hitting our network. But none of that traffic was focused in one place—it was split fairly evenly across our 23 data centers, so each of those facilities only had to deal with about 3 gigs of traffic. That's much more manageable." Making CloudFlare's approach work requires that it put its networks as close as possible to the core routers of the Internet—at least in terms of network hops. While companies like Google, Facebook, Microsoft, and Yahoo have gone to great lengths to build their own custom data centers in places where power is cheap and where they can take advantage of the economies of scale, CloudFlare looks to use existing facilities that "your network traffic would be passing through even if you weren't using our service," Prince said. As a result, the company's "data centers" are usually at most a few racks of hardware, installed at co-location facilities that are major network exchange points. Prince said that most of his company's data centers are set up at Equinix IBX co-location facilities in the US, including CloudFlare's primary facility in San Jose—a facility also used by Google and other major cloud players as an on-ramp to the Internet. CloudFlare looks for co-location facilities with the same sort of capabilities wherever it can. But these sorts of facilities tend to be older, without the kind of power distribution density that a custom-built data center would have. "That means that to get as much compute power as possible into any given rack, we're spending a lot of time paying attention to what power decisions we make," Prince said. The other factor driving what goes into those racks is the need to maximize the utilization of CloudFlare's outbound Internet connections. CloudFlare buys its bandwidth wholesale from network transit providers, committing to a certain level of service. "We're paying for that no matter what," Prince said, "so it's optimal to fill that pipe up." That means that the computing power of CloudFlare's servers is less of a priority than networking and cache input/output and power consumption. And since CloudFlare depends heavily on the facility providers overseas or other partners to do hardware installations and swap-outs, the company needed to make its servers as simple as possible to install—bringing it down to that six-page manual. To make that possible, CloudFlare's engineering team drew on experience and technology from the high-performance computing world. "A lot of our team comes from the HPC space," Prince said. "They include people who built HPC networks for the Department of Energy, where they have an 80 thousand node cluster, and had to figure out how to get 80,000 computers, fit them into one space, cable them in a really reliable way, and make sure that you can manage them from a single location." One of the things that CloudFlare brought over from the team's DoE experience was the Perceus Provisioning System, an open-source provisioning system for Linux used by DoE for its HPC environments. All of CloudFlare's servers are "pixie-booted" (using a Preboot eXecution Environment, or PXE) across a virtual private network between data centers; servers are delivered with no operating system or configuration whatsoever, other than a bootloader that calls back to Perceus for provisioning. "The servers come from whatever equipment vendor we buy them from completely bare," Prince said. "All we get from them is the MAC address." CloudFlare's servers run on a custom-built Linux distribution based on Debian. For security purposes, the servers are "statelessly" provisioned with Perceus—that is, the operating system is loaded completely in RAM. The mass storage on CloudFlare servers (which is universally based on SSD drives) is used exclusively for caching data from clients' sites. The gear deployed to data centers that gets significant pre-installation attention from CloudFlare's engineers is the routers—primarily supplied by Juniper Networks, which works with CloudFlare to preconfigure them before being shipped to new data centers. Part of the configuration is to create virtual network connections over the Internet to the other CloudFlare data centers, which allows each data center to use its nearest peer to pull software from during provisioning and updating. "When we booted up Vienna, for example," said Prince, "the closest data center was Frankfurt, so we used the Frankfurt facility to boot the new Vienna facility." One server in Vienna was booted first as the "master node," with provisioning instructions for each of the other machines. Once the servers are all provisioned and loaded, "they call back to our central facility (in San Jose) and say, 'Here are our MAC addresses, what do you need us to do?'" Once the machines have passed a final set of tests, each gets designated with an operational responsibility: acting as a proxy for Web requests to clients' servers, managing the cache of content to speed responses, DNS and logging services. Each of those services can be run on any server in the stack and step up to take over a service if one of its comrades fails. Caching is part of the job for every server in each CloudFlare facility, and being able to scale up the size of the cache is another reason for the modular nature of how the company thinks of servers. Rather than storing cached webpage objects in a traditional database or file system, CloudFlare uses a hash-based database that works in a fashion similar to "NoSQL" databases like 10gen's MongoDB and Amazon's Dynamo storage system. When a request for a webpage comes in for the first time, CloudFlare retrieves the site contents. A consistent hashing algorithm in CloudFlare's caching engine then converts the URL used to call each element into a value, which is used as the key under which the content is stored locally at each data center. Each server in the stack is assigned a range of hashes to store content for, and subsequent requests for the content are routed to the appropriate server for that cache. Unlike most database applications, the cache stored at each CloudFlare facility has an undefined expiration date—and because of the nature of those facilities, it isn't a simple matter to add more storage. To keep the utilization level of installed storage high, the cache system simply purges older cache data when it needs to store new content. The downside of the hash-based cache's simplicity is that it has no built-in logging system to track content. CloudFlare can't tell customers which data centers have copies of which content they've posted. "A customer will ask me, 'Tell me all of the files you have in cache,'" Prince said. "For us, all we know is there are a whole bunch of hashes sitting on a disk somewhere—we don't keep track of which object belongs to what site." The upside, however, is that the system has a very low overhead as a result and can retrieve site content quickly and keep those outbound pipes full. And when you're scaling a 32-person company to fight the speed of light worldwide, it helps to keep things as simple as possible. Home Events Community Learning Industry Verticals Answers Jobs 2012 SEMTECHBIZ WEST By Paul Miller on June 7, 2012 1:30 PM Daniel Tunkelang, Principal Data Scientist at LinkedIn, delivered the final keynote at SemTechBiz in San Francisco this morning, exploring the way in which “semantics emerge when we apply the right analytical techniques to a sufficient quality and quantity of data.” Daniel began by offering his key takeaways for the presentation; Communication trumps knowledge representation. Communication is the problem and the solution. Knowledge representation, and the systems that support it, are possibly over-rated. We get too obsessed, Tunkelang suggested, with building systems that are ‘perfect,’ and in setting out to assemble ‘comprehensive’ sets of data. On the flip side, Computation is underrated – machines can do a lot to help us cope with incomplete or messy data, especially at scale. We have a communication problem. Daniel goes back to the dream of AI, referencing Arthur C Clarke’s HAL 9000 and Star Trek’s android, Data. Both, he suggests, were “constructed by their authors as intelligent computers.” Importantly, they “supported natural language interfaces” to communicate with humans. Their creators, Tunkelang suggested, believed that the computation and the access to knowledge were the hard part – communication was an ‘easy’ after-thought. Moving on, we reach Vannevar Bush‘s Memex from the 1940′s. And in the 1980s we reach Cyc. Loaded with domain-specific knowledge and more, but “this approach did not and will not get us” anywhere particularly useful. Moving closer to the present, Freebase. “One of the best practical examples of semantic technologies in the semantic web sense… doing relations across a very large triple store… and making the result available in an open way.” But Freebase has problems, and “they are fundamental in nature.” When you’re dealing with structured data acquired from all over the world, it is difficult to ensure consistency or completeness. “We’re unlikely to achieve perfection, so we shouldn’t make perfection a requirement for success.” Wolfram Alpha, starting from a proprietary collection of knowledge, is enabling reasoning and analysis over a growing collection of data. Wolfram Alpha is very good when it’s good, but extremely weak when it comes to guiding users toward ‘appropriate’ sources; there is a breakdown in communication, and a failure to manage or guide user expectations. “Today’s knowledge repositories are incomplete, inconsistent, and inscrutable.” “They are not sustained by economic incentives.” Computation is under-rated. IBM’s Deep Blue, for example. A feat of brute-force computation rather than semantics, intelligence or cleverness. “Chess isn’t that hard.” Also IBM – Watson and its win at Jeopardy. “A completely different ball of wax to playing chess” that is far more freeform and unpredictable than rules-based chess. Although Stephen Wolfram’s blog post from 2011 suggests that internet search engines can also actually do pretty well in identifying Jeopardy answers. Google’s Alon Halevy, Peter Norvig and Fernando Pereira suggested in 2009 that “more data beats clever algorithms.” Where can we go from here? “We have a glut of semi-structured data.” LinkedIn has a lot of semi-structured data from 160 million members, predominantly in the form of free-text descriptive profile text; marked-up (but typically incomplete and ambiguous) statements of employment, education, promotion etc; (also typically incomplete) graph data representing the relationships between people and roles. Semi-structured search is a killer app. Faceted search UI on LinkedIn, letting the user explore and follow connections, without the need for data to be entered in accordance with a strict classification system or data model. There is no perfect schema or vocabulary. And even if there were, not everyone would ue it. Knowledge representation only tends to exceed in narrowly scoped areas. Brute force computation can be surprisingly successful. Machines don’t have to be perfect. Structure doesn’t have to be perfect. We don’t have to be perfect. Communicate with the user. Offer a UI that guides them and helps them explore. Don’t aim for perfection. Offer just enough to help the user move forward. “More data beats clever algorithms, but better data beats more data.” Computation isn’t the enemy. Make sure ‘better’ data – from SemTech community and others – is available to these machines and then we’ll see something remarkable. For more from Daniel, listen to April’s episode of the Semantic Link podcast in which he was our guest. Data Center Cloud Operating Systems Software Applications Networks Security Developer Policy Business Jobs Hardware Science Bootnotes Forums Verity Stob Beyond MapReduce: Hadoop hangs on Tooling up By Matt Asay • Get more from this author Posted in Developer, 10th July 2012 13:03 GMT Free whitepaper – EMA advanced performance analytics report Open ... and Shut Hadoop is all the rage in enterprise computing, and has become the poster child for the big-data movement. But just as the enterprise consolidates around Hadoop, the web world, including Google – which originated the technology ideas behind Hadoop – is moving on to real-time, ad-hoc analytics that batch-oriented Hadoop can't match. Is Hadoop already outdated? As Cloudant chief scientist Mike Miller points out, Google's MapReduce approach to big data analytics may already be passé. It certainly is at Google: [Google's MapReduce] no longer holds such prominence in the Google stack... Google seems to be moving past it. In fact, many of the technologies [Google now uses like Percolator for incremental indexing and analysis of frequently changing datasets and Dremel for ad-hoc analytics] aren’t even new; they date back the second half of the last decade, mere years after the seminal [MapReduce] paper was in print. By one estimate, Hadoop, which is an open-source implementation of Google's MapReduce technology, hasn't even caught up to Google's original MapReduce framework. And now people like Miller are arguing that a MapReduce approach to Big Data is the wrong starting point altogether. For a slow-moving enterprise, what to do? The good news is that soon most enterprises likely won't have to bother with Hadoop at all, as Hadoop will be baked into the cloud applications that enterprises buy. And as those vendors figure out better technologies to handle real-time (like Storm) or ad hoc analysis (like Dremel), they, too, will be baked into cloud applications. As an interim step to such applications, big-data tools vendors like Datameer and Karmasphere are already releasing cloud-based tools for analyzing Hadoop data. This is critical to Hadoop's short-term success as Forrester notes that Hadoop is still "an immature technology with many moving parts that are neither robust nor well integrated." Good tooling helps. But is Hadoop the right place to start, good tooling or no? Cloudscale chief executive Bill McColl, writing back in 2010, says "definitely not." He argues: Simple batch processing tools like MapReduce and Hadoop are just not powerful enough in any one of the dimensions of the big data space that really matters. Sure, Hadoop is great for simple batch processing tasks that are “embarrassingly parallel”, but most of the difficult big data tasks confronting companies today are much more complex than that. McColl isn't a neutral observer of Hadoop: his company competes with vanilla Hadoop deployments. My own company, Nodeable, offers a real-time complement to Hadoop, based on the open-source Storm project, but I'm much more sanguine about Hadoop's medium-term prospects than either McColl or Miller. But his point is well-taken, especially in light of Miller's observation that even the originator of MapReduce, Google, has largely moved on for faster, more responsive analytical tools. Does it matter? Probably not. At least, not anytime soon. It has long been the case that web giants like Facebook and Google have moved faster than enterprise IT, which tends to be much more risk-averse and more prone to hanging onto technology once it's made to work. So it's a Very Good Thing, as Businessweek highlights, that the web's technology of today is being open sourced to fuel the enterprise technology of tomorrow. Hadoop still has several kinks to work out before it can go truly mainstream in the enterprise. It's not as if enterprises are going to go charging ahead into Percolator or other more modern approaches to big data when they have yet to squeeze Hadoop for maximum value. Enterprise IT managers like to travel in packs, and the pack is currently working on Hadoop. There may be better options out there, but they're going to need to find ways to complement Hadoop, not displace it. Hadoop simply has too much momentum going for it. I suspect we'll see Hadoop continue forward as the primary engine of big data analytics. We're looking at many years of dominance for Hadoop. However, I think we'll also see add-on technologies offered by cloud vendors to augment the framework. Hadoop is never going to be a real-time system, so things like Storm will come to be viewed as must-have tools to provide real-time insight alongside Hadoop's timely, deep analytics. Some early adopters will figure these tools out on their own without help from cloud application vendors. But for most, they're going to buy, not build, and that "buy" decision will include plenty of Hadoop, whether from Cloudera or Metamarkets or Hortonworks or EMC or anybody else. That's why Forrester pegs today's Hadoop ecosystem at $1bn, a number that is only going to grow, no matter what Google thinks is a better approach to big data. ® Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register. what’s hot? Like 17 Sun Microsystems cofounder and networking guru Andy Bechtholsheim predicted that networking chips — which determine how quickly you can surf the Internet — will keep following the path of progress that it has for decades. Moore’s Law, the prediction in 1965 by Intel’s Gordon Moore that the number of transistors on a chip will double every two years, is still holding up. In the next 20 years, Bechtolsheim expects an improvement of 1,000 times in chip performance. We should all greet that with relief, since the $1 trillion-plus electronics economy 7 T 77 Share depends on the continuous efficiencies gained from making chips smaller, faster, and cheaper. “We are in the Golden Age of networking, driven by Moore’s Law,” said Bechtolsheim in a keynote speech at the Linley Tech Processor conference in San Jose, Calif. Bechtolsheim is worth listening to. He is the founder, chairman, and chief development officer at networking hardware firm Arista Networks, which builds cloud-based appliances for large data centers. He was the chief system architect at Sun and became famous as the angel who funded Google. He also started Granite Systems, which Cisco acquired in 1996. He developed a series of switches at Cisco and also founded Kealia, which Sun acquired in 2004. Bechtolsheim talked a lot about leaf switches and buffers and spines and other stuff that was way over my head. But he closed his talk with a series of predictions about the future of Moore’s Law and its relevance to the future of networking, which depends on data centers with tons of servers, each with lots of chips, powered by multiple computing brains (or cores). In data centers, keeping the flow of data moving as fast as possible between the outside world, through memory, into processors and into long-term storage is of paramount concern. It’s a realm in which nanoseconds matter. Today’s networking chips transfer data at rates of 10 gigabits a second, 40 gigabits a second, or 100 gigabits a second. Part of that depends on chips, but it also depends on optical components that transfer data using laser components, which are harder to improve compared to silicon chip technology. “Optics, unfortunately, is not on Moore’s Law,” said Bechtolsheim. But he remained optimistic about progress in the future. Bechtolsheim predicted: 1. Moore’s Law is alive and well. By doubling the number of transistors per chip every two years, chip designers will be able to keep feeding faster and cheaper chips to networking-gear designers. In the next 12 years, the path ahead is clear, Bechtolsheim said. That will give us almost 100 times more transistors — the basic on-off switches of digital computers — on every chip. 2. The economics of chips are changing. Each generation of chip design is getting more expensive as it takes more engineers to craft designs from the greater number of transistors available. Designing a new complex switch chip for networks can cost $20 million. Making chips that sell in low volumes is no longer viable. Chip startups often can’t afford to do this anymore. And in-house designs make less sense. 3. Merchant-network silicon vendors will gain market share. Those who design chips that many different system companies use will likely prevail over those who design in-house chips for just one vendor. Moreover, the differentiation now happens in the software that runs on the chip, not in the hardware. And internal design teams often can’t keep up with advances in silicon design tools on the merchant market. 4. Custom designs lead the way. Custom designs can get more bang for the buck out of the available transistors. So even the merchant silicon vendors will have to modify solutions for each customer. 5. Using the best available silicon manufacturing technology is the key. With each new manufacturing generation, chips become faster, smaller, and cheaper. Today’s silicon chip designs have to be built in 28-nanometer technology or better. Those designs must use less power, access more memory, and perform faster. “No one wants to roll the clock back, and the silicon march is relentless,” Bechtolsheim said. 6. Product life cycles are shorter. Each new silicon chip has a shorter life, but it can ship in higher volumes. The days of 10-year product life cycles are gone and will never come back. Chip designers and system makers can count on frequent upgrade cycles, but they’re face more competition. 7. Architecture matters. Having a faster internal engine makes a car run faster. That’s also true for a chip. With better design at the component level, the overall chip and system run better. This requires rethinking approaches that worked in the past for a more modern technology. Keeping the data flowing within the chip is critical. 8. Flexibility matters. Chips are becoming more versatile and programmable. They can support a variety of protocols and usage cases. Flexibility allows for reuse over generations and expansion to new markets. 9. Building blocks matter. In the age of multitasking, multiple components matter. Replicating cores, or brains, across a chip is the way to faster, more reliable, and lower-power chips. Every component is reusable. 10. The system is the chip. In the future, with more efficient manufacturing technology, future switch chips will be single-chip designs. That requires close communication between makers of systems, software designers, and chip vendors. Anyone who tries to lock out any of the other parties will likely be doomed. “In conclusion, Moore’s Law is alive and well,” Bechtolsheim said. HOME PAGE TODAY'S PAPER VIDEO MOST POPULAR Subscribe to Home Delivery TIMES TOPICS U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION Internet Architects Warn of Risks in Ultrafast Networks By QUENTIN HARDY Published: November 13, 2011 SANTA CLARA, Calif. — If nothing else, Arista Networks proves that two people can make more than $1 billion each building the Internet and still be worried about its reliability. Enlarge This Image Jim Wilson/The New York Times The Arista Networks founders, Andreas Bechtolsheim, left, David Cheriton and Kenneth Duda, with a data-routing switch at the company's headquarters in Santa Clara, Calif. Enlarge This Image David Cheriton, a computer science professor at Stanford known for his skills in software design, and Andreas Bechtolsheim, one of the founders of Sun Microsystems, have committed $100 million of their money, and spent half that, to shake up the business of connecting computers in the Internet’s big computing centers. Help Search All NYTimes.com Business Day WORLD ha_levin RECOMMEND TWITTER LINKEDIN E-MAIL PRINT REPRINTS SHARE As the Arista founders say, the promise of having access to mammoth amounts of data instantly, anywhere, is matched by the threat of catastrophe. People are creating more data and moving it ever faster on computer networks. The fast networks allow people to pour much more of civilization online, including not just Facebook posts and every book ever written, but all music, live video calls, and most of the information technology behind modern business, into a worldwide “cloud” of data centers. The networks are designed so it will always be available, via phone, tablet, personal computer or an increasing array of connected devices. Statistics dictate that the vastly greater number of transactions among computers in a world 100 times faster than today will lead to a greater number of unpredictable Jim Wilson/The New York Times accidents, with less time in between them. Already, Lorenz Redlefsen, an Arista engineer, with a data-routing switch. Amazon’s cloud for businesses failed for several hours in April, when normal computer routines faltered and the system overloaded. Google’s cloud of e-mail and document collaboration software has been interrupted several times. “We think of the Internet as always there. Just because we’ve become dependent on it, that doesn’t mean it’s true,” Mr. Cheriton says. Mr. Bechtolsheim says that because of the Internet’s complexity, the global network is impossible to design without bugs. Very dangerous bugs, as they describe them, capable of halting commerce, destroying financial information or enabling hostile attacks by foreign powers. Both were among the first investors in Google, which made them billionaires, and, before that, they created and sold a company to the networking giant Cisco Systems for $220 million. Wealth and reputations as technology seers give their arguments about the risks ARTS STYLE TRAVEL JOBS REAL ESTATE AUTOS of faster networks rare credibility. More transactions also mean more system attacks. Even though he says there is no turning back on the online society, Mr. Cheriton worries most about security hazards. “I’ve made the claim that the Chinese military can take it down in 30 seconds, no one can prove me wrong,” he said. By building a new way to run networks in the cloud era, he says, “we have a path to having software that is more sophisticated, can be self-defending, and is able to detect more problems, quicker.” The common connection among computer servers, one gigabit per second, is giving way to 10-gigabit connections, because of improvements in semiconductor design and software. Speeds of 40 gigabits, even 100 gigabits, are now used for specialty purposes like consolidating huge data streams among hundreds of thousands of computers across the globe, and that technology is headed into the mainstream. An engineering standard for a terabit per second, 1,000 gigabits, is expected in about seven years. Arista, which is based here, was built with the 10-gigabit world in mind. It now has 250 employees, 167 of them engineers, building a fast data-routing switch that could isolate problems and fix them without ever shutting down the network. It is intended to run on inexpensive mass-produced chips. In terms of software and hardware, it was a big break from the way things had been done in networking for the last quarter-century. “Companies like Cisco had to build their own specialty chips to work at high speed for the time,” Mr. Bechtolsheim said. Because of improvements in the quality and capability of the kind of chips used in computers, phones and cable television boxes, “we could build a network that is a lot more software-enabled, something that is a lot easier to defend and modify,” he said. For Mr. Cheriton, who cuts his own hair despite his great wealth, Arista was an opportunity to work on a new style of software he said he had been thinking about since 1989. No matter how complex, software is essentially a linear system of commands: Do this, and then do that. Sometimes it is divided into “objects” or modules, but these tend to operate sequentially. From 2004 to 2008, when Arista shipped its first product, Mr. Cheriton developed a five million-line system that breaks operations into a series of tasks, which when completed, other parts of the program can check on and pick up if everything seems fine. If it does not, the problem is rapidly isolated and addressed. Mr. Bechtolsheim worked with him to make the system operate with chips that were already on the market. The first products were sold to financial traders looking to shave 100 nanoseconds off their high-frequency trades. Arista has more than 1,000 customers now, including telecommunications companies and university research laboratories. “They have created something that is architecturally unique in networking, with a lot of value for the industry,” says Nicholas Lippis, who tests and evaluates switching equipment. “They built something fast that has a unique value for the industry.” Kenneth Duda, another founder, said, “What drives us here is finding a new way to do software.” Mr. Duda also worked with Mr. Cheriton and Mr. Bechtolsheim at Granite Systems, the company they sold to Cisco. “The great enemy is complexity, measured in lines of code, or interactions,” he said. In the world of cloud computing, “there is no person alive who can understand 10 percent of the technology involved in my writing and printing out an online shopping list.” Not surprisingly, Cisco, which dominates the $5 billion network switching business, disagrees. “You don’t have to reinvent the Internet,” says Ram Velaga, vice president for product management in Cisco’s core technology group. “These protocols were designed to work even if Washington is taken out. That is in the architecture.” Still, Cisco’s newest data center switches have rewritten software in a way more like Arista’s. A few products are using so-called merchant silicon, instead of its typical custom chips. “Andy made a bet that Cisco would never use merchant silicon,” Mr. Velaga says. Mr. Cheriton and Mr. Bechtolsheim have known each other since 1981, when Mr. Cheriton arrived from his native Canada to teach at Stanford. Mr. Bechtolsheim, a native of Germany, was studying electrical engineering and building what became Sun’s first product, a computer workstation. The two became friends and intellectual compatriots, and in 1994 began Granite Networks, which made one of the first gigabit switches. Cisco bought the company two years later. With no outside investors in Arista, they could take as long as they wanted on the product, Mr. Bechtolsheim said. “Venture capitalists have no patience for a product to develop.” he said. “Pretty soon they want to bring in their best buddy as the C.E.O. Besides, this looked like a good investment.” Mr. Cheriton said, “Not being venture funded was definitely a competitive advantage.” Besides, he said, “Andy never told me it would be $100 million.” A 'Big Data' Freeway for Scientists - NYTimes.com 1 of 2 http://bits.blogs nytimes.com/2013/03/20/a-big-data-freeway-for-scientist... MARCH 20, 2013, 1:29 PM A ‘Big Data’ Freeway for Scientists By JOHN MARKOFF The University of California, San Diego, this week plans to announce that it has installed an advanced optical computer network that is intended to serve as a “Big Data freeway system” for next-generation science projects in fields including genomic sequencing, climate science, electron microscopy, oceanography and physics. The new network, which is funded in part by a $500,000 grant from the National Science Foundation and based on an optical switch developed by Arista Networks, a start-up firm founded by the legendary Silicon Valley computer designer Andreas Bechtolsheim, is intended to move from an era where networks moved billions of bits of data each second to the coming age of trillion-bit-per-second data flows. (A terabit network has the capacity to move roughly the equivalent of 2.5 Blu-ray videodiscs each second.) However, the new ultrahigh speed networks are not just about moving files more quickly, or even moving larger files. Increasingly, computers used by scientific researchers are starting to escape the boundaries of a single box or even cluster and spread out to become “virtual,” in some cases across thousands of miles. The new network, known as Prism, is intended for a new style of scientific computing characterized both by “big data” data sets and optical networks that make it possible to compute on data that is stored at a distant location from the computer’s processor, said Philip M. Papadopoulos, program director for computing systems at the San Diego Supercomputer Center, and the principal investigator for the new network. The Prism network “enables users to simply not care where their data is located on campus,” he said. The Prism network is targeted at speeds of 100 billion bits a second and is intended as a bypass network that allows scientists to move data without affecting the performance of the normal campus network, which is based on a 10 billion-bit capacity and is near saturation. There is a range of scientific users with requirements that have easily outstripped the capacity of current-day computer networks, he said. For example he pointed to work being done in medicine by the National Center for Microscopy Imaging Research, with both light and electron microscopes that now generate threedimensional images that may range up to 10 terabytes of data. The laboratory stores several petabytes (a petabyte is one thousand terabytes) and will require Prism to move data between different facilities on campus. 3/21/2013 8:13 AM A 'Big Data' Freeway for Scientists - NYTimes.com 2 of 2 http://bits.blogs nytimes.com/2013/03/20/a-big-data-freeway-for-scientist... A previous optical network, known as Quartzite, was installed at San Diego beginning in 2004. That network was built on an earlier, less powerful, model of the Arista switch. The new version of the switch will handle up to 576 simultaneous 10 billion-bit connections. In some cases the links can be “bonded” to support even higher capacity data flows. During an event last month to introduce the event on campus, Larry Smarr, an astrophysicist who is the director of the California Institute for Telecommunications and Information Technology, a U.C.S.D. laboratory that is the focal point for the new network, demonstrated the ability to share data and scientific visualization information with other scientists by holding a videoconference with researchers at the Electronic Visualization Laboratory at the University of Illinois at Chicago. At one point he showed a three-dimensional image created from an M.R.I. of his own abdomen, demonstrating how it was possible to view and manipulate the digital image remotely. “The radiologists are used to reading the two dimensional scans and turning it into 3-D in their heads, but the doctors and surely the patients have never been able to see what is in their bodies,” he said. “I’m turning the insides of my body into a video game.” This post has been revised to reflect the following correction: Correction: March 20, 2013 An earlier version of this post misstated the name of the organization where Philip M. Papadopoulos works as a program director. It is the San Diego Supercomputer Center, not the San Diego Computing Center. It also misstated the location of the University of Illinois' Electronic Visualization Laboratory. It is at the university's Chicago campus, not the UrbanaChampaign campus. Copyright 2013 The New York Times Company Privacy Policy NYTimes.com 620 Eighth Avenue New York, NY 10018 3/21/2013 8:13 AM January, 10th 2013 8:15 Bernard Meyerson, IBM Chief Innovation Officer By Bernard Meyerson As IBM’s chief innovation officer, I’m especially proud to reveal today that the company has accomplished a remarkable achievement: It has been awarded the largest number of United States patents for the 20th year in a row. IBM’s scientists and engineers racked up 6,478 patents last year, and nearly 67,000 patents over the past two decades. The sheer number and diversity of these patents matters. It shows that a lot of truly novel thinking is going on at IBM’s global research and development labs in a wide variety of fields—from nanotechnology and computer systems design to business analytics and artificial intelligence, and beyond. Yet volume alone doesn’t tell the whole story. What good are a pile of patents if they don’t change the world? That’s why we set our research priorities and make our investments with the goal of producing maximum global impact. Today, we’re focused on a new era in Information Technology that is now in its early stages, but one that will continue to roll out over the next two decades. We call it the era of cognitive systems. We believe that the benefits of this new era will arrive sooner and stronger if companies, governments and universities adopt a culture of innovation that includes making big bets, fostering disruptive innovations, taking a long-term view and collaborating across institutional boundaries. That last part is crucial. What’s needed is radical collaboration—large-scale efforts to find common cause and share resources, expertise and ideas across the borders between companies and institutions. Innovation isn’t about “me” anymore—one person, one company, or even one country. It’s about “we.” First, a little bit about the new era. Today’s computers are programmed by humans to perform specific tasks. They are designed to calculate rapidly. In contrast, cognitive systems will learn from their interactions with data and humans —continuously reprogramming themselves to become ever more accurate and efficient in their outputs. They optimize their functions with each cycle of learning. They will be uniquely designed to analyze vast quantities of information. Today’s computers are processing centric; tomorrow’s will be data centric. Because of these improvements, the machines of the future will be able draw insights from data to help us learn how the world really works, making sense of all of its complexity. They will provide trusted advice to humans—whether heads of state or individuals trying to manage their careers or finances. At IBM, we’re producing some of the scientific advances that will enable the era of cognitive systems. Our early work has already shown up in patents granted last year. Consider Watson, the groundbreaking computer that defeated two former grand champions on the TV quiz show Jeopardy!. Watson’s creators in IBM Research programmed the machine to read millions of pages of information about all manner of things, and then, during the game, dig into that huge database for potential answers and come up with the most likely answer in less than 3 seconds. The U.S. Patent & Trademark Office has awarded several patents for elements of Watson, including U.S. Patent #8,275,803 – System and method for providing answers to questions. The scientists whose names are listed on that patent were all IBMers, but Watson was by no means an IBM-only effort. The project managers enlisted help from researchers at eight universities. They included natural language processing specialists Eric Nyberg of Carnegie Mellon University and Boris Katz of MIT. You can think of Watson as the left brain of cognitive computing. It’s where a lot of language-related work takes place. Separately, IBM Researchers in California, Texas and New York are working on a right-brain project, code-name SyNAPSE. They’re designing a cognitive chip that’s really good at taking sensory input from the world around us and turning it into insights. The team has received a number of patents, including U.S. Patent #8,311,965 – Area efficient neuromorphic circuits using field effect transistors (FET) and variable resistance material. The SyNAPSE project has had even more input from outside IBM than did Watson. Team leader Dharmendra Modha formed a collaborative partnership with university faculty members who brought expertise that IBM doesn’t possess internally. The first phases of the project included circuit designer Rajit Manohar of Cornell University, psychiatrist Giulio Tononi of University of Wisconsin-Madison, neuroscientist Stefano Fusi of Columbia University and robotics specialist Christopher Kello of University of California-Merced. Interesting, but not yet radical. To understand what radical collaboration is consider the workings of America’s largest “smart grid” research initiative, the Pacific Northwest Smart Grid Project. Participants include the Battelle Memorial Institute Pacific Northwest Division, the U.S. Department of Energy, eleven utilities, five technology partners (including IBM Research) and 60,000 customers across the states of Idaho, Montana, Oregon, Washington and Wyoming. The $178 million cost is being split between the government and private sector. The project is deploying and testing a two-way data communications system that is designed to lower costs, improve reliability, reduce emissions, and increase integration of renewable energy sources like wind and solar. The secret sauce is getting everybody involved to focus on common goals and a shared outcome, rather than on their own parochial interests. This project provides a model for how public-private partnerships can address large and complex problems to the benefit of consumers, companies, and society. In the past, science and technology researchers typically took on problems one piece at a time. Each expert or group developed solutions to address one aspect of the problem. We can’t do things that way anymore. There are simply too many interrelationships and interdependencies to work things independently. Coming up with solutions to today’s biggest problems requires a lot of different skills from a lot of different people and organizations. Think of this as an innovation mashup. Some people think that the process of teaming runs the risk of producing a mediocre, consensus result. I believe the opposite to be true, as teams more often build on one another’s expertise to create solutions neither could have created on its own. I see these new radical collaborations as an opportunity for talented teams to grapple with the large and hard-to-solve problems that defied solutions before. Sure, these projects are going to be difficult to structure and coordinate, and they will require leaders with clear visions and strong management skills. But innovating the old fashioned way has become unsustainable, so we’ve got to try something new. Construction of a Chaotic Computer Chip William L. Ditto, K. Murali and Sudeshna Sinha abstract Chaotic systems are great pattern generators and their defining feature, sensitivity to initial conditions, allows them to switch between patterns exponentially fast. We exploit such pattern generation by “tuning” representative continuous and discrete chaotic systems to generate all logic gate functions. We then exploit exponential sensitivity to initial conditions to achieve rapid switching between all the logic gates generated by each representative chaotic element. With this as a starting point we will present our progress on the construction of a chaotic computer chip consisting of large numbers of individual chaotic elements that can be individually and rapidly morphed to become all logic gates. Such a chip of arrays of morphing chaotic logic gates can then be programmed to perform higher order functions (such as memory, arithmetic logic, input/output operations, . . .) and to rapidly switch between such functions. Thus we hope that our reconfigurable chaotic computer chips will enable us to achieve the flexibility of field programmable gate arrays (FPGA), the optimization and speed of application specific integrated circuits (ASIC) and the general utility of a central processing unit (CPU) within the same computer chip architecture. Results on the construction and commercialization of the ChaoLogixTM chaotic computer chip will also be presented to demonstrate progress being made towards the commercialization of this technology ( http://www.chaologix.com ). William L. Ditto J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, FL 32611-6131, USA, and ChaoLogix, Inc. 101 S.E. 2nd Place, Suite 201 - A, Gainesville, FL 32601, USA e-mail: william.ditto@bme.ufl.edu K. Murali Department of Physics, Anna University, Chennai 600 025, INDIA e-mail: kmurali@annauniv.edu Sudeshna Sinha Institute of Mathematical Sciences, C.I.T. Campus, Taramani, Chennai 600 113, INDIA e-mail: sudeshna@imsc.res.in 1 2 William L. Ditto, K. Murali and Sudeshna Sinha 1 Introduction It was proposed in 1998 that chaotic systems may be utilized to design computing devices [1]. In the early years the focus was on proof-of-principle schemes that demonstrated the capability of chaotic elements to do universal computing. The distinctive feature of this alternate computing paradigm was that they exploited the sensitivity and pattern formation features of chaotic systems. In subsequent years, it was realized that one of the most promising direction of this computing paradigm is its ability to exploit a single chaotic element to reconfigure into different logic gates through a threshold based morphing mechanism [2, 3]. In contrast to a conventional field programmable gate array element, where reconfiguration is achieved through switching between multiple single purpose gates, reconfigurable chaotic logic gates (RCLGs) are comprised of chaotic elements that morph (or reconfigure) logic gates through the control of the pattern inherent in their nonlinear element. Two input RCLGs have recently been realized and shown to be capable of reconfiguring between all logic gates in discrete circuits [4, 5, 6]. Additionally such RCLGs have been realized in prototype VLSI circuits (0.13µ m CMOS, 30Mhz clock cycles) that employ two input reconfigurable chaotic logic gates arrays (RCGA) to morph between higher order functions such as those found in a typical arithmetic logic unit (ALU) [7]. In this article we first recall the theoretical scheme for flexible implementation of all these fundamental logical operations utilizing low dimensional chaos [2], and the specific realisation of the theory in a discrete-time and a continuous-time chaotic circuit. Then we will present new results on the design of reconfigurable multiple input gates. Note that multiple input logic gates are preferred mainly for reasons of space in circuits and also many combinational and sequential logic operations can be realized with these logic gates, in which one can minimize the propagation delay. Such a multiple input CGA would make RCLGs more power efficient, increase their performance and widen their range of applications. Here we specifically demonstrate a three input RCLG by implementing representative fundamental NOR and NAND gates with a continuous-time chaotic system. 2 Concept In order to use the rich temporal patterns embedded in a nonlinear time series efficiently one needs a mechanism to extract different responses from the system, in a controlled manner, without much run-time effort. Here we employ a threshold based scheme to achieve this [8]. Consider the discrete-time chaotic map, with its state represented by a variable x, as our chaotic chip or chaotic processor. In our scheme all the basic logic gate operations (AND, OR, XOR, NAND, NOR, NOT) involve the following simple steps: Construction of a Chaotic Computer Chip 3 1. Inputs: x → x0 + I1 + I2 for 2-input gates such as the AND, OR, XOR, NAND and NOR operations, and x → x0 + I for the 1-input gate such as the NOT operation. Here x0 is the initial state of the system, and the input value I = 0 when logic input is 0 and I = Vin when logic input is 1 (where Vin is a positive constant). 2. Dynamical update, i.e. x → f (x) where f (x) is a strongly nonlinear function. 3. Threshold mechanism to obtain output V0 : V0 = 0 if f (x) ≤ E, and V0 = f (x) − E if f (x) > E where E is the threshold. This is interpretated as logic output 0 if V0 = 0 and Logic Ouput 1 if V0 ∼ Vin . Since the system is chaotic, in order to specify the inital x0 accurately one needs a controlling mechanism. Here we will employ a threshold controller to set the inital x0 . So in this example we use the clipping action of the threshold controller to achieve the initialization, and subsequently to obtain the output as well. Note that in our implementation we demand that the input and output have equivalent definitions (i.e. 1 unit is the same quantity for input and output), as well as among various logical operations. This requires that constant Vin assumes the same value throughout a network, and this will allow the output of one gate element to easily couple to another gate element as input, so that gates can be “wired” directly into gate arrays implementing compounded logic operations. In order to obtain all the desired input-output responses of the different gates, we need to satisfy the conditions enumerated in Table 1 simultaneously. So given a dynamics f (x) corresponding to the physical device in actual implementation, one must find values of threshold and initial state satisfying the conditions derived from the Truth Tables to be implemented. For instance, Table 2 shows the exact solutions of the initial x0 and threshold E which satisfy the conditions in Table 1 when f (x) = 4x(1 − x) The constant Vin = 1 4 is common to both input and output and to all logical gates. 4 William L. Ditto, K. Murali and Sudeshna Sinha Logic Operation AND OR XOR NOR NAND NOT Input Set (I1 , I2 ) Output Necessary and Sufficient Condition (0,0) 0 f (x0 ) < E (0,1)/(1,0) 0 f (x0 +Vin ) < E (1,1) 1 f (x0 + 2Vin ) − E = Vin (0,0) 0 f (x0 ) < E (0,1)/(1,0) 1 f (x0 +Vin ) − E = Vin (1,1) 1 f (x0 + 2Vin ) − E = Vin (0,0) 0 f (x0 ) < E (0,1)/(1,0) 1 f (x0 +Vin ) − E = Vin (1,1) 0 f (x0 + 2Vin ) < E (0,0) 1 f (x0 ) − E = Vin (0,1)/(1,0) 0 f (x0 +Vin ) < E (1,1) 0 f (x0 + 2Vin ) < E (0,0) 1 f (x0 ) − E = Vin (0,1)/(1,0) 1 f (x0 +Vin ) − E = vin (1,1) 0 f (x0 + 2Vin ) < E 0 1 f (x0 ) − E = Vin 1 0 f (x0 +Vin ) < E Table 1 Necessary and sufficient conditions, derived from the logic truth tables, to be satisfied simultaneously by the nonlinear dynamical element, in order to have the capacity to implement the logical operations AND, OR, XOR, NAND, NOR and NOT with the same computing module. Construction of a Chaotic Computer Chip 5 Operation AND OR XOR NAND NOT x0 0 1/8 1/4 3/8 1/2 E 3/4 11/16 3/4 11/16 3/4 Table 2 One specific solution of the conditions in Table 1 which yields the logical operations AND, OR, XOR, NAND and NOT, with Vin = 41 . Note that these theoretical solutions have been fully verified in a discrete electrical circuit emulating a logistic map [4]. 3 Continuous-time Nonlinear System We now present a somewhat different scheme for obtaining logic responses from a continuous-time nonlinear system. Our processor is now a continuous time system described by the evolution equation d x /dt = F (x,t), where x = (x1 , x2 , . . . xN ) are the state variables and F is a nonlinear function. In this system we choose a variable, say x1 , to be thresholded. Whenever the value of this variable exceeds a threshold E it resets to E, i.e. when x1 > E then (and only then) x1 = E. Now the basic 2-input 1-output logic operation on a pair of inputs I1 , I2 in this scheme simply involves the setting of an inputs-dependent threshold, namely the threshold voltage E = VC + I1 + I2 where VC is the dynamic control signal determining the functionality of the processor. By switching the value of VC one can switch the logic operation being performed. Again I1 /I2 has value 0 when logic input is 0 and has value Vin when logic input is 1. So the theshold E is equal to VC when logic inputs are (0, 0), VC + Vin when logic inputs are (0, 1) or (1, 0), and VC + 2Vin when logic inputs are (1, 1). The output is interpreted as logic output 0 if x1 < E, i.e. the excess above threshold V0 = 0. The logic output is 1 if x1 > E, and the excess above threshold V0 = (x1 − E) ∼ Vin . The schematic diagram of this method is displayed in Fig. 1. Now for a NOR gate implementation (VC = VNOR ) the following must hold true: (i) when input set is (0, 0), output is 1, which implies that for threshold E = VNOR , output V0 = (x1 − E) ∼ Vin (ii) when input set is (0, 1) or (1, 0), output is 0, which implies that for threshold E = VNOR + Vin, x1 < E so that output V0 = 0. (iii) when input set is (1, 1), output is 0, which implies that for threshold E = VNOR + 2Vin, x1 < E so that output V0 = 0. For a NAND gate (VC = VNAND ) the following must hold true: (i) when input set is (0, 0), output is 1, which implies that for threshold E = VNAND , output V0 = (x1 − E) ∼ Vin 6 William L. Ditto, K. Murali and Sudeshna Sinha Chaotic Circuit X Control with threshold level E = Vc + I1 + I2 Output V 0 Fig. 1 Schematic diagram for implementing a morphing 2 input logic cell with a continuous time dynamical system. Here VC determines the nature of the logic response, and the 2 inputs are I1, I2. (ii) when input set is (0, 1) or (1, 0), output is 1, which implies that for threshold E = Vin + VNAND , output V0 = (x1 − E) ∼ Vin (iii) when input set is (1, 1), output is 0, which implies that for threshold E = VNAND + 2Vin, x1 < E so that output V0 = 0. In order to design a dynamic NOR/NAND gate one has to find values of VC that will satisfy all the above input-output associations in a robust and consistent manner. A proof-of-principle experiment of the scheme was realized with the double scroll chaotic Chua’s circuit given by the following set of (rescaled) 3 coupled ODEs [9] x˙1 = α (x2 − x1 − g(x1 )) (1) x˙2 = x1 − x2 + x3 x˙3 = −β x2 (2) (3) where α = 10. and β = 14.87 and the piecewise linear function g(x) = bx + 12 (a − b)(|x + 1| − |x − 1|) with a = −1.27 and b = −0.68. We used the ring structure configuration of the classic Chua’s circuit [9]. In the experiment we implemented minimal thresholding on variable x1 (this is the part in the “control” box in the schematic figure). We clipped x1 to E, if it exceeded E, only in Eqn. 2. This has very easy implementation, as it avoids modifying the value of x1 in the nonlinear element g(x1 ), which is harder to do. So then all we need to do is to implement x˙2 = E − x2 + x3 instead of Eqn. 2, when x1 > E, and there is no controlling action if x1 ≤ E. A representative example of a dynamic NOR/NAND gate can be obtained in this circuit implementation with parameters: Vin = 2V . The NOR gate is realized Construction of a Chaotic Computer Chip 7 around VC = 0V . At this value of control signal, we have the following: for input (0,0) the threshold level is at 0, which yields V0 ∼ 2V ; for inputs (1,0) or (0,1) the threshold level is at 0, which yields V0 ∼ 0V ; and for input (1,1) the threshold level is at 2V , which yields V0 = 0 as the threshold is beyond the bounds of the chaotic attractor. The NAND gate is realized around VC = −2V . The control signal yields the following: for input (0,0) the threshold level is at −2V , which yields V0 ∼ 2V ; for inputs (1,0) or (0,1) the threshold level is at 2V , which yields V0 ∼ 2V ; and for input (1,1) the threshold level is at 4V , which yields V0 = 0 [5]. So the knowledge of the dynamics allowed us to design a control signal that can select out the temporal patterns emulating the NOR and NAND gates [6]. For instance in the example above, as the dynamic control signal VC switches between 0V to −2V , the module first yields the NOR and then a NAND logic response. Thus one has obtained a dynamic logic gate capable of switching between two fundamental logic reponses, namely the NOR and NAND. 4 Design and Construction of a Three-Input Reconfigurable Chaotic Logic Gate As in Section 3, consider a single chaotic element (for inclusion into a RCLG) to be a continuous time system described by the evolution equation: d x/dt = F ( x;t) where x = (x1 , x2 , . . . , xN ) are the state variables, and F is a strongly nonlinear function. Again in this system we choose a variable, say x1 , to be thresholded. So whenever the value of this variable exceeds a critical threshold E (i.e. when x1 > E), it re-sets to E. In accordance to our basic scheme, the logic operation on a set of inputs I1 , I2 and I3 simply involves the setting of an inputs-dependent threshold, namely the threshold voltage E = VC + I1 + I2 + I3 , where VC is the dynamic control signal determining the functionality of the processor. By switching the value of VC , one can switch the logic operation being performed. I1,2,3 has value ∼ 0V when logic input is zero, and I1,2,3 has value Vin when logic input is one. So for input (0,0,0) the threshold level is at VC ; for inputs (0,0,1) or (0,1,0) or (1,0,0) the threshold level is at VC + Vin ; for input (0,1,1) or (1,1,0) or (1,1,0) the threshold level is at VC + 2Vin and for input (1,1,1) the threshold level is VC + 3Vin. As before, the output is interpreted as logic output 0 if x1 < E, and the excess above threshold V0 ∼ 0. The logic output is 1 if x1 > E, and V0 = (x1 − E) ∼ Vin . Now for the 3-inputs NOR and the NAND gate implementations the input-output relations given in Tables 3 and 4 must hold true. Again, in order to design the NOR or NAND gates, one has to use the knowledge of the dynamics of the nonlinear system to find the values of VC and V0 that will satisfy all the input-output associations in a consistent and robust manner. Consider again the simple realization of the double-scroll chaotic Chua’s attractor represented by the set of (rescaled) 3-coupled ODEs given in Eqns. 1-3. This 8 William L. Ditto, K. Murali and Sudeshna Sinha Input Set (I1 , I2 , I3 ) Threshold E Output Logic Output (0,0,0) VNOR V0 = (x1 − E) ∼ Vin 1 (0,0,1) or (1,0,0) or (0,1,0) VNOR +Vin V0 ∼ 0V as x1 < E 0 (0,1,1) or (1,1,0) or (1,0,1) VNOR + 2Vin V0 ∼ 0V as x1 < E 0 (0,0,0) VNOR + 3Vin V0 ∼ 0V as x1 < E 0 Table 3 Truth table for NOR gate implementation (Vin = 1.84V , VNOR = 0V ) Input Set (I1 , I2 , I3 ) Threshold E Output Logic Output (0,0,0) VNAND V0 = (x1 − E) ∼ Vin 1 (0,0,1) or (1,0,0) or (0,1,0) VNAND +Vin V0 = (x1 − E) ∼ Vin 1 (0,1,1) or (1,1,0) or (1,0,1) VNAND + 2Vin V0 = (x1 − E) ∼ Vin 1 (0,0,0) VNAND + 3Vin V0 ∼ 0V as x1 < E 0 Table 4 Truth table for NAND gate implementation (Vin = 1.84V , VNOR = −3.68V ) system was implemented by the circuit shown in Fig.3, with circuit component values: [ L = 18mH, R = 1710Ω , C1 = 10nF, C2 = 100nF, R1 = 220Ω , R2 = 220Ω , R3 = 2.2kΩ , R4 = 22kΩ , R5 = 22kΩ , R3 = 3.3kΩ , D = IN4148, B1 , B2 = Buffers, OA1 - OA3 : opamp µ A741]. The x1 dynamical variable (corresponding to the voltage V1 across the capacitor C1) is thresholded by a control circuit shown in the dotted box in Fig. 3, with voltage E setting varying thresholds. In the circuit, V T corresponds to the output signal from the threshold controller. Note that, as in the implementation of 2-input gates, we are only replacing dx2 /dt = x1 − x2 + x3 by dx2 /dt = E − x2 + x3 in Eq.(2), when x1 > E, and there is no controlling action if x1 ≤ E. The schematic diagram for the NAND/NOR gate implementation is depicted in Fig.2. In the representative example shown here, Vin = 1.84V . The NOR gate is Construction of a Chaotic Computer Chip 9 Fig. 2 Symbolic diagram for dynamic 3-Input NOR/NAND logic cell. Dynamic control signal VC determines the logic operation. In our example, VC can switch between VNAND giving a NAND gate, and VNOR giving a NOR gate. realized around VC = VNOR = 0V and the NAND gate is realized with VC = VNAND = −3.68V (See Tables 3 and 4). Fig. 3 Circuit module implementing a RCLG that morphs between NAND and NOR logic gates. The diagram represented in the dotted region is the threshold controller. Here E = VC + I1 + I2 + I3 is the dynamically varying threshold voltage. V T is the output signal from the threshold controller and V0 is the difference voltage signal. 10 William L. Ditto, K. Murali and Sudeshna Sinha Fig. 4 Voltage timing sequences from top to bottom (PSPICE simulation): (a) First input I1, (b) Second input I2, (c) Third input I3, (d) Dynamic control signal VC , where VC switches between VNAND = −3.68V and VNOR = 0V (e) Output signal V 1 (corresponding to x1 (t)) from the Chua’s circuit, (f) Recovered logic output signal from V 0. The fundamental period of oscillation of this circuit is 0.33mS. Thus the nonlinear evolution of the element has allowed us to obtain a control signal that selects out temporal patterns corresponding to NOR and NAND gates. For instance in Fig.4, as the dynamic control signal VC switches between −3.68V to 0V , the element yields first a NAND gate and then morphs into a NOR gate. The fundamental period of oscillation of the Chua’s circuit is 0.33ms. The average latency of morphing between logic gates is 48% of this period. Construction of a Chaotic Computer Chip 11 5 VLSI Implementation of Chaotic Computing Architectures – Proof of Concept Recently ChaoLogix Inc. designed and fabricated a proof of concept chip that demonstrates the feasibility of constructing reconfigurable chaotic logic gates, henceforth ChaoGates, in standard CMOS based VLSI (0.18µ m TSMC process operating at 30Mhz with a 3.1 × 3.1mm die size and a 1.8V digital core voltage). The basic building block ChaoGate is shown schematically in Fig. 5. ChaoGates were then incorporated into a ChaoGate Array in the VLSI chip to demonstrate higher order morphing functionality including: 1. A small Arithmetic Logic Unit (ALU) that morphs between higher order arithmetic functions (multiplier and adder/accumulator) in less than one clock cycle. An ALU is a basic building block of computer architectures. 2. A Communications Protocols (CP) Unit that morphs between two different complex communications protocols in less than one clock cycle: Serial Peripheral Interface (SPI, a synchronous serial data link) and an Inter Integrated Circuit Control bus implementation (I2C, a multi-master serial computer bus). While the design of the ChaoGates and ChaoGate Arrays in this proof of concept VLSI chip was not optimized for performance, it clearly demonstrates that ChaoGates can be constructed and organized into reconfigurable chaotic logic gate arrays capable of morphing between higher order computational building blocks. Current efforts are focused upon optimizing the design of a single ChaoGate to levels where they are comparable or smaller to a single NAND gate in terms of power and size yet are capable of morphing between all gate functions in under a single computer clock cycle. Preliminary designs indicate that this goal is achievable and that all gates currently used to design computers may be replaced with ChaoGates to provide added flexibility and performance. Acknowledgments We acknowledge the support of the Office of Naval Research [N000140211019]. References 1. Sinha, S. and Ditto, W.L. Phys. Rev. Lett. 81 (1998) 2156. 2. Sinha, S., Munakata, T. and Ditto, W.L, Phys. Rev. E 65 (2002) 036214; Munakata, T., Sinha, S. and Ditto, W.L, IEEE Trans. Circ. and Systems 49 (2002) 1629. 3. Sinha, S. and Ditto, W.L. Phys. Rev. E 59 (1999) 363; Sinha, S., Munakata, T. and Ditto, W.L Phys. Rev. E 65 036216. 12 William L. Ditto, K. Murali and Sudeshna Sinha Input Input IN1 IN2 VT1 Select Out Output ChaoGate Element VT2 VT3 Analog Analog Analog Select Select Select Global Thresholds Fig. 5 (Left) Schematic of a two-input, one output morphable ChaoGate. The gate logic functionality (NOR, NAND, XOR, ) is controlled (morphed), in the current VLSI design, by global thresholds connected to VT1, VT2 and VT3 through analog multiplexing circuitry and (Right) a size comparison between the current ChaoGate circuitry implemented in the ChaoLogix VLSI chaotic comuting chip and a typical NAND gate circuit (Courtesy of ChaoLogix Inc.) 4. Murali, K., Sinha, S. and Ditto, W.L., Proceedings of the STATPHYS-22 Satellite conference Perspectives in Nonlinear Dynamics Special Issue of Pramana 64 (2005) 433 5. Murali, K., Sinha, S. and Ditto, W.L., Int. J. Bif. and Chaos (Letts) 13 (2003) 2669; Murali, K., Sinha S., and I. Raja Mohamed, I.R., Phys. Letts. A 339 (2005) 39. 6. Murali, K., Sinha, S., Ditto, W.L., Proceedings of Experimental Chaos Conference (ECC9), Brazil (2006) published in Philosophical Transactions of the Royal Society of London (Series A) (2007) 7. W. Ditto, S. Sinha and K. Murali, US Patent Number 07096347 (August 22, 2006). 8. Sinha, S., Nonlinear Systems, Eds. R. Sahadevan and M.L. Lakshmanan, (Narosa, 2002) 309328; Murali, K. and Sinha, S., Phys. Rev. E 68 (2003) 016210 ; Ditto, W.L. and Sinha, S., Philosophical Transactions of the Royal Society of London (Series A) 364 (2006) 2483-2494. 9. Dimitriev, A.S. et al, J. Comm. Tech. Electronics, 43 (1998) 1038. Data Center Servers Cloud HPC Software Storage Like 0 Networks Security Data Networking T t Policy Virtualisation Business Jobs Hardware Cloud Infrastructure Science Bootnotes BOFH 11 Panasas kingpin: What's the solid state state of play? Garth Gibson on HPC - buffering the brute-force burst By Chris Mellor • Get more from this author Posted in Storage , 29th March 2012 16:28 GMT Free whitepaper – Enabling efficient data center monitoring Interview What can NAND flash do now for high-performance computing (HPC) storage and how will it evolve? Garth Gibson, the co-founder and chief technology officer for Panasas, the (HPC) storage provider, has definite views on it. Here's a snapshot of them. El Reg: How can solid state technology benefit HPC in general? Garth Gibson: The most demanding science done in HPC is high resolution simulation. This manifests as distributed shared memory -- the science is limited my memory size. Memory is typically 40 per cent of capital costs and 40 per cent of power costs, and can be said to be the limiting technology in future HPC systems. Solid state promises new choices in larger, lower power memory systems, possibly enabling advances in science better and faster. More narrowly, solid state technology does not have mechanical positioning delays, so small random accesses can have latencies that are two orders of magnitude shorter. El Reg: Does Panasas have any involvement with NAND in its products? If so, how and why? Garth Gibson: Panasas uses NAND flash to accelerate small random accesses. In HPC storage, the bulk of the data is sequentially accessed, so this means that the primary use of small random access Forums acceleration is file system metadata (directories, allocation maps, file attributes) and small files. But we also use this space for small random accesses into large files, which, although rare, can lead to disproportionately large fragmentation and read access slowdown. El Reg: What are your views on SLC NAND and MLC NAND in terms of speed (IOPS, MB/sec), endurance, form factor and interfaces? Garth Gibson: Our experience is that the NAND flash technologies are becoming more mature, and we can increasingly trust in the reliability mechanisms provided. This means that enterprise MLC is sufficiently durable and reliable to be used, although SLC continues to be faster when that extra speed can be fully exploited. El Reg: Where in the HPC server-to-storage 'stack' could NAND be used and why? Garth Gibson: The driving use of NAND flash in HPC by the end of this decade is likely to be so called "burst buffers". These buffers are the target of memory to memory copies, enabling checkpoints (defensive IO enabling a later "restart from checkpoint" after a failure) to be captured faster. The compute can then resume when the burst buffer drains to less expensive storage, typically on magnetic hard disk. But shortly after that use is established I expect scientists to want to do data analytics on multiple sequential checkpoints while these are still held in the burst buffer, because the low latency random access of NAND flash will allow brute-force analysis computations not effective in main memory or on magnetic disk. El Reg: Does Panasas develop its own NAND controller technology? If yes or no - why? Garth Gibson: Panasas is using best-in-class NAND flash controller technology today. But changes in NAND flash technology and vendors are rapid and important and we continue to track this technology closely, with an open mind to changing the way we use solid state. El Reg: What does Panasas think of the merits and demerits of TLC NAND (3-bit MLC)? Garth Gibson: TLC NAND flash is a new technology, not yet ready for use in Panasas equipment. As it evolves, it might become appropriate for burst buffers ... hard to say now. El Reg: How long before NAND runs out of steam? Garth Gibson: As usual, technologists can point to challenges with current technology that seem to favor alternative technologies in a timeframe of 2 to 4 generations in the future. I'm told in such discussions that 2024 looks quite challenging for NAND flash, and much better for its competitors. However, with that much time, the real issue is how much quality investment is made in the technology. The market impact of NAND flash is large enough now to ensure that significant effort will go into continued advances in NAND flash. This is not as clear for its competitors. El Reg: What do you think of the various post-NAND technology candidates such as Phase Change Memory, STT-RAM, memristor, Racetrack and the various Resistive-RAMs? Garth Gibson: I am totally enamored of STT-RAM because it promises infinite rewrite and DRAM-class speeds. Totally magic! I just hope the technology pans out, because it has a long way to go. Phase change is much more real, and suffering disappointing endurance improvement so far. El Reg: Any other pertinent points? Garth Gibson: Magnetic disk bits are small compared to solid state bits, and solid directions are available to continue to make them smaller. As long as society's appetite for online data continues to grow, I would expect magnetic disk to continue to play an important role. However, I would expect that the memory hierarchy - on-chip to RAM to disk will become deeper, with NAND flash and its competitors between RAM and disk. Not such good news in his views on memristor technology. Maybe HP will surprise us all. ® Storage at Exascale: Some Thoughts from Panasas CTO Garth Gibson May 25, 2011 Exascale computing is not just about FLOPS. It will also require a new breed of external storage capable of feeding these exaflop beasts. Panasas co-founder and chief technology officer Garth Gibson has some ideas on how this can be accomplished and we asked him to expound on the topic in some detail. HPCwire: What kind of storage performance will need to be delivered for exascale computing? Garth Gibson: The top requirement for storage in an exascale supercomputer is the capability to store a checkpoint in approximately 15 minutes or less so as to keep the supercomputer busy with computational tasks most of the time. If you do a checkpoint in 15 minutes, your compute period can be as little as two and a half hours and you still spend only 10 percent of your time checkpointing. The size of the checkpoint data is determined by the memory sizing; something that some experts expect will be approximately 64 petabytes based on the power and capital costs involved. Based on that memory size, we estimate the storage system must be capable of writing at 70 terabytes per second to support a 15 minute checkpoint. HPCwire: Given the slower performance slope of disk compared to compute, what types of hardware technologies and storage tiering will be required to provide such performance? Gibson: While we have seen peak rates of throughput on the hundreds of gigabytes per second range today, we have to scale 1000x to get to the required write speed for exascale compute. The challenge with the 70 terabyte-per-second write requirement is that traditional disk drives will not get significantly faster over the coming decade so it will require almost 1000x the number of spindles to sustain this level of write capability. After all, we can only write as fast as the sum of the individual disk drives. We can look at other technologies like flash storage -such as SSDs -- with faster write capabilities. The challenge with this technology, however, is the huge cost delta between flash-based solutions compared to ones based on traditional hard drives. Given that the scratch space will likely be at least 10 times the size of main memory, we are looking at 640 petabytes of scratch storage which translates to over half a billion dollars of cost in flash based storage alone. The solution is a hybrid approach where the data is initially copied to flash at 70 terabytes per second but the second layer gets 10 times as much time to write from flash to disk, lowering storage bandwidth requirements to 7 terabytes per second, and storage components to only about 100x today’s systems. You get the performance out of flash and the capacity out of spinning disk. In essence, the flash layer is really temporary “cheap memory,” possibly not part of the storage system at all, with little of no use of its non-volatility, and perhaps not using a disk interface like SATA. HPCwire: What types of software technologies will have to be developed? Gibson: If we solve the performance/capacity/cost issue with a hybrid model using flash as a temporary memory dump before data is written off to disk, it will require a significant amount of intelligent copy and tiering software to manage the data movement between main memory and the temporary flash memory and from there on to spinning disks. It is not even clear what layers of the application, runtime system, operating system or file system manage this flash memory. Perhaps more challenging, there will have to be a significant amount of software investment in building reliability into the system. An exascale storage system is going to have two orders of magnitude more components than current systems. With a lot more components comes a significantly higher rate of component failure. This means more RAID reconstructions needing to rebuild bigger drives and more media failures during these reconstructions. Exascale storage will need higher tolerance for failure as well as the capability for much faster reconstruction, such as is provided by Panasas’ parallel reconstruction, in addition to improved defense against media failures, such as is provided by Panasas’ vertical parity. And more importantly, end to end data integrity checking of stored data, data in transit, data in caches, data pushed through servers and data received at compute nodes, because there is just so much data flowing that detection of the inevitable flipped bit is going to be key. The storage industry is started on this type of high reliability feature development, but exascale computing will need exascale mechanisms years before the broader engineering marketplaces will require it. HPCwire: How will metadata management need to evolve? Gibson: At Carnegie Mellon University we have already seen with tests run at Oak Ridge National Laboratory that it doesn’t take a very big configuration before it starts to take thousands of seconds to open all the files, end-to-end. As you scale up the supercomputer size, the increased processor count puts tremendous pressure on your available metadata server concurrency and throughput. Frankly, this is one of the key pressure points we have right now – just simply creating, opening and deleting files can really eat into your available compute cycles. This is the base problem with metadata management. Exascale is going to mean 100,000 to 250,000 nodes or more. With hundreds to thousands of cores per node and many threads per core -- GPUs in the extreme -- the number of concurrent threads in exascale computing can easily be estimated in the billions. With this level of concurrent activity, a highly distributed, scalable metadata architecture is a must, with dramatically superior performance over what any vendor offers today. While we at Panasas believe we are in a relatively good starting position, it will nevertheless require a very significant software investment to adequately address this challenge. HPCwire: Do you believe there is a reasonable roadmap to achieve all this? Do you think the proper investments are being made? Gibson: I believe that there is a well reasoned and understood roadmap to get from petascale to exascale. However it will take a lot more investment than is currently being put into getting to the roadmap goals. The challenge is the return on investment for vendors. When you consider that the work will take most of the time running up to 2018, when the first exascale systems will be needed, and that there will barely be more than 500 publicly known petascale computers at that time, based on TOP500.org’s historical 7-year lag on the scale of the 500th largest computer. It is going to be hard to pay for systems development on that scale now, knowing that there is going to be only a few implementations to apportion the cost against this decade and that it will take most of the decade after that for the exascale installed base to grow to 500. We know that exascale features are a viable program at a time far enough down the line to spread the investment cost across many commercial customers such as those in the commercial sector doing work like oil exploration or design modeling. However, in the mean time, funding a development project like exascale storage systems could sink a small company and it would be highly unattractive on the P&L of a publicly traded company. What made petascale storage systems such as Panasas and Lustre a reality was the investment that the government made with DARPA in the 1990’s and with the DOE Path Forward program this past decade. Similar programs are going to be required to make exascale a reality. The government needs to share in this investment if it wants production quality solutions to be available in the target exascale timeframe. HPCwire: What do you think is the biggest hurdle for exascale storage? Gibson: The principal challenge for this type of scale will be the software capability. Software that can manage these levels of concurrency, streaming at such high levels of bandwidth without bottlenecking on metadata throughput, and at the same time ensure high levels of reliability, availability, integrity, and ease-of-use, and in a package that is affordable to operate and maintain is going to require a high level of coordination and cannot come from stringing together a bunch of open-source modules. Simply getting the data path capable of going fast by hooking it together with bailing wire and duct tape is possible but it gives you a false confidence because the capital costs look good and there is a piece of software that runs for awhile and appears to do the right thing. But in fact, having a piece of software that maintains high availability, doesn’t lose data, and has high integrity and a manageable cost of operation is way harder than many people give it credit for being. You can see this tension today in the Lustre open source file system which seems to require a non-trivial, dedicated staff trained to keep the system up and effective. HPCwire: Will there be a new parallel file system for exascale? Gibson: The probability of starting from scratch today and building a brand new production file system deployable in time for 2018 is just about zero. There is a huge investment in software technology required to get to exascale and we cannot get there without significant further investment in the parallel file systems available today. So if we want to hit the timeline for exascale, it is going to have to take investment in new ideas and existing implementations to hit the exascale target. Biff (Bloom Filter) Codes : Fast Error Correction for Large Data Sets Michael Mitzenmacher and George Varghese Abstract—Large data sets are increasingly common in cloud and virtualized environments. For example, transfers of multiple gigabytes are commonplace, as are replicated block of such sizes. There is a need for fast error-correction or data reconciliation in such settings even when the expected number of errors is small. Motivated by such cloud reconciliation problems, we consider error-correction schemes designed for large data, after explaining why previous approaches appear unsuitable. We introduce Biff codes, which are based on Bloom filters and are designed for large data. For Biff codes with a message of length L and E errors, the encoding time is O(L), decoding time is O(L + E) and the space overhead is O(E). Biff codes are low-density parity-check codes; they are similar to Tornado codes, but are designed for errors instead of erasures. Further, Biff codes are designed to be very simple, removing any explicit graph structures and based entirely on hash tables. We derive Biff codes by a simple reduction from a set reconciliation algorithm for a recently developed data structure, invertible Bloom lookup tables. While the underlying theory is extremely simple, what makes this code especially attractive is the ease with which it can be implemented and the speed of decoding, which we demonstrate with a prototype implementation. I. I NTRODUCTION Motivated by the frequent need to transfer and reconcile large data sets in virtualized and cloud envinonments, we provide a very simple and fast error-correcting code designed for very large data streams. For example, consider the specific problem of reconciling two memories of 2 Gbytes whose contents may differ by a small number of 32-bit words. Alternatively, one can picture transferring a memory of this size, and needing to check for errors after it is written to the new storage. We assume errors are mutation errors; data order remains intact. Other possible applications include deduplication, as exemplified by the Difference Engine [6]. While storage may seem cheap, great cost savings can be effected by replacing redundant copies of data with a single copy and pointers in other locations. For example, in virtualized environments, it is not surprising that two virtual machines might have virtual memories with a great deal of redundancy. For example, both VMs may include similar copies of the operating system. More generally, we are concerned with any setting with large data transfers over networks. In this setting, our primary notion of efficiency differs somewhat from standard coding. While we still want the redundancy added for the code to be as small as possible, speed appears to be a more important criterion for large data sets. In particular, for a message of length L and E errors, while we may want close to the minimum overhead of E words, O(E) words with a reasonably small constant should suffice. More importantly, we require very fast encoding and decoding times; encoding should be O(L) and decoding should be O(L + E), with very small constant factors implied in the asymptotic notation. Typically, E will be very small compared to L; we expect very small error rates, or even subconstant error rates (such as a bounded number of errors). In this paper, we describe new codes that are designed for large data. We also show why other approaches (such as Reed-Solomon Codes or Tornado codes with with blockbased checksum) are unsuitable. Our codes are extremely attractive from the point of view of engineering effectiveness: our software prototype implementation is very fast, decoding messages of 1 million words with thousands of errors in under a second. We call our codes Biff codes, where Biff denotes how we pronounce BF, for Bloom filter. Biff codes are motivated by recent Bloom filter variations, the invertible Bloom filter [3] and invertible Bloom lookup table [5], and their uses for set reconciliation [4], as explained below. Alternatively, Biff codes are similar to Tornado codes [1], [9], and can be viewed as a practical, randomized low-density parity-check code with an especially simple structure designed specifically for word-level mutation errors. Also, while Tornado codes were designed using multiple levels of random graphs with carefully chosen degree distributions, Biff codes reduce this structure to its barest elements; our basic structure is single-layer, and regular, in that each message symbol takes part in the same number of encoded symbols. As a result, programming efficient encoding and decoding routines can easily be done in a matter of hours. We expect this simplicity will be prized as a virtue in practical settings; indeed, we believe Biff codes reflect the essential ideas behind Tornado codes and subsequent related low-density parity-check (LDPC) codes, in their simplest form. We also provide a simple (and apparently new) general reduction from error correcting codes to set reconciliation. While reductions from erasure correcting codes to set reconciliation are well known [7], [10], our reduction may be useful independent of Biff Codes. II. F ROM S ET R ECONCILIATION TO E RROR C ORRECTING O MISSION E RRORS We now describe how to construct Biff codes from invertible Bloom lookup tables (IBLTs). The source of the stream of ideas we exploit is a seminal paper called Invertible Bloom Filters by Eppstein and Goodrich that invented a streaming data structure for the so-called straggler problem [3]. The basic idea was generalized for set reconciliation by Eppstein, Goodrich, Uyeda, and Varghese in [4] and generalized and improved further by Goodrich and Mitzenmacher to IBLTs [5]. We choose to use the framework of IBLTs in the exposition that follows. We start by reviewing the main aspects of IBLTs that we require from [5]. We note that we do not require the full IBLT structure for our application, so we discuss only the elements that we need, and refer readers to [5] for further details on IBLT performance. A. IBLTs via Hashing Our IBLT construction uses a table T of m cells, and a set of k random hash functions, h1 , h2 , . . ., hk , to store a collection of key-value pairs. In our setting, keys will be distinct, and each key will have a value determined by the key. On an insertion, each key-value pair is placed into cells T [h1 (x)], T [h2 (x)], . . . T [ht (x)]. We assume the hash functions are fully random; in practice this assumption appears suitable (see, e.g., [11], [13] for related work on this point). For technical reasons, we assume that distinct hash functions yield distinct locations. This can be accomplished in various ways, such as by splitting the m cells into k subtables each of size m/k, and having each hash function choose one cell (uniformly) from each subtable. Such splitting does not affect the asymptotic behavior in our analysis. In a standard IBLT, each cell contains three fields: a keySum field, which is the exclusive-or (XOR) of all the keys that have been inserted that map to this cell; a valueSum field, which is the XOR of all the values of the keys that have been inserted that map to this cell; and a count field, which counts the number of keys that have been inserted into the cell. As all operations are XORs, deletions are handled in an equivalent manner: on deletion of a previously inserted keyvalue pair, the IBLT XORs the key and value with the fields in the appropriate cells, and the count is reversed. This reverses a corresponding insertion. We will discuss later how to deal with deletions without corresponding insertions, a case that can usefully occur in our setting. B. Listing Set Entries We now consider how to list the entries of the IBLT. The approach is straightforward. We do a first pass through the cells to find cells with a count of 1, and construct a list of those cells. We recover the key and corresponding value from this cell, and then delete the corresponding pair from the table. In the course of performing deletions, we check the count of the relevant cells. If a cell’s count becomes 1, we add it to the list; if it drops from 1 to 0, we can remove it from the list. This approach can easily be implemented O(m) time. If at the end of this process all the cells have a count of 0, then we have succeeded in recovering all the entries in the IBLT. Otherwise, the method only outputs a partial list of the key-value pairs in B. This “peeling process” is well known in the context of random graphs and hypergraphs as the process used to find k ck 3 1.222 T HRESHOLDS FOR THE 4 1.295 5 1.425 TABLE I 2- CORE ROUNDED 6 1.570 7 1.721 TO FOUR DECIMAL PLACES . the 2-core of a random hypergraph (e.g., see [2], [12]). This peeling process is similarly used for various codes, including Tornado codes and their derivatives (e.g., see [9]). Previous results therefore give tight thresholds: when the number of hash values k for each pair is at least 2, there are constants ck > 1 such that if m > (ck + )n for any constant > 0, the listing process succeeds with probability 1 − o(1); similarly, if m < (ck − )n for any constant > 0, the listing process fails with probability o(1). As shown in [2], [12], these values are given by n o −kαxk−1 c−1 = sup α : 0 < α < 1; ∀x ∈ (0, 1), 1 − e < x . k Numerical values for k ≥ 3 are given in Table I. The choice of k affects the probability of the listing process failing. By choosing k sufficiently large and m above the 2-core threshold, standard results give that the bottleneck is the possibility of having two kye-value pairs with the same collection of hash values, giving a failure probability of O(m−k+2 ). We note that, with some additional effort, there are various ways to save space with the IBLT structure that are known in the literature. including using compressed arrays, quotienting, and irregular constructions (where different keys can utilize a different number of hash values, as in irregular LDPC codes). In practice the constant factors are small, and such approaches may interfere with the simplicity we aim for with the IBLT approach; we therefore do not consider them further here. C. Set Reconciliation with IBLTs We consider two users, Alice and Bob, referred to as A and B. Suppose Alice and Bob hold distinct but similar sets of keys, and they would like to reconcile the differences. This is the well known set reconciliation problem. To achieve such a reconciliation with low overhead [4], Alice constructs an IBLT. The value associated with each key is a fingerprint (or checksum) obtained from the key. In what follows, we assume the value is taken by hashing the key, yielding an uniform value over all b-bit values for an appropriate b, and that the hash function is shared between Alice and Bob. Alice sends Bob her IBLT, and he deletes the corresponding key-value pairs from his set. In this setting, when Bob deletes a key-value pair not held by Alice, it is possible for a cell count to become negative. The remaining key-value pairs left in the IBLT correspond exactly to items that exactly one of Alice or Bob has. Bob can use the IBLT structure to recover these pairs efficiently. For lack of space, we present the argument informally; the IBLT properties we use were formally derived in [5]. We first note that, in this setting, because deletions may reduce the count of cells, it is possible that a cell can have a count of 1 but not contain a single key-value pair. For example, if two pairs are inserted by Alice into a cell, and Bob deletes a pair that does not match Alice’s in that cell, the count will be 1. Hence, in this setting, the proper way to check if a cell contains a valid pair is to test that the checksum for the keySum field matches the valueSum. In fact, because the value corresponds to a checksum, the count field is extraneous. (It can be a useful additional check, but strictly is unnecessary. Moreover, it may not be space-effective; since the counts will depend on the number of items inserted, not on the size of the difference between the sets.) Instead, the list of cells that allow us to recover a value in our listing process are determined by a match of the key and checksum value. Importantly, because Bob’s deletion operation is symmetric to Alice’s insertion operation, this holds true for cells containing a pair deleted by Bob as well as cells containing a pair inserted by Alice. (In this case, the corresponding count, if used, should be −1 for cells with a deleted pair.) Bob can therefore use the IBLT to recover these pairs efficiently. (Strictly speaking, Bob need only recover Alice’s keys, but this simplification does not make a noticeable difference in our context.) If ∆ is an upper bound on the number of keys not shared between Alice and Bob, then from the argument sketched above, an IBLT with only O(∆) cells is necessary, with the constant factor dependent on the success probability desired. III. E RROR -C ORRECTING C ODES WITH IBLT S We now show how to use the above scheme to obtain a computationally efficient error-correcting code. Our errorcorrecting code can be viewed as a reduction using set reconciliation. Let B have a message for A corresponding to the sequence of values x1 , x2 , . . . , xn . Then B sends A the message along with set reconciliation information – in our case, the IBLT – for the set of ordered pairs (x1 , 1), (x2 , 2), . . . , (xn , n). For now we assume the set reconciliation information A obtains is without error; errors only occur in message values. When A obtains the sequence y1 , y2 , . . . , yn , she constructs her own set of pairs (y1 , 1), (y2 , 2), . . . , (yn , n), and reconciles the two sets to find erroneous positions. Notice that this approach requires random symbol errors as opposed to adversarial errors for our IBLT approach, as we require the checksums to accurately determine when key-value pairs are valid. However, there are standard approaches that overcome this problem that would make it suitable for adversarial errors with a suitably limited adversary (by applying a pseudorandom permutation on the symbols that is secret from the adversary; see, for example, [8]). Also, the positions of the errors can be anywhere in the message (as long as the positions are chosen independently of the method used to generate the set reconciliation information). If there are no errors in the data for the IBLT structure, then this reduction can be directly applied. However, assuming the IBLT is sent over the same channel as the data, then some cells in the IBLT will have erroneous keySum or valueSum fields. If errors are randomly distributed and the error rate is sufficiently small, this is not a concern; as shown in [5], IBLT listing is quite robust against errors in the IBLT structure. Specifically, an error will cause the keySum and valueSum fields of an IBLT cell not to match, and as such it will not be used for decoding; this can be problematic if all the cells hashed to be an erroneous message cell are themselves in error, as the value cannot then be recovered, but under appropriate parameter settings this will be rare in practice. As a summary, using the 1-layer scheme, where errors can occur in the IBLT, the main contribution to the failure probability is when an erroneous symbol suffers from all k of its hash locations in the IBLT being in error. If z is the fraction of IBLT cells in error, the expected number of such symbols is Ez k , and the distribution of such failures is binomial (and approximately Poisson, when the expectation is small). Hence, when such errors occur, there is usually only one of them, and instead of using recursive error correction on the IBLT one could instead use a very small amount of error correction in the original message. For bursty errors or other error models, we may need to randomly intersperse the IBLT structure with the original message; note, however, that the randomness used in hashing the message values protects us from bursty errors over the message. Basic pseudocode for encoding and decoding of Biff codes is given below (using C-style notation in places); the code is very simple, and is written entirely in terms of hash table operations. • ENCODE for i = 1 . . . n do for j = 1 . . . k do Tj [hj ((xi , i))].keySum = (xi , i). Tj [hj ((xi , i))].valueSum = Check((xi , i)). • DECODE for i = 1 . . . n do for j = 1 . . . k do Tj [hj ((yi , i))].keySum ˆ= (yi , i). Tj [hj ((yi , i))].valueSum ˆ= Check((yi , i)). while ∃ a, j with (Tj [a].keySum 6= 0) and (Tj [a].valueSum == Check(Tj [a].keySum)) do (z, i) = Tj [a].keySum if z 6= yi then set yi to z when decoding terminates for j = 1 . . . k do Tj [hj ((z, i))].keySum ˆ= (z, i). Tj [hj ((z, i))].valueSum ˆ= Check((z, i)). In our pseudocode, there is some leeway in how one implements the while statement. One natural implementation would keep a list (such as a linked list) of pairs a, j that satisfy the conditions. This list can be initialized by a walk through the arrays, and then updated as the while loop modifies the contents of the table. The total work will clearly be proportional to the size of the tables, which will be O(E) when the table size is chosen appropriately. We may also recursively apply a further IBLT, treating the first IBLT as data,; or we can use a more expensive error- correcting code, such as a Reed-Solomon code, to protect the much smaller IBLT. This approach is similar to that used under the original scheme for Tornado codes, but appears unnecessary for many natural error models. For ease of exposition, we assume random locations for errors henceforth. The resulting error-correcting code is not space-optimal, but the overhead in terms of the space required for the errorcorrection information is small when the error-rate is small. If there are e errors, then there will be 2e key-value pairs in the IBLT; the overhead with having 3, 4, or 5 choices, as seen from Table I, will then correspond to less than 3e cells. Each cell contains both a keySum or valueSum, each of which will be (depending on the implementation) roughly the same size as the original key. Note here the key in our setting includes a position as well as the original message symbol, so this is additional overhead. Putting these together, we can expect that the error-correction overhead is roughly a factor of 6 over the optimal amount of overhead, which would be e times the size of a message symbol. While this is a non-trivial price, it is important to place it in context. For large keys, with a 1% error rate, even an optimal code for a message of length M bytes would require at least (1/0.99)M ≈ 1.01M bytes to be sent, and a standard Reed-Solomon code (correcting E errors with 2E additional values) would require at least 1.02M bytes. Biff codes would require about 1.06M bytes. The resulting advantages, again, are simplicity and speed. We expect that in many engineering contexts, the advantages of the IBLT approach will outweigh the small additional space cost. For very large messages, parallelization can speed things up further; key-value pairs can be inserted or deleted in parallel easily, with the bottleneck being atomic writes when XORing into a cell. The listing step also offers opportunities for parallelization, with threads being based on cells, and cells becoming active when their checksum value matches the key. We don’t explore parallelization further here, but we note the simple, regular framework at the heart of Biff codes. We also note that, naturally, the approach of using IBLTs can be applied to design a simple erasure-correcting code. This corresponds to a set reconciliation problem where one set is slightly larger than the other; nothing is inserted at A’s end for missing elements. Other error models may also be handled using the same technique. IV. I SSUES WITH OTHER A PPROACHES Other natural approaches fail to have both fast encoding and decoding, and maintain O(E) overhead. While asymptotically faster algorithms exist, the computational overhead of ReedSolomon codes is generally Θ(EL) in practice, making a straightforward implementation infeasible in this setting, once the number of errors is non-trivial. Breaking the data into blocks and encoding each would be ineffective with bursty errors. One could randomly permute the message data before breaking it into blocks, to randomize the position of errors and thereby spread them among blocks. In practice, however, taking a large memory block and then permuting it is extremely expensive as it destroys natural data locality. Once a memory or disk page is read it is almost “free” to read the remaining words in sequence; randomizing positions becomes hugely expensive. Finally, there are issues in finding a suitable field size to compute over, particularly for large messages. The problems we describe above are not original; similar discussions, for example, appear with the early work on Tornado codes [1]. Experiments comparing Reed-Solomon codes for erasures with Tornado codes from the original paper demonstrate that Reed-Solomon codes are orders of magnitude slower at this scale. An alternative approach is to use Tornado codes (or similar LDPC codes) directly, using checksums to ensure that suitably sized blocks are accurate. For example, we could divide the message of length L into L/B blocks of B symbols and add an error-detection checksum of c bits to each block. If we assume blocks with detected errors are dropped, then E errors could result in EB symbols being dropped, requiring the code to send at least an additional kEB bits for a suitably small constant k. The total overhead would then be Lc/B + kEB; simple calculus yields the minimum p overhead is when B = √ block sizes of O( L/E) and resulting space 2 ckLE, with √ overhead of O( LE). On the other hand, for Biff codes the redundancy overhead is O(E) with small constants hidden in the O notation, because only the values in the cells of the hash table, and not the original data, require checksums. This is a key benefit of the Biff code approach; only the hash table cells need to be protected with checksums. V. E XPERIMENTAL R ESULTS In order to test our approach, we have implemented Biff codes in software. Our code uses pseudorandom hash values generated from the C drand function (randomly seeded using the clock), and therefore our timing information does not include the time to hash. However, we point out that hashing is unlikely to be a major bottleneck. For example, even if for each one wants 4 hash locations for each key into 4 subtables of size 1024, and an additional 24 bit hash for the checksum for each key, all the necessary values can be obtained with a single 64-bit hash operation. Setup: Our has table is split into k equal subtables. As mentioned, to determine locations in each subtable, we use pseudorandom hash values. For convenience we use random 20 bit keys as our original message symbols and 20 bits to describe the location in the sequence. While these keys are small, it allows us to do all computation with 64-bit operations. For a checksum, we use a simple invertible function: the pair (xi , i) gives a checksum of (2i + 1) ∗ xi + i2 . One standard test case uses 1 million 20-bit message symbols and an IBLT of 30000 cells, with errors introduced in 10000 messsage symbols and 600 IBLT cells. Note that with 20 bit keys and 20 bits to record the length, an IBLT cell is actually 4 times the size of a message cell; however, we use a 2% error rate in the IBLT as we expect message symbols will generally be much longer. For example, in practice a key might be a 1KB packet, in which case 1 million message symbols would correspond to a gigabyte. Timing: Our results show Biff codes to be extremely fast. There are two decoding stages, as can be seen in the previously given pseudocode. First, the received sequence values must be placed into the hash table. Second, the hash table must be processed and the erroneous values recovered. Generally, the bulk of the work will actually be in the first stage, when the number of errors are small. We had to utilize messages of 1 million symbols in order to obtain suitable timing data; otherwise processing was too fast. On our standard test case over 1000 trials, using 4 hash functions the first stage took 0.0561 seconds on average and the second took 0.0069 seconds on average. With 5 hash functions, the numbers were 0.0651 second and 0.0078 seconds. Thresholds: Our threshold calculations are very accurate. For example, in a setting where no errors are introduced in the IBLT, with 4 hash functions and 10000 errors we would expect to require approximately 26000 cells in order to recover fully. (Recall that 10000 errors means 20000 keys are placed into the IBLT.) Our experiments yielded that with and IBLT of 26000 cells, complete recovery occured in 803 out of 1000 trials; for 26500 cells, complete recovery occured in 10000 out of 10000 trials. Failure probabilities: We have purposely chosen parameters that would lead to failures, in order to check our analysis. Under our standard test case with four hash functions, we estimate the probability of failure during any single trial as 10000 · (600/30000)4 = 1.6 × 10−3 . Over an experiment with 10000 trials, we indeed found 16 trials with failures, and in each failure, there was just one unrecovered erroneous message symbol. Reducing to 500 errors in the IBLT reduces the failure probability to 10000 · (500/30000)4 ≈ 7.7 × 10−4 ; an experiment with 10000 trials led to a seven failures, each with just one unrecovered erroneous message symbol. Finally, with 5 hash functions and 600 IBLT errors, we would estimate the failure probability as 10000 · (600/30000)5 = 3.2 × 10−5 ; a run of 10000 trials yielded no failures. VI. C ONCLUSIONS Our goal was to design an error-correcting code that would be extremely fast and simple for use in networking applications such as large-scale data transfer and reconciliation in cloud computing systems. While not optimal in terms of rate, the amount of redundancy used is a small constant factor more than optimal; we expect this will be suitable for many applications, given the other advantages. Although we have focused on error correction of large data, Biff codes may also be useful for smaller messages, in settings where computational efficiency is paramount and where small block sizes were introduced at least partially to reduce Reed-Solomon decoding overheads. We note that in the large data setting we can adapt the sampling technique described in [4] to estimate the number of errors E in O(log L) time. This allows the Biff code to be sized correctly to O(E) without requiring any a priori bound on E to be known in advance. For example, when two large virtual memories are to be reconciled it is difficult to have a reasonable bound on the number of errors or differences. In the communications setting this is akin to estimating the channel error rate and adapting the code. However, such error rate estimation in the communication setting is done infrequently to reduce overhead. In our large data setting, the cost of estimation is so cheap that it can be done on each large data reconciliation. Finally, we note that modern low-density parity-check codes are sufficiently complex that they are difficult to teach without without going through a number of preliminaries. By contrast, Biff codes are sufficiently simple that we believe they could be taught in an introductory computer science class, and even introductory level programmers could implement them. Beyond their practical applications, Biff codes might prove worthwhile as a gateway to modern coding techniques. R EFERENCES [1] J.W. Byers, M. Luby, and M. Mitzenmacher. A digital fountain approach to asynchronous reliable multicast. IEEE Journal on Selected Areas in Communications, 20:8, pp. 1528-1540, 2002. [2] M. Dietzfelbinger, A. Goerdt, M. Mitzenmacher, A. Montanari, R. Pagh, and M. Rink. Tight thresholds for cuckoo hashing via XORSAT. In Proceedings of ICALP, pp. 213–225, 2010. [3] D. Eppstein and M. T. Goodrich. Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters. IEEE Trans. on Knowledge and Data Engineering, 23(2):297-306, 2011. [4] D. Eppstein, M. T. Goodrich, F. Uyeda, and G. Varghese. What’s the Difference? Efficient Set Reconciliation without Prior Context. Proceedings of SIGCOMM 2011, pp. 218-229, 2011. [5] M. Goodrich and M. Mitzenmacher. Invertible Bloom Lookup Tables. In Proceedings of the 49th Allerton Conference, pp. 792-799, 2011. [6] D. Gupta, S. Lee, M. Vrable, S. Savage, A.C. Snoeren, G. Varghese, G.M. Voelker, and A. Vahdat. Difference engine: Harnessing memory redundancy in virtual machines. Communications of the ACM, 53:10, pp. 85-93, 2010. [7] M. Karpovsky, L. Levitin, and A. Trachtenberg. Data verification and reconciliation with generalized error-correction codes. IEEE Transactions on Information Theory, 49(7):1788–1793, 2003. [8] M. Luby and M. Mitzenmacher. Verification-Based Decoding for Packet-Based Low-Density Parity-Check Codes. IEEE Transactions on Information Theory, 51(1):120–127, 2005. [9] M. Luby, M. Mitzenmacher, M. Shokrollahi, and D. Spielman. Efficient erasure correcting codes. IEEE Transactions on Information Theory, 47(2):569–584, 2001. [10] Y. Minsky, A. Trachtenberg, and R. Zippel. Set Reconciliation with Nearly Optimal Communication Complexity. IEEE Transactions on Information Theory, 49(9):2213–2218, 2003. [11] M. Mitzenmacher and S. Vadhan. Why simple hash functions work: exploiting the entropy in a data stream. In Proc. of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 746–755, 2008. [12] M. Molloy. The pure literal rule threshold and cores in random hypergraphs. In Proc. of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 672–681, 2004. [13] M. Patrascu and M. Thorup. The power of simple tabulation hashing. In Proc. of the 43rd Annual ACM Symposium on Theory of Computing, pp. 1-10, 2011. Verifiable Computation with Massively Parallel Interactive Proofs Justin Thaler∗ Mike Roberts † Michael Mitzenmacher‡ Hanspeter Pfister § arXiv:1202.1350v3 [cs.DC] 22 Feb 2012 Abstract As the cloud computing paradigm has gained prominence, the need for verifiable computation has grown increasingly urgent. The concept of verifiable computation enables a weak client to outsource difficult computations to a powerful, but untrusted, server. Protocols for verifiable computation aim to provide the client with a guarantee that the server performed the requested computations correctly, without requiring the client to perform the requested computations herself. By design, these protocols impose a minimal computational burden on the client. However, existing protocols require the server to perform a very large amount of extra bookkeeping, on top of the requested computations, in order to enable a client to easily verify the results. Verifiable computation has thus remained a theoretical curiosity, and protocols for it have not been implemented in real cloud computing systems. In this paper, our goal is to leverage GPUs to reduce the server-side slowdown for verifiable computation. To this end, we identify abundant data parallelism in a state-of-the-art general-purpose protocol for verifiable computation, originally due to Goldwasser, Kalai, and Rothblum [10], and recently extended by Cormode, Mitzenmacher, and Thaler [8]. We implement this protocol on the GPU, and we obtain 40-120× server-side speedups relative to a stateof-the-art sequential implementation. For benchmark problems, our implementation thereby reduces the slowdown of the server to within factors of 100-500× relative to the original computations requested by the client. Furthermore, we reduce the already small runtime of the client by 100×. Similarly, we obtain 20-50× server-side and clientside speedups for related protocols targeted at specific streaming problems. We believe our results demonstrate the immediate practicality of using GPUs for verifiable computation, and more generally, that protocols for verifiable computation have become sufficiently mature to deploy in real cloud computing systems. 1 Introduction A potential problem in outsourcing work to commercial cloud computing services is trust. If we store a large dataset with a server, and ask the server to perform a computation on that dataset – for example, to compute the eigenvalues of a large graph, or to compute a linear program on a large matrix derived from a database – how can we know the computation was performed correctly? Obviously we don’t want to compute the result ourselves, and we might not even be able to store all the data locally. Despite these constraints, we would like the server to not only provide us with the answer, but to convince us the answer is correct. Protocols for verifiable computation offer a possible solution to this problem. The ultimate goal of any such protocol is to enable the client to obtain results with a guarantee of correctness from the server much more efficiently than performing the computations herself. Another important goal of any such protocol is to enable the server to provide results with guarantees of correctness almost as efficiently as providing results without guarantees of correctness. Interactive proofs are a powerful family of protocols for establishing guarantees of correctness between a client and server. Although they have been studied in the theory community for decades, there had been no significant efforts ∗ Harvard University, School of Engineering and Applied Sciences, jthaler@seas.harvard.edu. Supported by the Department of Defense (DoD) through the National Defense Science & Engineering Graduate Fellowship (NDSEG) Program, and in part by NSF grants CCF-0915922 and IIS-0964473. † Harvard University, School of Engineering and Applied Sciences, mroberts@seas.harvard.edu. This work was partially supported by the Intel Science and Technology Center for Visual Computing, NVIDIA, and the National Science Foundation under Grant No. PHY-0835713. ‡ Harvard University, School of Engineering and Applied Sciences, michaelm@eecs.harvard.edu. This work was supported by NSF grants CCF-0915922 and IIS-0964473. § Harvard University, School of Engineering and Applied Sciences, pfister@seas.harvard.edu. This work was partially supported by the Intel Science and Technology Center for Visual Computing, NVIDIA, and the National Science Foundation under Grant No. PHY-0835713. 1 to implement or deploy such proof systems until very recently. A recent line of work (e.g., [5, 6, 7, 8, 9, 10, 19]) has made substantial progress in advancing the practicality of these techniques. In particular, prior work of Cormode, Mitzenmacher, and Thaler [8] demonstrates that: (1) a powerful general-purpose methodology due to Goldwasser, Kalai and Rothblum [10] approaches practicality; and (2) special-purpose protocols for a large class of streaming problems are already practical. In this paper, we clearly articulate this line of work to researchers outside the theory community. We also take things one step further, leveraging the parallelism offered by GPUs to obtain significant speedups relative to stateof-the-art implementations of [8]. Our goal is to invest the parallelism of the GPU to obtain correctness guarantees with minimal slowdown, rather than to obtain raw speedups, as is the case with more traditional GPU applications. We believe the insights of our GPU implementation could also apply to a multi-core CPU implementation. However, GPUs are increasingly widespread, cost-effective, and power-efficient, and they offer the potential for speedups in excess of those possible with commodity multi-core CPUs [17, 14]. We obtain server-side speedups ranging from 40-120× for the general-purpose protocol due to Goldwasser et al. [10], and 20-50× speedups for related protocols targeted at specific streaming problems. Our general-purpose implementation reduces the server-side cost of providing results with a guarantee of correctness to within factors of 100-500× relative to a sequential algorithm without guarantees of correctness. Similarly, our implementation of the special-purpose protocols reduces the server-side slowdown to within 10-100× relative to a sequential algorithm without guarantees of correctness. We believe the additional costs of obtaining correctness guarantees demonstrated in this paper would already be considered modest in many correctness-critical applications. For example, at one end of the application spectrum is Assured Cloud Computing for military contexts: a military user may need integrity guarantees when computing in the presence of cyber attacks, or may need such guarantees when coordinating critical computations across a mixture of secure military networks and insecure networks owned by civilians or other nations [1]. At the other end of the spectrum, a hospital that outsources the processing of patients’ electronic medical records to the cloud may require guarantees that the server is not dropping or corrupting any of the records. Even if every computation is not explicitly checked, the mere ability to check the computation could mitigate trust issues and stimulate users to adopt cloud computing solutions. Our source code is available at [20]. 2 2.1 Background What are interactive proofs? Interactive proofs (IPs) were introduced within the computer science theory community more than a quarter century ago, in seminal papers by Babai [11] and Goldwasser, Micali and Rackoff [3]. In any IP, there are two parties: a prover P, and a verifier V. P is typically considered to be computationally powerful, while V is considered to be computationally weak. In an IP, P solves a problem using her (possibly vast) computational resources, and tells V the answer. P and V then have a conversation, which is to say, they engage in a randomized protocol involving the exchange of one or more messages between the two parties. The term interactive proofs derives from the back-and-forth nature of this conversation. During this conversation, P’s goal is to convince V that her answer is correct. IPs naturally model the problem of a client (whom we model as V) outsourcing computation to an untrusted server (who we model as P). That is, IPs provide a way for a client to hire a cloud computing service to store and process data, and to efficiently check the integrity of the results returned by the server. This is useful whenever the server is not a trusted entity, either because the server is deliberately deceptive, or is simply buggy or inept. We therefore interchange the terms server and prover where appropriate. Similarly, we interchange the terms client and verifier where appropriate. Any IP must satisfy two properties. Roughly speaking, the first is that if P answers correctly and follows the prescribed protocol, then P will convince V to accept the provided answer. The second property is a security guarantee, which says that if P is lying, then V must catch P in the lie and reject the provided answer with high probability. A trivial way to satisfy this property is to have V compute the answer to the problem herself, and accept only if her answer 2 Figure 1: High-level depiction of an execution of the GKR protocol. matches P’s. But this defeats the purpose of having a prover. The goal of an interactive proof system is to allow V to check P’s answer using resources considerably smaller than those required to solve the problem from scratch. At first blush, this may appear difficult or even impossible to achieve. However, IPs have turned out to be surprisingly powerful. We direct the interested reader to [2, Chapter 8] for an excellent overview of this area. 2.2 How do interactive proofs work? At the highest level, many interactive proof methods (including the ones in this paper) work as follows. Suppose the goal is to compute a function f of the input x. First, the verifier makes a single streaming pass over the input x, during which she extracts a short secret s. This secret is actually a single (randomly chosen) symbol of an error-corrected encoding Enc(x) of the input. To be clear, the secret does not depend on the problem being solved; in fact, for many interactive proofs, it is not necessary that the problem be determined until after the secret is extracted. Next, P and V engage in an extended conversation, during which V sends P various challenges, and P responds to the challenges (see Figure 1 for an illustration). The challenges are all related to each other, and the verifier checks that the prover’s responses to all challenges are internally consistent. The challenges are chosen so that the prover’s response to the first challenge must include a (claimed) value for the function of interest. Similarly, the prover’s response to the last challenge must include a claim about what the value of the verifier’s secret s should be. If all of P’s responses are internally consistent, and the claimed value of s matches the true value of s, then the verifier is convinced that prover followed the prescribed protocol and accepts. Otherwise, the verifier knows that the prover deviated at some point, and rejects. From this point of view, the purpose of all intermediate challenges is to guide the prover from a claim about f (x) to a claim about the secret s, while maintaining V’s control over P. Intuitively, what gives the verifier surprising power to detect deviations is the error-correcting properties of Enc(x). Any good error-correcting code satisfies the property that if two strings x and x0 differ in even one location, then Enc(x) and Enc(x0 ) differ in almost every location. In the same way, interactive proofs ensure that if P flips even a single bit of a single message in the protocol, then P either has to make an inconsistent claim at some later point, or else has to lie almost everywhere in her final claim about the value of the secret s. Thus, if the prover deviates from the prescribed protocol even once the verifier will detect this with high probability and reject. 3 2.3 Previous work Unfortunately, despite their power, IPs have had very little influence on real systems where integrity guarantees on outsourced computation would be useful. There appears to have been a folklore belief that these methods are impractical [19]. As previously mentioned, a recent line of work (e.g., [5, 6, 7, 8, 9, 10, 19]) has made substantial progress in advancing the practicality of these techniques. In particular, Goldwasser et al. [10] described a powerful general-purpose protocol (henceforth referred to as the GKR protocol) that achieves a polynomial-time prover and nearly linear-time verifier for a large class of computations. Very recently, Cormode, Mitzenmacher, and Thaler [8] showed how to significantly speed up the prover in the GKR protocol [10]. They also implemented this protocol, and demonstrated experimentally that their implementation approaches practicality. Even with their optimizations, the bottleneck in the implementation of [8] is the prover’s runtime, with all other costs (such as verifier space and runtime) being extremely low. A related line of work has looked at protocols for specific streaming problems. Here, the goal is not just to save the verifier time (compared to doing the computation without a prover), but also to save the verifier space. This is motivated by cloud computing settings where the client does not even have space to store a local copy of the input, and thus uses the cloud to both store and process the data. The protocols developed in this line of work do not require the client to store the input, but rather allow the client to make a single streaming pass over the input (which can occur, for example, while the client is uploading data to the cloud). Throughout this paper, whenever we mention a streaming verifier, we mean the verifier makes a single pass over the input, and uses space significantly sublinear in the size of the data. The notion of a non-interactive streaming verifier was first put forth by Chakrabarti et al. [6] and studied further by Cormode et al. [7]. These works allow the prover to send only a single message to the verifier (e.g., as an attachment to an email, or posted on a website), with no communication in the reverse direction. Moreover, these works present protocols achieving provably optimal tradeoffs between the size of the proof and the space used by the verifier for a variety of problems, ranging from matrix-vector multiplication to graph problems like bipartite perfect matching. Later, Cormode, Thaler, and Yi extended the streaming model of [6] to allow an interactive prover and verifier, who actually have a conversation. They demonstrated that interaction allows for much more efficient protocols in terms of client space, communication, and server running time than are possible in the one-message model of [6, 7]. It was also observed in this work that the general-purpose GKR protocol works with just a streaming verifier. Finally, the aforementioned work of Cormode, Thaler, and Mitzenmacher [8] also showed how to use sophisticated Fast Fourier Transform (FFT) techniques to drastically speed up the prover’s computation in the protocols of [6, 7]. Also relevant is work by Setty et al. [19], who implemented a protocol for verifiable computation due to Ishai et al. [13]. To set the stage for our results using parallelization, in Section 6 we compare our approach with [19] and [8] in detail. As a summary, the implementation of the GKR protocol described in both this paper and in [8] has several advantages over [19]. For example, the GKR implementation saves space and time for the verifier even when outsourcing a single computation, while [19] saves time for the verifier only when batching together several dozen computations at once and amortizing the verifier’s cost over the batch. Moreover, the GKR protocol is unconditionally secure against computationally unbounded adversaries who deviate from the prescribed protocol, while the Ishai et al. protocol relies on cryptographic assumptions to obtain security guarantees. We present experimental results demonstrating that that the prover in the sequential implementation of [8] based on the GKR protocol runs significantly faster than the prover in the implementation of [19] based on the Ishai et al. protocol [13]. Based on this comparison, we use the sequential implementation of [8] as our baseline. We then present results that our new GPU-based implementation runs 40-120× faster than the sequential implementation in [8]. 3 Our interactive proof protocols In this section, we give an overview of the methods implemented in this paper. Due to their highly technical nature, we seek only to convey a high-level description of the protocols relevant to this paper, and deliberately avoid rigorous definitions or theorems. We direct the interested reader to prior work for further details [6, 7, 8, 10]. 4 Figure 2: A small arithmetic circuit. 3.1 GKR protocol The prover and verifier first agree on a layered arithmetic circuit of fan-in two over a finite field F computing the function of interest. An arithmetic circuit is just like a boolean circuit, except that the inputs are elements of F rather than boolean values, and the gates perform addition and multiplication over the field F, rather than computing AND, OR, and NOT operations. See Figure 2 for an example circuit. In fact, any boolean circuit can be transformed into an arithmetic circuit computing an equivalent function over a suitable finite field, although this approach may not yield the most succinct arithmetic circuit for the function. Suppose the output layer of the circuit is layer d, and the input layer is layer 0. The protocol of [10] proceeds in iterations, with one iteration for each layer of the circuit. The first iteration follows the general outline described in Section 2.2, with V guiding P from a claim about the output of the circuit to a claim about a secret s, via a sequence of challenges and responses. The challenges sent by V to P are simply random coins, which are interpreted as random points in the finite field F. The prescribed responses of P are polynomials, where each prescribed polynomial depends on the preceding challenge. Such a polynomial can be specified either by listing its coefficients, or by listing its evaluations at several points. However, unlike in Section 2.2, the secret s is not a symbol in an error-corrected encoding of the input, but rather a symbol in an error-corrected encoding of the gate values at layer d − 1. Unfortunately, V cannot compute this secret s on her own. Doing so would require evaluating all previous layers of the circuit, and the whole point of outsourcing is to avoid this. So V has P tell her what s should be. But now V has to make sure that P is not lying about s. This is what the second iteration accomplishes, with V guiding P from a claim about s, to the claim about a new secret s0 , which is a symbol in an encoding of the gate values at layer d − 2. This continues until we get to the input layer. At this point, the secret is actually a symbol in an error-corrected encoding of the input, and V can compute this secret in advance from the input easily on her own. Figure 1 illustrates the entirety of the GKR protocol at a very high level. We take this opportunity to point out an important property of the protocol of [10], which was critical in allowing our GPU-based implementation to scale to large inputs. Namely, any iteration of the protocol involves only two layers of the circuit at a time. In the ith iteration, the verifier guides the prover from a claim about gate values at layer d − i to a claim about gate values at layer d − i − 1. Gates at higher or lower layers do not affect the prescribed responses within iteration i. 3.2 Special-purpose protocols As mentioned in Section 2.3, efficient problem-specific non-interactive verifiable protocols have been developed for a variety of problems of central importance in streaming and database processing, ranging from linear programming to 5 graph problems like shortest s − t path. The central primitive in many of these protocols is itself a protocol originally due to Chakrabarti et al. [6], for a problem known as the second frequency moment, or F2 . P In this problem, the input is a sequence of m items from a universe U of size n, and the goal is to compute F2 (x) = i∈U fi2 , where fi is the number of times item i appears in the sequence. As explained in [8], speeding up this primitive immediately speeds up protocols for all of the problems that use the F2 protocol as a subroutine. The aforementioned F2 protocol of Chakrabarti et al. [6] achieves provably optimal tradeoffs between the length of the proof and the space used by the verifier. Specifically, for any positive integer h, the protocol can achieve a proof length of just h machine words, as long as the verifier uses v = O(n/h) words of space. For example, we may set √ both h and v to be roughly n, which is substantially sublinear in the input size n. Very roughly speaking, this protocol follows the same outline as in Section 2.2, except that in order to remove the interaction from the protocol, the verifier needs to compute a more complicated secret. Specifically, the verifier’s secret s consists of v symbols in an error-corrected encoding of the input, rather than a single symbol. To compute the prescribed proof, the prover has to evaluate 2n symbols in the error-corrected encoding of the input. The key insight of [8] is that these 2n symbols need not be computed independently (which would require substantially superlinear time), but instead can be computed in O(n log n) time using FFT techniques. More specifically, the protocol of [8] partitions the universe into a v × h grid, and it performs a sophisticated FFT variant known as the Prime Factor Algorithm [4] on each row of the grid. The final step of P’s computation is to compute the sum of the squared entries for each column of the (transformed) grid; these values form the actual content of P’s prescribed message. 4 Parallelizing our protocols In this section, we explain the insights necessary to parallelize the computation of both the prover and the verifier for the protocols we implemented. 4.1 4.1.1 GKR protocol Parallelizing P’s computation In every one of P’s responses in the GKR protocol, the prescribed message from P is defined via a large sum over roughly S 3 terms, where S is the size of the circuit, and so computing this sum naively would take Ω(S 3 ) time. Roughly speaking, Cormode et al. in [8] observe that each gate of the circuit contributes to only a single term of this sum, and thus this sum can be computed via a single pass over the relevant gates. The contribution of each gate to the sum can be computed in constant time, and each gate contributes to logarithmically many messages over the course of the protocol. Using these observations carefully reduces P’s runtime from Ω(S 3 ), to O(S log S), where again S is the circuit size. The same observation reveals that P’s computation can be parallelized: each gate contributes independently to the sum in P’s prescribed response. Therefore, P can compute the contribution of many gates in parallel, save the results in a temporary array, and use a parallel reduction to sum the results. We stress that all arithmetic is done within the finite field F, rather than over the integers. Figure 3 illustrates this process. 4.1.2 Parallelizing V’s computation The bulk of V’s computation (by far) consists of computing her secret, which consists of a single symbol s in a particular error-corrected encoding of the input x. As observed in prior work [9], each symbol of the input contributes independently to s. Thus, V can compute the contribution of many input symbols in parallel, and sum the results via a parallel reduction, just as in the parallel implementation of P’s computation. This speedup is perhaps of secondary importance, as V runs extremely quickly even in the sequential implementation of [8]. However, parallelizing V’s computation is still an appealing goal, especially as GPUs are becoming more common on personal computers and mobile devices. 6 Figure 3: Illustration of parallel computation of the server’s message to the client in the GKR protocol. 4.2 Special-purpose protocols 4.2.1 Parallelizing P’s computation Recall that the prover in the special-purpose protocols can compute the prescribed message by interpreting the input as a v × h grid, where h is roughly the proof length and v is the amount of space used by the verifier. The prover then performs a sophisticated FFT on each row of the grid independently. This can be parallelized by transforming multiple rows of the grid in parallel. Indeed, Cormode et al. [8] achieved roughly a 7× speedup for this problem by using all eight cores of a multicore processor. Here, we obtain a much larger 20-50× speedup using the GPU. (Note that [8] did not develop a parallel implementation of the GKR protocol, only of the special-purpose protocols). 4.2.2 Parallelizing V’s computation Recall that in the special-purpose protocols, the verifier’s secret s consists of v symbols in an error-corrected encoding of the input, rather than a single symbol. Just as in Section 3.1, this computation can be parallelized by noting that each input symbol contributes independently to each entry of the encoded input. This requires V to store a large buffer of input symbols to work on in parallel. In some streaming contexts, V may not have the memory to accomplish this. Still, there are many settings in which this is feasible. For example, V may have several hundred megabytes of memory available, and seek to outsource processing of a stream that is many gigabytes or terabytes in length. Thus, parallel computation combined with buffering can help a streaming verifier keep up with a live stream of data: V splits her memory into two buffers, and at all times one buffer will be collecting arriving items. As long as V can process the full buffer (aided by parallelism) before her other buffer overflows, V will be able to keep up with the live data stream. Notice this discussion applies to the client in the GKR protocol as well, as the GKR protocol also enables a streaming verifier. 5 5.1 Architectural considerations GKR protocol The primary issue with any GPU-based implementation of the prover in the GKR protocol is that the computation is extremely memory-intensive: for a circuit of size S (which corresponds to S arithmetic operations in an unverifiable 7 algorithm), the prover in the GKR protocol has to store all S gates explicitly, because she needs the values of these gates to compute her prescribed messages. We investigate three alternative strategies for managing the memory overhead of the GKR protocol, which we refer to as the no-copying approach, the copy-once-per-layer approach, and the copyevery-message approach. 5.1.1 The no-copying approach The simplest approach is to store the entire circuit explicitly on the GPU. We call this the no-copying approach. However, this means that the entire circuit must fit in device memory, a requirement which is violated even for relatively small circuits, consisting of roughly tens of million of gates. 5.1.2 The copy-once-per-layer approach Another approach is to keep the circuit in host memory, and only copy information to the device when it is needed. This is possible because, as mentioned in Section 3.1, at any point in the protocol the prover only operates on two layers of the circuit at a time, so only two layers of the circuit need to reside in device memory. We refer to this as the copy-once-per-layer approach. This is the approach we used in the experiments in Section 6. Care must be taken with this approach to prevent host-to-device copying from becoming a bottleneck. Fortunately, in the protocol for each layer there are several dozen messages to be computed before the prover moves on to the next layer, and this ensures that the copying from host to device makes up a very small portion of the runtime. This method is sufficient to scale to very large circuits for all of the problems considered in the experimental section of [8], since no single layer of the circuits is significantly larger than the problem input itself. However, this method remains problematic for circuits that have (one or several) layers which are particularly wide, as an explicit representation of all the gates within a single wide layer may still be too large to fit in device memory. 5.1.3 The copy-every-message approach In the event that there are individual layers which are too large to reside in device memory, a third approach is to copy part of a layer at a time from the host to the device, and compute the contribution of each gate in the part to the prover’s message before swapping the part back to host memory and bringing in the next part. We call this the copy-every-message approach. This approach is viable, but it raises a significant issue, alluded to in its name. Namely, this approach requires host-to-device copying for every message, rather than just once per layer of the circuit. That is, in any iteration i of the protocol, P cannot compute her jth message until after the (j − 1)th challenge from V is received. Thus, for each message j, the entirety of the ith layer must be loaded piece-by-piece into device memory, swapping each piece back to host memory after the piece has been processed. In contrast, the copy-once-per-layer approach allows P to copy an entire layer i to the device and leave the entire layer in device memory for the entirety of iteration i (which will consist of several dozen messages). Thus, the slowdown inherent in the copy-every-message approach is not just that P has to break each layer into parts, but that P has to do host-to-device and device-to-host copying for each message, instead of copying an entire layer and computing several messages from that layer. We leave implementing the copy-once-per-message approach in full for future work, but preliminary experiments suggest that this approach is viable in practice, resulting in less than a 3× slowdown compared to the copy-once-perlayer approach. Notice that even after paying this slowdown, our GPU-based implementation would still achieve a 10-40× speedup compared to the sequential implementation of [8]. 5.1.4 Memory access Recall that for each message in the ith iteration of the GKR protocol, we assign a thread to each gate g at the ith layer of the circuit, as each gate contributes independently to the prescribed message of the prover. The contribution of gate g depends only on the index of g, the indices of the two gates feeding into g, and the values of the two gates feeding into g. Given this data, the contribution of gate g to the prescribed message can be computed using roughly 10-20 additions and multiplications within the finite field F (the precise number of arithmetic operations required varies over the course 8 of the iteration). As described in Section 6, we choose to work over a field which allows for extremely efficient arithmetic; for example, multiplying two field elements requires three machine multiplications of 64-bit data types, and a handful of additions and bit shifts. In all of the circuits we consider, the indices of g’s in-neighbors can be determined with very little arithmetic and no global memory accesses. For example, if the wiring pattern of the circuit forms a binary tree, then the first in-neighbor of g has index 2 · index(g), and the second in-neighbor of g has index 2 · index(g) + 1. For each message, the thread assigned to g can compute this information from scratch without incurring any memory accesses. In contrast, obtaining the values of g’s in-neighbors requires fetching 8 bytes per in-neighbor from global memory. Memory accesses are necessary because it is infeasible to compute the value of each gate’s in-neighbors from scratch each message, and so we store these values explicitly. As these global memory accesses can be a bottleneck in the protocol, we strive to arrange the data in memory to ensure that adjacent threads access adjacent memory locations. To this end, for each layer i we maintain two separate arrays, with the j’th entry of the first (respectively, second) array storing the first (respectively, second) in-neighbor of the j’th gate at layer i. During iteration i, the thread assigned to the jth gate accesses location j of the first and second array to retrieve the value of its first and second in-neighbors respectively. This ensures that adjacent threads access adjacent memory locations. For all layers, the corresponding arrays are populated with in-neighbor values when we evaluate the circuit at the start of the protocol (we store each layer i’s arrays on the host until the i’th iteration of the protocol, at which point we transfer the array from host memory to device memory as described in Section 5.1.2). Notice this methodology sometimes requires data duplication: if many gates at layer i share the same in-neighbor g1 , then g1 ’s value will appear many times in layer i’s arrays. We feel that slightly increased space usage is a reasonable price to pay to ensure memory coalescing. 5.2 5.2.1 Special-purpose protocols Memory access Recall that the prover in our special-purpose protocols views the input as a v × h grid, and performs a sophisticated FFT on each row of the grid independently. Although the independence of calculations in each row offers abundant opportunities for task-parallelism, extracting the data-parallelism required for high performance on GPUs requires care due to the irregular memory access pattern of the specific FFT algorithm used. We observe that although each FFT has a highly irregular memory access pattern, this memory access pattern is data-independent. Thus, we can convert abundant task-parallelism into abundant data-parallelism by transposing the data grid into column-major rather than row-major order. This simple transformation ensures perfect memory coalescing despite the irregular memory access pattern of each FFT, and improves the performance of our specialpurpose prover by more than 10×. 6 6.1 Evaluation Implementation details Except where noted, we performed our experiments on an Intel Xeon 3 GHz workstation with 16 GB of host memory. Our workstation also has an NVIDIA GeForce GTX 480 GPU with 1.5 GB of device memory. We implemented all our GPU code in CUDA and Thrust [12] with all compiler optimizations turned on. Similar to the sequential implementations of [8], both our implementation of the GKR protocol and the specialpurpose F2 protocol due to [6, 8] work over the finite field Fp with p = 261 − 1. We chose this field for a number of reasons. Firstly, the integers embed naturally within it. Secondly, the field is large enough that the probability the verifier fails to detect a cheating prover is tiny (roughly proportional to reciprocal of the field size). Thirdly, arithmetic within the field can be performed efficiently with simple shifts and bit-wise operations [21]. We remark that we used no floating point operations were necessary in any of our implementations, because all arithmetic is done over finite fields. 9 Finally, we stress that in all reported costs below, we do count the time taken to copy data between the host and the device, and all reported speedups relative to sequential processing take this cost into account. We do not count the time to allocate memory for scratch space because this can be done in advance. 6.2 Experimental methodology for the GKR protocol We ran our GPU-based implementation of the GKR protocol on four separate circuits, which together capture several different aspects of computation, from data aggregation to search, to linear algebra. The first three circuits were described and evaluated in [8] using the sequential implementation of the GKR protocol. The fourth problem was described and evaluated in [19] based on the Ishai et al. protocol [13]. Below, [n] denotes the integers {0, 1, . . . , n−1}. P • F2 : Given a stream of m elements from [n], compute i∈[n] a2i where ai is the number of occurrences of i in the stream. • F0 : Given a stream of m elements from [n], compute the number of distinct elements (i.e., the number of i with ai 6= 0, where again ai is the number of occurrences of i in the stream). • PM: Given a stream representing text T = (t0 , . . . , tn−1 ) ∈ [n]n and pattern P = (p0 , . . . , pq−1 ) ∈ [n]q , the pattern P is said to occur at location i in t if, for every position j in P , pj = ti+j . The pattern-matching problem is to determine the number of locations at which P occurs in T . 2 • M AT M ULT: Given three matrices A, B, C ∈ [n]m , determine whether AB = C. (In practice, we do not expect C to truly be part of the input data stream. Rather, prior work [9, 8] has shown that the GKR protocol works even if A and B are specified from a stream, while C is given later by P.) The first two problems, F2 and F0 , are classical data aggregation queries which have been studied for more than a decade in the data streaming community. F0 is also a highly useful subroutine in more complicated computations, as it effectively allows for equality testing of vectors or matrices (by subtracting two vectors and seeing if the result is equal to the zero vector). We make use of this subroutine when designing our matrix-multiplication circuit below. The third problem, PM, is a classic search problem, and is motivated, for example, by clients wishing to store (and search) their email on the cloud. Cormode et al. [8] considered the PATTERN M ATCHING WITH W ILDCARDS problem, where the pattern and text can contain wildcard symbols that match with any character, but for simplicity we did not implement this additional functionality. We chose the fourth problem, matrix multiplication, for several reasons. First was its practical importance. Second was a desire to experiment on problems requiring super-linear time to solve (in contrast to F2 and F0 ): running on a super-linear problem allowed us to demonstrate that our implementation as well as that of [8] saves the verifier time in addition to space, and it also forced us to grapple with the memory-intensive nature of the GKR protocol (see Section 4). Third was its status as a benchmark enabling us to compare the implementations of [8] and [19]. Although there are also efficient special-purpose protocols to verify matrix multiplication (see Freivald’s algorithm [16, Section 7.1], as well as Chakrabarti et al. [6, Theorem 5.2]), it is still interesting to see how a general-purpose implementation performs on this problem. Finally, matrix multiplication is an attractive primitive to have at one’s disposal when verifying more complicated computations using the GKR protocol. 6.2.1 Description of circuits We briefly review the circuits for our benchmark problems. The circuit for F2 is by far the simplest (see Figure 4 for an illustration). This circuit simply computes the square of each input wire using a layer of multiplication gates, and then sums the results using a single sum-gate of very large fan-in. We remark that the GKR protocol typically assumes that all gates have fan-in two, but [8] explains how the protocol can be modified to handle a single sum-gate of very large fan-in at the output. The circuit for F0 exploits Fermat’s Little Theorem, which says that for prime p, ap−1 ≡ 1 mod p if and only if a 6= 0. Thus, this circuit computes the p − 1’th power of each input wire (taking all non-zero inputs to 1, and leaving all 0-inputs at 0), and sums the results via a single sum-gate of high fan-in. 10 Figure 4: The circuit for F2 . The circuit for PM is similar to that for F0 : essentially, for each possible location of the pattern, it computes a value that is 0 if the pattern is at the location, and non-zero otherwise. It then computes the (p − 1)th power of each such value and sums the results (i.e., it uses the F0 circuit as a subroutine) to determine the number of locations where the pattern does (not) appear in the input. Our circuit for M AT M ULT uses similar ideas. We could run a separate instance of the GKR protocol to verify each of the n2 entries in the output matrix AB and compare them to C, but this would be very expensive for both the client and the server. Instead, we specify a suitable circuit with a single output gate, allowing us to run a single instance of the protocol to verify the output. Our circuit computes the n2 entries in AB via naive matrix multiplication, and subtracts the corresponding entry of C from each. It then computes the number of non-zero values using the F0 circuit as a subroutine. The final output of the circuit is zero if and only if C = AB. 6.2.2 Scaling to large inputs As described in Section 5, the memory-intensive nature of the GKR protocol made it challenging to scale to large inputs, especially given the limited amount of device memory. Indeed, with the no-copying approach (where we simply keep the entire circuit in device memory), we were only able to scale to inputs of size roughly 150, 000 for the F0 problem, and to 32 × 32 matrices for the M AT M ULT problem on a machine with 1 GB of device memory. Using the copy-once-per-layer approach, we were able to scale to inputs with over 2 million entries for the F0 problem, and 128 × 128 matrices for the M AT M ULT problem. By running on a NVIDIA Tesla C2070 GPU with 6 GBs of device memory, we were able to push to 256 × 256 matrices for the M AT M ULT problem; the data from this experiment is reported in Table 2. 6.2.3 Evaluation of previous implementations To our knowledge, the only existing implementation for verifiable computation that can be directly compared to that of Cormode et al. [8] is that of Setty et al. [19]. We therefore performed a brief comparison of the sequential implementation of [8] with that of [19]. This provides important context in which to evaluate our results: our 40-120× speedups compared to the sequential implementation of [8] would be less interesting if the sequential implementation of [8] was slower than alternative methods. Prior to this paper, these implementations had never been run on the same problems, so we picked a benchmark problem (matrix multiplication) evaluated in [19] and compared to the results reported there. We stress that our goal is not to provide a rigorous quantitative comparison of the two implementations. Indeed, we only compare the implementation of [8] to the numbers reported in [19]; we never ran the implementations on the same system, leaving this more rigorous comparison for future work. Moreover, both implementations may be amenable to further optimization. Despite these caveats, the comparison between the two implementations seems clear. The results are summarized in Table 1. 11 Implementation Matrix Size [8] [19], Pepper [19], Habanero 512 × 512 400 × 400 400 × 400 P Time 3.11 hours 8.1 years∗ 17 days† V Time 0.12 seconds 14 hours∗ 2.1 minutes† Total Communication 138.1 KB Not Reported 17.1 GB† Table 1: Comparison of the costs for the sequential implementations of [8] and [19]. Entries marked with ∗ indicate that the costs given are total costs over 45,000 queries. Entries marked with † indicate that the costs are total costs over 111 queries. Problem Input Size (number of entries) Circuit Size (number of gates) GPU P Time (s) Sequential P Time (s) Circuit Evaluation Time (s) GPU V Time (s) Sequential V Time (s) Unverified Algorithm Time (s) F2 F0 PM M AT M ULT 8.4 million 2.1 million 524,288 65,536 25.2 million 255.8 million 76.0 million 42.3 million 3.7 128.5 38.9 39.6 424.6 8,268.0 1,893.1 1,658.0 0.1 4.2 1.2 0.9 0.035 0.009 0.004 0.003 3.600 0.826 0.124 0.045 0.028 0.005 0.006 0.080 Table 2: Prover runtimes in the GKR protocol for all four problems considered. In Table 1, Pepper refers to an implementation in [19] which is actually proven secure against polynomial-time adversaries under cryptographic assumptions, while Habenero is an implementation in [19] which runs faster by allowing for a very high soundness probability of 79 that a deviating prover can fool the verifier, and utilizing what the authors themselves refer to as heuristics (not proven secure in [19], though the authors indicate this may be due to space constraints). In contrast, the soundness probability in the implementation of [8] is roughly 2150 (roughly proportional to the reciprocal of the field size p = 261 − 1), and the protocol is unconditionally secure even against computationally unbounded adversaries. The implementation of [19] has very high set-up costs for both P and V, and therefore the costs of a single query are very high. But this set-up cost can be amortized over many queries, and the most detailed experimental results provided in [19] give the costs for batches of hundreds or thousands of queries. The costs reported in the second and third rows of Table 1 are therefore the total costs of the implementation when run on a large number of queries. When we run the implementation of [8] on a single 512 × 512 matrix, the server takes 3.11 hours, the client takes 0.12 seconds, and the total length of all messages transmitted between the two parties is 138.1 KB. In contrast, the server in the heuristic implementation of [19], Habanero, requires 17 days amortized over 111 queries when run on considerably smaller matrices (400 × 400). This translates to roughly 3.7 hours per query, but the cost of a single query without batching is likely about two orders of magnitude higher. The client in Habanero requires 2.1 minutes to process the same 111 queries, or a little over 1 second per query, while the total communication is 17.1 GBs, or about 157 MBs per query. Again, the per query costs will be roughly two orders of magnitude higher without the batching. We conclude that, even under large batching the per-query time for the server of the sequential implementation of [8] is competitive with the heuristic implementation of [19], while the per-query time for the verifier is about two orders of magnitude smaller, and the per-query communication cost is between two and three orders of magnitude smaller. Without the batching, the per-query time of [8] is roughly 100× smaller for the server and 1,000× smaller for the client, and the communication cost is about 100,000× smaller. Likewise, the implementation of [8] is over 5 orders of magnitude faster for the client than the non-heuristic implementation Pepper, and four orders of magnitude faster for the server. 6.2.4 Evaluation of our GPU-based implementation Figure 5 demonstrates the performance of our GPU-based implementation of the GKR protocol. Table 2 also gives a succinct summary of our results, showing the costs for the largest instance of each problem we ran on. We consider the main takeaways of our experiments to be the following. 12 105 F2 Prover (parallel) 102 101 100 10−1 10−2 10−3 10−4 4 2 26 28 210 212 214 216 218 220 222 224 Input Size (a) Computation Time (seconds) 105 F0 Prover (parallel) 103 102 101 100 10−1 10−2 10−3 10−4 4 2 26 28 210 212 214 216 218 220 222 224 Input Size (b) 105 M AT M ULT Prover (sequential) M AT M ULT Prover (parallel) 104 103 Computation Time (seconds) 103 102 101 100 10−1 10−2 10−3 10−4 4 2 26 105 F0 Prover (sequential) 104 Computation Time (seconds) F2 Prover (sequential) 104 Computation Time (seconds) Computation Time (seconds) 105 Input Size (d) 103 102 101 100 10−1 10−2 10−3 10−4 4 2 26 28 210 212 214 216 218 220 222 224 Input Size (c) Verifier (sequential) Verifier (parallel) 104 103 102 101 100 10−1 10−2 10−3 10−4 4 2 28 210 212 214 216 218 220 222 224 PM Prover (sequential) PM Prover (parallel) 104 26 28 210 212 214 216 218 220 222 224 Input Size (e) Figure 5: Comparison of prover and verifier runtimes between the sequential implementation of the GKR protocol due to [8] and our GPU-based implementation. Note that all plots are on a log-log scale. Plots (a), (b), (c), and (d) depict the prover runtimes for F0 , F2 , PM, M AT M ULT respectively. Plot (e) depicts the verifier runtimes for the GKR protocol. We include only one plot for the verifier, since its dominant cost in the GKR protocol is problem-independent. Server-side speedup obtained by GPU computing. Compared to the sequential implementation of [8], our GPUbased server implementation ran close to 115× faster for the F2 circuit, about 60× faster for the F0 circuit, 45× faster for PM, and about 40× faster for M AT M ULT (see Figure 5). Notice that for the first three problems, we need to look at large inputs to see the asymptotic behavior of the curve corresponding to the parallel prover’s runtime. Due to the log-log scale in Figure 5, the curves for both the sequential and parallel implementations are asymptotically linear, and the 45-120× speedup obtained by our GPUbased implementation is manifested as an additive gap between the two curves. The explanation for this is simple: there is considerable overhead relative to the total computation time in parallelizing the computation at small inputs, but this overhead is more effectively amortized as the input size grows. In contrast, notice that for M AT M ULT the slope of the curve for the parallel prover remains significantly smaller than that of the sequential prover throughout the entire plot. This is because our GPU-based implementation ran out of device memory well before the overhead in parallelizing the prover’s computation became negligible. We therefore believe the speedup for M AT M ULT would be somewhat higher than the 40× speedup observed if we were able to run on larger inputs. Could a parallel verifiable program be faster than a sequential unverifiable one? The very first step of the prover’s computation in the GKR protocol is to evaluate the circuit. In theory this can be done efficiently in parallel, by proceeding sequentially layer by layer and evaluating all gates at a given layer in parallel. However, in practice we observed that the time it takes to copy the circuit to the device exceeds the time it takes to evaluate the circuit sequentially. This observation suggests that on the current generation of GPUs, no GPU-based implementation of the prover could run faster than a sequential unverifiable algorithm. This is because sequentially evaluating the circuit takes at least as long as the unverifiable sequential algorithm, and copying the data to the GPU takes longer than sequentially evaluating the 13 circuit. This observation applies not just to the GKR protocol, but to any protocol that uses a circuit representation of the computation (which is a standard technique in the theory literature [13, 18]). Nonetheless, we can certainly hope to obtain a GPU-based implementation that is competitive with sequential unverifiable algorithms. Server-side slowdown relative to unverifiable sequential algorithms. For F2 , the total slowdown for the prover was roughly 130× (3.7 seconds compared to 0.028 seconds for the unverifiable algorithm, which simply iterates over all entries of the frequency vector and computes the sum of the squares of each entry). We stress that it is likely that we overestimate the slowdown resulting from our protocol, because we did not count the time it takes for the unverifiable implementation to compute the number of occurrences of each item i, that is, to aggregate the stream into its frequency vector representation (a1 , . . . , an ). Instead, we simply generated the vector of frequencies at random (we did not count the generation time), and calculated the time to compute the sum of their squares. In practice, this aggregation step may take much longer than the time required to compute the sum of the squared frequencies once the stream is in aggregated form. For F0 , our GPU-based server implementation ran roughly 25,000× slower than the obvious unverifiable algorithm which simply counts the number of non-zero items in a vector. The larger slowdown compared to the F2 problem is unsurprising. Since F0 is a less arithmetic problem than F2 , its circuit representation is much larger. Once again, it is likely that we overestimate the slowdowns for this problem, as we did not count the time for an unverifiable algorithm to aggregate the stream into its frequency-vector representation. Despite the substantial slow-down incurred for F0 compared to a naive unverifiable algorithm, it remains valuable as a primitive for use in heavier-duty computations like PM and M AT M ULT. For PM, the bulk of the circuit consists of a F0 sub-routine, and so the runtime of our GPU-based implementation was similar to those for F0 . However, the sequential unverifiable algorithm for PM takes longer than that for F0 . Thus, our GPU-based server implementation ran roughly 6,500× slower than the naive unverifiable algorithm, which exhaustively searches all possible locations for occurrences of the pattern. For M AT M ULT, our GPU-based server implementation ran roughly 500× slower than naive matrix-multiplication for 256 × 256 matrices. Moreover, this number is likely inflated due to cache effects from which the naive unverifiable algorithm benefited. That is, the naive unverifiable algorithm takes only 0.09 seconds for 256 × 256 matrices, but takes 7.1 seconds for 512 × 512 matrices, likely because the algorithm experiences very few cache misses on the smaller matrix. We therefore expect the slowdown of our implementation to fall to under 100× if we were to scale to larger matrices. Furthermore, the GKR protocol is capable of verifying matrix-multiplication over the finite field Fp rather than over the integers at no additional cost. Naive matrix-multiplication over this field is between 2-3× slower than matrix multiplication over the integers (even using the fast arithmetic operations available for this field). Thus, if our goal was to work over this finite field rather than the integers, our slowdown would fall by another 2-3×. It is therefore possible that our server-side slowdown may be less than 50× at larger inputs compared to naive matrix multiplication over Fp . Client-side speedup obtained by GPU computing. The bulk of V’s computation consists of evaluating a single symbol in an error-corrected encoding of the input; this computation is independent of the circuit being verified. For reasonably large inputs (see the row for F2 in Table 2), our GPU-based client implementation performed this computation over 100× faster than the sequential implementation of [8]. For smaller inputs the speedup was unsurprisingly smaller due to increased overhead relative to total computation time. Still, we obtained a 15× speedup even for an input of length 65,536 (256 × 256 matrix multiplication). Client-side speedup relative to unverifiable sequential algorithms. Our matrix-multiplication results clearly demonstrate that for problems requiring super-linear time to solve, even the sequential implementation of [8] will save the client time compared to doing the computation locally. Indeed, the runtime of the client is dominated by the cost of evaluating a single symbol in an error-corrected encoding of the input, and this cost grows linearly with the input size. Even for relatively small matrices of size 256 × 256, the client in the implementation of [8] saved time. For matrices with tens of millions of entries, our results demonstrate that the client will still take just a few seconds, while performing the matrix multiplication computation would require orders of magnitude more time. Our results demonstrate that GPU computing can be used to reduce the verifier’s computation time by another 100×. 14 103 Special-Purpose F2 Prover (sequential) Computation Time (seconds) Computation Time (seconds) 103 Special-Purpose F2 Prover (parallel) 102 101 100 10−1 10−2 10−3 17 2 218 219 220 221 222 Input Size (a) 223 224 225 Special-Purpose F2 Verifier (sequential) Special-Purpose F2 Verifier (parallel) 102 101 100 10−1 10−2 10−3 17 2 218 219 220 221 222 Input Size (b) 223 224 225 Figure 6: Comparison of prover (a) and verifier (b) runtimes in the sequential and GPU-based implementations of the special-purpose F2 protocol.√Note that all plots are on a log-log scale. Throughout, the verifier’s space usage and the proof length are both set to n. V space (KB) Proof length (KB) GPU P Time (s) Sequential P Time (s) GPU V Time (s) Sequential V Time (s) 39.1 78.2 156.5 313.2 1953.1 78.1 39.1 19.5 9.8 0.78 2.901 1.872 1.154 0.909 0.357 43.773 43.544 37.254 36.554 20.658 0.019 0.010 0.010 0.008 0.007 0.858 0.639 0.577 0.552 0.551 Table 3: Prover and verifier runtimes for the special-purpose F2 protocol. All results are for fixed universe size n = 25 million, varying the tradeoff between proof length and the client’s space usage. This universe size corresponds to 190.7 MB of data. 6.3 Special-purpose protocols. We implemented both the client and the server of the non-interactive F2 protocol of [6, 8] on the GPU. As described in Section 2.3, this protocol is the fundamental building block for a host of non-interactive protocols achieving optimal tradeoffs between the space usage of the client and the length of the proof. Figure 6 demonstrates the performance of our GPU-based implementation of this protocol. Our GPU implementation obtained a 20-50× server-side speedup relative to the sequential implementation of [8]. This speedup was only possible after transposing the data grid into column-major order so as to achieve perfect memory coalescing, as described in Section 5.2.1. The server-side speedups we observed depended on the desired tradeoff between proof length and space usage. That is, the protocol partitions the universe [n] into a v × h grid where h is roughly the proof length and v is the verifier’s space usage. The prover processes each row of the grid independently (many rows in parallel). When v is large, each row requires a substantial amount of processing. In this case, the overhead of parallelization is effectively amortized over the total computation time. If v is smaller, then the overhead is less effectively amortized and we see less impressive speedups. We note that Figure 6 depicts the prover runtime √ for both the sequential implementation of [8] and our GPUbased implementation with the parameters h = v = n. With these parameters, our GPU-based implementation achieved roughly a 20× speedup relative to the sequential program. Table 3 shows the costs of the protocol for fixed universe size n = 25 million as we vary the tradeoff between h and v. The data in this table shows that our parallel implementation enjoys a 40-60× speedup relative to the sequential implementation √ when v is substantially larger than h. This indicates that we would see similar speedups even when h = v = n if we scaled to larger input sizes n. Notice that universe size n = 25 million corresponds to over 190 MBs of data, while the verifier’s space usage 15 and the proof length are hundreds or thousands of times smaller in all our experiments. An unverifiable sequential algorithm for computing the second frequency moment over this universe required 0.031 seconds; thus, our parallel server implementation achieved a slowdown of 10-100× relative to an unverifiable algorithm. In contrast, the verifier’s computation was much easier to parallelize, as its memory access pattern is highly regular. Our GPU-based implementation obtained 40-70× speedups relative to the sequential verifier of [8] across all input √ lengths n, including when we set h = v = n. 7 Conclusions This paper adds to a growing line of work focused on obtaining fully practical methods for verifiable computation. Our primary contribution in this paper was in demonstrating the power of parallelization, and GPU computing in particular, to obtain robust speedups in some of the most promising protocols in this area. We believe the additional costs of obtaining correctness guarantees demonstrated in this paper would already be considered modest in many correctness-critical applications. Moreover, it seems likely that future advances in interactive proof methodology will also be amenable to parallelization. This is because the protocols we implement utilize a number of common primitives (such as the sum-check protocol [15]) as subroutines, and these primitives are likely to appear in future protocols as well. Several avenues for future work suggest themselves. First, the GKR protocol is rather inefficient for the prover when applied to computations which are non-arithmetic in nature, as the circuit representation of such a computation is necessarily large. Developing improved protocols for such problems (even special-purpose ones) would be interesting. Prime candidates include many graph problems like minimum spanning tree and perfect matching. More generally, a top priority is to further reduce the slowdown or the memory-intensity for the prover in general-purpose protocols. Both these goals could be accomplished by developing an entirely new construction that avoids the circuit representation of the computation; it is also possible that the the prover within the GKR construction can be further optimized without fundamentally altering the protocol. References [1] J. Applequist. New assured cloud computing center to http://cs.illinois.edu/news/2011/May6-01, May 2011. be established at Illinois. [2] S. Arora and B. Barak. Computational Complexity: A Modern Approach. Cambridge University Press, 2009. [3] L. Babai. Trading group theory for randomness. In ACM Symp. Theory of Computing (STOC ‘85), pages 421– 429, 1985. [4] C. Burrus and P. Eschenbacher. An in-place, in-order prime factor FFT algorithm. IEEE Trans. Acoustics, Speech and Signal Processing, 29(4):806–817, 1981. [5] R. Canetti, B. Riva, and G. N. Rothblum. Practical delegation of computation using multiple servers. In ACM Conf. Computer and Communications Security (CCS ‘11), pages 445–454, 2011. [6] A. Chakrabarti, G. Cormode, and A. Mcgregor. Annotations in data streams. In Intl. Colloq. Automata, Languages and Programming (ICALP ‘09), pages 222–234, 2009. [7] G. Cormode, M. Mitzenmacher, and J. Thaler. Streaming graph computations with a helpful advisor. In European Symp. Algorithms (ESA ‘10), pages 231–242, 2010. [8] G. Cormode, M. Mitzenmacher, and J. Thaler. Practical verified computation with streaming interactive proofs. In Innovations in Theoretical Computer Science (ITCS ‘12), 2012. [9] G. Cormode, J. Thaler, and K. Yi. Verifying computations with streaming interactive proofs. Proc. VLDB Endowment, 5(1):25–36, 2011. 16 [10] S. Goldwasser, Y. T. Kalai, and G. N. Rothblum. Delegating computation: Interactive proofs for muggles. In ACM Symp. Theory of Computing (STOC ‘08), pages 113–122, 2008. [11] S. Goldwasser, S. Micali, and C. Rackoff. The knowledge complexity of interactive proof systems. SIAM J. Computing, 18(1):186–208, 1989. [12] J. Hoberock and N. Bell. Thrust: A parallel template library, 2011. Version 1.3.0. [13] Y. Ishai, E. Kushilevitz, and R. Ostrovsky. Efficient arguments without short PCPs. In IEEE Conf. Computational Complexity (CCC ‘07), pages 278–291, 2007. [14] V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D. Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty, P. Hammarlund, R. Singhal, and P. Dubey. Debunking the 100x gpu vs. cpu myth: an evaluation of throughput computing on cpu and gpu. In Proceedings of the 37th annual international symposium on Computer architecture, ISCA ’10, pages 451–460, New York, NY, USA, 2010. ACM. [15] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for interactive proof systems. Journal of the ACM, 39(4):859–868, 1992. [16] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995. [17] J. Owens, M. Houston, D. Luebke, S. Green, J. Stone, and J. Phillips. Gpu computing. Proceedings of the IEEE, 96(5):879–899, 2008. [18] S. Setty, A. J. Blumberg, and M. Walfish. Toward practical and unconditional verification of remote computations. In Hot Topics in Operating Systems (HotOS ‘11), 2011. [19] S. Setty, R. McPherson, A. J. Blumberg, and M. Walfish. Making argument systems for outsourced computation practical (sometimes). In Network & Distributed System Security Symposium (NDSS ‘12), 2012. [20] J. Thaler, M. Roberts, M. Mitzenmacher, and H. Pfister. http://people.seas.harvard.edu/∼jthaler/TRMPcode.htm. 2012. Source code. [21] M. Thorup. Even strongly universal hashing is pretty fast. In ACM-SIAM Symp. Discrete Algorithms (SODA ‘00), pages 496–497, 2000. 17 HOME PAGE TODAY'S PAPER VIDEO MOST POPULAR Subscribe to Home Delivery U.S. Edition ha_levin Help Search All NYTimes.com WORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH ENVIRONMENT SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE SPACE & COSMOS Advertise on NYTimes.com Unreported Side Effects of Drugs Are Found Using Internet Search Data, Study Finds By JOHN MARKOFF Published: March 6, 2013 184 Comments Using data drawn from queries entered into Google, Microsoft and Yahoo search engines, scientists at Microsoft, Stanford and Columbia University have for the first time been able to detect evidence of unreported prescription drug side effects before they were found by the Food and Drug Administration’s warning system. FACEBOOK TWITTER GOOGLE+ SAVE E-MAIL Multimedia Using automated software tools to SHARE examine queries by six million PRINT Internet users taken from Web search REPRINTS logs in 2010, the researchers looked for searches relating to an antidepressant, paroxetine, and a cholesterol lowering drug, pravastatin. They were able to find evidence that the combination of the two drugs caused high blood sugar. Software and Side Effects Connect With Us on Social Media @nytimesscience on Twitter. Science Reporters and Editors on Twitter Like the science desk on Facebook. Readers’ Comments The study, which was reported in the Journal of the American Medical Informatics Association on Wednesday, is based on data-mining techniques similar to those employed by services like Google Flu Trends, which has been used to give early warning of the prevalence of the sickness to the public. The F.D.A. asks physicians to report side effects through a system known as the Adverse Event Reporting System. But its scope is limited by the fact that data is generated only when a physician notices something and reports it. Readers shared their thoughts on this article. The new approach is a refinement of work done by the laboratory of Russ B. Altman, the chairman of the Stanford Read All Comments (184) » bioengineering department. The group had explored whether it was possible to automate the process of discovering “drug-drug” interactions by using software to hunt through the data found in F.D.A. reports. The group reported in May 2011 that it was able to detect the interaction between paroxetine and pravastatin in this way. Its research determined that the patient’s risk of developing hyperglycemia was increased compared with taking either drug individually. The new study was undertaken after Dr. Altman wondered whether there was a more immediate and more accurate way to gain access to data similar to what the F.D.A. had access to. AUTOS He turned to computer scientists at Microsoft, who created software for scanning anonymized data collected from a software toolbar installed in Web browsers by users who permitted their search histories to be collected. The scientists were able to explore 82 million individual searches for drug, symptom and condition information. The researchers first identified individual searches for the terms paroxetine and pravastatin, as well as searches for both terms, in 2010. They then computed the likelihood that users in each group would also search for hyperglycemia as well as roughly 80 of its symptoms — words or phrases like “high blood sugar” or “blurry vision.” They determined that people who searched for both drugs during the 12-month period were significantly more likely to search for terms related to hyperglycemia than were those who searched for just one of the drugs. (About 10 percent, compared with 5 percent and 4 percent for just one drug.) They also found that people who did the searches for symptoms relating to both drugs were likely to do the searches in a short time period: 30 percent did the search on the same day, 40 percent during the same week and 50 percent during the same month. “You can imagine how this kind of combination would be very, very hard to study given all the different drug pairs or combinations that are out there,” said Eric Horvitz, a managing co-director of Microsoft Research’s laboratory in Redmond, Wash. The researchers said they were surprised by the strength of the “signal” that they detected in the searches and argued that it would be a valuable tool for the F.D.A. to add to its current system for tracking adverse effects. “There is a potential public health benefit in listening to such signals,” they wrote in the paper, “and integrating them with other sources of information.” The researchers said that they were now thinking about how to add new sources of information, like behavioral data and information from social media sources. The challenge, they noted, was to integrate new sources of data while protecting individual privacy. Currently the F.D.A. has financed the Sentinel Initiative, an effort begun in 2008 to assess the risks of drugs already on the market. Eventually, that project plans to monitor drug use by as many as 100 million people in the United States. The system will be based on information collected by health care providers on a massive scale. “I think there are tons of drug-drug interactions — that’s the bad news,” Dr. Altman said. “The good news is we also have ways to evaluate the public health impact. “This is why I’m excited about F.D.A. involvement here. They do have mechanisms and ways to pick up the things that we find and triage them based on anticipated public health impact.” Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Six Provocations for Big Data danah boyd Microsoft Research dmb@microsoft.com Kate Crawford University of New South Wales k.crawford@unsw.edu.au Technology is neither good nor bad; nor is it neutral...technology’s interaction with the social ecology is such that technical developments frequently have environmental, social, and human consequences that go far beyond the immediate purposes of the technical devices and practices themselves. Melvin Kranzberg (1986, p. 545) We need to open a discourse – where there is no effective discourse now – about the varying temporalities, spatialities and materialities that we might represent in our databases, with a view to designing for maximum flexibility and allowing as possible for an emergent polyphony and polychrony. Raw data is both an oxymoron and a bad idea; to the contrary, data should be cooked with care. Geoffrey Bowker (2005, p. 183-184) The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and many others are clamoring for access to the massive quantities of information produced by and about people, things, and their interactions. Diverse groups argue about the potential benefits and costs of analyzing information from Twitter, Google, Verizon, 23andMe, Facebook, Wikipedia, and every space where large groups of people leave digital traces and deposit data. Significant questions emerge. Will large-scale analysis of DNA help cure diseases? Or will it usher in a new wave of medical inequality? Will data analytics help make people’s access to information more efficient and effective? Or will it be used to track protesters in the streets of major cities? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means? Some or all of the above? Big Data is, in many ways, a poor term. As Lev Manovich (2011) observes, it has been used in the sciences to refer to data sets large enough to require supercomputers, although now vast sets of data can be analyzed on desktop computers with standard software. There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem. Big Data is notable not because of its size, but because of its relationality to other data. Due to efforts to mine 1 Electronic copy available at: http://ssrn.com/abstract=1926431 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. and aggregate data, Big Data is fundamentally networked. Its value comes from the patterns that can be derived by making connections between pieces of data, about an individual, about individuals in relation to others, about groups of people, or simply about the structure of information itself. Furthermore, Big Data is important because it refers to an analytic phenomenon playing out in academia and industry. Rather than suggesting a new term, we are using Big Data here because of its popular salience and because it is the phenomenon around Big Data that we want to address. Big Data tempts some researchers to believe that they can see everything at a 30,000-foot view. It is the kind of data that encourages the practice of apophenia: seeing patterns where none actually exist, simply because massive quantities of data can offer connections that radiate in all directions. Due to this, it is crucial to begin asking questions about the analytic assumptions, methodological frameworks, and underlying biases embedded in the Big Data phenomenon. While databases have been aggregating data for over a century, Big Data is no longer just the domain of actuaries and scientists. New technologies have made it possible for a wide range of people – including humanities and social science academics, marketers, governmental organizations, educational institutions, and motivated individuals – to produce, share, interact with, and organize data. Massive data sets that were once obscure and distinct are being aggregated and made easily accessible. Data is increasingly digital air: the oxygen we breathe and the carbon dioxide that we exhale. It can be a source of both sustenance and pollution. How we handle the emergence of an era of Big Data is critical: while it is taking place in an environment of uncertainty and rapid change, current decisions will have considerable impact in the future. With the increased automation of data collection and analysis – as well as algorithms that can extract and inform us of massive patterns in human behavior – it is necessary to ask which systems are driving these practices, and which are regulating them. In Code, Lawrence Lessig (1999) argues that systems are regulated by four forces: the market, the law, social norms, and architecture – or, in the case of technology, code. When it comes to Big Data, these four forces are at work and, frequently, at odds. The market sees Big Data as pure opportunity: marketers use it to target advertising, insurance providers want to optimize their offerings, and Wall Street bankers use it to read better readings on market temperament. Legislation has already been proposed to curb the collection and retention of data, usually over concerns about privacy (for example, the Do Not Track Online Act of 2011 in the United States). Features like personalization allow rapid access to more relevant information, but they present difficult ethical questions and fragment the public in problematic ways (Pariser 2011). There are some significant and insightful studies currently being done that draw on Big Data methodologies, particularly studies of practices in social network sites like Facebook and Twitter. Yet, it is imperative that we begin asking critical questions about what all this data means, who gets access to it, how it is deployed, and to what ends. With Big Data come big responsibilities. In this essay, we are offering six provocations that we hope can spark conversations about the issues of Big Data. Social and cultural researchers 2 Electronic copy available at: http://ssrn.com/abstract=1926431 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. have a stake in the computational culture of Big Data precisely because many of its central questions are fundamental to our disciplines. Thus, we believe that it is time to start critically interrogating this phenomenon, its assumptions, and its biases. 1. Automating Research Changes the Definition of Knowledge. In the early decades of the 20th century, Henry Ford devised a manufacturing system of mass production, using specialized machinery and standardized products. Simultaneously, it became the dominant vision of technological progress. Fordism meant automation and assembly lines, and for decades onward, this became the orthodoxy of manufacturing: out with skilled craftspeople and slow work, in with a new machine-made era (Baca 2004). But it was more than just a new set of tools. The 20th century was marked by Fordism at a cellular level: it produced a new understanding of labor, the human relationship to work, and society at large. Big Data not only refers to very large data sets and the tools and procedures used to manipulate and analyze them, but also to a computational turn in thought and research (Burkholder 1992). Just as Ford changed the way we made cars – and then transformed work itself – Big Data has emerged a system of knowledge that is already changing the objects of knowledge, while also having the power to inform how we understand human networks and community. ‘Change the instruments, and you will change the entire social theory that goes with them,’ Latour reminds us (2009, p. 9). We would argue that Bit Data creates a radical shift in how we think about research. Commenting on computational social science, Lazer et al argue that it offers ‘the capacity to collect and analyze data with an unprecedented breadth and depth and scale’ (2009, p. 722). But it is not just a matter of scale. Neither is enough to consider it in terms of proximity, or what Moretti (2007) refers to as distant or close analysis of texts. Rather, it is a profound change at the levels of epistemology and ethics. It reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality. Just as du Gay and Pryke note that ‘accounting tools...do not simply aid the measurement of economic activity, they shape the reality they measure’ (2002, pp. 12-13), so Big Data stakes out new terrains of objects, methods of knowing, and definitions of social life. Speaking in praise of what he terms ‘The Petabyte Age’, Chris Anderson, Editor-in-Chief of Wired, writes: This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves. (2008) 3 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Do numbers speak for themselves? The answer, we think, is a resounding ‘no’. Significantly, Anderson’s sweeping dismissal of all other theories and disciplines is a tell: it reveals an arrogant undercurrent in many Big Data debates where all other forms of analysis can be sidelined by production lines of numbers, privileged as having a direct line to raw knowledge. Why people do things, write things, or make things is erased by the sheer volume of numerical repetition and large patterns. This is not a space for reflection or the older forms of intellectual craft. As David Berry (2011, p. 8) writes, Big Data provides ‘destablising amounts of knowledge and information that lack the regulating force of philosophy.’ Instead of philosophy – which Kant saw as the rational basis for all institutions – ‘computationality might then be understood as an ontotheology, creating a new ontological “epoch” as a new historical constellation of intelligibility’ (Berry 2011, p. 12). We must ask difficult questions of Big Data’s models of intelligibility before they crystallize into new orthodoxies. If we return to Ford, his innovation was using the assembly line to break down interconnected, holistic tasks into simple, atomized, mechanistic ones. He did this by designing specialized tools that strongly predetermined and limited the action of the worker. Similarly, the specialized tools of Big Data also have their own inbuilt limitations and restrictions. One is the issue of time. ‘Big Data is about exactly right now, with no historical context that is predictive,’ observes Joi Ito, the director of the MIT Media Lab (Bollier 2010, p. 19). For example, Twitter and Facebook are examples of Big Data sources that offer very poor archiving and search functions, where researchers are much more likely to focus on something in the present or immediate past – tracking reactions to an election, TV finale or natural disaster – because of the sheer difficulty or impossibility of accessing older data. If we are observing the automation of particular kinds of research functions, then we must consider the inbuilt flaws of the machine tools. It is not enough to simply ask, as Anderson suggests ‘what can science learn from Google?’, but to ask how Google and the other harvesters of Big Data might change the meaning of learning, and what new possibilities and new limitations may come with these systems of knowing. 2. Claims to Objectivity and Accuracy are Misleading ‘Numbers, numbers, numbers,’ writes Latour (2010). ‘Sociology has been obsessed by the goal of becoming a quantitative science.’ Yet sociology has never reached this goal, in Latour’s view, because of where it draws the line between what is and is not quantifiable knowledge in the social domain. Big Data offers the humanistic disciplines a new way to claim the status of quantitative science and objective method. It makes many more social spaces quantifiable. In reality, working with Big Data is still subjective, and what it quantifies does not necessarily have a closer claim on objective truth – particularly when considering messages from social media sites. But there remains a mistaken belief that qualitative researchers are in the business of interpreting stories and quantitative researchers are in the business 4 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. of producing facts. In this way, Big Data risks reinscribing established divisions in the long running debates about scientific method. The notion of objectivity has been a central question for the philosophy of science and early debates about the scientific method (Durkheim 1895). Claims to objectivity suggest an adherence to the sphere of objects, to things as they exist in and for themselves. Subjectivity, on the other hand, is viewed with suspicion, colored as it is with various forms of individual and social conditioning. The scientific method attempts to remove itself from the subjective domain through the application of a dispassionate process whereby hypotheses are proposed and tested, eventually resulting in improvements in knowledge. Nonetheless, claims to objectivity are necessarily made by subjects and are based on subjective observations and choices. All researchers are interpreters of data. As Lisa Gitelman (2011) observes, data needs to be imagined as data in the first instance, and this process of the imagination of data entails an interpretative base: ‘every discipline and disciplinary institution has its own norms and standards for the imagination of data.’ As computational scientists have started engaging in acts of social science, there is a tendency to claim their work as the business of facts and not interpretation. A model may be mathematically sound, an experiment may seem valid, but as soon as a researcher seeks to understand what it means, the process of interpretation has begun. The design decisions that determine what will be measured also stem from interpretation. For example, in the case of social media data, there is a ‘data cleaning’ process: making decisions about what attributes and variables will be counted, and which will be ignored. This process is inherently subjective. As Bollier explains, As a large mass of raw information, Big Data is not self-explanatory. And yet the specific methodologies for interpreting the data are open to all sorts of philosophical debate. Can the data represent an ‘objective truth’ or is any interpretation necessarily biased by some subjective filter or the way that data is ‘cleaned?’ (2010, p. 13) In addition to this question, there is the issue of data errors. Large data sets from Internet sources are often unreliable, prone to outages and losses, and these errors and gaps are magnified when multiple data sets are used together. Social scientists have a long history of asking critical questions about the collection of data and trying to account for any biases in their data (Cain & Finch, 1981; Clifford & Marcus, 1986). This requires understanding the properties and limits of a dataset, regardless of its size. A dataset may have many millions of pieces of data, but this does not mean it is random or representative. To make statistical claims about a dataset, we need to know where data is coming from; it is similarly important to know and account for the weaknesses in that data. Furthermore, researchers must be able to account for the biases in their interpretation of the data. To do so requires recognizing that one’s identity and perspective informs one’s analysis (Behar & Gordon, 1996). 5 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Spectacular errors can emerge when researchers try to build social science findings into technological systems. A classic example arose when Friendster chose to implement Robin Dunbar’s (1998) work. Analyzing gossip practices in humans and grooming habits in monkeys, Dunbar found that people could only actively maintain 150 relationships at any time and argued that this number represented the maximum size of a person's personal network. Unfortunately, Friendster believed that people were replicating their pre-existing personal networks on the site, so they inferred that no one should have a friend list greater than 150. Thus, they capped the number of ‘Friends’ people could have on the system (boyd, 2006). Interpretation is at the center of data analysis. Regardless of the size of a data set, it is subject to limitation and bias. Without those biases and limitations being understood and outlined, misinterpretation is the result. Big Data is at its most effective when researchers take account of the complex methodological processes that underlie the analysis of social data. 3. Bigger Data are Not Always Better Data Social scientists have long argued that what makes their work rigorous is rooted in their systematic approach to data collection and analysis (McClosky, 1985). Ethnographers focus on reflexively accounting for bias in their interpretations. Experimentalists control and standardize the design of their experiment. Survey researchers drill down on sampling mechanisms and question bias. Quantitative researchers weigh up statistical significance. These are but a few of the ways in which social scientists try to assess the validity of each other’s work. Unfortunately, some who are embracing Big Data presume the core methodological issues in the social sciences are no longer relevant. There is a problematic underlying ethos that bigger is better, that quantity necessarily means quality. Twitter provides an example in the context of a statistical analysis. First, Twitter does not represent ‘all people’, although many journalists and researchers refer to ‘people’ and ‘Twitter users’ as synonymous. Neither is the population using Twitter representative of the global population. Nor can we assume that accounts and users are equivalent. Some users have multiple accounts. Some accounts are used by multiple people. Some people never establish an account, and simply access Twitter via the web. Some accounts are ‘bots’ that produce automated content without involving a person. Furthermore, the notion of an ‘active’ account is problematic. While some users post content frequently through Twitter, others participate as ‘listeners’ (Crawford 2009, p. 532). Twitter Inc. has revealed that 40 percent of active users sign in just to listen (Twitter, 2011). The very meanings of ‘user’ and ‘participation’ and ‘active’ need to be critically examined. Due to uncertainties about what an account represents and what engagement looks like, it is standing on precarious ground to sample Twitter accounts and make claims about people and users. Twitter Inc. can make claims about all accounts or all tweets or a random sample thereof as they have access to the central database. Even so, they cannot 6 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. easily account for lurkers, people who have multiple accounts or groups of people who all access one account. Additionally, the central database is also prone to outages, and tweets are frequently lost and deleted. Twitter Inc. makes a fraction of its material available to the public through its APIs1. The ‘firehose’ theoretically contains all public tweets ever posted and explicitly excludes any tweet that a user chose to make private or ‘protected.’ Yet, some publicly accessible tweets are also missing from the firehose. Although a handful of companies and startups have access to the firehose, very few researchers have this level of access. Most either have access to a ‘gardenhose’ (roughly 10% of public tweets), a ‘spritzer’ (roughly 1% of public tweets), or have used ‘white-listed’ accounts where they could use the APIs to get access to different subsets of content from the public stream.2 It is not clear what tweets are included in these different data streams or sampling them represents. It could be that the API pulls a random sample of tweets or that it pulls the first few thousand tweets per hour or that it only pulls tweets from a particular segment of the network graph. Given uncertainty, it is difficult for researchers to make claims about the quality of the data that they are analyzing. Is the data representative of all tweets? No, because it excludes tweets from protected accounts.3 Is the data representative of all public tweets? Perhaps, but not necessarily. These are just a few of the unknowns that researchers face when they work with Twitter data, yet these limitations are rarely acknowledged. Even those who provide a mechanism for how they sample from the firehose or the gardenhose rarely reveal what might be missing or how their algorithms or the architecture of Twitter’s system introduces biases into the dataset. Some scholars simply focus on the raw number of tweets: but big data and whole data are not the same. Without taking into account the sample of a dataset, the size of the dataset is meaningless. For example, a researcher may seek to understand the topical frequency of tweets, yet if Twitter removes all tweets that contain problematic words or content – such as references to pornography – from the stream, the topical frequency would be wholly inaccurate. Regardless of the number of tweets, it is not a representative sample as the data is skewed from the beginning. Twitter has become a popular source for mining Big Data, but working with Twitter data has serious methodological challenges that are rarely addressed by those who embrace it. When researchers approach a dataset, they need to understand – and publicly account for – not only the limits of the dataset, but also the limits of which questions they can ask of a dataset and what interpretations are appropriate. 1 API stands for application programming interface; this refers to a set of tools that developers can use to access structured data. 2 Details of what Twitter provides can be found at https://dev.twitter.com/docs/streaming-api/methods White-listed accounts were a common mechanism of acquiring access early on, but they are no longer available. 3 The percentage of protected accounts is unknown. In a study of Twitter where they attempted to locate both protected and public Twitter accounts, Meeder et al (2010) found that 8.4% of the accounts they identified were protected. 7 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. This is especially true when researchers combine multiple large datasets. Jesper Anderson, co-founder of open financial data store FreeRisk, explains that combining data from multiple sources creates unique challenges: ‘Every one of those sources is errorprone…I think we are just magnifying that problem [when we combine multiple data sets]’ (Bollier 2010, p. 13). This does not mean that combining data doesn’t have value – studies like those by Alessandro Acquisti and Ralph Gross (2009), which reveal how databases can be combined to reveal serious privacy violations are crucial. Yet, it is imperative that such combinations are not without methodological rigor and transparency. Finally, in the era of the computational turn, it is increasingly important to recognize the value of ‘small data’. Research insights can be found at any level, including at very modest scales. In some cases, focusing just on a single individual can be extraordinarily valuable. Take, for example, the work of Tiffany Veinot (2007), who followed one worker - a vault inspector at a hydroelectric utility company - in order to understand the information practices of blue-collar worker. In doing this unusual study, Veinot reframed the definition of ‘information practices’ away from the usual focus on early-adopter, white-collar workers, to spaces outside of the offices and urban context. Her work tells a story that could not be discovered by farming millions of Facebook or Twitter accounts, and contributes to the research field in a significant way, despite the smallest possible participant count. The size of data being sampled should fit the research question being asked: in some cases, small is best. 4. Not All Data Are Equivalent Some researchers assume that analyses done with small data can be done better with Big Data. This argument also presumes that data is interchangeable. Yet, taken out of context, data lose meaning and value. Context matters. When two datasets can be modeled in a similar way, this does not mean that they are equivalent or can be analyzed in the same way. Consider, for example, the rise of interest in social network analysis that has emerged alongside the rise of social network sites (boyd & Ellison 2007) and the industry-driven obsession with the ‘social graph’. Countless researchers have flocked to Twitter and Facebook and other social media to analyze the resultant social graphs, making claims about social networks. The study of social networks dates back to early sociology and anthropology (e.g., Radcliffe-Brown 1940), with the notion of a ‘social network’ emerging in 1954 (Barnes) and the field of ‘social network analysis’ emerging shortly thereafter (Freeman 2006). Since then, scholars from diverse disciplines have been trying to understand people’s relationships to one another using diverse methodological and analytical approaches. As researchers began interrogating the connections between people on public social media, there was a surge of interest in social network analysis. Now, network analysts are turning to study networks produced through mediated communication, geographical movement, and other data traces. 8 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. However, the networks produced through social media and resulting from communication traces are not necessarily interchangeable with other social network data. Just because two people are physically co-present – which may be made visible to cell towers or captured through photographs – does not mean that they know one another. Furthermore, rather than indicating the presence of predictable objective patterns, social network sites facilitate connectedness across structural boundaries and act as a dynamic source of change: taking a snapshot, or even witnessing a set of traces over time does not capture the complexity of all social relations. As Kilduff and Tsai (2003, p. 117) note, ‘network research tends to proceed from a naive ontology that takes as unproblematic the objective existence and persistence of patterns, elementary parts and social systems.’ This approach can yield a particular kind of result when analysis is conducted only at a fixed point in time, but quickly unravels as soon as broader questions are asked (Meyer et al. 2005). Historically speaking, when sociologists and anthropologists were the primary scholars interested in social networks, data about people’s relationships was collected through surveys, interviews, observations, and experiments. Using this data, social scientists focused on describing one’s ‘personal networks’ – the set of relationships that individuals develop and maintain (Fischer 1982). These connections were evaluated based on a series of measures developed over time to identify personal connections. Big Data introduces two new popular types of social networks derived from data traces: ‘articulated networks’ and ‘behavioral networks.’ Articulated networks are those that result from people specifying their contacts through a mediating technology (boyd 2004). There are three common reasons in which people articulate their connections: to have a list of contacts for personal use; to publicly display their connections to others; and to filter content on social media. These articulated networks take the form of email or cell phone address books, instant messaging buddy lists, ‘Friends’ lists on social network sites, and ‘Follower’ lists on other social media genres. The motivations that people have for adding someone to each of these lists vary widely, but the result is that these lists can include friends, colleagues, acquaintances, celebrities, friends-of-friends, public figures, and interesting strangers. Behavioral networks are derived from communication patterns, cell coordinates, and social media interactions (Meiss et al. 2008; Onnela et al. 2007). These might include people who text message one another, those who are tagged in photos together on Facebook, people who email one another, and people who are physically in the same space, at least according to their cell phone. Both behavioral and articulated networks have great value to researchers, but they are not equivalent to personal networks. For example, although often contested, the concept of ‘tie strength’ is understood to indicate the importance of individual relationships (Granovetter, 1973). When a person chooses to list someone as their ‘Top Friend’ on MySpace, this may or may not be their closest friend; there are all sorts of social reasons to not list one’s most intimate connections first (boyd, 2006). Likewise, when mobile phones recognize that a worker spends more time with colleagues than their spouse, this 9 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. does not necessarily mean that they have stronger ties with their colleagues than their spouse. Measuring tie strength through frequency or public articulation is a common mistake: tie strength – and many of the theories built around it – is a subtle reckoning in how people understand and value their relationships with other people. Fascinating network analysis can be done with behavioral and articulated networks. But there is a risk in an era of Big Data of treating every connection as equivalent to every other connection, of assuming frequency of contact is equivalent to strength of relationship, and of believing that an absence of connection indicates a relationship should be made. Data is not generic. There is value to analyzing data abstractions, yet the context remains critical. 5. Just Because it is Accessible Doesn’t Make it Ethical In 2006, a Harvard-based research project started gathering the profiles of 1,700 collegebased Facebook users to study how their interests and friendships changed over time (Lewis et al. 2008). This supposedly anonymous data was released to the world, allowing other researchers to explore and analyze it. What other researchers quickly discovered was that it was possible to de-anonymize parts of the dataset: compromising the privacy of students, none of whom were aware their data was being collected (Zimmer 2008). The case made headlines, and raised a difficult issue for scholars: what is the status of socalled ‘public’ data on social media sites? Can it simply be used, without requesting permission? What constitutes best ethical practice for researchers? Privacy campaigners already see this as a key battleground where better privacy protections are needed. The difficulty is that privacy breaches are hard to make specific – is there damage done at the time? What about twenty years hence? ‘Any data on human subjects inevitably raise privacy issues, and the real risks of abuse of such data are difficult to quantify’ (Nature, cited in Berry 2010). Even when researchers try to be cautious about their procedures, they are not always aware of the harm they might be causing in their research. For example, a group of researchers noticed that there was a correlation between self-injury (‘cutting’) and suicide. They prepared an educational intervention seeking to discourage people from engaging in acts of self-injury, only to learn that their intervention prompted an increase in suicide attempts. For some, self-injury was a safety valve that kept the desire to attempt suicide at bay. They immediately ceased their intervention (Emmens & Phippen 2010). Institutional Review Boards (IRBs) – and other research ethics committees – emerged in the 1970s to oversee research on human subjects. While unquestionably problematic in implementation (Schrag, 2010), the goal of IRBs is to provide a framework for evaluating the ethics of a particular line of research inquiry and to make certain that checks and balances are put into place to protect subjects. Practices like ‘informed consent’ and protecting the privacy of informants are intended to empower participants in light of 10 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. earlier abuses in the medical and social sciences (Blass, 2004; Reverby, 2009). Although IRBs cannot always predict the harm of a particular study – and, all too often, prevent researchers from doing research on grounds other than ethics – their value is in prompting scholars to think critically about the ethics of their research. With Big Data emerging as a research field, little is understood about the ethical implications of the research being done. Should someone be included as a part of a large aggregate of data? What if someone’s ‘public’ blog post is taken out of context and analyzed in a way that the author never imagined? What does it mean for someone to be spotlighted or to be analyzed without knowing it? Who is responsible for making certain that individuals and communities are not hurt by the research process? What does consent look like? It may be unreasonable to ask researchers to obtain consent from every person who posts a tweet, but it is unethical for researchers to justify their actions as ethical simply because the data is accessible. Just because content is publicly accessible doesn’t mean that it was meant to be consumed by just anyone (boyd & Marwick, 2011). There are serious issues involved in the ethics of online data collection and analysis (Ess, 2002). The process of evaluating the research ethics cannot be ignored simply because the data is seemingly accessible. Researchers must keep asking themselves – and their colleagues – about the ethics of their data collection, analysis, and publication. In order to act in an ethical manner, it is important that scholars reflect on the importance of accountability. In the case of Big Data, this means both accountability to the field of research, and accountability to the research subjects. Academic researchers are held to specific professional standards when working with human participants in order to protect their rights and well-being. However, many ethics boards do not understand the processes of mining and anonymizing Big Data, let alone the errors that can cause data to become personally identifiable. Accountability to the field and to human subjects required rigorous thinking about the ramifications of Big Data, rather than assuming that ethics boards will necessarily do the work of ensuring people are protected. Accountability here is used as a broader concept that privacy, as Troshynski et al. (2008) have outlined, where the concept of accountability can apply even when conventional expectations of privacy aren’t in question. Instead, accountability is a multi-directional relationship: there may be accountability to superiors, to colleagues, to participants and to the public (Dourish & Bell 2011). There are significant questions of truth, control and power in Big Data studies: researchers have the tools and the access, while social media users as a whole do not. Their data was created in highly context-sensitive spaces, and it is entirely possible that some social media users would not give permission for their data to be used elsewhere. Many are not aware of the multiplicity of agents and algorithms currently gathering and storing their data for future use. Researchers are rarely in a user’s imagined audience, neither are users necessarily aware of all the multiple uses, profits and other gains that come from information they have posted. Data may be public (or semi-public) but this does not simplistically equate with full permission being given for all uses. There is a 11 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. considerable difference between being in public and being public, which is rarely acknowledged by Big Data researchers. 6. Limited Access to Big Data Creates New Digital Divides In an essay on Big Data, Scott Golder (2010) quotes sociologist George Homans (1974): ‘The methods of social science are dear in time and money and getting dearer every day.’ Historically speaking, collecting data has been hard, time consuming, and resource intensive. Much of the enthusiasm surrounding Big Data stems from the perception that it offers easy access to massive amounts of data. But who gets access? For what purposes? In what contexts? And with what constraints? While the explosion of research using data sets from social media sources would suggest that access is straightforward, it is anything but. As Lev Manovich (2011) points out, ‘only social media companies have access to really large social data - especially transactional data. An anthropologist working for Facebook or a sociologist working for Google will have access to data that the rest of the scholarly community will not.’ Some companies restrict access to their data entirely; other sell the privilege of access for a high fee; and others offer small data sets to university-based researchers. This produces considerable unevenness in the system: those with money – or those inside the company – can produce a different type of research than those outside. Those without access can neither reproduce nor evaluate the methodological claims of those who have privileged access. It is also important to recognize that the class of the Big Data rich is reinforced through the university system: top-tier, well-resourced universities will be able to buy access to data, and students from the top universities are the ones most likely to be invited to work within large social media companies. Those from the periphery are less likely to get those invitations and develop their skills. The result is that the divisions between those who went to the top universities and the rest will widen significantly. In addition to questions of access, there are questions of skills. Wrangling APIs, scraping and analyzing big swathes of data is a skill set generally restricted to those with a computational background. When computational skills are positioned as the most valuable, questions emerge over who is advantaged and who is disadvantaged in such a context. This, in its own way, sets up new hierarchies around ‘who can read the numbers’, rather than recognizing that computer scientists and social scientists both have valuable perspectives to offer. Significantly, this is also a gendered division. Most researchers who have computational skills at the present moment are male and, as feminist historians and philosophers of science have demonstrated, who is asking the questions determines which questions are asked (Forsythe 2001; Harding 1989). There are complex questions about what kinds of research skills are valued in the future and how those skills are taught. How can students be educated so that they are equally comfortable with algorithms and data analysis as well as with social analysis and theory? 12 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Finally, the difficulty and expense of gaining access to Big Data produces a restricted culture of research findings. Large data companies have no responsibility to make their data available, and they have total control over who gets to see it. Big Data researchers with access to proprietary data sets are less likely to choose questions that are contentious to a social media company, for example, if they think it may result in their access being cut. The chilling effects on the kinds of research questions that can be asked - in public or private - are something we all need to consider when assessing the future of Big Data. The current ecosystem around Big Data creates a new kind of digital divide: the Big Data rich and the Big Data poor. Some company researchers have even gone so far as to suggest that academics shouldn’t bother studying social media - as in-house people can do it so much better.4 Such explicit efforts to demarcate research ‘insiders’ and ‘outsiders’ – while by no means new – undermine the utopian rhetoric of those who evangelize about the values of Big Data. ‘Effective democratisation can always be measured by this essential criterion,’ Derrida claimed, ‘the participation in and access to the archive, its constitution, and its interpretation’ (1996, p. 4). Whenever inequalities are explicitly written into the system, they produce class-based structures. Manovich writes of three classes of people in the realm of Big Data: ‘those who create data (both consciously and by leaving digital footprints), those who have the means to collect it, and those who have expertise to analyze it’ (2011). We know that the last group is the smallest, and the most privileged: they are also the ones who get to determine the rules about how Big Data will be used, and who gets to participate. While institutional inequalities may be a forgone conclusion in academia, they should nevertheless be examined and questioned. They produce a bias in the data and the types of research that emerge. By arguing that the Big Data phenomenon is implicated in some much broader historical and philosophical shifts is not to suggest it is solely accountable; the academy is by no means the sole driver behind the computational turn. There is a deep government and industrial drive toward gathering and extracting maximal value from data, be it information that will lead to more targeted advertising, product design, traffic planning or criminal policing. But we do think there are serious and wide-ranging implications for the operationalization of Big Data, and what it will mean for future research agendas. As Lucy Suchman (2011) observes, via Levi Strauss, ‘we are our tools.’ We should consider how they participate in shaping the world with us as we use them. The era of Big Data has only just begun, but it is already important that we start questioning the assumptions, values, and biases of this new wave of research. As scholars who are invested in the production of knowledge, such interrogations are an essential component of what we do. 4 During his keynote talk at the International Conference on Weblogs and Social Media (ICWSM) in Barcelona on July 19, 2011, Jimmy Lin – a researcher at Twitter – discouraged researchers from pursuing lines of inquiry that internal Twitter researchers could do better given their preferential access to Twitter data. 13 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Acknowledgements We wish to thank Heather Casteel for her help in preparing this article. We are also deeply grateful to Eytan Adar, Tarleton Gillespie, and Christian Sandvig for inspiring conversations, suggestions, and feedback. References Acquisti, A. & Gross, R. (2009) ‘Predicting Social Security Numbers from Public Data’, Proceedings of the National Academy of Science, vol. 106, no. 27, pp. 10975-10980. Anderson, C. (2008) ‘The End of Theory, Will the Data Deluge Makes the Scientific Method Obsolete?’, Edge, <http://www.edge.org/3rd culture/anderson08/ anderson08 index.html>. [25 July 2011] Baca, G. (2004) ‘Legends of Fordism: Between Myth, History, and Foregone Conclusions’, Social Analysis, vol. 48, no.3, pp. 169-178. Barnes, J. A. (1954) ‘Class and Committees in a Norwegian Island Parish’, Human Relations, vol. 7, no. 1, pp. 39–58. Barry, A. and Born, G. (2012) Interdisciplinarity: reconfigurations of the Social and Natural Sciences. Taylor and Francis, London. Behar, R. and Gordon, D. A., eds. (1996) Women Writing Culture. University of California Press, Berkeley, California. Berry, D. (2011) ‘The Computational Turn: Thinking About the Digital Humanities’, Culture Machine. vol 12. <http://www.culturemachine.net/index.php/cm/article/view/440/470>. [11 July 2011]. Blass, T. (2004) The Man Who Shocked the World: The Life and Legacy of Stanley Milgram. Basic Books, New York, New York. Bollier, D. (2010) ‘The Promise and Peril of Big Data’, <http:// www.aspeninstitute.org/sites/default/files/content/docs/pubs/ The Promise and Peril of Big Data.pdf>. [11 July 2011]. boyd, d. (2004) ‘Friendster and Publicly Articulated Social Networks’, Conference on Human Factors and Computing Systems (CHI 2004). ACM, April 24-2, Vienna. boyd, d. (2006) ‘Friends, Friendsters, and Top 8: Writing community into being on social network sites’, First Monday vol. 11, no. 12, article 2. boyd, d. and Ellison, N. (2007) ‘Social Network Sites: Definition, History, and Scholarship’, Journal of Computer-Mediated Communication, vol. 13, no.1, article 11. 14 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. boyd, d. and Marwick, A. (2011) ‘Social Privacy in Networked Publics: Teens’ Attitudes, Practices, and Strategies,’ paper given at Oxford Internet Institute Decade in Time Conference. Oxford, England. Bowker, G. C. (2005) Memory Practices in the Sciences. MIT Press, Cambridge, Massachusetts. Burkholder, L, ed. (1992) Philosophy and the Computer, Boulder, San Francisco, and Oxford: Westview Press. Cain, M. and Finch, J. (1981) Towards a Rehabilitation of Data. In: P. Abrams, R. Deem, J. Finch, & P. Rock (eds.), Practice and Progress: British Sociology 1950-1980, George Allen and Unwin, London. Clifford, J. and Marcus, G. E., eds. (1986) Writing Culture: The Poetics and Politics of Ethnography. University of California Press, Berkeley, California. Crawford, K. (2009) ‘Following you: Disciplines of listening in social media’, Continuum: Journal of Media & Cultural Studies vol. 23, no. 4, 532-33. Du Gay, P. and Pryke, M. (2002) Cultural Economy: Cultural Analysis and Commercial Life, Sage, London. Dunbar, R. (1998) Grooming, Gossip, and the Evolution of Language, Harvard University Press, Cambridge. Derrida, J. (1996) Archive Fever: A Freudian Impression. Trans. Eric Prenowitz, University of Chicago Press, Chicago & London. Emmens, T. and Phippen, A. (2010) ‘Evaluating Online Safety Programs’, Harvard Berkman Center for Internet and Society, <http://cyber.law.harvard.edu/sites/cyber.law.harvard.edu/files/Emmens Phippen EvaluatingOnline-Safety-Programs 2010.pdf>. [23 July 2011]. Ess, C. (2002) ‘Ethical decision-making and Internet research: Recommendations from the aoir ethics working committee,’ Association of Internet Researchers, <http://aoir.org/reports/ethics.pdf >. [12 September 2011]. Fischer, C. (1982) To Dwell Among Friends: Personal Networks in Town and City. University of Chicago, Chicago. Forsythe, D. (2001) Studying Those Who Study Us: An Anthropologist in the World of Artificial Intelligence, Stanford University Press, Stanford. Freeman, L. (2006) The Development of Social Network Analysis, Empirical Press, Vancouver. Gitelman, L. (2011) Notes for the upcoming collection ‘Raw Data’ is an Oxymoron, <https://files.nyu.edu/lg91/public/>. [23 July 2011]. Golder, S. (2010) ‘Scaling Social Science with Hadoop’, Cloudera Blog, <http://www.cloudera.com/blog/2010/04/scaling-social-science-with-hadoop/>. [June 18 2011]. 15 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Granovetter, M. S. (1973) ‘The Strength of Weak Ties,’ American Journal of Sociology vol. 78, issue 6, pp. 1360-80. Harding, S. (2010) ‘Feminism, science and the anti-Enlightenment critiques’, in Women, knowledge and reality: explorations in feminist philosophy, eds A. Garry and M. Pearsall, Boston: Unwin Hyman, 298–320. Homans, G.C. (1974) Social Behavior: Its Elementary Forms, Harvard University Press, Cambridge, MA. Isbell, C., Kearns, M., Kormann, D., Singh, S. & Stone, P. (2000) ‘Cobot in LambdaMOO: A Social Statistics Agent’, paper given at the 17th National Conference on Artificial Intelligence (AAAI-00). Austin, Texas. Kilduff, M. and Tsai, W. (2003) Social Networks and Organizations, Sage, London. Kranzberg, M. (1986) ‘Technology and History: Kranzberg's Laws’, Technology and Culture vol. 27, no. 3, pp. 544-560. Latour, B. (2009). ‘Tarde’s idea of quantification’, in The Social After Gabriel Tarde: Debates and Assessments, ed M. Candea, London: Routledge, pp. 145-162.< http:// www.brunolatour.fr/articles/article/116-TARDE-CANDEA.pdf>. [19 June 2011]. Lazer, D., Pentland, A., Adamic, L., Aral, S., Barabási, A., Brewer, D.,Christakis, N., Contractor, N., Fowler, J.,Gutmann, M., Jebara, T., King, G., Macy, M., Roy, D., & Van Alstyne, M. (2009). ‘Computational Social Science’. Science vol. 323, pp. 721-3. Lewis, K., Kaufman, J., Gonzalez, M.,Wimmer, A., & Christakis, N. (2008) ‘Tastes, ties, and time: A new social network dataset using Facebook.com’, Social Networks vol. 30, pp. 330-342. Manovich, L. (2011) ‘Trending: The Promises and the Challenges of Big Social Data’, Debates in the Digital Humanities, ed M.K.Gold. The University of Minnesota Press, Minneapolis, MN <http://www.manovich.net/DOCS/Manovich trending paper.pdf>.[15 July 2011]. McCloskey, D. N. (1985) ‘From Methodology to Rhetoric’, In The Rhetoric of Economics au D. N. McCloskey, University of Wisconsin Press, Madison, pp. 20-35. Meeder, B., Tam, J., Gage Kelley, P., & Faith Cranor, L. (2010) ‘RT @IWantPrivacy: Widespread Violation of Privacy Settings in the Twitter Social Network’, Paper presented at Web 2.0 Security and Privacy, W2SP 2011, Oakland, CA. Meiss, M.R., Menczer, F., and A. Vespignani. (2008) ‘Structural analysis of behavioral networks from the Internet’, Journal of Physics A: Mathematical and Theoretical, vol. 41, no. 22, pp. 220224. Meyer D, Gaba, V., Colwell, K.A., (2005) ‘Organizing Far from Equilibrium: Nonlinear Change in Organizational Fields’, Organization Science, vol. 16, no. 5, pp.456-473. Moretti, F. (2007) Graphs, Maps, Trees: Abstract Models for a Literary History. Verso, London. 16 Paper to be presented at Oxford Internet Institute’s “A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society” on September 21, 2011. Onnela, J. P., Saramäki, J., Hyvönen, J., Szabó, G., Lazer, D., Kaski, K., & Kertész, J., Barabási, A.L. (2007) ‘Structure and tie strengths in mobile communication networks’, Proceedings from the National Academy of Sciences, vol.104, no.18, pp. 7332-7336. Pariser, E. (2011) The Filter Bubble: What the Internet is Hiding from You. Penguin Press, New York, NY. Radcliffe-Brown, A.R. (1940) ‘On Social Structure’, The Journal of the Royal Anthropological Institute of Great Britain and Ireland vol.70, no.1, pp.1–12. Reverby, S. M. (2009) Examining Tuskegee: The Infamous Syphilis Study and Its Legacy. University of North Carolina Press. Schrag, Z. M. (2010) Ethical Imperialism: Institutional Review Boards and the Social Sciences, 1965-2009. Johns Hopkins University Press, Baltimore, Maryland. Suchman, L. (2011) ‘Consuming Anthropology’, in Interdisicpinarity: Reconfigurations of the social and natural sciences, eds Andrew Barry and Georgina Born, Routledge, London and New York. Twitter. (2011) ‘One hundred million voices’, Twitter blog, <http://blog.twitter.com/2011/09/one-hundred-million-voices.html>. [12 September 2011] Veinot, T. (2007) ‘The Eyes of the Power Company: Workplace Information Practices of a Vault Inspector’, The Library Quarterly, vol.77, no.2, pp.157-180. Zimmer, M. (2008) ‘More on the ‘Anonymity’ of the Facebook Dataset – It’s Harvard College’, MichaelZimmer.org Blog, <http://www.michaelzimmer.org/2008/01/03/more-on-the-anonymityof-the-facebook-dataset-its-harvard-college/>. [20 June 2011]. 17 By Cade Metz 02.25.13 1:30 PM Scott Yara, the co-founder of Greenplum, a company that seeks to reinvent data analysis under the aegis of tech giant EMC. Photo: EMC Jeff Hammerbacher says that Facebook tried them all. And none of them did what the web giant needed them to do. Hammerbacher is the Harvard-trained mathematician Facebook hired in 2006. His job was to harness all the digital data generated by Mark Zuckerberg’s social network — to make sense of what people were doing on the service and find new ways of improving the thing. But as the service expanded to tens of millions of people, Hammerbacher remembers, it was generating more data than the company could possibly analyze with the software at hand: a good old-fashioned Oracle database. At the time, a long line of startups were offering a new breed of database designed to store and analyze much larger amounts of data. Greenplum. Vertica. Netezza. Hammerbacher and Facebook tested them all. But they weren’t suited to the task either. In the end, Facebook turned to a little-known open source software platform that had only just gotten off the ground at Yahoo. It was called Hadoop, and it was built to harness the power of thousands of ordinary computer servers. Unlike the Greenplums and the Verticas, Hammerbacher says, Hadoop could store and process the ever-expanding sea of data generated by what was quickly becoming the world’s most popular social network. Over the next few years, Hadoop reinvented data analysis not only at Facebook and Yahoo but so many other web services. And then an army of commercial software vendors started selling the thing to the rest of the world. Soon, even the likes of Oracle and Greenplum were hawking Hadoop. These companies still treated Hadoop as an adjunct to the traditional database — as a tool suited only to certain types of data analysis. But now, that’s changing too. On Monday, Greenplum — now owned by tech giant EMC — revealed that it has spent the last two years building a new Hadoop platform that it believes will leave the traditional database behind. Known as Pivotal HD, this tool can store the massive amounts of information Hadoop was created to store, but it’s designed to ask questions of this data significantly faster than you can with the existing open source platform. “We think we’re one the verge of a major shift where businesses are looking at a set of canonical applications that can’t be easily run on existing data fabrics and relational databases,” says Paul Martiz, the former Microsoft exec who now oversees Greenplum. Businesses need a new data fabric, Maritz says, and the starting point for that fabric is Hadoop. That’s a somewhat surprising statement from a company whose original business was built around a relational database — software that stores data in neat rows and columns. But Greenplum and EMC are just acknowledging what Jeff Hammerbacher and Facebook learned so many years ago: Hadoop — for all its early faults — is so well suited to storing and processing the massive amounts of data facing the modern business. What’s more, Greenplum is revamping Hadoop to operate more like a relational database, letting you rapidly ask questions of data using the structured query language, or SQL, which has been a staple of the database world for decades. “When we were acquired [by EMC], we really believed that the two worlds were going to fuse together,” says Greenplum co-founder Scott Yara. “What was going to be exciting is if you cold take the massively parallel query processing technology in a database system [like Greenplum] and basically fuse it with the Hadoop platform.” The trouble with Hadoop has always been that it takes so much time to analyze data. It was a “batch system.” Using a framework called Hadoop MapReduce, you had the freedom to build all sorts of complex programs that crunch enormous amounts of data, but when you gave it a task, you could wait hours — or even days — for a response. With its new system Greenplum has worked to change that. A team led by former Microsoft database designer Florian Waas has designed a new “query engine” that can more quickly run SQL queries on data stored across a massive cluster of systems using the Hadoop File System, or HDFS. Open source tools such as Hive have long provided ways of running SQL queries on Hadoop data, but this too was a batch system that needed a fair amount of time to complete queries. This query engine will make its debut later this year as part of Pivotal HD. Greenplum is now a key component of an EMC subsidiary called The Pivotal Initiative, which seeks to bring several new age web technologies and techniques to the average business. This time, Greenplum is in lock-step with Jeff Hammerbacher. After leaving Facebook, Hammerbacher helped found a Hadoop startup known as a Cloudera, and late last year, he unveiled a system called Impala, which also seeks to run real-time queries atop Hadoop. But according to Waas and Yara, Pivotal HD is significantly faster than Impala and the many other tools that run SQL queries atop Hadoop. Yara claims that it’s at least 100 times faster than Impala. The caveat, says Waas, is that if a server crashes when Pivotal HD is running a query, you’re forced to restart the query. This is a little different from what people have come to expect when running jobs at Hadoop, which was specifically designed to keep running across a large cluster of servers even as individual machines started to fail — as they inevitably do. “The query extensions of Pivotal HD behave slightly differently in that they require a restart of the query when a machine is lost,” he says. “An individual query needs to be restarted but the integrity, accessibility and functionality of the system is guaranteed to continue. We consider this a small price to pay for several orders of magnitude performance enhancement as we do not materialize any results during processing.” The traditional database will always have its place. Even Greenplum will continue to offer its original data warehouse tool, which was based on the open source PostgreSQL database. But the company’s new query engine is yet another sign that Hadoop will continue to reinvent the way businesses crunch their data. Not just web giants. But any business. Update: This story has been updated with additional comment from Florian Waas and to clarify how Pivotal HD deals with hardware failures. March 10, 2013 By STEVE LOHR Trading stocks, targeting ads, steering political campaigns, arranging dates, besting people on “Jeopardy” and even choosing bra sizes: computer algorithms are doing all this work and more. But increasingly, behind the curtain there is a decidedly retro helper — a human being. Although algorithms are growing ever more powerful, fast and precise, the computers themselves are literal-minded, and context and nuance often elude them. Capable as these machines are, they are not always up to deciphering the ambiguity of human language and the mystery of reasoning. Yet these days they are being asked to be more humanlike in what they figure out. “For all their brilliance, computers can be thick as a brick,” said Tom M. Mitchell, a computer scientist at Carnegie Mellon University. And so, while programming experts still write the step-by-step instructions of computer code, additional people are needed to make more subtle contributions as the work the computers do has become more involved. People evaluate, edit or correct an algorithm’s work. Or they assemble online databases of knowledge and check and verify them — creating, essentially, a crib sheet the computer can call on for a quick answer. Humans can interpret and tweak information in ways that are understandable to both computers and other humans. Question-answering technologies like Apple’s Siri and I.B.M.’s Watson rely particularly on the emerging machine-man collaboration. Algorithms alone are not enough. Twitter uses a far-flung army of contract workers, whom it calls judges, to interpret the meaning and context of search terms that suddenly spike in frequency on the service. For example, when Mitt Romney talked of cutting government money for public broadcasting in a presidential debate last fall and mentioned Big Bird, messages with that phrase surged. Human judges recognized instantly that “Big Bird,” in that context and at that moment, was mainly a political comment, not a reference to “Sesame Street,” and that politics-related messages should pop up when someone searched for “Big Bird.” People can understand such references more accurately and quickly than software can, and their judgments are fed immediately into Twitter’s search algorithm. “Humans are core to this system,” two Twitter engineers wrote in a blog post in January. Even at Google, where algorithms and engineers reign supreme in the company’s business and culture, the human contribution to search results is increasing. Google uses human helpers in two ways. Several months ago, it began presenting summaries of information on the right side of a search page when a user typed in the name of a well-known person or place, like “Barack Obama” or “New York City.” These summaries draw from databases of knowledge like Wikipedia, the C.I.A. World Factbook and Freebase, whose parent company, Metaweb, Google acquired in 2010. These databases are edited by humans. When Google’s algorithm detects a search term for which this distilled information is available, the search engine is trained to go fetch it rather than merely present links to Web pages. “There has been a shift in our thinking,” said Scott Huffman, an engineering director in charge of search quality at Google. “A part of our resources are now more human curated.” Other human helpers, known as evaluators or raters, help Google develop tweaks to its search algorithm, a powerhouse of automation, fielding 100 billion queries a month. “Our engineers evolve the algorithm, and humans help us see if a suggested change is really an improvement,” Mr. Huffman said. Katherine Young, 23, is a Google rater — a contract worker and a college student in Macon, Ga. She is shown an ambiguous search query like “what does king hold,” presented with two sets of Google search results and asked to rate their relevance, accuracy and quality. The current search result for that imprecise phrase starts with links to Web pages saying that kings typically hold ceremonial scepters, a reasonable inference. Her judgments, Ms. Young said, are “not completely black and white; some of it is subjective.” She added, “You try to put yourself in the shoes of the person who typed in the query.” I.B.M.’s Watson, the powerful question-answering computer that defeated “Jeopardy” champions two years ago, is in training these days to help doctors make diagnoses. But it, too, is turning to humans for help. To prepare for its role in assisting doctors, Watson is being fed medical texts, scientific papers and digital patient records stripped of personal identifying information. Instead of answering questions, however, Watson is asking them of clinicians at the Cleveland Clinic and medical school students. They are giving answers and correcting the computer’s mistakes, using a “Teach Watson” feature. Watson, for example, might come across this question in a medical text: “What neurological condition contraindicates the use of bupropion?” The software may have bupropion, an antidepressant, in its database, but stumble on “contraindicates.” A human helper will confirm that the word means “do not use,” and Watson returns to its data trove to reason that the neurological condition is seizure disorder. “We’re using medical experts to help Watson learn, make it smarter going forward,” said Eric Brown, a scientist on I.B.M.’s Watson team. Ben Taylor, 25, is a product manager at FindTheBest, a fast-growing start-up in Santa Barbara, Calif. The company calls itself a “comparison engine” for finding and comparing more than 100 topics and products, from universities to nursing homes, smartphones to dog breeds. Its Web site went up in 2010, and the company now has 60 full-time employees. Mr. Taylor helps design and edit the site’s education pages. He is not an engineer, but an English major who has become a self-taught expert in the arcane data found in Education Department studies and elsewhere. His research methods include talking to and e-mailing educators. He is an information sleuth. On FindTheBest, more than 8,500 colleges can be searched quickly according to geography, programs and tuition costs, among other criteria. Go to the page for a university, and a wealth of information appears in summaries, charts and graphics — down to the gender and race breakdowns of the student body and faculty. Mr. Taylor and his team write the summaries and design the initial charts and graphs. From hundreds of data points on college costs, for example, they select the ones most relevant to college students and their parents. But much of their information is prepared in templates and tagged with code a computer can read. So the process has become more automated, with Mr. Taylor and others essentially giving “go fetch” commands that the computer algorithm obeys. The algorithms are getting better. But they cannot do it alone. “You need judgment, and to be able to intuitively recognize the smaller sets of data that are most important,” Mr. Taylor said. “To do that, you need some level of human involvement.” How Systems Fail How Complex Systems Fail (Being a Short Treatise on the Nature of Failure; How Failure is Evaluated; How Failure is Attributed to Proximate Cause; and the Resulting New Understanding of Patient Safety) Richard I. Cook, MD Cognitive technologies Laboratory University of Chicago 1) Complex systems are intrinsically hazardous systems. All of the interesting systems (e.g. transportation, healthcare, power generation) are inherently and unavoidably hazardous by the own nature. The frequency of hazard exposure can sometimes be changed but the processes involved in the system are themselves intrinsically and irreducibly hazardous. It is the presence of these hazards that drives the creation of defenses against hazard that characterize these systems. 2) Complex systems are heavily and successfully defended against failure. The high consequences of failure lead over time to the construction of multiple layers of defense against failure. These defenses include obvious technical components (e.g. backup systems, ‘safety’ features of equipment) and human components (e.g. training, knowledge) but also a variety of organizational, institutional, and regulatory defenses (e.g. policies and procedures, certification, work rules, team training). The effect of these measures is to provide a series of shields that normally divert operations away from accidents. 3) Catastrophe requires multiple failures – single point failures are not enough.. The array of defenses works. System operations are generally successful. Overt catastrophic failure occurs when small, apparently innocuous failures join to create opportunity for a systemic accident. Each of these small failures is necessary to cause catastrophe but only the combination is sufficient to permit failure. Put another way, there are many more failure opportunities than overt system accidents. Most initial failure trajectories are blocked by designed system safety components. Trajectories that reach the operational level are mostly blocked, usually by practitioners. 4) Complex systems contain changing mixtures of failures latent within them. The complexity of these systems makes it impossible for them to run without multiple flaws being present. Because these are individually insufficient to cause failure they are regarded as minor factors during operations. Eradication of all latent failures is limited primarily by economic cost but also because it is difficult before the fact to see how such failures might contribute to an accident. The failures change constantly because of changing technology, work organization, and efforts to eradicate failures. 5) Complex systems run in degraded mode. A corollary to the preceding point is that complex systems run as broken systems. The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. After accident reviews nearly always note that the system has a history of prior ‘proto-accidents’ that nearly generated catastrophe. Arguments that these degraded conditions should have been recognized before the overt accident are usually predicated on naïve notions of system performance. System operations are dynamic, with components (organizational, human, technical) failing and being replaced continuously. Copyright © 1998, 1999, 2000 by R.I.Cook, MD, for CtL Page 1 Revision D (00.04.21) How Systems Fail 6) Catastrophe is always just around the corner. Complex systems possess potential for catastrophic failure. Human practitioners are nearly always in close physical and temporal proximity to these potential failures – disaster can occur at any time and in nearly any place. The potential for catastrophic outcome is a hallmark of complex systems. It is impossible to eliminate the potential for such catastrophic failure; the potential for such failure is always present by the system’s own nature. 7) Post-accident attribution accident to a ‘root cause’ is fundamentally wrong. Because overt failure requires multiple faults, there is no isolated ‘cause’ of an accident. There are multiple contributors to accidents. Each of these is necessary insufficient in itself to create an accident. Only jointly are these causes sufficient to create an accident. Indeed, it is the linking of these causes together that creates the circumstances required for the accident. Thus, no isolation of the ‘root cause’ of an accident is possible. The evaluations based on such reasoning as ‘root cause’ do not reflect a technical understanding of the nature of failure but rather the social, cultural need to blame specific, localized forces or events for outcomes.1 8) Hindsight biases post-accident assessments of human performance. Knowledge of the outcome makes it seem that events leading to the outcome should have appeared more salient to practitioners at the time than was actually the case. This means that ex post facto accident analysis of human performance is inaccurate. The outcome knowledge poisons the ability of after-accident observers to recreate the view of practitioners before the accident of those same factors. It seems that practitioners “should have known” that the factors would “inevitably” lead to an accident.2 Hindsight bias remains the primary obstacle to accident investigation, especially when expert human performance is involved. 9) Human operators have dual roles: as producers & as defenders against failure. The system practitioners operate the system in order to produce its desired product and also work to forestall accidents. This dynamic quality of system operation, the balancing of demands for production against the possibility of incipient failure is unavoidable. Outsiders rarely acknowledge the duality of this role. In non-accident filled times, the production role is emphasized. After accidents, the defense against failure role is emphasized. At either time, the outsider’s view misapprehends the operator’s constant, simultaneous engagement with both roles. 10) All practitioner actions are gambles. After accidents, the overt failure often appears to have been inevitable and the practitioner’s actions as blunders or deliberate willful disregard of certain impending failure. But all practitioner actions are actually gambles, that is, acts that take place in the face of uncertain outcomes. The degree of uncertainty may change from moment to moment. That practitioner actions are gambles appears clear after accidents; in general, Anthropological field research provides the clearest demonstration of the social construction of the notion of ‘cause’ (cf. Goldman L (1993), The Culture of Coincidence: accident and absolute liability in Huli, New York: Clarendon Press; and also Tasca L (1990), The Social Construction of Human Error, Unpublished doctoral dissertation, Department of Sociology, State University of New York at Stonybrook. 1 This is not a feature of medical judgements or technical ones, but rather of all human cognition about past events and their causes. 2 Copyright © 1998, 1999, 2000 by R.I.Cook, MD, for CtL Page 2 Revision D (00.04.21) How Systems Fail post hoc analysis regards these gambles as poor ones. But the converse: that successful outcomes are also the result of gambles; is not widely appreciated. 11) Actions at the sharp end resolve all ambiguity. Organizations are ambiguous, often intentionally, about the relationship between production targets, efficient use of resources, economy and costs of operations, and acceptable risks of low and high consequence accidents. All ambiguity is resolved by actions of practitioners at the sharp end of the system. After an accident, practitioner actions may be regarded as ‘errors’ or ‘violations’ but these evaluations are heavily biased by hindsight and ignore the other driving forces, especially production pressure. 12) Human practitioners are the adaptable element of complex systems. Practitioners and first line management actively adapt the system to maximize production and minimize accidents. These adaptations often occur on a moment by moment basis. Some of these adaptations include: (1) Restructuring the system in order to reduce exposure of vulnerable parts to failure. (2) Concentrating critical resources in areas of expected high demand. (3) Providing pathways for retreat or recovery from expected and unexpected faults. (4) Establishing means for early detection of changed system performance in order to allow graceful cutbacks in production or other means of increasing resiliency. 13) Human expertise in complex systems is constantly changing Complex systems require substantial human expertise in their operation and management. This expertise changes in character as technology changes but it also changes because of the need to replace experts who leave. In every case, training and refinement of skill and expertise is one part of the function of the system itself. At any moment, therefore, a given complex system will contain practitioners and trainees with varying degrees of expertise. Critical issues related to expertise arise from (1) the need to use scarce expertise as a resource for the most difficult or demanding production needs and (2) the need to develop expertise for future use. 14) Change introduces new forms of failure. The low rate of overt accidents in reliable systems may encourage changes, especially the use of new technology, to decrease the number of low consequence but high frequency failures. These changes maybe actually create opportunities for new, low frequency but high consequence failures. When new technologies are used to eliminate well understood system failures or to gain high precision performance they often introduce new pathways to large scale, catastrophic failures. Not uncommonly, these new, rare catastrophes have even greater impact than those eliminated by the new technology. These new forms of failure are difficult to see before the fact; attention is paid mostly to the putative beneficial characteristics of the changes. Because these new, high consequence accidents occur at a low rate, multiple system changes may occur before an accident, making it hard to see the contribution of technology to the failure. 15) Views of ‘cause’ limit the effectiveness of defenses against future events. Post-accident remedies for “human error” are usually predicated on obstructing activities that can “cause” accidents. These end-of-the-chain measures do little to reduce the likelihood of further accidents. In fact that likelihood of an identical accident is already extraordinarily low because the pattern of latent failures changes constantly. Instead of increasing safety, post-accident remedies usually increase the coupling and complexity of Copyright © 1998, 1999, 2000 by R.I.Cook, MD, for CtL Page 3 Revision D (00.04.21) How Systems Fail the system. This increases the potential number of latent failures and also makes the detection and blocking of accident trajectories more difficult. 16) Safety is a characteristic of systems and not of their components Safety is an emergent property of systems; it does not reside in a person, device or department of an organization or system. Safety cannot be purchased or manufactured; it is not a feature that is separate from the other components of the system. This means that safety cannot be manipulated like a feedstock or raw material. The state of safety in any system is always dynamic; continuous systemic change insures that hazard and its management are constantly changing. 17) People continuously create safety. Failure free operations are the result of activities of people who work to keep the system within the boundaries of tolerable performance. These activities are, for the most part, part of normal operations and superficially straightforward. But because system operations are never trouble free, human practitioner adaptations to changing conditions actually create safety from moment to moment. These adaptations often amount to just the selection of a well-rehearsed routine from a store of available responses; sometimes, however, the adaptations are novel combinations or de novo creations of new approaches. 18) Failure free operations require experience with failure. Recognizing hazard and successfully manipulating system operations to remain inside the tolerable performance boundaries requires intimate contact with failure. More robust system performance is likely to arise in systems where operators can discern the “edge of the envelope”. This is where system performance begins to deteriorate, becomes difficult to predict, or cannot be readily recovered. In intrinsically hazardous systems, operators are expected to encounter and appreciate hazards in ways that lead to overall performance that is desirable. Improved safety depends on providing operators with calibrated views of the hazards. It also depends on providing calibration about how their actions move system performance towards or away from the edge of the envelope. Other materials: Cook, Render, Woods (2000). Gaps in the continuity of care and progress on patient safety. British Medical Journal 320: 791-4. Cook (1999). A Brief Look at the New Look in error, safety, and failure of complex systems. (Chicago: CtL). Woods & Cook (1999). Perspectives on Human Error: Hindsight Biases and Local Rationality. In Durso, Nickerson, et al., eds., Handbook of Applied Cognition. (New York: Wiley) pp. 141-171. Woods & Cook (1998). Characteristics of Patient Safety: Five Principles that Underlie Productive Work. (Chicago: CtL) Cook & Woods (1994), “Operating at the Sharp End: The Complexity of Human Error,” in MS Bogner, ed., Human Error in Medicine, Hillsdale, NJ; pp. 255-310. Copyright © 1998, 1999, 2000 by R.I.Cook, MD, for CtL Page 4 Revision D (00.04.21) How Systems Fail Woods, Johannesen, Cook, & Sarter (1994), Behind Human Error: Cognition, Computers and Hindsight, Wright Patterson AFB: CSERIAC. Cook, Woods, & Miller (1998), A Tale of Two Stories: Contrasting Views of Patient Safety, Chicago, IL: NPSF, (available as PDF file on the NPSF web site at www.npsf.org). Copyright © 1998, 1999, 2000 by R.I.Cook, MD, for CtL Page 5 Revision D (00.04.21) HOME PAGE TODAY'S PAPER VIDEO MOST POPULAR Subscribe to Home Delivery U.S. Edition ha_levin Help Search Opinion WORLD U.S. N.Y. / REGION BUSINESS TECHNOLOGY SCIENCE HEALTH SPORTS OPINION ARTS STYLE TRAVEL JOBS REAL ESTATE Advertise on NYTimes.com OP-ED COLUMNIST What Data Can’t Do By DAVID BROOKS Published: February 18, 2013 254 Comments Not long ago, I was at a dinner with the chief executive of a large bank. He had just had to decide whether to pull out of Italy, given the weak economy and the prospect of a future euro crisis. Enlarge This Image The C.E.O. had his economists project out a series of downside scenarios and calculate what they would mean for his company. But, in the end, he made his decision on the basis of values. FACEBOOK TWITTER GOOGLE+ SAVE E-MAIL SHARE PRINT REPRINTS Josh Haner/The New York Times David Brooks Go to Columnist Page » The Conversation David Brooks and Gail Collins talk between columns. All Conversations » Connect With Us on Twitter For Op-Ed, follow @nytopinion and to hear from the editorial page editor, Andrew Rosenthal, follow @andyrNYT. Readers’ Comments Readers shared their thoughts on this article. Read All Comments (254) » His bank had been in Italy for decades. He didn’t want Italians to think of the company as a fair-weather friend. He didn’t want people inside the company thinking they would cut and run when times got hard. He decided to stay in Italy and ride out any potential crisis, even with the short-term costs. He wasn’t oblivious to data in making this decision, but ultimately, he was guided by a different way of thinking. And, of course, he was right to be. Commerce depends on trust. Trust is reciprocity coated by emotion. People and companies that behave well in tough times earn affection and self-respect that is extremely valuable, even if it is hard to capture in data. I tell this story because it hints at the strengths and limitations of data analysis. The big novelty of this historic moment is that our lives are now mediated through data-collecting computers. In this world, data can be used to make sense of mind-bogglingly complex situations. Data can help compensate for our overconfidence in our own intuitions and can help reduce the extent to which our desires distort our perceptions. But there are many things big data does poorly. Let’s note a few in rapid-fire fashion: Data struggles with the social. Your brain is pretty bad at math (quick, what’s the square root of 437), but it’s excellent at social cognition. People are really good at mirroring each other’s emotional states, at detecting uncooperative behavior and at assigning value to things through emotion. AUTOS Computer-driven data analysis, on the other hand, excels at measuring the quantity of social interactions but not the quality. Network scientists can map your interactions with the six co-workers you see during 76 percent of your days, but they can’t capture your devotion to the childhood friends you see twice a year, let alone Dante’s love for Beatrice, whom he met twice. Therefore, when making decisions about social relationships, it’s foolish to swap the amazing machine in your skull for the crude machine on your desk. Data struggles with context. Human decisions are not discrete events. They are embedded in sequences and contexts. The human brain has evolved to account for this reality. People are really good at telling stories that weave together multiple causes and multiple contexts. Data analysis is pretty bad at narrative and emergent thinking, and it cannot match the explanatory suppleness of even a mediocre novel. Data creates bigger haystacks. This is a point Nassim Taleb, the author of “Antifragile,” has made. As we acquire more data, we have the ability to find many, many more statistically significant correlations. Most of these correlations are spurious and deceive us when we’re trying to understand a situation. Falsity grows exponentially the more data we collect. The haystack gets bigger, but the needle we are looking for is still buried deep inside. One of the features of the era of big data is the number of “significant” findings that don’t replicate the expansion, as Nate Silver would say, of noise to signal. Big data has trouble with big problems. If you are trying to figure out which e-mail produces the most campaign contributions, you can do a randomized control experiment. But let’s say you are trying to stimulate an economy in a recession. You don’t have an alternate society to use as a control group. For example, we’ve had huge debates over the best economic stimulus, with mountains of data, and as far as I know not a single major player in this debate has been persuaded by data to switch sides. Data favors memes over masterpieces. Data analysis can detect when large numbers of people take an instant liking to some cultural product. But many important (and profitable) products are hated initially because they are unfamiliar. Data obscures values. I recently saw an academic book with the excellent title, “ ‘Raw Data’ Is an Oxymoron.” One of the points was that data is never raw; it’s always structured according to somebody’s predispositions and values. The end result looks disinterested, but, in reality, there are value choices all the way through, from construction to interpretation. This is not to argue that big data isn’t a great tool. It’s just that, like any tool, it’s good at some things and not at others. As the Yale professor Edward Tufte has said, “The world is much more interesting than any one discipline.” By Natalie Wolchover, Simons Science News 02.06.13 9:30 AM In Cuernavaca, Mexico, a “spy” network makes the decentralized bus system more efficient. As a consequence, the departure times of buses exhibit a ubiquitous pattern known as “universality.” (Photo: Marco de Leija) In 1999, while sitting at a bus stop in Cuernavaca, Mexico, a Czech physicist named Petr Šeba noticed young men handing slips of paper to the bus drivers in exchange for cash. It wasn’t organized crime, he learned, but another shadow trade: Each driver paid a “spy” to record when the bus ahead of his had departed the stop. If it had left recently, he would slow down, letting passengers accumulate at the next stop. If it had departed long ago, he sped up to keep other buses from passing him. This system maximized profits for the drivers. And it gave Šeba an idea. “We felt here some kind of similarity with quantum chaotic systems,” explained Šeba’s co-author, Milan Krbálek, in an email. Original story reprinted with permission from Simons Science News, an editorially independent division of SimonsFoundation.org whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the computational, physical and life sciences. After several failed attempts to talk to the spies himself, Šeba asked his student to explain to them that he wasn’t a tax collector, or a criminal — he was simply a “crazy” scientist willing to trade tequila for their data. The men handed over their used papers. When the researchers plotted thousands of bus departure times on a computer, their suspicions were confirmed: The interaction between drivers caused the spacing between departures to exhibit a distinctive pattern previously observed in quantum physics experiments. “I was thinking that something like this could come out, but I was really surprised that it comes exactly,” Šeba said. Subatomic particles have little to do with decentralized bus systems. But in the years since the odd coupling was discovered, the same pattern has turned up in other unrelated settings. Scientists now believe the widespread phenomenon, known as “universality,” stems from an underlying connection to mathematics, and it is helping them to model complex systems from the internet to Earth’s climate. The red pattern exhibits a precise balance of randomness and regularity known as “universality,” which has been observed in the spectra of many complex, correlated systems. In this spectrum, a mathematical formula called the “correlation function” gives the exact probability of finding two lines spaced a given distance apart. (Illustration: Simons Science News) The pattern was first discovered in nature in the 1950s in the energy spectrum of the uranium nucleus, a behemoth with hundreds of moving parts that quivers and stretches in infinitely many ways, producing an endless sequence of energy levels. In 1972, the number theorist Hugh Montgomery observed it in the zeros of the Riemann zeta function, a mathematical object closely related to the distribution of prime numbers. In 2000, Krbálek and Šeba reported it in the Cuernavaca bus system. And in recent years it has shown up in spectral measurements of composite materials, such as sea ice and human bones, and in signal dynamics of the Erdös–Rényi model, a simplified version of the internet named for Paul Erdös and Alfréd Rényi. Each of these systems has a spectrum — a sequence like a bar code representing data such as energy levels, zeta zeros, bus departure times or signal speeds. In all the spectra, the same distinctive pattern appears: The data seem haphazardly distributed, and yet neighboring lines repel one another, lending a degree of regularity to their spacing. This fine balance between chaos and order, which is defined by a precise formula, also appears in a purely mathematical setting: It defines the spacing between the eigenvalues, or solutions, of a vast matrix filled with random numbers. “Why so many physical systems behave like random matrices is still a mystery,” said Horng-Tzer Yau, a mathematician at Harvard University. “But in the past three years, we have made a very important step in our understanding.” By investigating the “universality” phenomenon in random matrices, researchers have developed a better sense of why it arises elsewhere — and how it can be used. In a flurry of recent papers, Yau and other mathematicians have characterized many new types of random matrices, which can conform to a variety of numerical distributions and symmetry rules. For example, the numbers filling a matrix’s rows and columns might be chosen from a bell curve of possible values, or they might simply be 1s and –1s. The top right and bottom left halves of the matrix might be mirror images of one another, or not. Time and again, regardless of their specific characteristics, the random matrices are found to exhibit that same chaotic yet regular pattern in the distribution of their eigenvalues. That’s why mathematicians call the phenomenon “universality.” “It seems to be a law of nature,” said Van Vu, a mathematician at Yale University who, with Terence Tao of the University of California, Los Angeles, has proven universality for a broad class of random matrices. Universality is thought to arise when a system is very complex, consisting of many parts that strongly interact with each other to generate a spectrum. The pattern emerges in the spectrum of a random matrix, for example, because the matrix elements all enter into the calculation of that spectrum. But random matrices are merely “toy systems” that are of interest because they can be rigorously studied, while also being rich enough to model real-world systems, Vu said. Universality is much more widespread. Wigner’s hypothesis (named after Eugene Wigner, the physicist who discovered universality in atomic spectra) asserts that all complex, correlated systems exhibit universality, from a crystal lattice to the internet. The more complex a system is, the more robust its universality should be, said László Erdös of the University of Munich, one of Yau’s collaborators. “This is because we believe that universality is the typical behavior.” Mathematicians are using random matrix models to study and predict some of the internet’s properties, such as the size of typical computer clusters. (Illustration: Matt Britt) In many simple systems, individual components can assert too great an influence on the outcome of the system, changing the spectral pattern. With larger systems, no single component dominates. “It’s like if you have a room with a lot of people and they decide to do something, the personality of one person isn’t that important,” Vu said. Whenever a system exhibits universality, the behavior acts as a signature certifying that the system is complex and correlated enough to be treated like a random matrix. “This means you can use a random matrix to model it,” Vu said. “You can compute other parameters of the matrix model and use them to predict that the system may behave like the parameters you computed.” This technique is enabling scientists to understand the structure and evolution of the internet. Certain properties of this vast computer network, such as the typical size of a cluster of computers, can be closely estimated by measurable properties of the corresponding random matrix. “People are very interested in clusters and their locations, partially motivated by practical purposes such as advertising,” Vu said. A similar technique may lead to improvements in climate change models. Scientists have found that the presence of universality in features similar to the energy spectrum of a material indicates that its components are highly connected, and that it will therefore conduct fluids, electricity or heat. Conversely, the absence of universality may show that a material is sparse and acts as an insulator. In new work presented in January at the Joint Mathematics Meetings in San Diego, Ken Golden, a mathematician at the University of Utah, and his student, Ben Murphy, used this distinction to predict heat transfer and fluid flow in sea ice, both at the microscopic level and through patchworks of Arctic melt ponds spanning thousands of kilometers. When Arctic melt ponds are sufficiently connected, as pictured here, they exhibit a property called universality that researchers believe is common to all complex, correlated systems. (Photo: Don Perovich) The spectral measure of a mosaic of melt ponds, taken from a helicopter, or a similar measurement taken of a sample of sea ice in an ice core, instantly exposes the state of either system. “Fluid flow through sea ice governs or mediates very important processes that you need to understand in order to understand the climate system,” Golden said. “The transitions in the eigenvalue statistics presents a brand new, mathematically rigorous approach to incorporating sea ice into climate models.” The same trick may also eventually provide an easy test for osteoporosis. Golden, Murphy and their colleagues have found that the spectrum of a dense, healthy bone exhibits universality, while that of a porous, osteoporotic bone does not. “We’re dealing with systems where the ‘particles’ can be on the millimeter or even on the kilometer scale,” Murphy said, referring to the systems’ component parts. “It’s amazing that the same underlying mathematics describes both.” The reason a real-world system would exhibit the same spectral behavior as a random matrix may be easiest to understand in the case of the nucleus of a heavy atom. All quantum systems, including atoms, are governed by the rules of mathematics, and specifically by those of matrices. “That’s what quantum mechanics is all about,” said Freeman Dyson, a retired mathematical physicist who helped develop random matrix theory in the 1960s and 1970s while at Princeton’s Institute for Advanced Study. “Every quantum system is governed by a matrix representing the total energy of the system, and the eigenvalues of the matrix are the energy levels of the quantum system.” The matrices behind simple atoms, such as hydrogen or helium, can be worked out exactly, yielding eigenvalues that correspond with stunning precision to the measured energy levels of the atoms. But the matrices corresponding to more complex quantum systems, such as a uranium nucleus, quickly grow too thorny to grasp. According to Dyson, this is why such nuclei can be compared to random matrices. Many of the interactions inside uranium — the elements of its unknown matrix — are so complex that they become washed out, like a mélange of sounds blending into noise. Consequently, the unknown matrix that governs the nucleus behaves like a matrix filled with random numbers, and so its spectrum exhibits universality. Scientists have yet to develop an intuitive understanding of why this particular random-yet-regular pattern, and not some other pattern, emerges for complex systems. “We only know it from calculations,” Vu said. Another mystery is what it has to do with the Riemann zeta function, whose spectrum of zeros exhibits universality. The zeros of the zeta function are closely tied to the distribution of the prime numbers — the irreducible integers out of which all others are constructed. Mathematicians have long wondered at the haphazard way in which the primes are sprinkled along the number line from one to infinity, and universality offers a clue. Some think there may be a matrix underlying the Riemann zeta function that is complex and correlated enough to exhibit universality. Discovering such a matrix would have “big implications” for finally understanding the distribution of the primes, said Paul Bourgade, a mathematician at Harvard. Or perhaps the explanation lies deeper still. “It may happen that it is not a matrix that lies at the core of both Wigner’s universality and the zeta function, but some other, yet undiscovered, mathematical structure,” Erdös said. “Wigner matrices and zeta functions may then just be different representations of this structure.” Many mathematicians are searching for the answer, with no guarantee that there is one. “Nobody imagined that the buses in Cuernavaca would turn out to be an example of this. Nobody imagined that the zeroes of the zeta function would be another example,” Dyson said. “The beauty of science is it’s completely unpredictable, and so everything useful comes out of surprises.” Questioning the Foundations FQXi's 2012 Essay Contest Winners! FQXI ARTICLE March 12, 2013 Ideas inspired by microscopic physics and magnetism could one day help predict the spread of disease, financial markets, and the fates of Facebook friendships. by Graeme Stemp Morlock November 6, 2012 Your grade ten math teacher probably wrote this several times on your tests: SIMPLIFY. And, for much of science, that’s part of the work: SIMPLIFY. The universe can be broken down into smaller and smaller chunks in an attempt to find its most basic level and functions. But what do you do when that doesn’t work? Complex systems that defy reduction are all around us, from the elaborate workings of an ant colony—which could never be predicted from the physiology of a single ant—to fluctuations in the financial system that can send ripples around the globe. When broken into their constituent pieces, examined and put back together, such systems do not behave as expected. The sum of the parts does not equal the whole. Surprisingly, the best way to analyze these decidedly large-scale systems may be by exploiting techniques first developed not in biology or economics, but in microscopic physics. Raissa D’Souza, a complexity scientist at UC Davis and an external professor at the Santa Fe Institute, is applying lessons learned from studying how physical systems go through phase transitions—becoming magnetized, for instance—to try to predict when Raissa D’Souza everyday networks will go through Univers ty of California, Davis. potentially catastrophic changes. Her work has implications for the spread of disease, sustaining the energy infrastructure, the financial health of countries, and for the way we connect with our friends in online communities. While completing her PhD in statistical physics at MIT in the 1990s, D’Souza became fascinated with complex systems and the behavior patterns that emerge from them. Since she did not know of anyone who specialized in the subject, she went to the library and searched the entire Boston area for someone who did, before finding Norm Margolus, who it turned out was handily also at MIT and with whom she studied pattern formation and computing in natural systems. D’Souza’s background in statistical physics introduced her to the prototypical phase transition. It considers a collection of atoms, each with a magnetic moment, that could either line-up with each other—so that the overall system becomes magnetized—or remain in a disordered mess. There is a tension in this case: on the one hand, the atoms want to line-up, lowering the system’s energy; on the other hand, the laws of thermodynamics tell us that systems prefer to move to a state of increasing disorder, mathematically expressed as having a higher entropy. It was first discovered experimentally that the outcome depends on temperature. At high temperatures entropy rules, the atoms remain disordered and the system does not become magnetized. But below some critical temperature, the system undergoes a phase transition and the atoms align. That sounds simple enough, but the phase transitions that change a system’s behaviour so profoundly are often unpredictable, especially if you only study the system in its smallest components. How for instance, could you predict what the critical temperature would be, in theory, if you only focus your attention down onto one isolated atom? Instead, you’ve got to see the big picture. And sometimes that picture is very big. The Power of Networking Taking a step back, D’Souza sees everything as being interconnected. Her job is to work out when linked objects or entities will go through profound phase transitions, which could lead to a negative (or sometimes positive) outcome. For instance, the United States power grid was once a collection of small isolated grids, made of a few powerplants run by some municipality or corporation. Then, local grids were connected to create state-wide and regional grids that remained largely independent. Distinct regions were then interconnected to allow power transfer in emergency situations. But, with deregulation, those interconnections now transfer massive amounts of power bought and sold on power auction markets each day. As D’Souza points out, this interdependence has changed networks in ways that were originally never intended, leading to unforeseen bad consequences. The U.S. power grid has grown to a point where it has seemingly encountered a phase transition and now occasionally suffers from large cascading blackouts, where a problem in one area can drag many areas down. Worse, a failure in one network can actually drag down many different networks. So a failure in the power grid can cause a failure in the telecommunications grid which causes a failure in the transportation grid and the impact keeps rippling through time and space. Speaking at FQXi’s Setting Time Aright meeting, D’Souza discussed the conception of time that emerges from considering examples of interconnected networks in terms of complexity theory: "In the last 3-4 years, I’ve been working to take the ideas from single networks, such as the structure of the network and the dynamics happening on top of the network substrate, and extending it to this bigger context where we have networks of networks," says D’Souza. "I’ve been trying to understand what it does to emergent properties like phase transitions and what it means for the vulnerability of these systems. So, if there is a small ripple in one layer can it go into other layers and how long does it take?" Understanding how networks interconnect and evolve has huge implications for public health, for instance. D’Souza cites the example of pandemics, where infection rates have changed drastically over time based on advancements in our transportation networks. In the Middle Ages, the bubonic plague took years to spread across Europe, for example; by contrast the Spanish flu pandemic of 1918 killed over 20 million people across the globe, taking only a matter of weeks or months to spread. But now, with the arrival of mass air travel, it only takes hours for SARS, swine flu or bird flu to reach new countries. Pinpointing the critical point of a phase transition is not easy in the world of networked networks, but part of D’Souza’s work has been to find a generalised model, or set of equations that will apply to many different examples, not just the power grid or the transportation network. In February 2012, D’Souza and colleagues published a paper in the Proceedings of the National Academies of Sciences (PNAS) in which they analysed such a universal model and predicted where the optimal level of connection and interdependence would be—and that, ultimately, too much connectivity would be detrimental. There are drawbacks to basing your mathematical analyses on equations inspired by mathematical physics that are usually used to analyse the collective behavior of atoms and molecules, however. Such statistical 26 equations usually work by considering a collection of around 10 atoms (that’s 10 followed by 26 zeros, Avogradro’s number). By contrast, even the biggest real-world networks today only get up to about a billion 9 (10 ), which makes it difficult to take theoretical predictions from the equations and apply them directly to real-world networks. Nonetheless, independent network scientists aiming to forecast financial crises have found intriguing evidence that backs D’Souza’s theoretical predictions about interdependence and phase transitions. Financial Contagion Soon after D’Souza’s PNAS paper appeared, Stefano Battiston, at ETH Zurich, and colleagues published an independent study in the Journal of Economic Dynamics and Control that investigated the dominant view in finance that diversification is good. The idea is that it is better to spread your money around, so that even if one investment goes bad, you will still win overall. However, Battiston’s group found that diversification may not be the best strategy. In fact, they calculated that it could actually spread "financial contagion." What Battiston’s group realized was that a naturally occurring financial mechanism known as trend reinforcement was enough to push a single failure through the entire system. Trend reinforcement works through a rating body that evaluates an entity’s performance. In the financial case that Battiston’s group evaluated, when the market was disappointed by a company’s returns, they downgraded that company, which resulted in additional selling, which caused the company to underperform and disappoint the rating body again. This negative cycle and penalization altered the probability of the company doing well and magnified the initial shock. Furthermore, they found that if the shock became big enough, then a phase transition would occur, as D’Souza hypothesizes, allowing the shock to travel through the entire system. "There are some benefits in Dangerous Liaisons? Networks of networks share the good and the bad. diversification and connections of Credit: aleksandarvelasevic course," says Battiston, "but there are serious threats that come from connecting everything in a single system that behaves in synchrony because recovering from a complete collapse is obviously very costly, not just economically but socially as well." Extending their tentacles beyond the financial world, networks can also help expose the way politicians and nation-states act. Zeev Maoz, a political scientist at UC Davis and a distinguished fellow at the Interdisciplinary Center in Herzliya, Israel, has found that geopolitical networks have significant spillover to other networks, for instance security and trade. Importantly, Maoz has also shown that nations are not connected equally; often smaller states are connected through more central players. So, you get a situation where there are a few major players each with a large cadre of states connected on their periphery, and this can be destabilizing. "The uneven structure is a cause of potential instability because if everyone is connected to a few major partners and the major powers experience a shock then everyone suffers," says Maoz. Unfortunately, there aren’t any levers that can help mitigate a shock like that because of the nature of connectivity, he explains. Take for instance Greece, which is dependent on Germany and France and the United States. If a shock because of the recession hits the big players, then Greece suffers more than Germany, France, or the USA, because Greece is dependent on them and does not have trading partners of its own. Complex Conceptions of Time All these studies converge on one conclusion: complex systems are, fittingly, even more complex than first thought. Complexity theorists have long accepted that you cannot just look at components and understand the whole system—their discipline is based on that assumption, after all. But now complexity scientists have learned that you cannot even look at a single system and understand it without the context of all the other systems it interacts with. "So we’re at the point where we can begin to analyze systems of systems," says D’Souza, each evolving on its own timescale, with feedbacks between these systems. Take, for instance, online social networks that evolve much faster than say social norms or transportation networks. Users of Facebook or Twitter typically develop a web of "friends" or "followers" that extends well beyond the number of people they would have time to physically meet up with and interact with face-to-face, says D’Souza: "How do we characterize time in these disparate systems?" At first sight, the ability of online social networks to bring people around the world closer together and shrink the Social networks could breakdown if time that it takes to interact may seem like an unambiguously positive thing. But even social networks they got so dense you couldn’t are vulnerable to phase transitions, so D’Souza urges distinguish meaningful information caution: At some point that connectivity might backfire and potentially cause the network to collapse. "Maybe we from noise anymore. will find that Facebook becomes a great big mush and isn’t - Raissa D’Souza interesting anymore because there is no way to differentiate who is a true friend and who is someone that you used to know 20 years ago and you’re just overwhelmed with information," D’Souza says. "That could be one way that a network like Facebook could fail. It could break down if it got so dense that you couldn’t distinguish meaningful information from noise anymore." And your number of Facebook friends is only going to increase, according to D’Souza. In fact, she believes that to be almost a rule in thinking about networks of networks. "I firmly believe networks become more interdependent in time," says D’Souza. "We see the global economy becoming more interdependent. We see Facebook making everyone more interconnected. We’re relying increasingly on technologies like the Internet and communications networks, for instance, the smart-grid, a cyber-physical system. All these networks that used to operate more independently are now becoming more interconnected, and to me that is really a signature of time." 1 Optimization of Lyapunov Invariants in Verification of Software Systems Mardavij Roozbehani, Member, IEEE, Alexandre Megretski , Member, IEEE, and Eric Feron Member, IEEE Abstract The paper proposes a control-theoretic framework for verification of numerical software systems, and puts forward software verification as an important application of control and systems theory. The idea is to transfer arXiv:1108.0170v1 [cs.SY] 31 Jul 2011 Lyapunov functions and the associated computational techniques from control systems analysis and convex optimization to verification of various software safety and performance specifications. These include but are not limited to absence of overflow, absence of division-by-zero, termination in finite time, presence of dead-code, and certain user-specified assertions. Central to this framework are Lyapunov invariants. These are properly constructed functions of the program variables, and satisfy certain properties—resembling those of Lyapunov functions—along the execution trace. The search for the invariants can be formulated as a convex optimization problem. If the associated optimization problem is feasible, the result is a certificate for the specification. Index Terms Software Verification, Lyapunov Invariants, Convex Optimization. I. I NTRODUCTION S OFTWARE in safety-critical systems implement complex algorithms and feedback laws that control the interaction of physical devices with their environments. Examples of such systems are abundant in aerospace, automotive, and medical applications. The range of theoretical and practical issues that arise in analysis, design, and implementation of safety-critical software systems is extensive, see, e.g., [26], [37] , and [22]. While safety-critical software must satisfy various resource allocation, timing, scheduling, and fault tolerance constraints, the foremost requirement is that it must be free of run-time errors. A. Overview of Existing Methods 1) Formal Methods: Formal verification methods are model-based techniques [44], [41], [36] for proving or disproving that a mathematical model of a software (or hardware) satisfies a given specification, Mardavij Roozbehani and Alexandre Megretski are with the Laboratory for Information and Decision Systems (LIDS), Massachusetts Institute of Technology, Cambridge, MA. E-mails: {mardavij,ameg}@mit.edu. Eric Feron is professor of aerospace software engineering at the School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA. E-mail: feron@gatech.edu. 2 i.e., a mathematical expression of a desired behavior. The approach adopted in this paper too, falls under this category. Herein, we briefly review model checking and abstract interpretation. a) Model Checking: In model checking [14] the system is modeled as a finite state transition system and the specifications are expressed in some form of logic formulae, e.g., temporal or propositional logic. The verification problem then reduces to a graph search, and symbolic algorithms are used to perform an exhaustive exploration of all possible states. Model checking has proven to be a powerful technique for verification of circuits [13], security and communication protocols [33], [38] and stochastic processes [3]. Nevertheless, when the program has non-integer variables, or when the state space is continuous, model checking is not directly applicable. In such cases, combinations of various abstraction techniques and model checking have been proposed [2], [17], [54]; scalability, however, remains a challenge. b) Abstract Interpretation: is a theory for formal approximation of the operational semantics of computer programs in a systematic way [15]. Construction of abstract models involves abstraction of domains—typically in the form of a combination of sign, interval, polyhedral, and congruence abstractions of sets of data—and functions. A system of fixed-point equations is then generated by symbolic forward/backward executions of the abstract model. An iterative equation solving procedure, e.g., Newton’s method, is used for solving the nonlinear system of equations, the solution of which results in an inductive invariant assertion, which is then used for checking the specifications. In practice, to guarantee finite convergence of the iterates, narrowing (outer approximation) operators are used to estimate the solution, followed by widening (inner approximation) to improve the estimate [16]. This compromise can be a source of conservatism in analysis. Nevertheless, these methods have been used in practice for verification of limited properties of embedded software of commercial aircraft [7]. Alternative formal methods can be found in the computer science literature mostly under deductive verification [32], type inference [45], and data flow analysis [23]. These methods share extensive similarities in that a notion of program abstraction and symbolic execution or constraint propagation is present in all of them. Further details and discussions of the methodologies can be found in [16], and [41]. 3 2) System Theoretic Methods: While software analysis has been the subject of an extensive body of research in computer science, treatment of the topic in the control systems community has been less systematic. The relevant results in the systems and control literature can be found in the field of hybrid systems [11]. Much of the available techniques for safety verification of hybrid systems are explicitly or implicitly based on computation of the reachable sets, either exactly or approximately. These include but are not limited to techniques based on quantifier elimination [29], ellipsoidal calculus [27], and mathematical programming [5]. Alternative approaches aim at establishing properties of hybrid systems through barrier certificates [46], numerical computation of Lyapunov functions [10], [24], or by combined use of bisimulation mechanisms and Lyapunov techniques [20], [28], [54], [2]. Inspired by the concept of Lyapunov functions in stability analysis of nonlinear dynamical systems [25], in this paper we propose Lyapunov invariants for analysis of computer programs. While Lyapunov functions and similar concepts have been used in verification of stability or temporal properties of system level descriptions of hybrid systems [48], [10], [24], to the best of our knowledge, this paper is the first to present a systematic framework based on Lyapunov invariance and convex optimization for verification of a broad range of code-level specifications for computer programs. Accordingly, it is in the systematic integration of new ideas and some well-known tools within a unified software analysis framework that we see the main contribution of our work, and not in carrying through the proofs of the underlying theorems and propositions. The introduction and development of such framework provides an opportunity for the field of control to systematically address a problem of great practical significance and interest to both computer science and engineering communities. The framework can be summarized as follows: 1) Dynamical system interpretation and modeling (Section II). We introduce generic dynamical system representations of programs, along with specific modeling languages which include Mixed-Integer Linear Models (MILM), Graph Models, and MIL-over-Graph Hybrid Models (MIL-GHM). 2) Lyapunov invariants as behavior certificates for computer programs (Section III). Analogous to a Lyapunov function, a Lyapunov invariant is a real-valued function of the program variables, and satisfies a difference inequality along the trace of the program. It is shown that such functions can 4 be formulated for verification of various specifications. 3) A computational procedure for finding the Lyapunov invariants (Section IV). The procedure is standard and constitutes these steps: (i) Restricting the search space to a linear subspace. (ii) Using convex relaxation techniques to formulate the search problem as a convex optimization problem, e.g., a linear program [6], semidefinite program [9], [55], or a SOS program [42]. (iii) Using convex optimization software for numerical computation of the certificates. II. DYNAMICAL S YSTEM I NTERPRETATION AND M ODELING OF C OMPUTER P ROGRAMS We interpret computer programs as discrete-time dynamical systems and introduce generic models that formalize this interpretation. We then introduce MILMs, Graph Models, and MIL-GHMs as structured cases of the generic models. The specific modeling languages are used for computational purposes. A. Generic Models 1) Concrete Representation of Computer Programs: We will consider generic models defined by a finite state space set X with selected subsets X0 ⊆ X of initial states, and X∞ ⊂ X of terminal states, and by a set-valued state transition function f : X 7→ 2X , such that f (x) ⊆ X∞ , ∀x ∈ X∞ . We denote such dynamical systems by S(X, f, X0 , X∞ ). Definition 1: The dynamical system S(X, f, X0 , X∞ ) is a C-representation of a computer program P, if the set of all sequences that can be generated by P is equal to the set of all sequences X = (x(0), x(1), . . . , x(t), . . . ) of elements from X, satisfying x (0) ∈ X0 ⊆ X, x (t + 1) ∈ f (x (t)) ∀t ∈ Z+ (1) The uncertainty in x(0) allows for dependence of the program on different initial conditions, and the uncertainty in f models dependence on parameters, as well as the ability to respond to real-time inputs. Example 1: Integer Division (adopted from [44]): The functionality of Program 1 is to compute the result of the integer division of dd (dividend) by dr (divisor). A C-representation of the program is displayed alongside. Note that if dd ≥ 0, and dr ≤ 0, then the program never exits the “while” loop and the value of q keeps increasing, eventually leading to either an overflow or an erroneous answer. The program terminates if dd and dr are positive. 5 int IntegerDivision ( int dd, int dr ) {int q = {0}; int r = {dd}; while (r >= dr) { q = q + 1; r = r − dr; } return r; } Z = Z∩ [−32768, 32767] X = Z4 X0 = {(dd, dr, q, r) ∈ X | q = 0, r = dd} X∞ = {(dd, dr, q, r) ∈ X | r < dr} ( (dd, dr, q + 1, r − dr), f : (dd, dr, q, r) 7→ (dd, dr, q, r), (dd, dr, q, r) ∈ X\X∞ (dd, dr, q, r) ∈ X∞ Program 1: The Integer Division Program (left) and its Dynamical System Model (right) 2) Abstract Representation of Computer Programs: In a C-representation, the elements of the state space X belong to a finite subset of the set of rational numbers that can be represented by a fixed number of bits in a specific arithmetic framework, e.g., fixed-point or floating-point arithmetic. When the elements of X are non-integers, due to the quantization effects, the set-valued map f often defines very complicated dependencies between the elements of X, even for simple programs involving only elementary arithmetic operations. An abstract model over-approximates the behavior set in the interest of tractability. The drawbacks are conservatism of the analysis and (potentially) undecidability. Nevertheless, abstractions in the form of formal over-approximations make it possible to formulate computationally tractable, sufficient conditions for a verification problem that would otherwise be intractable. Definition 2: Given a program P and its C-representation S(X, f, X0 , X∞ ), we say that S(X, f , X 0 , X ∞ ) is an A-representation, i.e., an abstraction of P, if X ⊆ X, X0 ⊆ X 0 , and f (x) ⊆ f (x) for all x ∈ X, and the following condition holds: X ∞ ∩ X ⊆ X∞ . (2) Thus, every trajectory of the actual program is also a trajectory of the abstract model. The definition of X ∞ is slightly more subtle. For proving Finite-Time Termination (FTT), we need to be able to infer that if all the trajectories of S eventually enter X ∞ , then all trajectories of S will eventually enter X∞ . It is tempting to require that X ∞ ⊆ X∞ , however, this may not be possible as X∞ is often a discrete set, while X ∞ is dense in the domain of real numbers. The definition of X ∞ as in (2) resolves this issue. 6 Construction of S(X, f , X 0 , X ∞ ) from S(X, f, X0 , X∞ ) involves abstraction of each of the elements X, f, X0 , X∞ in a way that is consistent with Definition 2. Abstraction of the state space X often involves replacing the domain of floats or integers or a combination of these by the domain of real numbers. Abstraction of X0 or X∞ often involves a combination of domain abstractions and abstraction of functions that define these sets. Semialgebraic set-valued abstractions of some commonly-used nonlinearities are presented in Appendix I. Interested readers may refer to [49] for more examples including abstractions of fixed-point and floating point operations. B. Specific Models of Computer Programs Specific modeling languages are particularly useful for automating the proof process in a computational framework. Here, three specific modeling languages are proposed: Mixed-Integer Linear Models (MILM), Graph Models, and Mixed-Integer Linear over Graph Hybrid Models (MIL-GHM). 1) Mixed-Integer Linear Model (MILM): Proposing MILMs for software modeling and analysis is motivated by the observation that by imposing linear equality constraints on boolean and continuous variables over a quasi-hypercube, one can obtain a relatively compact representation of arbitrary piecewise affine functions defined over compact polytopic subsets of Euclidean spaces (Proposition 1). The earliest reference to the statement of universality of MILMs appears to be [39], in which a constructive proof is given for the one-dimensional case. A constructive proof for the general case is given in [49]. Proposition 1: Universality of Mixed-Integer Linear Models. Let f : X 7→ Rn be a piecewise affine map with a closed graph, defined on a compact state space X ⊆ [−1, 1]n , consisting of a finite union of compact polytopes. That is: f (x) ∈ 2Ai x + 2Bi subject to x ∈ Xi , i ∈ Z (1, N ) where, each Xi is a compact polytopic set. Then, f can be specified precisely, by imposing linear equality constraints on a finite number of binary and continuous variables ranging over compact intervals. 7 Specifically, there exist matrices F and H, such that the following two sets are equal: G1 = {(x, f (x)) | x ∈ X} T T G2 = {(x, y) | F [ x w v 1 ] = y, H[ x w v 1 ] = 0, (w, v) ∈ [−1, 1]nw × {−1, 1}nv } Mixed Logical Dynamical Systems (MLDS) with similar structure were considered in [4] for analysis of a class of hybrid systems. The main contribution here is in the application of the model to software analysis. A MIL model of a computer program is defined via the following elements: 1) The state space X ⊂ [−1, 1]n . 2) Letting ne = n + nw + nv + 1, the state transition function f : X 7→ 2X is defined by two matrices F, and H of dimensions n-by-ne and nH -by-ne respectively, according to: T T nw nv f (x) ∈ F [ x w v 1 ] | H[ x w v 1 ] = 0, (w, v) ∈ [−1, 1] × {−1, 1} . (3) 3) The set of initial conditions is defined via either of the following: a) If X0 is finite with a small cardinality, then it can be conveniently specified by its elements. We will see in Section IV that per each element of X0 , one constraint needs to be included in the set of constraints of the optimization problem associated with the verification task. b) If X0 is not finite, or |X0 | is too large, an abstraction of X0 can be specified by a matrix H0 ∈ RnH0 ×ne which defines a union of compact polytopes in the following way: T X0 = {x ∈ X | H0 [ x w v 1 ] = 0, (w, v) ∈ [−1, 1]nw × {−1, 1}nv }. (4) 4) The set of terminal states X∞ is defined by T X∞ = {x ∈ X | H[ x w v 1 ] 6= 0, ∀w ∈ [−1, 1]nw , ∀v ∈ {−1, 1}nv }. (5) Therefore, S(X, f, X0 , X∞ ) is well defined. A compact description of a MILM of a program is either of the form S (F, H, H0 , n, nw , nv ) , or of the form S (F, H, X0 , n, nw , nv ). The MILMs can represent a broad range of computer programs of interest in control applications, including but not limited to control programs of gain scheduled linear systems in embedded applications. In addition, generalization of the model to programs with piecewise affine dynamics subject to quadratic constraints is straightforward. Example 2: A MILM of an abstraction of the IntegerDivision program (Program 1: Section II-A), with 8 all the integer variables replaced with real variables, is given by S (F, H, H0 , 4, 3, 0) , where H0 = H= F = 1 0 0 0 0 0 0 1 0 0 −1 0 0 0 0 0 2 0 −2 1 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 , 0 −2 0 0 0 1 0 1, 0 −2 0 0 0 1 0 0 0 0 0 0 1 0 1 −2 0 0 0 0 0 1 1 0 −1 0 1 0 0 0 −2 0 0 0 0 0 1 1 0 0 1/M 0 Here, M is a scaling parameter used for bringing all the variables within the interval [−1, 1] . 2) Graph Model: Practical considerations such as universality and strong resemblance to the natural flow of computer code render graph models an attractive and convenient model for software analysis. Before we proceed, for convenience, we introduce the following notation: Pr (i, x) denotes the projection operator defined as Pr (i, x) = x, for all i ∈ Z∪ {o n} , and all x ∈ Rn . A graph model is defined on a directed graph G (N , E) with the following elements: 1) A set of nodes N = {∅} ∪ {1, . . . , m} ∪ {o n} . These can be thought of as line numbers or code locations. Nodes ∅ and o n are starting and terminal nodes, respectively. The only possible transition from node o n is the identity transition to node o n. 2) A set of edges E = {(i, j, k) | i ∈ N , j ∈ O (i)} , where the outgoing set O (i) is the set of all nodes to which transition from node i is possible in one step. Definition of the incoming set I (i) is analogous. The third element in the triplet (i, j, k) is the index for the kth edge between i and j, and Aji = {k | (i, j, k) ∈ E} . 3) A set of program variables xl ∈ Ω ⊆ R, l ∈ Z (1, n) . Given N and n, the state space of a graph model is X = N × Ωn . The state x e = (i, x) of a graph model has therefore, two components: The discrete component i ∈ N , and the continuous component x ∈ Ωn ⊆ Rn . k k k 4) A set of transition labels T ji assigned to every edge (i, j, k) ∈ E, where T ji maps x to the set T ji x = k {Tjik (x, w, v) | (x, w, v) ∈ Sji }, where (w, v) ∈ [−1, 1]nw × {−1, 1}nv , and Tjik : Rn+nw +nv 7→ Rn k k k is a polynomial function and Sji is a semialgebraic set. If T ji is a deterministic map, we drop Sji k and define T ji ≡ Tjik (x). 5) A set of passport labels Πkji assigned to all edges (i, j, k) ∈ E, where Πkji is a semialgebraic set. A state transition along edge (i, j, k) is possible if and only if x ∈ Πkji . 9 6) A set of semialgebraic invariant sets Xi ⊆ Ωn , i ∈ N are assigned to every node on the graph, such that Pr (i, x) ∈ Xi . Equivalently, a state x e = (i, x) satisfying x ∈ X\Xi is unreachable. Therefore, a graph model is a well-defined specific case of the generic model S(X, f, X0 , X∞ ), with X = N × Ωn , X0 = {∅} × X∅ , X∞ = {o n} × Xno and f : X 7→ 2X defined as: o n k f (e x) ≡ f (i, x) = (j, T ji x) | j ∈ O (i) , x ∈ Πkji ∩ Xi . (6) Conceptually similar models have been reported in [44] for software verification, and in [1], [12] for modeling and verification of hybrid systems. Interested readers may consult [49] for further details regarding treatment of graph models with time-varying state-dependent transitions labels which arise in modeling operations with arrays. Remarks – The invariant set of node ∅ contains all the available information about the initial conditions of the program variables: Pr (∅, x) ∈ X∅ . – Multiple edges between nodes enable modeling of logical ”or” or ”xor” type conditional transitions. This allows for modeling systems with nondeterministic discrete transitions. k – The transition label T ji may represent a simple update rule which depends on the real-time input. T For instance, if T = Ax + Bw, and S = Rn × [−1, 1] , then x 7→ {Ax + Bw | w ∈ [−1, 1]} . k In other cases, T ji may represent an abstraction of a nonlinearity. For instance, the assignment T x 7→ sin (x) can be abstracted by x 7→ {T (x, w) | (x, w) ∈ S} (see Eqn. (46) in Appendix I). Before we proceed, we introduce the following notation: Given a semialgebraic set Π, and a polynomial function τ : Rn 7→ Rn , we denote by Π (τ ) , the set: Π(τ ) = {x | τ (x) ∈ Π} . a) Construction of Simple Invariant Sets: Simple invariant sets can be included in the model if they are readily available or easily computable. Even trivial invariants can simplify the analysis and improve the chances of finding stronger invariants via numerical optimization. – Simple invariant sets may be provided by the programmer. These can be trivial sets representing simple algebraic relations between variables, or they can be more complicated relationships that reflect the programmer’s knowledge about the functionality and behavior of the program. 10 – Invariant Propagation: Assuming that Tijk are deterministic and invertible, the set −1 S Xi = Πkij Tijk (7) j∈I(i), k∈Aij is an invariant set for node i. Furthermore, if the invariant sets Xj are strict subsets of Ωn for all j ∈ I (i) , then (7) can be improved. Specifically, the set S k k −1 k −1 Xi = Πij Tij ∩ Xj Tij (8) j∈I(i), k∈Aij is an invariant set for node i. Note that it is sufficient that the restriction of Tijk to the lower dimensional spaces in the domains of Πkij and Xj be invertible. – Preserving Equality Constraints: Simple assignments of the form Tijk : xl 7→ f (ym ) result in invariant sets of the form Xi = {x | xl − f (ym ) = 0} at node i, provided that Tijk does not simultaneously update ym . Formally, let Tijk be such that (Tijk x)l −xl is non-zero for at most one element ˆl ∈ Z (1, n) , and that (Tijk x)l̂ is independent of xl̂ . Then, the following set is an invariant set at node i : Xi = S x | Tijk − I x = 0 j∈I(i), k∈Aij 3) Mixed-Integer Linear over Graph Hybrid Model (MIL-GHM): The MIL-GHMs are graph models in which the effects of several lines and/or functions of code are compactly represented via a MILM. As a result, the graphs in such models have edges (possibly self-edges) that are labeled with matrices F and H corresponding to a MILM as the transition and passport labels. Such models combine the flexibility provided by graph models and the compactness of MILMs. An example is presented in Section V. C. Specifications The specification that can be verified in our framework can generically be described as unreachability and finite-time termination. Definition 3: A Program P ≡ S(X, f, X0 , X∞ ) is said to satisfy the unreachability property with respect to a subset X− ⊂ X, if for every trajectory X ≡ x (·) of (1), and every t ∈ Z+ , x(t) does not belong to X− . A program P ≡ S(X, f, X0 , X∞ ) is said to terminate in finite time if every solution X = x (·) of (1) satisfies x(t) ∈ X∞ for some t ∈ Z+ . Several critical specifications associated with runtime errors are special cases of unreachability. 11 1) Overflow: Absence of overflow can be characterized as a special case of unreachability by defining: X− = x ∈ X | α−1 x∞ > 1, α = diag {αi } where αi > 0 is the overflow limit for variable i. 2) Out-of-Bounds Array Indexing: An out-of-bounds array indexing error occurs when a variable exceeding the length of an array, references an element of the array. Assuming that xl is the corresponding integer index and L is the array length, one must verify that xl does not exceed L at location i, where referencing occurs. This can be accomplished by defining X− = {(i, x) ∈ X | |xl | > L} over a graph model and proving that X− is unreachable. This is also similar to “assertion checking” defined next. 3) Program Assertions: An assertion is a mathematical expression whose validity at a specific location in the code must be verified. It usually indicates the programmer’s expectation from the behavior of the program. We consider assertions that are in the form of semialgebraic set memberships. Using graph models, this is done as follows: at location i : assert x ∈ Ai ⇒ define X− = {(i, x) ∈ X | x ∈ X\Ai } , at location i : assert x ∈ / Ai ⇒ define X− = {(i, x) ∈ X | x ∈ Ai } . In particular, safety assertions for division-by-zero or taking the square root (or logarithm) of positive variables are standard and must be automatically included in numerical programs (cf. Sec. III-A, Table I). 4) Program Invariants: A program invariant is a property that holds throughout the execution of the program. The property indicates that the variables reside in a semialgebraic subset XI ⊂ X. Essentially, any method that is used for verifying unreachability of a subset X− ⊂ X, can be applied for verifying invariance of XI by defining X− = X\XI , and vice versa. D. Implications of the Abstractions For mathematical correctness, we must show that if an A-representation of a program satisfies the unreachability and FTT specifications, then so does the C-representation, i.e., the actual program. This is established in the following proposition. The proof is omitted for brevity but can be found in [49]. Proposition 2: Let S(X, f , X 0 , X ∞ ) be an A-representation of program P with C-representation S(X, f, X0 , X∞ ). Let X− ⊂ X and X − ⊂ X be such that X− ⊆ X − . Assume that the unreachability 12 property w.r.t. X − has been verified for S. Then, P satisfies the unreachability property w.r.t. X− . Moreover, if the FTT property holds for S, then P terminates in finite time. Since we are not concerned with undecidability issues, and in light of Proposition 2, we will not differentiate between abstract or concrete representations in the remainder of this paper. III. LYAPUNOV I NVARIANTS AS B EHAVIOR C ERTIFICATES Analogous to a Lyapunov function, a Lyapunov invariant is a real-valued function of the program variables satisfying a difference inequality along the execution trace. Definition 4: A (θ, µ)-Lyapunov invariant for S(X, f, X0 , X∞ ) is a function V : X 7→ R such that V (x+ ) − θV (x) ≤ −µ ∀x ∈ X, x+ ∈ f (x) : x ∈ / X∞ . (9) where (θ, µ) ∈ [0, ∞) × [0, ∞). Thus, a Lyapunov invariant satisfies the difference inequality (9) along the trajectories of S until they reach a terminal state X∞ . It follows from Definition 4 that a Lyapunov invariant is not necessarily nonnegative, or bounded from below, and in general it need not be monotonically decreasing. While the zero level set of V defines an invariant set in the sense that V (xk ) ≤ 0 implies V (xk+l ) ≤ 0, for all l ≥ 0, monotonicity depends on θ and the initial condition. For instance, if V (x0 ) ≤ 0, ∀x0 ∈ X0 , then (9) implies that V (x) ≤ 0 along the trajectories of S, however, V (x) may not be monotonic if θ < 1, though it will be monotonic for θ ≥ 1. Furthermore, the level sets of a Lyapunov invariant need not be bounded closed curves. Proposition 3 (to follow) formalizes the interpretation of Definition 4 for the specific modeling languages. Natural Lyapunov invariants for graph models are functions of the form V (e x) ≡ V (i, x) = σi (x) , i ∈ N, (10) which assign a polynomial Lyapunov function to every node i ∈ N on the graph G (N , E) . Proposition 3: Let S (F, H, X0 , n, nw , nv ) and properly labeled graph G (N , E) be the MIL and graph models for a computer program P. The function V : [−1, 1]n 7→ R is a (θ, µ)-Lyapunov invariant for P if it satisfies: V (F xe ) − θV (x) ≤ −µ, ∀ (x, xe ) ∈ [−1, 1]n × Ξ, 13 where T Ξ = {(x, w, v, 1) | H[ x w v 1 ] = 0, (w, v) ∈ [−1, 1]nw × {−1, 1}nv }. The function V : N ×Rn 7→ R, satisfying (10) is a (θ, µ)-Lyapunov invariant for P if k σj (x+ ) − θσi (x) ≤ −µ, ∀ (i, j, k) ∈ E, (x, x+ ) ∈ (Xi ∩ Πkji ) × T ji x. (11) Note that a generalization of (9) allows for θ and µ to depend on the state x, although simultaneous search for θ (x) and V (x) leads to non-convex conditions, unless the dependence of θ on x is fixed a-priori. We allow for dependence of θ on the discrete component of the state in the following way: k k σi (x) ≤ −µji , ∀ (i, j, k) ∈ E, (x, x+ ) ∈ (Xi ∩ Πkji ) × T ji x σj (x+ ) − θji (12) A. Behavior Certificates 1) Finite-Time Termination (FTT) Certificates: The following proposition is applicable to FTT analysis of both finite and infinite state models. Proposition 4: Finite-Time Termination. Consider a program P, and its dynamical system model S(X, f, X0 , X∞ ). If there exists a (θ, µ)-Lyapunov invariant V : X 7→ R, uniformly bounded on X\X∞ , satisfying (9) and the following conditions V (x) ≤ −η ≤ 0, where kV k∞ = ∀x ∈ X0 (13) µ + (θ − 1) kV k∞ > 0 (14) max (µ, η) > 0 (15) sup V (x) < ∞, then P terminates in finite time, and an upper-bound on the number x∈X\X∞ of iterations is given by log (µ + (θ − 1) kV k∞ ) − log (µ) , θ 6= 1, µ > 0 log θ log (kV k∞ ) − log (η) Tu = , θ 6= 1, µ = 0 log θ kV k /µ , θ=1 ∞ Proof: The proof is presented in Appendix II. (16) 14 When the state-space X is finite, or when the Lyapunov invariant V is only a function of a subset of the variables that assume values in a finite set, e.g., integer counters, it follows from Proposition 4 that V being a (θ, µ)-Lyapunov invariant for any θ ≥ 1 and µ > 0 is sufficient for certifying FTT, and uniform boundedness of V need not be established a-priori. Example 3: Consider the IntegerDivision program presented in Example 1. The function V : X 7→ R, defined according to V : (dd, dr, q, r) 7→ r is a (1, dr)-Lyapunov invariant for IntegerDivision: at every step, V decreases by dr > 0. Since X is finite, the program IntegerDivision terminates in finite time. This, however, only proves absence of infinite loops. The program could terminate with an overflow. 2) Separating Manifolds and Certificates of Boundedness: Let V be a Lyapunov invariant satisfying def (9) with θ = 1. The level sets of V, defined by Lr (V ) = {x ∈ X : V (x) < r}, are invariant with respect to (1) in the sense that x(t + 1) ∈ Lr (V ) whenever x(t) ∈ Lr (V ). However, for r = 0, the level sets Lr (V ) remain invariant with respect to (1) for any nonnegative θ. This is an important property with the implication that θ = 1 (i.e., monotonicity) is not necessary for establishing a separating manifold between the reachable set and the unsafe regions of the state space (cf. Theorem 1). Theorem 1: Lyapunov Invariants as Separating Manifolds. Let V denote the set of all (θ, µ)Lyapunov invariants satisfying (9) for program P ≡ S(X, f, X0 , X∞ ). Let I be the identity map, and for h ∈ {f, I} define h−1 (X− ) = {x ∈ X|h (x) ∩ X− 6= ∅} . A subset X− ⊂ X, where X− ∩ X0 = ∅ can never be reached along the trajectories of P, if there exists V ∈ V satisfying sup V (x) < x∈X0 inf V (x) x∈h−1 (X− ) (17) and either θ = 1, or one of the following two conditions hold: (I) θ < 1 and (II) θ > 1 and inf x∈h−1 (X− ) V (x) > 0. sup V (x) ≤ 0. x∈X0 (18) (19) 15 Proof: The proof is presented in Appendix II. The following corollary is based on Theorem 1 and Proposition 4 and presents computationally implementable criteria for simultaneously establishing FTT and absence of overflow. Corollary 1: Overflow and FTT Analysis Consider a program P, and its dynamical system model S(X, f, X0 , X∞ ). Let α > 0 be a diagonal matrix specifying the overflow limit, and let X− = {x ∈ X | kα−1 xk∞ > 1}. Let q ∈ N∪{∞} , h ∈ {f, I} , and let the function V : X 7→ R be a (θ, µ)-Lyapunov invariant for S satisfying V (x) ≤ 0 ∀x ∈ X0 . n o −1 ∀x ∈ X. V (x) ≥ sup α h (x) q − 1 (20) (21) Then, an overflow runtime error will not occur during any execution of P. In addition, if µ > 0 and µ + θ > 1, then, P terminates in at most Tu iterations where Tu = µ−1 if θ = 1, and for θ 6= 1 we have: Tu = where kV k∞ = log (µ + (θ − 1) kV k∞ ) − log µ log (µ + θ − 1) − log µ ≤ log θ log θ sup (22) |V (x)| . x∈X\{X− ∪X∞ } Proof: The proof is presented in Appendix II. Application of Corollary 1 with h = f typically leads to much less conservative results compared with h = I, though the computational costs are also higher. See [49] for remarks on variations of Corollary 1 to trade off conservativeness and computational complexity. a) General Unreachability and FTT Analysis over Graph Models: The results presented so far in this section (Theorem 1, Corollary 1, and Proposition 4) are readily applicable to MILMs. These results will be applied in Section IV to formulate the verification problem as a convex optimization problem. Herein, we present an adaptation of these results to analysis of graph models. Definition 5: A cycle Cm on a graph G (N , E) is an ordered list of m triplets (n1 , n2 , k1 ) , (n2 , n3 , k2 ) , ..., (nm , nm+1 , km ) , where nm+1 = n1 , and (nj , nj+1 , kj ) ∈ E, ∀j ∈ Z (1, m) . A simple cycle is a cycle with no strict sub-cycles. 16 Corollary 2: Unreachability and FTT Analysis of Graph Models. Consider a program P and its graph model G (N , E) . Let V (i, x) = σi (x) be a Lyapunov invariant for G (N , E) , satisfying (12) and σ∅ (x) ≤ 0, ∀x ∈ X∅ (23) and either of the following two conditions: (I) : σi (x) > 0, (II) : σi (x) > 0, ∀x ∈ Xi ∩ Xi− , i ∈ N \ {∅} (24) n ko ∀x ∈ Xj ∩ T −1 (Xi− ) , i ∈ N \ {∅} , j ∈ I (i) , T ∈ T ij (25) where T −1 (Xi− ) = {x ∈ Xi |T (x) ∩ Xi− 6= ∅} Then, P satisfies the unreachability property w.r.t. the collection of sets Xi− , i ∈ N \ {∅} . In addition, if for every simple cycle C ∈ G, we have: (θ (C) − 1) kσ (C)k∞ + µ (C) > 0, and µ (C) > 0, and kσ (C)k∞ < ∞, (26) where θ (C) = k θij , Q (i,j,k)∈C µ (C) = max µkij , (i,j,k)∈C kσ (C)k∞ = max (i,.,.)∈C |σi (x)| sup (27) x∈Xi \Xi− then P terminates in at most Tu iterations where Tu = X C∈G:θ(C)6=1 log ((θ (C) − 1) kσ (C)k∞ + µ (C)) − log µ (C) + log θ (C) X C∈G:θ(C)=1 kσ (C)k∞ . µ (C) Proof: The proof is presented in Appendix II. For verification against an overflow violation specified by a diagonal matrix α > 0, Corollary 2 is applied with X− = {x ∈ Rn | kα−1 xk∞ > 1}. Hence, (24) becomes σi (x) ≥ p (x) (kα−1 xkq − 1), ∀x ∈ Xi , i ∈ N \ {∅} , where p (x) > 0. User-specified assertions, as well as many other standard safety specifications such as absence of division-by-zero can be verified using Corollary 2 (See Table I). – Identification of Dead Code: Suppose that we wish to verify that a discrete location i ∈ N \ {∅} in a graph model G (N , E) is unreachable. If a function satisfying the criteria of Corollary 2 with Xi− = Rn can be found, then location i can never be reached. Condition (24) then becomes σi (x) ≥ 0, ∀x ∈ Rn . 17 TABLE I A PPLICATION OF C OROLLARY 2 TO THE VERIFICATION OF VARIOUS SAFETY SPECIFICATIONS . apply Corollary 2 with: At location i: assert x ∈ Xa ⇒ Xi− := {x ∈ Rn | x ∈ Rn \Xa } At location i: assert x ∈ / Xa ⇒ Xi− := {x ∈ Rn | x ∈ Xa } At location i: ⇒ Xi− := {x ∈ Rn | xo = 0} At location i: (expr.)/xo √ 2k x o ⇒ Xi− := {x ∈ Rn | xo < 0} At location i: log (xo ) ⇒ Xi− := {x ∈ Rn | xo ≤ 0} At location i: dead code ⇒ Xi− := Rn Example 4: Consider the following program void ComputeTurnRate (void) L0 : {double x = {0}; double y = {∗PtrToY}; L1 : while (1) L2 : { y = ∗PtrToY; L3 : x = (5 ∗ sin(y) + 1)/3; L4 : if x > −1 { L5 : L6 : L7 : L8 : x = x + 1.0472; TurnRate = y/x; } else { TurnRate = 100 ∗ y/3.1416 }} Program 3. Graph of an abstraction of Program 3 Note that x can be zero right after the assignment x = (5 sin(y) + 1)/3. However, at location L6, x cannot be zero and division-by-zero will not occur. The graph model of an abstraction of Program 3 is shown next to the program and is defined by the following elements: T65 : x 7→ x + 1.0472, and T41 : x 7→ [−4/3, 2] . The rest of the transition labels are identity. The only non-universal passport labels are Π54 and Π84 as shown in the figure. Define σ6 (x) = −x2 − 100x + 1, σ5 (x) = −(x + 1309/1250)2 − 100x − 2543/25 σ0 (x) = σ1 (x) = σ4 (x) = σ8 (x) = −x2 + 2x − 3. It can be verified that V (x) = σi (x) is a (θ, 1)-Lyapunov invariant for Program 3 with variable rates: 18 θ65 = 1, and θij = 0 ∀ (i, j) 6= (6, 5). Since −2 = sup σ0 (x) < inf σ6 (x) = 1 x∈X0 x∈X− the state (6, x = 0) cannot be reached. Hence, a division by zero will never occur. We will show in the next section how to find such functions in general. IV. C OMPUTATION OF LYAPUNOV I NVARIANTS It is well known that the main difficulty in using Lyapunov functions in system analysis is finding them. Naturally, using Lyapunov invariants in software analysis inherits the same difficulties. However, the recent advances in hardware and software technology, e.g., semi-definite programming [18], [53], and linear programming software [19] present an opportunity for new approaches to software verification based on numerical optimization. A. Preliminaries 1) Convex Parameterization of Lyapunov Invariants: The chances of finding a Lyapunov invariant are increased when (9) is only required on a subset of X\X∞ . For instance, for θ ≤ 1, it is tempting to replace (9) with V (x+ ) − θV (x) ≤ −µ, ∀x ∈ X\X∞ : V (x) < 1, x+ ∈ f (x) (28) In this formulation V is not required to satisfy (9) for those states which cannot be reached from X0 . However, the set of all functions V : X 7→ R satisfying (28) is not convex and finding a solution for (28) is typically much harder than (9). Such non-convex formulations are not considered in this paper. The first step in the search for a function V : X 7→ R satisfying (9) is selecting a finite-dimensional linear parameterization of a candidate function V : n X V (x) = Vτ (x) = τk Vk (x) , τ = (τk )nk=1 , τk ∈ R, (29) k=1 where Vk : X 7→ R are fixed basis functions. Next, for every τ = (τk )N k=1 let φ(τ ) = max x∈X\X∞ , x+ ∈f (x) Vτ (x+ ) − θVτ (x), (assuming for simplicity that the maximum does exist). Since φ (·) is a maximum of a family of linear 19 functions, φ (·) is a convex function. If minimizing φ (·) over the unit disk yields a negative minimum, the optimal τ ∗ defines a valid Lyapunov invariant Vτ ∗ (x). Otherwise, no linear combination (29) yields a valid solution for (9). The success and efficiency of the proposed approach depend on computability of φ (·) and its subgradients. While φ (·) is convex, the same does not necessarily hold for Vτ (x+ ) − θVτ (x). In fact, if X\X∞ is non-convex, which is often the case even for very simple programs, computation of φ (·) becomes a non-convex optimization problem even if Vτ (x+ ) − Vτ (x) is a nice (e.g. linear or concave and smooth) function of x. To get around this hurdle, we propose using convex relaxation techniques which essentially lead to computation of a convex upper bound for φ (τ ). 2) Convex Relaxation Techniques: Such techniques constitute a broad class of techniques for constructing finite-dimensional, convex approximations for difficult non-convex optimization problems. Some of the results most relevant to the software verification framework presented in this paper can be found in [31] for SDP relaxation of binary integer programs, [34] and [40] for SDP relaxation of quadratic programs, [56] for S-Procedure in robustness analysis, and [43],[42] for sum-of-squares relaxation in polynomial non-negativity verification. We provide a brief overview of the latter two techniques. a) The S-Procedure : The S-Procedure is commonly used for construction of Lyapunov functions for nonlinear dynamical systems. Let functions φi : X 7→ R, i ∈ Z (0, m) , and ψj : X 7→ R, j ∈ Z (1, n) be given, and suppose that we are concerned with evaluating the following assertions: (I): φ0 (x) > 0, ∀x ∈ {x ∈ X | φi (x) ≥ 0, ψj (x) = 0, i ∈ Z (1, m) , j ∈ Z (1, n)} + (II): ∃τi ∈ R , ∃µj ∈ R, such that φ0 (x) > m X i=1 τi φi (x) + n X µj ψj (x) . (30) (31) j=1 The implication (II) → (I) is trivial. The process of replacing assertion (I) by its relaxed version (II) is called the S-Procedure. Note that condition (II) is convex in decision variables τi and µj . The implication (I) → (II) is generally not true and the S-Procedure is called lossless for special cases where (I) and (II) are equivalent. A well-known such case is when m = 1, n = 0, and φ0 , φ1 are quadratic functionals. A comprehensive discussion of the S-Procedure as well as available results on its losslessness can be found 20 in [21]. Other variations of S-Procedure with non-strict inequalities exist as well. b) Sum-of-Squares (SOS) Relaxation : The SOS relaxation technique can be interpreted as the generalized version of the S-Procedure and is concerned with verification of the following assertion: fj (x) ≥ 0, ∀j ∈ J, gk (x) 6= 0, ∀k ∈ K, hl (x) = 0, ∀l ∈ L ⇒ −f0 (x) ≥ 0, (32) where fj , gk , hl are polynomial functions. It is easy to see that the problem is equivalent to verification of emptiness of a semialgebraic set, a necessary and sufficient condition for which is given by the Positivstellensatz Theorem [8]. In practice, sufficient conditions in the form of nonnegativity of polynomials are formulated. The non-negativity conditions are in turn relaxed to SOS conditions. Let Σ [y1 , . . . , ym ] denote the set of SOS polynomials in m variables y1 , ..., ym , i.e. the set of polynomials that can be represented as p = t P p2i , pi ∈ Pm , where Pm is the polynomial ring of m variables with real coefficients. i=1 Then, a sufficient condition for (32) is that there exist SOS polynomials τ0 , τi , τij ∈ Σ [x] and polynomials ρl , such that τ0 + X i τi fi + X i,j τij fi fj + X l ρl hl + ( Y gk )2 = 0 Matlab toolboxes SOSTOOLS [47], or YALMIP [30] automate the process of converting an SOS problem to an SDP, which is subsequently solved by available software packages such as LMILAB [18], or SeDumi [53]. Interested readers are referred to [42], [35], [43], [47] for more details. B. Optimization of Lyapunov Invariants for Mixed-Integer Linear Models Natural Lyapunov invariant candidates for MILMs are quadratic and affine functionals. 1) Quadratic Invariants: The linear parameterization of the space of quadratic functionals mapping Rn to R is given by: ( Vx2 = n V : R 7→ R | V (x) = x 1 T P x 1 ) n+1 ,P ∈S , (33) where Sn is the set of n-by-n symmetric matrices. We have the following lemma. Lemma 1: Consider a program P and its MILM S (F, H, X0 , n, nw , nv ) . The program admits a quadratic (θ, µ)-Lyapunov invariant V ∈ Vx2 , if there exists a matrix Y ∈ Rne ×nH , ne = n + nw + nv + 1, a w diagonal matrix Dv ∈ Dnv , a positive semidefinite diagonal matrix Dxw ∈ Dn+n , and a symmetric + 21 matrix P ∈ Sn+1 , satisfying the following LMIs: LT1 P L1 − θLT2 P L2 He (Y H) + LT3 Dxw L3 + LT4 Dv L4 − (λ + µ) LT5 L5 λ = Trace Dxw + Trace Dv where In F L1 = , L2 = 01×(ne −1) L5 T 0(n+nw )×nv 0n×(ne −n) In+nw , L4 = Inv , L3 = 0(nv +1)×(n+nw ) 1 0 1×nv T T 0(ne −1)×1 , L5 = 1 Proof: The proof is presented in Appendix II The following theorem summarizes our results for verification of absence of overflow and/or FTT for MILMs. The result follows from Lemma 1 and Corollary 1 with q = 2, h = f, though the theorem is presented without a detailed proof. Theorem 2: Optimization-Based MILM Verification. Let α : 0 ≺ α In be a diagonal positive definite matrix specifying the overflow limit. An overflow runtime error does not occur during any execution of P if there exist matrices Yi ∈ Rne ×nH , diagonal matrices Div ∈ Dnv , positive semidefinite w diagonal matrices Dixw ∈ Dn+n , and a symmetric matrix P ∈ Sn+1 satisfying the following LMIs: + [ x0 1 ]P [ x0 1 ]T ≤ 0, ∀x0 ∈ X0 LT1 P L1 − θLT2 P L2 He (Y1 H) + LT3 D1xw L3 + LT4 D1v L4 − (λ1 + µ) LT5 L5 LT1 ΛL1 − LT2 P L2 He (Y2 H) + LT3 D2xw L3 + LT4 D2v L4 − λ2 LT5 L5 (34) (35) (36) where Λ = diag {α−2 , −1} , λi = Trace Dixw + Trace Div , and 0 Dixw , i = 1, 2. In addition, if µ + θ > 1, then P terminates in a most Tu steps where Tu is given in (22). 2) Affine Invariants: Affine Lyapunov invariants can often establish strong properties, e.g., boundedness, for variables with simple uncoupled dynamics (e.g. counters) at a low computational cost. For variables with more complicated dynamics, affine invariants may simply establish sign-invariance (e.g., xi ≥ 0) or more generally, upper or lower bounds on some linear combination of certain variables. As we will observe in Section V, establishing these simple behavioral properties is important as they can be recursively added to the model (e.g., the matrix H in a MILM, or the invariant sets Xi in a graph 22 model) to improve the chances of success in proving stronger properties via higher order invariants. The linear parameterization of the subspace of linear functionals mapping Rn to R, is given by: n Vx1 = V : Rn 7→ R | V (x) = K T [x o 1]T , K ∈ Rn+1 . (37) It is possible to search for the affine invariants via semidefinite programming or linear programming. Proposition 5: SDP Characterization of Linear Invariants: There exists a (θ, µ)-Lyapunov invariant V ∈ Vx1 for a program P ≡ S (F, H, X0 , n, nw , nv ) , if there exists a matrix Y ∈ Rne ×nH , a diagonal (n+nw )×(n+nw ) matrix Dv ∈ Dnv , a positive semidefinite diagonal matrix Dxw ∈ D+ , and a matrix K ∈ Rn+1 satisfying the following LMI: He(LT1 KL5 − θLT5 K T L2 ) ≺ He(Y H) + LT3 Dxw L3 + LT4 Dv L4 − (λ + µ) LT5 L5 (38) where λ = Trace Dxw + Trace Dv and 0 Dxw . Proposition 6: LP Characterization of Linear Invariants: There exists a (θ, µ)-Lyapunov invariant for a program P ≡ S (F, H, X0 , n, nw , nv ) in the class Vx1 , if there exists a matrix Y ∈ R1×nH , and nonnegative matrices Dv , Dv ∈ R1×nv , Dxw , Dxw ∈ R1×(n+nw ) , and a matrix K ∈ Rn+1 satisfying: K T L1 − θK T L2 − Y H − (Dxw − Dxw )L3 − (Dv − Dv )L4 − (D1 + µ) L5 = 0 (39a) D1 + Dv + Dv 1r + Dxw + Dxw 1n+nw ≤ 0 (39b) Dv , Dv , Dxw , Dxw ≥ 0 (39c) where D1 is either 0 or −1. As a special case of (39), a subset of all the affine invariants is characterized by the set of all solutions of the following system of linear equations: K T L1 − θK T L2 + L5 = 0 (40) Remark 1: When the objective is to establish properties of the form Kx ≥ a for a fixed K, (e.g., when establishing sign-invariance for certain variables), matrix K in (38)−(40) is fixed and thus one can make θ a decision variable subject to θ ≥ 0. Exploiting this convexity is extremely helpful for successfully establishing such properties. 23 The advantage of using semidefinite programming is that efficient SDP relaxations for treatment of binary variables exists, though the computational cost is typically higher than the LP-based approach. In contrast, linear programming relaxations of the binary constraints are more involved than the corresponding SDP relaxations. Two extreme remedies can be readily considered. The first is to relax the binary constraints and treat the variables as continuous variables vi ∈ [−1, 1] . The second is to consider each of the 2nv different possibilities (one for each vertex of {−1, 1}nv ) separately. This approach can be useful if nv is small, and is otherwise impractical. More sophisticated schemes can be developed based on hierarchical linear programming relaxations of binary integer programs [52]. C. Optimization of Lyapunov Invariants for Graph Models A linear parameterization of the subspace of polynomial functionals with total degree less than or equal to d is given by: n+d = V : R 7→ R | V (x) = K Z (x) , K ∈ R , N = (41) d where Z (x) is a vector of length n+d , consisting of all monomials of degree less than or equal to d in n d Vxd n T N variables x1 , ..., xn . A linear parametrization of Lyapunov invariants for graph models is defined according d(i) to (10), where for every i ∈ N , we have σi (·) ∈ Vx , where d (i) is a selected degree bound for σi (·) . Depending on the dynamics of the model, the degree bounds d (i) , and the convex relaxation technique, the corresponding optimization problem will become a linear, semidefinite, or SOS optimization problem. 1) Node-wise Polynomial Invariants: We present generic conditions for verification over graph models using SOS programming. Although LMI conditions for verification of linear graph models using quadratic invariants and the S-Procedure for relaxation of non-convex constraints can be formulated, we do not present them here due to space limitations. Such formulations are presented in the extended report [49], along with executable Matlab code in [57]. The following theorem follows from Corollary 2. Theorem 3: Optimization-Based Graph Model Verification. Consider a program P, and its graph d(i) model G (N , E) . Let V : Ωn 7→ R, be given by (10), where σi (·) ∈ Vx . Then, the functions σi (·) , 24 i ∈ N define a Lyapunov invariant for P, if for all (i, j, k) ∈ E we have: k −σj (Tjik (x, w)) + θji σi (x) − µkji ∈ Σ [x, w] subject to (x, w) ∈ k Xi ∩ Πkji × [−1, 1]nw ∩ Sji (42) Furthermore, P satisfies the unreachability property w.r.t. the collection of sets Xi− , i ∈ N \ {∅} , if there exist εi ∈ (0, ∞) , i ∈ N \ {∅} , such that −σ∅ (x) ∈ Σ [x] subject to x ∈ X∅ (43) σi (x) − εi ∈ Σ [x] subject to x ∈ Xi ∩ Xi− , i ∈ N \ {∅} (44) As discussed in Section IV-A2b, the SOS relaxation techniques can be applied for formulating the search problem for functions σi satisfying (42)–(44) as a convex optimization problem. For instance, if k Xi ∩ Πkji × [−1, 1]nw ∩ Sji = {(x, w) | fp (x, w) ≥ 0, hl (x, w) = 0} , then, (42) can be formulated as an SOS optimization problem of the following form: k −σj (Tjik (x, w)) + θji σi (x) − µkji − X p τ p fp − X τpq fp fq − X p,q ρl hl ∈ Σ [x, w] , s.t. τp , τpq ∈ Σ [x, w] . l Software packages such as SOSTOOLS [47] or YALMIP [30] can then be used for formulating the SOS optimization problems as semidefinite programs. V. C ASE S TUDY In this section we apply the framework to the analysis of Program 4 displayed below. / ∗ EuclideanDivision.c ∗ / F0 : int IntegerDivision ( int dd, int dr ) F1 : {int q = {0}; int r = {dd}; F2 : while (r >= dr) { F3 : q = q + 1; F4 : r = r − dr; Fo n: return r; } L0 : int main ( int X, int Y ) { L1 : int rem = {0}; L2 : while (Y > 0) { L3 : rem = IntegerDivision (X , Y); L4 : X = Y; L5 : Y = rem; Lo n: return X; }} Program 4: Euclidean Division and its Graph Model 25 Program 4 takes two positive integers X ∈ [1, M] and Y ∈ [1, M] as the input and returns their greatest common divisor by implementing the Euclidean Division algorithm. Note that the M AIN function in Program 4 uses the I NTEGER D IVISION program (Program 1). A. Global Analysis A global model can be constructed by embedding the dynamics of the I NTEGER D IVISION program within the dynamics of M AIN. A labeled graph model is shown alongside the text of the program. This model has a state space X = N × [−M, M]7 , where N is the set of nodes as shown in the graph, and the global state x = [X, Y, rem, dd, dr, q, r] is an element of the hypercube [−M, M]7 . A reduced graph model can be obtained by combining the effects of consecutive transitions and relabeling the reduced graph model accordingly. While analysis of the full graph model is possible, working with a reduced model is computationally advantageous. Furthermore, mapping the properties of the reduced graph model to the original model is algorithmic. Interested readers may consult [51] for further elaboration on this topic. For the graph model of Program 4, a reduced model can be obtained by first eliminating nodes Fno, L4 , L5 , L3 , F0 , F1 , F3 , F4 , and L1 , (Figure 1 Left) and composing the transition and passport labels. Node L2 can be eliminated as well to obtain a further reduced model with only three nodes: F2 , L0 , Lno. (Figure 1 Right). This is the model that we will analyze. The passport and transition labels associated with the reduced model are as follows: Fig. 1. Two reduced models of the graph model of Program 4. 26 1 2 T F2F2 : x 7→ [X, Y, rem, dd, dr, q + 1, r − dr] T F2F2 : x 7→ [Y, r, r, Y, r, 0, Y] T L0F2 : x 7→ [X, Y, 0, X, Y, 0, X] T F2Lo n : x 7→ [Y, r, r, dd, dr, q, r] Π2F2F2 : {x | 1 ≤ r ≤ dr − 1} Π1F2F2 : {x | r ≥ dr} ΠF2Lo n : {x | r ≤ dr − 1, r ≤ 0} Finally, the invariant sets that can be readily included in the graph model (cf. Section II-B2a) are: XL0 = {x | M ≥ X, M ≥ Y, X ≥ 1, Y ≥ 1} , XF2 = {x | dd = X, dr = Y} , XLo n = {x | Y ≤ 0} . We are interested in generating certificates of termination and absence of overflow. First, by recursively searching for linear invariants we are able to establish simple lower bounds on all variables in just two rounds (the properties established in the each round are added to the model and the next round of search begins). For instance, the property X ≥ 1 is established only after Y ≥ 1 is established. These results, which were obtained by applying the first part of Theorem 3 (equations (42)-(43) only) with linear functionals are summarized in Table II. TABLE II Property Proven in Round σF2 (x) = 1 θF2F2 , µ1F2F2 2 θF2F2 , µ2F2F2 q≥0 Y≥1 dr ≥ 1 rem ≥ 0 dd ≥ 1 X≥1 r≥0 I I I I II II II −q 1−Y 1 − dr −rem 1 − dd 1−X −r (1, 1) (1, 0) (1, 0) (1, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) (0, 0) We then add these properties to the node invariant sets to obtain stronger invariants that certify FTT and boundedness of all variables in [−M, M]. By applying Theorem 3 and SOS programming using YALMIP [30], the following invariants are found1 (after post-processing, rounding the coefficients, and reverifying): σ1F2 (x) = 0.4 (Y − M) (2 + M − r) σ2F2 (x) = (q × Y + r)2 − M2 σ3F2 (x) = (q + r)2 − M2 σ4F2 (x) = 0.1 (Y − M + 5Y × M + Y2 − 6M2 ) σ5F2 (x) = Y + r − 2M + Y × M − M2 σ6F2 (x) = r × Y + Y − M2 − M The properties proven by these invariants are summarized in the Table III. The specifications that the 1 Different choices of polynomial degrees for the Lyapunov invariant function and the multipliers, as well as different choices for θ, µ and different rounding schemes lead to different invariants. Note that rounding is not essential. 27 program terminates and that x ∈ [−M, M]7 for all initial conditions X, Y∈ [1, M] , could not be established in one shot, at least when trying polynomials of degree d ≤ 4. For instance, σ1F2 certifies boundedness of all the variables except q, while σ2F2 and σ3F2 which certify boundedness of all variables including q do not certify FTT. Furthermore, boundedness of some of the variables is established in round II, relying on boundedness properties proven in round I. Given σ (x) ≤ 0 (which is found in round I), second round verification can be done by searching for a strictly positive polynomial p (x) and a nonnegative polynomial q (x) ≥ 0 satisfying: 2 q (x) σ (x) − p (x) ( T x i − M2 ) ≥ 0, 1 2 T ∈ {T F2F2 , T F2F2 } (45) where the inequality (45) is further subject to boundedness properties established in round I, as well as the usual passport conditions and basic invariant set conditions. TABLE III Invariant σF2 (x) = σ1F2 (x) σ2F2 (x) , σ3F2 (x) σ4F2 (x) σ5F2 (x) , σ6F2 (x) 1 θF2F2 , µ1F2F2 (1, 0) (1, 0) (1, 0) (1, 1) 2 θF2F2 , µ2F2F2 (1, 0.8) (0, 0) (1, 0.7) (1, 1) Y, X, r, dr, rem, dd q, Y, dr, rem Y, X, r, dr, rem, dd Y, dr, rem Round I: x2i ≤ M2 for xi = Round II: x2i ≤ M2 for xi = Certificate for FTT X, r, dd NO NO X, r, dd NO YES, Tu = 2M2 In conclusion, σ2F2 (x) or σ3F2 (x) in conjunction with σ5F2 (x) or σ6F2 (x) prove finite-time termination of the algorithm, as well as boundedness of all variables within [−M, M] for all initial conditions X, Y ∈ [1, M] , for any M ≥ 1. The provable bound on the number of iterations certified by σ5F2 (x) and σ6F2 (x) is Tu = 2M2 (Corollary 2). If we settle for more conservative specifications, e.g., x ∈ [−kM, kM]7 for all initial conditions X, Y ∈ [1, M] and sufficiently large k, then it is possible to prove the properties in one shot. We show this in the next section. B. MIL-GH Model For comparison, we also constructed the MIL-GH model associated with the reduced graph in Figure 1. The corresponding matrices are omitted for brevity, but details of the model along with executable Matlab 28 verification codes can be found in [57]. The verification theorem used in this analysis is an extension of Theorem 2 to analysis of MIL-GHM for specific numerical values of M, though it is certainly possible to perform this modeling and analysis exercise for parametric bounded values of M. The analysis using the MIL-GHM is in general more conservative than SOS optimization over the graph model presented earlier. This can be attributed to the type of relaxations proposed (similar to those used in Lemma 1) for analysis of MILMs and MIL-GHMs. The benefits are simplified analysis at a typically much less computational cost. The certificate obtained in this way is a single quadratic function (for each numerical value of M), establishing a bound γ (M) satisfying γ (M) ≥ X2 + Y2 + rem2 + dd2 + dr2 + q2 + r2 1/2 . Table IV summarizes the results of this analysis which were performed using both Sedumi 1 3 and LMILAB solvers. TABLE IV 102 103 104 105 106 Solver: LMILAB [18]: γ (M) 5.99M 6.34M 6.43M 6.49M 7.05M Solver: SEDUMI [53]: γ (M) 6.00M 6.34M 6.44M 6.49M NAN 1, 10−3 1, 10−3 1, 10−3 1, 10−3 1, 10−3 1, 10−3 1, 10−3 1, 10−3 M −3 1 θF2F2 , µ1F2F2 1, 10 2 θF2F2 , µ2F2F2 1, 10−3 Upperbound on iterations Tu = 2e4 Tu = 8e4 Tu = 8e5 Tu = 1.5e7 Tu = 3e9 C. Modular Analysis The preceding results were obtained by analysis of a global model which was constructed by embedding the internal dynamics of the program’s functions within the global dynamics of the Main function. In contrast, the idea in modular analysis is to model software as the interconnection of the program’s ”building blocks” or ”modules”, i.e., functions that interact via a set of global variables. The dynamics of the functions are then abstracted via Input/Output behavioral models, typically constituting equality and/or inequality constraints relating the input and output variables. In our framework, the invariant sets of the terminal nodes of the modules (e.g., the set Xno associated with the terminal node Fno in Program 4) provide such I/O model. Thus, richer characterization of the invariant sets of the terminal nodes of the modules are desirable. Correctness of each sub-module must be established separately, while correctness 29 of the entire program will be established by verifying the unreachability and termination properties w.r.t. the global variables, as well as verifying that a terminal global state will be reached in finite-time. This way, the program variables that are private to each function are abstracted away from the global dynamics. This approach has the potential to greatly simplify the analysis and improve the scalability of the proposed framework as analysis of large size computer programs is undertaken. In this section, we apply the framework to modular analysis of Program 4. Detailed analysis of the advantages in terms of improving scalability, and the limitations in terms of conservatism the analysis is an important and interesting direction of future research. The first step is to establish correctness of the I NTEGER D IVISION module, for which we obtain σ7F2 (dd, dr, q, r) = (q + r)2 − M2 The function σ7F2 is a (1, 0)-invariant proving boundedness of the state variables of I NTEGER D IVISION. Subject to boundedness, we obtain the function σ8F2 (dd, dr, q, r) = 2r − 11q − 6Z which is a (1, 1)-invariant proving termination of I NTEGER D IVISION . The invariant set of node Fno can thus be characterized by Xno = (dd, dr, q, r) ∈ [0, M]4 | r ≤ dr − 1 The next step is construction of a global model. Given Xno, the assignment at L3: L3 : rem = IntegerDivision (X , Y) can be abstracted by rem = W, s.t. W ∈ [0, M] , W ≤ Y − 1, allowing for construction of a global model with variables X, Y, and rem, and an external state-dependent input W ∈ [0, M] , satisfying W ≤ Y − 1. Finally, the last step is analysis of the global model. We obtain the function σ9L2 (X, Y, rem) = Y ×M−M2 , which is (1, 1)-invariant proving both FTT and boundedness of all variables within [M, M] . 30 VI. C ONCLUDING R EMARKS We took a systems-theoretic approach to software analysis, and presented a framework based on convex optimization of Lyapunov invariants for verification of a range of important specifications for software systems, including finite-time termination and absence of run-time errors such as overflow, out-of-bounds array indexing, division-by-zero, and user-defined program assertions. The verification problem is reduced to solving a numerical optimization problem, which when feasible, results in a certificate for the desired specification. The novelty of the framework, and consequently, the main contributions of this paper are in the systematic transfer of Lyapunov functions and the associated computational techniques from control systems to software analysis. The presented work can be extended in several directions. These include understanding the limitations of modular analysis of programs, perturbation analysis of the Lyapunov certificates to quantify robustness with respect to round-off errors, extension to systems with software in closed loop with hardware, and adaptation of the framework to specific classes of software. A PPENDIX I Semialgebraic Set-Valued Abstractions of Commonly-Used Nonlinearities: – Trigonometric Functions: Abstraction of trigonometric functions can be obtained by approximation of the Taylor series expansion followed by representation of the absolute error by a static bounded uncertainty. For instance, an abstraction of the sin (·) function can be constructed as follows: Abstraction of sin (x) x ∈ [− π2 , π2 ] x ∈ [−π, π] sin (x) ∈ {x + aw | w ∈ [−1, 1]} a = 0.571 a = 3.142 sin (x) ∈ {x − 16 x3 + aw | w ∈ [−1, 1]} a = 0.076 a = 2.027 Abstraction of cos (·) is similar. It is also possible to obtain piecewise linear abstractions by first approximating the function by a piece-wise linear (PWL) function and then representing the absolute error by a bounded uncertainty. Section II-B (Proposition 1) establishes universality of representation of generic PWL functions via binary and continuous variables and an algorithmic construction can be found in [49]. For instance, if x ∈ [0, π/2] then a piecewise linear approximation with absolute error less than 31 0.06 can be constructed in the following way: S = (x, v, w) |x = 0.2 [(1 + v) (1 + w2 ) + (1 − v) (3 + w2 )] ,(w, v) ∈ [−1, 1]2 × {−1, 1} sin (x) ∈ {T xE | xE ∈ S} , T : xE 7→ 0.45 (1 + v) x + (1 − v) (0.2x + 0.2) + 0.06w1 (46a) (46b) – The Sign Function (sgn) and the Absolute Value Function (abs): The sign function (sgn(x) = 1I[0,∞) (x) − 1I(−∞,0) (x)) may appear explicitly or as an interpretation of if-then-else blocks in computer programs (see [49] for more details). A particular abstraction of sgn (·) is as follows: sgn(x) ∈ {v | xv ≥ 0, v ∈ {−1, 1}}. Note that sgn (0) is equal to 1, while the abstraction is multi-valued at zero: sgn (0) ∈ {−1, 1} . The absolute value function can be represented (precisely) over [−1, 1] in the following way: abs (x) = {xv | x = 0.5 (v + w) , (w, v) ∈ [−1, 1] × {−1, 1}} More on the systematic construction of function abstractions including those related to floating-point, fixed-point, or modulo arithmetic can be found in the report [49]. A PPENDIX II of Proposition 4: Note that (13)−(15) imply that V is negative-definite along the trajectories of S, except possibly for V (x (0)) which can be zero when η = 0. Let X be any solution of S. Since V is uniformly bounded on X, we have: − kV k∞ ≤ V (x (t)) < 0, ∀x (t) ∈ X , t > 1. Now, assume that there exists a sequence X ≡ (x(0), x(1), . . . , x(t), . . . ) of elements from X satisfying (1), but not reaching a terminal state in finite time. That is, x (t) ∈ / X∞ , ∀t ∈ Z+ . Then, it can be verified that if t > Tu , where Tu is given by (16), we must have: V (x (t)) < − kV k∞ , which contradicts boundedness of V. of Theorem 1: Assume that S has a solution X =(x (0) , ..., x (t− ) , ...) , where x (0) ∈ X0 and x (t− ) ∈ X− . Let γh = inf x∈h−1 (X− ) V (x) 32 First, we claim that γh ≤ max {V (x (t− )), V (x (t− − 1))} . If h = I, we have x (t− ) ∈ h−1 (X− ) and γh ≤ V (x (t− )). If h = f, we have x (t− − 1) ∈ h−1 (X− ) and γh ≤ V (x (t− − 1)), hence the claim. Now, consider the θ = 1 case. Since V is monotonically decreasing along solutions of S, we must have: γh = inf V (x) ≤ max {V (x (t− )), V (x (t− − 1))} ≤ V (x (0)) ≤ sup V (x) x∈h−1 (X− ) (47) x∈X0 which contradicts (17). Note that if µ > 0 and h = I, then (47) holds as a strict inequality and we can replace (17) with its non-strict version. Next, consider case (I) , for which, V need not be monotonic along the trajectories. Partition X0 into two subsets X 0 and X 0 such that X0 = X 0 ∪ X 0 and V (x) ≤ 0 ∀x ∈ X 0 , and V (x) > 0 ∀x ∈ X 0 Now, assume that S has a solution X = (x (0) , ..., x (t− ) , ...) , where x (0) ∈ X 0 and x (t− ) ∈ X− . Since V (x (0)) > 0 and θ < 1, we have V (x (t)) < V (x (0)) , γh = inf ∀t > 0. Therefore, V (x) ≤ max {V (x (t− )), V (x (t− − 1))} ≤ V (x (0)) ≤ sup V (x) x∈h−1 (X− ) x∈X0 which contradicts (17). Next, assume that S has a solution X = (x (0) , ..., x (t− ) , ...) , where x (0) ∈ X 0 and x (t− ) ∈ X− . In this case, regardless of the value of θ, we must have V (x (t)) ≤ 0, ∀t, implying that γh ≤ 0, and hence, contradicting (18). Note that if h = I and either µ > 0, or θ > 0, then (18) can be replaced with its non-strict version. Finally, consider case (II). Due to (19), V is strictly monotonically decreasing along the solutions of S. The rest of the argument is similar to the θ = 1 case. of Corollary 1: It follows from (21) and the definition of X− that: n o V (x) ≥ sup α−1 h (x)q − 1 ≥ sup α−1 h (x)∞ − 1 > 0, ∀x ∈ X. (48) It then follows from (48) and (20) that: inf V (x) > 0 ≥ sup V (x) x∈h−1 (X− ) x∈X0 Hence, the first statement of the Corollary follows from Theorem 1. The upperbound on the number of 33 iterations follows from Proposition 4 and the fact that supx∈X\{X− ∪X∞ } |V (x)| ≤ 1. of Corollary 2: The unreachability property follows directly from Theorem 1. The finite time termination property holds because it follows from (12), (23) and (26) along with Proposition 4, that the maximum number of iterations around every simple cycle C is finite. The upperbound on the number of iterations is the sum of the maximum number of iterations over every simple cycle. of Lemma 1: Define xe = (x, w, v, 1)T , where x ∈ [−1, 1]n , w ∈ [−1, 1]nw , v ∈ {−1, 1}nv . Recall that (x, 1)T = L2 xe , and that for all xe satisfying Hxe = 0, there holds: (x+ , 1) = (F xe , 1) = L1 xe . It follows from Proposition 3 that (9) holds if: xTe LT1 P L1 xe − θxTe LT2 P L2 xe ≤ −µ, s.t. Hxe = 0, L3 xe ∈ [−1, 1]n+nw , L4 xe ∈ {−1, 1}nv . (49) Recall from the S-Procedure ((30) and (31)) that the assertion σ (y) ≤ 0, ∀y ∈ [−1, 1]n holds if there exists nonnegative constants τi ≥ 0, i = 1, ..., n, such that σ (y) ≤ P τi (yi2 − 1) = y T τ y − Trace (τ ) , where τ = diag (τi ) 0. Similarly, the assertion σ (y) ≤ 0, ∀y ∈ {−1, 1}n holds if there exists a diagonal matrix µ such that σ (y) ≤ P µi (yi2 − 1) = y T µy − Trace (µ) . Applying these relaxations to (49), we obtain sufficient conditions for (49) to hold: xTe LT1 P L1 xe −θxTe LT2 P L2 xe ≤ xTe Y H + H T Y T xe +xTe LT3 Dxw L3 xe +xTe LT4 Dv L4 xe −µ−Trace(Dxw +Dv ) Together with 0 Dxw , the above condition is equivalent to the LMIs in Lemma 1. 34 R EFERENCES [1] R. Alur, C. Courcoubetis, N. Halbwachs, T. A. Henzinger, P.-H. Ho X. Nicollin, A. Oliviero, J. Sifakis, and S. Yovine. The algorithmic analysis of hybrid systems, Theoretical Computer Science, vol. 138, pp. 3–34, 1995. [2] R. Alur, T. Dang, and F. Ivancic. Reachability analysis of hybrid systems via predicate abstraction. In Hybrid Systems: Computation and Control. LNCS v. 2289, pp. 35–48. Springer Verlag, 2002. [3] C. Baier, B. Haverkort, H. Hermanns, and J.-P. Katoen. Model-checking algorithms for continuous-time Markov chains. IEEE Trans. Soft. Eng., 29(6):524–541, 2003. [4] A. Bemporad, and M. Morari. Control of systems integrating logic, dynamics, and constraints. Automatica, 35(3):407–427, 1999. [5] A. Bemporad, F. D. Torrisi, and M. Morari. Optimization-based verification and stability characterization of piecewise affine and hybrid systems. LNCS v. 1790, pp. 45–58. Springer-Verlag, 2000. [6] D. Bertsimas, and J. Tsitsikilis. Introduction to Linear Optimization. Athena Scientific, 1997. [7] B. Blanchet, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, D. Monniaux, and X. Rival. Design and implementation of a special-purpose static program analyzer for safety-critical real-time embedded software. LNCS v. 2566, pp. 85–108, Springer-Verlag, 2002. [8] J. Bochnak, M. Coste, and M. F. Roy. Real Algebraic Geometry. Springer, 1998. [9] S. Boyd, L.E. Ghaoui, E. Feron, and V. Balakrishnan. Linear Matrix Inequalities in Systems and Control Theory, SIAM, 1994. [10] M. S. Branicky. Multiple Lyapunov functions and other analysis tools for switched and hybrid systems. IEEE Trans. Aut. Ctrl., 43(4):475–482, 1998. [11] M. S. Branicky, V. S. Borkar, and S. K. Mitter. A unified framework for hybrid control: model and optimal control theory. IEEE Trans. Aut. Ctrl., 43(1):31–45, 1998. [12] R. W. Brockett. Hybrid models for motion control systems. Essays in Control: Perspectives in the Theory and its Applications, Birkhauser, 1994. [13] E. M. Clarke, O. Grumberg, H. Hiraishi, S. Jha, D.E. Long, K.L. McMillan, and L.A. Ness. Verification of the Future-bus+cache coherence protocol. In Formal Methods in System Design, 6(2):217–232, 1995. [14] E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. MIT Press, 1999. [15] P. Cousot, and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In 4th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 238–252, 1977. [16] P. Cousot. Abstract interpretation based formal methods and future challenges. LNCS, v. 2000:138–143, Springer, 2001. [17] D. Dams. Abstract Interpretation and Partition Refinement for Model Checking. Ph.D. Thesis, Eindhoven University of Technology, 1996. [18] P. Gahinet, A. Nemirovskii, and A. Laub. LMILAB: A Package for Manipulating and Solving LMIs. South Natick, MA: The Mathworks, 1994. [19] ILOG Inc. ILOG CPLEX 9.0 User’s guide. Mountain View, CA, 2003. [20] A. Girard, and G. J. Pappas. Verification using simulation. LNCS, v. 3927, pp. 272–286 , Springer, 2006. 35 [21] S. V. Gusev, and A. L. Likhtarnikov. Kalman–Popov–Yakubovich Lemma and the S-procedure: A historical essay. Journal of Automation and Remote Control, 67(11):1768–1810, 2006. [22] B. S. Heck, L. M. Wills, and G. J. Vachtsevanos. Software technology for implementing reusable, distributed control systems. IEEE Control Systems Magazine, 23(1):21–35, 2003. [23] M. S. Hecht. Flow Analysis of Computer Programs. Elsevier Science, 1977. [24] M. Johansson, and A. Rantzer. Computation of piecewise quadratic Lyapunov functions for hybrid systems. IEEE Tran. Aut. Ctrl. 43(4):555–559, 1998. [25] H. K. Khalil. Nonlinear Systems. Prentice Hall, 2002. [26] H. Kopetz. Real-Time Systems Design Principles for Distributed Embedded Applications. Kluwer, 2001. [27] A. B. Kurzhanski, and I. Valyi. Ellipsoidal Calculus for Estimation and Control. Birkhauser, 1996. [28] G. Lafferriere, G. J. Pappas, and S. Sastry. Hybrid systems with finite bisimulations. LNCS, v. 1567, pp. 186–203, Springer, 1999. [29] G. Lafferriere, G. J. Pappas, and S. Yovine. Symbolic reachability computations for families of linear vector fields. Journal of Symbolic Computation, 32(3):231–253, 2001. [30] J. Löfberg. YALMIP : A Toolbox for Modeling and Optimization in MATLAB. In Proc. of the CACSD Conference, 2004. URL: http://control.ee.ethz.ch/˜joloef/yalmip.php [31] L. Lovasz, and A. Schrijver. Cones of matrices and set-functions and 0-1 optimization. SIAM Journal on Optimization, 1(2):166–190, 1991. [32] Z. Manna, and A. Pnueli. Temporal Verification of Reactive Systems: Safety. Springer-Verlag, 1995. [33] W. Marrero, E. Clarke, and S. Jha. Model checking for security protocols. In Proc. DIMACS Workshop on Design and Formal Verification of Security Protocols, 1997. [34] A. Megretski. Relaxations of quadratic programs in operator theory and system analysis. Operator Theory: Advances and Applications, v. 129, pp. 365–392. Birkhauser -Verlag, 2001. [35] A. Megretski. Positivity of trigonometric polynomials. In Proc. 42nd IEEE Conference on Decision and Control, pages 3814–3817, 2003. [36] S. Mitra. A Verification Framework for Hybrid Systems. Ph.D. Thesis. Massachusetts Institute of Technology, September 2007. [37] C. S. R. Murthy, and G. Manimaran. Resource Management in Real-Time Systems and Networks. MIT Press, 2001. [38] G. Naumovich, L. A. Clarke, and L. J. Osterweil. Verification of communication protocols using data flow analysis. In Proc. 4-th ACM SIGSOFT Symposium on the Foundation of Software Engineering, pages 93–105, 1996. [39] G. L. Nemhauser and L. A. Wolsey. Integer and Combinatorial Optimization. Wiley-Interscience, 1988. [40] Y.E. Nesterov, H. Wolkowicz, and Y. Ye. Semidefinite programming relaxations of nonconvex quadratic optimization. In Handbook of Semidefinite Programming: Theory, Algorithms, and Applications. Dordrecht, Kluwer Academic Press, pp. 361–419, 2000. [41] F. Nielson, H. Nielson, and C. Hank. Principles of Program Analysis. Springer, 2004. [42] P. A. Parrilo. Minimizing polynomial functions. In Algorithmic and Quantitative Real Algebraic Geometry. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, v. 60, pp. 83-100, 2003. 36 [43] P. A. Parrilo. Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. Ph.D. Thesis, California Institute of Technology, 2000. [44] D. A. Peled. Software Reliability Methods. Springer-Verlag, 2001. [45] B. C. Pierce. Types and Programming Languages. MIT Press, 2002. [46] S. Prajna. Optimization-Based Methods for Nonlinear and Hybrid Systems Verification. Ph.D. Thesis, California Institute of Technology, 2005. [47] S. Prajna, A. Papachristodoulou, P. Seiler, and P. A. Parrilo, SOSTOOLS: Sum of squares optimization toolbox for MATLAB, 2004. http://www.mit.edu/˜parrilo/sostools. [48] S. Prajna, and A. Rantzer, Convex programs for temporal verification of nonlinear dynamical systems, SIAM Journal on Control and Opt., 46(3):999–1021, 2007. [49] M. Roozbehani, A. Megretski and, E. Feron. Optimization of lyapunov invariants in analysis of software systems, Available at http://web.mit.edu/mardavij/www/publications.html Also available at http://arxive.org [50] M. Roozbehani, A. Megretski, E. Feron. Safety verification of iterative algorithms over polynomial vector fields. In Proc. 45th IEEE Conference on Decision and Control, pages 6061–6067, 2006. [51] M. Roozbehani, A. Megretski, E. Frazzoli, and E. Feron. Distributed Lyapunov Functions in Analysis of Graph Models of Software. In Hybrid Systems: Computation and Control, Springer LNCS 4981, pp 443-456, 2008. [52] H. D. Sherali, and W. P. Adams. A hierarchy of relaxations and convex hull characterizations for mixed-integer zero-one programming problems. Discrete Applied Mathematics, 52(1):83–106, 1994. [53] J. F. Sturm. Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optimization Methods and Software, 11–12:625–653, 1999. URL: http://sedumi.mcmaster.ca [54] A. Tiwari, and G. Khanna. Series of abstractions for hybrid automata. In Hybrid Systems: Computation and Control, LNCS, v. 2289, pp. 465–478. Springer, 2002. [55] L. Vandenberghe, and S. Boyd. Semidefinite programming. SIAM Review, 38(1):49–95, 1996. [56] V. A. Yakubovic. S-procedure in nonlinear control theory. Vestnik Leningrad University, 4(1):73–93, 1977. [57] http://web.mit.edu/mardavij/www/Software