Code Audit and Suricata Extension by CASEC
Transcription
Code Audit and Suricata Extension by CASEC
Code Audit and Suricata Extension by CASEC Authors Lauritz P. Sømme Levi A. Tobiassen Stian H. Bergseth Vinjar V. Hillestad Bachelor in Information Security Bachelor in Software Engineering 20 ECTS Department of Computer Science and Media Technology Norwegian University of Science and Technology, 18.05.2016 Supervisor Basel Katt CASEC Sammendrag av Bacheloroppgaven Tittel: Kode Revisjon og Utvidelse av Suricata utført av CASEC Dato: 18.05.2016 Deltakere: Veiledere: Lauritz P. Sømme Levi A. Tobiassen Stian H. Bergseth Vinjar V. Hillestad Basel Katt Oppdragsgiver: Kongsberg Defence and Aerospace Kontaktperson: Erik Hjelmås, erik.hjelmas@ntnu.no, 61135000 Nøkkelord: Antall sider: Tilgjengelighet: Suricata, IDS, Kode revisjon, SMTP, CASEC 139 Åpen Sammendrag: Suricata er et Network Intrusion Detection and Prevention System. Suricata er utviklet av sine brukere og open source miljøet. Oppgaven beskriver hva vi gjorde og hvordan vi gjorde det da vi reviderte Suricatas kodebase samt utviklet funksjonalitet. Funksjonaliteten vi utviklet besto i å gi Suricata støtte for bruk av Lua scripts mot Suricatas SMTP og MIME data strukturer. I tillegg beskriver oppgaven hvordan vi gjennomførte dette med bruk av smidig utviklingsmetodikk og The Value Method å øke den sikkerhetsmessige kvaliteten av resultatet vårt. i CASEC Summary of Graduate Project Title: Code Audit and Suricata Extension by CASEC Date: 18.05.2016 Authors: Supervisor: Lauritz P. Sømme Levi A. Tobiassen Stian H. Bergseth Vinjar V. Hillestad Basel Katt Employer: Kongsberg Defence and Aerospace Contact Person: Erik Hjelmås, erik.hjelmas@ntnu.no, 61135000 Keywords: Pages: Availability: Suricata, IDS, Code Review, SMTP, CASEC 139 Open Abstract: Suricata is an open source Network Intrusion Detection and Prevention System. It’s developed by its users and its community. This thesis describe we did, and how we did it when we audited the Suricata code base along with extending Suricata’s functionality. The purpose of the extension is to enable support for Lua scripts towards Suricata’s SMTP and MIME data structures. Additionally, the thesis describe how we did this using Agile Software Development with an amplified focus on security using The Value Method. ii CASEC Preface First and foremost, we would like to thank our advisor Basel Katt for excellent feedback and guidance. Without his help our thesis would not be as good as it is. We would also like to thank Gaute Wangen for his help in establishing contact with Kongsberg Defence & Aerospace. We would like to thank Anders Sand Frogner and Torgeir Natvig at Kongsberg Defence & Aerospace for good cooperation and feedback throughout our work, and for providing the thesis. Also we like to thank the lead programmer of Suricata, Victor Julien, along with the rest of the Suricata community for excellent help and support over IRC and email. And thanks to Christian Sømme for great advice on the phrasing of the introduction parts of the thesis. iii CASEC Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Project goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.1 Quantitative Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.2 Qualitative Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2.3 Learning Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 Project Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.1 Financial Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.2 Time Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3.3 Jurisdictional Framework . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4.1 Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Risk Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5.1 Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5.2 Risk Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.6 Academic Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.7 Group Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.8 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Open-Source Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Intrusion Detection Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Suricata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Lua . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 LuaJIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 LuaJIT C API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.3 Lua Scripts in Suricata . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.1 SMTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2 MIME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 iv CASEC 3.1 Project Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1 Visualizing the Workflow . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Project Phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.2 Code Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.3 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.4 Thesis Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.5 Plan Proposals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Development Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.1 Methodology implementation . . . . . . . . . . . . . . . . . . . . . 18 3.4 Code Review Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.1 Manual Code Review . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.2 Static Code Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5 Secure Software Development Process . . . . . . . . . . . . . . . . . . . . 23 3.5.1 Assessing SSDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5.2 SSDP Choice Rationalization . . . . . . . . . . . . . . . . . . . . . 25 3.6 The Value Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.6.1 SDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.6.2 Touchpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.1 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.2 Value #2: Security and Privacy Requirements . . . . . . . . . . . . 41 4.1.3 Value #3: Security and Privacy Risk Assessment . . . . . . . . . . . 41 4.1.4 Value #4: Attack Surface Analysis . . . . . . . . . . . . . . . . . . . 41 4.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.1 Principles and Standards . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.2 System Arcitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.3 Value #5 Threat Modeling and Abuse Cases . . . . . . . . . . . . . 42 4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.1 Data Structure Architecture . . . . . . . . . . . . . . . . . . . . . . 44 4.3.2 Suricata and SMTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.3 Suricata and MIME . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.4 Refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.4.1 Value #8 Perform Dynamic Analysis . . . . . . . . . . . . . . . . . 48 4.4.2 Value #9 Perform Fuzz Testing . . . . . . . . . . . . . . . . . . . . 49 4.4.3 Value #10 Conduct Attack Suriface Review . . . . . . . . . . . . . 49 4.4.4 Value #11 Risk-Based Security Testing . . . . . . . . . . . . . . . . 49 4.4.5 Performance Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 50 v CASEC 4.4.6 Automatic Builds for Testing . . . . . . . . . . . . . . . . . . . . . . 53 4.5 Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.5.1 Quality Assurance by OISF . . . . . . . . . . . . . . . . . . . . . . . 54 4.5.2 Pull Request to Suricata . . . . . . . . . . . . . . . . . . . . . . . . 56 4.5.3 Value #13 Certify Release and Archive . . . . . . . . . . . . . . . . 56 4.5.4 API Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Code review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.1 Manual analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.1.1 Review: alert-unified2-alert.c . . . . . . . . . . . . . . . . . . . . . 61 5.1.2 Review: decode-ipv6.c . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.1.3 Review: decode-ipv4.c . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.1.4 Selective Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.2 Value Practice #7: Perform Static Analysis . . . . . . . . . . . . . . . . . . 64 5.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.3 Result Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2.4 Important Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2.5 Result Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.1 Research Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.2 Development Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.3 Code Review Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.4 Main conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 7 Future Work and Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . 76 7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.2 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 A util-lua-smtp.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 B Pull Request Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 C Suricata Source Code Research . . . . . . . . . . . . . . . . . . . . . . . . . . 94 C.1 Research util-lua-http . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 C.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 C.1.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 C.2 Research util-lua . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 C.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 C.2.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 C.2.3 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 C.3 Research util-lua-ssh.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 vi CASEC C.3.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 C.4 Research util-lua-dns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 C.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 C.4.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 C.5 Research util-lua-common . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 C.5.1 Description and functions . . . . . . . . . . . . . . . . . . . . . . . 100 C.6 Research output-lua.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 C.6.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 C.6.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 C.7 Research detect-lua-extensions.c . . . . . . . . . . . . . . . . . . . . . . . 105 C.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 C.7.2 Function Description . . . . . . . . . . . . . . . . . . . . . . . . . . 105 C.8 Research detect-lua . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 C.8.1 detect-lua.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 C.8.2 Research detect-lua.c . . . . . . . . . . . . . . . . . . . . . . . . . . 106 D Manual Code Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 D.1 Result of grep function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 D.1.1 Audit: source-nfq.c . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 E Static Code Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 F Skyhigh implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 F.1 Create Instance: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 F.2 Manage RSA keys: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 F.3 Install Git: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 F.4 Install dev-tools and other libraries from the Ubuntu repository. . . . . . . 132 F.5 Clone our version of Suricata from Github: . . . . . . . . . . . . . . . . . . 132 F.6 Get LuaJIT: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 F.7 Compile LuaJIT: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 F.8 Clone libhtp from Github: . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 F.9 Compile Suricata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 F.10 Skyhigh resources: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 G Data Structure Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 134 G.1 SMTPState . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 G.2 SMTPTransaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 G.3 MimeDecEntity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 H Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 H.1 Raw SMTP Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 I Gantt diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 J Meeting attendance log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 vii CASEC List of Figures 1 Contributing actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Suricata SMTP rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Dynamic Typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 Domain Matching Example . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5 Self Signed Certificate Matching [1]. . . . . . . . . . . . . . . . . . . . . . 12 6 SMTP exchange between client and server [2] . . . . . . . . . . . . . . . . 13 7 The three different approaches to the project implementation . . . . . . . 17 8 The work flow of development . . . . . . . . . . . . . . . . . . . . . . . . 19 9 V-model, explaining testing connections [3] . . . . . . . . . . . . . . . . . 21 10 Use case diagram of extension to Suricata . . . . . . . . . . . . . . . . . . 35 11 SMTP related structs in Suricata . . . . . . . . . . . . . . . . . . . . . . . . 45 12 Excert of SMTPState and SMTPTransaction 13 Exert of MimeDecField . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 14 After refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 15 Before refactoring 16 Excerpt from regression testfile . . . . . . . . . . . . . . . . . . . . . . . . 49 17 SMTPGetMailFrom example . . . . . . . . . . . . . . . . . . . . . . . . . . 57 18 SMTPGetMimeField example . . . . . . . . . . . . . . . . . . . . . . . . . 57 19 SMTPGetRcptList example . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 20 SMTPGetAttachmentFilename example . . . . . . . . . . . . . . . . . . . . 58 21 SMTPGetMimeList example . . . . . . . . . . . . . . . . . . . . . . . . . . 59 . . . . . . . . . . . . . . . . . 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 viii CASEC List of Tables 1 Impact level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Probability level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Risk scores by threat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Development board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5 Research Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 6 Code Audit Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 7 Template high level use case . . . . . . . . . . . . . . . . . . . . . . . . . . 35 8 High level use case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 9 Low level use case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 10 Low level use case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 11 Low level use case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 12 Low level use case 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 13 Low level use case 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 14 Probability classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 15 Impact Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 16 Risk Matrix: Bug 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 17 Risk Matrix: Bug 38 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 18 Risk Matrix: Bug 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 19 Risk Matrix: Bug 45 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 ix CASEC Our Problem Statement: Security monitoring systems are an integral part of a modern and secure network design. Having confidence in the secure implementation of the security monitoring systems gives confidence in the security of the network. Therefore it is a necessity to have a security focus while developing security monitoring systems and ensure that existing software is securely implemented. Our goal is to verify the implementation of such a system and add to its feature set in a secure way by using established secure development practices. x CASEC 1 1.1 Introduction Background The digital threat landscape is ever changing, but one constant attack vector has been network based attacks. These network based attacks have been an increasing threat to business and private actors [4]. One way of getting an edge in preventing such attacks, or responding to them after they have occurred, is to deploy a network monitoring system. Some of these systems are developed by the open source community while others are managed by traditional corporations. We believe that the value of having openly available tools that can combat complex problems without direct cost is intangible. This cyber landscape assessment, combined with the intrigue of the collaborative security community, serves as the main motivation for this thesis. The backdrop and the enabling factors for this thesis to be created is the fact that it is a part of our bachelor degree. We were also given the freedom to contact external entities and ask them to guide us through the project as a customer, enableing us to help them by providing them with a product. We contacted Gaute Wangen, PhD. student at NTNU Gjøvik, and asked him to get us in touch with Kongsberg Defence & Aerospace. We explained our main motivations and collectively agreed on a product that would be beneficial to both of us. Namely, the work in this thesis, to extend and audit Suricata, a Network Security Monitoring engine(NSM engine) [5]. 1.2 1.2.1 Project goals Quantitative Goals • 100% of the code will accepted into the Suricata project. • The code addition will result in a 0% decrease in general Suricata performance. • The Lua engine running a simple Lua script will be able to process 10 000 Simple Mail Transfer Protocol (SMTP) packets a second. • We will audit all code with interfaces to the code we are going to write. • We will perform static analysis of 100% of the Suricata code base. • We will supply code fixes for 100% of all serious security related findings. • We will submit all documentation we create ourselves to the Suricata development team. 1.2.2 • • • • 1.2.3 Qualitative Goals We will create code that makes it possible to write SMTP rules in Lua script. We will contribute to the Suricata community as best as possible. Our code audit findings will improve the general security of the Suricata project. Our improvements to the Suricata project will be helpful to KDA. Learning Outcome The course description for IMT3912, Bachelor’s thesis, describes a set of learning goals [6]. Based on the goals set forth by NTNU we created our own list of areas where the thesis 1 CASEC work should enable us to acquire more knowledge and experience. • • • • • • • • • 1.3 1.3.1 Contributing to open source software development Executing a code review Secure code development IDS technology Working with an external actor and employer Advanced C programming Working with projects of great magnitude Project management Documenting findings in a scientific manner Project Framing Financial Framework • KDA will provide all equipment required by the bachelor thesis work. • For testing the software developed, the school’s experimental cloud, Skyhigh, has been made available for our group. 1.3.2 Time Frame The bachelor thesis has a final due date of 18th of May 2016. As the thesis amounts to 20 credits, a minimum of 600 work hours per group member is expected, and this amounts to a weekly 30 work hours per member. 1.3.3 Jurisdictional Framework • KDA will provide a Non-Disclosure Agreement(NDA) for all group members to sign. • All group members who intend to deliver the bachelor thesis are also obligated to sign a project agreement provided by NTNU Gjøvik. • The course description of IMT3912 is a binding contract agreement between the group members and the university. • Open Information Security Foundation (OISF) requires that all contributors to the Suricata project sign the OISF contribution agreement and donate the material to the project. After the submission of the material it will be forced under the current software license of Suricata. 1.4 1.4.1 Scope Thesis Kongsberg Defence & Aerospace wants us to test the viability of Suricata, an open source NSM engine. Suricata is a software based system that monitors network traffic and compares it with predefined rules and signatures [7]. KDA has requested a security audit of the Suricata code base. As the code base of Suricata contains more than four hundred thousand lines of code, we have to define the scope of the manual audit to be a limited part of Suricata. A code audit is a manual or automatic analysis of source code to reveal syntactic or semantic mistakes made by the programmer as bugs, and can also be used to discover security issues. Suricata has a built-in Lua scripting engine that allows more complex logic to be applied in the matching of traffic then the original rule based system. Our task will be to 2 CASEC extend this functionality to allow Lua scripts to act upon SMTP traffic. 1.4.2 Limitations Due to the size of the Suricata’s code base and inexperience in auditing code, we will have to limit the size of the manual audit. We will manually audit our own code and all code with interfaces directly connected to it. Additional manual auditing will be done if we find it necessary. The scope for the Lua scripting functionality will be defined in the requirements from KDA. This project will not focus on performing complete comparisons or analysis of programming methodologies or auditing tools. The groups equipment budget is mostly limited to what NTNU Gjøvik and KDA can provide us with. 1.5 Risk Analysis Any threat effecting this project could potentially have a significant impact on the grading of our project. Grading can again potentially effect how hirable group members are, internal group relations or relations to school and project owner. Not adequately assessing potential risks before entering the project would be an unforgivable oversight given the importance of the subject. Following is the risks we believe the project work may face from inception to final presentation. Threat impact is based on how a threat will effect the final report as the final report is the product off this project. We have set a risk threshold for this analysis at a risk score of 5. We will need to consider mitigations for any threat with a threat score above the threshold. Impact 1. Low 2. Moderate 3. High 4. Catastrophic Description This threat will not have a significant impact on the final report but it will make our work noticeably harder. The threat will have a noticeable impact on our final thesis. The threat will have a significant negative impact on the final thesis. The effect of this threat will make it very unlikely for us to finish the thesis. Table 1: Impact level Probability 1. Low 2. Medium 3. High Description This threat has a 0-25% chance of occurring during the project. This threat has a 25-75% chance of occurring during the project. This threat is likely to occur at least once during the project. Table 2: Probability level 1.5.1 Threats 1. An essential group member is stricken by sickness for a limited amount of time. 2. The scope of the development is too large. We are unable to complete the work. 3 CASEC 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 1.5.2 The development is too technically hard to be completed. The code quality is too bad to be accepted by the Suricata project. Auditing finds little to no results. Auditing is scoped to large and consumes too much time. Fuzzing is too technically hard to be done. Our hardware is too limited or we can not set up the proper environment to perform realistic testing. Implementing methodologies is so time consuming that it effects our ability to complete the practical tasks. Our methodology choices are not optimal, and the argumentation for our choices is not sufficient. Group conflicts degrade the working environment. Group members leave the group. Work from other sources are not cited properly. Group members refuse to contribute to the project. We are unable to gain contact with KDA. Our advisor is unable to help us with our problems. The Suricata code base is too large and complex for us to properly model the program. Risk Assessment After analyzing each threat and their potential impact we assigned a plausibility score to each threat. The results are found in Table 3. The most significant risk to us is threat 3, “The development is too technically hard to be completed.” The other threats are within the acceptable risk threshold. To mitigate the probability of threat 3 we will spend a lot of effort in our research phase to get acquainted with the Suricata code base. Threat nr. Impact Probablity Risk score 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1 2 3 2 2 2 2 2 2 3 1 3 4 3 2 2 2 2 2 2 2 1 2 2 2 2 1 3 1 1 1 1 1 2 2 4 6 4 2 4 4 4 4 3 3 3 4 3 2 2 4 Table 3: Risk scores by threat 4 CASEC 1.6 Academic Background Out of the four group members three are students of Bachelor of Information Security and one is studying Bachelor of Software Engineering at NTNU Gjøvik. Both are Bachelors programs amounting to a total of 180 credits, scheduled to be completed in three years, as full time. Although these bachelor programs specializes in different directions, the base of is built upon a foundation of general informatics and computer science courses. Both programs have courses in common like introduction to programming, systems engineering, operating systems, data communication and network security, database systems and software security. All group members have basic programming knowledge in C and C++ as well as in scripting languages like bash, PowerShell and Python. More important then knowing the syntax of different programming and scripting languages is the understanding of semantics and logic. The groups members have good experience with imperative thinking and are well prepared to contribute both syntactically and logically sound code to the Suricata project. The information security students have knowledge on risk assessment and analysis, incident management, in depth network administration and security, ethical hacking and penetration testing, as well as first hand experience with the ISO 27001 security framework. This will effect the project positively as the results of the audit, as well as the requirements and design section of development, will need to be analyzed in a secure manner, in order to develop and audit code of adequate quality. The student of software engineering have experience with software development methodology, system requirements, IT-governance, in-depth system engineering, multi threaded programming, and secure software development. As Suricata already consists of more then four hundred thousand lines of code, the developed code has to build upon an existing system, fit in with all data structure and files it will interact with. This will require a substantial amount of research of the existing platform, and all interacting libraries the system uses. 1.7 Group Roles Group Leader The group leader has the responsibility to create the room reservations for the scheduled meetings. The group leader obtains the possibility to cast a double vote. Group Leader: Lauritz P. Sømme Substitute group leader: Stian H. Bergseth OISF communication representative The representative from CASEC has the responsibility to communicate with OISF, the creator and maintainer of Suricata. The representative does also have to convay all communication between him and OISF to the rest of the group members. All group meetings will include the OISF communication representative to present the latest communication between OISF and CASEC. OISF communication representative: Levi A. Tobiassen Employer communication representative The employer communication representative from CASEC has the responsibility to communicate with the employer, KDA. The representative does also have to convey 5 CASEC all communication between him and the employer to the rest of the group members. All group meetings will include the communication representative presenting the latest communication between the employer and CASEC. Employer communication representative: Stian H. Bergseth NTNU communication representative The NTNU communication representative from CASEC has the responsibility to communicate with NTNU. The representative does also have to convey all communication between the group and the university to the rest of the group members. All group meetings will include the communication representative presenting the latest communication activities between the University and CASEC. NTNU communication representative: Lauritz P. Sømme Figure 1: Contributing actors As seen in Figure 1 there are external contributing actors to our bachelor thesis. We have received help through both meetings, email conversations and Internet Relay Chat. From our perspective this have helped raise the quality of the thesis to new heights. All external actors have contributed in different ways to help us reach our goals. The Suricata community has years of experience on the platform and have been a very good source of information and question sparring. The employer have throughout the project been available with product feedback, system requirements, and feature specifications. 6 CASEC With constant availability and meetings every other week, our supervisor has helped us form a proper thesis to base our report on, and provided us with excellent academic guidance. 1.8 Thesis Structure Our thesis is derived into several chapters and sections. This is a description of the different chapters. Each of the chapters, sections and subsections are listed in the Table of Contents. Problem statement This page is dedicated to the problem statement. Introduction This chapters covers the background of the thesis, the goals we want to reach, frameing and scope of the thesis, along with a risk analysis for the project work. Background This chapter describe the background for the thesis. This includes a description of the different technologies that the reader should know about in order to understand the context and content of the thesis. Methodology This chapter describe the methodology used for the report itself, the methodology of its different phases, supplementing frameworks for the different methodologies and the process of selecting them. Development This chapter describe the development phase of our thesis, our findings and the results for each phase of the development life cycle. Code Review This chapter describe the code review phase of our thesis, our findings and results of this phase. The phase is divided into static analysis and manual review. Conclusions In this chapter we conclude on the overall results of each phase for the entire thesis. 7 CASEC 2 2.1 Background Open-Source Software “Open-Source software (OSS) is computer software with its source code made available with a license in which the copyright holder provides the rights to study, change, and distribute the software to anyone and for any purpose. Open-source software may be developed in a collaborative public manner.” [8] OSS differs from proprietary software in that the source code is made available for everyone. That enables for others to learn from the source code, create their own personal adjustments or help the developers with their own code contributions for bug fixes or other improvements. Other benefits from OSS is that code can be personally verified to be secure by the users. This way of thinking is popularly condensed to “Linus’s Law”, which goes “given enough eyeballs, all bugs are shallow”[9]. A less permissive alternative the to OSS licenses is Free Open-Source Software (FOSS). FOSS mainly differs from OSS on a ideological basis. OSS only guarantees that you will be able to read the source code, while FOSS also guarantees among other things that you can modify it for your own use and distribute it with your own changes [10]. 2.2 Intrusion Detection Systems “Intrusion detection is the process of monitoring the events occurring in a computer system or network and analyzing them for signs of possible incidents, which are violations or imminent threats of violation of computer security policies, acceptable use policies, or standard security practices.” [11] Intrusion detection in computer networks is done by using network intrusion detection systems (NIDS). These detection systems are split into two main types, anomaly and rule based. Anomaly based systems use a baseline that it compares traffic patterns with, based on protocol types and bandwidth used. Signature based systems use a set of predefined rules to match traffic patterns. Each type has their strengths and weaknesses. Anomaly based can detect unknown threats. But it requires training periods to establish the baseline, and it will need to be retrained for each change in traffic pattern. Rule based detection does not require a training period and will be effective from day one, but it can not detect unknown threats. Both type of systems can produce a high number of false positive alarms if configured incorrectly. Intrusion Detection Prevention Systems (IDPS) are a variant of IDS that allows traffic blocking. The sytem will block traffick when it detects an anomaly or matches on a traffic pattern. Suricata talks about Network Security Monitoring engine, which is another name for a NIDS. 2.2.1 Suricata Suricata is a rule based NIDS and can also be used as an IPS if it is placed in line [12]. It uses a signature language to match on the different types of traffic. The Suricata code 8 CASEC base is multi threaded, modern, clean and highly scalable, thus making a single instance of the program capable of inspecting multi-gigabit traffic [13]. This is the key feature of Suricata, all though there are many more. Its features range from different type of engines, operating system support, to several protocol parsers and different ways of output. The most significant features of Suricata relevant to the thesis is the use of Lua scripting language, protocol parser for SMTP and MIME, and the use of C in the source code. Rules Signatures have a very central role in Suricata as they are the only way of matching traffic. In most cases users will use predefined rule sets, the most common being the Emerging Threats rule set [14]. Emerging threats is a collection of several security projects, usually related to intrusion detection and network traffic analysis. Their main project is the Emerging Threats rule set. which addresses several attack vectors with different rules/signatures [15]. These signatures are then applied to Suricata, and matched with data on networks that Suricata is set to listen on. In general, a rule consist of three properties: an action, a header and rule options. An example rule cen be seen in Figure 2. There are four types of actions: pass, drop reject and alert. An action determines what Suricata will do when a rule matches. A header consist of a protocol and different rule variables [16]. The rule options consist of message. This message can be set to whatever the user see adequate, but most commonly the message should relate to the rest of the rule, or identify it. 1 a l e r t t c p any any −> any ! [ 2 5 , 5 8 7 , 4 6 5 ] ( msg : " SURICATA SMTP but not t c p p o r t 2 5 , 5 8 7 , 4 6 5 " ; 2 flow : t o _ s e r v e r ; app−l a y e r −p r o t o c o : smtp ; s i d :2271006; r e v : 1 ; ) Figure 2: Suricata SMTP rule Licensing and Suricata Suricata is considered FOSS because it uses the GNU GPLv2 license which the GNU project recognizes as FOSS [17][18]. The Suricata project is managed entirely by the Open Information Security Foundation(OISF), which claims Suricata will remain FOSS forever. The purpose of the creation of OISF itself was to have a safe haven for Suricata [19]. OISF is registered as a non-profit organization and is funded by donations from all over the world [20]. All features of Suricata, including the Suricatas bug tracker, development road map and code is available for all to see at any give time. Discussions and decisions for the further development of Suricata is made in the open, and anyone is free to submit their ideas for new features. The fact that Suricata is publicly developed and uses a free open license is a quintessential for us to perform this thesis. Contributing to Suricata would likely not be possible for us if it was under a proprietary license, nor would we be able to learn from the source code. Openly sharing documentation, source code, and developer discussions combined with a active and accessible developer community amounts to a huge relief in our workload. 2.3 Lua Lua is an interpreted scripting language. ”An interpreted language is a programming language for which most of its implementations execute instructions directly, without 9 CASEC previously compiling a program into machine-language instructions.“ [21] This tells us that the Lua scripts are not compiled to binary code and run by the computer. It is instead parsed and executed by a Lua interpreter [22]. Another example of interpreted scripting languages is Python or JavaScript. Lua’s memory footprint and compiled size is smaller compared to Python and similar languages. Lua’s interpreter, excluding libraries, is under 100KB, compared to the Python core which is 824KB [23]. We can conclude that Lua generate less overhead in memory and has less bloated code. Lua types are dynamic and largely based on associative tables. Dynamic typing means that the types are deducted during runtime [24]. This allows developers to write quick and simple scripts without explicitly defining variables type. That means Lua is able to perform operations on different types without explicitly using type casting in the source code. 1 2 3 4 5 A B C C = 1 = "2" = A + B == 3 −− −− −− −− " int " " char magic t y p e c a s t i n g on B " int " Figure 3: Dynamic Typing 2.3.1 LuaJIT The LuaJIT project provides a just-in-time compiler for Lua [25]. This just-in-time compiler is a substitute for an interpreter and instead compiles the Lua code to machine code like a traditional ahead-of-time compiler would [26]. The difference is that LuaJIT does it during run time. This allows the code to keep its flexibility and still gain a speed up during execution. There is a latency drawback over ahead-of-time compiling, but this is mostly during startup and not applicable when the code runs for an extended period of time. 2.3.2 LuaJIT C API LuaJIT can be embedded into other applications. This is how it is used in Suricata where it extends upon the functionality available in C. The LuaJIT C-API is based on a virtual stack. If one want to interact with LuaJIT from the C code, it’s necessairy to interact with the provided virtual stack. This is done by pushing data to the stack from the C code to make the data available in Lua. Likewise, pop values from the stack if the goal is to retrieve data from the Lua script. It is also possible to access values on the stack using their index. 2.3.3 Lua Scripts in Suricata Lua in Suricata Lua is mainly used in Suricata for writing rules to do content matching. This is done by writing a Suricata rule and referring to the name of the Lua script using keywords for the just-in-time compiler, LuaJIT. It is also possible to use the regular Suricata rule to do preliminary matching before use of the Lua script. Matching content in a Lua script is much more powerful than using plain Suricata rules. This is because it is possible to do matching programmatically. One requirement for the Lua script is to return either 1 or 0 10 CASEC where 1 means a match and 0 does not. If needed, it’s possible to decide to not match on the in the incoming data. Lua scripts in Suricata Lua scripts in Suricata have a general structure. An init function is called for selecting which data to extract from the network flow in Suricata. The init function extracts the desired data from Suricata and is returned to the "match" function in the Lua script. It is then possible to use Suricatas functions to extract specific header fields. The user also has the opportunity to extract the raw packet which is acquired from the “needs” function in the Lua script, in turn enabling the user to interact with the packet directly. This functionality is needed when the available functions are not adequate. Scripts are generally separated into two functional categories: scripts for rule matching and scripts for creating specified output [27]. Both output and rule scripts are written the same way, they are only categorized by their purpose. Lua scripts is a powerful tool to define rules and output programmatically. And we believe that the functionality provided by this scripting option can make writing, reading and documenting Suricata rules much easier. The documentation provided by Suricata on this subject is sparse and does not give a complete view of all the currently available functionality. The match part of the script is where to do the content matching and decide if it is needed to trigger the rule or not. This simple match function triggers the rule if the requested host was "google.com". 1 2 3 4 5 6 7 8 9 10 11 12 13 14 function i n i t ( a r g s ) local needs = {} needs [ " h t t p . r e q u e s t l i n e " ] = tostring ( true ) return needs end function match ( a r g s ) bad_domain = " google.com " req_domain = HttpGetRequestHost ( ) −− T h i s f u n c t i o n g e t s t h e r e q u e s t e d h o s t if domain == bad_domain then return 1 −− T r i g g e r s t h e c a l l i n g r u l e end return 0 −− The c o n t e n t d i d n o t match what we were l o o k i n g f o r end Figure 4: Domain Matching Example Figure 4 depcicts the general structure for most Suricata Lua rules. The real difference lies in what data is chosen to retrieve using the needs option. The "packet" option allows one to get the entire packet including headers, while the "payload" option gets the packet payload (not the stream) [28]. The official documentation seems to be a bit lacking as all the functions for HTTP and TLS on this page are available for rule writing as well [27]. 11 CASEC Example of a Suricata rule where the Lua script self-signed-cert.lua is invoked. 1 a l e r t t l s any any −> any any ( msg : " SURICATA TLS S e l f Signed C e r t i f i c a t e " ; 2 flow : e s t a b l i s h e d ; l u a j i t : s e l f −signed−c e r t . l u a ; 3 t l s . s t o r e ; c l a s s t y p e : p r o t o c o l−command−decode ; 4 s i d :999666111; r e v : 1 ; ) Example of a very simple Lua script for checking self signed certificates using the TLS functions. It should be noted that the init function is not included in this example 1 function match ( a r g s ) 2 version , subject , issuer , f i n g e r p r i n t = TlsGetCertInfo () ; 3 4 if s u b j e c t == i s s u e r then 5 return 1 6 else 7 return 0 8 end 9 end Figure 5: Self Signed Certificate Matching [1]. 2.4 2.4.1 Protocols SMTP “The objective of the Simple Mail Transfer Protocol (SMTP) is to transfer mail reliably and efficiently. SMTP is independent of the particular transmission subsystem and requires only a reliable ordered data stream channel.” [29] This means that SMTP can be used over any transport layer protocol. What SMTP does is to relay and send e-mail. The idea is that a user, A, can send a message to a user, B, no matter where user A or user B connects from. To achieve this we need a globaly unique identifier for each user, known as an e-mail address. The layout of an e-mail address is specified in RFC 5322 [30], but the gist of it is; a local identifier, an “@“, followed by a domain, a “.”, and a Top-Level Domain(TLD) specifier. This will result in string formatted like this: "userof@domain.TLD". User A and user B might not have their mailboxes at the same hosts, and there might not be a direct connection between the hosts. To still be able to deliver mail, SMTP supports mail relaying. If user A has their mailbox at smtp1 and user B has their mailbox at smtp3, but there exists no direct connection between smtp1 and smtp3, a relay server smtp2 can serve as a midway point if smtp2 has a direct connection with both smtp1 and smtp3 [31]. SMTP transports a mail object. A mail object contains an envelope and content. The SMTP envelope is sent as a series of SMTP protocol units. It consists of an originator address which error reports should be directed, one or more recipient addresses, and optional protocol extension material. Figure 6 is an example transaction for e-mails. It shows how the client and server comunicates with each other. 2.4.2 MIME Multipurpose Internet Mail Extensions (MIME) is an internet standard that is mainly designed for extending the Internet Message format or in other terms, email. MIME extends the email functionality by enabling a set of features not originally found. Sending non- 12 CASEC S: C: S: C: S: C: S: C: S: C: C: S: C: S: 220 smtp.server.com Simple Mail Transfer Service Ready HELO client.example.com 250 Hello client.example.com MAIL FROM:<mail@samlogic.com> 250 OK RCPT TO:<john@mail.com> 250 OK DATA 354 Send message content; end with <CRLF>.<CRLF> <The message data (body text, subject, e-mail header, attachments etc) is sent> . 250 OK, message accepted for delivery: queued as 12345 QUIT 221 Bye Figure 6: SMTP exchange between client and server [2] text attachments, like video or audio, and text in character sets other than ASCII are some of the features included in MIME. MIME functionality and its integration with SMTP and email is specified in several RFCs. [32]Examples of MIME header fields: Content-Type, Content-Transfer-Encoding, and encoded-word. [33] The MIME header fields contain the information of the different fields in an email, defined in RFC-2822. [34] The MIME standard does however not limit the potential header field to the ones defined in an RFC. Anyone can define their custom MIME fields giving increased functionality and relevance. In conclusion: a MIME entity and its MIME header fields consist of information that describe an email, which in turn is valuable information to a security system with the goal of detecting suspicious traffic, malicious traffic and policy violations in a network. In other words, MIME information is important for the function of Suricata. 13 CASEC 3 3.1 Methodology Project Methodology There are different factors to focus on in the decision of choosing and defining the methodology to use for this project. The entire group consists of just four people. All group members have limited experience with practicing and committing to a strict methodology. The time frame is limited to four months. By reviewing the project phases, the time frame, and the groups current knowledge of code review, contributing to a open source project, and writing optimal C code. The decision to choose an agile methodology was made. This is based on the fact that the group as a whole would partake in this project as a process of learning and thus we will have to reiterate previous or early work as the group members will have gained more knowledge. A methodology that could work with different phases and was agile enough to adapt between phases was chosen, as the project does not purely consist of software development. When choosing an agile methodology, the choice was between Scrum and Kanban. Scrum is a more prescribed tool in many ways because it comes with a framework of "rules" for the team to follow, like having indirectly limitations of Work in Progress(WIP) as each sprint is locked and defined. Also building scrum-teams with specialized roles and activities within the iteration process would have been to up scaled for our project. We therefore believe that scrum would fit a larger, more development focused project better then our diverse project. Scrum has constraints and a defined framework to follow, it is not possible to change this basis framework and still call it Scrum [35]. We would have to follow this framework or define it as not being Scrum to use it on our project. Kanban is defined as a more non-prescribed methodology, with only two solid rules, which is: 1. Visualize the work flow 2. Directly limit the WIP lists, or swim lanes, on the Kanban board to a certain number. It is possible to build upon Kanban with self defined activities and choose to edit the methodology in between the phases of our project. Kanban is perceived as a agile methodology that, if used correctly, fits our project. Kanban does not define a time boxed iterative working process, but meetings twice every week(regular Monday and Thursday meetings) will be used to go though the work that has been done on each task. The meetings will be used to demo current tasks, review these tasks, plan and decide which task should enter the board. Workload will be reviewed every Thursday and room will be made for redefining the limit of WIP. This is referred to as the Retrospective meeting. There is a consensus against having a time log in our group, because we feel that it adds unnecessary overhead and we do not think time is a good measurement for contribution effort. Participation will be tracked durint the mandatory meetings as this will 14 CASEC allow for necessary changes to be made in order to improve participation. 3.1.1 Visualizing the Workflow Workflow will be visualized by creating for differetn implementation of Kanban boards in the Trello collaboration tool [36]. Three of these will be connected towards the actual project phases; research, code review, and development. These will be named as their respective connected phase. The last board will represent all tasks not directly related towards the main phases, such as; organizing meetings, sending emails, etc. All group members can claim tasks from all the lanes on all boards. Exceptions are listed in the table comments. Boards include a product backlog. Also, lists inside boards have a work in progress limit number. These numbers limit the amount of tasks allowed in each column simultaneously. All tasks in the code-review list will be sent to the code audit board and have to be completed there before it can continue on the development board and enter the test-lane. 3 2 1 2 2 To do Design Development Code-review Test Release Table 4: Development board The group member that claims a task from the to do list is responsible for that task until it reaches the done-lane. 4 4 4 To do Research Present Done Table 5: Research Board The person that has written the code can not do the code-review on that specific piece of code. This board lacks lane limits because we are not familiar enough with audit methodology to set these with confidence. To do Review Solutions Done Table 6: Code Audit Board 3.2 Project Phases Three different high level plans have been considered for the project work. All three consist of the same components, the difference lies in the order they are completed in. The plan components are as follows: 3.2.1 Research The goal for this phase is to perform all research needed to start the development or code review phase. Suricata source code will be the main reading subject, starting with a 15 CASEC Lua scripting module comparable to the one we will write and then all source code with interfaces towards that code. A product of this research will be a conceptual model of the Suricata program. 3.2.2 Code Review The code review is one of the two main phases for this project. Our goal with the code review is to further the overall security of Suricata. We also need to make sure that our newly written code is secure and well written so that it can be accepted into the project. The code reviewing will be split into manual code review and static code analysis. 3.2.3 Development This is the other main phase for the project. The development will be ran as individual development project with its own methodology. All code will be written and tested during this phase. While not directly related to programming, a testing environment will be required during this phase. 3.2.4 Thesis Writing This phase will be used to collect all documentation from the previous phases, and use that to build the thesis. The thesis presentation will also be finalized during this phase. Any bugs or flaws with our code that may surface from late testing or feedback will be fixed. We will get our code accepted for merging into the Suricata main project. The documentation and code review findings will also be sent to the Suricata project. 3.2.5 Plan Proposals Plan 1 This plan is sequential and is based on doing the code review before the development. Having the code review first is beneficial because the knowledge gained from the code review will help us during the development phase. We can also focus our work on just one of the two main phases. This plan also guarantees that everyone in the group gets to work with both subjects. One issue with this plan is the fact that we need a smaller code review phase after the development either way because we need to review our own source code. Having four people working on the same phase at the same time is also probably not the most effective. This plan also makes the project less agile. It will be harder for us to implement what we have learned from the code review in the development or vice versa. Plan 2 During this plan the code audit and development phases will be ran simultaneously. The group will be split into two teams and each will work on one phase in parallel. With this approach we can use knowledge gained from one phase in the other. We can also ensure that each group member gets experience in both phases by swapping assigned tasks between the teams. Thus the code will to some extent be naturally peer reviewed. An issue with this approach is that swapping teams will be time consuming. The teams will have to read up on the changes that has been done by the other team. Team members will also not be able to specialize if the teams are swapped. 16 CASEC Figure 7: The three different approaches to the project implementation Plan 3 This plan is very similar to plan 2 in that the code review and development will be ran simultaneously. The main difference between the two is that plan 3 does not have dedicated groups for each phase. Tasks for the code review will be gathered on one board and tasks for the development on another. Group members can then choose the task they want to work on themselves. This plan allows us to have all the positive aspects of plan 2 while avoiding the time consuming swapping phase. A potential issue with this plan is that project members may not get an equal amount of experience in both phases. This is a minor issue though because the individual would have chosen that himself. Beeing unable to properly prioritize the chosen tasks would be a more potent issue if the most pressing tasks were left unchosen. Plan Choice Rationalization Plan 3 was chosen for this project. The freedom of choosing what subject to work on was valued highly. The transition time when we had to swap subjects between the groups in Plan 2 also seemed like a major issue. Plan 1 was also heavily considered but we concluded that splitting the phases into two individual time blocks would compromise 17 CASEC the quality of the first phase in favor of the last. 3.3 Development Methodology The development is the second our two main phases. It is very important that the group implements, and operates in, a suitable development methodology. A defined and customized methodology will in most cases increase quality of general project management. It may also help visualize the work flow with both supporting tools and as a framework built around cooperation. The methodology in our project and this development phase is implemented to ensure the best possible environment for cooperation, learning and quality assurance. In the project plan the group selected Kanban as the agile methodology to use throughout the entire projects. Because Kanban is defined as a customizable and lightweight methodology, building on two fundamental principles, means that it will fit with our development phase. Agile vs. Traditional At this moment in time there are two very different categories of development methodologies. The oldest of them is the Traditional methodologies. The main philosophy behind many of the traditional software development methodology is to plan and schedule all processes before these are started, making them plan-driven processes [37]. One of the Traditional methodologies, and probably the most known, is the Waterfall method. This methodology is based upon five fundamental phases of software development, requirements, design, implementation, testing and deployment. The Waterfall method governs that each of these processes should be completed to their entirety before a new phase is started, it is not allowed to go back and edit one phase while the project is already within, or in between, phases. Agile development methodology is a more product and costumer focused category of methodologies. Methods like eXtreme Programming, Scrum and Kanban are examples of agile development methodologies where the process itself is not a goal. Satisfying the costumer with working software, made in close contact with the costumer, the ability to responding to change, and focusing on individuals and interactions is the most important pillars of agile methodology. The Agile manifesto is a manifest formed by seventeen pioneers within the computer science community, and is also signed by thousands of people [38]. This manifest states the values of agile software development. It was created to help others within the profession think about software development, methodologies, and organization in new, more agile ways. Agile methodologies may be experienced as more free then the Traditional methods, and although this may be true, all agile methods have a process framework of some sort, explaining and describing how to use the specific methodology. 3.3.1 Methodology implementation As described in the project methodology part, the group members will distribute tasks to them selves based on if they prefer to participate in the development or in the audit part. This may also shift as different group members are free to change their project area during this period of the project. It was greed that all group members should be kept informed on the current status of the development phase to make this work. This is something that the methodology is able to support by making sure that the work flow is 18 CASEC visualized and available for all group members. Using the first part of the weekly meetings to share the current status on the development was also made a common practice. This to help ensure that everyone could at any point take part in the development without extensive training. The project development phase will also be affected by the implementation of The Value Method as a secure software development process in parallel with Kanban. This includes implementing obligatory practices to each phase within the development and iterate. Defined by the software development life cycle there are five parts of development: Requirement, design, implementation, testing, and deployment. Each part will be iterated multiple times during the project, with exception of the deployment phase. The deployment part of this project consist of a delivery of product to both employer and the project community. For the development, the practical ramifications of this will cause the implementation and delivery will be split into two different iterative processes, as we want to deliver the product with all functionality included and just iterate over improvements or feedback the larger community may discover. Figure 8: The work flow of development The requirement phase of the development should consists of both creating a conceptual overview and specification of what is to be implemented, how it should work, and the requirements of the software. This should include both user requirements and system requirements, where the latter ideally works as a expansion of the former with the addition of technical specifications, and detailed explanation of how the user requirements is to be reached. Dedicated software requirements document extending beyond the project plan will not be created as Kanban is a methodology with focus on agile work flow. The project plan will be used as foundation of the requirements from the employer, and we will iterate the more technical details and functionality to be implemented in meeting with the employer as the project progresses. The software requirements are expected to gradually grow in quantity as a result of this, but still keep clear connections to its foundation found in the project plan. Architectural design will be performed as a part of the software design process, which will in practice function as a link between the system requirements and the design engineering. The architectural design will map the the actual model of how the implementation should look like, with regards to the functional and non-functional requirements established in the previous phase. This design process will seek to discover potential design patterns to use, functional and non-functional design principles to follow, and the 19 CASEC architectural structure of the software to be developed. Test suite The test phase of development will include unit, regression, integration and acceptance tests. Unit tests is the process of testing different components in a system. This could be done both automatically and manually. The components being tested can range in size from simple functions and methods to larger functionality implementation. The test is often ran with defined input and will assert expected output, thereby deeming the test a success or fail. Regression tests is basically a test suite made to ensure that there is not introduced any new bugs or flaws to the software as the implementation is ongoing. This test suite should be ran whenever untested code is introduced, or when fixing software bugs to constantly check that the changes did not introduce new problems. The test backlog should be growing incrementally as new functionality enters the software. Integration tests check that subsystems integrate properly into larger parts of the software, and does not break any key components with the introduction of the new software. Acceptance testing is the final test suite to be ran. This will ensure that the developed software meet the requirements set in earlier phases of the development, both system requirements and user-tests. Our methods of testing will try to follow the V-model, as seen in Figure 9. This will connect each testing practice against a specific part of the development process. Using this testing methods in an agile development methodology is something that we believe will only enhance the quality of the product and make it easier to discover bugs or flaws in early stages of development. As each iteration of Kanban will include each of the four different development phases(probably more then once), the v-model will follow each iteration. This means that each iteration will try to include all test, except for the acceptance tests. There is a connection between each type of test and each high level phase in the SDLC, as shown in the V-model. Unit to implementation, integration to design, etc. This assures that all phases of the SDLC will be tested against and quality assured. Functionality will be manually unit tested on the fly as it is implemented, this will also result in the need to run regressions tests as unit tests are expected to uncover bugs or flaws in our code and when more code is being developed. Integration and acceptance testing will include performance-tests to ensure that the developed code does not slow down, or in any way drastically affect, the performance of Suricata. It will also test to check that all functional and non-functional requirements are fulfilled, and all design principles are properly followed and the architectural model of the result correct. As the project framework does not include deployment of the code, the last milestone of this project will be delivery of the product to OISF and the Suricata project. The delivery will most likely be iterated at the end of the project, as the developed code base will be reviewed by the Suricata community. The testing the developed code will undergo is manual unit-tests, regressions tests, integration tests, and realistically configured performance tests. To accomplish unit-testing with good code coverage we had to discuss how to design 20 CASEC Figure 9: V-model, explaining testing connections [3] and execute the tests. pcap files were produced that include raw SMTP test traffic in company with a rule set to trigger the Lua script we wrote to test our functionality. This was done because the our code will as a API towards Lua for availability of SMTP data. This Lua script will contain all API calls to test the different functions and will be designed to log and print both successful output and error messages in case anything goes wrong. This script will also work as the regression tests suite, as it can implement and keep tests for all functionality and run through all previous tests when new functionality is introduced. 3.4 Code Review Methodology The code review together with development is one of the two main phases of this project. Our goal with the review process is to further the overall security of Suricata. We also need to make sure that our newly written code is secure and well written so that it can be accepted into the project. A good starting point is to have a methodology in place. The review will be split into manual code review and static code analysis. 3.4.1 Manual Code Review “You might be amused to note that using grep to search code for words like "bug," "XXX," "fix," "here," and best of all "assume" often reveals interesting and relevant tidbits. Any good security source code review should start with that.” [39] The manual code review needs to get started as soon as possible to get the best possible understanding of C programming and the project source code. The main issue with the manual audit will probably be to identify the source files with the most potential so we don’t waste our time. There will be two different approaches for selecting entry points in the code to review. The first is focused on reviewing the code with the most potential. It involves searching through the source code for promising comments such as 21 CASEC "hack, bug, XXX, TODO, etc." The results should give plenty of entry points to do code review. Approach number two would be to define what we believe is the most critical code in Suricata. That approach may not yield the most findings, but it is probably the most important code to review. These two approaches should amount to a balanced code review where we get to cover code of a reasonable amount and importance with our limited time. The methodology for the manual review is based upon a set of different practices and checklists. The checklist we will use is from Liberty University and contains different points to be checked in six categories [40]. • • • • • • Structure Documentation Variables Arithmetic Operations Loops and Branches Defensive Programming Smartbears list, "Best practices for code review", will be used as as part of the methodology for the manual review. Smartbear is a company that specialize in development of tools for testing, monitoring, and more. Their list contains a set of rules for completing a successful code review [41]. A key point to their list is that a code review should be done in pairs. There will be two group members doing code review on the same files. The other practices in their list are: • • • • • • • • • • Review fewer than 400 lines of code at a time Inspection rates should be under 500 lines of code per hour Do not review for more than 60 minutes at a time Set goals and capture metrics Authors should annotate source code before the review Use checklists Establish a process for fixing defects found Foster a positive code review culture Embrace the subconscious implications of peer review Practice lightweight code reviews Not following all the practices mentioned will be followed, just the ones that are relevant to the project. 3.4.2 Static Code Analysis Only about half of our total review time frame will go to manual code review, the other half will be used to do static analysis. This decision came naturally as the Suricata code base contains more then four hundred thousand lines of code and doing a manual code review of the entire code base would be a next to impossible task. The quality of static code analyzers have reached a level were we believe it is possible to use a static analysis tool for the initial analyzing and to manually iterate over the findings to validate true positives. We plan to use a static analysis tool on the entire Suricata code base and manually validate the results. Even though static analyzers are a mature tool in the auditing tool chain the results can not be directly treated as actual security findings. The depth of the manual confirmation and documentation of the results will probably depend on 22 CASEC the number of results. We also hope to cross correlate the results with multiple static analyzers, this may be critical if the number of results we get initially is either very low or very high. 3.5 Secure Software Development Process System Development Life Cycle(SDLC) is a term used to describe the process for planning, creating, testing and deploying an information system. A SDLC ensures the quality of the system developing process, and it consist of phases such as System investigation, System analysis, design, testing and more. All of the phases aim to ensure a high quality result based on the customer requirements [42]. KDA is the “customer” in this process, although we added some requirements ourselves. Part of our goal is to develop secure software, so securing the different processes during development is an important requirement. We discovered that there are few SDLC with adequate focus on security, and decided that a Secure Software Development Process(SSDP) would fulfill the security requirements. So part of the research was to discover and decide on a SSDP that would compliment our SDLC with a satisfying amount of security in its phases and processes. This part of the thesis will assess the different SSDPs found a comparison of these, concluding with which SSDP we decided to use. 3.5.1 Assessing SSDPs The three most acknowledged secure software development cycle frameworks are the ones looked at. The frameworks assessed in this section: "Security Development Life cycle (SDL)" by Microsoft, "Comprehensive, Lightweight Application Security Process (CLASP)" by OWASP, and ”Touchpoints" by McGraw. These are describe in the paper “On the secure software development process: CLASP, SDL and Touchpoints compared." [43]This paper was key for gathering knowledge about the processes and making a decision on which of the processes were most suitable for our development phase. The paper was provided to us by our thesis supervisor, concluding that this is a reliable resource. Requirements for the choice of SSDP • • • • • • Easy to use Fit to small project Easily applicable to smaller groups Easily accessible Clear and transparent Having adequate security practices SDL The SDL is a result of Microsoft’s proclaimed commitment to trustworthy computing in 2002. SDL is a set of activities which complement Microsoft’s development process and are particularly aimed at addressing security issues. SDL is made up of seven phases: Training, Requirements, Design, Implementation, Verification, Release and Response. Each phase contains at least one “SDL Practice”. The goal of each “SDL Practice” is to ensure that security is thoroughly addressed in each phase. The practice is described in detail, making it easy to complete and use. Some of the practices also contains different tools and software, enabling the user to complete the practice. An example could be 23 CASEC “SDL Practice 12: Perform Fuzz Testing" which has has resources specific to the practice, including tools for Fuzz testing, videos, training and documentation. [44] [?] Key characteristics of SDL • Security as a support to functional quality. • Several activities have continuous characteristics so you can improve on intermediate results. • Good methods that guide you trough executing activities. CLASP CLASP stands for Comprehensive, Lightweight Application Security Process and was originally defined by Secure Software and later donated to OWASP. CLASP is a lightweight process for building secure software. It is the result of analyzing several development life cycle, its system resources and decomposing these in order to develop and create a set of security requirements. The set’s comprehensive size and its requirements is the foundation of CLAPS’s best practices enabling the users to systematically address vulnerabilities, resulting in prevention of compromising key security services. In summary CLASP consist of five features, which then again branch further into other parts of CLASP. These five features are: CLASP views, CLASP Best Practices, 24 CLASP Activities, CLASP Resources, and Taxonomy of CLASP. Each of these consist of different practices, activities and information, all towards developing secure systems and software. [45] [46] Key characteristics of CLASP • Having security in a central role. • Keeping the process light by only supplying activities without defining which activities to select or when to execute them. • There is an extensive set of resources that facilitate the implementation of activities. Software Security Touchpoints The Seven Touchpoints is describe in Gary McGraws book: Software Security: Building Security In [47]. We refer to The Seven Touchpoints as Touchpoints. Here are the Touchpoints in order of effectiveness: 1. 2. 3. 4. 5. 6. 7. Code review Architectural risk analysis Penetration testing Risk-based security tests Abuse cases Security requirements Security operations. The touch points specifies activities or processes that should be done for each phase of a SDLC, resulting in a secure software and systems. For instance Security requirements are set in the requirements and use case phase of a project. All of the activities are done for each iteration of the SDLC. Touchpoints’s activities and processes consists of different best practices and is adopted by big cooperations such as Cigital, the U.S. Department of Homeland Security, and Ernst and Young. McGraw categorize the touch points into two categories, destructive and constructive activities. Destructive activities are related 24 CASEC to attacks, exploits and breaking software. Contrary, the constructive ones are about design, defense and functionality. [48] Key characteristics of Touchpoints • It incorporates a risk management framework into the software development life cycle. • Activities are prioritized allowing for a gradual approach. • Touchpoints includes resources and examples on how to execute activities. 3.5.2 SSDP Choice Rationalization Many of the SSDPs have valuable practices and information that is adequate to use for our thesis. We decided on making our own SSDP based on the most valuable practices and sections from the other SSDPs. This resulted in a SSDP that we named The Value Method. Part of the paper referred to earlier in this section is concluding that “SDL is very thorough in architectural threat analysis” [43] One of our concerns were that we would change the data received from other parts of Suricata, as we state in 4.1.2. It is also important that our code is secure as stated in the Problem statement. Which is why the SDL pracitces that were adequate were added to The Value Method. There are many practices added to The Value Method from SDL, including, but not limited to, static analysis, attack surface analysis, threat modeling and attack surface review. Not only will the activities increase our knowledge of the code base, but they will also give us the confidence to write code without being afraid of doing collateral damage to other parts of Suricata. On the other hand, the conclusion for CLASP states that its greatest strength is the good support for architectural design. However, this is not of great concern, due to the all ready well established design of Suricata. It was found by comparing SDL with CLASP that the structure in SDL makes it easier to get a good overview of the processes. CLASP seemed too big and it was harder to grasp its functions and framework, which made SDL and Touchpoints functionality a more viable addition to The Value Method, and by that excluding CLASP entirely from our methodology. Touchpoints on the other hand, with its flexibility and the mix of black and white activities will be a viable addition to The Value Method. Some of the touch points overlap with the functionality of SDL. Therefore we decided to add the parts of Touchpoints that we saw adequate to the thesis and those that were not present in SDL, to The Value Method. In the end this resulted with the addition of touch point five: Abuse cases, to The Value Method. Some of the practices in SDL are meant for bigger projects these will be discarded from The Value Method or modified to fit into The Value Method. Most of the SDL practices fulfill our requirements, and that’s why The Value Method is based on SDL. Which practices and touch points that are included in The Value Method, and which that overlap, will be elaborated on in the sections to come. 3.6 The Value Method The Value Method is our own take on a SSDP. It’s a combination of selected parts from two different SSDPs, Touchpoints and SDL. We decided to make our own SSDP because could not use all of SDL and there where also activities from Touchpoints that would be a benefit to our process. Another reason for making our own SSDP was because of the time constraint and the scope of our development phase. The existing SSDPs are made 25 CASEC for creating entire systems, while our development goal is to extend the feature set of Suricata. Many of the processes and practices added through the SSDP are meant for big projects and would not be feasible to complete during our project. Therefore adjustments were made to both SDL and Touchpoints to make them fit to a smaller project, resulting in The Value Method. The Value Method will be closer to SDL than to Touchpoints, however many of the touch points and the SDL practices overlap, and this will be highlighted in the explanation of SDL practices and phases. The reason why our SSDP has the word "Value" is because we get the best value for our thesis by combining two SSDPs instead of doing both of them. Practices in The Value Method: • • • • • • • • • • • • • 3.6.1 Value Practice #1: Security Training Value Practice #2: Security Requirements Value Practice #3: Security and Privacy risk assessments Value Practice #4: Attack Surface Analysis Value Practice #5: Threat Modeling and Abuse Cases Value Practice #6: Deprecate Unsafe Functions Value Practice #7: Perform Static Analysis Value Practice #8: Perform Dynamic Analysis Value Practice #9: Perform Fuzz Testing Value Practice #10: Conduct Attack Surface Review Value Practice #11: Risk-Based Security Testing Value Practice #12: Conduct Final Security Review Value Practice #13: Certify Release and Archive SDL SDL consist of many phases and each phase has at least one practices. Each phase and its practices correspond with a phase of the SDLC, and their function is to apply adequate security in each phase. For instance the requirements phase of SDL corresponds with the requirements phase of the SDLC. Although its not given that a SDL phase always corresponds with a SLDC phase. An example would be the training phase of SDL, this phase’s function is a prerequisite that ensures that the software development team members have appropriate knowledge of how to develop secure software. This section will explain which practices and processes were chose and why. All of the practices have specified when in the development process they should be done. SDL is limited to only be the SDL practices. This is due to the size of the development phase. SDL in Agile Development SDL was first developed as an addition to a more traditional way of creating software, the SDLC. In recent times Agile development methods have become very popular, with Scrum leading the way. Our development methodology, Kanban is an Agile development method and we need to fit a secure development strategy into it. Microsoft has a recommended approach for utilizing SDL in Agile [49]. Their recommendation is to do some practices for each iteration. • SDL practice #1 • SDL practice #7 26 CASEC • • • • • SDL practice #8 SDL practice #9 SDL practice #10 SDL practice #15 SDL practice #16 Some can be done for more then one iteration at a time. • • • • SDL practice #3 SDL practice #11 SDL practice #12 SDL practice #13 And some are meant to just be done once. • • • • • SDL practice #2 SDL practice #4 SDL practice #5 SDL practice #6 SDL practice #14 After Microsoft released SDL there was some discussions around the web about how good it was, and if it is viable to retrofit a framework for traditional development methodologies into Agile methods [50][51]. With this in mind we will evaluate each practice in SDL and select the best parts for our project. Phase 1: Training phase SDL Practice #1: Core Security Training Microsoft describes this practice as a prerequisite for implementing the SDL. This practice include training in how to use SDL, but also training in secure design, threat modeling, secure coding, security testing, and best practices surrounding privacy [52]. The entire group have completed the course “Software security”, and we believe that this, among other courses, have provided us with a sufficient academic background for completing the development phase of our thesis in a secure way. This includes training in concepts like secure coding, security design, threat modeling, security testing, how to use SDL, and best practices surrounding the security of the SDLC. Parts of the research phase will also serve as Core Security Training, so this process is completed within that phase. Touchpoints also asses training, but in a different way. The way Touchpoints asses training is through a knowledge management framework, but the book does not completely elaborate how this knowledge can be used throughout the different touch points [43]. Therefore this practice is not affected by Touchpoints, and completed in the SDL way. Phase 2: Requirements Phase SDL Practice #2: Establish security and privacy requirements This practice defines and integrates security and privacy requirements. It also defines minimum security and privacy criteria for the application [53]. With security being a main focus in the software we are developing and for the thesis, it was decided that this SDL Practice will be applied and added to The Value 27 CASEC Method. Thus ensuring that our contribution to Suricata have well established Security and privacy requirements. This practice overlap with the touch point: Security requirements [43]. SDL practice #3: Create quality gates/bug bars This practice defines the accepted level of security and privacy quality. The background for this is to help the development team understand risks associated with security issues and identify and fix security bugs during development [53]. Our contribution to Suricata consist of a manageable size of code. This practice will not be necessary, and not added to The Value Method, as the final contribution should not include any known bugs. Because the of the small size, it will be easier and fast to do a thorough code review and audit. However, if we were to discover bugs in our code, they would be fixed and documented. If we were to create a quality and bug bars, the bug bar would be set to zero bugs of any kind. The quality bar would be that our code should be of equal quality compared to the rest of the Suricata code base. SDL practice #4: Perform security and privacy risk assessments Investigating software design with cost, known rules and regulations in mind will help a team to identify which parts of project that require threat modeling and security design reviews before release. Also the team should determine the Privacy Impact Rating of a feature, product og service [53]. As mentioned, the security standards of the Suricata project is high. Even tough, we believe that a security risk assessment will help identifying parts of the project that both needs further attention to security or detailed threat modeling. Which will also make us more familiar with the code base, and able to address the different threats discovered. So this practice will be added to The Value Method. Phase 3: Design Phase SDL practice #5: Establish design requirements The idea in this phase is to address security and privacy early, and to then reduce the risk of breaking schedule and reduce project expenses. This is done by accurately making a complete design specifications, including security relevant aspects such as minimal cryptographic design requirements and a specification review [54]. One threat is lost time which is mentioned in the project plan, there is no budget for our project. Reducing project expenses is not one of our concerns. Time loss is addressed through the use of Kanban. Concluding that validating our design specifications against a functional specification is something we do not consider relevant. SDL practice #6: perform attack surface analysis/reduction The purpose of this practice, as the name states, is to perform attack surface analysis/reduction. The purpose of doing an attack surface analysis and to reduce the risks found, is to reduce the opportunities for an attackers to exploit potential weak spots and vulnerabilities in the system or in code. Also included in the practice is disabling or restricting access to system services, applying the principle of least privilege, and employing layered defenses wherever possible [54]. The part of our code that process data retrieved from Lua, is considered an attack 28 CASEC surface. This is because we do not control the other surfaces where data enter Suricata. We do also not control the actual Lua scripts written by the user. And we do not know which system Suricata is deployed in. Thus making it impossible for us to address risks in the users system services and applying principle of least privilege in the users system. This practice has been reduced to focus on analyzing the attack surfaces of our code, it is added in The Value Method with the name, attack surface analysis. SDL practice #7: Use Threat modeling This practice states that it is important to have a structured approach during the design process. It helps the team to effectively and less expensively identity security vulnerabilities. It will also identify threats, to then address and mitigate them [54]. We believe threat modeling will help identifying security vulnerabilities and determine risks from those threats. The threat modeling will also help establishing appropriate mitigations and is included in The Value Method. This practice is corresponding with Touchpoints’s Architectural Analysis. The Architectural Analysis of Touchpoints is somewhat bigger than SDL’s Threat modeling practice. However, both include threat modeling trough STRIDE. Phase 4: Implementation phase Similar activities exists in Touchpoints. SDL practice #8: Use Approved Tools This practices recommends the user of SDL to make and maintain a list of approved tools and associated security checks such as compiler options and warnings. SDL claims that this practice will help automate and enforce security practices and at a low cost [55]. Microsoft recommends using one set of approved compiler options and warnings. This is something that will be adhered to because the Suricata build files that all members use come with compiler flags for displaying all errors and warnings. Our source code will also need to be validated by the automated build check tools the Suricata developers use before it will be approved. These tools run the GCC and clang compiler with another set of flags. There will however not be a list of approved developer tools like compiler, IDE, or git clients. This is because we believe that developer freedom helps productivity and we have trust in the competence of the group members. Potential disagreements in tool usage or incompatibilities can be dealt with in a more informal manner. SDL practice #9: Deprecate Unsafe Functions This practice suggest analyzing all the functions in the project and its APIs. The ones determined to be unsage should be banned and replaced with safer aleternatives. This is will reduce the risks of potential bugs [55]. We agree with Microsoft that everyone doing programming and auditing should be mindful of deprecated functions. Functions are deprecated for a reason, but may still be available for backwards compatibility. Using unsafe functions is a common pitfall, especially in C programming. Therefore we include this practice in The Value Method. SDL practice #10: Perform Static Analysis Doing Static analysis on the source code ensures that secure coding policies are being followed, and will also help to reveal bugs, and logical errors [55]. 29 CASEC Our could will be secure and bug free as a requirement. The Suricata source code should also be secure as stated in the problem statement. Static analysis will therefore be a significant part of our project. This will make sure that our newly written code is secure and well written so it can be accepted into the project fulfilling one of our goals. Therefore this practice is added to The Value Method. Phase 5: Verification phase Similar activities exists in Touchpoints. SDL practice #11: Perform Dynamic Analysis This practice recommends the team to perform a run-time verification of the softwares functionality. This should be done with a tool that monitor application behavior for memory corruption, user right issues and other security issues [?]. Our code base is so small, relative to the size of Suricata, that it would be hard to track potential dynamic analysis findings back to our code. It could however prove valuable to Suricata in general so it will be performed and it will be added to The Value Method. SDL practice #12: Perform Fuzz Testing This practice suggest that the team should do fuzz testing on their software. A fuzz test is to introduce malformed or random data as input to the software for inducing program failure. This could reveal potential vulnerabilities in the software [?]. Fuzz testing is valuable to hunt for potential vulnerabilities and results are easier to track back to our own code. We hope we will be able to perform this practice due its valuable nature and it will be a great addition to The Value Method. SDL practice #13: Conduct Attack Surface Review Revision of attack surface upon code completion will help to ensure that the team have has changes in design and implementation of the program or the system into account. This could reveal new attack vectors created. The findings should then be assessed accordingly [?]. This practice could bring valuable information to our project. However if the practice is completed or not, is dependent on the results of Practice #6: perform attack surface analysis/reduction. Although it is added to The Value Method. Phase 6: Release The affect of Touchpoint is limited in this phase. SDL practice #14: Create an Incident Response Plan This practice is about making a Incident Response Plan. It is helpful in the way of addressing new threats that emerge over time [56]. This practice will not be performed. An incident response plan should be tailored to a specific deployed operation. Our code could be deployed anywhere due to the nature of open source projects. We also believe that creating a general incident response plan is hard to get right and other work will be prioritized. 30 CASEC SDL practice #15: Conduct Final Security Review The practice is about reviewing all security activities that were completed during all the different phases in the SDLC. This will ensure that the product, in this case code for Suricata, is ready for release. This includes analyzing threat models, tool outputs and performance against the quality gates and bug bard from the Requirements phase in SDL [56]. Although we do not use all the practices, an extensive security review will help ensure that the the security practices required have been perfomed, and performed well. It will also help ensure that our code is consistent with the security requirements set. It is added to The Value Method. This practice should coincide with our conclusion phase. SDL practice #16: Certify Release and Archive This practice recommends certifying software prior to the release, which in turn helps to ensure that the security and privacy requirements were met. This practice also include archiving all relevant data for performing post-release servicing tasks. Archiving includes many results of the different practices, but also results of the completion of the software [56]. This practice is covered to the full extent in the API documentation provided together with our source code, both to OISF and KDA. Our source code will be archived as it needs to be publicly available due the software license. The bachelor thesis itself should also serve as a rather comprehensive documentation of our work. This practice is automatically completed in the process of doing this thesis, but the practice is added to The Value Method anyway. Phase 7: Response SDL practice #17: Execute Incident Response Plan The last practice suggest that the you should perform the Incident Response plans you have made in Practice 14 [?]. Due to the fact that we will not make an Incident Response plan makes this practice excessive and will not be included in The Value Method. 3.6.2 Touchpoints Touchpoints is the second SSDP that we add to The Value Method. Many of the touch points do the same as some of the SDL practices. Therefore it was decided to not add the corresponding ones to The Value Method, and to not add the non-relevant ones either. In the SDL practise description it is explained where the touch points correspond with the SDL practices, and where specific points from Touchpoints are located. Abuse cases The practice of making the abuse cases are done in the Requirements phase of the SDLC. The concept of a abuse case is to write a use case where the result of the use case is a negative one. In order words, a abuse case can describe an unwanted feature in a software, system or program [57]. Writing abuse cases can then mitigate or create solutions for avoiding them to happen. That is soul purpose of this practice, and a good way to create awareness around potential threats in the software or system to then make it easier to discover other abuse cases later in the development process. 31 CASEC Risk-Based Security Testing This is another practice from Touchpoints that we want to incorporate into The Value Method. Risk-Based Security Testing covers more aspects then we need, what we want to take with us is the aspects around Abuse cases and the tests that cover them “... abuses cases developed earlier in the life cycle should be used to enhance a test plan with adversarial tests based on plausible abuse scenarios.” [58] Bringing this point with us allows us to test our code based on the assessments from the abuse cases. Background & Methodology Summary The background and methodology chapter is largely based on the work we did during the research phase. We knew before going in to this project that we would need to spend a lot of time looking into both technical and organizational subjects. We also had to look into the main software project we were going to interact with, Suricata. The underlying software development philosophy of Suricata, free open source software development is extensively documented. Suricata’s code base consist of more then four hundred thousand lines of code. The documentation of Suricata is varying in quantity and of limited quality, especially regarding design and high-level documentation. A brief look at the code revealed that code comments were sparsely used. The development phase of our bachelor thesis will result in an addition to the Lua related functionality of Suricata. On the other hand, the Lua functionality represents a minor part of the complete Suricata code base. Because of the size of the Suricata codebase, it would be too time-consuming to familiarize ourself with the entire code base. We decided to focus on the program files we thought we would be more likely to interface with. Initially couple of core files were chosen to review and research, on the basis that they were related to extending Lua functionality in Suricata. Afterwards we branched out to other files as more knowledge was gained about their functionality, how they were related, and if the code was considered relevant to the thesis. With the advanced level of program code, combined with the lacking documentation, the group deemed it unwise to start the development phase of the project right away. This was due to a lack of knowledge of how the Suricata code base was structured, making it difficult to asses where to add our code in the code base. Also, at the time, we had insufficient understanding of what code was relevant to our addition of code and which parts of the code we had to interface with to make integrated code of high quality. C is the programming language used to write Suricata and none of the group members have had any previous experience with writing or auditing C code for production. The C code used in Suricata will at times perform low level operation and uses coding practices that we have not learned previous to this project. Our understanding of programming in C had to be advanced due to our lacking knowledge of advanced C and the complexity of the code base. It could be possible learn "on the fly" as the auditing and development was performed, but it was decided that this would likely either result in a unacceptable quality on our early work or it would require us to redo it causing a significant time loss. This was another motivation for doing an extensive research on the C code used in Suricata. 32 CASEC Our final technical research subject was the scripting language Lua. Researching Lua came naturally as the goal with our development was to integrate Lua scripting functionality into Suricata. Thus we had to look into what Lua is, how it works in relation to C code in general, and more specifically how it works in relation to Suricata. We would also most likely need to know how to write our own Lua scripts for testing. In addition to working with Lua and Suricata we also knew that we needed to work with e-mail traffic. Thus we needed to look into the underlying Internet Engineering Taskforce standards SMTP and MIME In addition to technical subjects we also researched different methodologies for development, audit and project management. Getting a proper understanding of the different methodologies and choosing the one most suitable for our project is quite essential and we dedicated a suitable amount of time in order to make an educated choice. We researched three different possible SSDPs that we could add to our project. These were Microsoft SDL, CLASP and Touchpoints. CLASP was too extensive for our project and it’s strenghts covered areas of little concern, like architectural desing. SDL and Touchpoints were the ones deemed most suiting for the project, but neither was a complete fit. We decided that the best for this project would be to select the most valuable touch points and SDL activities and define our own methodology. This resulted in The Value Method. Kanban was chosen as the overall management methodology. The choice was partly made because we believed the group would not fit well with a strict methodology, and Kanban allowed us to custom fit the methodlogy to our needs. Another methodology aspect was the organization of the project phases. There were three proposed sequences the phases could appear in. The one that allowed all group members to get experience in every phase was selected. We also had to define the methodology that specifically addressed the review phase. Here we decided to define our own approach both for static and manual analysis. We did not find any methodlogy that said how to practicaly perform these activities so we defined our own. All these findings helped create the final high level schedule. 33 CASEC 4 Development This chapter will familiarize the reader with the development phase of our project and it will discuss how the extension we made have been specified, implemented, and tested. The chapter also includes the API documentation of our product and a short summary discussing the methodology used, developing software with focus on security first, and implementing our own SSDP into the development phase. 4.1 Requirements This part of the development process has inherited the functional requirements, goals, scope, and limitations from the project plan. The requirements specification for this functionality extension of Suricata is largely based on the thesis description and communication with the employer. Functional Requirements: • The system extension will facilitate the extraction of SMTP-data between Suricata and Lua. • The extension will not decrease the general performance of Suricata. • The extension will allow users to extract sender address and recipient address of any given mail parsed by Suricata. • The extension will allow users to extract all fields related to the MIME protocol in any given mail parsed by Suricata. • The extension will not break the portability of Suricata [7]. Non-functional Requirements: • It should be intuitive from the function name, what the purpose of the function is. • The extended functionality should be usable to people with only basic knowledge and training with Lua. Use Case This document will contain a list of use-cases with a high, and low level detail description. These low-level detailed use cases will optimally contain the following properties: For a high level use case, brief input to the above "form" will be fulfilling to the Name, Actors, Purpose, and Description properties. Low level use cases will require more detailed and in depth description and input to all the attributes above. A sequence diagram will also be need on with the low level use cases, visualizing the preferred chain of events. 34 CASEC 4.1.1 Use Case Diagram Figure 10: Use case diagram of extension to Suricata Name Purpose Actors Description Preconditions Postconditions Trigger A fitting 2-5 word name of the use case. What part of the system does this use case belong to. What is the purpose of the use case. The different entities involved in the use case. A fitting description of the use case and how it will work. What conditions needs be fulfilled before the use case/scenario is set in motion. Conditions that must be met for a valid end of the use case. Chain of events that initiate this use case. Alternative outcome: If variations of a "happy day scenario" occurs, what is then the outcome or behavior of the program. Table 7: Template high level use case 35 CASEC Name Actors Purpose Description SMTP support for Lua in Suricata Surcata, Lua. The system administrator will be able to write Lua scripts and create rules for Suircata, matching on SMTP. Make Suricata support Lua scripts to define rules matching on SMTP fields and attributes. The code will essentially become middle ware allowing Lua scripts to use data from SMTP. The main SMTP fields the costumer needs to match on is: mail_from, mail_to, subject, and attachment. Table 8: High level use case 1 Name Actors Purpose Description Extract field: subject from SMTP packet. Lua, Suricata, User. Allow Lua script to access field Subject of SMTP packet. This function will - Check that the Lua state is accessible/existing. If not - write error output. Extract transaction from Lua state. Check that transaction is not null. Create temporary var sufficient to hold any string residing in SMTPTransaction->subject. Extract subject field from transaction into the temp var. Check tempvar to be filled. Push tempvar to Lua State, return 1 (number of vars pushed) if any check fail - return error. Table 9: Low level use case 1 36 CASEC Name Actors Purpose Description Precondition Postconditions Trigger Extract list of all MIME fields in current SMTP transaction. Lua, Suricata, user. Allow developers to gain knowledge of which MIME fields the current SMTPTransaction contains. The functionality will allow developers to gain knowledge of which MIME fields the current SMTPTransaction contains. It will write to console a list of the names of all MIME fields found in the data structure of the SMTPTransaction currently being parsed by Suricata. Further extension could log the names of all MIME fields found and the transaction ID of where they were found. This list will be extracted from flow->smtpState->smtpTransaction>mimedecEntity->listof(mimedecField) Suricata must be installed with Luajit support enabled. The function must be called from a Lua script. A Suricata rule must refer to the Lua script in order for the function to be reachable. A SMTP transaction needs to be parsed by Suricata to trigger the alert function. Use case must write all currently existing MIME fields to console. The use case is triggered when preconditions are met and Suricata generates a alert based on the rule, referencing the Lua script including the initial function call. Table 10: Low level use case 2 37 CASEC Name Actors Purpose Description Preconditions Postcondition Trigger Specified MIME field extraction. Lua, Suricata, user Allow for user to extract the information/value of a specified MIME field residing in a transaction. This functionality will input the name of a MIME field and extract the value of the specified MIME field to the user. This functionality will enter the transaction currently being parsed by Suricata and use existing Suricata support functions to find and extract the correct MIME field. As this is the only functionality in our development that will receive variable input as form of a string from the user, this input will need to be considered controlled and validated by a ’input validation manager’ as this may arise security concerns. Suricata must be installed with Luajit support enabled. The function must be called from a Lua script. A Suricata rule must refer to the Lua script in order for the function to be reachable. A SMTP transaction will need to be parsed by Suricata to trigger the alert function. A string (name of the MIME field) is needed to be passed as an argument to the functioncall in the Lua script. Functionality will either return the value of the discovered MIME field. If nothing is found, the function returns a error string to the Lua state stack. The use case is triggered when preconditions are met and Suricata generates a alert based on the rule, referencing the Lua script including the initial function call with the variable string attached as a function argument. Sequence Diagram: Table 11: Low level use case 3 38 CASEC Name Actors Purpose Description Preconditions Postconditions Trigger Extract MailFrom field from SMTPTransaction. Lua, Suricata, user As the MailFrom attribute of an SMTP packet is duplicated to exist both in the fundamental SMTPTransaction struct and as a mimeDecField in the list of mimeFields connected to a SMTPTransaction, this functionality will extract the attribute from the SMTPTransaction. (This is added because the MailFrom attribute is often a point of interest in discovering security breaches or threats among other attributes of interest.) This functionality will extract the MailFrom attribute pointed to by the current SMTPTransaction being parsed. This will cause the string value being appointed to be extracted to the Lua state stack. The function will then on the Lua-side output the MailFrom attribute value. Suricata must be installed with Luajit-support enabled. The function must be called from a Lua script. A Suricata rule must refer to the Lua script in order for the function to be reachable. A SMTP transaction will need to be parsed by Suricata to trigger the alert function. For this use case to be successful, the function should output the value being pointed to by the MailFrom attribute pointer in the currently parsed Transaction. If the MailFrom attribute is not to be found or have not yet been appointed its value. The function will return an error message to the Lua state stack. The use case is triggered when preconditions are met and Suricata generates a alert based on the rule, referencing the Lua script including the initial function call. Table 12: Low level use case 4 39 CASEC Name Actors Purpose Description Preconditions Postconditions Trigger Extract list of rcpt from SMTPTransaction. Lua, Suricata, user. As the list of rcpt(recipients) is duplicated(same as MailFrom) as a list of strings in the fundamental transaction struct and as a MIME field residing inside the list of mimeDecEntites, we will write a function to extract the rcpt-list. This is so that the user does not have to write rule sets which depend on the implementation of mime but also because the rcptlist is often the second parameter communicated to the SMTP server, thereby being the address the emails is actually sent to. Rcpt-list extraction will work by navigating down through flow, smtpState, and to the SMTPTransaction. The list will then have to be iterated, pushing each value to the Lua state stack. This should preferably be done by pushing as a table. On the Lua side the function will have to place in a local table variable defined as a = {} Suricata must be installed with Luajit support enabled. The function must be called from a Lua script. A Suricata rule must refer to the Lua script in order for the function to be reachable. A SMTP transaction will need to be parsed by Suricata to trigger the alert function. The proper reception of the table in the Lua script will also have to be defined. For this use case to be successful, the function should deliver a table of strings referring to the addresses of the recipients. Use case triggers when preconditions are met, Suricata generates a alert based on the rule, referencing the Lua script including the initial function call. Sequence Diagram: Table 13: Low level use case 5 40 CASEC 4.1.2 Value #2: Security and Privacy Requirements • Our developed software will not leave trace of processed or unprocessed data. • Our software will not influence the integrity of the data being parsed. • The software contribution should not introduce new security vulnerabilities to the Suricata project. • All potential compiler warnings generated by our code will be documented and dedicated attention and time for fixing. • All code we produce will be manually reviewed for security issues 4.1.3 Value #3: Security and Privacy Risk Assessment We trust data received internally from Suricata. Most vulnerable area of our code base is the data that our code might receive from the Lua scripts. Input will either be allowed through parameters to our written functions, or input validation is needed on parameters passed to our functions. 4.1.4 Value #4: Attack Surface Analysis All code written by us that interact with data received from both Suricata and Lua scripts is recognized as attack surface. We discussed reducing our attack surface by never allowing user generated input to reach the Suricata platform. Through the system specifications we found the best solution to extract specific MIME fields is by matching the MIME field name with user generated input to generalize functionality. This however would contradict our attack surface reduction method. The alternative solution to this, but not allowing user input through our code would be to create one function to extract a specific MIME field per MIME field. This breaks with the principle of modularity and generalized functions, it would also be impossible to allow users to extract all existing MIME fields, as we do not possess a complete list of all MIME-fields that can reside within a email transaction. Our solution to this problem is to allow the user generated input through only one function, but assure that the input is properly validated before entering the Suricata data structure, and devote special attention to this area. Only the documented external interface needs be visible outside the source. 4.2 4.2.1 Design Principles and Standards Before, and during, the development of this functionality suite, we The basic security design principles to be implemented have to carefully followed and evaluated before, after, and during the development, as well as more general design principles of modern software development. This includes of the security design principles from our defined value method and other more general design principles for procedural programming. Our implementation phase and the code we write will be heavily influenced by the coding standards and style of the Suricata development team [59]. These will also dictate some of the design features of our code, both aesthetically and to some extend also functionality-vise. This includes the use of static function declaration whenever possible, specific indentation and syntax for control statements and commenting-practices to trigger code documentation to be added to Doxygen. Also, OISF have published a list of banned library functions that has been deemed unsafe or OS specific and thereby not 41 CASEC allowed to use in the development. 4.2.2 System Arcitecture The extension our group is building for Suricata will reside inside the Suricata code base and will not be viewed as a external program. Our code contribution to Suricata will be a API-extension between Suricata and Lua to allow access to data from Suricata. Modularity in code is desirable for our extension in such a large code base, as the development is done in a procedural programming language. As much existing functionality as possible will be reused. In order to get our developed code accepted into the Suricata main development and release repository, there is a defined manual and automatic quality assurance(QA) process that the submitted code needs to pass. Most of the automatic QA is handled in a contentious integration testing and build platform named Travis CI. This is triggered by a pull-request being submitted to the Github repository, the process is explained in its entirety in the delivery phase. Only the documented external interface needs be visible outside the source. 4.2.3 Value #5 Threat Modeling and Abuse Cases STRIDE is a threat model used and developed by Microsoft [60]. It can be quite useful when threat modeling to ask questions like “Can the identifier I check in this code get spoofed?” Asking all the right questions can be hard or impossible, this is where the STRIDE model helps. It is an acronym standing for Spoofing, Tampering with data, Repudiation, Information disclosure, Denial of service, and Elevation of privilege. This check list can then be used so the threat modeler at least covers the basic threat possibilities provided by the STRIDE model. Using the ’I’ in STRIDE as a basis can help form a question like “Can the potentially sensitive information processed by my code be accessed from somewhere else in the system”. STRIDE is recommended by the Microsoft SDL for threat modeling and it will be used as a part of our SSDP for threat modeling. STRIDE Spoofing identity: There is no identity management associated with Suricata. Tampering: Data sent internally in Suricata using the internal Lua stack can only be tampered with if someone can write to the program memory. This is a not an attack vector that can be easily mitigated and it has a low risk. Our code should not make changes to any data i retrieves from Suricata. We need to make sure that data integrity is not affected as it is a real threat. Integrity issues should be impossible without a fault in our code. Thus, data can only be tampered with if someone can write to the program memory. This is a not an attack vector that can be easily mitigated and it has a low probability. Repudiation: Our code is mostly library functions for the Lua scripts. Logging function calls are not required or standard for Suricata and there is no identity management to link function calls to. 42 CASEC Information disclosure: Data is only sent internally in Suricata using the internal Lua stack. Data can only be disclosed if someone can read from the program memory. This is a not an attack vector that can be easily mitigated and it has a low probability. Denial of service: Data is only sent to Lua using the internal Lua stack. The only way to prevent data being transfered is to stop the Suricata process. This is a not an attack vector that can be easily mitigated and it has a low risk. We supply a number of functions to the Lua script. These functions or Suricata itself might halt if a malicious Lua script abuses the function calls. Such a script could for example call an exhausting function in an eternal loop. This is a real threat, but it’s hard to mitigate because we do not control how Lua scripts call our code. Another potential threat is that we may have to work on the Flow structure provided in Suricata, which contains resources that can be accessed from multiple threads. This might cause a deadlock situation if we don’t use the available mutexes correctly. Elevation of privilege: Our code should not have any means to cause privilege escalation. There is no identity management associated with our code either. Abuse Cases 1. Input miss-match With this release of our code , the only function written that accepts user input is SMTPGetMimeField(String name), and as the declaration states it receives a string as the only argument. This string argument is used to match and find the expected MIMEfield located in the SMTP traffic. For this function to work, the argument will have to properly match the actual MIME-field name, if this is not the case, the function will not return any data to the Lua script. As the user input is parsed by our code inside Suricata, this was quickly recognized as a possible entry of attackers. If a potential attacker could have access to the Lua scripts that rules in Suricata refers to, the attacker would also have a entry point of input to the running Suricata instance. In case the attacker use unintended input to the function, the only response our function will return is that it could not find any MIME field matching the input. This mismatch could be anything from a typo in the argument to a fuzzed string with adaptive length handed to the function. 2. Input where not should be With the same prerequisite as the previous abuse case, the attacker needs to have access to the Lua script referred to by a Suricata rule. The attacker could then deliver arguments to a calling function that is not designed to handle arguments. If the arguments is formatted according to Lua standards with correct syntax, the function will not accept the unexpected arguments, thereby ignoring them, and running the function as intended. 3. Input not existing If an attacker intentionally calls SMTPGetMimeField with no argument-input, Suricata loops through the existing MIME-fields trying to match with NULL, hits the beyond the 43 CASEC last item of the list, finds that the entire field is pointing to NULL, then trying to return NULL. Thereby creating a segmentation fault to Suricata. This needs to be fixed by checking if the input out function gets from the call is NULL. This abuse case is an example of how they help discover existing bugs and flaws in the software. When found, the group needs to identify if there is a need for mitigation, then if so, mitigate the bug. -As of April 29th, a fix for this issue has been implemented. 4. Malformed SMTP data If an attacker sends malformed SMTP data to intentionally crash or critically injure the running version of Suricata the point of entry will be the data parser engine of Suricata. This data parser engine will then decide how Suricata should respond. If it feeds malformed data into it’s data-structure, for example MIME-field->value = " ", our code will do no more then hand this information to the user. 4.3 Implementation The coding style provided by OISF and Suricata will be followed to the best of our ability, as stated in design section of the development phase [59]. This contains information on formatting, flow, functions, variables, comments, filenames, control-statements, goto’s, unit tests and banned functions. Value #6 Deprecate Unsafe Functions The Value Method includes deprecating unsafe functions as a obligatory practice to implement. Deprecating unsafe functions have to some extent already been done through the Suricata project contribution documentation. As the reseach phase unveiled many common C coding practices in Suricata, it also highlighted which safe functions to use in order to reach spesific goals solve problems programatically without compromising the security of the system. Tools and Structure The tools we will use for implementing the code is for each individual developer to decide, as mentioned in approved tools, and may be everything from a lightweight editor like Notepad, Vim, Vi, Ed, Emacs, and Nano to full stacked enterprise IDE’s like Eclipce, Xcode, and JetBrain’s CLion. The implementation have included three out of four group members with tasks ranging from feature implementation, testing, bug-fixing, documentation, refactoring and auditing. As the development phase have been an iterative process, the requirement, design, implementation, and to different extents the testing phase is part of almost every iteration. 4.3.1 Data Structure Architecture Before we started the implementation of our code, we needed to familiarize ourself with the data structure of Suricata. This was to a large extend done in the research phase of this project but needed constant iterations throughout the implementation. As a result of the research phase, we had the following perception of Suricatas SMTP data structure. Visualized in Figure 11 The application layer data is found in a Flow, this is the first struct of the SMTP layer. 44 CASEC Figure 11: SMTP related structs in Suricata Within the flow there is a SMTPState which contains all data parsed by Suricata that is of SMTP. To reach the specific mail transaction it’s necessary to extract the SMTPtransaction currently being parsed by the state. Within that transaction lies the data specific to each mail and pointers to the lists of MIME fields connected to that mail transaction. 4.3.2 Suricata and SMTP SMTP is one of the protocols that Suricata has support for parsing. This means that SMTP packets are recognized by Suricata as such and relevant data is extracted. Having SMTP and MIME data made easily available in Suricata is essential for our bachelor thesis. The already excisting SMTP parser makes the job of extracting spesific SMTP and MIME fields much easier. SMTP is also supported on a superficial level for normal Suricata rules. One can match on "smtp" instead of a port number while writing rules for good readability and 45 CASEC rule management. The "smtp" keyword in rules will however not detect traffic on ports other than port 25. Our Lua scripting project is not the only part of Suricata that benefits from the available SMTP resources. Suricata has options that can easily be toggled in the configuration files for automated logging of specific SMTP and MIME data. SMTP data in Suricata is separated into two logical structures. The first is SMTPState and the second is SMTPTransaction. SMTPState contains the state of the SMTP parser and is persistent throughout the parsing session. This structure and the related functions are responsible for parsing the data flow and extracting the relevant SMTP data. When the SMTP parser detects a new SMTP transaction it will create a new SMTPTransaction struct for the SMTP data exclusive to that transaction. That could be for example the "mail from" message in the SMTP standard. So this means that the data for each SMTP transaction is stored in it’s respective SMTPTransaction structure. These are in turn stored as a linked list in the SMTPState structure. All text data from the SMTP transaction that only shows up once, like "mail from", is stored as pointers to allocated memory set to the text. Data fields that can show up multiple times are however mostly stored in linked lists. If the data has it’s own defined structure than it will most likely have a list of those structures. An example of this is the file_ts pointer in SMTPState that points to a list of FileContainer structures. They contain all the files sent to the SMTP server in the parsed traffic. If the data field does not have a dedicated structure, a general structure like SMTPString can be linked together. This is the case for the rcpt_to_list where each SMTPString structure in the list contains a pointer to an allocated space with the name of one recipient and the length of the name. The hierarchy for these structures is LuaState->Flow->SMTPState->SMTPTransaction. The hitherto unknown structures are LuaState and Flow. LuaState is our C view into the Lua parser, and it contains the data that Suricata supplied it with when it called the Lua script that called our C functions. LuaState contains the Flow structure, this is a general structure for a network flow and it contains the traffic that we are doing content matching on. This is what the SMTPState and SMTPTransaction look like, only including the variables we use. 1 2 3 4 5 6 7 8 9 10 11 t y p e d e f s t r u c t SMTPState_ { SMTPTransaction ∗ c u r r _ t x ; FileContainer ∗ f i l e s _ t s ; } SMTPState ; t y p e d e f s t r u c t SMTPTransaction_ { MimeDecEntity ∗ m s g _ t a i l ; u i n t 8 _ t ∗mail_from ; uint16_t mail_from_len ; TAILQ_HEAD ( , SMTPString_ ) r c p t _ t o _ l i s t ; } SMTPTransaction ; /∗∗< r c p t t o s t r i n g l i s t ∗/ Figure 12: Excert of SMTPState and SMTPTransaction 4.3.3 Suricata and MIME Suricata extracts MIME information similarly to SMTP data. The parser state information is separated out into one structure, MimeDecParseState in SMTPTransaction. Transaction specific data is stored in a MimeDecEntity structure linked list. This includes for example the actual MIME fields and the filename of attachments. The head and tail to this list is also located in SMTPTransaction, which is makes for a convenient mapping between an 46 CASEC SMTP transaction and the belonging MIME transaction The actual MIME header fields are accessible from the MIME entity’s “field_list.” It contains all the MIME fields that were set in the SMTP transaction and their values. This includes both the ones in RFC 4021 and the custom ones [61].Finally the structures in "field_list" containing the data are called MimeDecField and have a pointer to the field name and length, and the value name and length. The hierarchy of for the MIME fields are LuaState->Flow->SMTPState>SMTPTransaction->MimeDecEntity->MimeDecField. We can however skip accessing the MimeDecField linked list directly and looping through them because we use the available MimeDecFindField function. This is what the MimeDecField look like, only including the variables we use. 1 t y p e d e f s t r u c t MimeDecField { 2 u i n t 8 _ t ∗name ; /∗∗< Name o f t h e header f i e l d ∗/ 3 u i n t 3 2 _ t name_len ; /∗∗< Length o f t h e name ∗/ 4 u i n t 3 2 _ t v a l u e _ l e n ; /∗∗< Length o f t h e v a l u e ∗/ 5 u i n t 8 _ t ∗ v a l u e ; /∗∗< Value o f t h e header f i e l d ∗/ 6 } MimeDecField ; Figure 13: Exert of MimeDecField 4.3.4 Refactoring Refactoring is the act of modifying source code without changing the feature it delivers [62]. The goal is the make the source code easier to comprehend, extend and maintain. This is usually achieved by a few simple activities like moving code into smaller functions, using better names for variables, and moving code to a more suited location. Figure 15 is an excerpt the SMTPGetrcptList function. The function is recreating a lot of functionality in the if-else statement that can be extracted into a smaller function that provides the same functionality. The result of this can be seen in Figure 14. 1 2 3 4 5 6 7 8 9 10 if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; G e t r c p t L i s t ( l u a s t a t e , flow ) ; FLOWLOCK_UNLOCK( flow ) ; } else { G e t r c p t L i s t ( l u a s t a t e , flow ) ; } return 1 ; Figure 14: After refactoring 47 CASEC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; SMPTState ∗ s t a t e = ( SMPTState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { FLOWLOCK_UNLOCK( flow ) ; return L u a C a l l B a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : s t a t e not found " ) ; } /∗ Removed some c o d e w i t h more c h e c k s and u n l o c k s ∗/ TAILQ_FOREACH( r c p t , &smtp_tx−>r c p t _ t o _ l i s t , n e x t ) { /∗ F o r e a c h c o n t e n t ∗/ } FLOWLOCK_UNLOCK( flow ) ; return 1 ; } else { SMPTState ∗ s t a t e = ( SMPTState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l B a c k E r r o r ( l u a s t a t e , " I n t e r n E r r o r : s t a t e not found " ) ; } /∗ Removed some c o d e ∗/ TAILQ_FOREACH( r c p t , &smpt_tx−>r c p t _ t o _ l i s t , n e s t ) { /∗ F o r e a c h c o n t e n t ∗/ } return 1 ; } Figure 15: Before refactoring 4.4 Testing To accomplish unit testing with good code coverage we had to discuss how to design and execute the tests. Our code work as a API towards Lua for the availabilty of SMTP data. pcap files were therfore produced that included raw SMTP test traffic together with a rule set to trigger the test Lua script which included the tests. This Lua script contained all API calls to test the different functions and was designed to log and print both successful output and error messages in case anything went wrong. The script did also work as the regression tests suite as it implemented and kept tests for all functionality as it was produced, and it ran through all previous tests when new functionality was introduced. A excerpt of this Lua script is shown in Figure 16. We also discussed researching and using different tool sets for creating automated unit and regression tests but decided that this would extend the research even further, thereby allowing less time for the development and would most likely consume time from the finalization of the project report. As we already had figured out one way of implementing the tests, This seemed unwise and unnecessary as a way of implementing the tests was already figured out. 4.4.1 Value #8 Perform Dynamic Analysis Within the regression runs, after tips from the Suricata development community we downloaded and ran a tool named Valgrind on our test build of Suricata. Valgrind is a framework for building dynamic analysis tools and a suite of tools that automatically detects memory management and threading bugs [63]. This did not provide results of any bugs within this category but proved helpful by both assuring our code did not introduce such bugs and as a way to exclude reasons for other bugs and flaws. 48 CASEC 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 f u n c t i o n match ( a r g s ) f i l e = i o . open ( " . / l o g f i l e . dat " , " a " ) io . output ( f i l e ) i o . w r i t e ( os . d a t e ( ) ) −−T e s t SMTPGetAttahcmentInfo −−p r i n t s f i l e n a m e : ’ name ’ −− md5 : ’md5 ’ l o c a l aa = SMTPGetAttachmentInfo ( ) for i , v i n p a i r s ( aa ) do for k , b i n p a i r s ( v ) do print (k , b) io . write (k) io . write (b) end end −−T e s t SMTPGetMailFrom −−p r i n t s a t t r i b u t e MailFrom end l o c a l bb = SMTPGetMailFrom ( ) p r i n t ( bb ) i o . w r i t e ( bb ) −−T e s t SMTPGetMimeField −−p r i n t s containment / v a l u e o f s p e s i f i e d Mimefield if found l o c a l c c = SMTPGetMimeField ( " d a t e " ) p r i n t ( cc ) io . write ( cc ) −−T e s t SMTPGetRcptList −−p r i n t s t a b l e e n t r i e s . l o c a l dd = SMTPGetRcptList ( ) for d , e i n p a i r s ( dd ) do p r i n t (d , e ) io . write (d) io . write (e) end i o . w r i t e ( " T e s t f i n i s h e d , new e n t r y = new t e s t " ) io . close ( f i l e ) end Figure 16: Excerpt from regression testfile 4.4.2 Value #9 Perform Fuzz Testing This is a testing practice we did not delegate much of our time to perform as fuzzy testing is a part of the quality assurance process, this is a testing practice we did not delegate much of our time to perform. We trust the lead developers of Suricata to have enough experience on this testing practice that we could focus on other areas of testing. 4.4.3 Value #10 Conduct Attack Suriface Review The development phase did not really deviate from the conclusions made during the 4th practice of The Value Method. Because of this, the attack surface review is conducted in a formal manner. 4.4.4 Value #11 Risk-Based Security Testing When building the abuse cases, a solid result of this was uncovering bugs of substantial criticality to our extension. The risk-based testing based of the defined abuse cases was the trigger to unveil these bugs. An example of this is the third abuse case, this defined a case where a possible attacker would call our function SMTPGetMimeField in a script without passing an argument. Our specially built tests uncovered that this scenario would trigger Suricata to segmentation fault. After supplying a fix for the bug in question, we re-ran the tests to be sure that the bug was mitigated. This practice was considered useful 49 CASEC in that we uncovered security issues with the code we had written. “QA is about making sure good things happen. Security testing is about making sure bad things don‘t happen.” [64] 4.4.5 Performance Testing The Suricata software has to operate effectively in a myriad of different environments. These can range from small single user setup to enormous 10 gigabit business network environments. Thus we need our code to be able to perform quickly, efficiently, and with high reliability. Thus we have decided to engage in a performance test to evaluate our code against the general Suricata performance. Our goal is that we should in no way limit the Suricata performance. A small test environment was created to make sure our code is production ready. The goal was to create emails, send them to a dummy SMTP server locally, and capture the traffic with Wireshark. The emails were created using the mutt email client which allows us to send emails using the CLI [65]. Mutt was used as a front end for the msmtp SMTP client which in turn sent emails to FakeSMTP server [66] [67]. This setup allowed for automateing the process of sending the emails, enabeling the generation of pcap files that can simulate high volume environments. Three different pcap files were ran for simulation during this test. These contain 100, 1000 and 10000 SMTP transactions. The layout of these transactions is located in the appendix H.1. Each transaction consist of one file attachment, two recipients, a subject and a message body. We run pcaps with 100, 1000 and 10000 transactions because we need to infer how much the startup time of Suricata effects the results. The time used running the pcap files should not increase by a power of 10 with each pcap as the one with 100 transactions should be more effected by startup time. The tests and test file generation was preformed on a Lenovo Thinkpad T440s. GPU acceleration in Suricata was not enabled, but multi threading was. The Suricata version is Suricata build version 3.0dev and the CPU is a Intel(R) Core(TM) i7-4600U CPU running at 2.10GHz. It is stated both in our development requirements and quantitative project goals that we do not want our code to have a negative effect on the Suricata performance in any way. That is the main motivation for this performance test. Test results Real time usage of either 100, 1000 or 1000 transactions is measured 40 times. Only results occurring more than once are shown to get rid of outliers, up to five results are shown. No output is printed to the terminal. This test does not take into account the time it takes to print the data in the terminal or loop through the retrieved data structures on the Lua side. All functions are called as such: print(function()), and the needs field is set to "packet". Legend: Amount of results with given time MIN:SEC:DSEC Base line score. No smtp rule. 100 transactions: 50 CASEC 29 0:00.05 6 0:00.06 1000 transactions: 13 0:00.17 10 0:00.16 6 0:00.15 5 0:00.18 2 0:00.14 10000 transactions: 4 0:01.31 4 0:01.30 3 0:01.32 3 0:01.14 2 0:01.47 Base line score. No smtp functions enabled, empty Lua script. 100 transactions: 34 0:00.05 1000 transactions: 20 0:00.11 13 0:00.12 6 0:00.10 10000 transactions: 12 0:00.71 12 0:00.70 7 0:00.72 3 0:00.69 2 0:00.73 Score with printing SMTPGetMailFrom enabled. 100 transactions: 37 0:00.05 1000 transactions: 18 0:00.12 10 0:00.11 9 0:00.13 10000 transactions: 13 0:00.83 7 0:00.82 6 0:00.84 2 0:00.99 2 0:00.86 Score with printing SMTPGetMimeField("date") enabled. 100 transactions: 33 0:00.05 51 CASEC 3 0:00.06 1000 transactions: 19 0:00.13 18 0:00.12 10000 transactions: 7 0:00.88 6 0:00.89 6 0:00.87 4 0:01.05 3 0:00.90 Score with printing SMTPGetrcptList enabled. 100 transactions: 35 0:00.05 2 0:00.07 1000 transactions: 18 0:00.12 13 0:00.13 2 0:00.14 10000 transactions: 8 0:00.89 8 0:00.88 5 0:00.87 4 0:00.90 2 0:01.07 2 0:01.06 Score with printing SMTPGetMimeList enabled. 100 transactions: 35 0:00.05 3 0:00.06 1000 transactions: 18 0:00.12 15 0:00.13 2 0:00.14 10000 transactions: 10 0:00.90 5 0:00.89 5 0:00.88 3 0:00.91 3 0:00.87 Score with printing SMTPGetAttachmentFilename enabled. 100 transactions: 34 0:00.05 1000 transactions: 52 CASEC 19 0:00.12 16 0:00.13 3 0:00.14 10000 transactions: 7 0:00.89 6 0:00.91 4 0:01.09 4 0:00.92 4 0:00.90 Score with printing all the above functions. 100 transactions: 30 0:00.05 4 0:00.06 1000 transactions: 31 0:00.16 5 0:00.17 10000 transactions: 5 0:01.30 4 0:01.57 3 0:01.56 3 0:01.32 3 0:01.31 Result Conclusions There are some conclusions we can make from this test. The most important one is that our code clearly has no significant impact on the preformance level of Suricata. This test was done using pcap files exclusively containing SMTP and MIME packets, which is not comparable to a production environment. There is a small deterioration in performance once all functions are enabled, but the processing speed and packet density makes us conclude that the practical effect should be negligible. The main data out lier is the baseline score with no Suricata rules enabled. These tests probably took more time because Suricata treats no rules as an error which may invoke some non-optimized code. The rough doubling in running time between 100 and 1000 transactions is a clear indicator of how time spent starting and stopping Suricata effects the results. A large percentage of the 0.05 seconds used to parse 100 transactions probably stems from that. 4.4.6 Automatic Builds for Testing Achieveing a consistent build of Suricata requires an automated build process [68]. The benefits of having this is that every time a test is run it is done under the same conditions. Since every group member have experience using Skyhigh, the OpenStack based cloud solution run on campus, we decided to use that as our infrastructure. Having an automatic process to build Suricata will save time and will give us some other benefits over manually building it. The biggest advantage is a consistent environment for each time Suricata is built as this allows for more focus on the actual code and not trying to reproduce the steps required to build it. The recipe for how it was done can be found 53 CASEC in Appendix F, but the gist of it was to accomplish all the tasks listed in the installation guide from the Suricata documentation [69]. 1. 2. 3. 4. 5. 6. 7. 8. 9. 4.5 Create Instance Manage RSA keys for access Install Git Install dev-tools and other libraries needed from the Ubuntu repository Clone our version of Suricata from Github Get LuaJIT to enable LUA support in Suricata Compile LuaJIT Clone libhtp from Git Compile Suricata Delivery The delivery part of this project has one uncommon property compared to many software development projects. We are delivering our product to two quite different actors, one being OISF and the other is our employer, KDA. The delivery part of our project highly effected by modern open source development trends. The main one being the usage of the Git version control software and the Git server hosting and managing service github.com [70]. Suricata is hosted by Github, as is our version of Suricata with our own code. Our final version of the code is pushed to the Suricata repository for approval, and KDA will get their product from a Github repository as well. One representative from KDA has also had access to our development repository for the latter half of development meaning they had, and will have, access to the latest available version of our code. This means that the entire delivery process is completed entirely through Github. Both parties are also supplied with user documentation for our code. KDA can choose to use this documentation internally while OISF has expressed interest to host it on their wiki page [71]. Some documentation about data structures in Suircata and how they relate to each other has also been created, which may be relevant for OISF. 4.5.1 Quality Assurance by OISF The delivery starts with a quality assurance process by OISF. Quality assurance is commonly implemented in almost any software project of substantial order, both open source and not. This often consists of both a manual and an automatic part. The QA process is mainly a platform for reviewing work committed to a project in order to prevent flaws or bugs, but also to compare the work as a result to the requirements documented and to assure that the costumer receives the desired product. In the Suricata project the QA process is described in details on the front page of its repository and in the Redmine documentation wiki, thereby making the process well established and known to the development community. Anyone can contribute by creating a pull request. This could potentially generate a lot of noise and contributed code of lesser quality, it would be a waste of developer resources for them to manually disapprove code that does not compile. Familiarizing the developers with the QA process before the QA process is set in motion may drastically increase the quality bar of delivered product, even before it reaches the stage of QA. Regular contributers will automate this process with a pull request script and a au- 54 CASEC tomated build and quality assurance process. This does however require an account on the build system which we did not require as this is a one time contribution. OISF provides plenty of guidance and regulations for pull requests even if the automated tools are not used. We have followed the specific coding standard and guide Github contribution [59] [72]. Once a pull request is made the code will automatically be built using the clang and GCC compilers with a predefined set of flags. The success or failure result of the compilations will be attached to the pull request. This could help potential contributers fix errors, even before experienced programmers give their feedback, based on the potential error messages produced by build failures. High level QA “On a high level, the steps are: • Travis-CI based build & unit testing. This runs automatically when a pull request is made. • Review by devs from the team and community • QA runs. Overview of Suricata’s QA Trusted developer and core team members are able to submit builds to our (semi) public Buildbot instance. It will run a series of build tests and a regression suite to confirm no existing features break. The final QA run takes a few hours minimally, and is started by Victor. It currently runs: • extensive build tests on different OS’, compilers, optimization levels, configure features • static code analysis using cppcheck, scan-build • runtime code analysis using valgrind, DrMemory, AddressSanitizer, LeakSanitizer • regression tests for past bugs • output validation of logging • unix socket testing • pcap based fuzz testing using ASAN and LSAN Next to these tests, based on the type of code change further tests can be run manually: • • • • • • traffic replay testing (multi gigabit) large pcap collection processing (multi terabytes) AFL based fuzz testing (might take multiple days or even weeks) pcap based performance testing live performance testing various other manual tests based on evaluation of the proposed changes It’s important to realize that almost all of the tests above are used as acceptance tests. If something fails, it’s up to you to address this in your code. One step of the QA is currently run post-merge. We submit builds to the Coverity Scan program. Due to limitations of this (free) service, we can submit once a day max. Of course it can happen that after the merge the community will find issues. For both cases we request you to help address the issues as they may come up.” [73] 55 CASEC 4.5.2 Pull Request to Suricata We did two iterations of pull requests to Suricata before this thesis was finished. Victor Julian was the OISF representative giving feedback on both iterations. He has the title of Lead Programmer, meaning that he is probably the most competent person we could hope to get feedback from. The first iteration provided feedback on 5 sections of the code and our second iteration had 3 sections that needed feedback. The feedback was mostly on minor code guideline mistakes like indentation or bracket placements. There were also a some misunderstandings regarding return values and other details. Some results were very interesting and valuable as well. Julian did for instance comment on our one usage of strlen. strlen requires a NULL terminated string and strlen was used in a situation where the string was not guaranteed to be NULL terminated. This could have resulted in a potential security issue. It could also be an example of how having experienced C programmers with knowledge of common pitfalls in the quality assurance can have a significant impact on the over all quality in a project. More details, including all feedback in both iterations, is located in the Appendix.B 4.5.3 Value #13 Certify Release and Archive As previously stated, the API documentation, together with the delivery of our source code, covers this Value to its full extent. Also, this bachelor thesis serves of a rather comprehensive documentation of our work and is openly available to the public. 4.5.4 API Documentation The following section features a complete list of all functions made available for Lua scripts, description on I/O for each feature, and examples on usage. The features made available through functions to the users of Suricata are as follows: SMTPGetMailFrom() Allows users access the to MailFrom attribute of a SMTPTransaction. SMTPGetMimeField(string) Allows users access to specified MIME fields in the SMTPTransaction by passing the name of the desired mimefield as an string argument to the function. If the field specified is found, the function will return the value of the desired MIME field. SMTPGetRcptList() Allows users access to the complete list of recipients of a SMTPTransaction. SMTPGetAttachementFilename() Allows users access to name all attachments and MD5 hash body of all attachments of a SMTPTransaction. SMTPGetMimeList() Currently used only for development purposes. Writes console output on all MIME fields of a SMTPTransaction. Possible extension could push results to a table in Lua script. Thereby allowing the user access to script logic around the return values of this functionality. 56 CASEC SMTPGetMailFrom SMTPGetMailFrom() - Returns string buffer. May be used on both by assigning output to variable and by placing function call straight into control statements. ex1: local a = SMTPGetMailFrom() if(a) print(a) end ex2: if(true) print(SMTPGetMailFrom()) end output: <example@domain.xyz> Figure 17: SMTPGetMailFrom example SMTPGetMimeField SMTPGetMimeField(string) - Returns string buffer. May be used both by assigning output to variable and by placing function call straight into control statements. Argument string, should refer to the name of the MIME field desired and is mandatory to the call. Function will return the value of the specified MIME field if the MIME field is found. ex1: local a = SMTPGetMimeField("x-mailer") if(a) print(a) end ex2: if(true) printf(SMTPGetMimeField("subject") end output: 1: Microsoft Office Outlook 12.0 2: Re-negotiating usage of project mangament systems Figure 18: SMTPGetMimeField example 57 CASEC SMTPGetRcptList SMTPGetRcptList() - Returns table consisting of key and value. Table variable have to be initialized before calling the function. The data returned by the function consists of the separate addresses found in the list of recipients, also known as the rcpt-list. local a = {} a = SMTPGetRcptList() for i, v in pairs(a), do print(i,v) end output: 0 1 <recipient@exampledomain.xyz> <devel@hig.ntnu.xyz> Figure 19: SMTPGetRcptList example SMTPGetAttachmentFilename SMTPGetAttachmentFilename() - Returns a nested table consisting of two different keyvalue pairs. The outer table contains a index as key, pointing to the nested table with the file information. Both filename pair(name and value) and MD5-body(name and value) located within this index table. Just as with SMTPGetRcptList() the table variable have to initialized before calling the function. Since this consists of a nested table, working on the inner data(filename and MD5) requires another for loop. local a = {} a = SMTPGetAttachementFilename() for i, v in pairs(a), do for j, k in pairs(v), do print(j, k) end end output: filename md5-body report.pdf b63de299836f54017ea6f0801b8c0be6 Figure 20: SMTPGetAttachmentFilename example 58 CASEC SMTPGetMimeList SMTPGetMimeList() - This function returns a table with the names of all MIME fields residing within a mail transaction. Table = index int => mimefieldname local a = { } a = SMTPGetMimeList() for i, v in pairs(a), do print(i, v) // could also use print(v) to get name output only end output: 1 2 subject user-agent Figure 21: SMTPGetMimeList example 4.6 Summary Development is one of the main phases in our project and it ran in parallel with the code review. This development phase has one special property in that we have extended our general methodology, Kanban, with a secure software development methodology. It is a methodology we have defined ourselves and named The Value Method. It should be noted that the Kanban board was not actively used once the development got going. Task delegation and task progress was organized between members and communicated orally as work happened on campus. The use cases were used as a basis for the functionality we were going to develop while the functional and non-functional requirements were served as a basis for testing and helped shape general development decisions. We knew from our research that the Suricata high level documentation was a bit lacking, especially when it comes to relations between files and data structure. These relations were documented as we believe the information will prove valuable for future work. The Value Method is of course extensively incorporated into the development. One activity we performed and got concrete results from was Value Practice #5: Threat Modeling and Abuse Cases. The abuse cases under this section were tested against and a segmentation fault bug was found as a result. Other activites where Security and Privacy Requirements, Security and Privacy Risk Assessment and Attack Surface Analysis. These tasks combined with the extensive quality assurance and iterative feedback process in the delivery section has given us confidence in the quality of our source code and the Suricata contribution process. One essential part of this development was was the extensive performance test. All our code functionality was tested with generated pcap files. These pcap files contained nothing but SMTP traffic and allowed us to adequately test the performance. The results were largely built up under or goals as there was a miniscule decline in performance with extreme traffic data. The actual development went according to plan and got up to feature parity with our use cases. Writing C code for a large and mature project was a technological challenge. Extra time and personnel resources were used to combat this challenge, and the development was completed in a timely fashion. Another factor that helped keep the development phase in line with our schedule was that most of our research into the loca- 59 CASEC tion of functions, data structure and files related to Lua, SMTP and MIME in Suricata was really valuable. We did not spend a lot of time searching for data structures and helper functions. All source code is readily available for our employer, KDA. They do also have access to user documentation. The first two round of pull requests were sent to the Suricata developers as well. There was not enough time to iterate through the pull request process enough times for our code to be accepted into the Suricata before we had to prioritize the thesis writing. 60 CASEC 5 5.1 Code review Manual analysis We have two main approaches to manual auditing. The first method was, as described in our methodology, to search for potentially suspicious source code comments. This method returned too many results for us to manually check the related source code for each line. We got 476 results from using the “grep-” method. The following bash command was used to find the suspicious code lines: 1 f i n d ∗ . c | x a r g s grep −HNi " todo \| bug \| hack \| f i x \| xxx \| temporary " We looked over the list and selected the lines that had the most potential for actual errors. Many of these lines were however to entangled in advanced logic outside of our competence. Understanding the lines retrieved would in many cases require us to use a significant amount of time to backtrack complex logic in related code. We could in many cases not justify the time for such extensive auditing based on the amount of findings and the small amounts of confidence we could place in the grep results. This assumption was supported by the findings from the grep results that were looked more extensively into. Most of the more manageable grep source lines that were looked into proved to be false positives with a quick glance. This lead us to believe that the false positive rates would be similar in the more complex cases so it would not be worth the time to investigate these. Three grep source lines that looked like a potential threat, even after a first glance, were selected and did a more extensive review was performed. Following is the review notes, conclusions and peer feedback. These files contain raw audit notes, but were kept as is to showcase the methodology and thought process behind the audit. 5.1.1 Review: alert-unified2-alert.c Filename: alert-unified2-alert.c Triggering line: 1401 Line content: filename = SCMalloc(PATH\_MAX); /* XXX some sane default? */ This file was selected by running the grep command on all source code files searching for "XXX". The particular output was selected because the lack of a sane default can be a security risk if there is a chance of no parameter being specified. The offending comment is in the function "Unified2AlertOpenFileCtx" which reads the config, sets the file pointer and opens the file. The line that is commented on allocates memory for a file, and points a char pointer to it. Memory of size PATH_MAX is allocated. PATH_MAX is the not a sane default here as it is defined to be 4096 in linux/limits.h [74]. The definition of LogFileCtx (the struct containing filename) the comment above the variable filename says that it is the name of the file. Thus we can assume that the issues raised in this blog post won’t be a problem [75]. If the filename is over 4096 characters the buffer will however overflow. This wiki page 61 CASEC shows that this 4096 buffer is well within the limits of a probable filename [76]. This might be the reason as to why PATH_MAX is not a sane default, it’s too large. Fix solution: Set the max to 255 bytes. Which is the common denominator for most file systems as seen here [76]. Audit feedback The comment states that PATH_MAX isn’t the sane default here, and that might seem to be right. A filename != path, so in theory a new value for filename max (NAME_MAX in linux/limits.h) would be nice to have, 255 is the normal maximum filename in most file systems [76]. On a related note, PATH_MAX for filenames are used quite a bit throughout the Suricata code base: log-pcap.c,log-tcp-data.c, util-debug.c, util-logopenfile.c, utilprofiling.c and a few more. With this in mind it would be wrong to change this in just one place but it might be a good idea to change this everywhere. 5.1.2 Review: decode-ipv6.c ./decode-ipv6.c:185 /todo move into own function and load on demand. ./decode-ipv6.c:187 / xxx unused and broken, original packet is modified in the memcpy ./decode-ipv6.c:196 /todo do this without memcpy since it’s expensive. This file was selected by running the grep command on all source code files searching for "todo", "xxx". This particular output was selected because the comments hint at potential optimization. The comments apply to the function DecodeIPV6xtHdrs. The files general purpose is for decoding IPV6 packets and the function is for extracting headers from the IPv6 decoded packet. The problematic lines is for now commented out so the code is not in use. Its commented out because the memcpy call on line 197 will modify the content of the packet. Suricata just "sniff" packets, it never modifies them. The out-commented lines could be moved into its own function and called only on demand with packets of the special header size, and thereby being the correct protocol for this memcpy call. This mitigation concerning moving it into its own function would be quite simple. Although, the memcpy would have to be changed and tested to never modify any packets in the system. Audit feedback Unless there is a specific issue/bug in the tracker, we should just leave this as it is. For something in the same function, at line 209 there is an empty result of a switch statement ( case PROTO_HOPOPTS ), this results in the next case will go into effect if there is a hit on PROTO_HOPOPTS, in the next case there is an if statement to look for PROTO_HOPOPTS so this has been considered. This reveals that there is some intent behind this, but is it the best solution? I would conclude that this might raise issues if the code needs to be modified at some point. 5.1.3 Review: decode-ipv4.c Filename: decode-ipv4.c Triggering line: 319 and 332 Line content: /** \todo What if more data exist after EOL (possible covert channel or data leakage)? */ 62 CASEC /** \todo What if padding is non-zero (possible covert channel or data leakage)? */ This file was selected by running the grep command on all source code files searching for "todo". This particular output was selected because the comment in question eludes to a possible security issue. These are comments for code in the function DecodeIPV4Options. The entire file is for decoding IPv4 packets, and this function is for decoding and validating IPv4 options. The lines that might be a problem are the ones responsible for validating the length of the IPv4 packet. This length value is later used to loop through the packet options. The length is validated by checking for a an End Of List (EOL) on line 319. A possible issue here, raised in the comment, is that there might be data hidden after the EOL. I could not find any research about hiding data after EOL, but it should be theoretically possible. Recommended mitigation would be checking that plen is 1 and failing hard if it’s not. The second issue raised is if the packets remaining options length field has less than 2 bytes left. This check is after the EOL check and after the No Operation padding check. There are no valid fields left that it could be. A missing EOL is not a problem and fits the standard if the end of the options field is also the end of the IP header. This should be checked for. The maximum covert channel leakage would be one byte per package which is in my opinion quite insignificant, though that one byte should be logged. Audit Feedback The comment on line 319 pertains to a possible data leakage or covert channel, but it is quite difficult to see any significant threat here. The if statement checks if the pkt is equal to IPV4_OPT_EOL, and that should in theory mean that there is no more information after this. The comment on line 322 asks about a possible data leakage or covert channel if the padding is non-zero. The check here is if plen is less then 2, so if it is 1 we might skip 1 byte of data. This 1 byte can be utilized as a covert channel, but it is a pretty small amount of data. There is also no clear indication how this channel could be exploited in the wild. My suggested fix would be to add another else if statement where we check if plen is 1, log it and break out. 5.1.4 Selective Review Our second approach to manual analysis was to select what we believed was the most critical code in the Suricata and audit that code. This approach has one obvious fault, that code has probably been written with care so the chances of finding potential threats could be slim. There is however also the argumentation that the most critical code should probably be audited more than once, that was the approach we went with. We deduced that one highly relevant attack surface for Suricata in general is the entry point for network traffic. This code will in many cases deal with unfiltered network traffic from potentially malicious actors and is the first line of defense in many network designs. The investigations into this code, located in source-nfq.c, proved our first assumption to be true. This code seemed to handle the initial parsing with adequate care. This supports our general impression, namely that Suricata source code is managed by competent developers. Another issue with the manual audit was the fact that the data traffic quickly 63 CASEC disappeared into structures, obfuscating the continuation of the data flow. We did also do manual review, in the form of peer reviews, continuously during the development phase. These audits were done in a more informal manner as a way to support the development and keep a continuous, high code security level during the development. 5.2 5.2.1 Value Practice #7: Perform Static Analysis Introduction Static analysis means analyzing the source code and not the binary file the source code compiles to. Static analyzers work similarly to compilers in that they perform many of the same steps. The code is compiled to an intermediate language where the static analyzer is able to understand the program flow. Then certain logical checks are ran on logical patterns. Checking if variables are used, functions are called with proper parameters, pointers are set before they are used and so on [77] [78] We used an automated tool for our static analysis and this section is the results of the static analysis we performed. We decided to use the clang-analyzer, it is a part of the Clang project which is a front-end for the LLVM compiler. clang-analyzer has existed for a long time and is a part a part of the very reputable Clang/LLVM project. Clang/LLVM has an active community and is under R were also considered but the continued development. Other alternatives like Fortify cost of the product and an unclear feature list made us go for a FOSS alternative. The entire Suricata code base was ran through the static analyzer and each finding has been validated manually. The results have also been evaluated for the potential risk they pose to Suricata and remedies are suggested. All findings we have deemed insignificant are located in the appendix. 5.2.2 Metrics We have selected the most critical bugs and make individual risk evaluations for each. The risk is computed as the product of the probability the of bug occurring on a scale from 1-3 and the potential consequence of the bug on the same scale. We consider risk scores over 3 to be a potential threat to Suricata. The risk is presented as a matrix with impact as the X-Axis and probability as the Y-Axis. The cross section is where the risk score is presented. Probability 1 2 3 Description It is theoretically possible but highly unlikely, or impossible, for the bug to occur The bug can possibly occur The bug will occur Table 14: Probability classes 64 CASEC Impact 1 2 3 Description Almost inconsequential to the Suricata performance or code quality Will cause memory leaks or other instability to Suricata. May impact availability will most likely stop the Suricata process Table 15: Impact Classes 5.2.3 Result Discussion The clang-analyzer organized the results into four categories based on the type of bug it found. The first category is dead store, it contains bugs where a variable is defined and then never used afterwards. The logic error category groups bugs that occurs because a logical misstep by the developer. Memory errors are bugs when the code tries to use memory it should not access. The Unix API category are bugs that utilizes the Unix API wrong. 35 bugs where in the Dead store category. 30 Dead assignment 5 Dead increment 1 bugs where in Logic error category. 1 Dereference of null pointer 21 bugs where in the Memory error category. 21 Use-after-free 3 bugs where in the Unix API category. 3 Undefined allocation of 0 bytes 5.2.4 Important Results Bug 36. Type: Unix API. Undefined allocation of 0 bytes. File: detect-engine-mpm.c Function: DetectSetFastPatternAndItsId Line: 1345 Conclusion: True positive. There are two values at the start of this function that causes this issue, struct_total_size and content_total_size, these are set to 0 at the start of this function and they will get their actual values in the for loop in line 1335. The precondition for this for bug is that sig_list in the DetectEngineCtx struct is set to NULL. If this is the case then the program flow will not enter the for loop leaving the total_size variables on 0. The SCMalloc on line 1345 causes the actual bug because it will allocate an amount of bytes equal to sizeof(uint8_t) * (struct_total_size + content_total_size). In our case this would be an allocation of sizeof(uint8_t) * 0, which results in an allocation of 0 bytes. We recommend adding the following code above line 1344. if((struct_total_size != 0) || (content_total_size != 0)) 65 CASEC Probability/ Impact 1 2 3 3 2 Score: 4 1 Table 16: Risk Matrix: Bug 36 Bug 38. Type: Unix API. Undefined allocation of 0 bytes. File: util-radix-tree.c Function: SCRadixAddKey Line: 749 Conclusion: True positive. Probability: 2 Impact: 2 Risk score: 4 This is another bug where arithmetics in a SCMalloc call makes it possible for an undefined allocation of 0 bytes. The SCMalloc call allocates sizeof(uint8_t) * (node>netmask_cnt - i) node->netmask_cnt appears to have a constant value while "i" is set in a for loop on line 744. "i" is infact guaranteed to be equal to node->netmask_cnt, causing the undefined allocation, if the "if" statement on line 745 in the "for" loop never triggers. This is hard to determine because the "if" condition is based on arcane logic in the radix tree generation function. It is quite likely that the mentioned "if" statement is designed to always to trigger since i is always equal to node->netmask_cnt and the allocation bug will always happen if it does not. We can however not guarantee it and we will therefore recommend that the following code is added above line 749. if(i == node->netmask_cnt) return -1; Probability/ Impact 1 2 3 2 Score: 4 1 Table 17: Risk Matrix: Bug 38 Bug 39. Type: Unix API. Undefined allocation of 0 bytes. File: util-pool.c Function: PoolInit Line: 166 Conclusion: True positive. Probability: 1 Impact: 2 66 3 CASEC Risk score: 2 This bug requires the argument elt_size for PoolInit to be set to 0. This is not the case in any PoolInit in the Suricata source code as of yet. If elt_size however is set to 0 then there could be a case where an undefined allocation of 0 bytes would be made. We recommend changing line 165 from to following to fix this bug. } else if (elt_size > 0) { Probability/ Impact 1 2 3 3 2 Score: 2 1 Table 18: Risk Matrix: Bug 39 Bug 45. Type: Memory eror. Use-after-free. File: unix-manager.c Function: UnixCommandRun Line: 545 Conclusion: True positive. Probability: 2 Impact: 1 Risk score: 2 If UnixCommandRun receives a command that is longer than what’s allowed it will close the connection to the client and free all related information. The program flow does not break if this happens so after the information is freed it will continue with the command execution flow. Finally it will try to access the freed memory, making this a true positive. Practical testing of this bug reveals that the input gets truncated to the size of the buffer by the recv system call that retrieves the remote command. This means that the terminating characters of the string will get removed in most cases, causing the json parsing of the command to fail and stopping this bug from triggering. This bug can only be exploited if someone sends a command that longer than the buffer size, but contains a terminating character before the truncation. Practical tests show that Suricata survives receiving a command matching the required prerequisites for the bug. Probably because the json library that is responsible for the function accessing the freed memory does proper checks. We recommend adding the following after line 543. else { And an extra closing bracket at the end of the function. 67 CASEC Probability/ Impact 1 2 3 3 2 Score: 2 1 Table 19: Risk Matrix: Bug 45 5.2.5 Result Summary The clang-analyzer returned a total of 60 results. This may seem like quite a lot at first, but most of these were in fact the same type of bug located in different part of the source code. 35 of the 39 true positives came from essentially the same dead storage bug. Dead storage is when a variable is set but never used. This might for example occur when code is removed, but some variables are left in place. We do obviously not regard these types of bugs as a significant threat, but they should be fixed. They do also skew the results towards a better true positive rate. The false positive results were also effected by bugs occurring multiple times throughout the source code. 16 out of the 21 false positive bugs stemmed from the same clanganalyzer issue. The clang-analyzer failed to correctly interpret one macro in the source code called TAILQ_FIRST. It believed that one pointer could be accessed after it was freed, but this was not the case. This skewed the results in towards equilibrium, but not enough to compensate for the dead storage bugs. Summing the dead storage and use-after-free bugs gives us a total of 51 out 60 bugs spread over two bug categories. That result was a bit disappointing, but the tool can not be blamed as it should display similar bugs. It is noteworthy that three of the four bugs we deemed important were outside these two categories. We did not evaluate any of the important bugs to have a risk score over four, even though the bugs had some interesting aspects and were true positives. The broader conclusion is that we are pleased with the results as false positives and other issues should be expected with the use of automated tools. 5.3 Summary We had two main approaches to code reviewing. The first was to search through the source code for comments with that hinted towards potential vulnerabilities. The second was to pinpoint the most critical code in Suricata and audit that. Combining these two methods meant that we could look into source code that had both a higher probability of containing vulnerabilities but of a lower impact, and code that had a lower probability of containing vulnerabilities but with a higher impact. The source code comment based approach yielded a lot of initial results, but auditing these proved difficult. We admit that this is partly due to a lacking competence in advance C code, but we also believe that we could not have complicated any better than we did during the research phase. Another contributing factor was that the source code is really extensive and one interesting line of code is likely just a small part in a larger logic that could span thousands of lines. Tracking input and output parameters, variables and such through the source code proved to be challenging. We did however get some interesting 68 CASEC results that were investigated thoroughly and yielded conclusive results. Our second approach, pinpointing critical code, was the most lacking in true positive and concrete results. Looking through the Suricata source code and interacting with the developers has left the impression of a mature development project which takes security seriously. This was reflected the code we defined as critical, as we did not manage to unearth any security related issues. We note that the data flow into Suricata, quickly disappeared into abstract data structures, which made us unable to trace the data flow far. The difficulty we had throughout the audit in the more complex parts of the code can in part be attributed towards the lacking documentation, both in source code and design documents. We believe that we are able to make conclusions about the security state of Suricata based on our code review phase. Though we wish we had found more results for the sake of our learning outcome. 69 CASEC 6 6.1 Conclusions Research Conclusion The purpose of the research phase of our thesis was to make a good foundation for reaching our quantitative goals, qualitative goals, and goals in our problem statement. The research phase included research on: Suricata, relevant technologies and protocols in Suricata, project and development methodologies, audit methodologies and secure software development processes. On top of that, we created our own secure software development process, The Value Method. During the research phase we planned how we would work, and which methodologies we would use to reach our goals and conclude on our problem statement. Because of the project phases, the time frame, the groups knowledge of code audit and contributing to an open source project, we chose the Agile development methodology Kanban. Kanban was chosen for our thesis and for each of the project phases. This decision was based on the fact that we wanted the group members involvement in each phase, and resulted in that the group members knowledge gained during the thesis work would be distributed among the group members. This was made possible by reiterating on previous work. This also partly completed our learning outcome for project management. We also decided on which methodologies in the code review section we would use, enabling us to conclude on the code review parts of our problem statement. We made our own secure software development process to supply the development phase with adequate focus on security, and enabling us to ensure that the requirement of developing secure code, stated in our problem statement and in our quantitative goals, were fulfilled. Which also set the baseline for getting the learning outcome of code review and secure code development. In the research phase, we used our methodology as planned, and held two weekly meetings. Our estimates for the Kanban limits in this phase were correct. Specifically, limiting the research lane to a maximum of four files from Suricata or topics being researched at a time, resulting with group members finishing tasks before starting another. We used weekly meetings for group members to present their research on the Suricata source code and other topics, to the rest of the group and we delegated new tasks. The meetings worked as intended, all group members gained knowledge that was essential for the next phases. However, we failed to estimate the length of some of the Suricata source code files, which resulted in less than four presentations at each meeting. The reason for this was the variable length of each source file, which could vary from two hundred lines to two thousand lines of code. Following the research on Suricata’s source code, was technologies related to Suricata and our planned work of extending Suricata. Among the knowledge gained was the implementation of SMTP, MIME and Lua in Suricata. The result of this can be read about in the background section, and in the appendix. The research on these technologies was harder to split between group members, and didn’t really fit into our methodology. This 70 CASEC resulted in a longer research phase. However, we consider this extra time spent in the research phase as valuable because of the importance of the knowledge gained form this(fix settning). We used the results of this research as a base for a lot of our code structure, and we reused some code sections for our code. Looking into other protocols with support for Lua, was essential for understanding how to interact with the Lua stack with our extension to Suricata. Having the other protocol implementations for Lua as a starting point, gave us insight in the location of the other source files that define Lua related functionality, in Suricata. We also got a good understanding of where the MIME and SMTP data was located and how to extract and use them. Along with the knowledge gained from the research on this code came the underlying learning process of advanced C, helping us towards the learning outcome of Advanced C programming. The research done in the Background and Methodology sections set the baseline for achieve us the learning learning outcome stated. The research phase set the strong foundation we were hoping for. And there is no doubt that the result of the research phase is in direct correlations with goals reached in the other phases of the thesis. 6.2 Development Conclusion The development phase was one of our two main phases. The goal for the development phase was, as described in the problem statement, to develop an extension to Suricata in a secure way by using established secure development practices. The development phase was completed with the agile development method Kanban, along with The Value Method. There are five different phases under development: Requirements, Design, Implementation, Testing and Delivery. The use cases for the extension were made in cooperation with KDA. The Value Method was used throughout the entire development phase. In the requirements phase, security requirements were added through Value practice 2: Security Requirements, and Value practice 3: Perform Security and Privacy risk assessment. These practices ensured that security was thoroughly addressed in this phase. The security requirements were taken into account throughout the development phase. In the design phase security was addressed with two Value practices: Attack Surface Analysis, and Use Threat Modeling and Abuse Cases. The result of the Attack Surface Analysis practice was, even when tailored to our project, extensive. However we got one result out of it and a solution was found. The Threat Modeling and Abuse Case practice was valuable. It identified threats and they were mitigated accordingly. The abuses cases created the wanted awareness, and uncovered a serious bug during the risk-based security testing that was not revealed during ordinary testing. Security was addressed in the implementation phase with two Value practices: Deprecate Unsafe functions, and Perform Static Analysis. The Suricata project contribution elaborates on the use of deprecated functions. It also highlights which safe functions to use instead. Static analysis is covered in the code review conclusion. In the testing phase security was addressed with three value practices: Perform Dynamic Analysis, Perform Fuzz Testing and Risk-Based Security Testing. The dynamic analysis did not yield any results, which was assuring to us since that meant that our code did not introduce such bugs. The other practices are elaborated in their full extent under implementation. All of our testing implies that our security requirements are fulfilled, 71 CASEC especially after the pull request iterations are completed. We can then conclude that our security requirements were fulfilled. We did not leave trace of processed or unprocessed data. Our extension did not influence the integrity of the data being parsed. And we did not introduce new security vulnerabilities to the Suricata project. The last phase, Delivery, consists of having our code accepted into the Suricata project, delivering our user documentation and complete the QA process. The QA process is run and facilitated by OISF. It contains a lot of checks to ensure the quality of any submitted code and keeps the code quality in Suricata at a very high level. This is a step we have not completed yet, due to our need to prioritize completing this thesis. Value practice #12 and #13 combined with the OISF QA process will ensure, and verify, the quality of our code. The OISF QA process also covers Value practice #9: Perform Fuzz Testing. We did not have sufficient time to complete a fuzz test, but we are glad to see that the OISF QA covers this part of The Value Methodology. There were a few issues that have been identified throughout the development phase. The development was more time consuming than what was anticipated, effecting our code review phase in a negative way. We believe this could have been remediated by using our project management methodology, Kanban, more strictly. There was also two code related issues that were not anticipated and caused some issues during the phase execution. The first was that we were informed during the pull request feedback that our email attachment extraction functionality extracted attachment information that was persistent across SMTP session. That could potentially have been fixed had it been discovered earlier, but it was too late for rewriting code under delivery. Not discovering that fault earlier was a shortcoming in our testing, though we believe it is a hard to discover such an edge case. The other code related issue is tangentially related. It was discovered during testing, and confirmed with Suricata developers, that Suricata does not have application layer support for recognizing SMTP sessions. It would be optimal if Lua rules triggered once per SMTP session, this is not the case as it triggers multiple times. These two issues are however the only known shortcomings with the developed code. The functional requirement, “the extension will facilitate the extraction of SMTP-data between Suricata and Lua”, was completed by all of our functions. The other requirements are fulfilled by specific functions. The extension will allow users to extract all fields related to the MIME protocol in any given mail parsed by Suricata. This is fulfilled by the SMTPGetMimeList function, and the SMTPGetMimeField funtion. The extension does not decrease the general performance of Suricata, as seen by our performance tests. The extension will allow users to extract sender address and recipient address of any given mail parsed by Suricata. This is fulfilled by the SMTPGetMailFrom and SMTPGetRcptList functions. The system will still be portable to all operating systems supported by Suricata [7]. To help us achieve our non-functional requirements we created an extensive set of user documentation. The user documentation will be delivered to KDA, OISF and the Suricata project, as stated in our project goals. This documentation will help people with basic knowledge and training with Lua. The other requirement pertained to function names and that the intention behind the function should be clear from its name. We believe that this was fulfilled because we did not get any comments on it during our pull 72 CASEC requests, and that indicates that there where no issues with the naming. In general, The Value Method provided adequate security focus for the development phase. Our belief is that we had a great learning outcome regarding Secure code development. Although some of the practices were too extensive for the size of our project, we still believe it was a valuable addition to our development phase. As there is still some work to be done to get our code accepted into Suricata, we can not say that we completed our goal of extending Suricata in a secure way. But due to the OISF QA process and the relevant Value practices, we can be confident that our extension will meet OISFs security expectations when it gets accepted. When this is done, we will have reached the goal of extending Suricata in a secure way by using established secure development practices. The development phase resulted in five new API functions providing support for writing Lua scripts to extract data from Suricata’s SMTP and MIME data structures(appendix link). Thus solving part of our problem statement. We have iterated through the pull request process two times, but there is still a few things that require our attention. However we had to prioritize writing our thesis. We will complete the goal of submitting documentation after our pull request have gone through. Both of these goals will be future work. We did not reach our goal of zero percent decrease in general Suricata performance. Still, we believe that the performance decrease is of an insignificant nature, and that users will not be affected. As seen from our performance test on the Lua engine, running a simple Lua script, can process ten thousand SMTP packets a second, this is a goal we achieved. From the development phase we had great learning outcome in: Contributing to open source software development, Advanced C programming, Working with projects of great magnitude, Project management, Documenting findings in a scientific way, IDS technology and working with an external actor and employer. 6.3 Code Review Conclusion Code review is one of the two main phases in this thesis. Two different approaches to code reviewing were used. The first was manual analysis and the second was automated static analysis. Both static and manual analysis was performed according to a predefined methodology. It is stated in Value practice 2#: Security and Privacy Requirements, that “All code we produce will be manually reviewed for security issues.” Manual analysis was used continuously during the development, but in an informal manner. It was however always performed by a different group member than the one who wrote the code. This was to ensure the continuous high security and quality level of the code under development. The goal of the code review phase was, as described in the problem statement, “to verify the implementation of such as system,” referring to Suricata. Manually reviewing source code can be an extensive task, so selecting entry points for reviewing is essential for the outcome of the review. Two different methods were used for this. One approach had a focus on a high probability for results but with a low potential impact. The method here was to search through the source code looking for comments that eluded to security issues. This approach yielded the best and most concrete results. No high impact vulnerabilities were found, but the information gathered helped form an 73 CASEC opinion of the general security level in the Suricata source code. The second approach to code entry points was to evaluate the most critical code and review it. This method had a low probability for results, as such code would hopefully be written with care, but potential vulnerabilities would have a high impact. The second approach yield few concrete results as the code quality was good and the code complexity was challenging. We conclude that combining the first and second approach complemented each other and resulted in a balanced manual analysis. Static analysis is in our custom methodology as Value practice #7 and was performed with the tool clang-analyzer. This activity was performed as intended and the results were satisfying. All results were manually validated, reviewed and solutions were suggested. Few high impact bugs were found, supporting our conclusion about the quality of the Suricata implementation. Kanban was used throughout the code review phase as a group management methodology. It was however not strictly enforced, which may have been a factor in some faults with the review phase. The time spent before a peer reviewed the results of a code review was to long. Shortening that time could have helped to make the phase more effective, allowing for more results. We believe that using Kanban more strictly would have forced more time to be spent in the code review phase. The methodology has tools like backlog and to do lanes for tasks, forcing correct prioritization. The methodologies we used during manual and static analysis gave the results more strength, for example the peer review. The entry points used during the manual audit were especially good and allowed for the most effective auditing. We conclude that the sparse, high impact, results of the manual code review phase was because of the high quality of care put into the Suricata implementation, and of the expertise of its community. The community surrounding Suricata is alive and proactive. Suricata’s issue trackeris active and new tickets are closed within a reasonable amount of time, this is an indicator of the well-being of an open source project [79]. One essential result of the manual code review was the discovery of Suricatas sparse documentation and commenting. There were few comments in the code, and the ones that were found were usually high level descriptions of functions. That was definitely a large contributing factor to the hardships faced during the manual code review. Overall the high impact results of the code review phase was sparse, though this is also a result in itself. This observation is important, because it shows that the Suricata project keeps security in mind during all the aspects of development, keeping the code base clean from bugs and other potential risks and threats. This is reflected throughout both the manual and the static code review. The manual and static audit reflect the high quality of work that lies behind the Suricata project and the knowledge in its community. Another cause for the sparse result is our experience with advanced C code. Manual code audit is rarely practiced in our Bachelor course in Information security and in Software engineering. For this reason we believe that result of the manual audit would have been better if our knowledge and skills in the subjects were better. Even tough our results are sparse, we managed to complete all the goals we set for our review. Static analysis was performed on the entire code base, and code fixes have been suggested for all discovered bugs or flaws. We also believe that some of our audit findings have helped raise the general security of the Suricata project. Performing the 74 CASEC code review has been a learning process, and the learning outcome goal of executing a code review has most definitely been fulfilled. In the end we believe we did verify the Suricata project to the best of our ability. The results from the code review phase support our general belief, namely that the Suricata implementation is secure, which is our answer to the problem statement. 6.4 Main conclusion The problem of our thesis was to verify a security system and add an extension to such a system in a secure way by using established secure development practices. The thesis is divided into two main phases. In these phases we show that we verify a security monitoring system, and that we have added an extension to such a system in a secure way by using established secure development practices. The result of the first main phase, the development phase, show that the extension made have enabled the users of Suricata to write Lua scripts that can act upon SMTP traffic. It has been shown, during this phase, that the extension was developed with established security practices. The result of the second main phase, the code review phase, show that the thesis have verified the implementation of a security system, in our case Suricata. The results show that users can have confidence in using our extension of the security monitoring system Suricata. And our results show that users can have confidence in using the security monitoring system Suricata in general. 75 CASEC 7 7.1 Future Work and Lessons Learned Future Work Extracting filename and MD5 for attachments One requested feature was to extract the filename and MD5 sum for all email attachments. This feature was implemented, tested, documented and delivered to the Suricata developers. However we had made an error as there are two locations in the Suricata datastructures where filenames are stored. The filenames we retrieved were persistent across SMTP transactions making our approach for filename and MD5 sum extraction unsuitable. The feature is not currently implemented in our code and is future work. Completion of the pull request process We have not completed the pull request process with Suricata. The results is that our code has not been added to Suricata yet. This is part of the future work that we need to complete. Rework the SMTP parser Suricata does not support recognition of an SMTP session on the application layer. Suricata is only capable recognizing SMTP packets. However, support for general stream sessions is included, but does not work optimally with SMTP. It makes Lua rules toward SMTP data trigger several times for a SMTP session. To fix this one has to rewrite the SMTP parser, which was too extensive for our thesis. Fixing this would enhance the SMTP related Lua functionality in Suricata. Complete manual review of the entire Suricata source code Because we did not complete a full manual audit of the entire Suricata code base, we believe that this should be done in the future. This would give a more accurate verification of the security of Suricata. This process would become quite extensive, and could prove hard to do because of Suricatas constant updates and changes. But it is by all means possible with an adequate amount of resources and knowledge. Static analysis with different tools We performed static analysis with the clang-analyzer but we believe, for even more accurate result, that these results should be cross-referenced using several static analysis tools for an even better verification of Suricata’s security. Lua support for other protocols Creating Lua support for other protocols in Suricata could be valuable functionality for Suricata users. 7.2 Lessons Learned One lesson we learned trough the project was that enforcing group work to happen on campus was valuable. It made the communication in the group faster and important 76 CASEC decisions were quickly addressed. We saw a significant drop in productivity when the group was not physically together. We used the git version control software for the entire project. It was reassuring to know that work could always be restored to previous versions. It was helpful that everyone had access to each others work so that the knowledge could be spread out among the members. We learned that investing a lot of time into research is important and valuable when working on large projects, such as this. Our results would have been of significant lower quality if it were not for the thorough research phase. We learned that it is important to use methodologies strictly to ensure correct prioritization. We used Kanban for our development phase, however we had a loose relation to it during this phase. We lost valuable time because of this, which was an important lesson. Starting to write code in well established, large, projects is time consuming. We learned that one has to put a significant amount of time into becoming familiar with data structures and functions used in the code if one wants to contribute to it. We also learned that a QA process is time consuming, and that it can take a substantial amount of time and iterations before getting your code included into production. Security reviewing a program is more difficult than we were prepared for. It was very time consuming to get a good understanding of the softwares structure and functionality. Especially in our case as the software was poorly documented. With the entire process and project completed the group feels that we gained a lot of new, useful knowledge. Both for creating and verifying software with security in mind, and for managing and participating in large projects. We also feel that we managed to meet, or exceed, the desired learning outcome inn all the areas we outlined at the start of our thesis project. 77 CASEC Bibliography [1] Finding self signed tls certificates suricata and luajit scripting. https://www.stamus-networks.com/2015/07/24/ finding-self-signed-tls-certificates-suricata-and-luajit-scripting/. (Visited May 2016). [2] Smtp commands reference. http://www.samlogic.net/articles/ smtp-commands-reference.htm. (Visited May 2016). [3] 2016. V-model. https://commons.wikimedia.org/w/index.php?title=File: V-model.svg&oldid=158567139. (Visited May 2016). [4] 2016. The global state of information security survey 2016. http://www.pwc.com/ gsiss. (Visited May 2016). [5] Suricata - top 3 reasons you should try suricata. https://suricata-ids.org/. (Visited May 2016). [6] Bachelor’s thesis - imt 3912. http://english.hig.no/course_catalogue/ student_handbook/2013_2014/courses/avdeling_for_informatikk_og_ medieteknikk/imt3912_bachelor_s_thesis. (Visited May 2016). [7] 2016. Complete list of suricata features. https://suricata-ids.org/features/ all-features/. (Visited May 2016). [8] Open-source software. https://en.wikipedia.org/w/index.php?title= Open-source_software&oldid=715799413. (Visited April. 2016). [9] Linus’s law. https://en.wikipedia.org/w/index.php?title=Linus%27s_Law& oldid=713638891. (Visited March. 2016). [10] What is free software? (Visited March. 2016). https://www.gnu.org/philosophy/free-sw.en.html. [11] Guide to intrusion detection and prevention systems(idps). http://csrc.nist. gov/publications/nistpubs/800-94/SP800-94.pdf. (Visited May 2016). [12] Suricata - open source ids/ips/nsm engine. https://suricata-ids.org/. (Visited May 2016). [13] 2016. Suricata - features. May2016). https://suricata-ids.org/features/. (Visited [14] Suricata rules. https://redmine.openinfosecfoundation.org/projects/ suricata/wiki/Suricata_Rules. (Visited May 2016). [15] Emerging threats faq. http://doc.emergingthreats.net/bin/view/Main/ EmergingFAQ. (Visited May 2016). 78 CASEC [16] Suricata.yaml - rule-vars. https://redmine.openinfosecfoundation.org/ projects/suricata/wiki/Suricatayaml#Rule-vars. (Visited May 2016). [17] Suricata gnu general public license. https://raw.githubusercontent.com/ inliniac/suricata/master/LICENSE. (Visited May 2016). [18] Gnu - various licenses and comments about them. https://www.gnu.org/ licenses/license-list.en.html. (Visited May 2016). [19] Suricata - open source. https://suricata-ids.org/about/open-source/. (Visited May 2016). [20] Oisf - about us. https://oisf.net/about-us. (Visited May 2016). [21] Interpreted language. https://en.wikipedia.org/w/index.php?title= Interpreted_language&oldid=711174830. (Visited March. 2016). [22] Lua - about. https://www.lua.org/about.html. (Visited May 2016). [23] Lua versus python. http://lua-users.org/wiki/LuaVersusPython. (Visited May 2016). [24] Type system. https://en.wikipedia.org/w/index.php?title=Type_system& oldid=709690048. (Visited March. 2016). [25] Luajit. http://luajit.org/luajit.html. (Visited May 2016). [26] Just-in-time compilation. https://en.wikipedia.org/wiki/Just-in-time_ compilation. (Visited May 2016). [27] Lua output. https://redmine.openinfosecfoundation.org/projects/ suricata/wiki/Lua_Output. (Visited May 2016). [28] Lua scripting. https://redmine.openinfosecfoundation.org/projects/ suricata/wiki/Lua_scripting. (Visited May 2016). [29] Simple mail transfer protocol - transport of electronic mail. https://tools.ietf. org/html/rfc5321#section-1.1. (Visited May 2016). [30] Internet message format - addr-spec specification. https://tools.ietf.org/ html/rfc5322#section-3.4.1. (Visited May 2016). [31] Simple mail transfer protocol. https://www.ietf.org/rfc/rfc2821.txt. (Visited May 2016). [32] Mime. https://en.wikipedia.org/w/index.php?title=MIME&oldid= 715538720. (Visited March. 2016). [33] Mime (multipurpose internet mail extension) part thress: Message header extension for non-ascii text. https://tools.ietf.org/html/rfc2047. (Visited May 2016). [34] Internet message format. https://www.ietf.org/rfc/rfc2822.txt. (Visited May 2016). 79 CASEC [35] Henrik Kniberg, M. S. Kanban and Scrum - making the most of both, 10. InfoQ, 2010. [36] 2016. Trello. http://www.trello.com/. (Visited May 2016). [37] Sommerville, I. Software engineering -9th edition, 29–36. Addison-Wesley, 2011. [38] 2016. Manifesto for agile software development. http://agilemanifesto.org/. (Visited May 2016). [39] McGraw, G. Software Security, Building Security In, 110. Addison Wesley, 2006. [40] 2016. Generic checklist for code reviews. http://www.liberty.edu/media/1414/ %5B6401%5Dcode_review_checklist.pdf. (Visited May2016). [41] 2016. Generic checklist for code reviews. http://www.liberty.edu/media/1414/ %5B6401%5Dcode_review_checklist.pdf. (Visited May 2016). [42] System development live cycle. https://en.wikipedia.org/w/index.php?titl= Systems_development_life_cycle&oldid=714934957. (Visited May 2016). [43] Joosen, B. D. W. R. S. K. B. J. G. W. 2016. On the secure software development process: Clasp, sdl and touchpoints compared. https://lirias.kuleuven.be/ bitstream/123456789/242084/1/comparison.pdf. (Visited May 2016). [44] Microsoft sdl. https://www.microsoft.com/en-us/sdl/. (Visited May 2016). [45] Owasp: Clasp concepts. https://www.owasp.org/index.php?title=CLASP_ Concepts&oldid=48320. (Visited May 2016). [46] Owasp: Clasp project. https://www.owasp.org/index.php?title=Category: OWASP_CLASP_Project&oldid=209288. (Visited May 2016). [47] McGraw, G. 2006. Software Security, Building Security In. Addison Wesley. [48] Mcgraw touchpoints. http://www.swsec.com/resources/touchpoints/. (Visited May 2016). [49] Sdl for agile. https://www.microsoft.com/en-us/SDL/discover/sdlagile. aspx. (Visited May 2016). [50] Critic of agile sdl. http://www.ikangae.net/mood-post/ critic-of-agile-sdl/. (Visited May 2016). [51] Firestarter: Agile development and security. https://securosis.com/blog/ agile-development-and-security. (Visited May 2016). [52] Sdl process: Training. https://www.microsoft.com/en-us/SDL/process/ training.aspx. (Visited May 2016). [53] Sdl process: Requirements. https://www.microsoft.com/en-us/sdl/process/ requirements.aspx. (Visited May 2016). [54] Sdl process: Design. https://www.microsoft.com/en-us/sdl/process/design. aspx. (Visited May 2016). 80 CASEC [55] Sdl process: Implementation. https://www.microsoft.com/en-us/sdl/ process/implementation.aspx. (Visited May 2016). [56] Sdl process: Release. https://www.microsoft.com/en-us/sdl/process/ release.aspx. (Visited May 2016). [57] McGraw, G. Software Security, Building Security In, 205–222. Addison Wesley, 2006. [58] McGraw, G. Software Security, Building Security In, 187–189. Addison Wesley, 2006. [59] 2016. Oisf coding style. https://redmine.openinfosecfoundation.org/ projects/suricata/wiki/Coding_Style. Visited Feb. 2016. [60] Stride. https://en.wikipedia.org/wiki/STRIDE_%28security%29. (Visited May 2016). [61] 2016. Rfc 4021: Permanent mail header field registrations. https://tools.ietf. org/html/rfc4021#section-2.1. (Visited May 2016). [62] Wikipedia. 2016. Code refactoring — wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Code_refactoring& oldid=717403162. [Online; accessed 10-May2016]. [63] Valgrind home. http://valgrind.org/. (Visited May 2016). [64] McGraw, G. Software Security, Building Security In, 87. Addison Wesley, 2006. [65] 2016. The mutt e-mail client. http://www.mutt.org. (Visited May 2016). [66] 2016. msmtp. http://msmtp.sourceforge.net. (Visited May 2016). [67] 2016. Fakesmtp. https://nilhcem.github.io/FakeSMTP/. (Visited May 2016). [68] 2016. Automated builds: The key to consistency. articles/Automated-Builds. (Visited May 2016). http://www.infoq.com/ [69] 2016. Installation from git with luajit. https://redmine. openinfosecfoundation.org/projects/suricata/wiki/Installation_from_ GIT_with_luajit. (Visited May 2016). [70] 2016. Suricata github repository. https://github.com/inliniac/suricata. (Visited May 2016). [71] Suricata documentation. https://redmine.openinfosecfoundation.org/ projects/suricata/wiki. (Visited May 2016). [72] 2016. Code submission quality criteria. https://redmine. openinfosecfoundation.org/projects/suricata/wiki/Code_Submission_ Quality_Criteria. (Visited May 2016). [73] Julien, V. 2016. Suricata github readme. https://github.com/inliniac/ suricata/blob/master/README.md. Visited Apr. 2016. 81 CASEC [74] 2016. Where is path max defined in linux. https://stackoverflow. com/questions/9449241/where-is-path-max-defined-in-linux. (Visited Feb. 2016). [75] 2016. Path_max simply isn’t. http://insanecoding.blogspot.no/2007/11/ pathmax-simply-isnt.html. (Visited Feb. 2016). [76] 2016. Comparison of file systems. https://en.wikipedia.org/w/index.php? title=Comparison_of_file_systems&oldid=719823852. Visited May2016. [77] Clang analyzer. http://clang-analyzer.llvm.org/. (Visited May 2016). [78] Static analysis. https://en.wikipedia.org/w/index.php?title=Static_ program_analysis&oldid=711385436. (Visited May 2016). [79] Suricata - issues. https://redmine.openinfosecfoundation.org/projects/ suricata/issues. (Visited May 2016). [80] 2016. Suricata user guide. https://redmine.openinfosecfoundation.org/ projects/suricata/wiki/Suricata_User_Guide. (Visited May 2016). [81] Queue handling. http://www.netfilter.org/projects/libnetfilter_queue/ doxygen/group__Queue.html. (Visited May 2016). [82] Installation from git with luajit. https://redmine.openinfosecfoundation.org/ projects/suricata/wiki/Installation_from_GIT_with_luajit. (Visited May 2016). [83] Registration of mail and mime header fields. https://tools.ietf.org/html/ rfc4021. (Visited May 2016). 82 CASEC A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 util-lua-smtp.c /∗ C o p y r i g h t (C) 2014 Open I n f o r m a t i o n S e c u r i t y F o u n d a t i o n ∗ ∗ You can copy , r e d i s t r i b u t e o r m o d i f y t h i s Program under t h e t e r m s o f ∗ t h e GNU G e n e r a l P u b l i c L i c e n s e v e r s i o n 2 a s p u b l i s h e d by t h e F r e e ∗ Software Foundation . ∗ ∗ T h i s program i s d i s t r i b u t e d i n t h e hope t h a t i t w i l l be u s e f u l , ∗ b u t WITHOUT ANY WARRANTY ; w i t h o u t e v e n t h e i m p l i e d warranty o f ∗ MERCHANTABILITY o r FITNESS FOR A PARTICULAR PURPOSE . S e e t h e ∗ GNU G e n e r a l P u b l i c L i c e n s e f o r more d e t a i l s . ∗ ∗ You s h o u l d have r e c e i v e d a c o p y o f t h e GNU G e n e r a l P u b l i c L i c e n s e ∗ v e r s i o n 2 a l o n g w i t h t h i s program ; i f not , w r i t e t o t h e F r e e S o f t w a r e ∗ Foundation , I n c . , 51 F r a n k l i n S t r e e t , F i f t h F l o o r , Boston , MA ∗ 02110−1301, USA . ∗/ /∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗/ \file \ author \ author \ author \ author \ author c a s e c B a c h e l o r s group L a u r i t z Prag Sømme <lauritz24@me . com> L e v i T o b i a s s e n < l e v i . t o b i a s s e n @ g m a i l . com> S t i a n Hoel B e r g s e t h <s t i a n . b e r g s e t h @ h i g . no> V i n j a r H i l l e s t a d <v i n j a r . h i l l e s t a d @ h i g . no> #include " s u r i c a t a −common . h " #include " debug . h " #include " c o n f . h " #include #include #include #include " threads . h" " threadvars . h" " tm−t h r e a d s . h " " output . h " #include " app−l a y e r −smtp . h " #ifdef HAVE_LUA #include <l u a . h> #include <l u a l i b . h> #include " u t i l −l u a . h " #include " u t i l −lua−common . h " #include " u t i l −f i l e . h " /∗ ∗ \ b r i e f i n t e r n a l f u n c t i o n u s e d by SMTPGetMimeField ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o u s e and push a t t r i b u t e s t o ∗ \param f l o w network f l o w o f SMTP p a c k e t s ∗ \param name name o f t h e a t t r i b u t e t o e x t r a c t from M i m e D e c F i e l d ∗ ∗ \ r e t v a l 1 i f s u c c e s s m i m e f i e l d found and pushed t o s t a c k . R e t u r n s e r r o r ∗ i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s . ∗/ static int GetMimeDecField ( l u a _ S t a t e ∗ l u a s t a t e , Flow ∗flow , const char ∗name) { /∗ e x t r a c t s t a t e from f l o w ∗/ SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; /∗ c h e c k t h a t s t a t e e x s i s t s ∗/ if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l e r r o r : no s t a t e i n flow " ) ; } /∗ p o i n t e r t o c u r r e n t t r a n s a c t i o n i n s t a t e ∗/ SMTPTransaction ∗smtp_tx = s t a t e −>c u r r _ t x ; if ( smtp_tx == NULL) { 83 CASEC 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 return L u a C a l l b a c k E r r o r ( l u a s t a t e , " T r a n s a c t i o n ending or not found " ) ; } /∗ p o i n t e r t o t a i l o f msg l i s t o f M i m e D e c E n t i t y s i n c u r r e n t t r a n s a c t i o n . ∗/ MimeDecEntity ∗mime = smtp_tx−>m s g _ t a i l ; /∗ c h e c k i f m s g _ t a i l was h i t ∗/ if (mime == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l e r r o r : no f i e l d s i n t r a n s a c t i o n " ) ; } /∗ e x t r a c t MIME f i e l d b a s e d on s p e s i f i c f i e l d name . ∗/ MimeDecField ∗ f i e l d = MimeDecFindField (mime , name) ; /∗ c h e c k MIME f i e l d ∗/ if ( f i e l d == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : m i m e f i e l d not found " ) ; } /∗ r e t u r n e x t r a c t e d f i e l d . ∗/ if ( f i e l d −>v a l u e == NULL || f i e l d −>v a l u e _ l e n == 0) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r , p o i n t e r e r r o r " ) ; } return L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , f i e l d −>value , f i e l d −>v a l u e _ l e n ) ; } /∗∗ ∗ \ b r i e f F u n c t i o n e x t r a c t s s p e c i f i c MIME f i e l d b a s e d on argument from l u a s t a t e ∗ s t a c k then pushing the a t t r i b u t e onto the l u a s t a t e s t a c k . ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o pop and push a t t r i b u t e s f o r I /O t o l u a ∗ ∗ \ r e t v a l 1 i f s u c c e s s m i m e f i e l d found and pushed t o s t a c k . R e t u r n s e r r o r ∗ i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s . ∗/ static int SMTPGetMimeField ( l u a _ S t a t e ∗ l u a s t a t e ) { if ( ! ( LuaStateNeedProto ( l u a s t a t e , ALPROTO_SMTP) ) ) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : p r o t o c o l not SMTP" ) ; } int l o c k _ h i n t = 0 ; Flow ∗ flow = LuaStateGetFlow ( l u a s t a t e , &l o c k _ h i n t ) ; /∗ c h e c k t h a t f l o w e x i s t ∗/ if ( flow == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no flow found " ) ; } const char ∗name = LuaGetStringArgument ( l u a s t a t e , 1) ; /∗ l o c k c h e c k ∗/ if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { FLOWLOCK_RDLOCK( flow ) ; /∗ g e t s p e c i f i c MIME f i e l d ∗/ GetMimeDecField ( l u a s t a t e , flow , name) ; /∗ u n l o c k f l o w mutex t o a l l o w f o r m u l t i t h r e a d i n g ∗/ FLOWLOCK_UNLOCK( flow ) ; /∗ r e t u r n number o f f i e l d s pushed t o l u a s t a t e ∗/ } else { /∗ i f mutex a l r e a d y l o c k e d ∗/ GetMimeDecField ( l u a s t a t e , flow , name) ; } return 1 ; } /∗∗ ∗ \ brief ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o pop and push a t t r i b u t e s f o r I /O t o l u a ∗ \param f l o w network f l o w o f SMTP p a c k e t s ∗ ∗ \ r e t v a l 1 i f t h e m i m e l i s t t a b l e i s pushed t o l u a s t a t e s t a c k . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s . ∗/ static int GetMimeList ( l u a _ S t a t e ∗ l u a s t a t e , Flow ∗ flow ) { SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no SMTP s t a t e " ) ; } /∗ C r e a t e a p o i n t e r t o t h e c u r r e n t S M T P t r a n s a c t i o n ∗/ SMTPTransaction ∗smtp_tx = s t a t e −>c u r r _ t x ; if ( smtp_tx == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no SMTP t r a n s a c t i o n found " ) ; } 84 CASEC 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 /∗ C r e a t e a p o i n t e r t o t h e t a i l o f M i m e D e c E n t i t y l i s t ∗/ MimeDecEntity ∗mime = smtp_tx−>m s g _ t a i l ; if (mime == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no mime e n t i t y found " ) ; } MimeDecField ∗ f i e l d = mime−>f i e l d _ l i s t ; if ( f i e l d == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no f i e l d _ l i s t found " ) ; } /∗ C o u n t e r o f MIME f i e l d s found ∗/ int num = 1 ; /∗ l o o p t r o u g h t h e l i s t o f m i m e F i e l d s , p r i n t i n g e a c h name found ∗/ lua_newtable ( l u a s t a t e ) ; while ( f i e l d != NULL) { if ( f i e l d −>name != NULL) { l u a _ p u s h i n t e g e r ( l u a s t a t e , num++); L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , f i e l d −>name , f i e l d −>name_len ) ; l u a _ s e t t a b l e ( l u a s t a t e ,−3) ; } f i e l d = f i e l d −>n e x t ; } return 1 ; } /∗∗ ∗ \ b r i e f L i s t s name and v a l u e t o a l l MIME f i e l d s which ∗ i s i n c l u d e d i n a SMTP t r a n s a c t i o n . ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o pop and push a t t r i b u t e s f o r I /O t o l u a . ∗ ∗ \ r e t v a l 1 i f t h e t a b l e i s pushed t o l u a . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗ ∗/ static int SMTPGetMimeList ( l u a _ S t a t e ∗ l u a s t a t e ) { /∗ Check i f r i g h t p r o t o c o l ∗/ if ( ! ( LuaStateNeedProto ( l u a s t a t e , ALPROTO_SMTP) ) ) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : p r o t o c o l not SMTP" ) ; } /∗ mutex l o c k i n d i c a t o r v a r ∗/ int l o c k _ h i n t = 0 ; /∗ E x t r a c t network f l o w ∗/ Flow ∗ flow = LuaStateGetFlow ( l u a s t a t e , &l o c k _ h i n t ) ; if ( flow == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no flow found " ) ; } /∗ c h e c k i f f l o w a l r e a d y l o c k e d ∗/ if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ m u t e x l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; GetMimeList ( l u a s t a t e , flow ) ; FLOWLOCK_UNLOCK( flow ) ; } else { GetMimeList ( l u a s t a t e , flow ) ; } return 1 ; } /∗∗ ∗ \ b r i e f i n t e r n a l f u n c t i o n u s e d by SMTPGetMailFrom ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o pop and push a t t r i b u t e s f o r I /O t o l u a . ∗ \param f l o w f l o w t o g e t s t a t e f o r SMTP ∗ ∗ \ r e t v a l 1 i f mailfrom f i e l d found . ∗ R e t r u n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int GetMailFrom ( l u a _ S t a t e ∗ l u a s t a t e , Flow ∗ flow ) { /∗ E x t r a c t SMTPstate from c u r r e n t f l o w ∗/ SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : no s t a t e " ) ; } SMTPTransaction ∗smtp_tx = s t a t e −>c u r r _ t x ; if ( smtp_tx == NULL) { 85 CASEC 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : no SMTP t r a n s a c t i o n " ) ; } if ( smtp_tx−>mail_from == NULL || smtp_tx−>m a i l _ f r o m _ l e n == 0) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " MailFrom not found " ) ; } L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , smtp_tx−>mail_from , smtp_tx−>m a i l _ f r o m _ l e n ) ; /∗ R e t u r n s 1 b e c a u s e we n e v e r push more t h e n 1 i t e m t o t h e l u a s t a c k ∗/ return 1 ; } /∗∗ ∗ \ b r i e f E x t r a c t s mail_fr om p a r a m e t e r from SMTPState . ∗ A t t r i b u t e may a l s o be a v a i l a b l e from m i m e f i e l d s , a l t h o u g h t h e r e i s no ∗ g u a r a n t e e o f i t e x i s t i n g a s mime . ∗ ∗ \param l u a s t a t e l u a s t a t e s t a c k t o pop and push a t t r i b u t e s f o r I /O t o l u a . ∗ ∗ \ r e t v a l 1 i f mailfrom f i e l d found . ∗ R e t r u n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int SMTPGetMailFrom ( l u a _ S t a t e ∗ l u a s t a t e ) { /∗ c h e c k p r o t o c o l ∗/ if ( ! ( LuaStateNeedProto ( l u a s t a t e , ALPROTO_SMTP) ) ) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : p r o t o c o l not SMTP" ) ; } /∗ u s e l o c k _ h i n t t o c h e c k f o r m u t e x l o c k on f l o w ∗/ int l o c k _ h i n t = 0 ; /∗ E x t r a c t f l o w , w i t h l o c k h i n t t o c h e c k m u t e x l o c k i n g ∗/ Flow ∗ flow = LuaStateGetFlow ( l u a s t a t e , &l o c k _ h i n t ) ; if ( flow == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : no flow " ) ; } /∗ c h e c k i f a l r e a d y m u t e x l o c k e d by p a r e n t s ∗/ if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ m u t e x l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; GetMailFrom ( l u a s t a t e , flow ) ; FLOWLOCK_UNLOCK( flow ) ; } else { GetMailFrom ( l u a s t a t e , flow ) ; } return 1 ; } /∗∗ ∗ \ b r i e f i n t e r n f u n c t i o n u s e d by SM T P Ge t Rc p L is t ∗ ∗ \ params l u a s t a t e l u a s t a t e s t a c k f o r i n t e r n a l communication w i t h Lua . ∗ Used t o hand o v e r data t o t h e r e c i e v e i n g l u a s c r i p t . ∗ ∗ \ r e t v a l 1 i f t h e t a b l e i s pushed t o l u a . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int G e t R c p t L i s t ( l u a _ S t a t e ∗ l u a s t a t e , Flow ∗ flow ) { SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l e r r o r , no s t a t e " ) ; } SMTPTransaction ∗smtp_tx = s t a t e −>c u r r _ t x ; if ( smtp_tx == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " No more tx , or t x not found " ) ; } /∗ C r e a t e a new t a b l e i n l u a s t a t e f o r r c p t l i s t ∗/ lua_newtable ( l u a s t a t e ) ; /∗ r c p t v a r f o r i t e r a t o r ∗/ int u = 1 ; SMTPString ∗ r c p t ; TAILQ_FOREACH( r c p t , &smtp_tx−>r c p t _ t o _ l i s t , n e x t ) { L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , r c p t −>s t r , r c p t −>l e n ) ; l u a _ p u s h i n t e g e r ( l u a s t a t e , u++); l u a _ s e t t a b l e ( l u a s t a t e , −3) ; } 86 CASEC 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 /∗ r e t u r n 1 s i n c e we a l l w a y s push one t a b l e t o l u a s t a t e ∗/ return 1 ; } /∗∗ ∗ \ b r i e f f u n c t i o n l o o p s t h r o u g h r c p t −l i s t l o c a t e d i n ∗ f l o w−>SMTPState−>SMTPTransaction , a d d i n g a l l i t e m s t o a t a b l e . ∗ Then p u s h i n g i t t o t h e l u a s t a t e s t a c k . ∗ ∗ \ params l u a s t a t e l u a s t a t e s t a c k f o r i n t e r n a l communication w i t h Lua . ∗ Used t o hand o v e r data t o t h e r e c i e v e i n g l u a s c r i p t . ∗ ∗ \ r e t v a l 1 i f t h e t a b l e i s pushed t o l u a . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int SMTPGetRcptList ( l u a _ S t a t e ∗ l u a s t a t e ) { /∗ c h e c k p r o t o c o l ∗/ if ( ! ( LuaStateNeedProto ( l u a s t a t e , ALPROTO_SMTP) ) ) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : p r o t o c o l not SMTP" ) ; } /∗ c r e a t e l o c k h i n t v a r f o r f l o w l o c k c h e c k . ∗/ int l o c k _ h i n t = 0 ; /∗ E x t r a c t f l o w , w i t h l o c k h i n t t o c h e c k m u t e x l o c k i n g ∗/ Flow ∗ flow = LuaStateGetFlow ( l u a s t a t e , &l o c k _ h i n t ) ; if ( flow == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l e r r o r : no flow " ) ; } /∗ c h e c k i f a l r e a d y m u t e x l o c k e d by p a r e n t s ∗/ if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; G e t R c p t L i s t ( l u a s t a t e , flow ) ; /∗ open f l o w ∗/ FLOWLOCK_UNLOCK( flow ) ; } else { G e t R c p t L i s t ( l u a s t a t e , flow ) ; } /∗ r e t u r n 1 s i n c e we a l l w a y s push one t a b l e t o l u a s t a t e ∗/ return 1 ; } /∗∗ ∗ \ b r i e f i n t e r n f u n c t i o n u s e d by SMTPGetAttachmentInfo ∗ ∗ \ params l u a s t a t e , l u a s t a t e f o r i n t e r n a l communication t o w a r d s t h e ∗ l u a s c r i p t i n g engine . ∗ ∗ \ r e t v a l 1 i f t h e t a b l e i s pushed t o l u a . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int G e t A t t a c h m e n t I n f o ( l u a _ S t a t e ∗ l u a s t a t e , Flow ∗ flow ) { /∗ E x t r a c t SMTPState from f l o w ∗/ SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : s t a t e not found " ) ; } /∗ g e t F i l e C o n t a i n e r i n SMTPState ∗/ F i l e C o n t a i n e r ∗ f i l e _ c o n = s t a t e −>f i l e s _ t s ; if ( f i l e _ c o n == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : no f i l e s found " ) ; } /∗ p o i n t t o s t a r t o f l i s t f o r i t e r a t i n g t r o u g h ∗/ F i l e ∗ f i l e = f i l e _ c o n −>head ; if ( f i l e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : no f i l e ( s ) i n c o n t a i n e r " ) ; } int u = 1 ; /∗ c r e a t e new t a b l e f o r p l a c e m e n t o f f i n d i n g s ∗/ lua_newtable ( l u a s t a t e ) ; /∗ l o o p t h r o u g h and push f i l e n a m e t o l u a s t a t e t a b l e on s t a c k ∗/ while ( f i l e != NULL) { l u a _ p u s h i n t e g e r ( l u a s t a t e , u++); lua_newtable ( l u a s t a t e ) ; lua_pushstring ( luastate , " filename " ) ; L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , f i l e −>name , f i l e −>name_len ) ; l u a _ s e t t a b l e ( l u a s t a t e , −3) ; 87 CASEC 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 /∗ c r e a t i n g f o r l o o p temp v a r s ∗/ #ifdef HAVE_NSS char smd5 [ 2 5 6 ] ; int i ; size_t x; /∗ l o o p s t h r o u g h md5 i n t array , p o r t s i t t o c h a r ∗/ for ( i = 0 , x = 0 ; x < sizeof ( f i l e −>md5) ; x++) { i += s n p r i n t f (&smd5[ i ] , 255−i , "%02x " , f i l e −>md5[ x ] ) ; } /∗ push md5 c h a r a r r a y t o l u a s t a t e s t a c k ∗/ l u a _ p u s h s t r i n g ( l u a s t a t e , " md5−f i e l d " ) ; l u a _ p u s h s t r i n g ( l u a s t a t e , smd5) ; l u a _ s e t t a b l e ( l u a s t a t e , −3) ; /∗ s e t s e l f t o n e x t i n l i s t ∗/ #endif l u a _ s e t t a b l e ( l u a s t a t e , −3) ; f i l e = f i l e −>n e x t ; } /∗ r e t u r n 1 s i n c e we a l l w a y s push one t a b l e t o l u a s t a t e ∗/ return 1 ; } /∗∗ ∗ \ b r i e f F u n c t i o n g r a b s p o s s i b l e l i s t o f f i l e −s t r u c t s r e s i d i n g i n s i d e ∗ f l o w−>SMTPState−>F i l e C o n t a i n e r t h e n l o o p s t r o u g h t h i s l i s t , p u s h i n g a t a b l e f o r ∗ e a c h e n t i t y , c o n t a i n i n g t h e f i l e n a m e and MD5 checksum . ∗ ∗ \ params l u a s t a t e , l u a s t a t e f o r i n t e r n a l communication t o w a r d s t h e ∗ l u a s c r i p t i n g engine . ∗ ∗ \ r e t v a l 1 i f t h e t a b l e i s pushed t o l u a . ∗ R e t u r n s e r r o r i n t and msg pushed t o l u a s t a t e s t a c k i f e r r o r o c c u r s ∗/ static int SMTPGetAttachmentInfo ( l u a _ S t a t e ∗ l u a s t a t e ) { /∗ c h e c k p r o t o c o l ∗/ if ( ! ( LuaStateNeedProto ( l u a s t a t e , ALPROTO_SMTP) ) ) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : p r o t o c o l not SMTP" ) ; } /∗ c r e a t e l o c k h i n t v a r f o r f l o w l o c k c h e c k . ∗ r c p t v a r f o r i t e r a t o r ∗/ int l o c k _ h i n t = 0 ; /∗ E x t r a c t f l o w w i t h l u a s t a t e ∗/ Flow ∗ flow = LuaStateGetFlow ( l u a s t a t e , &l o c k _ h i n t ) ; if ( flow == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " i n t e r n a l e r r o r : no flow " ) ; } /∗ c h e c k i f f l o w a l r e a d y m u t e x l o c k e d ∗/ if ( l o c k _ h i n t == LUA_FLOW_NOT_LOCKED_BY_PARENT) { /∗ m u t e x l o c k f l o w ∗/ FLOWLOCK_RDLOCK( flow ) ; G e t A t t a c h m e n t I n f o ( l u a s t a t e , flow ) ; FLOWLOCK_UNLOCK( flow ) ; } else { G e t A t t a c h m e n t I n f o ( l u a s t a t e , flow ) ; } return 1 ; } int L u a R e g i s t e r S m t p F u n c t i o n s ( l u a _ S t a t e ∗ l u a s t a t e ) { l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SMTPGetMailFrom ) ; l u a _ s e t g l o b a l ( l u a s t a t e , " SMTPGetMailFrom " ) ; l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SMTPGetRcptList ) ; l u a _ s e t g l o b a l ( l u a s t a t e , " SMTPGetRcptList " ) ; l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SMTPGetMimeList ) ; l u a _ s e t g l o b a l ( l u a s t a t e , " SMTPGetMimeList " ) ; l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SMTPGetMimeField ) ; l u a _ s e t g l o b a l ( l u a s t a t e , " SMTPGetMimeField " ) ; l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SMTPGetAttachmentInfo ) ; l u a _ s e t g l o b a l ( l u a s t a t e , " SMTPGetAttachmentInfo " ) ; /∗ a l l f u n c t i o n s t h a t n e e d s be r e a c h a b l e from l u a have t o be pushed and ∗ s e t g l o b a l l y here . ∗ ex : 88 CASEC 471 ∗ l u a _ p u s h c f u n c t i o n ( l u a s t a t e , SmtpGetSmptpState ) ; 472 ∗ l u a _ s e t g l o b a l ( l u a s t a t e , " SmtpGetSmtpState " ) ; 473 ∗/ 474 return 0 ; 475 } 476 477 #endif /∗ HAVE_LUA ∗/ 89 CASEC B Pull Request Feedback Code Section One 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 if ( smtp \ _ t x == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " No more tx , or t x not found " ) ; } /∗ C r e a t e a new t a b l e i n l u a s t a t e f o r r c p t l i s t ∗/ l u a \ _newtable ( l u a s t a t e ) ; /∗ r c p t v a r f o r i t e r a t o r ∗/ int u = 1 ; SMTPString ∗ r c p t ; TAILQ\_FOREACH( r c p t , &smtp \ _tx−>r c p t \ _ t o \ _ l i s t , n e x t ) { L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , r c p t −>s t r , r c p t −>l e n ) ; l u a \ _ p u s h i n t e g e r ( l u a s t a t e , u++); l u a \ _ s e t t a b l e ( l u a s t a t e , −3) ; } /∗ R e t u r n s 1 b e c a u s e we n e v e r push more t h e n 1 i t e m t o t h e l u a s t a c k ∗/ “Feedback 1: this seems incorrect, the code above walks a list and could add more than one item to the stack” “Feedback 2: after discussing in IRC it does seem this is correct. The comment wording confused me. It would be good to state there that we return 1 because we push one table to the stack.” This is one instance where Julian misunderstood our code based on a comment in our code. Every function is supposed to return the amount of items pushed to the Lua stack. We pushed a table to the stack and looped through an array filling the table with a variable amount of data. The function was hard coded to return 1 no matter what and this paired together with an unfortunate wording in a comment about the return statement was the cause of the confusion. We talked with Julian on IRC and he agreed that we should return 1 as we only return one table. Code Section Two 1 int i ; 2 size \ _t x ; 3 /∗ l o o p s t h r o u g h md5 i n t array , p o r t s i t t o c h a r ∗/ 4 for ( i = 0 , x = 0 ; x < sizeof ( f i l e −>md5) ; x++) { 5 i += s n p r i n t f (&smd5[ i ] , 255−i , " \%02x " , f i l e −>md5[ x ] ) ; 6 } 7 /∗ push md5 c h a r a r r a y t o l u a s t a t e s t a c k ∗/ 8 l u a \ _ p u s h s t r i n g ( l u a s t a t e , " md5−f i e l d " ) ; 9 l u a \ _ p u s h s t r i n g ( l u a s t a t e , smd5) ; 10 l u a \ _ s e t t a b l e ( l u a s t a t e , −3) ; 11 /∗ s e t s e l f t o n e x t i n l i s t ∗/ 12 #endif 13 l u a \ _ s e t t a b l e ( l u a s t a t e , −3) ; 14 f i l e = f i l e −>n e x t ; 15 } 16 return 1 ; “Feedback 1: This shouldn’t we return more than 1 if there are more files? ” “Feedback 2: dito” 90 CASEC Here we have the exact same case as in section one, except in another function. We came to the same agreement as well, the code was fine. The “dito” or ditto is a reference two the reply in section two. Code Section Three 1 2 3 4 LuaRegisterDnsFunctions ( lua \ _ s t a t e ) ; LuaRegisterTlsFunctions ( lua \ _ s t a t e ) ; LuaRegisterSshFunctions ( lua \ _ s t a t e ) ; LuaRegisterSmtpFunctions ( lua \ _ s t a t e ) ; “indent looks off here” This was good feedback and relates to a indentation error during development where one line was indented too much. C does not care about indentation so it would not affect the compiled code, but this shows that the developers enforce their code standard. Code Section Four 1 #include " u t i l −f i l e . h " 2 3 /∗ 4 ∗ \ b r i e f F u n c t i o n e x t r a c t s M i m e D e c F i e l d from 5 ∗ f l o w−>SMTPState−>SMTPTransaction−>MimeDecEntity−>M i m e D e c F i e l d 6 ∗ b a s e d on p a r a m e t e r name , s e t i n p r e v i o u s , s p e s i f i e d f u n c t i o n . 7 ∗ 8 ∗ \param l u a s t a t e l u a s t a t e s t a c k t o u s e and push a t t r i b u t e s t o 9 ∗ \param f l o w network f l o w o f SMTP p a c k e t s 10 ∗ \param name name o f t h e a t t r i b u t e t o e x t r a c t from M i m e D e c F i e l d 11 ∗ 12 ∗ \ r e t v a l r e t u r n s number o f a t t r i b u t e s pushed t o l u a s t a t e s t a c k , 13 ∗ or e r r o r i n t + e r r o r msg t o s t a c k 14 ∗/ 15 16 static int GetMimeDecField ( l u a \ _ S t a t e ∗ l u a s t a t e , Flow ∗flow , const char ∗name) { “please put bracket on a new line. See https://redmine.openinfosecfoundation. org/projects/suricata/wiki/Coding_Style” This is another code standard issue and a slip up during development. Having a consistent code structure throughout the source code makes reading and maintaining the code easier. Code Section Five 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 return L u a C a l l b a c k E r r o r ( l u a s t a t e , " T r a n s a c t i o n ending or not found " ) ; } /∗ p o i n t e r t o t a i l o f msg l i s t o f M i m e D e c E n t i t y s i n c u r r e n t t r a n s a c t i o n . ∗/ MimeDecEntity ∗mime = smtp \ _tx−>msg\ _ t a i l ; /∗ c h e c k i f msg\ _ t a i l was h i t ∗/ if (mime == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l e r r o r : no f i e l d s i n t r a n s a c t i o n " ) ; } /∗ e x t r a c t MIME f i e l d b a s e d on s p e s i f i c f i e l d name . ∗/ MimeDecField ∗ f i e l d = MimeDecFindField (mime , name) ; /∗ c h e c k MIME f i e l d ∗/ if ( f i e l d == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : m i m e f i e l d not found " ) ; } /∗ r e t u r n e x t r a c t e d f i e l d . ∗/ if ( ! ( s t r l e n ( ( const char ∗) f i e l d −>v a l u e ) == f i e l d −>v a l u e \ _ l e n ) ) { “I’m not sure it’s safe to use strlen (which expects a nul-terminated string) on this field. In other places (e.g. output-json-email-common.c I see that BytesToString is used first.)” 91 CASEC This is really good feedback as this could have been a potential security issue. strlen was not on the list of deprecated function and we did not consider the possibility of nullstring missing. Situations like these may highlight the value of experienced programmers. It should be noted that our API documentation, found in Section 4.5.4, was supplied toghether with the pull request. We were recommended to add it to the wiki page for our code after it has been merged into the project and to create a new pull request with the incorperated changes [80]. We incorperated all the feedback and sent another pull request. The iterative feedback loop between us and the developers was a classic example of open source development. Our second pull request gathered a second round of feedback from Julian. We got comments on another three sections of code. Code Section One 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 if ( f i e l d == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " E r r o r : no f i e l d \ _ l i s t found " ) ; } /∗ C o u n t e r o f MIME f i e l d s found ∗/ int num = 1 ; /∗ l o o p t r o u g h t h e l i s t o f m i m e F i e l d s , p r i n t i n g e a c h name found ∗/ l u a \ _newtable ( l u a s t a t e ) ; while ( f i e l d != NULL) { if ( f i e l d −>name != NULL) { l u a \ _ p u s h i n t e g e r ( l u a s t a t e , num++); L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , f i e l d −>name , f i e l d −>name\ _ l e n ) ; l u a \ _ s e t t a b l e ( l u a s t a t e ,−3) ; } f i e l d = f i e l d −>n e x t ; } return 1 ; “is this correct if we just had one ’field’ and it’s name was NULL? In this case we’d not push anything to the stack.” This is another comment about return values. We return 1 no matter what at one point in the code but there is one potential flow in the program where no value is pushed to the Lua stack. In that case we should have returned 0. This is however an easy fix. Code Section Two 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 /∗ E x t r a c t SMTPstate from c u r r e n t f l o w ∗/ SMTPState ∗ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; if ( s t a t e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : no s t a t e " ) ; } SMTPTransaction ∗smtp \ _ t x = s t a t e −>c u r r \ _ t x ; if ( smtp \ _ t x == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " I n t e r n a l E r r o r : no SMTP t r a n s a c t i o n " ) ; } if ( smtp \ _tx−>mail \ _from == NULL || smtp \ _tx−>mail \ _from \ _ l e n == 0) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " MailFrom not found " ) ; } L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , smtp \ _tx−>mail \ _from , smtp \ _tx−>mail \ _from \ _ l e n ) ; /∗ R e t u r n s 1 b e c a u s e we n e v e r push more t h e n 1 i t e m t o t h e l u a s t a c k ∗/ return 1 ; “we can probably have ’return LuaPushStringBuffer(luastate, smtp_tx->mail_from, smtp_tx->mail_from_len);’ instead?” This comment is about directly returning the value of the function instead of calling the function and then returning that value. This saves one line of source code but has no 92 CASEC effect on the compiled code. Returning function calls is a matter of coding style, but was not listed in the coding standard. Changeing it is not an issue though and will help with the consistency of the code. Code Section Three 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 } /∗ g e t F i l e C o n t a i n e r i n SMTPState ∗/ F i l e C o n t a i n e r ∗ f i l e \ _con = s t a t e −>f i l e s \ _ t s ; if ( f i l e \ _con == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : no f i l e s found " ) ; } /∗ p o i n t t o s t a r t o f l i s t f o r i t e r a t i n g t r o u g h ∗/ F i l e ∗ f i l e = f i l e \ _con−>head ; if ( f i l e == NULL) { return L u a C a l l b a c k E r r o r ( l u a s t a t e , " e r r o r : no f i l e ( s ) i n c o n t a i n e r " ) ; } int u = 1 ; /∗ c r e a t e new t a b l e f o r p l a c e m e n t o f f i n d i n g s ∗/ l u a \ _newtable ( l u a s t a t e ) ; /∗ l o o p t h r o u g h and push f i l e n a m e t o l u a s t a t e t a b l e on s t a c k ∗/ while ( f i l e != NULL) { “files are stored per smtp state, not per ’transaction’. Here simply all files are returned. I think it would make more sense to act only on the files belong to the current tx?” This is true and it is a logic slip up made during our development. The filenames that are per transaction are located elsewere in the code but extracting them requires reworking a larger part of the function. 93 CASEC C C.1 C.1.1 Suricata Source Code Research Research util-lua-http Introduction This file contains functions for retrieving spesific HTTP header information, the HTTP body or the entire header. All this information is located in the lua_State pointer, a required parameter for all the functions. The LibHTP HTTP parser library is heavily used to retrieve data from lua_State. All information retrieved is pushed back into luastate and the success of that is mostly the return value of the functions. Luastate contains the traffick. All information is retrieved and passed somewhere else with LuaDoXYZ(luastate, orLuastateCastedToSomethingElse) functions. tx = htp_tx_tcontent C.1.2 Functions s t a t i c i n t HttpGetRequestHost ( l u a _ S t a t e ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Checks t h a t t h e hostname e x i s t s Hostname and Hostname l e n g h t i s pushed t o a s t r i n g b u f f e r i n l u a s t a t e − as t h e t y p e b s t r _ p t r and b s t r _ l e n Host = en . w i k i p e d i a . org The domain name o f t h e s e r v e r ( f o r v i r t u a l h o s t i n g ) , and t h e TCP p o r t number on which t h e s e r v e r i s l i s t e n i n g . The p o r t number may be o m i t t e d i f t h e p o r t i s t h e s t a n d a r d p o r t f o r t h e s e r v i c e r e q u e s t e d~\ c i t e {web : httphead } . s t a t i c i n t HttpGetRequestUriRaw ( l u a _ S t a t e ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Checks t h a t t h e URI e x i s t s URI and URI l e n g h t i s pushed t o a s t r i n g b u f f e r i n l u a s t a t e − as t h e t y p e b s t r _ p t r and b s t r _ l e n RequestURI : Request−URI = " ∗ " | a b s o l u t e U R I | a b s_ p a t h | a u t h o r i t y GET h t t p : / /www. w3 . org /pub/WWW/ T h e P r o j e c t . html HTTP/ 1 . 1 s t a t i c i n t HttpGetRequestUriNormalized ( l u a _ S t a t e ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e R e t r i e v e s and c h e c k s HtpTxUserData from t x Checks t h a t t h e URI e x i s t s URI and URI l e n g h t i s pushed t o a s t r i n g b u f f e r i n l u a s t a t e RequestURI : Request−URI = " ∗ " | a b s o l u t e U R I | a b s_ p a t h | a u t h o r i t y GET h t t p : / /www. w3 . org /pub/WWW/ T h e P r o j e c t . html HTTP/ 1 . 1 Normalized URI : h t t p : / /w3 . org /pub/www/ t h e p r o j e x t . html s t a t i c i n t HttpGetRequestLine ( lua_State ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Checks t h a t t h e r e q u e s t l i n e e x i s t s Request l i n e and r e q u e s t l i n e l e n g h t i s pushed t o a s t r i n g b u f f e r i n l u a s t a t e − as t h e t y p e b s t r _ p t r and b s t r _ l e n The Request−L i n e b e g i n s with a method token , f o l l o w e d by t h e Request−URI and t h e p r o t o c o l v e r s i o n , and ending with CRLF . The e l e m e n t s a r e s e p a r a t e d by SP c h a r a c t e r s . No CR or LF i s allowed e x c e p t i n t h e f i n a l CRLF sequence~\ c i t e {web : rfc2616 −5}. 94 CASEC Request−L i n e = Method SP Request−URI SP HTTP−V e r s i o n CRLF Method = GET/POST/ e t c Request−L i n e = Method + URI s t a t i c i n t HttpGetResponseLine ( l u a _ S t a t e ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Checks t h a t t h e response−l i n e e x i s t s Response−l i n e and response−l i n e l e n g h t i s pushed t o a s t r i n g b u f f e r i n l u a s t a t e − as t h e t y p e b s t r _ p t r and b s t r _ l e n Response−l i n e = Response S t a t u s−L i n e ? The f i r s t l i n e o f a Response message i s t h e S t a t u s−Line , c o n s i s t i n g o f t h e p r o t o c o l v e r s i o n f o l l o w e d by a numeric s t a t u s code and i t s a s s o c i a t e d t e x t u a l phrase , with each element s e p a r a t e d by SP c h a r a c t e r s . No CR or LF i s allowed e x c e p t i n t h e f i n a l CRLF sequence~\ c i t e {web : rfc2616 −5}. S t a t u s−L i n e = HTTP−V e r s i o n SP S t a t u s−Code SP Reason−P hr a s e CRLF s t a t i c i n t HttpGetHeader ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Checks i f t h e " f i r s t argument " i n l u a s t a t e e x i s t s Gets t h e r e s p o n s e header i f d i r was 1 , e l s e i t g e t s t h e r e q u e s t header − as t h e t y p e b s t r _ p t r and b s t r _ l e n b s t r _ p t r and b s t r _ l e n o f t h e v a l u e o f t h e header i s pushed t o l u a s t a t e Does a l o t o f htp_x_y c a s t i n g o f e l e m e n t s i n l u a s t a t e t o g e t t o t h e header . s t a t i c i n t HttpGetRequestHeader ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetHeader ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 0 s t a t i c i n t HttpGetResponseHeader ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetHeader ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 1 s t a t i c i n t HttpGetRawHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e R e t r i e v e s and c h e c k s HtpTxUserData from t x R e t r i e v e s r e q u e s t header and r e q u e s t header l e n g t h from HtpTxUserData Gets r e s p o n s e i n s t e a d i f d i r i s s e t t o 1 Header and header l e n g t h i s pushed t o l u a s t a t e L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , u i n t 8 _ t ∗ , u i n t 3 2 _ t ) , has no c a s t i n g t o b s t r _ l e n / p t r i n this function . s t a t i c i n t HttpGetRawRequestHeaders ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetRawHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 0 s t a t i c i n t HttpGetRawResponseHeaders ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetRawHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 1 s t a t i c i n t HttpGetHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e Gets r e q u e s t header from tx−>r e q u e s t _ h e a d e r s Gets r e s p o n s e i n s t e a d i f d i r i s s e t t o 1 Header and header l e n g t h i s pushed t o l u a s t a t e R e s e t s l u a s t a t e with l u a _ n e w t a b l e ( l u a s t a t e ) ( ? ) Loops through h e a d e r s and p u t s them i n t o l u a s t a t e G e t t i n g raw h e a d e r s i s more c o m p l i c a t e d s t a t i c i n t HttpGetRequestHeaders ( l u a _ S t a t e ∗ l u a s t a t e ) 95 CASEC C a l l s HttpGetHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 0 s t a t i c i n t HttpGetResponseHeaders ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetHeaders ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 1 s t a t i c i n t HttpGetBody ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) Checks i f t h e p r o t o c o l i s h t t p R e t r i e v e s and c h e c k s t x from l u a s t a t e R e t r i e v e s and c h e c k s HtpTxUserData from t x Gets r e q u e s t _ b o d y from HtpTxUserData i f d i r was 0 Gets response_body i f d i r was not luastate gets reset (?) Loops through t h e body pushing body chunks t o l u a s t a t e F i n a l l y i t c h e c k s i f ( body−>f i r s t && body−>l a s t ) Then i t pushes body o f f s e t s t o l u a s t a t e L a s t o p e r a t i o n i s a b i t vague s t a t i c i n t HttpGetRequestBody ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetBody ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 0 s t a t i c i n t HttpGetResponseBody ( l u a _ S t a t e ∗ l u a s t a t e ) C a l l s HttpGetBody ( l u a _ S t a t e ∗ l u a s t a t e , i n t d i r ) − with d i r as 1 int LuaRegisterHttpFunctions ( lua_State ∗ luastate ) Does one l u a _ p u s h f u n c t i o n ( l u a s t a t e , HttpGetRequestHeader ) − f o r each f u n c t i o n mentioned above . Does one l u a _ s e t g l o b a l ( l u a s t a t e , " HttpGetRequestHeader " ) f o r each as w e l l returns 0 Allows f o r t h e f u n c t i o n s t o be c a l l e d through l u a s t a t e ? [ 1] hxxps : / / en . w i k i p e d i a . org / w i k i / L i s t _ o f _ H T T P _ h e a d e r _ f i e l d s [ 2] hxxps : / /www. w3 . org / P r o t o c o l s / r f c 2 6 1 6 / rfc2616−s e c 5 . html [ 3] hxxps : / /www. w3 . org / P r o t o c o l s / r f c 2 6 1 6 / rfc2616−s e c 5 . html C.2 C.2.1 Research util-lua Introduction Deklarerer 9 strenger, eller keyer for forkjellige pointere. De forskjellige pointeren er tv,tx,p,flow,packet alert, file pointer, streaming buffer. I tilegg til en key for flow lock hint bool og en key for direction. C.2.2 Functions Funksjonene: ThreadVars *LuaStateGetThreadVars(lua\_State *luastate) Packet *LuaStateGetPacket(lua\_State *luastate) void *LuaStateGetTX(lua\_State *luastate) PacketAlert *LuaStateGetPacketAlert(lua\_State *luastate) File *LuaStateGetFile(lua\_State *luastate) LuaStreamingBuffer *LuaStateGetStreamingBuffer(lua\_State *luastate) int LuaStateGetDirection(lua\_State *luastate) De syv gettene over bruker alle funksjonene “lua\_pushlightuserdata”, “lua\_gettable”. I tileg void LuaStateSetThreadVars(lua\_State *luastate, ThreadVars *tv) void LuaStateSetPacket(lua\_State *luastate, Packet *p) 96 CASEC void void void void void C.2.3 LuaStateSetTX(lua\_State *luastate, void *txptr) LuaStateSetPacketAlert(lua\_State *luastate, PacketAlert *pa) LuaStateSetFile(lua\_State *luastate, File *file) LuaStateSetStreamingBuffer(lua\_State *luastate, LuaStreamingBuffer *b) LuaStateSetDirection(lua\_State *luastate, int direction) Description De syv settene over bruker alle to funksjoner: “lua_pushlightuserdata”, “lua_settable”, hvor “lua_pushlightuserdata” blir brukt to ganger, før “lua_settable” blir brukt. Der sendes det med en key, og den aktuelle pekeren som skal settes. Tislutt kalles den siste funksjonen, luastate og LUA_REGISTRYINDEX blir sendt med. Det er et set med funksjoner som skiller seg ut, det er funksjonene: Flow *LuaStateGetFlow(lua_State *luastate, int *lock_hint) void LuaStateSetFlow(lua_State *luastate, Flow *f, int need_flow_lock) I GetFlow oprrettes det en flowpeker som er null. Bruker “lua_pushlightuserdata” på lik måte som i de andre gettene bortsett fra at det som returnes lagres som en flow peker ikke en void peker. I samme funksjon hentes flow lock hint ved hjelp av “lua_pushlightuserdata” på lik måte som i de andre gettene, untatt at funksjonen som returnerer er “lua_boolean(luastate, -1)” hvor return verdien lagres i pekeren “lock_hint”, denne variablen gjøres det ikke noe med i funksjonen etter at den er lagret. Deretter returneres flowpekeren. Tar inn en ekstra paramterer, i forhold til de andre gettene. I SetfFlow brukes de samme funksjoneene som i de andre setfunksjonene, det eneste som er forskjellig er keyene som blir brukt(sånn som i de andre), men også at det setes to variabler, flow og flow lock status hint. Derfor tar funksjonen naturligvis inn en ekstra paramterer i forhold til de andre settene. Tilslutt i filen finner vi funksjonene: void LuaPrintStack(lua_State *state) int LuaPushStringBuffer(lua_State *luastate, const uint8_t *input, size_t input_len) void LuaPrintStack(lua_State *state) får inn en luastate, henter ut størelsen på luastaten, som er en int. Deretter kjøres en for løkke, til størelsen på luastaten, hvor det inne i for løkken switches på luastatens type og hvilken i i forløkken vi er på. Dette ved hjelp av funksjonen lua_type, som returnerer en int. Det skrives ut Size level og type. Type kan være: LUA_TFUNCTION LUA_TBOOLEAN LUA_TNUMBER LUA_TSTRING LUA_TTABLE Default for switchen vil skrive ut other. Det skrives ut en streng gitt hvilken type luastate. int LuaPushStringBuffer(lua_State *luastate, const uint8_t *input, size_t input_len) Tar inn luastate peker, to unsigned ints kalt *input og input_len. Hvor den første er en peker. util-lua.h dfinerer structen LuaStreamingBuffer. Getter også noen av de forskjellige pointerene vi bruker i .c filen(Flow, PackeAlert, File og Direction). Deklarerer samtlige funksjoner. 97 CASEC C.3 C.3.1 Research util-lua-ssh.c Functions Funksjon: static int GetServerProtoVersion(lua_State *luastate, const Flow *f) Henter i hovedsak protokoll versjon. Gjør dette ved å bruke en statepeker, som er Flowen som blir sendt med i funksjonen, som blir sjekket om er NULL. Om sjekken går gjennom castes staten til en SSH state peker, som videre sjekker serverens SSH versjon, returnere feil om det ikke finnes noen server protokol versjon. Tilslutt pushes luastaten, ssh statens protokoll versjon og lengenden av ssh på LUa stringbuffer og det returneres en variabel. Funksjon: static int SshGetServerProtoVersion(lua_State *luastate) Får inn en lua state peker, sjekker om aktuelle protokoll er SSH, hvis ikke melder funksjonen i fra om at lua_staten ikke er SSH. Neste steg oppretter en flow peker, hvor luastaten sendes med som variabel, sammen med inten lock_hint som er 0. Om flowen er NULL returneres en error, og melding skrives ut. Neste if sjekker om LUA flowen er i låst av en parent prosess, om ikke låses en FLOWLOCK henter deretter kalles GetServerProtoVersion. Dette gjøre uavhening o mparent er locked eller ikke. Tilslutt returneres det GetServerProtoVersion i form av en int. Funksjon: static int GetServerSoftwareVersion(lua_State *luastate, const Flow *f) Får inn luastate, lager en state peker ved å sende med en Flow peker i FlowGetAppSate. Sjekker deretter om state er null. Finner software versjon, pusher til luabuffer og returnerer en int. Funksjon: static int SshGetServerSoftwareVersion(lua_State *luastate) Bruker funksjonen over, forskjellen er at du får tilsendt en lua state istednfor en flow. Sjekker om protocolen til luastaten er ssh og om luastatens flow finnes. Låser av kritisk ecor for å sjekke server versjon med funksjonen over. Mulig feil i if elsen?? De resterende funksjonene opererer på samme måte, bare at koden her er knyttet opp mot client istedenfor server. Koden har tilsvarende trekk, som låsing av kritisk sector, og sjekker etter flow, om protocol er ssh, om app layer staten finnes. Den siste funksjonen int LuaRegisterSshFunctions(lua_State *luastate), registrer ssh som protocol i luastaten, og pusher funksjoner til lua. util lua ssh.h Deklarerer en funksjonen: int LuaRegisterSshFunctions(lua_State *luastate); C.4 C.4.1 Research util-lua-dns Introduction This file contains functions for retrieving spesific information about DNS-packets. The information about the spesific packet is located within a lua_state pointer, which is passed to all functions as a parameter, e.g: DnsGetTxId(lua_state *luastate). From the luastate pointer, there is pulled a tx pointer, by using LuaStateGetTX(), this tx pointer is of type DNSTransaction stuct, which is defined in app-layer-dns-common.c All information retrieved is pushed back into luastate. C.4.2 Functions s t a t i c i n t DnsGetDnsRrname ( l u a _ S t a t e ∗ l u a s t a t e ) Checks i f t h e p r o t o c o l i s DNS . 98 CASEC C r e a t e s DNSTransaction p o i n t e r named TX . Gets t x from l u a s t a t e with LuaStateGetTx ( l u a s t a t e ) Checks i f t x e x i s t s . S e t s a DNSQueryEnyty ∗query = NULL E n t e r s a i n t e r n a l f o r e a c h f u n c t i o n p a s s i n g t h e " query " , t h e " query l i s t " from TX , and some unused c o n s t or v a r named " n e x t " . c r e a t e s a Char p o i n t e r , g e t s a bytesteam presumably o f l e n g h t query−>len , then adds t h e i n t s i z e o f ( DNSqueryEntry ) − t h e i n t s i z e o f t h e e n t i r e dns query . I t c h e c k s i f t h e c h a r p o i n t e r c c o n t a i n s anything , i f so , p l a c e s s t r l e n o f c i n t o i n p u t _ l e n v a r i a b l e , then c h e c k s i f i n p u t _ l e n i s l a r g e r then 2∗ query−l e n . i f so i t i s c o n s i d e r e d a i n v a l i d l e n g t h and r e t u r n s l u a c a l l b a c k e r r o r . e l s e , i t p l a c e s t h e l u a s t a t e , p o i n t e r c , and i n p u t _ l e n i n t o r e t ( i n t v a r i a b l e ) trough a a L u a P u s h B u f f e r f u n c t i o n . Then f r e e s t h e c p o i n t e r . General purpose o f t h i s f u n c t i o n : Push each query name o f t h e query−l i s t i n t h e b y t e s t r e a m t o t h e l u a s t a t e . Main a t t r i b u t e from pckg : rrname s t a t i c i n t DnsGetTxid ( l u a _ s t a t e ∗ l u a s t a t e ) Checks i f dns p r o t o c o l i n l u a s t a t e . c r e a t e s t x p o i n t e r o f DNSTransaction trough LuaStateGetTX ( l u a s t a t e ) u s e s l u a _ p u s h i n t e g e r ( l u a s t a t e , tx−>t x _ i d ) t o add t h e i d o f TX t o l u a s t a t e v a r . then r e t u r n s 1 i f s u c c e s s . General purpose o f t h i s f u n c t i o n : Pushes t h e t x I d o f t h e pckg t o t h e l u a s t a t e . s t a t i c i n t DnsGetRcode ( l u a _ S t a t e ∗ l u a s t a t e ) i f l u a s t a t e ’ s t x has rcode v a r i a b l e not n u l l , a c h a r rcode [16] i s c r e a t e d . then DNSCreateRcodeString f u n c t i o n i s c a l l e d with tx−>code , rcode c h a r a r r a y [16] and and s i z e o f rcode . I t then r e t u r n s t h e i n t from p u s h s t r i n g b u f f e r . s t a t i c i n t DnsGetRecursionDesired ( lua_State ∗ j ) c h e c k s i f p r o t o c o l i s DNS C r e a t e s ∗ t x from LuaStateGetTX ( l u a s t a t e ) checks i f tx i s n u l l pushes boolean from tx−>r e c u r s i o n _ d e s i r e d t o l u a s t a t e returns success state s t a t i c i n t DnsGetQueryTable ( l u a _ S t a t e ∗ l u a s t a t e ) check p r o t o c o l g e t Tx p o i n t e r from l u a s t a t e . check t x i n i t 32 b i t u i n t =0 c a l l lua_newtable ( l u a s t a t e ) c r e a t e a DNSQueryEntry p o i n t e r ∗query and s e t i t t o n u l l TAILQ_FOREACH( query , &tx−>q u e r y _ l i s t , n e x t ) , b a s i c a l l y a f o r e a c h loop trough t h e q u e r y l i s t − l o a d s o f data . On each element i n t h i s l i s t : c a l l l u a _ p u s h i n t e g e r ( l u a s t a t e , u++) i n t 0 i s pushed t o s t a t e c a l l l u a _ n e w t a b l e ( l u a s t a t e ) , p r o l l y c r e a t e s new t a b l e . c r e a t e s a c h a r a r r a y r e c o r d [16] c a l l s DNSCreateTypeString ( query−>type , r eco rd , s i z e o f ( r e c o r d ) ) then t h e y use l u a _ p u s h s t r i n g on both t h e r e c o r d c h a r a r r a y and hardcoded " t y p e " i n t o the l u a s t a t e l u a _ s e t t a b l e ( l u a s t a t e , −3) − no i d e a guyes . . . . C r e a t e s new l o c a l scope with i n t e r n a l { } c r e a t e s c h a r p o i n t e r , w r i t e s B y t e s T o S t r i n g from query t o t h e c v a r i a b l e . saves s t r l e n of c . c h e c k s i f t h e l e n o f c i s l o n g e r then 2 ∗ query−>l e n . I f so , t h e c h a r p o i n t e r i s f r e e d and e r r o r i s read . l u a _ p u s h s t r i n g ( l u a s t a t e , " rrname " ) c a l l e d t o push a t t r i b u t e name then p u s h s t r i n g on t h e v a r i a b l e c , which c o n t a i n s t h e a c t u a l rrname l u a _ s e t t a b l e ( l u a s t a t e , −3) c a l l e d . f r e e s char p o i n t e r c . s t a t i c i n t DnsGetAnswerTable ( l u a _ S t a t e ∗ l u a s t a t e ) check p r o t o c o l c r e a t e s ∗ t x p o i n t e r from LuaStateGetTx ( l u a s t a t e ) s e t s new i n t " u " t o 0 c a l l s lua_newtable ( l u a s t a t e ) c r e a t e s DNSAnswerEntry ∗answer = NULL i n i t s f o r e a c h l o o p trough answers i n tx−>a n s w e r _ l i s t pushes t h e i n t u++ t o l u a t s t a t e c r e a t e a n o t h e r newtable from l u a s t a t e c r e a t e c h a r a r r a y r e c o r d [16] DNSCreateTypeString ( answer−>type , re co rd , s i z e o f ( r e c o r d ) ) i s c a l l e d ^t h e above f u n c t i o n p r o b a b l y t a k e s t h e t y p e from answer and w r i t e s i t t o 99 CASEC t h e r e c o r d v a r as s t r i n g i n t o c h a r a r r a y . then pushes hardcoded " t y p e " t o l u a s t a t e + r e c o r d c h a r a r r a y does t h e same f o r t t l a t t r i b u t e . then d e f i n e s a new scope i n s i d e t h e f o r e a c h l o o p f o r g e t t i n g t h e r e s t o f t h e a t t r i b u t e s . rrname , addr , addr , addr . most p r o b a b l y t h r e e d i f f e r e n t ip addresses . s t a t i c i n t DnsGetAuthorityTable ( l u a _ s t a t e ∗ l u a s t a t e ) check p r o t o c o l i f dns create ∗tx pointer with answer l o o p s trough f o r e a c h and pushes a t t r i b u t e s t y p e and t t l t o l u a s t a t e then s c o p e s a n o t h e r view and f e t c h e s t h e rrname a t t r i b u t e trough b y t e s t o S t r i n g ( ) i n t LuaRegisterDnsFunctions ( lua_State ∗ l u a s t a t e ) i n s e r t i n g and d e f i n i n g a l l f u n c t i o n s i n t h i s f i l e t o be used and c a l l e d from l u a s t a t e g l o a b a l l y . pushes t h e s e f u n c t i o n s onto t h e l u a s t a t e . C.5 C.5.1 Research util-lua-common Description and functions T h i s f i l e c o n t a i n s f u n c t i o n s commonly used by a l l or some p r o t o c o l s p e s i f i c u t i l −l u a files . i n t L u a C a l l b a c k E r r o r ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t c h a r ∗msg ) D e f a u l t e r r o r f u n c t i o n . An e r r o r message i s r e c i e v e d and pushed t o t h e l u a s t a t e . c o n s t c h a r ∗LuaGetStringArgument ( l u a _ S t a t e ∗ l u a s t a t e , i n t a r g c ) F u n c t i o n g e t s t h e argument as s t r i n g and r e t u r n s i t . Check i f creates returns Else , i t a r g c i s s t r i n g , r e t u r n s n u l l i f not . const char ∗ s t r , d e l e g a t e s t o s t r i n g of argc to s t r . n u l l i f s t r not f i l l e d , or ! s t r l e n ( s t r ) . r e t u r n s t h e s t r i n g s t r , ergo , t h e argument passed . v o i d L u a P u s h T a b l e K e y V a l u e I n t ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t c h a r ∗key , i n t v a l u e ) F u n c t i o n o n l y pushes key and v a l u e t o t h e l u a s t a t e . Value pushed as number . Then u s e s s e t t a b l e ( l u a s t a t e , −3) ; v o i d L u a P u s h T a b l e K e y V a l u e S t r i n g ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t c h a r ∗key , c o n s t c h a r ∗ v a l u e ) F u n c t i o n pushes key t o l u a s t a t e s t a c k , then c h e c k s i f v a l u e i s N u l l and pushes " ( n u l l ) " or containment o f value to l u a s t a t e stack . l u a _ s e t t a b l e ( l u a s t a t e , −3) a f t e r w o r d s . v o i d LuaPushTableKeyValueArray ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t c h a r ∗key , c o n s t u i n t 8 _ t ∗ value , s i z e _ t l e n ) F u n c t i o n pushes key . runs l u a P u s h S t r i n g B u f f e r ( l u a s t a t e , value , l e n ) which pushes v a l u e i n t and l e n t o luastate . l u a _ s e t t a b l e ( l u a s t a t e , −3) a f t e r w o r d s . s t a t i c i n t LuaCallbackStreamingBufferPushToStack ( lua_State ∗ luastate , const L u a S t r e a m i n g B u f f e r ∗b ) Uses l u a _ p u s h s t r i n g with b−>data as c h a r ∗ , ergo , f i l l s l u a s t a c k with payload . pushes b o o l s u s i n g lua_pushboolean , adding b−>f l a g s & o u t p u t _ s t r e a m i n g _ f l a g _ o p e n / c l o s e to the l u a s t a t e stack . s t a t i c i n t LuaCallbackStreamingBuffer ( lua_State ∗ l u a s t a t e ) c r e a t e s t h e c o n s t L u a S t r e a m i n g B u f f e r ∗b = L u a S t a t e G e t S t r e a m B u f f e r ( l u a s t a t e ) ; R e t u r n s t h e i n t o f t h e above f u n c t i o n . s t a t i c i n t LuaCallbackPacketPayloadPushToStackFromPacket ( l u a _ S t a t e ∗ l u a s t a t e , const P a c k e t ∗p ) R e t r i e v e s a P a c k e t s t r u c t p o i n t e r ∗p , where t h e p a c k e t payload g e t s pushed t o t h e l u a s t a t e s t a c k by l u a _ p u s h s l s t r i n g ( l u a s t a t e , ( c o n s t c h a r ∗)p−>payload , p−>p a y l o a d _ l e n ) ; then i t r e t u r n s 1 no m a t t e r what . s t a t i c i n t LuaCallbackPacketPayload ( l u a _ s t a t e ∗ l u a s t a t e ) C r e a t i n g P a c k e t ∗p by L u a S t a t e G e t P a c k e t ( l u a s t a t e ) R e t u r n s t h e f u n c t i o n above by p a s s i n g t h e p a c k e t j u s t popped from t h e l u a s t a t e . 100 CASEC s t a t i c i n t L u a C a l l b a c k T i m e S tr i n g P u s h T o S t a c k Fr o m P a c k e t ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t P a c k e t ∗p ) F i l l s t h e l u a s t a t e s t a c k with h e a d e r i n f o r m a t i o n , i n t h i s case , t h e timestamp from t h e packet . C r e a t e s a c h a r A r r a y t i m e b u f [64] C r e a t e T i m e S t r i n g (&p−>t s ( i s c a s t e d t o t i m e v a l ) , timebuf , s i z e o f ( t i m e b u f ) ) ; pushes t h e timebuf−s t r i n g t o t h e l u a s t a t e . r e t u r n s 1 no m a t t e r what . s t a t i c i n t LuaCallbackPacketTimeString ( lua_State ∗ l u a s t a t e ) c r e a t e s packed l i k e t h e payload f u n c t i o n , r e t u r n s t h e f u n c t i o n above by p a s s i n g t h e p a c k e t j u s t popped from t h e l u a s t a t e . s t a t i c i n t LuaCallbackTimeStringPushToStackFromFlow ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t Flow ∗ flow ) c r e a t e s t i m e b u f a r r a y o f 64 c h a r s . Uses C r e a t e T i m e S t r i n g f u n c t i o n with flow ’ s timestamp . pushes t h e t i m e b u f a r r a y onto l u a s t a t e s t a c k a g a i n . Returns 1; s t a t i c i n t LuaCallBackFlowTimeString ( l u a _ S t a t e ∗ l u a s t a t e ) I n i t s l o c k e d i n t = 0 , t h i s i s used t o check i f t h e flow should be l o c k e d or not . When c r e a t i n g t h e flow s t r u c t p o i n t e r , l o c k e d i s passed by r e f e r a n c e t o LuaStateGetFlow ( ) , w i t h i n t h i s , t h e l o c k e d i n t i s determined by t h e l u a s t a t e when g e t t i n g t h e flow . I f l u a s t a t e index −1 i s f a l s e or null , the locked var i s s e t 0 , i n a l l o t h e r c a s e s , t h e l o c k e d i n t i s s e t 1 . Thereby , e n t e r i n g t h e l o c k i n g mechanism . When mutex i s locked , t h e above f u n c t i o n i s c a l l e d with a race−c o n d i t i o n " s a f e " environment . F u n c t i o n r e t u r n s 1 or e r r o r . s t a t i c i n t LuaCallbackTouplePushToStackFromPacket ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t P a c k e t ∗p ) B r i e f : F i l l s l u a s t a t e s t a c k with header i n f o r e t u r n s : count number o f data i t e m s p l a c e d on t h e s t a c k . Determines t h e i p v e r s i o n (4 or 6) from packet , then i p v e r number t o l u a s t a t e . i n i t s s o u r c e i p and d e s t i n a t i o n i p as c h a r a r r a y s , then u s e s p r i n t I n e t f u n c t i o n s t o g e t a t t r i b u t e s s r c i p and d s t i p and p l a c e then i n r e s p e c t i v e c h a r a r r a y s . P r i n t I n e t ( AF_INET , ( c o n s t v o i d ∗)GET_IPV4_SRC_ADDR_PTR ( p ) , s r c i p , s i z e o f ( s r c i p ) ) ; i f ipv6 , same c a l l j u s t s l i g h t l y d i f f e r e n t param s t r u c t u r e and param f u n c t i o n . pushes s c r i p and d s t i p t o l u a s t a t e s t a c k . Checks p r o t o c o l o f p a c k e t s , i f TCP or UDP , pushes p a c k e t s sp and dp t o l u a s t a t e s t a c k , ( s o u r c e port , d e s t p o r t ?) e l s e i f ICMP or ICMPV6 , pushes packes t y p e and code . i f none o f t h e above , puches numbers 0 t w i c e . s t a t i c i n t LuaCallbackTuple ( lua_State ∗ l u a s t a t e ) C r e a t e s c o n s t p a c k e t ∗p from L u a S t a t e G e t P a c k e t ( ) i n u t i l −l u a . c i f t h e r e i s a packet , r e t u r n s r e s u l t o f above f u n c t i o n . s t a t i c i n t LuaCallbackTuplePushToStackFromFlow ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t Flow ∗ f ) b a s i c a l l y t h e same as LuaCallbackTuplePushToStackFromPacket ( ) o n l y t h a t t h i s f u n c t i o n works with Flow s t r u c t p o i n t e r and not P a c k e t s t r u c t p o i n t e r . s t a t i c i n t LuaCallbackTupleFlow ( l u a _ S t a t e ∗ l u a s t a t e ) Almost i d e n t i c a l t o L u a C a l l b a c k F l o w T i m e S t r i n g ( ) . Uses l o c k i n g mechanisms when g e t t i n g i n f o r m a t i o n on t h e Flow p o i n t e r , and p l a c i n g t h i s i n f o r m a t i o n on t h e l u a s t a t e s t a c k . r e t u r n s r e s u l t o f LuaCallBackTuplePushToStackFromFlow ( ) ; s t a t i c i n t LuaCallbackAppLayerProtoPushToStackFromFlow ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t Flow ∗f ) c r e a t e s c o n s t s t r i n g p o i n t e r t o A p p r o t o T o S t r i n g ( f−>a l p r o t o ) c h e c k s i s s t r i n g i s n u l l , i n t h a t case , s e t s i t t o " unknown " then pushes a p p l a y e r p r o t o c o l a t t r i b u t e t o l u s t a t e s t a c k . s t a t i c i n t LuaCal lbackAp pLayerP rotoFlo w ( l u a _ S t a t e ∗ l u a s t a t e ) Almost i d e n t i c a l t o L u a C a l l b a c k F l o w T i m e S t r i n g ( ) . Uses L o c k i n g mechanisms when g e t t i n g i n f o r m a t i o n on t h e Flow p o i n t e r , and p l a c i n g t h i s i n f o r m a t i o n on t h e l u a s t a t e s t a c k . r e t u r n s r e s u l t o f LuaCallbackAppLayerProtoPushToStackFromFlow ( ) ; s t a t i c i n t LuaCallbackStatsPushToStackFromFlow ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t Flow ∗ f ) pushes f o u r numbers t o l u a s t a c k then r e t u r n i n g i n t 4 . V a r i a b l e s pushed i s f−>t o d s t p k t c n t 101 CASEC f−>t o d s t b y t e c n t f−>t o s r c p k t c n t f−>t o s r c b y t e c h n t presumably t h i s i s t o d e s t i n a t i o n p a c k e t count t o d e s t i n a t i o n b y t e s count t o s o u r c e p a c k e t count t o s o u r c e b y t e s count s t a t i c i n t LuaCallbackStatsFlow ( lua_State ∗ l u a s t a t e ) Almost i d e n t i c a l t o L u a C a l l b a c k F l o w T i m e S t r i n g ( ) . Uses L o c k i n g mechanisms when g e t t i n g i n f o r m a t i o n on t h e Flow p o i n t e r , and p l a c i n g t h i s i n f o r m a t i o n on t h e l u a s t a t e s t a c k . r e t u r n s r e s u l t o f LuaCallbackStatsPushToStackFromFlow ( ) ; s t a t i c i n t LuaCallbackRuleIdsPushToStackFromPacketAlert ( lua_State ∗ luastate , const P a c k e t A l e r t ∗pa ) B r i e f : f i l l s t h e l u a s t a t e s t a c k with a l e r t i n f o r m a t i o n Pushes t h r e e numbers / a t t r i b u t e s t o t h e l u a s t a t e s t a c k . A t t r i b u t e s pushed : pa−>s−>i d pa−>s−>r e v pa−>s−>g i d R e t u r n s 3 . The number o f how many data i t e m s pushed t o s t a c k . s t a t i c i n t LuaCallbackRuleIds ( lua_State ∗ l u a s t a t e ) c r e a t e P a c k e t a l e r t p o i n t e r and d e l e g a t e i t by u s i n g L u a S t a t e G e t P a c k e t A l e r t ( l u a s t a t e ) p a s s t h i s p a c k e t a l e r t i n t o t h e r e t u r n s t a t e m e n t f u n c t i o n as param . r e t u r n r e s u l t o f above f u n c t i o n , ergo 3 . s t a t i c i n t LuaCallbackRuleMsgPushToStackFromPacketAlert ( l u a _ s t a t e ∗ l u a s t a t e , c o n s t P a c k e t A l e r t ∗pa ) pushes a t t r i b u t e pa−>s−>msg t o l u a s t a t e s t a c k . pa−>s−>msg = r u l e message . returns 1 s t a t i c i n t LuaCallbackRuleMsg ( l u a _ S t a t e ∗ l u a s t a t e ) c r e a t e P a c k e t a l e r t p o i n t e r and d e l e g a t e i t by u s i n g L u a S t a t e G e t P a c k e t A l e r t ( l u a s t a t e ) p a s s t h i s p a c k e t a l e r t i n t o t h e r e t u r n s t a t e m e n t f u n c t i o n as param . r e t u r n r e s u l t o f above f u n c t i o n , ergo 1 . s t a t i c i n t LuaCallbackRuleClassPushToStackFromPacketAlert ( lua_State ∗ luastate , const P a c k e t A l e r t ∗pa ) f u n c t i o n pushes two a t t r i b u t e s from t h e P a c k e t A l e r t t o l u a s t a t e s t a c k . pa−>s−>c l a s s _ m s g pa−>s−>p r i o pretty selfesxplanitory . r e t u r n s 2 , number o f a t t r pushed t o s t a c k . s t a t i c i n t LuaCallbackRuleClass ( lua_state ∗ luastate ) c r e a t e P a c k e t a l e r t p o i n t e r and d e l e g a t e i t by u s i n g L u a S t a t e G e t P a c k e t A l e r t ( l u a s t a t e ) p a s s t h i s p a c k e t a l e r t i n t o t h e r e t u r n s t a t e m e n t f u n c t i o n as param . r e t u r n r e s u l t o f above f u n c t i o n , ergo 2 . s t a t i c i n t LuaCallbackLogPath ( lua_State ∗ l u a s t a t e ) A s s i g n s l o g D i r e c t o r y t o a c h a r p o i n t e r , trough a f u n c t i o n : C o n f i g G e t L o g D i r e c t o r y ( ) ; R e t u r n s t h e f u n c t i o n L u a P u s h S t r i n g B u f f e r ( l u a s t a t e , ( c o n s t u i n t 8 _ t ∗) ld , s t r l e n ( l d ) ) s t a t i c i n t LuaCallbackLogDebug ( l u a _ S t a t e ∗ l u s t a t e ) e x t r a c t s message as c h a r p o i n t e r from f u n c t i o n : LuaGetStringArgument ( l u a s t a t e , 1) Runs SCLogDebug ( ) f u n c t i o n with t h a t msg as param . Returns 0; s t a t i c i n t LuaCallbackLoginfo ( lua_State ∗ l u a s t a t e ) e x t r a c t s msg as c h a r p o i n t e r from f u n c t i o n : LuaGetStringArgument ( l u a s t a t e , 1) Runs SCLogInfo ("\% s " , msg ) on t h i s a t t r i b u t e . returns 0; s t a t i c i n t LuaCallbackLogNotice ( lua_State ∗ l u a s t a t e ) e x t r a c t s msg as c h a r p o i n t e r from f u n c t i o n : LuaGetStringArgument ( l u a s t a t e , 1) Runs SCLogNotce ("\% s " , msg ) on t h i s a t t r i b u t e . returns 0; s t a t i c i n t L ua C al l ba c kl o gW ar n in g ( l u a _ S t a t e ∗ l u a s t a t e ) e x t r a c t s msg as c h a r p o i n t e r from f u n c t i o n : LuaGetStringArgument ( l u a s t a t e , 1) Runs SCLogWarning (SC_WARN_LUA_SCRIPT , "\% s " , msg ) on t h i s a t t r i b u t e . returns 0; s t a t i c i n t LuaCallbackLoError ( lua_State ∗ l u a s t a t e ) e x t r a c t s msg as c h a r p o i n t e r from f u n c t i o n : LuaGetStringArgument ( l u a s t a t e , 1) Runs SCLogError (SC_WARN_LUA_SCRIPT , "\% s " , msg ) on t h i s a t t r i b u t e . 102 CASEC returns 0; s t a t i c i nt LuaCallbackFileInfoPushToStackFromFile ( lua_State ∗luastate , const F i l e ∗ f i l e ) #i f d e f HAVE_NSS { c r e a t e s and emptys out c h a r a r r a y [33] named md5 . c r e a t e s md5ptr c h a r p o i n t e r , a s s i g n s t h i s t o b e g i n n i n g o f md5 a r r r a y i f check f o r f i l e −>f l a g s & FILE_MD5 f o r loop l o o p s trough s i z e o f ( f i l e −>md5) e l e m e n t s c r e a t e s c h a r a r r a y [ 3 ] named one . u s e s s n p r i n t f ( one , s i z e o f ( one ) , "\%02x " , f i l e −>md5[ x ] ) t o f i l l b u f f e r one [ 3 ] with c o n t e n t from f i l e −>md5[ loop round ] . "\%02x " format s p e c i f i e r means , p r i n t a t l e a s t two d i g i t s , i f < 2 , prepend with 0 s t r l c a t (md5, one , s i z e o f (md5) ) w i l l c o n c a t e r n a t e one i n t o md5, then n u l l t e r m i n a t e s t h e one a r r a y . } else c r e a t e n u l l p o i n t e r c h a r ∗md5ptr then push f i l e −>f i l e _ I d , f i l e −>t x I d , f i l e −>name , f i l e −>name_len , f i l e −>s i z e , f i l e −>magic , md5ptr t o l u a s t a t e s t a c k . r e t u r n 6 , number o f a t t r i b u t e s pushed t o s t a c k . s t a t i c int LuaCallbackFileInfo ( lua_State ∗luastate ) create F i l e const ∗ f i l e pointer via LuaStateGetFile ( luastate ) i f f i l e p o i n t e r exists , return LuaCallvackFileInfoPushToStackFromFile ( luastate , f i l e ) function . s t a t i c i n t LuaCallBackFileStatePushToStackFromFile ( lua_State ∗luastate , const F i l e ∗ f i l e ) c r e a t e c o n s t c h a r p o i n t e r ∗ s t a t e and i n i t i t t o "UNKNOWN" s w i t c h on f i l e −>s t a t e case closed f i l e : s e t s t a t e to " closed " case truncated f i l e : s e t s t a t e to " truncated " case error f i l e : s e t s t a t e to " error " Push s t a t e , f i l e −>f l a g s & f i l e _ s t o r e d t o l u a s t a t e s t a c k . r e t u r n 2 , number o f a t t r i b u t e s pushed t o l u a s t a c k . s t a t i c int LuaCallbackFileState ( lua_State ∗luastate ) i n i t const F i l e pointer ∗ f i l e i f f i l e not n u l l , r e t u r n L u a C a l l b a c k F i l e S t a t e P u s h T o S t a c k F r o m F i l e ( l u a s t a t e , f i l e ) s t a t i c i n t LuaCallbackThreadInfoPushToStackFromThreadVars ( l u a _ S t a t e ∗ l u a s t a t e , c o n s t ThreadVars ∗ t v ) c r e a t e u_long t i d var , s e t i t ut SCGetThreadIdLong ( ) ; push i n t t i d , s t r i n g tv−>name , tv−>threa_group_name t o l u a s t a t e s t a c k . r e t u r n 3 : number o f a t t r i b u t e s pushed t o l u a s t a c k . s t a t i c i n t LuaCallbackThreadInfo ( lua_State ∗ l u a s t a t e ) c r e a t e and a s s i g n c o n s t ThreadVars p o i n t e r ∗ tv , s e t i t t o L u a S t a t e G e t T h r e a d V a r s ( l u a s t a t e ) i f t v not e q u a l s NULL , r e t u r n LuaCallbachThreadInfoPushToStackFromThreadVars ( l u a s t a t e , tv ) ; i nt LuaRegisterFunctiona ( lua_State ∗ luastate ) registers all callbacks does t h i s by u s i n g p u s h f u n c t i o n c a l l , s e t g l o b a l c a l l with " SCpartOfFunctionName " as parameter . This i s repeated f o r a l l f u n c t i o n s of the f i l e . C.6 C.6.1 Research output-lua.c Description The file starts with three different structs. the name of the structs are “LogLuaMasterCtx_”,“LogLuaCtx_” and “LogLuaThreadCtx_”. The first struct defines a path to a script directory. it is defined in a char array called “path”. “LogLuaCtx” has three variables a mutex a Lua state pointer and a integer. “LogLuaCtx” The last struct contains only a log Lua CTX pointer. The first function is named LuaTxLogger. It takes seven variables. This function sa TX logger for Lua scripts. A single call to this function will run one script on a single transaction. The function looks down a mutex and and uses the Lua state set function on 4 different pointers with four different variables. each function sending four of the 7 vari- 103 CASEC ables. After that Lua get global function is called with the keyword log hardcoded. after that Lua Pico function is called this function returns an integer. this integer is checked if the integer is not equal to 0 an error message is displayed. after this is test the mutex is unlocked and SC return integer function is called with a 0 as variable. the next function is also a logger function. it is called Lua streaming logger it hooks into the streaming logger API and gets called for each chunk of new streaming data. that function looks down the music mutex after at checking flags and output streaming flag transaction after that setting the Lua TX pointer. and after that setting other pointers. under this Lua get global and new table functions are called with the Lua state and with the word log sent with the function. after this the flags are checked again. and after this mutex is unlocked and the function SCreturnint is called with a int declared elsewhere in the code. Next function called is Luapacketloggeralerts. A single call to this function will run one script for a single packet. If it is called, it means that the registered condition function has returned true. The script is called once for each alert stored in the packet. The function takes three variables. The function checks if the package received is either ipv4 or ipv6. if it’s one of them the function continuous. after that the function checks if the protocol is a valid protocol . if the protocol is not known it will use the function IP get IP proto and print this. after this check the function will loop through alerts stored in the packet. this is once more done with the lua_getglobal function with the string “log” sent with the function, and 4 of the lua state set functions are called. This is all done while the mutex is locked. after going through the loop the mutex is unlocked and SC return Int is called with 0. Linje 228 C.6.2 Functions static int LuaPacketConditionAlerts(ThreadVars *tv, const Packet *p) Sjekker om en pakkes allerts sin cnt er større enn null, returnerer true eller false. static int LuaPacketLoggerTls(ThreadVars *tv, void *thread_data, const Packet *p) Kjører et script for en enkelt pakke. Setter en del pakker og flows. returnerer en int. Calle static int LuaPacketConditionTls(ThreadVars *tv, const Packet *p) Sjekker om conditionen til packeten er riktig, hvis ikke så returnres false. static static static static static static int int int int int int LuaPacketLoggerSsh(ThreadVars *tv, void *thread_data, const Packet *p) LuaPacketConditionSsh(ThreadVars *tv, const Packet *p) LuaPacketLogger(ThreadVars *tv, void *thread_data, const Packet *p) LuaPacketCondition(ThreadVars *tv, const Packet *p) LuaFileLogger(ThreadVars *tv, void *thread_data, const Packet *p, const File *ff) LuaFlowLogger(ThreadVars *tv, void *thread_data, Flow *f) Disse gjøres på samme måte som TLS og alerts med små variasjoner. 104 CASEC static int LuaStatsLogger(ThreadVars *tv, void *thread_data, const StatsTable *st) Lager et lua tabel med en for løkke som pusher en del strenger og verdier til lusatasten. Avsl Structen LogLuaScriptOptions defineres med inter. static int LuaScriptInit(const char *filename, LogLuaScriptOptions *options) Sjekker om aktuelle funksjoner har init fuknskon og hvilke scripts som trenger. Tar i mot filn static lua_State *LuaScriptSetup(const char *filename) Setter opp luastaten med script som mottas. Sjekker feil, returenerer luastaten om det er suce C.7 C.7.1 Research detect-lua-extensions.c Introduction Luastate is a blank sheet for Lua interaction with C, it is a part of the Lua library ( lua.h, lauxlib.h ) The library is not included in this file, so it must be included somewhere else. All functions from the Lua library starts with lua_ and the lauxlib functions start with lual_ Function Enumeration The file contains the definitions of functions: • • • • • • • • C.7.2 LuaGetFlowvar LuaSetFlowvar LuaGetFlowint LuaSetFlowint LuaIncrFlowint LuaDecrFlowint LuaExtensionsMatchSetup LuaRegisterExtensions Function Description LuaGetFlowvar t a k e s one parameter , a p o i n t e r t o l u a _ s t a t e . The f u n c t i o n c h e c k s f o r m u l t i p l e c o n f i g u r a t i o n s o p t i o n s from t h e l u a s t a t e s t a c k . I f t h e r e i s a problem each check w i l l push an e r r o r message onto t h e l u a s t a t e s t a c k so l u a s c r i p t s can check f o r an a c t on t h e e s e e r r o r s LuaSetFlowvar t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e T h i s f u n c t i o n seems t o be a m i r r o r o f GetFlowvar , c h e c k s a l o t o f t h e same i n f o r m a t i o n but i t a l s o c r e a t e s a b u f f e r a t t h e end o f t h e f u n c t i o n . I t a l s o r e t u r n s 0 , i n s t e a d o f 1 from GetFlowvar . I n my mind t h a t means t h a t t h e f u n c t i o n s a r e c a l l e d f o r d i f f e r e n t p u r p o s e s somewhere e l s e i n t h e s u r i c a t a codebase . LuaGetFlowint t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e . T h i s f u n c t i o n s l o o k s l i k e a pure copy o f Flowvar , doing s e v e r a l c h e c k s on i n f o r m a t i o n i n t h e l u a s t a t e and r e t u r n s e r r o r s i f i t doesn ’ t f i n d what i t l o o k s f o r . L u a S e t F l o w i n t t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e . T h i s f u n c t i o n s does a l o t o f t h e same c h e c k s as GetFlowint , but r e t u r n s e i t h e r an e r r o r i f i t doesn ’ t f i n d what i t l o o k s f o r and adds e r r o r messages t o l u a s t a t e or i f i t f i n d s a l l the t h i n g s i t looks f o r r e t u r n s a 0 to the c a l l i n g function . L u a I n c r F l o w i n t t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e . The f u n c t i o n does t h e same c h e c k s as t h e p r e v i o u s f u n c t i o n s , but a t t h e end i t m a n i p u l a t e s t h e r e t u r n v a l u e from FlowVarGet ( ) by e i t h e r s e t t i n g i t t o 1 ( i f i t i s NULL ) or i n c r e m e n t i n g what was t h e r e with 1 . 105 CASEC LuaDecrFlowint t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e . The f u n c t i o n s does a l l o f t h e same checks , but a t t h e end i t m a n i p u l a t e s t h e r e t u r n v a l u e from FlowVarGet ( ) b y t e e i t h e r s e t t i n g i t t o 0 ( i f i t i s NULL ) or decrementing t h e number i t f i n d s . LuaExtensionsMatchSetup t a k e s 7 parameters , a p o i n t e r t o l u a s t a t e , a p o i n t e r f o r DetectLuaData , a p o i n t e r f o r DetectEngineThread , a Flow p o i n t e r , an i n t c a l l e d f l o w _ l o c k e d , a P a c k e t p p o i n t e r , and a u i n t 8 _ t c a l l e d f l a g s . The f u n c t i o n pushes a s e t o f p r e d e f i n e d s t r i n g s onto t h e l u a s t a t e s t a c k ( s u r i c a t a : l u a j i t d a t a and s u r i c a t a : d e t _ c t x ) I t a l s o g e t s p o i n t e r s from AppLayerParseGetTx and p u t s i t onto t h e l u a _ s t a t e s t a c k i f t h e r e t u r n v a l u e i s not 0 . The l a s t a c t i o n i s t o c a l l L u a S a t e S e t D i r e c t i o n with a f l a g c a l l e d STREAM_TOSERVER . L u a R e g i s t e r E x t e n s i o n s t a k e s one paramter , a p o i n t e r t o l u a _ s t a t e The purpose o f t h i s f u n c t i o n i s t o r e g i s t e r t h e names o f a l l t h e f u n c t i o n s i n t h i s f i l e in the l u a _ s t a t e stack . C.8 C.8.1 Research detect-lua detect-lua.h t y p e d e f s t r u c t DetectLuaThreadData { lua_State ∗luastate ; uint32_t f l a g s ; int alproto ; } DetectLuaThreadData ; t y p e d e f s t r u c t DetectLuaData { int thread_ctx_id ; i n t negated ; cha r ∗ f i l e n a m e ; uint32_t f l a g s ; int alproto ; cha r ∗buffername ; /∗ b u f f e r name i n c a s e o f a s i n g l e b u f f e r ∗/ u i n t 1 6 _ t f l o w i n t [DETECT_LUAJIT_MAX_FLOWINTS ] ; uint16_t flowints ; u i n t 1 6 _ t f l o w v a r [DETECT_LUAJIT_MAX_FLOWVARS ] ; uint16_t flowvars ; uint32_t sid ; uint32_t rev ; uint32_t gid ; } DetectLuaData ; C.8.2 Research detect-lua.c Short introduction This file does a lot of things. It checks if the suricata install actually supports lua. It manages and populates the pool of Lua states. It actually runs Lua scripts to see if it matches traffick. It also contains 6 unittests for testing functionality. Namely payload buffer, packet buffer, http buffer and flowints. Description # i f HAVE_LUA i s NOT d e f i n e d s t a t i c i n t DetectLuaSetupNoSupport ( D e t e c t E n g i n e C t x ∗a , S i g n a t u r e ∗b , c h a r ∗c ) P r i n t s an e r r o r about l u a not b e i ng s u p p o r t e d void DetectLuaRegister ( void ) S e t s some v a l u e s i n s i g m a t c h _ t a b l e [DETECT_LUA] P r o b a b l y t o s i g n i f y t h a t l u a s u p p o r t not enabled # I f HAVE_LUA and HAVE_LUAJIT i s d e f i n e d C r e a t e s a l u a _ S t a t e pool S t a t e s a r e r e q u i r e d t o be l e s s than 2GB i n memory The pool i s p r o t e c t e d by mutex l o c k s #HAVE_LUAJIT i s not r e q u i r e d 106 CASEC void DetectLuaRegister ( void ) S e t s some v a l u e s i n s i g m a t c h _ t a b l e [DETECT_LUA] P r o b a b l y t o s i g n i f y t h a t l u a s u p p o r t i s enabled #HAVE_LUAJIT needs t o be d e f i n e d a g a i n s t a t i c void ∗LuaStatePoolAlloc ( void ) R e t u r n s a new l u a _ S t a t e s t a t i c v o i d L u a S t a t e P o o l F r e e ( v o i d ∗d ) C l o s e s t h e l u a _ s t a t e c o n t a i n i n g t h e pool // P o p u l a t e s t h e l u a s t a t e s pool : i n t D e t e c t L u a j i t S e t u p S t a t e s P o o l ( i n t num, i n t r e l o a d s ) Locks t h e l u a j i t s t a t e s mutex Gets ConfNode with ConfGetNode ( " d e t e c t−engine " ) Loops through t h e nodes g e t t i n g " l u a j i t −s t a t e s " C a l c u l a t e s t h e amount o f t h r e a d s t o know how many l u a j i t _ s t a t e s − to a l l o c a t e A l l o c a t e the s t a t e s Unlock t h e l u a j i t s t a t e s mutex R e t u r n s −1 i f t h e a l l o c a t i o n f a i l s , e l s e 0 #does NOT r e q u i r e HAVE_LUAJIT anymore s t a t i c lua_State ∗DetectLuaGetState ( void ) C r e a t e s a new l u a _ S t a t e # i f HAVE_LUAJIT Locks t h e l u a j i t s t a t e s mutex Gets one l u a _ S t a t e from t h e l u a j i t _ s t a t e s pool Unlocks t h e mutex #ELSE l u a _ S t a t e i s s e t t o a new s t a t e #Does not r e q u i r e HAVE_LUAJIT anymore Returns the l u a _ S t a t e s t a t i c void DetectLuaReturnState ( lua_State ∗s ) # i f HAVE_LUAJIT Locks l u a j i t s t a t e s mutex C a l l s Poo lRe turn on t h e l u a _ S t a t e Unlocks l u a j i t s t a t e s mutex #ELSE Uses l u a _ c l o s e on t h e l u a _ S t a t e #does NOT r e q u i r e HAVE_LUAJIT v o i d LuaDumpStack ( l u a _ S t a t e ∗ s t a t e ) Loops through a l l l u a _ S t a t e s t a t e s and p r i n t s − s t a c k dump i n f o r m a t i o n . // Does t h e s i g n a t u r e matching i n t D e t e c t L u a M a t c h B u f f e r ( D e t e c t E n g i n e T h r e a d C t x ∗ d e t _ c t x , S i g n a t u r e ∗s , SigMatch ∗sm , uint8_t ∗buffer , uint32_t buffer_len , uint32_t offset , Flow ∗ f , i n t f l o w _ l o c k ) Gets DetectLuaData from Sigmatch−>c t x Gets DetectLuaThreadData with a f u n c t i o n on DetectLuaData ( c a l l s i t " l u a j i t " ) Uses LuaExtensionsMatchSetup t o s e t some data p a r a m e t e r e s i n DetectLuaThreadData−>l u a s t a t e P a s s e s some data t o t h e l u a s t a t e − " match " , " o f f s e t " and t h e o f f s e t v a r i a b l e , b u f f e r v a r i a b l e Then i t t r i e s t o run t h e s c r i p t through l u a _ p c a l l on t h e l u a s t a t e Gets r e t u r n v a l u e from t h e s c r i p t when i t ’ s done running R e t r i e v e s a t a b l e from t h e s c r i p t i n l u a s t a t e Gets t h e r e t u r n v a l u e o f t h e s c r i p t Clears the l u a s t a t e stack Returns the return value of the s c r i p t s t a t i c i n t DetectLuaMatch ( ThreadVars ∗ tv , D e t e c t E n g i n e T h r e a d C t x ∗ d e t _ c t x , P a c k e t ∗p , S i g n a t u r e ∗s , c o n s t SigMatchCtx ∗ c t x ) Gets DetectLuaData from c t x parameter Gets DetectLuaThreadData with a f u n c t i o n on DetectLuaData S e t s f l a g s v a r i a b l e depending on what t h e s t a t e o f p−>f l o w f l a g s i s 107 CASEC Uses LuaExtensionsMatchSetup t o s e t some data p a r a m e t e r e s i n DetectLuaThreadData−>l u a s t a t e P a s s e s some data t o t h e l u a s t a t e ( p a c k e t payload and p a c k e t data ) Checks i f t h e p r o t o c o l i s HTTP Locks t h e packet−>flow P u t s t h e packet−>flow−>a l s t a t e i n a H t p S t a t e Gets some t r a n s a c t i o n d e t a i l s ( index and t o t a l t r a n s a c t i o n s ) Loops through t h e HTTP p a c k e t s Pushes t h e r e q u e s t l i n e t o l u a s t a t e Unlocks t h e packet−>flow Then i t t r i e s t o run t h e s c r i p t through l u a _ p c a l l on t h e l u a s t a t e Gets r e t u r n v a l u e from t h e s c r i p t when i t ’ s done running R e t r i e v e s a t a b l e from t h e s c r i p t i n l u a s t a t e Gets t h e r e t u r n v a l u e o f t h e s c r i p t Clears the l u a s t a t e stack Returns the return value of the s c r i p t s t a t i c i n t DetectLuaAppMatchCommon ( ThreadVars ∗t , D e t e c t E n g i n e T h r e a d C t x ∗ d e t _ c t x , Flow ∗ f , u i n t 8 _ t f l a g s , v o i d ∗ s t a t e , c o n s t S i g n a t u r e ∗s , c o n s t SigMatchCtx ∗ c t x ) Gets DetectLuaData from SigMatchCtx Gets DetectLuaThreadData from a f u n c t i o n on DetectLuaData Uses LuaExtensionsMatchSetup t o s e t some data p a r a m e t e r e s i n DetectLuaThreadData−>l u a s t a t e Checks i f t h e p r o t o c o l i s HTTP Gets H t p S t a t e from t h e s t a t e parameter Gets t x from H t p S t a t e Pushes t h e r e q u e s t l i n e t o l u a s t a t e Runs t h e l u a s c r i p t and g e t s t h e r e t u r n v a l u e R e t u r n s t h e r e t u r n v a l u e i f t h e s c r i p t r e t u r n s a number Checks i f t h e s c r i p t r e t u r n s a t a b l e P r i n t s t h e v a l u e s i n t h e t a b l e as d e b u g i n f o Gets t h e r e t u r n v a l u e from t h e t a b l e Returns the return value of the s c r i p t s t a t i c i n t DetectLuaAppMatch ( ThreadVars ∗t , D e t e c t E n g i n e T h r e a d C t x ∗ d e t _ c t x , Flow ∗ f , u i n t 8 _ t f l a g s , v o i d ∗ s t a t e , S i g n a t u r e ∗s , SigMatch ∗m) Only c a l l s DetectLuaAppMatchCommon with t h e same p a r a m e t e r s s t a t i c i n t DetectLuaAppTxMatch ( ThreadVars ∗t , D e t e c t E n g i n e T h r e a d C t x ∗ d e t _ c t x , Flow ∗ f , u i n t 8 _ t f l a g s , v o i d ∗ s t a t e , v o i d ∗ txv , c o n s t S i g n a t u r e ∗s , c o n s t SigMatchCtx ∗ c t x ) Only c a l l s DetectLuaAppMatchCommon with t h e same p a r a m e t e r e s C r e a t e s a u t _ s c r i p t p o i n t e r t o a b u f f e r t h a t should c o n t a i n a l u a s c r i p t ( u n i t t e s t ) i f − UNITTESTS i s d e f i n e d s t a t i c v o i d ∗ D e t e c t L u a T h r e a d I n i t ( v o i d ∗ data ) C a s t s data t o DetectLuaData A l l o c a t e s memory t h e s i z e o f DetectLuaThread t o a DetectLuaThreadData p o i n t e r − and s e t s i t t o 0x00 S e t s t h e p r o t o c o l and f l a g s o f t h e p o i n t e r t o t h e v a l u e s i n DetectLuaData S e t s DetectLuaThreadData−>l u a s t a t e t o t h e r e s u l t o f D e t e c t L u a G e t S t a t e ( ) Opens t h e l i b r a r i e s a s s o c i a t e d with l u a s t a t e R e g i s t e r s the e x t e n t i o n s of the l u a s t a t e Pushes s i d , r e v and g i d from DetectLuaData t o t h e l u a s t a t e # i f UNITTESTS i s d e f i n e d Loads t h e u n i t t e s t i n t o l u a s t a t e ( t h e u t _ s c r i p t v a r i a b l e ) # i f UNITTESTS i s not d e f i n e d Loads a f i l e d e f i n e d i n DetectLuaData t o t h e l u a s t a t e Runs t h e l u a s c r i p t ( t h e l u a s t a t e ) s t a t i c void DetectLuaThreadFree ( void ∗ ctx ) C a s t s c t x t o DetectLuaThreadData C a l l s D e t e c t L u a R e t u r n S t a t e on t h e l u a s t a t e i n DetectLuaThreadData F r e e s t h e DetectLuaThreadData p o i n t e r s t a t i c DetectLuaData ∗ D e t e c t L u a P a r s e ( c o n s t D e t e c t E n g i n e C t x ∗ de_ctx , c h a r ∗ s t r ) C r e a t e a DetectLuaData p o i n t e r and a l l o c a t e data f o r i t S e t t h e f i l e name a t t r i b u t e o f t h e p o i n t e r with De t e c tL o a d C om p l e te S i g P at h ( de_ctx , s t r ) return the pointer s t a t i c i n t DetectLuaSetupPrime ( D e t e c t E n g i n e C t x ∗ de_ctx , DetectLuaData ∗ l d ) C r e a t e s a new l u a s t a t e 108 CASEC Opens t h e l u a s t a t e l i b r a r i e s # i f UNITTESTS i s d e f i n e d Loads t h e u n i t t e s t i n t o l u a s t a t e # i f UNITTESTS i s not d e f i n e d Loads a f i l e d e f i n e d i n DetectLuaData t o t h e l u a s t a t e Runs t h e l u a s c r i p t Pushes t h e n i l v a l u e onto t h e s t a c k Loops through t h e l u a s t a c k Gets a s t r i n g from t h e s t a c k ( 1 ) Checks i f t h e s t r i n g i s " f l o w v a r " and i f s t a c k index −1 i s a t a b l e Pushes t h e n i l v a l u e on t h e s t a c k Loops through t h e f l o w v a r t a b l e P r i n t s t h e v a l u e s as debug i n f o Pops i t a f t e r w a r d s but keeps t h e key i n t h e key v a l u e p a i r Gets t h e i d x o f t h e f l o w v a r v a r i a b l e name from t h e D e t e c t E n g i n e C t x C r e a t e s a new a r r a y element i n t h e f l o w i n t s a r r a y i n D e t e c t L u a and s e t i t t o − the r e t r i e v e d flowvar Pops one v a l u e from t h e s t a c k but keeps t h e key i n t h e key v a l u e p a i r − i f i t was not a l r e a d y done Checks i f t h e s t r i n g i s " f l o w i n t " and i f s t a c k index −1 i s a t a b l e Pushes t h e n i l v a l u e on t h e s t a c k Loops through t h e f l o w i n t t a b l e P r i n t s t h e v a l u e s as debug i n f o Pops one v a l u e from t h e s t a c k but keeps t h e key i n t h e key v a l u e p a i r Gets t h e i d x o f t h e f l o w i n t v a r i a b l e name from t h e D e t e c t E n g i n e C t x C r e a t e s a new a r r a y element i n t h e f l o w i n t s a r r a y i n D e t e c t L u a and s e t i t t o − the r e t r i e v e d f l o w i n t Pops one v a l u e from t h e s t a c k but keeps t h e key i n t h e key v a l u e p a i r − i f i t was not a l r e a d y done Gets a s t r i n g from t h e s t a c k ( 2 ) Pops a v a l u e from t h e s t a c k P r i n t s t h e two s t r i n g s as debug v a l u e s Checks i f t h e v a l u e s a r e " p a c k e t " and " t r u e " S e t s DetectLuaData−>f l a g s t o b i t w i s e i n c l u s i v e or o f 1<<0 Checks i f t h e v a l u e s a r e " payload " and " t r u e " S e t s DetectLuaData−>f l a g s t o b i t w i s e i n c l u s i v e or o f 1<<1 Checks i f t h e v a l u e s a r e " stream " and " t r u e " S e t s DetectLuaData−>f l a g s t o b i t w i s e i n c l u s i v e or o f 1<<2 S e t s DetectLuaData−>buffername t o " stream " Checks i f t h e v a l u e s a r e " h t t p " ( f i r s t 4 c h a r a c t e r s ) and " t r u e " S e t s DetectLuaData−>a l p r o t o t o ALPROTO_HTTP S e t s f l a g s depending on what v a l u e ( 1 ) was a f t e r h t t p ( e . g . h t t p . u r i ) − format : DetectLuaData |= x<<y ( x = y = i n t ) S e t s DetectLuaData−>buffername t o v a l u e ( 1 ) Checks i f t h e v a l u e s a r e " dns " ( f i r s t 3 c h a r a c t e r s ) and " t r u e " S e t s DetectLuaData−>a l p r o t o t o ALPROTO_DNS S e t s f l a g s depending on what v a l u e ( 1 ) was a f t e r dns ( e . g . dns . r e s p o n s e ) − format : DetectLuaData |= x<<y ( x = y = i n t ) S e t s DetectLuaData−>buffername t o v a l u e ( 1 ) Checks i f t h e v a l u e s a r e " t l s " ( f i r s t 3 c h a r a c t e r s ) and " t r u e " S e t s DetectLuaData−>a l p r o t o t o ALPROTO_TLS S e t s DetectLuaData−>f l a g s t o b i t w i s e i n c l u s i v e or o f 1<<18 Checks i f t h e v a l u e s a r e " s s h " ( f i r s t 3 c h a r a c t e r s ) and " t r u e " S e t s DetectLuaData−>a l p r o t o t o ALPROTO_SSH S e t s DetectLuaData−>f l a g s t o b i t w i s e i n c l u s i v e or o f 1<<19 Pops one element from t h e l u a s t a c k Closes the l u a s t a t e s t a t i c i n t D e t e c t L u a S e t u p ( D e t e c t E n g i n e C t x ∗ de_ctx , S i g n a t u r e ∗s , c h a r ∗ s t r ) C r e a t e s a DetectLuaData p o i n t e r and s e t s i t t o D e t e c t L u a P a r s e ( de_ctx , s t r ) C r e a t e s and a l l o c a t e s a SigMatch p o i n t e r S e t s s−>a l p r o t o t o DetectLuaData−>a l p r o t o i f both a r e not uknown and don ’ t d i f f e r from each o t h e r S e t s t h e SigMatch−>t y p e t o DETECT_LUA S e t s t h e SigMatch−>c t x t o DetectLuaData ( l u a j i t ) Does some p r o t o c o l c h e c k s and appends a s i g m a t c h t o t h e l i s t i n DetectLuaData depending on − which DetectLuaData−>f l a g s a r e s e t and t h e p r o t o c o l I n c r e m e n t s D e t e c t E n g i n e C t x−>d e t e c t _ l u a j i t _ i n s t a n c e s with 1 // Post−s i g p a r s e f u n c t i o n t o s e t t h e s i d , rev , g i d i n t o t h e //− c t x , as t h i s i s n ’ t a v a i l a b l e y e t d u r i n g p a r s i n g . void DetectLuaPostSetup ( Signature ∗s ) Loops through SigMatch ’ s Gets DetectLuaData from S i g n a t u r e−>s m _ l i s t [ i ]−>c t x P u t s DetectLuaData−>s i d , r e v and g i d i n s−>id , r e v and g i d 109 CASEC s t a t i c void DetectLuaFree ( void ∗ ptr ) F r e e s t h e p t r−>buffername and p t r−>f i l e n a m e Frees the pointer sent #i f d e f UNITTESTS s t a t i c i n t LuaMatchTest01 ( v o i d ) // T e s t s t h e h t t p b u f f e r C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars − D e t e c t E n g i n e T h r e a d C t x and AppLayerParserThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream with S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a D e t e c t E n g i n e C t x p o i n t e r with D e t e c t E n g i n e C t x I n i t ( ) P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e xt s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 1 Unlocks t h e flow mutex C r e a t e s a H t p S t a t e p o i n t e r t o Flow−>a l s t a t e Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 2 Unlocks t h e flow mutex Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup s t a t i c i n t LuaMatchTest02 ( v o i d ) // T e s t s t h e payload b u f f e r C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars and − DetectEngineThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream wwith S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a DetectEngineCtx pointer P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e x t s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup s t a t i c i n t LuaMatchTest03 ( v o i d ) // T e s t s t h e p a c k e t b u f f e r C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars and − DetectEngineThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t 110 CASEC I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream wwith S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a DetectEngineCtx pointer P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e xt s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup s t a t i c i n t LuaMatchTest04 ( v o i d ) // T e s t s t h e h t t p b u f f e r , f l o w i n t s C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars − D e t e c t E n g i n e T h r e a d C t x and AppLayerParserThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream wwith S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a DetectEngineCtx pointer P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e xt s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 1 Unlocks t h e flow mutex C r e a t e s a H t p S t a t e p o i n t e r t o Flow−>a l s t a t e Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 2 Unlocks t h e flow mutex Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup s t a t i c i n t LuaMatchTest05 ( v o i d ) // T e s t h t t p b u f f e r , f l o w i n t s C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars − D e t e c t E n g i n e T h r e a d C t x and AppLayerParserThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream wwith S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a DetectEngineCtx pointer P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e x t s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 1 Unlocks t h e flow mutex C r e a t e s a H t p S t a t e p o i n t e r t o Flow−>a l s t a t e 111 CASEC Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 2 Unlocks t h e flow mutex Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup s t a t i c i n t LuaMatchTest06 ( v o i d ) // T e s t h t t p b u f f e r , f l o w i n t s C r e a t e s a c h a r p o i n t e r with t h e t e s t i n g l u a s c r i p t C r e a t e s a c h a r p o i n t e r with t h e s i g C r e a t e s two b u f f e r s with t h e t e s t r e q u e s t l i n e s Gets t h e l e n g t h o f t h e r e q u e s t l i n e s C r e a t e s a TcpSession , Packet , Flow , S i g n a t u r e , ThreadVars − D e t e c t E n g i n e T h r e a d C t x and AppLayerParserThreadCtx v a r i a b l e . Sets the u t _ s c r i p t global v a r i a b l e to the t e s t i n g l u a s c r i p t I n i t i a l i z e s t h e Flow v a r i a b l e with FLOW_INITIALIZE S e t s Flow−>p r o t o c t x t o T c p S e s s i o n S e t s Flow−>p r o t o t o IPPROTO_TCP S e t s Flow−>f l a g s t o t h e b i t w i s e i n c l u s i v e o f FLOW_PKT_TOSERVER S e t s Flow−>a l p r o t o t o ALPROTO_HTTP S e t s flow , f l o w f l a g s and f l a g s on t h e two P a c k e t v a r i a b l e s I n i t i a l i z e s a TCP stream wwith S t r e a m T c p I n i t C o n f i g (TRUE) I n i t i a l i z e s a DetectEngineCtx pointer P u t s t h e s i g n a t u r e c h a r p o i n t e r i n t h e S i g n a t u r e v a r i a b l e and adds i t i n t o t h e − D e t e c t Engine C o n t e xt s i g n a t u r e l i s t C a l l s S i g G r o u p B u i l d ( DetecEngineCtx ) I n i t i a l i z e s t h e t h r e a d s p e s i f i c d e t e c t i o n engine Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 1 Unlocks t h e flow mutex C r e a t e s a H t p S t a t e p o i n t e r t o Flow−>a l s t a t e Matches t h e s i g n a t u r e a g a i n s P a c k e t 1 with S i g M a t c h S i g n a t u r e Locks t h e flow mutex P a r s e s t h e flow with A p p L a y e r P a r s e r P a r s e f o r r e q u e s t l i n e 2 Unlocks t h e flow mutex Matches t h e s i g n a t u r e a g a i n s P a c k e t 2 with S i g M a t c h S i g n a t u r e C r e a t e s a FlowVar p o i n t e r and s e t s i t t o FlowVarGet (&Flow , 1) , − g e t s f l o w v a r with i d x = 1 from Flow P u t s " 2 " i n t o FlowVar−>data . f v _ s t r . v a l u e F r e e s p o i n t e r s and does g e n e r a l cleanup void DetectLuaRegisterTests ( void ) R e g i s t e r s t h e u n i t t e s t s with − U t R e g i s t e r s T e s t ( " LuaMatchTest0X " , LuaMatchTest0X , 1) 112 CASEC D D.1 Manual Code Analysis Results Result of grep function . / a l e r t −debuglog . c : 1 1 1 : ∗ \\ todo c o n s t P a c k e t p t r , r e q u i r e s us t o change t h e . / a l e r t −debuglog . c :150:/∗∗ \\ todo doc . / a l e r t −debuglog . c : 3 0 8 : /∗∗ \\ todo improve t h e o r d e r s e l e c t i o n p o l i c y ∗/ . / a l e r t −f a s t l o g . c : 2 6 : ∗ \\ todo Support c l a s s i f i c a t i o n s . / a l e r t −f a s t l o g . c : 2 7 : ∗ \\ todo Support more than j u s t IPv4 / IPv6 TCP/UDP . . / a l e r t −p r e l u d e . c : 8 5 0 : /∗ XXX which one t o add t o t h i s a l e r t ? L e t s s e e how S n o r t solves this . . / a l e r t −u n i f i e d 2−a l e r t . c : 1 0 4 4 : /∗ TODO i n v e r s e o r d e r i f needed , t h i s should be done on a . / a l e r t −u n i f i e d 2−a l e r t . c : 7 5 9 : /∗ Fake d a t a l i n k t o a v o i d bug with o l d barnyard2 ∗/ . / app−l a y e r . c : 2 3 3 : ∗ \\ todo We need t o f i g u r e out a more r o b u s t s o l u t i o n f o r this , . / app−l a y e r . c :643:/∗∗ \\ b r i e f HACK t o work around our broken unix manager ( r e ) i n i t loop . / app−l a y e r −d c e r p c . c : 1 2 2 9 : /∗ j u s t a hack t o g e t t h i n g working . We shouldn ’ t be setting . / app−l a y e r −d c e r p c . c : 1 4 6 8 : ∗ \\ todo − C u r r e n t l y t h e p a r s e r i s v e r y g e n e r i c . Enable t a r g e t based . / app−l a y e r −d c e r p c . c : 1 4 8 7 : /∗ temporary use . we w i l l g e t r i d o f t h i s l a t e r , once we have i r o n e d out . / app−l a y e r −d c e r p c . c : 1 6 0 9 : /∗ temporary f i x ∗/ . / app−l a y e r −d c e r p c . c : 1 7 6 2 : /∗ temporary f i x ∗/ . / app−l a y e r −d c e r p c . c : 1 8 4 5 : /∗ temporary f i x ∗/ . / app−l a y e r −d c e r p c . c : 2 6 : ∗ \\ todo Remove a l l t h e u n n e c e s s a r y per b y t e i n c r e m e n t a l l o o p s with a f u l l one . / app−l a y e r −d c e r p c . c : 2 9 0 4 : ∗ \\ todo Needs t o be r e w r i t t e n . / app−l a y e r −dcerpc−udp . c : 6 : ∗ \\ todo Updated by AS : I n s p e c t t h e p o s s i b i l i t i e s o f s e n d i n g junk s t a r t a t t h e . / app−l a y e r −d e t e c t−p r o t o . c : 1 0 9 : /∗ \\ todo Change t h i s i n t o a non−p o i n t e r ∗/ . / app−l a y e r −d e t e c t−p r o t o . c : 1 2 5 : /∗ \\ todo we don ’ t need t h i s e x c e p t a t s e t u p time . Get r i d o f i t . ∗/ . / app−l a y e r −d e t e c t−p r o t o . c : 1 3 9 : ∗ \\ todo Modify c t x _ i p p t o hold f o r o n l y t c p and udp . The r e s t can be . / app−l a y e r −d e t e c t−p r o t o . c : 1 5 4 6 : ∗ \\ todo i n c o m p l e t e . Need more work . . / app−l a y e r −d e t e c t−p r o t o . c : 6 9 : /∗ \\ todo don ’ t r e a l l y need i t . See i f you can g e t r i d o f i t ∗/ . / app−l a y e r −d e t e c t−p r o t o . c : 7 1 : /∗ \\ todo c a l c u l a t e a t runtime and g e t r i d o f t h i s v a r ∗/ . / app−l a y e r −d e t e c t−p r o t o . c : 7 3 : /∗ \\ todo check i f we can reduce t h e bottom 2 v a r s t o u i n t 1 6 _ t ∗/ . / app−l a y e r −dns−common . c : 4 3 7 : DNSDecrMemcap(0 x f f f f , d n s _ s t a t e ) ; /∗∗ TODO update i f / once we a l l o c . / app−l a y e r −dns−common . c : 8 0 : BUG_ON( s i z e > s t a t e −>memuse) ; /∗∗< TODO remove l a t e r ∗/ . / app−l a y e r −dns−common . c : 8 4 : BUG_ON( s i z e > SC_ATOMIC_GET( dns_memuse ) ) ; /∗∗< TODO remove l a t e r ∗/ . / app−l a y e r −dns−t c p . c : 1 1 4 : /∗∗ \\ todo s e t e v e n t ? ∗/ . / app−l a y e r −dns−t c p . c : 1 5 7 : /∗∗ \\ todo be s m a r t e r about t h i s , l i k e use a pool or several pools for . / app−l a y e r −dns−t c p . c : 2 1 1 : /∗∗ \\ todo s e t e v e n t ?∗/ . / app−l a y e r −dns−t c p . c : 2 3 0 : /∗∗ \\ todo s e t e v e n t ? ∗/ . / app−l a y e r −dns−t c p . c : 2 8 7 : /∗∗ \\ todo remove t h i s when PP i s f i x e d t o e n f o r c e i p p r o t o ∗/ . / app−l a y e r −dns−t c p . c : 5 0 1 : /∗∗ \\ todo remove t h i s when PP i s f i x e d t o e n f o r c e i p p r o t o ∗/ . / app−l a y e r −dns−t c p . c : 9 7 : /∗∗ \\ todo s e t e v e n t ?∗/ . / app−l a y e r −dns−udp . c : 1 2 0 : /∗∗ \\ todo s e t e v e n t ? ∗/ . / app−l a y e r −dns−udp . c : 1 7 9 : /∗∗ \\ todo remove t h i s when PP i s f i x e d t o e n f o r c e i p p r o t o ∗/ . / app−l a y e r −dns−udp . c : 6 9 : /∗∗ \\ todo remove t h i s when PP i s f i x e d t o e n f o r c e i p p r o t o ∗/ . / app−l a y e r −dns−udp . c : 9 8 : /∗∗ \\ todo s e t e v e n t ?∗/ . / app−l a y e r −f t p . h : 7 4 : /∗∗ \\ todo more i f m i s s i n g . . ∗/ . / app−l a y e r −htp . c : 1 5 6 8 : ∗ \\ todo r e a l l y needed ? ∗/ . / app−l a y e r −htp . c : 1 7 4 : ∗ \\ todo T h i s needs t o be a l i b h t p f u n c t i o n . . / app−l a y e r −htp . c : 2 0 3 : ∗ \\ todo T h i s needs t o be a l i b h t p f u n c t i o n . . / app−l a y e r −htp . c : 4 0 8 : /∗ hack : even i f l i b h t p c o n s i d e r s t h e t x incomplete , we want t o 113 CASEC . / app−l a y e r −htp . c : 4 1 1 : ∗ we hack around i t here . ∗/ . / app−l a y e r −htp . c :5081:/∗∗ \\ t e s t T e s t \\ c h a r i n query p r o f i l e IDS . Bug 739 . / app−l a y e r −htp . c :5191:/∗∗ \\ t e s t T e s t + c h a r i n query . Bug 1035 . / app−l a y e r −htp . c :5301:/∗∗ \\ t e s t T e s t + c h a r i n query . Bug 1035 . / app−l a y e r −p a r s e r . c : 1 1 0 0 : /∗∗ \\ todo bug 719 ∗/ . / app−l a y e r −smtp . c : 7 2 1 : /∗∗ \\ todo decoder e v e n t ∗/ . / app−l a y e r −smtp . c : 7 6 4 : /∗ kinda l i k e a hack . The mail s e n t i n DATA mode , would be . / app−l a y e r −s s l . c : 1 0 0 9 : /∗ \\ todo f i x t h e e v e n t from i n v a l i d r u l e t o unknown r u l e ∗/ . / app−l a y e r −s s l . c : 1 0 5 6 : ∗ \\ todo On r e a c h i n g an i n c o n s i s t e n t s t a t e , check i f t h e i n p u t has . / app−l a y e r −s s l . c : 1 2 7 2 : /∗ \\ todo D e t e c t t h e 2 b y t e ones ∗/ . / app−l a y e r −s s l . c : 3 8 7 8 : ∗ \\ t e s t T e s t f o r bug #955 and CVE−2013−5919. The data i s from the . / app−l a y e r −t l s −handshake . c : 1 7 7 : // TODO maybe an e v e n t here ? . / app−l a y e r −t l s −handshake . c : 1 8 5 : // TODO do we need an e v e n t here ? . / conf . c :1283: ∗ used t o t r i g g e r a bug t h a t caused t h e second l e v e l o f t h e name . / c o n f . c : 3 1 : ∗ \\ todo C o n s i d e r having t h e in−memory c o n f i g u r a t i o n d a t a b a s e a d i r e c t . / c o n f . c : 3 5 : ∗ \\ todo Get r i d o f a l l o w o v e r r i d e and go with a s i m p l e r f i r s t s e t , ./ counters . c :618: /∗∗ temporary l o c a l t a b l e t o merge t h e per t h r e a d c o u n t e r s , . / c o u n t e r s . c : 7 5 8 : ∗ \\ todo reimplement t h i s , p r o b a b l y based on s t a t s −j s o n . / decode . c : 4 4 1 : ∗ \\ todo IPv6 . / decode−e t h e r n e t . c : 1 4 2 : ∗ \\ todo More E t h e r n e t t e s t s . / decode−g r e . c : 1 3 6 : ∗ \\ todo We need t o make s u r e t h i s does not a l l o w bypassing . / decode−g r e . c : 7 2 : ∗ \\ todo We need t o make s u r e t h i s does not a l l o w bypassing . / decode . h : 1 0 0 8 : ∗ \\ todo we need more & maybe put them i n a s e p a r a t e f i l e ? ∗/ . / decode−icmpv4 . h : 2 6 3 : ∗ \\ todo T h i s check i s used i n t h e flow engine and needs t o be as . / decode−icmpv6 . c : 1 5 9 3 : ∗ \\ todo More ICMPv6 t e s t s . / decode−i p v 4 . c : 2 1 0 : /∗∗ \\ todo A p a r e n t l y a DOI o f z e r o i s f i n e i n p r a c t i c e − v e r i f y . ∗/ . / decode−i p v 4 . c : 2 7 3 : /∗∗ \\ todo Wireshark marks t h i s a padding , but s p e c s a y s r e s e r v e d . ∗/ . / decode−i p v 4 . c : 2 7 9 : /∗∗ \\ todo May not want t o r e t u r n e r r o r here on unknown t a g t y p e ( a t l e a s t not f o r 3|4) ∗/ . / decode−i p v 4 . c : 5 3 : ∗ \\ todo T h i s f u n c t i o n needs removed i n f a v o r o f s p e c i f i c v a l i d a t i o n . . / decode−i p v 4 . h : 6 1 : /∗∗ \\ todo We may want t o break t y p e up i n t o i t s 3 f i e l d s . / decode−i p v 6 . c :293:/∗∗ \\ todo move i n t o own f u n c t i o n t o loaded on demand ∗/ . / decode−i p v 6 . h : 1 7 8 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 1 8 3 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 2 8 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 3 2 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 4 3 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 4 7 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 5 1 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 2 5 5 : / ∗ XXX ∗/ . / decode−i p v 6 . h : 9 1 : / ∗ XXX ∗/ . / decode−pppoe . c : 4 4 4 : ∗ \\ todo More PPPOE t e s t s . / decode−raw . c : 2 2 0 : ∗ \\ todo More Raw t e s t s . / decode−s c t p . h : 3 0 : / ∗ XXX RAW∗ needs t o be r e a l l y ’ raw ’ , so no ntohs t h e r e ∗/ . / decode−t c p . h : 2 2 : ∗ \\ todo RAW∗ macro ’ s should be r e t u r n i n g t h e raw value , not t h e h o s t order . / decode−t e m p l a t e . c : 5 5 : /∗ TODO add c o u n t e r f o r your t y p e o f p a c k e t t o DecodeThreadVars , . / decode−udp . h : 2 9 : / ∗ XXX RAW∗ needs t o be r e a l l y ’ raw ’ , so no ntohs t h e r e ∗/ . / decode−v l a n . c :143:/∗∗ \\ todo Must GRE+VLAN and Multi−Vlan p a c k e t s t o . / d e f r a g . c : 2 1 6 1 : ∗ f a i l . The f i x was simple , but t h i s u n i t t e s t i s j u s t t o make s u r e . / d e f r a g . c : 2 8 : ∗ \\ todo pool f o r f r a g p a c k e t s t o r a g e . / d e f r a g . c : 2 9 : ∗ \\ todo p o l i c y bsd−r i g h t . / d e f r a g . c : 3 0 : ∗ \\ todo p r o f i l e hash f u n c t i o n . / d e f r a g . c : 3 1 : ∗ \\ todo l o g anomalies . / d e f r a g . c : 5 0 0 : ∗ \\ todo A l l o c a t e p a c k e t b u f f e r s from a pool . . / d e t e c t−bytejump . c : 1 9 8 : ∗ \\ todo Should t h i s v a l i d a t e i t i s s t i l l i n t h e ∗ payload ∗? . / d e t e c t−bytejump . c : 3 0 7 : ∗ \\ todo Should t h i s v a l i d a t e i t i s s t i l l i n t h e ∗ payload ∗? . / d e t e c t−bytejump . c : 4 3 7 : /∗∗ \\ todo E r r o r on dups ? ∗/ . / d e t e c t−bytejump . c : 8 3 3 : ∗ \\ todo T h i s f a i l s becuase we can o n l y have 9 c a p t u r e s and there are 10. . / d e t e c t−bytejump . c : 9 6 : /∗ XXX ∗/ . / d e t e c t−bytejump . h : 1 1 0 : ∗ \\ todo The r e t u r n seems backwards . We should r e t u r n a non−z e r o e r r o r code . . / d e t e c t−bytejump . h : 5 7 : ∗ \\ todo add s u p p o r t f o r no_stream and s t r e a m _ o n l y . / d e t e c t−b y t e t e s t . c : 1 5 8 : ∗ \\ todo Should t h i s v a l i d a t e i t i s i n t h e ∗ payload ∗? . / d e t e c t−b y t e t e s t . c : 3 7 9 : /∗∗ \\ todo E r r o r on dups ? ∗/ . / d e t e c t−b y t e t e s t . c : 4 9 : /∗ ∗ \\ todo We p r o b a b l y j u s t need a s i m p l e t o k e n i z e r here ∗/ . / d e t e c t−b y t e t e s t . c : 9 8 : /∗ XXX ∗/ . / d e t e c t−b y t e t e s t . h : 1 2 1 : ∗ \\ todo The r e t u r n seems backwards . We should r e t u r n a non−z e r o e r r o r code . One o f t h e e r r o r codes i s " no match " . As−i s i f someone a c c i d e n t a l l y does : i f ( D e t e c t B y t e t e s t M a t c h ( . . . ) ) { match } , then t h e y c a t c h an e r r o r 114 CASEC as a match . . / d e t e c t−b y t e t e s t . h : 6 6 : ∗ \\ todo add s u p p o r t f o r no_stream and s t r e a m _ o n l y . / d e t e c t . c :11823: ∗ Bug #611 ∗/ . / d e t e c t . c :12090:/∗∗ \\ t e s t bug #815 ∗/ . / detect . c :1967: /∗ hack : i f we a r e i n p a s s t h e e n t i r e flow mode , we need t o s t i l l . / detect . c :4171: /∗ XXX f i x t h i s ∗/ . / detect . c :4690: /∗ s e t t i n g i t t o d e f a u l t . You ’ ve g o t t a remove i t once you f i x t h e s t a t e t a b l e t h i n g ∗/ . / detect . c :4900: ∗ \\ todo Support t h i s . ∗/ . / detect . c :9147: // XXX TODO . / d e t e c t−c o n t e n t . h : 7 4 : ∗ c a s t i n g . \\ todo check t h i s and f i x i t i f p o s s s i b l e ∗/ . / d e t e c t−dce−opnum . c : 1 2 6 8 : /∗ todo chop t h e r e q u e s t f r a g l e n g t h and change t h e . / d e t e c t−dce−opnum . c : 7 2 7 : /∗ todo chop t h e r e q u e s t f r a g l e n g t h and change t h e . / d e t e c t−dce−stub−data . c : 1 6 6 : /∗ todo chop t h e r e q u e s t f r a g l e n g t h and change t h e . / d e t e c t−dce−stub−data . c : 7 2 2 : /∗ todo chop t h e r e q u e s t f r a g l e n g t h and change t h e . / d e t e c t−dns−query . c : 1 0 8 : ∗ \\ todo what should we r e t u r n ? J u s t t h e f a c t t h a t we matched ? . / d e t e c t−dns−query . c : 1 1 4 2 : /∗∗ \\ todo should not a l e r t , bug #839 . / d e t e c t−d s i z e . c : 8 9 : /∗ XXX ∗/ . / d e t e c t−engine−a d d r e s s . c : 1 1 6 2 : ∗ \\ todo do t h e same f o r IPv6 . / d e t e c t−engine−a d d r e s s . c : 1 2 7 0 : /∗ XXX more ??? ∗/ . / d e t e c t−engine−a d d r e s s . c : 1 2 9 9 : i f ( r == ADDRESS_EQ || r == ADDRESS_EB) { /∗ XXX more ??? ∗/ . / d e t e c t−engine−a d d r e s s . c : 1 6 4 8 : ∗ \\ todo a r r a y should be ordered , so we can break out o f t h e loop . / d e t e c t−engine−a d d r e s s . c : 1 6 8 2 : ∗ \\ todo a r r a y should be ordered , so we can break out o f t h e loop . / d e t e c t−engine−a d d r e s s . c : 1 7 8 7 : /∗ XXX f i g u r e out a way t o not need t o do t h i s n t o h l i f we s w i t c h t o . / d e t e c t−engine−a d d r e s s . c : 1 8 7 7 : /∗ XXX should we r e a l l y do t h i s check e v e r y time we run t h i s f u n c t i o n ? ∗/ . / d e t e c t−engine−a d d r e s s . c : 2 5 : ∗ \\ todo Move t h i s out o f t h e d e t e c t i o n p l u g i n s t r u c t u r e . / d e t e c t−engine−a d d r e s s . c : 2 7 1 : ∗ XXX c u r r e n t s o r t i n g o n l y works f o r o v e r l a p p i n g nets . / d e t e c t−engine−a d d r e s s . c : 5 0 8 : /∗ XXX ∗/ . / d e t e c t−engine−a d d r e s s . c : 5 5 6 : ∗ \\ todo I t h i n k f o r t h e f i n a l s e c t i o n : w h i l e ( c i d r > 0) , we can s i m p l y . / d e t e c t−engine−a d d r e s s . c : 5 7 : ∗ \\ todo not MT s a f e ∗/ . / d e t e c t−engine−a d d r e s s . c : 6 3 7 : /∗ 1 . 2 . 3 . 4 / xxx format ( e i t h e r d o t t e d or c i d r n o t a t i o n ∗/ . / d e t e c t−engine−a d d r e s s . c : 8 9 0 : /∗ XXX cleanup ∗/ . / d e t e c t−engine−a d d r e s s . c : 8 9 8 : ∗ \\ todo We don ’ t seem t o be h a n d l i n g negated c a s e s , l i k e [ addr , ! [ ! addr , addr ] ] , . / d e t e c t−engine−a d d r e s s . c : 9 3 5 : " P l e a s e f i l e a bug r e p o r t on this .") ; . / d e t e c t−engine−address−i p v 4 . c : 1 3 4 : /∗ g e t a p l a c e t o temporary put s i g s l i s t s ∗/ . / d e t e c t−engine−address−i p v 6 . c : 3 8 1 : /∗ g e t a p l a c e t o temporary put s i g s l i s t s ∗/ . / d e t e c t−engine−a n a l y z e r . c : 7 6 6 : // todo : warning i f c o n t e n t i s weak , s e p a r a t e warning f o r p c r e + weak c o n t e n t . / d e t e c t−engine−a n a l y z e r . c : 8 6 8 : // todo : warning i f c o n t e n t i s weak , s e p a r a t e warning f o r p c r e + weak c o n t e n t . / d e t e c t−engine . c : 1 3 9 9 : /∗∗ \\ todo we s t i l l depend on t h e g l o b a l mpm_ctx here . / d e t e c t−engine . c : 1 6 1 6 : /∗∗ \\ todo g e t r i d o f t h i s s t a t i c ∗/ . / d e t e c t−engine . c : 1 7 1 1 : /∗ XXX ∗/ . / d e t e c t−engine . c :1820:/∗∗ TODO l o c k i n g ? Not needed i f t h i s i s a one time s e t t i n g a t s t a r t u p ∗/ . / d e t e c t−engine . c :3226:/∗∗ \\ t e s t bug 892 bad v a l u e s ∗/ . / d e t e c t−engine−c o n t e n t−i n s p e c t i o n . c : 1 2 5 : /∗ \\ todo u n i f y t h i s which i s phase 2 o f payload i n s p e c t i o n u n i f i c a t i o n ∗/ . / d e t e c t−engine−c o n t e n t−i n s p e c t i o n . c : 2 7 8 : /∗ \\ todo Add a n o t h e r o p t i m i z a t i o n here . I f cd−>c o n t e n t _ l e n i s . / d e t e c t−engine−hhd . c :3487:/∗∗ \\ t e s t r e a s s e m b l y bug where h e a d e r s with names o f l e n g t h 6 were . / d e t e c t−engine−i p o n l y . c :1886:/∗ \\ todo f i x i t . We have d i s a b l e d t h i s u n i t t e s t because 599 e x p o s e s 608 , . / d e t e c t−engine−i p o n l y . c : 1 8 8 7 : ∗ which i s why t h e s e u n i t t e s t s f a i l . When we f i x 608 , we need t o r e n a b l e . / d e t e c t−engine−i p o n l y . c :2039:/∗ \\ todo f i x i t . We have d i s a b l e d t h i s u n i t t e s t because 599 e x p o s e s 608 , . / d e t e c t−engine−i p o n l y . c : 2 0 4 0 : ∗ which i s why t h e s e u n i t t e s t s f a i l . When we f i x 608 , we need t o r e n a b l e . / d e t e c t−engine−i p o n l y . c :2294:/∗ \\ todo f i x i t . We have d i s a b l e d t h i s u n i t t e s t because 599 e x p o s e s 608 , . / d e t e c t−engine−i p o n l y . c : 2 2 9 5 : ∗ which i s why t h e s e u n i t t e s t s f a i l . When we f i x 608 , we need t o r e n a b l e . / d e t e c t−engine−i p o n l y . c :2304:/∗ \\ todo f i x i t . We have d i s a b l e d t h i s u n i t t e s t because 599 e x p o s e s 608 , . / d e t e c t−engine−i p o n l y . c : 2 3 0 5 : ∗ which i s why t h e s e u n i t t e s t s f a i l . When we f i x 608 , we need t o r e n a b l e . / d e t e c t−engine−i p o n l y . c : 8 9 1 : /∗ XXX : how a r e we going t o p r i n t t h e s t a t s now? ∗/ 115 CASEC . / d e t e c t−engine−mpm. c : 1 9 3 1 : ∗ \\ todo determine i f a c o n t e n t match can s e t t h e ’ s i n g l e ’ flag . / d e t e c t−engine−mpm. c : 1 9 3 2 : ∗ \\ todo do e r r o r c h e c k i n g . / d e t e c t−engine−mpm. c : 1 9 3 3 : ∗ \\ todo r e w r i t e t h e COPY s t u f f . / d e t e c t−engine−mpm. c : 6 3: / ∗ ∗ \\ todo make i t p o s s i b l e t o use m u l t i p l e p a t t e r n matcher algorithms next to . / d e t e c t−engine−payload . c : 3 0 2 : u i n t 8 _ t ∗ buf = ( u i n t 8 _ t ∗) " we need t o f i x t h i s and y e s f i x t h i s now " ; . / d e t e c t−engine−payload . c : 9 3 4 : ∗ \\ t e s t T e s t p c r e r e c u r s i v e matching − bug #529 . / d e t e c t−engine−payload . c : 9 4 : ∗ \\ todo we might a l s o p a s s t h e p a c k e t t o t h i s f u n c t i o n f o r the pktvar . / d e t e c t−engine−p o r t . c : 1 0 0 8 : ∗ \\ todo We don ’ t seem t o be h a n d l i n g negated c a s e s , l i k e [ port , ! [ ! port , p o r t ] ] , . / d e t e c t−engine−p o r t . c : 1 2 9 9 : i f ( r == PORT_EQ || r == PORT_EB) { /∗ XXX more ??? ∗/ . / d e t e c t−engine−p o r t . c : 1 4 5 1 : /∗ XXX b e t t e r i n p u t v a l i d a t i o n ∗/ . / d e t e c t−engine−p o r t . c : 1 9 6 : ∗ \\ todo XXX c u r r e n t s o r t i n g o n l y works f o r o v e r l a p p i n g ranges . / d e t e c t−engine−p o r t . c : 2 5 : ∗ \\ todo move t h i s out o f t h e d e t e c t i o n p l u g i n s t r u c t u r e . / d e t e c t−engine−p o r t . c : 2 6 : ∗ \\ todo more u n i t t e s t i n g . / d e t e c t−engine−p o r t . c : 3 5 9 : /∗ XXX ∗/ . / d e t e c t−engine−p o r t . c : 3 8 9 : /∗ g e t a p l a c e t o temporary put s i g s l i s t s ∗/ . / d e t e c t−engine−p r o t o . c : 1 3 4 : /∗∗ \\ todo a r e numeric p r o t o c o l s even v a l i d ? ∗/ . / d e t e c t−engine−p r o t o . c : 1 4 0 : // XXX . / d e t e c t−engine−p r o t o . c : 2 5 : ∗ \\ todo move t h i s out o f t h e d e t e c t i o n p l u g i n s t r u c t u r e . / d e t e c t−engine−s i g o r d e r . c :2130:/∗∗ \\ t e s t Bug 1061 ∗/ . / d e t e c t−f i l e s i z e . c : 7 1 : s i g m a t c h _ t a b l e [ DETECT_FILESIZE ] . f l a g s |= SIGMATCH_PAYLOAD ; /∗∗ XXX n e c e s s a r y ? ∗/ . / d e t e c t−f i l e s t o r e . c : 1 0 2 : /∗ XXX ∗/ . / d e t e c t−f i l e s t o r e . c : 2 6 9 : ∗ \\ todo when we s t a r t s u p p o r t i n g more p r o t o c o l s , t h e l o g i c i n this function . / d e t e c t−flow . c : 8 9 : /∗ XXX ∗/ . / d e t e c t−f t p b o u n c e . c : 2 2 1 : ∗ TODO: As a s u g g e s t i o n , maybe we can add a f l a g i n t h e flow . / d e t e c t−f t p b o u n c e . c : 6 2 : ∗ \\ todo add s u p p o r t f o r no_stream and s t r e a m _ o n l y . / d e t e c t−g e o i p . c : 5 0 : ∗ \\ todo add s u p p o r t f o r s r c _ o n l y and d s t _ o n l y . / d e t e c t−g e o i p . c : 7 3 : ∗ \\ todo add s u p p o r t f o r s r c _ o n l y and d s t _ o n l y . / d e t e c t . h :758:/∗∗ \\ todo review how many we a c t u a l l y need here ∗/ . / d e t e c t−h o s t b i t s . c : 3 6 1 : // TODO . / d e t e c t−h o s t b i t s . c : 5 6 :TODO: . / d e t e c t−h o s t b i t s . c : 7 1 0 : / ∗ TODO r e e n a b l e a f t e r both i s s u p p o r t e d . / d e t e c t−icmp−i d . c : 1 9 8 : /∗∗ \\ todo can B y t e E x t r a c t S t r i n g U i n t 1 6 do t h i s ? ∗/ . / d e t e c t−i p o p t s . c : 1 1 9 : /∗ Loop through i n s t e a d o f u s i n g o_xxx d i r e c t a c c e s s f i e l d s so that . / d e t e c t−i s d a t a a t . c : 1 0 2 : ∗ \\ todo We need t o add s u p p o r t f o r rawbytes . / d e t e c t−i s d a t a a t . c : 9 6 : /∗ XXX ∗/ . / d e t e c t−l u a . c : 7 1 4 : /∗ h a c k i s h , needed t o a l l o w u n i t t e s t s t o p a s s b u f f e r s as s c r i p t s i n s t e a d o f f i l e s ∗/ . / d e t e c t−l u a . c : 8 0 5 : /∗ h a c k i s h , needed t o a l l o w u n i t t e s t s t o p a s s b u f f e r s as s c r i p t s i n s t e a d o f f i l e s ∗/ . / d e t e c t−metadata . c : 2 5 : ∗ \\ todo Do we need t o do a n y t h i n g more t h i s i s used i n s n o r t host a t t r i b u t e table . / d e t e c t−msg . c : 6 5 : /∗ XXX do t h i s p a r s i n g i n a b e t t e r way ∗/ . / d e t e c t−msg . c : 7 0 : // p r i n t f ( " DetectMsgSetup : format hack a p p l i e d : \\ ’\% s \ \ ’ \ \ n " , str ) ; . / d e t e c t−p a r s e . c :2000:/∗∗ \\ t e s t P a r s i n g bug debugging a t 2010−03−18 ∗/ . / d e t e c t−p a r s e . c :2396:/∗∗ \\ t e s t s i d v a l u e too l a r g e . Bug #779 ∗/ . / d e t e c t−p a r s e . c :2415:/∗∗ \\ t e s t g i d v a l u e too l a r g e . R e l a t e d t o bug #779 ∗/ . / d e t e c t−p a r s e . c :2434:/∗∗ \\ t e s t r e v v a l u e too l a r g e . R e l a t e d t o bug #779 ∗/ . / d e t e c t−p a r s e . c : 3 2 0 1 : ∗ \\ t e s t check v a l i d n e g a t i o n bug 1079 . / d e t e c t−p a r s e . c : 3 6 4 : /∗ as t h i s i s a bug we should a b o r t t o e a s e debugging ∗/ . / d e t e c t−p a r s e . c : 6 4 6 : /∗ XXX VJ e x c l u d e h a n d l i n g t h i s f o r none UDP/TCP proto ’ s ∗/ . / d e t e c t−p c r e . c : 1 6 4 : /∗ XXX ∗/ . / d e t e c t−p c r e . c :1666:/∗∗ \\ t e s t Bug 1098 ∗/ . / d e t e c t−rawbytes . c : 2 5 : ∗ \\ todo P r o v i d e un−normalized t e l n e t dce / r p c b u f f e r s t o match on . / d e t e c t−r p c . c : 5 5 3 : s = s−>n e x t = S i g I n i t ( de_ctx , " a l e r t udp any any −> any any ( msg : \ \ " RPC Get XXX C a l l . . no match \ \ " ; r p c :123456 , ∗ , 3 ; s i d : 5 ; ) " ) ; . / d e t e c t−r p c . c : 8 9 : /∗ XXX ∗/ . / d e t e c t−sameip . c : 4 7 : ∗ \\ todo add s u p p o r t f o r no_stream and s t r e a m _ o n l y . / d e t e c t−t a g . c : 2 3 8 : /∗ TODO: l o a d DETECT_TAG_MAX_PKTS from c o n f i g ∗/ . / d e t e c t−t a g . c : 9 7 : /∗ XXX ∗/ . / d e t e c t−t a g . h : 4 1 : ∗ TODO: l o a d i t from c o n f i g ( v a r t a g g e d _ p a c k e t _ l i m i t ) ∗/ . / d e t e c t−u r i c o n t e n t . c : 2 4 5 : ∗ \\ todo what should we r e t u r n ? J u s t t h e f a c t t h a t we matched ? . / d e t e c t−window . c : 9 2 : /∗ XXX ∗/ . / d e t e c t−w i t h i n . c : 6 5 : ∗ \\ todo a p p l y t o u r i c o n t e n t . / d e t e c t−x b i t s . c : 2 5 4 : // TODO . / flow−b i t . c : 2 7 : ∗ \\ todo move away from a l i n k e d l i s t i mple ment ati on . / flow−b i t . c : 2 8 : ∗ \\ todo use d i f f e r e n t d a t a t y p e s , such as s t r i n g , i n t , e t c . . / flow−b i t . c : 2 9 : ∗ \\ todo have more than one i n s t a n c e o f t h e same var , and be a b l e t o match on a 116 CASEC . / flow . c : 2 3 4 : ∗ \\ todo we can ’ t r e s t o r e t h e l a s t t s . / flow . c : 8 7 8 : ∗ \\TODO handle UDP . / flow . c : 8 9 9 : /∗ todo : handle p a s s c a s e ( a l s o f o r UDP ! ) ∗/ . / flow−manager . c : 1 0 3 : ∗ \\ todo Kinda h a c k i s h s i n c e i t u s e s t h e t v name t o i d e n t i f y flow manager . / flow−manager . c : 9 1 5 : ∗ \\ todo Kinda h a c k i s h s i n c e i t u s e s t h e t v name t o i d e n t i f y flow recycler . / flow−t i m e o u t . c : 5 4 2 : /∗ \\ todo A l s o s k i p f l o w s t h a t shouldn ’ t be i n s p e c t e d ∗/ . / flow−u t i l . c : 1 4 6 : /∗ XXX handle d e f a u l t ∗/ . / flow−u t i l . c : 1 5 2 : i f ( p−>t c p h != NULL) { /∗ XXX MACRO ∗/ . / flow−u t i l . c : 1 5 5 : } e l s e i f ( p−>udph != NULL) { /∗ XXX MACRO ∗/ . / flow−u t i l . c : 1 6 4 : } e l s e i f ( p−>s c t p h != NULL) { /∗ XXX MACRO ∗/ . / flow−u t i l . c : 1 6 7 : } /∗ XXX handle d e f a u l t ∗/ . / host−b i t . c : 2 7 : ∗ \\ todo move away from a l i n k e d l i s t i mple ment ati on . / host−b i t . c : 2 8 : ∗ \\ todo use d i f f e r e n t d a t a t y p e s , such as s t r i n g , i n t , e t c . . / i p p a i r −b i t . c : 2 7 : ∗ \\ todo move away from a l i n k e d l i s t i mple ment ati on . / i p p a i r −b i t . c : 2 8 : ∗ \\ todo use d i f f e r e n t d a t a t y p e s , such as s t r i n g , i n t , e t c . . / log−pcap . c : 4 0 2 : /∗ XXX pcap handles , nfq , p f r i n g , can o n l y have one l i n k t y p e ipfw ? we do . / log−pcap . c : 4 4 5 : /∗ s e t t i n g s TODO move t o g l o b a l c f g s t r u c t ∗/ . / log−t l s l o g . c : 3 0 8 : /∗ todo : l o g i c t o l o g once ∗/ . / output−f i l e . c : 2 0 4 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−f i l e d a t a . c : 3 3 0 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−flow . c : 1 5 0 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−j s o n−drop . c : 8 9 : j s o n _ t ∗ j s = CreateJSONHeader ( ( P a c k e t ∗)p , 0 , " drop " ) ; / /TODO const . / output−j s o n−f i l e . c : 1 5 9 : /∗ o r i g i n a l l y j u s t ’ f i l e ’ , but due t o bug 1127 naming i t f i l e i n f o ∗/ . / output−j s o n−f i l e . c : 8 7 : j s o n _ t ∗ j s = CreateJSONHeader ( ( P a c k e t ∗)p , 0 , " f i l e i n f o " ) ; //TODO c o n s t . / output−j s o n−flow . c :113:# i f 0 // TODO . / output−j s o n−flow . c : 3 2 1 : j s o n _ t ∗ j s = CreateJSONHeaderFromFlow ( f , " flow " ) ; //TODO const . / output−j s o n−flow . c : 4 2 6 : a f t −>f l o w l o g _ c t x = ( ( OutputCtx ∗) i n i t d a t a )−>data ; //TODO . / output−j s o n−h t t p . c : 3 7 6 : j s o n _ t ∗ j s = CreateJSONHeaderWithTxId ( ( P a c k e t ∗)p , 1 , " h t t p " , t x _ i d ) ; //TODO c o n s t . / output−j s o n−h t t p . c : 5 5 6 : a f t −>h t t p l o g _ c t x = ( ( OutputCtx ∗) i n i t d a t a )−>data ; //TODO . / output−j s o n−n e t f l o w . c :122:# i f 0 // TODO . / output−j s o n−n e t f l o w . c : 2 9 6 : j s o n _ t ∗ j s = CreateJSONHeaderFromFlow ( f , " n e t f l o w " , 0) ; //TODO c o n s t . / output−j s o n−n e t f l o w . c : 3 0 7 : j s = CreateJSONHeaderFromFlow ( f , " n e t f l o w " , 1) ; //TODO const . / output−j s o n−n e t f l o w . c : 4 0 8 : a f t −>f l o w l o g _ c t x = ( ( OutputCtx ∗) i n i t d a t a )−>data ; //TODO . / output−j s o n−s s h . c : 1 1 7 : j s o n _ t ∗ j s = CreateJSONHeader ( ( P a c k e t ∗)p , 1 , " s s h " ) ; / /TODO . / output−j s o n−s s h . c : 3 1 0 : /∗ todo : l o g i c t o l o g once ∗/ . / output−j s o n−t l s . c : 1 5 3 : j s o n _ t ∗ j s = CreateJSONHeader ( ( P a c k e t ∗)p , 0 , " t l s " ) ; / /TODO . / output−j s o n−t l s . c : 3 7 0 : /∗ todo : l o g i c t o l o g once ∗/ . / output−l u a . c : 4 5 5 : ∗ TODO non−h t t p s u p p o r t . / output−l u a . c : 4 7 2 : ∗ TODO hardcoded t o HTTP c u r r e n t l y ∗/ . / output−l u a . c : 6 1 0 : /∗ h a c k i s h , needed t o a l l o w u n i t t e s t s t o p a s s b u f f e r s as s c r i p t s i n s t e a d o f f i l e s ∗/ . / output−l u a . c : 7 5 5 : /∗ h a c k i s h , needed t o a l l o w u n i t t e s t s t o p a s s b u f f e r s as s c r i p t s i n s t e a d o f f i l e s ∗/ . / output−p a c k e t . c : 1 4 5 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−s t a t s . c : 1 4 0 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−s t r e a m i n g . c : 1 6 0 : f o r ( t x _ i d = 0 ; t x _ i d < t o t a l _ t x s ; t x _ i d++) { // TODO optimization store log tx . / output−s t r e a m i n g . c : 3 7 8 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / output−t x . c : 2 1 5 : / ∗ todo ∗/ BUG_ON( t s == NULL) ; . / pkt−v a r . c : 2 5 : ∗ \\ todo move away from a l i n k e d l i s t i mple men tati on . / pkt−v a r . c : 2 6 : ∗ \\ todo use d i f f e r e n t d a t a t y p e s , such as s t r i n g , i n t , e t c . . / pkt−v a r . c : 2 7 : ∗ \\ todo have more than one i n s t a n c e o f t h e same var , and be a b l e t o match on a . / queue . h : 3 3 8 : / ∗ XXX ∗/ . / r e p u t a t i o n . h : 9 2 : / /TODO: Add a timestamp here t o know t h e l a s t update o f t h i s reputation . . / respond−r e j e c t . c : 2 5 : ∗ \\ todo RespondRejectFunc r e t u r n s 1 on e r r o r , 0 on ok . . . why? For now i t should . / respond−r e j e c t . c : 2 6 : ∗ j u s t r e t u r n 0 always , e r r o r h a n d l i n g i s a TODO i n t h e t h r e a d i n g model ( VJ ) . / respond−r e j e c t −l i b n e t 1 1 . c : 1 4 7 : /∗ TODO come up with t t l c a l c f u n c t i o n ∗/ . / respond−r e j e c t −l i b n e t 1 1 . c : 2 3 9 : /∗ TODO come up with t t l c a l c f u n c t i o n ∗/ . / respond−r e j e c t −l i b n e t 1 1 . c : 2 7 : ∗ \\ todo c a l c u l a t e TTL base on a v e r a g e from stream tracking . / respond−r e j e c t −l i b n e t 1 1 . c : 2 8 : ∗ \\ todo come up with a way f o r u s e r s t o s p e c i f y icmp unreachable type . / respond−r e j e c t −l i b n e t 1 1 . c : 2 9 : ∗ \\ todo P o s s i b l y d e f a u l t t o p o r t u n r e a c h a b l e f o r UDP t r a f f i c t h i s seems . / respond−r e j e c t −l i b n e t 1 1 . c : 3 1 : ∗ \\ todo implement i p v 6 r e s e t s . / respond−r e j e c t −l i b n e t 1 1 . c : 3 2 : ∗ \\ todo implement pre−a l l o c r e s e t s f o r speed 117 CASEC . / respond−r e j e c t −l i b n e t 1 1 . c : 3 5 9 : /∗ TODO come up with t t l c a l c f u n c t i o n ∗/ . / respond−r e j e c t −l i b n e t 1 1 . c : 4 5 1 : /∗ TODO come up with t t l c a l c f u n c t i o n ∗/ . / runmodes . c : 7 7 3 : // TODO i f module == parent , f i n d i t ’ s c h i l d r e n . / runmodes . c : 8 2 3 : BUG_ON( s c r i p t s == NULL) ; //TODO . / runmode−unix−s o c k e t . c : 5 0 6 : // TODO cleanup . / runmode−unix−s o c k e t . c : 5 8 8 : // TODO cleanup . / runmode−unix−s o c k e t . c : 6 6 3 : // TODO cleanup . / runmode−unix−s o c k e t . c : 7 4 0 : // TODO cleanup . / runmode−unix−s o c k e t . c : 7 9 1 : // TODO cleanup . / source−af−p a c k e t . c : 1 1 9 : ∗ \\ todo U n i t t e s t s a r e needed f o r t h i s module . . / source−af−p a c k e t . c : 1 6 5 9 : ∗ \\ todo C r e a t e a g e n e r a l AFP s e t u p f u n c t i o n . . / source−af−p a c k e t . c : 1 8 3 1 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−af−p a c k e t . c : 2 6 6 : ∗ \\ todo U n i t t e s t s a r e needed f o r t h i s module . . / source−af−p a c k e t . c : 3 1 : ∗ \\ todo watch o t h e r i n t e r f a c e e v e n t t o d e t e c t s u p p r e s s i o n o f t h e monitored . / source−af−p a c k e t . c : 4 8 7 : ∗ \\ todo U n i t t e s t s a r e needed f o r t h i s module . . / source−af−p a c k e t . c : 5 3 4 : /∗ XXX should t r y t o use read t h a t g e t d i r e c t l y t o p a c k e t ∗/ . / source−af−p a c k e t . c : 5 9 1 : /∗ XXX t h i s i s m i n i m a l i s t , but t h i s seems enough ∗/ . / source−af−p a c k e t . c : 8 0 0 : /∗ XXX t h i s i s m i n i m a l i s t , but t h i s seems enough ∗/ . / source−e r f −dag . c : 6 1 8 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−e r f −f i l e . c : 2 8 1 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−ipfw . c : 1 6 5 : SC_CAP_NET_BROADCAST ; /∗∗ \\ todo u n t e s t e d ∗/ . / source−ipfw . c : 1 8 2 : SC_CAP_NET_BIND_SERVICE ; /∗∗ \\ todo u n t e s t e d ∗/ . / source−ipfw . c : 4 4 9 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo p a c k e t s . / source−ipfw . c : 5 8 6 : /∗∗ \\ todo For d i v e r t s o c k e t s , dropping means not w r i t i n g t h e p a c k e t back t o t h e s o c k e t . . / source−ipfw . c : 7 7 8 : ∗ T h i s f u n c t i o n i s temporary used as c o n f i g u r a t i o n p a r s e r . . / source−mpipe . c : 1 0 3 6 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−mpipe . c : 2 0 3 : // TODO: Check f o r dual mPipes . . / source−mpipe . c : 7 8 4 : // TODO − Save t h e r e s t o f t h e Huge Page f o r o t h e r a l l o c a t i o n s . . / source−napatech . c : 3 5 4 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−netmap . c : 1 0 3 3 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−nfq . c : 1 0 4 1 : /∗∗ \\ todo add a t e s t on v a l i d i t y o f t h e e n t r y NFQQueueVars c o ul d have been . / source−nfq . c : 1 2 2 9 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo p a c k e t s . / source−nfq . c : 1 8 1 : /∗ XXX c r e a t e a g e n e r a l NFQ s e t u p f u n c t i o n ∗/ . / source−nfq . c : 2 8 : ∗ \\ todo t e s t i f R e c e i v e and V e r d i c t i f both a r e p r e s e n t . / source−nfq . c : 8 5 2 : ∗ T h i s f u n c t i o n i s temporary used as c o n f i g u r a t i o n p a r s e r . . / source−nfq . c : 8 7 8 : /∗ XXX what happens on r v == 0? ∗/ . / source−pcap . c : 3 6 4 : ∗ \\ todo C r e a t e a g e n e r a l pcap s e t u p f u n c t i o n . . / source−pcap . c : 4 0 1 : /∗ XXX c r e a t e a g e n e r a l pcap s e t u p f u n c t i o n ∗/ . / source−pcap . c : 7 0 2 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo p a c k e t s . / source−pcap−f i l e . c : 4 1 1 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / source−p f r i n g . c : 1 2 3 : / ∗ XXX r e p l a c e with u s e r c o n f i g u r a b l e o p t i o n s ∗/ . / source−p f r i n g . c : 2 6 : ∗ \\ todo remove r e q u i r e m e n t f o r s e t t i n g c l u s t e r so o l d 3 . x v e r s i o n s are supported . / source−p f r i n g . c : 2 7 : ∗ \\ todo implement DNA s u p p o r t . / source−p f r i n g . c : 2 8 : ∗ \\ todo Allow r i n g o p t i o n s such as s n a p l e n e t c , t o be u s e r configurable . . / source−p f r i n g . c : 3 9 1 : ∗ \\ todo add a c o n f i g o p t i o n f o r s e t t i n g c l u s t e r i d . / source−p f r i n g . c : 3 9 2 : ∗ \\ todo C r e a t e a g e n e r a l p f r i n g s e t u p f u n c t i o n . . / source−p f r i n g . c : 6 1 3 : ∗ \\ todo V e r i f y t h a t PF_RING o n l y d e a l s with e t h e r n e t t r a f f i c . / source−p f r i n g . c : 6 2 3 : /∗ XXX HACK: flow t i m e o u t can c a l l us f o r i n j e c t e d pseudo packets . / stream . c : 1 8 5 : e x i t ( EXIT_FAILURE ) ; /∗ XXX ∗/ . / stream . c : 2 2 9 : ∗ \\ todo we may want t o c o n s i d e r non empty queue ’ s . / stream−t c p . c : 1 3 1 4 : /∗∗ \\ todo ∗/ . / stream−t c p . c : 1 4 4 2 : ∗ \\ todo improve r e s e t t i n g t h e s e s s i o n ∗/ . / stream−t c p . c : 1 4 9 6 : /∗∗ \\ todo check i f i t ’ s c o r r e c t or s e t e v e n t ∗/ . / stream−t c p . c : 2 6 : ∗ \\ todo − 4WHS: what i f a f t e r t h e 2nd SYN we t u r n out t o be normal 3WHS anyway? . / stream−t c p . c : 3 9 1 7 : /∗∗ \\ todo ∗/ . / stream−t c p . c : 4 0 4 4 : /∗∗ \\ todo ∗/ . / stream−t c p . c : 4 3 7 1 : ∗ See bug 1238. . / stream−t c p . c : 5 0 1 3 : // TODO r e s e t p a c k e t f l a g wrt flow : d i r e c t i o n , HAS_FLOW e t c . / stream−t c p . c : 5 1 4 9 : /∗ XXX ∗/ . / stream−tcp−i n l i n e . c : 1 3 3 : ∗ \\ todo What about r e a s s e m b l e d f r a g m e n t s ? . / stream−tcp−i n l i n e . c : 1 3 4 : ∗ \\ todo What about unwrapped t u n n e l p a c k e t s ? . / stream−tcp−i n l i n e . c : 1 4 9 : /∗∗ \\ todo review l o g i c ∗/ . / stream−tcp−r e a s s e m b l e . c : 2 1 6 3 : ∗ \\ todo t h i s f u n c t i o n i s too long , we need t o break i t up . I t needs i t BAD 118 CASEC . / stream−tcp−r e a s s e m b l e . c : 2 3 6 8 : XXX we need a s e t u p f u n c t i o n ∗/ . / stream−tcp−r e a s s e m b l e . c : 2 6 2 6 : ∗ TODO i f i n i t i a l data i s b i g enough f o r p r o t o d e t e c t , we co u l d do t h e . / stream−tcp−r e a s s e m b l e . c : 2 8 8 1 : ∗ \\ todo t h i s f u n c t i o n i s too long , we need t o break i t up . I t needs i t BAD . / stream−tcp−r e a s s e m b l e . c : 3 0 8 9 : u i n t 3 2 _ t s m s g _ o f f s e t ; // TODO d i f f with smsg−>d a t a _ l e n ? . / stream−tcp−r e a s s e m b l e . c : 3 1 7 8 : r e t u r n 1 ; // TODO . / stream−tcp−r e a s s e m b l e . c : 3 2 4 5 : XXX we need a s e t u p f u n c t i o n ∗/ . / stream−tcp−r e a s s e m b l e . c : 3 3 0 2 : ∗ \\ todo t h i s f u n c t i o n i s too long , we need t o break i t up . I t needs i t BAD . / stream−tcp−r e a s s e m b l e . c : 3 4 7 6 : ∗ \\ todo VJ We can remove t h e a b o r t ( ) s l a t e r . . / stream−tcp−r e a s s e m b l e . c : 3 4 7 7 : ∗ \\ todo VJ Why not memcpy? . / stream−tcp−r e a s s e m b l e . c : 3 5 7 6 : ∗ \\ todo VJ wouldn ’ t a memcpy be more a p p r o p r i a t e here ? . / stream−tcp−r e a s s e m b l e . c :5867:/∗∗ \\ t e s t T e s t t h e bug 56 c o n d i t i o n ∗/ . / stream−tcp−r e a s s e m b l e . c :5936:/∗∗ \\ t e s t T e s t t h e bug 57 c o n d i t i o n ∗/ . / stream−tcp−r e a s s e m b l e . c :6004:/∗∗ \\ t e s t T e s t t h e bug 76 c o n d i t i o n ∗/ . / stream−tcp−r e a s s e m b l e . c : 6 1 9 : ∗ a hack though , we ’ r e going t o check n e x t how we end up with . / stream−tcp−r e a s s e m b l e . c : 6 6 8 : ∗ a hack though , we ’ r e going t o check n e x t how we end up with . / stream−tcp−r e a s s e m b l e . c : 8 7 3 1 : U t R e g i s t e r T e s t ( " StreamTcpReassembleTest32 −− Bug t e s t " , StreamTcpReassembleTest32 , 1) ; . / stream−tcp−r e a s s e m b l e . c : 8 7 3 2 : U t R e g i s t e r T e s t ( " StreamTcpReassembleTest33 −− Bug t e s t " , StreamTcpReassembleTest33 , 1) ; . / stream−tcp−r e a s s e m b l e . c : 8 7 3 3 : U t R e g i s t e r T e s t ( " StreamTcpReassembleTest34 −− Bug t e s t " , StreamTcpReassembleTest34 , 1) ; . / stream−tcp−s a c k . c : 2 6 9 : /∗∗ \\ todo need a m e t r i c t o a check f o r a r i g h t edge l i m i t ∗/ . / s u r i c a t a . c :2237: /∗∗ \\ todo we need an a p i f o r t h e s e ∗/ . / s u r i c a t a . c :2511: /∗∗ TODO t h i s can do i n t o i t ’ s own fun c ∗/ . / s u r i c a t a . c : 3 7 2 : / ∗ XXX hack : make s u r e t h r e a d s can s t o p t h e engine by c a l l i n g t h i s . / s u r i c a t a . h : 1 2 3 : ∗ XXX move t o t h e TmQueue s t r u c t u r e l a t e r . / t h r e a d v a r s . h : 4 4 : ∗ o f a hack f o r s o l v i n g stream−timeout−shutdown . I s s e t by t h e main t h r e a d . ∗/ . / tmqh−nfq . c : 4 5 : / ∗ XXX not s c a l i n g ∗/ . / tmqh−p a c k e t p o o l . c : 4 5 0 : /∗∗ \\ todo make t h i s a c a l l b a c k . / tm−t h r e a d s . c : 1 0 3 6 : /∗ XXX c r e a t e s e p a r a t e f u n c t i o n f o r t h i s : a l l o c a t e a t h r e a d c o n t a i n e r ∗/ . / tm−t h r e a d s . c : 1 1 6 : ∗ \\ todo Deal with po st_p q f o r s l o t s beyond t h e f i r s t . . / tm−t h r e a d s . c : 3 8 9 : ∗ \\ todo Only t h e f i r s t " s l o t " c u r r e n t l y makes t h e " p ost_ pq " available . / unix−manager . c : 5 5 7 : ∗ \\ todo Kinda h a c k i s h s i n c e i t u s e s t h e t v name t o i d e n t i f y unix manager . / unix−manager . c : 9 6 7 : ∗ \\ todo Kinda h a c k i s h s i n c e i t u s e s t h e t v name t o i d e n t i f y unix manager . / u t i l −b i n s e a r c h . c : 2 5 : ∗ \\ todo r e p l a c e t h i s by a b e t t e r a l g o . / u t i l −b y t e . c : 1 0 2 : /∗∗ \\ todo Need s t a n d a r d r e t u r n v a l u e s ∗/ . / u t i l −b y t e . c : 1 2 3 : /∗∗ \\ todo Need s t a n d a r d r e t u r n v a l u e s ∗/ . / u t i l −b y t e . c : 8 1 : /∗∗ \\ todo Need s t a n d a r d r e t u r n v a l u e s ∗/ . / u t i l −b y t e . h : 2 6 6 : /∗∗ \\ todo Need s t a n d a r d r e t u r n v a l u e s ∗/ . / u t i l −b y t e . h : 2 7 3 : /∗∗ \\ todo P r o b a b l y a more e f f i c i e n t way t o do t h i s . ∗/ . / u t i l −c l a s s i f i c a t i o n −c o n f i g . c : 1 2 2 : /∗ i f i t i s not NULL , use t h e f i l e d e s c r i p t o r . The hack so t h a t we can . / u t i l −coredump−c o n f i g . c : 7 8 : /∗ todo : use t h e r e g i s t r y t o g e t / s e t dump c o n f i g u r a t i o n ∗/ . / u t i l −cpu . c : 1 7 9 : ∗ \\ todo We’ l l have t o d e a l with removig t i c k s from t h e e x t r a c p u i d s inbetween . / u t i l −daemon . c : 1 1 2 : /∗∗ \\ todo We should check i f wie a l l o w more than 1 i n s t a n c e . / u t i l −daemon . h : 2 7 : /∗ ∗ \\ todo A d j u s t path ∗/ . / u t i l −debug . c : 1 4 5 4 : ∗ t o have a s i n g l e bug here ( not t h a t you can a f f o r d t o have a bug . / u t i l −debug . c : 1 7 8 : ∗ \\ todo s y s l o g i s thread−s a f e a c c o r d i n g t o POSIX manual and g l i b c code , but we . / u t i l −debug . c : 5 0 3 : r e t u r n SC_ERR_LOG_FG_FILTER_MATCH ; // b i t hacky , but j u s t return !0 . / u t i l −debug−f i l t e r s . c : 5 2 7 : ∗ \\ r e t v a l 1 S i n c e i t i s a hack t o g e t t h i n g s working i n s i d e t h e macros . / u t i l −decode−der . c : 2 1 5 : /∗ f i x t h e l e n g t h f o r unknown o b j e c t s , e l s e . / u t i l −decode−mime . c : 1 1 3 7 : // TODO . / u t i l −e r r o r . c : 2 5 : ∗ \\ todo Needs r e f i n i n g o f t h e e r r o r codes . Renaming with a p r e f i x o f SC_ERR , . / u t i l −f i l e . c : 8 5 : /∗∗ \\ todo make t h i s s i z e c o n f i g u r a b l e ∗/ . / u t i l −hash−lookup3 . c : 8 7 0 : u i n t 8 _ t qqqq [ ] = " x x x T h i s i s t h e time f o r a l l good men t o come t o t h e a i d o f t h e i r c o u n t r y . . . " ; . / u t i l −l o g o p e n f i l e . c : 5 5 1 : /∗ TODO go async here ? ∗/ . / u t i l −l o g o p e n f i l e −t i l e . c : 1 1 1 : ∗ TODO: Check e r r o r s . / u t i l −l o g o p e n f i l e −t i l e . c : 3 2 9 : /∗ TODO: Need t o count open f i l e s and c l o s e when r e a c h e s z e r o . ∗/ . / u t i l −memcmp . h : 4 4 5 : /∗ TODO Check f o r a l r e a d y a l i g n e d c a s e s . To o p t i m i z e . ∗/ . / u t i l −mem. h : 2 5 : ∗ \\ todo Add wrappers f o r f u n c t i o n s t h a t a l l o c a t e / f r e e memory here . 119 CASEC . / u t i l −mpm−ac−bs . c : 1 4 3 2 : /∗ \\ todo t r i e d loop u n r o l l i n g with r e g i s t e r var , with no p e r f i n c r e a s e . Need . / u t i l −mpm−ac−bs . c : 1 4 3 4 : /∗ \\ todo Change i t f o r s t a t e f u l MPM. Supply t h e s t a t e u s i n g mpm_thread_ctx ∗/ . / u t i l −mpm−ac−bs . c : 3 6 : ∗ \\ todo − Do a p r o p e r a n a l y i s o f our e x i s t i n g MPMs and s u g g e s t a good one based . / u t i l −mpm−ac−bs . c : 3 7 9 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r t h e same CTX with t h e same s i d ∗/ . / u t i l −mpm−ac−bs . c : 4 4 9 : /∗ \\ todo u s i n g i t t e m p o r a r i l y now d u r i n g dev , s i n c e I have restricted . / u t i l −mpm−ac−bs . c : 5 9 1 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac−bs . c : 6 0 5 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac−bs . c : 6 3 0 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; \\ . / u t i l −mpm−ac−bs . c : 6 3 9 : " f i l e a bug r e p o r t on t h i s " ) , \\ . / u t i l −mpm−ac . c : 1 3 1 2 : /∗ \\ todo t r i e d loop u n r o l l i n g with r e g i s t e r var , with no p e r f i n c r e a s e . Need . / u t i l −mpm−ac . c : 1 3 1 4 : /∗ \\ todo Change i t f o r s t a t e f u l MPM. Supply t h e s t a t e u s i n g mpm_thread_ctx ∗/ . / u t i l −mpm−ac . c :1499:/∗ \\ todo T e c h n i c a l l y i t ’ s g e n e r i c t o a l l mpms, but s i n c e we use ac only , t h e . / u t i l −mpm−ac . c :1761:/∗ \\ t o d o s . / u t i l −mpm−ac . c :1791:/∗ \\ todo Reduce o f f s e t b u f f e r s i z e . P r o b a b l y a 100 ,000 e n t r y would be s u f f i c i e n t . ∗/ . / u t i l −mpm−ac . c : 3 6 : ∗ \\ todo − Do a p r o p e r a n a l y i s o f our e x i s t i n g MPMs and s u g g e s t a good one based . / u t i l −mpm−ac . c : 3 7 1 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r t h e same CTX with t h e same s i d ∗/ . / u t i l −mpm−ac . c : 4 4 2 : /∗ \\ todo u s i n g i t t e m p o r a r i l y now d u r i n g dev , s i n c e I have restricted . / u t i l −mpm−ac . c : 6 6 2 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac . c : 6 7 6 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac . c : 7 0 1 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; \\ . / u t i l −mpm−ac . c : 7 1 0 : " f i l e a bug r e p o r t on t h i s " ) , \\ . / u t i l −mpm−ac−cuda−k e r n e l . cu : 2 5 : ∗ \\ todo − T h i s i s a b a s i c v e r s i o n o f t h e k e r n e l . . / u t i l −mpm−ac−g f b s . c : 1 3 0 5 : /∗ \\ todo Change i t f o r s t a t e f u l MPM. Supply t h e s t a t e u s i n g mpm_thread_ctx ∗/ . / u t i l −mpm−ac−g f b s . c : 1 3 1 3 : /∗ \\ todo t r i e d loop u n r o l l i n g with r e g i s t e r var , with no p e r f i n c r e a s e . Need . / u t i l −mpm−ac−g f b s . c : 1 4 4 0 : /∗ \\ todo Change i t f o r s t a t e f u l MPM. Supply t h e s t a t e u s i n g mpm_thread_ctx ∗/ . / u t i l −mpm−ac−g f b s . c : 1 4 4 7 : /∗ \\ todo t r i e d loop u n r o l l i n g with r e g i s t e r var , with no p e r f i n c r e a s e . Need . / u t i l −mpm−ac−g f b s . c : 3 5 : ∗ \\ todo − Do a p r o p e r a n a l y i s o f our e x i s t i n g MPMs and s u g g e s t a good one based . / u t i l −mpm−ac−g f b s . c : 3 7 2 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r t h e same CTX with t h e same s i d ∗/ . / u t i l −mpm−ac−g f b s . c : 4 4 2 : /∗ \\ todo u s i n g i t t e m p o r a r i l y now d u r i n g dev , s i n c e I have restricted . / u t i l −mpm−ac−g f b s . c : 5 8 4 : " Fatal Error . Exiting . Please f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac−g f b s . c : 5 9 8 : " Fatal Error . Exiting . Please f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac−g f b s . c : 6 2 3 : " Fatal Error . E x i t i n g . P l e a s e f i l e a bug r e p o r t on t h i s " ) ; \\ . / u t i l −mpm−ac−g f b s . c : 6 3 2 : " f i l e a bug r e p o r t on t h i s " ) , \\ . / u t i l −mpm−ac−g f b s . c : 6 4 2 : ∗ \\ todo Use a b e t t e r way t o f i n d union o f 2 s e t s . . / u t i l −mpm−ac−t i l e . c : 1 0 8 3 : /∗ TODO − Find b e t t e r way t o s t o r e t h i s . ∗/ . / u t i l −mpm−ac−t i l e . c : 1 1 7 4 : /∗ TODO: Could be made more compact ∗/ . / u t i l −mpm−ac−t i l e . c : 3 9 5 : /∗ F i x up t r a n s l a t i o n t a b l e f o r u p p e r c a s e ∗/ . / u t i l −mpm−ac−t i l e . c : 5 2 0 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r t h e same CTX with t h e same s i d ∗/ . / u t i l −mpm−ac−t i l e . c : 5 3 : ∗ \\ todo − Do a p r o p e r a n a l y i s o f our e x i s t i n g MPMs and s u g g e s t a good . / u t i l −mpm−ac−t i l e . c : 7 5 2 : " Fatal Error . Exiting . Please f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−ac−t i l e . c : 7 6 4 : " Fatal Error . Exiting . Please f i l e a bug r e p o r t on t h i s " ) ; . / u t i l −mpm−b2g . c : 3 0 : ∗ \\ todo Try t o g e t t h e S0 c a l c u l a t i o n r i g h t . . / u t i l −mpm−b2g . c : 3 9 4 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r the 120 CASEC . / u t i l −mpm−b2g . h : 5 5 : u i n t 1 6 _ t l e n ; /∗∗< \\ todo we ’ r e l i m i t e d t o 32/64 b y t e l e n g t h s , u i n t 8 _ t would be f i n e here ∗/ . / u t i l −mpm−b3g . c : 3 0 : ∗ \\ todo Try t o g e t t h e S0 c a l c u l a t i o n r i g h t . . / u t i l −mpm−b3g . c : 3 4 1 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r the . / u t i l −mpm−b3g . c : 7 6 : /∗ ∗ \\ todo XXX Unused ??? ∗/ . / u t i l −mpm. c : 4 3 3 : pmq−>p a t t e r n _ i d _ a r r a y _ s i z e = 32; /∗ I n t i a l s i z e , TODO Make t h i s c o n f i g u r e o p t i o n ∗/ . / u t i l −mpm. c : 4 5 3 : pmq−>r u l e _ i d _ a r r a y _ s i z e = 128; /∗ I n i t i a l s i z e , TODO: Make c o n f i g u r e o p t i o n . ∗/ . / u t i l −mpm. c : 5 9 2 : /∗∗ \\ todo now s e t merged f l a g ? ∗/ . / u t i l −mpm. c : 6 0 1 : ∗ \\ todo memset i s e x p e n s i v e , but we need i t as we merge pmq ’ s . We might use . / u t i l −mpm. c : 6 1 4 : /∗ TODO: R e a l l o c t h e r u l e i d a r r a y s m a l l e r a t some s i z e ? ∗/ . / u t i l −mpm−wumanber . c :2351:/∗∗ \\ todo VJ d i s a b l e d because i t t e s t s t h e o l d match s t o r a g e ∗/ . / u t i l −mpm−wumanber . c : 2 9 : ∗ \\ todo make hash1 a a r r a y o f p t r and g e t r i d o f t h e f l a g f i e l d in the . / u t i l −mpm−wumanber . c : 3 1 : ∗ \\ todo remove e x i t ( ) c a l l s . / u t i l −mpm−wumanber . c : 3 2 : ∗ \\ todo o n l y c a l c p r e f i x c i _ b u f f o r nocase p a t t e r n s ? −− would be i n a . / u t i l −mpm−wumanber . c : 3 4 : ∗ \\ todo make s u r e runtime c o u n t e r s can be d i s a b l e d ( a t compile time ) . / u t i l −mpm−wumanber . c : 3 8 7 : /∗ TODO f i g u r e out how we can be c a l l e d m u l t i p l e t i m e s f o r t h e same CTX with t h e same s i d ∗/ . / u t i l −mpm−wumanber . c : 6 9 3 : /∗ TODO VJ t h e s e v a l u e s a r e chosen p r e t t y much randomly , so . / u t i l −r e f e r e n c e −c o n f i g . c : 1 2 0 : /∗ i f i t i s not NULL , use t h e f i l e d e s c r i p t o r . The hack so t h a t we can . / u t i l −rohash . c : 3 0 : ∗ \\ todo a b l o o m f i l t e r i n t h e ROHashTableOffsets c o ul d p o s s i b l y prevent . / u t i l −rohash . c : 3 3 : ∗ \\ todo maybe add a u s e r c t x t o be r e t u r n e d i n s t e a d , something l i k e a . / u t i l −runmodes . c : 4 6 6 : /∗ \\ todo S e t t h r e a d s number i n c o n f i g t o 1 ∗/ . / u t i l −s t o r a g e . c : 2 0 4 : ∗ \\ todo we co u l d r e t u r n −1 when r e g i s t r a t i o n i s n ’ t c l o s e d yet , however . / u t i l −t h r e s h o l d−c o n f i g . c : 2 8 : ∗ \\ todo Need t o s u p p o r t s u p p r e s s . / u t i l −t h r e s h o l d−c o n f i g . c : 3 7 7 : " s i d > 0 and g i d == 0 . P l e a s e f i x t h i s " . / u t i l −t h r e s h o l d−c o n f i g . c : 5 7 6 : " s i d > 0 and g i d == 0 . P l e a s e f i x t h i s " . / u t i l −t h r e s h o l d−c o n f i g . c : 6 7 : / ∗ TODO: " a p p l y _ t o " ∗/ . / u t i l −t h r e s h o l d−c o n f i g . c : 8 8 0 : /∗ TODO: implement o p t i o n " a p p l y _ t o " ∗/ . / u t i l −u n i t t e s t −h e l p e r . c : 2 3 3 : /∗ TODO: Add more p r o t o c o l s ∗/ . / u t i l −u n i t t e s t −h e l p e r . c : 4 2 7 : /∗ TODO: Add more p r o t o c o l s ∗/ D.1.1 Audit: source-nfq.c Rough audit findings Remove comment on line 878. It’s not an issue anymore. Note. Check threading in NFQQueueVars (NFQVerdictCacheFlush, NFQVerdictCacheLen) There should possibly be a mutex lock on line 898. nfq_set_verdict_mark is used on line 1115 1119 1147 1151. "This function is deprecated since it is broken, its use is highly discouraged. Please, use nfq_set_verdict2 instead." [81] It is however only used if the deprecating function is not available on the machine running Suricata. nfq_set_verdict_batch and nfq_set_verdict_batch2 is deprecated with no further information. Will also fail silently on kernel 3.1 We might possibly need a mutex on line 318 since qh is multi threaded. 121 CASEC E Static Code Analysis Results Bug 1. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1655 Conclusion: True positive. This error is a true positive. The variable input_len is set to 0 and the function returns shortly after, leaving the set variable unused. The input_len variable is not, as far as we are able find out, not used outside the scope of this function. So it should not really matter if it is set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1655. Bug 2. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1723 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1723. Bug 3. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1877 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1877. Bug 4. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1581 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1581. Bug 5. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1724 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1724. Bug 6. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1636 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1636. 122 CASEC Bug 7. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1681 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1681. Bug 8. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1608 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1608. Bug 9. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1701 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1701. Bug 10. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1682 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1682. Bug 11. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1635 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1635. Bug 12. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1785 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1785. Bug 13. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1582 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1635. Bug 14. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1761 Conclusion: True positive. 123 CASEC This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1761. Bug 15. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1609 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1609. Bug 16. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1845 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1845. Bug 17. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1787 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1787. Bug 18. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1762 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1762. Bug 19. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1654 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1654. Bug 20. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1803 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1803. Bug 21. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1555 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it 124 CASEC is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1555. Bug 22. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1878 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1878. Bug 23. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1844 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1844. Bug 24. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1556 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "input_len" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1556. Bug 26. Type: Dead store. Dead assignment. File: app-layer-dcerpc.c Function: DCERPCParser Line: 1802 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1802. Bug 27. Type: Dead store. Dead assignment. File: util-mpm-ac-bs.c Function: SCACBSSearch Line: 1278 Conclusion: True positive. The variable "state" is set to 0 on line 1278. It is then used in an assignent in a "while" loop on line 1282. If the code does not enter the while loop it will get an assignment on line 1290 either way. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1278. Bug 28. Type: Dead store. Dead assignment. File: util-mpm-ac-bs.c Function: SCACBSCreateModDeltaTable Line: 723 Conclusion: True positive. The variable "state" is set to 0 on line 723. It is then used in a for loop on line 726, though it is intitialized to 0 in the for loop as well. This makes the assignement on line 723 redundant. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 723. Bug 29. Type: Dead store. Dead assignment. File: util-mpm-ac-bs.c Function: SCABSCreateModDeltaTable Line: 798 Conclusion: True positive. This is another true positive simular to the above bug 28, except the assignment is on line 798. The only affect of this bug is an extra variable assignement and it is in our 125 CASEC opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 798. Bug 30. Type: Dead store. Dead assignment. File: util-mpm-ac-bs.c Function: SCACBSSearch Line: 1196 Conclusion: True positive. State is set to 0 on line 1196, but the value is never read before another value is assigned to the variable. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1196. Bug 31. Type: Dead store. Dead increment. File: app-layer-smb.c Function: SMBParseByteCount Line: 836 Conclusion: True positive. This is another true positive where a variable called "input_len" is set, but not used. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 836. Bug 32. Type: Dead store. Dead increment. File: app-layer-smb.c Function: SMBParseByteCount Line: 835 Conclusion: True positive. This is another true positive where a variable called "parsed" is set, but not used. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 835. Bug 33. Type: Dead store. Dead assignment. File: app-layer-smb.c Function: DataParser Line: 680 Conclusion: True positive. This is another true positive where a variable called "parsed" is set, but not used. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 835. Bug 34. Type: Dead store. Dead increment. File: app-layer-ssl.c Function: SSLv3ParseHandshakeType Line: 309 Conclusion: True positive. This is another true positive simular to bug 1. Only this time it reffers to a "parsed" variable, also set to 0. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 309. Bug 35. Type: Dead store. Dead increment. File: app-layer-ssl.c Function: SSLv3Decode Line: 1048 Conclusion: True positive. This is another true positive where a variable called "input_len" is set, but not used. The only affect of this bug is an extra variable assignement and it is in our opinion not worth the developers time to remove it. The remidating fix would however be to remove the assignment on line 1048. Bug 36. Type: Logic error. Dereference of null pointer. File: app-layer-dns-common.c Function: DNSStoreAnswerInState Line: 595 Conclusion: False positive. This bug referes to a DNSTransaction pointer being used while pointing to NULL. The function DNSTransactionAlloc is however ran on line 592 setting the resulting allocation to the tx pointer. There is also an if check on the following line checking if tx points to NULL, the function will return instantly if this is the case. The offending usage of the 126 CASEC pointer is after the if check so we have decided that this is a false positive based for the before mentioned reasons. Bug 36. Type: Unix API. Undefined allocation of 0 bytes. File: detect-engine-mpm.c Function: DetectSetFastPatternAndItsId Line: 1345 Conclusion: True positive. Probability: 2 Consequence: 2 Risk score: 4 There are two values at the start of this function that causes this issue, struct_total_size and content_total_size, these are set to 0 at the start of this function and they will get their actual values in the for loop in line 1335. The precondition for this for bug is that sig_list in the DetectEngineCtx struct is set to NULL. If this is the case then the program flow will not enter the for loop leaving the total_size variables on 0. The SCMalloc on line 1345 causes the actual bug becuase it will allocate an amount of bytes equal to sizeof(uint8_t) * (struct_total_size + content_total_size). In our case this would be an allocation of sizeof(uint8_t) * 0, which results in an allocation of 0 bytes. We recommend adding the following code above line 1344. if((struct_total_size != 0) || (content_total_size != 0)) Bug 38. Type: Unix API. Undefined allocation of 0 bytes. File: util-radix-tree.c Function: SCRadixAddKey Line: 749 Conclusion: True positive. Probability: 2 Consequence: 2 Risk score: 4 This is another bug where arithmetics in a SCMalloc call makes it possible for an undefined allocation of 0 bytes. The SCMalloc call allocates sizeof(uint8_t) * (node>netmask_cnt - i) node->netmask_cnt appears to have a constant value while "i" is set in a for loop on line 744. "i" is infact guaranteed to be equal to node->netmask_cnt, causing the undefined allocation, if the "if" statement on line 745 in the "for" loop never triggers. This is hard to determine becuase the "if" condition is based on arcane logic in the radix tree generation function. It is quite likely that the mentioned "if" statment is designed to always to trigger since i is always equal to node->netmask_cnt and the allocation bug will always happen if it does not. We can however not guarantee it and we will therefore recommend that the following code is added above line 749. if(i == node->netmask_cnt) return -1; Bug 39. Type: Unix API. Undefined allocation of 0 bytes. File: util-pool.c Function: PoolInit Line: 166 Conclusion: True positive. Probability: 1 Consequence: 2 Risk score: 2 This bug requires the argument elt_size for PoolInit to be set to 0. This is not the case in any PoolInit in the Suricata source code as of yet. If elt_size however is set to 0 then there could be a case where an undefined allocation of 0 bytes would be made. We recommend changeing line 165 from to following to fix this bug. } else if (elt_size > 0) { Bug 40. Type: Memory error. Use-after-free. File: util-rohash.c Function: ROHashInitFinalize Line: 243 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 242 correctly. It sets the item variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus item is set to a new value once it’s used. Bug 41. Type: Memory error. Use-after-free. File: app-layer-dns-common.c Function: DNSTransactioFree Line: 299 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 298 127 CASEC correctly. It sets the "a" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "a" is set to a new value once it’s used. Bug 42. Type: Memory error. Use-after-free. File: app-layer-dns-common.c Function: DNSStateFree Line: 432 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 431 correctly. It sets the tx variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus tx is set to a new value once it’s used. Bug 43. Type: Memory error. Use-after-free. File: app-layer-dns-common.c Function: DNSTransactionFree Line: 304 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 303 correctly. It sets the "a" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "a" is set to a new value once it’s used. Bug 44. Type: Memory error. Use-after-free. File: app-layer-dns-common.c Function: DNSTransactionFree Line: 292 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 291 correctly. It sets the "q" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "q" is set to a new value once it’s used. Bug 45. Type: Memory error. Use-after-free. File: unix-manager.c Function: UnixCommandRun Line: 545 Conclusion: True positive. Probability: 2 Consequence: 1 Risk score: 2 If UnixCommandRun recieves a command that is longer than what’s allowed it will close the connection to the client and free all related information. The programflow does not break if this happens so after the information is freed it will continue with the command execution flow. Finally it will try to access the freed memory, making this a true positive. Practical testing of this bug reveals that the input gets trunctuated to the size of the buffer by the recv system call that retrieves the remote command. This means that the terminating characters of the string will get removed in most cases, causing the json parsing of the command to fail and stopping this bug from triggering. This bug can only be exploited if someone sends a command that longer than the buffer size, but contains a terminating character before the trunctuation. Practical tests show that Suricata survives recieving a command matching the required prerequisetes for the bug. Probably because the json library that is responsible for the function accessing the freed memory does proper checks. We recommend adding the following after line 543. else { And an extra closing bracket at the end of the function. Bug 46. Type: Memory error. Use-after-free. File: app-layer-smtp.c Function: SMTPTransactionFree Line: 1378 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 1378 correctly. It sets the str variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus str is set to a new value once it’s used. Bug 47. Type: Memory error. Use-after-free. File: app-layer-smtp.c Function: SMTPStateFree Line: 1416 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 1415 correctly. It sets the tx variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus tx is set to a new value once it’s used. 128 CASEC Bug 48. Type: Memory error. Use-after-free. File: app-layer-dcerpc.c Function: DCERPCUuidListFree Line: 1967 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 1966 correctly. It sets the "entry" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "entry" is set to a new value once it’s used. Bug 49. Type: Memory error. Use-after-free. File: util-decode-mime.c Function: MimeDecFreeEntity Line: 194 Conclusion: False positive. This function loops through a linked list of MimeDecEntity structs freeing each. There is a if statement checking if the one before this entity in the last was not freed and so you don’t accidentaly skip parts of the linked list. The clang-analyzer however first assumes that one MimeDecEntity is freed, then in the next loop iteration it assumes that the child of the current MimeDecEntity is not NULL, which it is since it was freed just before. That makes this a false positive. Bug 50. Type: Memory error. Use-after-free. File: detect-engine-address.c Function: DetectAddressMergeNot Line: 1222 Conclusion: False positive. This is another case of a false positive. We are looping through a linked list checking if the next element is NULL. The clang-analyzer requires an if check in the loop body checking if the next element is NULL to be true and for there to be another iteration in the loop. This could only happen if there was a skipped element in the linked list, which is as far as we can tell not possible. Bug 51. Type: Memory error. Use-after-free. File: runmodes.c Function: RanOutputFreeList Line: 445 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 442 correctly. It sets the "output" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "output" is set to a new value once it’s used. Bug 52. Type: Memory error. Use-after-free. File: runmodes.c Function: RunModeShutDown Line: 488 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 486 correctly. It sets the "output" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "output" is set to a new value once it’s used. Bug 53. Type: Memory error. Use-after-free. File: util_var.c Function: CleanVariableResolveList Line: 167 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 166 correctly. It sets the p_item variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus p_item is set to a new value once it’s used. Bug 54. Type: Memory error. Use-after-free. File: detect-engine-port.c Function: DetectPortParseMergeNotPorts Line: 1262 Conclusion: False positive. There is a null check done on the variable before entering the for loop so there should be no scenario where the code tried to use ag2 after it has been sent to free. Bug 55. Type: Memory error. Use-after-free. File: detect-engine-port.c Function: De- 129 CASEC tectPortMergeNotPorts Line: 1237 Conclusion: False positive. There is a null check done on the variable before entering the for loop so there should be no scenario where the code tried to use ag2 after it has been sent to free. Bug 56. Type: Memory error. Use-after-free. File: output.c Function: OutputDeregisterAll Line: 598 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 597 correctly. It sets the "module" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "module" is set to a new value once it’s used. Bug 57. Type: Memory error. Use-after-free. File: app-layer-ssl.c Function: SSLStateFree Line: 1312 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 1311 correctly. It sets the "item" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "item" is set to a new value once it’s used. Bug 58. Type: Memory error. Use-after-free. File: app-layer-template.c Function: TemplateStateFree Line: 113 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 112 correctly. It sets the "tx" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "tx" is set to a new value once it’s used. Bug 59. Type: Memory error. Use-after-free. File: app-layer-dcerpc-udp.c Function: DCERPCUDPStateFree Line: 795 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 793 correctly. It sets the "item" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "item" is set to a new value once it’s used. Bug 60. Type: Memory error. Use-after-free. File: conf.c Function: ConfNodeFree Line: 157 Conclusion: False positive. This is a false positive. Clang fails to interperate the TAILQ_FIRST macro on line 156 correctly. It sets the "tmp" variable that clang-analiyzer believes is used after it’s freed to a new value for each cycle in the while loop. Thus "tmp" is set to a new value once it’s used. 130 CASEC F Skyhigh implementation We want to have a system where building Suricata would be easy, fast and consistent for every time we would compile it. To achive this we want to create a system for automating the entire build process. To create this process we want to use NTNU Gjøviks cloud, Skyhigh. Skyhigh is an OpenStack powered solution that we have all used as part of IMT 3441 - Database- og applikasjonsdrift. There exists multiple automatic build systems, but our needs are pretty simple and can be solved with a few homemade scripts. What we need are a script that can create virtual machines, called instances, in Skyhigh and then iniate a script that will pull in all the dependencies and our code base. It will then need to compile Suricata for us. To achive this we need to complete everything in this list [82]. 1. 2. 3. 4. 5. 6. 7. 8. 9. F.1 Create Instance Manage RSA keys for access Install Git Install dev-tools and other libraries needed from the Ubuntu repository Clone our version of Suricata from Github Get LuaJIT to enable LUA support in Suricata Compile LuaJIT Clone libhtp from Git Compile Suricata Create Instance: 1 nova boot −−f l a v o r m1. l a r g e −−image " Ubuntu S e r v e r 1 4 . 0 4 . 3 ( T r u s t y Tahr ) amd64 " 2 −−key−name i n t e r n a l −−s e c u r i t y −groups d e f a u l t 3 −−n i c " net−i d=a3ea7663−f f d 8 −42f5 −840f −4425eed38715 " 4 −−user−data i n s t a l l . sh b u i l d−box This create an instance of size m1.large, 8 GB RAM 60 GB diskspace and 4 vCPU’s. It installs Ubuntu 14.04.3, an image that was provided by Skyhigh. It adds the public key of the "internal" key-pair to authorized_keys in the new instance. It adds it to the default security group. The default security group allows SSH traffic on port 22. the –nic flag adds it to our network inside Skyhigh. It then supplies the new instance with install.sh, the script that compiles Suricata. Finally it gives it the name "build-box" F.2 Manage RSA keys: The first step in install.sh is to add a premade RSA key to the instance. This allows the instance to clone our closed github repo. F.3 Install Git: 1 apt−g e t update && apt−g e t i n s t a l l g i t −y 131 CASEC The second part of install.sh updates the package index files and grabs Git from the offical Ubuntu repository. F.4 Install dev-tools and other libraries from the Ubuntu repository. 1 apt−g e t −y i n s t a l l l i b p c r e 3 l i b p c r e 3 −dbg l i b p c r e 3 −dev b u i l d−e s s e n t i a l 2 a u t o c o n f automake l i b t o o l l i b p c a p−dev l i b n e t 1 −dev 3 l i b y a m l −0−2 l i b y a m l−dev z l i b 1 g z l i b 1 g−dev l i b c a p −ng−dev 4 l i b c a p −ng0 make l i b m a g i c−dev l i b j a n s s o n −dev l i b j a n s s o n 4 5 pkg−c o n f i g l i b n e t f i l t e r −queue−dev l i b n e t f i l t e r −queue1 6 l i b n f n e t l i n k −dev l i b n f n e t l i n k 0 Installs all the packages from the Ubuntu repository that we need to compile and run Suricata. F.5 Clone our version of Suricata from Github: 1 g i t c l o n e g i t @ g i t h u b . com : Syoc / s u r i c a t a −c a s e c . g i t F.6 Get LuaJIT: 1 wget h t t p : / / l u a j i t . org / download / LuaJIT − 2 . 0 . 4 . t a r . gz −O ~/s u r i c a t a −c a s e c / LuaJIT − 2 . 0 . 4 . t a r . gz As this comes compressed and archived we need to extract it. 1 t a r −x f ~/s u r i c a t a −c a s e c / LuaJIT − 2 . 0 . 4 . t a r . gz F.7 Compile LuaJIT: 1 cd ~/s u r i c a t a −c a s e c / LuaJIT −2.0.4/ 2 sudo make && sudo make i n s t a l l Moves into the LuaJIT folder and compiles it. F.8 Clone libhtp from Github: 1 g i t c l o n e h t t p s : / / g i t h u b . com/ OISF / l i b h t p . g i t −b 0 . 5 . x This clones libhtp with the latest version of the 0.5 branch. F.9 Compile Suricata 1 . / autogen . sh && . / c o n f i g u r e −−p r e f i x =/u s r / −−s y s c o n f d i r =/e t c / 2 −−l o c a l s t a t e d i r =/v a r / −−enable−l u a j i t 3 −−with−l i b n s s −l i b r a r i e s =/u s r / l i b −−with−l i b n s s −i n c l u d e s=/u s r / i n c l u d e / n s s / 4 −−with−l i b n s p r −l i b r a r i e s =/u s r / l i b 5 −−with−l i b n s p r −i n c l u d e s=/u s r / i n c l u d e / n sp r 6 −−with−l i b l u a j i t −i n c l u d e s=/u s r / local / i n c l u d e / l u a j i t −2.0/ 7 −−with−l i b l u a j i t −l i b r a r i e s =/u s r / l i b / 8 make c l e a n && make && make i n s t a l l && l d c o n f i g This step creates the required makefiles and compiles Suricata. In the end it runs ldconfig to create the necessary config files to start Suricata. F.10 Skyhigh resources: • 5 instances • 10 vCPU’s 132 CASEC • 20 GB RAM • 10 volumes, total space 1 TB. 133 CASEC G Data Structure Documentation During our development process we will rely on the existing Suricata infrastructure for extracting relevant SMTP data. Here we hope to document the data structures Suricata provides for storing SMTP related information and where they are located. We will also document the related helper functions for extracting the data, and how to use them. The following MIME fields are confirmed to be located in MimeDecEntity->field_list: • • • • • • • • • • • • • • • • date Message date and time to Primary recipient mailbox from Mailbox of message author cc Carbon-copy recipient mailbox reply_to Mailbox for replies to message references Message-IDs of messages in the preceding reply chain. bcc Blind-carbon-copy recipient mailbox message_id Message identifier subject Topic of message received Mail transfer trace information priority Message priority (normal, urgent, non-urgent) sensitivity Message content sensitivity (personal, private, company confidential) importance Message importance (high, normal, low) organization The organization associated with the sender content_md5 The MD5 sum of the content x_originating_ip Gets the IP of the user if a web-fronten is used. Not in the MIME rfc • x_mailer This is for supplying the software used to send the mail. Not in the MIME rfc • user_agent This seems similar to x-mailer Descriptions are from RFC 4021 [83]. Other fields may also be accessable, but these are the ones that are retrieved elsewhere in the code. The data structures containing this information, and the functions for retrieving and manageing this information is primarly located in two files. First we have app-layersmtp.h and app-layer-smtp.c The header file contains the general structures that contain the SMTP data. The two most relevant structures are SMTPState and SMTPTransaction. G.1 SMTPState This structure contains the current state of the SMTP parser. This struct can be thought of as a way to gain access to the data of the SMTP packet that is currently being parsed, however it does also contain all parsed SMTP sessions. Relevant variables are: SMTPTransaction ∗ c u r r \ _ t x A p o i n t e r t o a s t r u c t c o n t a i n g i n g one SMTP t r a n s a c t i o n , or s e s s i o n , c u r r e n t l y b e i ng parsed . 134 CASEC This s t r u c t i s explained l a t e r . tx \ _ l i s t A l i s t o f a l l t h e t r a n s a c t i o n s or s e s s i o n s . uint64 \ _t tx \ _cnt The number o f t r a n s a c t i o n s making up t h e t o t a l SMTP t r a n s a c t i o n count . FileContainer ∗ f i l e s \ _ts T h i s s t r u c t p o i n t e r c o n t a i n s t h e l i s t o f a l l t h e f i l e s be i n g s e n t t o t h e s e r v e r . uint8 \ _t ∗helo T h i s i s a c h a r a c t e r a r r a y c o n t a i n i n g t h e c u r r e n t h e l o message i n t h e p a r s e r ( c ou l d be c l i e n t name or I P ) The SMTPState s t r u c t i s a c c e s s a b l e i f you have a flow p o i n t e r . You can then c a l l : SMTPState ∗smtp \ _ s t a t e = ( SMTPState ∗) FlowGetAppState ( flow ) ; G.2 SMTPTransaction This is the other relevant SMTP structure. It contains data about one SMTP transaction. uint64 \ _t tx \ _id The number f o r t h i s t r a n s a c t i o n ( e . g . t h e number t h i s p a c k e t had i n t h e s e s s i o n ) . MimeDecEntity ∗msg\ _head A p o i n t e r t o t h e head o f a l i s t c o n t a i n i n g a s t r u c t f o r t h e i n d i v i d u a l MIME p a c k e t s i n the t r a n s a c t i o n . MimeDecEntity ∗msg\ _ t a i l A pointer to the t a i l of the l i s t . u i n t 8 \ _ t ∗ mail \ _from The c h a r a c t e r a r r a y c o n t a i n i n g t h e MAIL FROM parameter SMTPTransaction can be a c c e s s e d through SMTPState with : SMTPTransaction ∗ t x = s t a t e −>c u r r \ _ t x ; T h i s i s how you would loop through a l l t h e SMTPTransactions i n t h e SMTPState . SMTPTransaction ∗ t x = NULL ; w h i l e ( ( t x = TAILQ\ _FIRST (\& smtp \ _ s t a t e −>t x \ _ l i s t ) ) ) { Or you c o ul d use t h i s i n s t e a d . TAILQ\_FOREACH( tx , \&smtp \ _ s t a t e −>t x \ _ l i s t , n e x t ) { There are also the functions SMTPStateGetTxCnt that returns the number of transactions, and SMTPStateGetTx that returns a transaction with a given number. These two allow you to loop through all transactions easily. The other relevant file is util-decode-mime.h. This file mainly contains the structs for holding the MIME header information. Here we can extract the most relevant data for rule matching. The main struct is MimeDecEntity. G.3 MimeDecEntity This struct contains the MIME Entity (MIME information) for one packet. MimeDecField ∗ f i e l d \ _ l i s t T h i s i s a l i s t c o n t a i n i n g a l l header f i e l d s MimeDecUrl ∗ u r l \ _ l i s t A l i s t o f a l l URLs i n t h e E n t i t y T h i s means h o s t names , I P s and URLs t o e x e c u t a b l e f i l e s ∗ Next comes l e n g t h o f body b e f o r e and a f t e r decoding , header , c o n t e n t and anomaly f l a g s ∗ uint8 \ _t ∗filename The name o f a p o s s i b l e f i l e attachment uint8 \ _t ∗ c t n t \ _type Pointer to content type f i e l d u i n t 8 \ _ t ∗msg\ _ i d Message i d s t r u c t MimeDecEntity ∗ n e x t P o i n t e r to the next e n t i t y in the l i s t s t r u c t MimeDecEntity ∗ c h i l d Pointer to the a l i s t of c h i l d e n t i t i e s We w i l l p r o b a b l y o n l y need MimeDecEntity and i t can be r e t r i e v e d with t h e f o l l o w i n g l i n e where t x i s a SMTPTransaction . MimeDecEntity ∗ e n t i t y = tx−>msg\ _ t a i l ; MIME f i e l d s can e a s i l y be r e t r i e v e d with : MimeDecField f i e l d = MimeDecFindField ( e n t i t y , " s u b j e c t " ) ; 135 CASEC H H.1 Testing Raw SMTP Transaction 220 debian.studby.hig.no ESMTP SubEthaSMTP null EHLO localhost 250-debian.studby.hig.no 250-8BITMIME 250-AUTH LOGIN 250 Ok MAIL FROM:<vinjar> 250 Ok RCPT TO:<test@test.com> 250 Ok RCPT TO:<levi@com.com> 250 Ok DATA 354 End data with <CR><LF>.<CR><LF> Date: Tue, 26 Apr 2016 13:59:50 +0200 From: demo <demo@test.com> To: test@test.com, levi@com.com Subject: test Message-ID: <20160426115950.GA14177@debian.studby.hig.no> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="tThc/1wpZn/ma/RB" Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) --tThc/1wpZn/ma/RB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline tesssst --tThc/1wpZn/ma/RB Content-Type: text/x-csrc; charset=us-ascii Content-Disposition: attachment; filename="fizz.c" #include <stdio.h> int main(void) { 136 CASEC int i = 1; for(; i <= 100; i++) { if(!(i % 3)) printf("fizz"); if(!(i % 5)) printf("buzz"); if(i % 3 && i % 5) printf("%d", i); putchar(’\n’); } return 0; } --tThc/1wpZn/ma/RB-. 250 Ok QUIT 221 Bye 137 CASEC I Gantt diagram Bachelor Thesis Gantt 2 3 4 5 6 7 Project Plan Project Plan Completed Research Research Completed Code Audit and Development Code Audit and Development Completed Finalizing Report Deadline Presentation 138 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 CASEC J 11.01.2016 25.01.2016 29.01.2016 08.02.2016 11.02.2016 16.02.2016 22.02.2016 25.02.2016 29.02.2016 03.03.2016 07.03.2016 10.03.2016 14.03.2016 17.03.2016 31.03.2016 04.04.2016 07.04.2016 18.04.2016 21.04.2016 25.04.2016 02.05.2016 - Meeting attendance log Everyone was there Everyone was there Everyone was there Everyone was there Everyone was there Everyone was there Everyone was there Lauritz gone Stian and Lauritz gone Lauritz gone Everyone was there Stian gone Stian gone Stian gone Stian gone Everyone was there Vinjar gone Lauritz gone Everyone was there Everyone was there Stian gone 139