Creating an R Package
Transcription
Creating an R Package
Introduction Create package Submitting to CRAN Conclusion Creating an R Package M. Quartagno1 1 Department of Medical Statistics London School of Hygiene and Tropical Medicine EMERGE Group meeting, 2015 Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Terminology Repositories Terminology. Package: extension of R base system with code, data and documentation; Source: Original version with human-readable text and code Binary : Compiled version with computer-readable text and code, may work only on specific platform. Library: A directory containing installed packages; Repository: A website providing packages for installation. (Github, CRAN...); Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Terminology Repositories Choice of the Repository Assume we already have our functions and data we want to share; Which repository to use? Github: various types of software packages, not R specific; Sharing with specific people, colleagues... Less strict policies; CRAN: "Official" way to publish R package; Make package available to everyone; Need to respect CRAN policies; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files Create package. Suppose we want to create package LSHTM. Two ways: 1 Use package.skeleton() function; Load all functions and data into clean R session; Run: package.skeleton("LSHTM"); Some files and folders automatically created: 2 Create source (folders and files) manually; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files Source folders data: contains all the datasets we want to include in the package; R: contains all R functions; man: contain all help files for both datasets and R functions (manual); src: contains C/C++/Fortran uncompiled code. Optional; inst: contains miscellaneous other stuff, e.g. citation format for the package; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files man folder Help files are in R documentation format (.Rd); Latex-Style format: Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files man folder Help files are in R documentation format (.Rd); Latex-Style format: name: name of the package; title: a short title, should be only one line; description: this could be one or two paragraphs; arguments: a short description of all the inputs of a function; details: a longer description of the algorithm used in the function; value: the output(s) returned by the function; references; examples: few examples. This should not run for more than 4 or 5 seconds; package.skeleton() creates a skeleton file for all of these voices for each function. Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files DESCRIPTION file DESCRIPTION: a mandatory file with brief description; Package: jomo Type: Package Title: Multilevel Joint Modelling Multiple Imputation Version: 0.1-2 Date: 2014-01-15 Author: Matteo Quartagno, James Carpenter Maintainer: Matteo Quartagno <matteo.quartagno@lshtm.ac.uk> Description: Building on Schafer’s package pan and on the standalone program REALCOM, jomo is a package for multilevel joint modelling multiple imputation. Binary and categorical variables are handled through latent normal variables and algorithms for cluster-specific covariance matrices are introduced. License: GPL-2 Suggests: BaBooN Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files DESCRIPTION file DESCRIPTION: a mandatory file with brief description; Version: First number major changes, second number minor changes, third number bugs fixed; Maintainer: Use a valid email account, it is the only place where you write it; Description: One or two paragraphs Depends: other R packages necessary in order for your package to work; Suggests: other R packages suggested but not strictly necessary; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Package skeleton Folders Files NAMESPACE file NAMESPACE: a mandatory file with objects you want to import/export; exportPattern(".") useDynLib(jomo) Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package R CMD build When finished preparing package, make sure you have last version of R installed; Download last version of R Devel; Open command prompt and run R CMD build LSHTM; This will create a tarball file that you can install locally or send to someone else or upload on GitHub; If you want to submit to CRAN, you still need to check that you meet all the guidelines; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package R CMD check Run R CMD check LSHTM –as-cran ; This will check carefully the whole package. Help files are created automatically; Examples are run and output is printed on a separate file; A file with all Errors, Warnings and Notes is created; If you want to submit to CRAN, not only all of the errors but also all of the warnings and notes need to be fixed; Remember to run it with the last version of R; When everything is fine, take your time to read again the whole CRAN Policies page... Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package Uploading the package Upload the package on the CRAN repository: Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package Uploading the package Upload the package on the CRAN repository; The package is then to be checked by CRAN mantainers. Be advised that they receive tenths of packages every day and they work for free... They might not be in the best mood when they reply... Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package Uploading the package Upload the package on the CRAN repository; The package is then to be checked by CRAN mantainers. Be advised that they receive tenths of packages every day and they work for free... They might not be in the best mood when they reply... > > The maintainer confirms that he or she > has read and agrees to the CRAN policies. > Please do also follow them. Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Checking codes Uploading the package Publication You will have to resend your package through the same procedure until everything is ok; When your package is accepted, you will receive an email confirming the publication of the source on CRAN; Binary versions of the package are then published within a couple of days; Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Aftermath Aftermath Start advertising your package; If there were some bugs in the code, be ready to receive thousands of emails... You can theoretically submit a new version of the package each month; Before the package is established, which may take several rounds, more submissions are accepted; When new version of R is available, all packages are tested; The author is solely responsible for updating the package in case updates caused some trouble to his/her package. Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Aftermath R Journal When your package is on-line and you are enough confident, you can write a paper for the R Journal; Impact factor 1, but continuously increasing in last years; Remember that without outreach activity, your package will slowly die... Example: orcutt package for cochrane-orcutt estimation... After 4 years, apparently no citations, almost never used... Matteo Quartagno Emerge Introduction Create package Submitting to CRAN Conclusion Aftermath R Journal When your package is on-line and you are enough confident, you can write a paper for the R Journal; Impact factor 1, but continuously increasing in last years; Remember that without outreach activity, your package will slowly die... Example: orcutt package for cochrane-orcutt estimation... After 4 years, apparently no citations, almost never used... Matteo Quartagno Emerge Appendix For Further Reading Aknowledgements and Bibliography I This work was supported by funding from the European Community’s Seventh Framework Programme FP7/2011: Marie Curie Initial Training Network MEDIASRES ("Novel Statistical Methodology for Diagnostic/Prognostic and Therapeutic Studies and Systematic Reviews"; www.mediasres-itn.eu) with the Grant Agreement Number 290025. Matteo Quartagno Emerge