opensource.google.com

Menu

Docs

R

go/thirdpartyr

This describes R-specific guidance for checking code into //piper/third_party/R.

IMPORTANT: Read go/thirdparty first.

Before submitting new code see:

  • //piper/third_party/OWNERS
  • //piper/third_party/R/OWNERS
  • //piper/third_party/README

as well as the rest of this file.

Introduction

This file describes how to add code to the //piper/third_party/R directory.

Third party code for the R language should go in Piper under //piper/third_party/R. This makes it easier to keep track of third party code, and ensure that we are in legal compliance with software licenses. For more details about third-party code at Google, see go/thirdparty. Installing packages from CRAN using install.packages is possible but discouraged for reasons of security, compatibility, …, see go/r-install.

The directory contains code for packages and for R itself, at:

//third_party/R/packages/PACKAGENAME/...
//third_party/R/bioconductor/PACKAGENAME/...
//third_party/R/R/...

Sections here include:

  • Using packages if you just want to use packages that are already incorporated here.
  • Adding a package from CRAN or Bioconductor describes that common case.

Using packages

To use packages that are already incorporated in //piper/third_party/R, see the installation instructions at go/r-install#installing-packages-from-common-locations

There is an automated list of available packages at go/rdocs, and list of Google-specific packages for interfacing between R and Google computing infrastructure at go/rlang/google3-infrastructure/google3-r-packages.

Licensing

By R community standards, use of one of the standard licenses are indicated by one of the following short specifications in the 'License' field of a package’s DESCRIPTION file:

  • GPL-2
  • GPL-3
  • LGPL-2
  • LGPL-2.1
  • LGPL-3
  • AGPL-3 (read go/agpl first)
  • Artistic-2.0
  • BSD_2_clause
  • BSD_3_clause
  • MIT

These specifications indicate a direct link to the corresponding license file at http://linkremoved/ and in //piper/third_party/R/R/R_3_2_2/share/licenses. If an R package indicates one of these standard licenses in its description file, that is sufficient to satisfy third_party license requirements so long as you add the full text of the pointed to LICENSE in the top-level LICENSE file and explain how you got it in the METADATA file. The section on manually adding packages below has a more in-depth discussion of these files and the layout of the package directory.

The standard R licensing practice is an exception to the overall //piper/third_party policy, which requires either a license file or link in the upstream code. References to non-standard licenses without accompanying license text do not satisfy //piper/third_party requirements. If there’s ever any confusion or dispute over this exception please email emailremoved@ to resolve.

Also, the last three licenses above are usually specified as “MIT + file LICENSE”, where the package-provided LICENSE contains the copyright year and copyright holder necessary to complete the information in the license template. For an example, see the original license template in //piper/third_party/R/R/R_3_2_2/share/licenses/MIT. This is acceptable as long as there are no additional licensing conditions in the package-provided LICENSE file.

The import_from_cran tool described in the following section checks DESCRIPTION files for the above licenses. If one is found, it copies the appropriate LICENSE file to the package’s directory.

Automatically adding or updating third-party packages

There is a tool that mostly automates adding packages from the CRAN and Bioconductor repositories. Alias it with (you can add the same line to your .bashrc):

alias import_from_cran='/path/to/.../import_from_cran'

Next, navigate to a google3 root directory. You might also want to create a clean CITC client at the same time.

mkclient -f <packagename>

Then, to install a CRAN / Bioconductor package:

import_from_cran --package=<packagename>

To update a package, just use the same command as installing (the --update flag is no longer required) :

import_from_cran --package=<packagename>

To import a non-CRAN / non-Bioconductor package:

import_from_cran --package=<packagename> --url=<download_url>

The URL should point to the package source, e.g.

import_from_cran --package=rethinking \
  --url=https://github.com/rmcelreath/rethinking/archive/master.zip

To install your own copy of a package (before submitting the CL):

blaze run third_party/R/packages/<packagename>:<packagename>_install

Because the entire Bioconductor ecosystem is synchronized to release cycles, our internal repository of Bioconductor packages is synchronized to a single release. This ensures compatibility among imported packages, enabling smoother use and imports. Please see //piper/third_party/R/bioconductor/README.md for details.

After completion, the tool will describe the remaining manual steps that are needed. Some packages will need modification before they can be submitted. However, you should still submit a “pristine” copy, with no manual changes, to ********************, and follow up on those changes as part of the approval process.

If the package depends on other packages that have not already been reported, please import those first. The tool creates a CL (changelist); each package should be in a separate CL.

Manually adding a package from CRAN

To add a package foo from CRAN:

  1. Create a Piper client that includes //piper/third_party/R/packages.

    g4 client -a //piper/.../... && g4 sync
    

    or

    git5 start --import-empty third_party/R/packages/zipcode
    
  2. Download the package source to //piper/third_party/R/packages/foo/foo.

    mkdir third_party/R/packages/foo && cd third_party/R/packages/foo
    

    (use web search to find the package and download the foo.tgz)

    tar xzvf foo.tgz
    
  3. Create four additional files under //piper/third_party/R/packages/foo:

    • BUILD
    • LICENSE
    • METADATA
    • OWNERS
  4. Test that the package installs; from the google3 directory, run:

    blaze run third_party/R/packages/foo:foo_install
    
  5. In R, load the package using library(foo) and test that it works.

  6. Create a CL and request approval:

    g4 mail -m ********************,******************
    
  7. After receiving approval from both groups, submit the CL.

Rules for the BUILD file are at Building R Packages and Binaries. You will probably find it simplest to mimic an existing package. See //piper/third_party/R/packages/bit/BUILD for a simple example for a package that includes both R and C code.

The LICENSE file should be a copy of the license file from the package original package, if there is one. Otherwise, the DESCRIPTION file in the package should describe the license, e.g. GPL-2. In that case, include the standard GPL-2 license.

The OWNERS file must list at least two full-time employees; that is typically you and at least one more person from your team.

The METADATA file should document the package and can be auto-built at go/thirdparty/metadata. See akima for a simple example for a package with no local modifications.

For more formal requirements for these files, see go/thirdparty.

Manually updating a package in //third_party/R

This is for packages with the nested structure, with no real local modifications. If there are local modifications then you may need a versioned subdirectory and a few more steps.

cd third_party/R/packages/foo
rm -rf foo
/path/to/.../updatemd -version $VERSION METADATA
tar xzvf ~/Downloads/new-foo.tgz
g4 edit `g4 diff -se ...`
g4 delete `g4 diff -sd ...`
g4 add `g4 nothave`

Manually update BUILD.

The R directory

The R directory contains source and binary versions of R that have been with Google’s build infrastructure. The binary versions of R in //piper/third_party/R/R/binaries will be used automatically by r_binary and r_test targets so that the build system does not have to recompile R from source for every R target build using blaze. The binaries in this directory can also be used in packages or .Rar files when running R locally or on Borg. For information about R packages, see Rebuilding r-google

A copy of the R binary, slightly modified for interactive desktop use can be installed on a goobuntu desktop machine via:

sudo apt-get remove r-base-core
sudo apt-get install r-google

The instructions used by the emailremoved@ team for building and releasing the binaries are also available.

Advanced topics: vendor branches for packages

A small minority of packages have a substantial number of changes that are necessary for the packages to be used inside the Google environment. Keeping those local changes synchronized with updated releases of the open source package can be a maintenance problem without the use of vendor branches. RProtoBuf? and R itself are two examples where we maintain an extensive list of google-specific changes and could use vendor branches to smooth the upgrade process.

The procedure for maintaining an R package in this method is

  1. Create a new Piper client on local disk mapping:

    //piper/.../OWNERS
    //piper/.../README
    //piper/.../RProtoBuf
    
  2. Unpack and mail clean vendor sources to emailremoved@ and cc emailremoved@ for review:

    cd RProtoBuf
    tar xzvf ~/Downloads/RProtoBuf_0.2.5.tar.gz
    find RProtoBuf -type f|xargs g4 add
    g4 mail -m ****************** -cc ******************** ...
    

    (See changelist …131)

  3. Label the clean vendor sources with a release label with version suffix for official versions, or revision numbers from SVN (e.g. RProtoBuf_r500, RProtoBuf_0.2.5, etc.)

    g4 label RProtoBuf_0.2.5
    (set spec to //piper/.../...)
    (set revision number to @CL_number_of_submitted_change_from_step_2)
    
  4. Branch the vendor branch over to where our local changes will be kept

    This is only necessary the first time you create a new vendor_src package in this way. Subsequent updates can skip this step.

    g4 branch RProtoBuf
    (set spec from //piper/.../... to //piper/.../...)
    
  5. Sync and resolve from the vendor branch over to the google_vendor_src_branch branch with our modified sources.

    g4 integrate -b RProtoBuf
    g4 resolve
    
  6. Apply any other local changes to the google_vendor_src_branch branch and submit changes

    For example, you probably need to update METADATA with the new location of the source for this version you are importing.

  7. Integrate over to //piper/third_party/R/packages/RProtoBuf, and send to emailremoved@ and emailremoved@ for review.

Making additional google-specific changes to RProtoBuf or other vendor branch R packages

  1. You will need a Piper client with at least two directories mapped:

    //piper/.../...
    //piper/.../...
    
  2. The first step is to make your change to RProtoBuf in //piper/.../...

  3. Then update the METADATA file in that directory as well describing your local change, and then send this to emailremoved@ or other emailremoved@ reviewer with g4 mail. emailremoved@ approval is not required, because there are no license implications for our google changes here. If someone from emailremoved@ is added, you can ignore it and submit anyway.

  4. After review, submit your change with g4 submit -c as usual.

  5. Now you must integrate the change over to the //piper/third_party/R/packages/RProtoBuf directory tree. This is accomplished by g4 integrate from the local path to your google_vendor_src_branch version of RProtoBuf to the local path of the third_party RProtoBuf path in your client. E.g. something like:

    g4 integrate ../google_vendor_src_branch/RProtoBuf/... third_party/R/packages/RProtoBuf/...
    g4 resolve
    
  6. Then prepare a CL of the integrate and send it to emailremoved@ to review. Please specify whether the merge was clean (e.g. no conflicts that had to be modified by hand, you did not edit the files at all) or whether there were conflicts you manually resolved with an editor in your CL message.

  7. After review, submit your change with presubmit -p third_party.r,r_tools -s -c. No emailremoved@ approval is required.

Importing updated versions of RProtoBuf or other vendor branch R packages from GitHub

  1. You will need either a citc client or a Piper client with at least 3 directories mapped:

    //piper/.../...
    //piper/.../...
    //piper/.../...
    
  2. You will need to download the latest version of the package in a clean SVN workspace from R-Forge:

    cd /tmp
    git clone https://github.com/eddelbuettel/rprotobuf.git
    
  3. Then move the old RProtoBuf directory from the clean vendor branch out of the way and copy the clean version from SVN R-Forge in its place:

    cd ${CLIENT}/vendor_src/RProtoBuf
    mv RProtoBuf RProtoBuf.old
    cd /tmp/rprotobuf
    find . -type f | grep -v /.git/ | cpio -dump ${CLIENT}/vendor_src/RProtoBuf/RProtoBuf
    
  4. Add the new files into a CL, review, compare against the previous version, and if it is correct, send it to emailremoved@ to review or just submit it.

    cd ${CLIENT}/vendor_src/RProtoBuf/RProtoBuf
    g4 edit `g4 diff -se ...`
    g4 delete `g4 diff -sd ...`
    g4 add `g4 nothave`
    g4 diff ...
    g4 submit ...
    
  5. Integrate the updated vendor_src changes with our local changes

    g4 integrate -b RProtoBuf
    g4 submit ${CLIENT}/google_vendor_src_branch/RProtoBuf/...
    
  6. Integrate the change over to the //piper/third_party/R/packages/RProtoBuf directory tree. This is accomplished by g4 integrate from the local path to your google_vendor_src version of RProtoBuf to the local path of the third_party RProtoBuf path in your client. E.g. something like:

    g4 integrate google_vendor_src_branch/RProtoBuf/... //piper/third_party/R/packages/RProtoBuf...
    g4 resolve
    g4 edit //piper/third_party/R/packages/RProtoBuf/BUILD //piper/third_party/R/packages/RProtoBuf/METADATA
    emacs //piper/third_party/R/packages/RProtoBuf/BUILD //piper/third_party/R/packages/RProtoBuf/METADATA
    (update version numbers)
    
  7. Review and send out the CL for review to emailremoved@. Indicate that there are no compliance changes because it is an update of an existing package with no license changes. You can submit as soon as your emailremoved@ reviewer approves it.

    g4 mail -m ********************,****************** -cc '' -b '' //piper/third_party/R/packages/RProtoBuf...
    
  8. After review, submit your change with presubmit -p third_party.r,r_tools -s -c. No emailremoved@ approval is required.

Instructions for package reviewers

These instructions are for members of emailremoved@ reviewing package CLs created by import_from_cran.

g4 mail now automatically tests that packages build. Run

/path/to/.../check_package_import -c <CL>

to:

  1. install the package on your machine,
  2. run the examples,
  3. check that the contents match the download URL from the METADATA file, and
  4. grep for ‘http’.

You should then:

  1. Check that the additional files BUILD, LICENSE, METADATA, and OWNERS look good and match the DESCRIPTION file.
  2. Check that the package doesn’t have the libraries that it depends on bundled into it.
  3. Check any http calls for security.
  4. Check that importing the package generally makes sense. We generally don’t want packages with obvious security issues or that are known to be poor quality or unmaintained.

You can LGTM the CL with a message like:

This matches CRAN, the four extra files look OK, the only instances of “http” look OK (a small security check), it installs, and examples in help files run. LGTM for emailremoved@. Please wait for LGTM & approval from emailremoved@ before submitting.

Updating the R build tools

The import_from_cran tool executes a script located at //piper/third_party/R/packages/import_from_cran.R. It uses the import.from.cran package, which is at //piper/…/import_from_cran.

The check_package_import tool executes a script located at //piper/third_party/R/packages/check_package_import.sh.

If you update any of the package import/check tools, you need to update the shared copies in x20 (the /path/to/.../... directories mentioned above). We manage these files via go/build_copier; you update the copy in x20 by asking any member of the r-platform emailremoved@ to run

/path/to/.../build_copier --config=r-tools import_from_cran

(replacing the tool name as needed). The complete list of configurations is maintained in Piper.

For more information

  • R Extensions Manual
  • R Development In google3: go/r-development
  • Creating and Installing Google R Packages go/r-install
  • Building R Packages in google3 go/rlang/getting-started/old_building
  • R Intergrouplet Newsletters: go/rlang/intergrouplet/newsletters

Except as otherwise noted, the content of this page is licensed under CC-BY-4.0 license. Third-party product names and logos may be the trademarks of their respective owners.