This article and accompanying slides cover fundamentals of working with PPI data and the problem of study bias in protein interactions networks.

Outline

This workshop covers the following topics:
0. Brief intro into R Markdown
1. The fundamentals of protein-protein interactions (PPI) and networks (lecture)
2. Getting PPI data into R (from IntAct, using PSICQUIC)
3. Finding publication (Pubmed) IDs for every protein
4. Why should you care about the study bias in your network and how to evaluate it

0. Brief intro into R Markdown

Learn more:
http://rmarkdown.rstudio.com/

Cheat sheet:
https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf

Online course on Reproducible data analysis: https://www.coursera.org/learn/reproducible-research/lecture/5NzHN/r-markdown

Slides:
https://github.com/vitkl/PPI_biases_workshop/blob/master/PPI_networks_and_study_bias.pptx

1. The fundamentals of protein-protein interactions (PPI) and networks (lecture)

The PowerPoint presentation for this lecture can be found in the project folder (PPI_networks_and_study_bias.pptx).

2. Getting PPI data into R (from IntAct, using PSICQUIC)

You can download protein-protein interaction data directly from IntAct or using PSICQUIC and PSICQUIC client implemented in PSICQUIC R package. Both ways are implemented in PItools package. Downloading data from IntAct gets you all data from all IMEx consortium databases. PSICQUIC route allows to query non-IMEx databases and construct complex search queries.
As implemented in PItools, PSICQUIC route is slow to download (limit of 2500 interactions per second to lower the load on servers). IntAct route is faster but requires memory intensive processing locally (3 GB table).

First, let’s focus on the IntAct route. Later we will download molecular interactions from a large non-IMEx database called BioGRID.

2.1 Getting PPI data into R from IntAct

Load full human-human interactome and count interactions

By species search is taxonomy-hierarchy aware.

# find the date of the latest locally available release
human = fullInteractome(taxid = 9606, database = "IntActFTP", # 9606 - human taxid
                        clean = TRUE,
                        protein_only = TRUE,
                        directory = "./data_files/", # NULL to keep data files inside R library - default
                        releaseORdate = NULL) # useful to keep track of the release date e.g. 2019Mar23, but for the first download set to NULL

## ... looking for the date of the latest IntAct release ...
## ... looking for the date of the latest IntAct release ...

## ... loading local copy ...

## Warning in fread(file_name, header = T, stringsAsFactors = F): Found
## and resolved improper quoting out-of-sample. First healed line 25383:
## <<uniprotkb:P16054 uniprotkb:Q05769 intact:EBI-298451 intact:EBI-298933|
## uniprotkb:Q543K3 psi-mi:kpce_mouse(display_long)|uniprotkb:Prkce(gene
## name)|psi-mi:Prkce(display_short)|uniprotkb:Pkce(gene name synonym)|
## uniprotkb:Pkcea(gene name synonym)|uniprotkb:nPKC-epsilon(gene name
## synonym) psi-mi:pgh2_mouse(display_long)|uniprotkb:Ptgs2(gene name)|
## psi-mi:Ptgs2(display_short)|uniprotkb:Cox-2(gene name synonym)|
## uniprotkb:Cox2(gene name synonym)|uniprotkb:Pghs-b(gene name synonym)|
## uniprotkb:Tis10(gene na>>. If the fields are not quoted (e.g. field
## separator does not appear within any field), try quote="" to avoid this
## warning.

Load all human-viral interactions from IntAct, BioGRID, VirHostNet

10239 is the top viral clade identifier.

# read all IntAct data (filter later)
IntAct = loadIntActFTP(dir = "./data_files/IntActRelease_2019Mar23/")

## ... looking for the date of the latest IntAct release ...

## ... loading local copy ...

## Warning in fread(file_name, header = T, stringsAsFactors = F): Found
## and resolved improper quoting out-of-sample. First healed line 25383:
## <<uniprotkb:P16054 uniprotkb:Q05769 intact:EBI-298451 intact:EBI-298933|
## uniprotkb:Q543K3 psi-mi:kpce_mouse(display_long)|uniprotkb:Prkce(gene
## name)|psi-mi:Prkce(display_short)|uniprotkb:Pkce(gene name synonym)|
## uniprotkb:Pkcea(gene name synonym)|uniprotkb:nPKC-epsilon(gene name
## synonym) psi-mi:pgh2_mouse(display_long)|uniprotkb:Ptgs2(gene name)|
## psi-mi:Ptgs2(display_short)|uniprotkb:Cox-2(gene name synonym)|
## uniprotkb:Cox2(gene name synonym)|uniprotkb:Pghs-b(gene name synonym)|
## uniprotkb:Tis10(gene na>>. If the fields are not quoted (e.g. field
## separator does not appear within any field), try quote="" to avoid this
## warning.

# filter for viral-human interaction
human_viral = interSpeciesInteractome(MITABdata = IntAct,
                                      taxid1 = 9606,  taxid2 = 10239,
                                      database = "IntActFTP",
                                      clean = TRUE, protein_only = TRUE, 
                                      directory = "./data_files/",
                        releaseORdate = "2019Mar23")

## Warning in fread(filename): Found and resolved improper quoting out-of-
## sample. First healed line 5839: <<542959 Bpp1virus "BPP-1-like viruses";
## BPP-1-like phages; Bpp-1-like viruses; Bppunalikevirus Genus Viruses; dsDNA
## viruses, no RNA stage; Caudovirales; Podoviridae 10744 >>. If the fields
## are not quoted (e.g. field separator does not appear within any field), try
## quote="" to avoid this warning.

## Warning in fread(filename): Found and resolved improper quoting out-of-
## sample. First healed line 5839: <<542959 Bpp1virus "BPP-1-like viruses";
## BPP-1-like phages; Bpp-1-like viruses; Bppunalikevirus Genus Viruses; dsDNA
## viruses, no RNA stage; Caudovirales; Podoviridae 10744 >>. If the fields
## are not quoted (e.g. field separator does not appear within any field), try
## quote="" to avoid this warning.

uniqueNinteractions(human_viral)

## [1] 15003

uniqueNinteractors(human_viral, taxid = 9606)

## [1] 4644

uniqueNinteractors(human_viral, taxid = 9606, inverse_filter = T)

## [1] 894

# retrieve all VirHostNet data 
# which does not understand taxonomy hierarhy so interSpeciesInteractome() will not work
VirHostNet_raw = queryPSICQUICrlib(query = "taxid:9606",
                                   format = "tab27", database = "VirHostNet",
                                   directory = "./data_files/",
                                   releaseORdate = NULL)

## downloading using PSICQUIC

VirHostNet = interSpeciesInteractome(taxid1 = 9606,  taxid2 = 10239, 
                                     MITABdata = VirHostNet_raw,
                                     clean = TRUE, protein_only = TRUE, 
                                     directory = "./data_files/")

## ... looking for the date of the latest IntAct release ...

## Warning in fread(filename): Found and resolved improper quoting out-of-
## sample. First healed line 5839: <<542959 Bpp1virus "BPP-1-like viruses";
## BPP-1-like phages; Bpp-1-like viruses; Bppunalikevirus Genus Viruses; dsDNA
## viruses, no RNA stage; Caudovirales; Podoviridae 10744 >>. If the fields
## are not quoted (e.g. field separator does not appear within any field), try
## quote="" to avoid this warning.

## Warning in fread(filename): Found and resolved improper quoting out-of-
## sample. First healed line 5839: <<542959 Bpp1virus "BPP-1-like viruses";
## BPP-1-like phages; Bpp-1-like viruses; Bppunalikevirus Genus Viruses; dsDNA
## viruses, no RNA stage; Caudovirales; Podoviridae 10744 >>. If the fields
## are not quoted (e.g. field separator does not appear within any field), try
## quote="" to avoid this warning.

# filter out human-human and viral-viral interactions 
VirHostNet$data = VirHostNet$data[!Taxid_interactor_A == Taxid_interactor_B]
uniqueNinteractions(VirHostNet)

## [1] 23109

uniqueNinteractors(VirHostNet, taxid = 9606)

## [1] 6216

uniqueNinteractors(VirHostNet, taxid = 9606, inverse_filter = T)

## [1] 766

# Find which interactions are uniquely present in VirHostNet (not IntAct)
VirHostNet$data = VirHostNet$data[!pair_id %in% human_viral$data$pair_id]
uniqueNinteractions(VirHostNet)

## [1] 17822

uniqueNinteractors(VirHostNet, taxid = 9606)

## [1] 5512

uniqueNinteractors(VirHostNet, taxid = 9606, inverse_filter = T)

## [1] 679

Load all human-mouse interactions

# filter for viral-mouse interaction
human_mouse = interSpeciesInteractome(taxid1 = 9606,  taxid2 = 10090,
                                      database = "IntActFTP", MITABdata = IntAct,
                                      clean = TRUE, protein_only = TRUE,
                                      directory = "./data_files/",
                        releaseORdate = "2019Mar23")
# save RData file to save time processing this data if you want to proceed to later sections immediately
rm(IntAct)
save(list = ls(), file = "./RData_steps/study_bias_doc_1.RData")

You can find out any other taxid if you know species name using UniProt taxonomy search: https://www.uniprot.org/taxonomy/

Count interactors and interactions

# load previosly saved RData
load("./RData_steps/study_bias_doc_1.RData")

uniqueNinteractions(human)

## [1] 196473

uniqueNinteractors(human)

## [1] 21521

uniqueNinteractions(human_viral)

## [1] 15003

uniqueNinteractors(human_viral)

## [1] 5538

uniqueNinteractions(human_mouse)

## [1] 24960

uniqueNinteractors(human_mouse)

## [1] 10114

# how many of these proteins are human?
uniqueNinteractors(human_mouse, taxid = 9606)

## [1] 7060

filter human-human data by detection method and recount interactors and interactions

# subset two-hybrid interactions
human_two_hybrid = subsetMITABbyMethod(MITABdata = human,
                                       Interaction_detection_methods = "MI:0018")

## loading local copy of MI ontology

uniqueNinteractions(human_two_hybrid)

## [1] 83919

uniqueNinteractors(human_two_hybrid)

## [1] 14396

# subset all interactions but two-hybrid
human_NOT_two_hybrid = subsetMITABbyMethod(MITABdata = human,
                                           Interaction_detection_methods = "MI:0018", inverse_filter = T)

## loading local copy of MI ontology

uniqueNinteractions(human_NOT_two_hybrid)

## [1] 116651

uniqueNinteractors(human_NOT_two_hybrid)

## [1] 16616

# subset affinity purification - mass spectrometry interactions
human_AP_MS = subsetMITABbyMethod(MITABdata = human,
                                  Interaction_detection_methods = "MI:0004",  Identification_method = "MI:0433")

## loading local copy of MI ontology

uniqueNinteractions(human_AP_MS)

## [1] 69044

uniqueNinteractors(human_AP_MS)

## [1] 12465

filter human-human data by PMID and recount interactors and interactions

# subset both published and unpublished Vidal group data
Vidal_all = subsetMITABbyPMIDs(MITABdata = human,
                               PMIDs = c("25416956", "unassigned1304"))
uniqueNinteractions(Vidal_all)

## [1] 59037

uniqueNinteractors(Vidal_all)

## [1] 9756

# subset Mattias Mann 2015 paper data
Mann = subsetMITABbyPMIDs(MITABdata = human,
                          PMIDs = "26496610")
uniqueNinteractions(Mann)

## [1] 15589

uniqueNinteractors(Mann)

## [1] 4950

You can get help and more details on these functions (for example, how to find which molecular ontology terms correspond to which methods): ?subsetMITABbyMethod

Find interactions between components of the mediator complex in the Vidal dataset

mediator_complex_proteins = fread("http://www.uniprot.org/uniprot/?query=GO:0016592%20AND%20taxonomy:9606&format=tab&columns=id")
mediator_complex = subsetMITABbyID(Vidal_all,
                                   ID_seed = mediator_complex_proteins$Entry,
                                   within_seed = T, only_seed2nonseed = F)
uniqueNinteractions(mediator_complex)

## [1] 5

uniqueNinteractors(mediator_complex)

## [1] 7

Find interactions of the components of the mediator complex with other proteins

mediator_complex_interactions = subsetMITABbyID(Vidal_all,
                                                ID_seed = mediator_complex_proteins$Entry,
                                                within_seed = F, only_seed2nonseed = T)
uniqueNinteractions(mediator_complex_interactions)

## [1] 179

uniqueNinteractors(mediator_complex_interactions)

## [1] 179

2.2 Getting PPI data from non-IMEx databases into R using PSICQUIC

Let’s retrieve all human-viral interactions from BioGrid database. But first we need to find correct database names.

Find correct database names.

client = PSICQUIC()
providers(client)

##  [1] "BioGrid"           "bhf-ucl"           "ChEMBL"           
##  [4] "HPIDb"             "InnateDB"          "InnateDB-All"     
##  [7] "IntAct"            "IMEx"              "mentha"           
## [10] "MPIDB"             "iRefIndex"         "MatrixDB"         
## [13] "MINT"              "Reactome"          "Reactome-FIs"     
## [16] "EBI-GOA-miRNA"     "UniProt"           "MBInfo"           
## [19] "VirHostNet"        "BAR"               "EBI-GOA-nonIntAct"
## [22] "ZINC"

# Don't run
human_viral_BioGrid = interSpeciesInteractome(taxid1 = 9606,  taxid2 = 10239,
                                              database = "BioGrid",
                                              clean = TRUE, protein_only = TRUE,
                                              directory = "./data_files/")
human_BioGrid = fullInteractome(taxid = 9606, database = "BioGrid", # 9606 - human taxid
                        clean = TRUE,
                        protein_only = TRUE,
                        directory = "./data_files/")

2.3 Advanced search

This can be useful in you need interactions for a small number of proteins or if you want to query non-IMEx databases. Note that queryPSICQUIC doesn’t keep track of database version data, while queryPSICQUICrlib does.

# Query for interactions of bacterial RNA polymerase sigma factor SigA identified using two-hybrid methods in all imex databases
queryPSICQUIC(query = "id:P74565 AND detmethod:\"MI:0018\"",
              format = "tab27",
              database = "imex",
              file = "./data_files/P74565_2H_interactions_imex_tab27.tsv")

## Warning in if (N_interactions > 0) {: the condition has length > 1 and only
## the first element will be used

## Warning in N_SPECIES_ID_interactome[indices] <- N_interactions: number of
## items to replace is not a multiple of replacement length

##                                     query
##  1: id:P74565 AND detmethod:%22MI:0018%22
##  2: id:P74565 AND detmethod:%22MI:0018%22
##  3: id:P74565 AND detmethod:%22MI:0018%22
##  4: id:P74565 AND detmethod:%22MI:0018%22
##  5: id:P74565 AND detmethod:%22MI:0018%22
##  6: id:P74565 AND detmethod:%22MI:0018%22
##  7: id:P74565 AND detmethod:%22MI:0018%22
##  8: id:P74565 AND detmethod:%22MI:0018%22
##  9: id:P74565 AND detmethod:%22MI:0018%22
## 10: id:P74565 AND detmethod:%22MI:0018%22
## 11: id:P74565 AND detmethod:%22MI:0018%22
##                                                   file format
##  1: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  2: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  3: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  4: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  5: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  6: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  7: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  8: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##  9: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
## 10: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
## 11: ./data_files/P74565_2H_interactions_imex_tab27.tsv  tab27
##     all.databases n.interactions.in.database database.not.active
##  1:        IntAct                         44                    
##  2:          MINT                          0                    
##  3:       bhf-ucl                          0                    
##  4:         MPIDB                          0                    
##  5:      MatrixDB                     <html>                    
##  6:         HPIDb                          0                    
##  7:      I2D-IMEx                                       I2D-IMEx
##  8: InnateDB-IMEx                                  InnateDB-IMEx
##  9:        MolCon                                         MolCon
## 10:       UniProt                          0                    
## 11:        MBInfo                          0

# Query for interactions of sigma factor SigA identified using two-hybrid methods in mentha (a database that aggregates data from all primary databases, but does no interaction predition)
queryPSICQUIC(query = "id:P74565 AND detmethod:\"MI:0018\"",
              format = "tab25",
              database = "mentha",
              file = "./data_files/P74565_2H_interactions_mentha_tab25.tsv")

##                                    query
## 1: id:P74565 AND detmethod:%22MI:0018%22
##                                                    file format
## 1: ./data_files/P74565_2H_interactions_mentha_tab25.tsv  tab25
##    all.databases n.interactions.in.database database.not.active
## 1:        mentha                         44

# Query for interactions of sigma factor SigA in mentha
queryPSICQUIC(query = "id:P74565",
              format = "tab25",
              database = "mentha",
              file = "./data_files/P74565_2H_interactions_mentha_tab25.tsv")

##        query                                                 file format
## 1: id:P74565 ./data_files/P74565_2H_interactions_mentha_tab25.tsv  tab25
##    all.databases n.interactions.in.database database.not.active
## 1:        mentha                         44

# Retrieve interaction of any proteins encoded by Beta-adrenergic receptor kinase 1 gene (Entrez GeneID 156) from BioGRID (which recognises only this type of ID)
queryPSICQUIC(query = "id:156",
              format = "tab25",
              database = "BioGrid",
              file = "./data_files/entrezgene156_interactions_BioGrid_tab25.tsv")

##     query                                                      file format
## 1: id:156 ./data_files/entrezgene156_interactions_BioGrid_tab25.tsv  tab25
##    all.databases n.interactions.in.database database.not.active
## 1:       BioGrid                         82

# The function return the report of how many interaction were found in each database, not the data itself. Reading data into R.
fread("./data_files/entrezgene156_interactions_BioGrid_tab25.tsv", header = T, stringsAsFactors = F)[1:5]

##                            V1                         V2
## 1:  entrez gene/locuslink:156 entrez gene/locuslink:5579
## 2:  entrez gene/locuslink:156 entrez gene/locuslink:5148
## 3:  entrez gene/locuslink:156 entrez gene/locuslink:6714
## 4: entrez gene/locuslink:6714  entrez gene/locuslink:156
## 5: entrez gene/locuslink:5148  entrez gene/locuslink:156
##                                                                             V3
## 1:                                 biogrid:106665|entrez gene/locuslink:ADRBK1
## 2:                                 biogrid:106665|entrez gene/locuslink:ADRBK1
## 3:                                 biogrid:106665|entrez gene/locuslink:ADRBK1
## 4: biogrid:112592|entrez gene/locuslink:SRC|entrez gene/locuslink:RP5-823N20.1
## 5:                                  biogrid:111174|entrez gene/locuslink:PDE6G
##                                                                             V4
## 1:                                  biogrid:111565|entrez gene/locuslink:PRKCB
## 2:                                  biogrid:111174|entrez gene/locuslink:PDE6G
## 3: biogrid:112592|entrez gene/locuslink:SRC|entrez gene/locuslink:RP5-823N20.1
## 4:                                 biogrid:106665|entrez gene/locuslink:ADRBK1
## 5:                                 biogrid:106665|entrez gene/locuslink:ADRBK1
##                                                                                                                                                                                            V5
## 1:                                            entrez gene/locuslink:BARK1(gene name synonym)|entrez gene/locuslink:BETA-ARK1(gene name synonym)|entrez gene/locuslink:GRK2(gene name synonym)
## 2:                                            entrez gene/locuslink:BARK1(gene name synonym)|entrez gene/locuslink:BETA-ARK1(gene name synonym)|entrez gene/locuslink:GRK2(gene name synonym)
## 3:                                            entrez gene/locuslink:BARK1(gene name synonym)|entrez gene/locuslink:BETA-ARK1(gene name synonym)|entrez gene/locuslink:GRK2(gene name synonym)
## 4: entrez gene/locuslink:ASV(gene name synonym)|entrez gene/locuslink:SRC1(gene name synonym)|entrez gene/locuslink:c-SRC(gene name synonym)|entrez gene/locuslink:p60-Src(gene name synonym)
## 5:                                                                                                entrez gene/locuslink:PDEG(gene name synonym)|entrez gene/locuslink:RP57(gene name synonym)
##                                                                                                                                                                                                 V6
## 1: entrez gene/locuslink:PKC-beta(gene name synonym)|entrez gene/locuslink:PKCB(gene name synonym)|entrez gene/locuslink:PRKCB1(gene name synonym)|entrez gene/locuslink:PRKCB2(gene name synonym)
## 2:                                                                                                     entrez gene/locuslink:PDEG(gene name synonym)|entrez gene/locuslink:RP57(gene name synonym)
## 3:      entrez gene/locuslink:ASV(gene name synonym)|entrez gene/locuslink:SRC1(gene name synonym)|entrez gene/locuslink:c-SRC(gene name synonym)|entrez gene/locuslink:p60-Src(gene name synonym)
## 4:                                                 entrez gene/locuslink:BARK1(gene name synonym)|entrez gene/locuslink:BETA-ARK1(gene name synonym)|entrez gene/locuslink:GRK2(gene name synonym)
## 5:                                                 entrez gene/locuslink:BARK1(gene name synonym)|entrez gene/locuslink:BETA-ARK1(gene name synonym)|entrez gene/locuslink:GRK2(gene name synonym)
##                                                    V7             V8
## 1: psi-mi:MI:0004(affinity chromatography technology) Yang XL (2003)
## 2: psi-mi:MI:0004(affinity chromatography technology)  Wan KF (2003)
## 3: psi-mi:MI:0004(affinity chromatography technology)  Wan KF (2003)
## 4: psi-mi:MI:0004(affinity chromatography technology)  Wan KF (2003)
## 5: psi-mi:MI:0004(affinity chromatography technology)  Wan KF (2003)
##                 V9        V10        V11
## 1: pubmed:12679936 taxid:9606 taxid:9606
## 2: pubmed:12624098 taxid:9606 taxid:9606
## 3: pubmed:12624098 taxid:9606 taxid:9606
## 4: pubmed:12624098 taxid:9606 taxid:9606
## 5: pubmed:12624098 taxid:9606 taxid:9606
##                                     V12                     V13
## 1: psi-mi:MI:0915(physical association) psi-mi:MI:0463(biogrid)
## 2: psi-mi:MI:0915(physical association) psi-mi:MI:0463(biogrid)
## 3: psi-mi:MI:0915(physical association) psi-mi:MI:0463(biogrid)
## 4: psi-mi:MI:0915(physical association) psi-mi:MI:0463(biogrid)
## 5: psi-mi:MI:0915(physical association) psi-mi:MI:0463(biogrid)
##               V14 V15
## 1: biogrid:259364   -
## 2: biogrid:259971   -
## 3: biogrid:259972   -
## 4: biogrid:259974   -
## 5: biogrid:259977   -

All the same operations can be done using function queryPSICQUICrlib but with the convienience of automatic tracking of database release date and the exact query text. This function also return the data in object of class RAW_MItab that after cleaned make data ready for use with other tools in the package.

BioGrid_156 = queryPSICQUICrlib(query = "id:156",
                                format = "tab25",
                                database = "BioGrid",
                                directory = "./data_files/")

## downloading using PSICQUIC

# The same protein, but only two-hybrid interactions
BioGrid_156_2H = queryPSICQUICrlib(query = "id:156 AND detmethod:\"MI:0018\"",
                                   format = "tab25",
                                   database = "BioGrid",
                                   directory = "./data_files/")

## downloading using PSICQUIC

# The data returned by queryPSICQUICrlib constains auxillary information that is not necessary for most analysis. Let's clean the data.
cleanMITAB(BioGrid_156)[1:5]

## $data
##     IDs_interactor_A IDs_interactor_B interactor_IDs_databases_A
##  1:              156             5579      entrez gene/locuslink
##  2:              156             5148      entrez gene/locuslink
##  3:              156             6714      entrez gene/locuslink
##  4:              156             2776      entrez gene/locuslink
##  5:              156             2769      entrez gene/locuslink
##  6:              156             5159      entrez gene/locuslink
##  7:             1460              156      entrez gene/locuslink
##  8:              156            25489      entrez gene/locuslink
##  9:              156             9815      entrez gene/locuslink
## 10:              156            28964      entrez gene/locuslink
## 11:              156             6010      entrez gene/locuslink
## 12:              156              857      entrez gene/locuslink
## 13:             1233              156      entrez gene/locuslink
## 14:              156             1910      entrez gene/locuslink
## 15:              156              185      entrez gene/locuslink
## 16:              156             2492      entrez gene/locuslink
## 17:              154              156      entrez gene/locuslink
## 18:              156            28964      entrez gene/locuslink
## 19:              156             5082      entrez gene/locuslink
## 20:              156             5132      entrez gene/locuslink
## 21:              156             6622      entrez gene/locuslink
## 22:              156             4193      entrez gene/locuslink
## 23:              156              408      entrez gene/locuslink
## 24:              156             2782      entrez gene/locuslink
## 25:              156             2782      entrez gene/locuslink
## 26:              156            30818      entrez gene/locuslink
## 27:              156            26260      entrez gene/locuslink
## 28:              154              156      entrez gene/locuslink
## 29:              156             5037      entrez gene/locuslink
## 30:              156           223864      entrez gene/locuslink
## 31:              156             4193      entrez gene/locuslink
## 32:              156             2776      entrez gene/locuslink
## 33:              156             1656      entrez gene/locuslink
## 34:              156            54919      entrez gene/locuslink
## 35:            10979              156      entrez gene/locuslink
## 36:            10914              156      entrez gene/locuslink
## 37:              156            23398      entrez gene/locuslink
## 38:            11212              156      entrez gene/locuslink
## 39:              156             5792      entrez gene/locuslink
## 40:              156             2911      entrez gene/locuslink
## 41:              156             1969      entrez gene/locuslink
## 42:              156             5923      entrez gene/locuslink
## 43:              156             3320      entrez gene/locuslink
## 44:              156             3326      entrez gene/locuslink
## 45:              156              157      entrez gene/locuslink
## 46:              156               31      entrez gene/locuslink
## 47:              156             4343      entrez gene/locuslink
## 48:              156             4193      entrez gene/locuslink
## 49:              156             2840      entrez gene/locuslink
## 50:              156             2783      entrez gene/locuslink
## 51:              156            16554      entrez gene/locuslink
## 52:              156            53970      entrez gene/locuslink
## 53:              156           238799      entrez gene/locuslink
## 54:              156             6257      entrez gene/locuslink
## 55:              156             9128      entrez gene/locuslink
## 56:              156             9646      entrez gene/locuslink
## 57:              156            57606      entrez gene/locuslink
## 58:              156            79731      entrez gene/locuslink
## 59:            10013              156      entrez gene/locuslink
## 60:           814753           816413      entrez gene/locuslink
## 61:           814753           829142      entrez gene/locuslink
## 62:           814753           829480      entrez gene/locuslink
## 63:              156            26051      entrez gene/locuslink
## 64:              156              157      entrez gene/locuslink
## 65:              156            57533      entrez gene/locuslink
##     IDs_interactor_A IDs_interactor_B interactor_IDs_databases_A
##     interactor_IDs_databases_B Taxid_interactor_A Taxid_interactor_B
##  1:      entrez gene/locuslink               9606               9606
##  2:      entrez gene/locuslink               9606               9606
##  3:      entrez gene/locuslink               9606               9606
##  4:      entrez gene/locuslink               9606               9606
##  5:      entrez gene/locuslink               9606               9606
##  6:      entrez gene/locuslink               9606               9606
##  7:      entrez gene/locuslink               9606               9606
##  8:      entrez gene/locuslink               9606              10116
##  9:      entrez gene/locuslink               9606               9606
## 10:      entrez gene/locuslink               9606               9606
## 11:      entrez gene/locuslink               9606               9606
## 12:      entrez gene/locuslink               9606               9606
## 13:      entrez gene/locuslink               9606               9606
## 14:      entrez gene/locuslink               9606               9606
## 15:      entrez gene/locuslink               9606               9606
## 16:      entrez gene/locuslink               9606               9606
## 17:      entrez gene/locuslink               9606               9606
## 18:      entrez gene/locuslink               9606               9606
## 19:      entrez gene/locuslink               9606               9606
## 20:      entrez gene/locuslink               9606               9606
## 21:      entrez gene/locuslink               9606               9606
## 22:      entrez gene/locuslink               9606               9606
## 23:      entrez gene/locuslink               9606               9606
## 24:      entrez gene/locuslink               9606               9606
## 25:      entrez gene/locuslink               9606               9606
## 26:      entrez gene/locuslink               9606               9606
## 27:      entrez gene/locuslink               9606               9606
## 28:      entrez gene/locuslink               9606               9606
## 29:      entrez gene/locuslink               9606               9606
## 30:      entrez gene/locuslink               9606              10090
## 31:      entrez gene/locuslink               9606               9606
## 32:      entrez gene/locuslink               9606               9606
## 33:      entrez gene/locuslink               9606               9606
## 34:      entrez gene/locuslink               9606               9606
## 35:      entrez gene/locuslink               9606               9606
## 36:      entrez gene/locuslink               9606               9606
## 37:      entrez gene/locuslink               9606               9606
## 38:      entrez gene/locuslink               9606               9606
## 39:      entrez gene/locuslink               9606               9606
## 40:      entrez gene/locuslink               9606               9606
## 41:      entrez gene/locuslink               9606               9606
## 42:      entrez gene/locuslink               9606               9606
## 43:      entrez gene/locuslink               9606               9606
## 44:      entrez gene/locuslink               9606               9606
## 45:      entrez gene/locuslink               9606               9606
## 46:      entrez gene/locuslink               9606               9606
## 47:      entrez gene/locuslink               9606               9606
## 48:      entrez gene/locuslink               9606               9606
## 49:      entrez gene/locuslink               9606               9606
## 50:      entrez gene/locuslink               9606               9606
## 51:      entrez gene/locuslink               9606              10090
## 52:      entrez gene/locuslink               9606              10090
## 53:      entrez gene/locuslink               9606              10090
## 54:      entrez gene/locuslink               9606               9606
## 55:      entrez gene/locuslink               9606               9606
## 56:      entrez gene/locuslink               9606               9606
## 57:      entrez gene/locuslink               9606               9606
## 58:      entrez gene/locuslink               9606               9606
## 59:      entrez gene/locuslink               9606               9606
## 60:      entrez gene/locuslink               3702               3702
## 61:      entrez gene/locuslink               3702               3702
## 62:      entrez gene/locuslink               3702               3702
## 63:      entrez gene/locuslink               9606               9606
## 64:      entrez gene/locuslink               9606               9606
## 65:      entrez gene/locuslink               9606               9606
##     interactor_IDs_databases_B Taxid_interactor_A Taxid_interactor_B
##     Publication_Identifiers Confidence_values       pair_id
##  1:                12679936                NA      156|5579
##  2:                12624098                NA      156|5148
##  3:                12624098                NA      156|6714
##  4:                12885252                NA      156|2776
##  5:                12885252                NA      156|2769
##  6:                12381737                NA      156|5159
##  7:                21900206                NA      1460|156
##  8:                19953087                NA     156|25489
##  9:                10896954                NA      156|9815
## 10:                10896954                NA     156|28964
## 11:                10722682                NA      156|6010
## 12:                10085129                NA       156|857
## 13:                10754297                NA      1233|156
## 14:                 9211925                NA      156|1910
## 15:                 9211925                NA       156|185
## 16:                10379886                NA      156|2492
## 17:                11309381                NA       154|156
## 18:                 9826657                NA     156|28964
## 19:                10884381                NA      156|5082
## 20:                10884381                NA      156|5132
## 21:                17893145                NA      156|6622
## 22:                21081496                NA      156|4193
## 23:                21081496                NA       156|408
## 24:                17548351                NA      156|2782
## 25:                10744734                NA      156|2782
## 26:                17102134                NA     156|30818
## 27:                23940030                NA     156|26260
## 28:                14654844                NA       154|156
## 29:                14654844                NA      156|5037
## 30:                20861385                NA    156|223864
## 31:                17006543                NA      156|4193
## 32:                21940795                NA      156|2776
## 33:                26344197                NA      156|1656
## 34:                26344197                NA     156|54919
## 35:                26344197                NA     10979|156
## 36:                26344197                NA     10914|156
## 37:                26344197                NA     156|23398
## 38:                26344197                NA     11212|156
## 39:                26344197                NA      156|5792
## 40:                16091361                NA      156|2911
## 41:                21988832                NA      156|1969
## 42:                21988832                NA      156|5923
## 43:                23455922                NA      156|3320
## 44:                23455922                NA      156|3326
## 45:                23455922                NA       156|157
## 46:                23455922                NA        156|31
## 47:                22658674                NA      156|4343
## 48:                26228571                NA      156|4193
## 49:                26228571                NA      156|2840
## 50:                25982117                NA      156|2783
## 51:                26496610                NA     156|16554
## 52:                26496610                NA     156|53970
## 53:                26496610                NA    156|238799
## 54:                26496610                NA      156|6257
## 55:                26496610                NA      156|9128
## 56:                26496610                NA      156|9646
## 57:                26496610                NA     156|57606
## 58:                26344197                NA     156|79731
## 59:                22193721                NA     10013|156
## 60:                26071421                NA 814753|816413
## 61:                26071421                NA 814753|829142
## 62:                26071421                NA 814753|829480
## 63:                28514442                NA     156|26051
## 64:                28514442                NA       156|157
## 65:                28514442                NA     156|57533
##     Publication_Identifiers Confidence_values       pair_id
## 
## $metadata
##     query                                                            file
## 1: id:156 ./data_files/DownloadDate_20190411/query_id.156tab25BioGrid.tsv
##    format all.databases n.interactions.in.database database.not.active
## 1:  tab25       BioGrid                         82                    
##    date_time
## 1: 2019Mar23
## 
## $<NA>
## NULL
## 
## $<NA>
## NULL
## 
## $<NA>
## NULL

3. Finding publication (Pubmed) IDs for every protein

There are many ways to find publications for a given protein (all proteins). We will discuss 3 and focus on one (UniProtKB references).

UniProtKB references. Details: https://www.uniprot.org/help/publications_section
Quote:
“The set of publications fully curated in UniProtKB/Swiss-Prot and publications imported in UniProtKB/TrEMBL is complemented by additional publications that have been computationally mapped from other resources to UniProtKB entries.”

https://www.uniprot.org/help/publications_section#additional_bibliography:
“As a comprehensive and high-quality resource of protein sequence and functional information, UniProtKB strives to provide comprehensive literature citations associated with protein sequences and their characterization. Currently about 2 thirds of the UniProtKB PubMed citations are found in UniProtKB/Swiss-Prot, as a result of active integration in the course of manual curation.

In order to keep up with the explosive growth of literature and to give our users access to additional publications, we decided to integrate additional sources of literature from other annotated databases into UniProtKB. For this purpose we selected a number external databases, e.g. Entrez Gene (GeneRIFs), SGD, MGI, GAD and PDB, and extracted citations that were mapped to UniProtKB entries. This additional protein bibliography information helps our users to better explore the existing knowledge of their proteins of interest."

Europe PMC text-mining efforts yeld multiple identifier-publication associations. Details can be found here: https://europepmc.org/downloads. UniProtKB to pubmedID mapping can be downloaded from ftp: ftp://ftp.ebi.ac.uk/pub/databases/pmc/TextMinedTerms/
NCBI portal: references for entrez geneID. Similar to UniProtKB references, but different ID, documentation about how these links are maintained is provided here: https://www.ncbi.nlm.nih.gov/entrez/query/static/entrezlinks.html#gene. GeneID to pubmedID mapping can be downloaded from NCBI ftp: ftp://ftp.ncbi.nih.gov/gene/DATA/gene2pubmed.gz

3.1 Retrive UniProtKB references using Uniprot REST API

Let’s download the list of NCBI PubMed identifiers (PubMed IDs) associated with the UniProtKB entry (human proteins) and computationally mapped to the UniProtKB entry (human proteins).

# retrieve human interactions (taxonomy:9606) from SwissProt, a manually reviewed part or the UniProt (reviewed:yes)
url = "https://www.uniprot.org/uniprot/?query=taxonomy:9606&compress=yes&format=tab&columns=id,citation,citationmapping"
filename.gz = "./data_files/uniprot2pubmed_human.tsv.gz"
filename = "./data_files/uniprot2pubmed_human.tsv"
if(!file.exists(filename.gz)) download(url, filename.gz)
R.utils::gunzip(filename.gz, filename, remove = F, overwrite = T)

# Read into R
uniprot2pubmed_human = fread(filename, header = T, stringsAsFactors = F)

# We need to do some processing
## 1. Merge manually associated and computationally mapped publications 
uniprot2pubmed_human[, PMIDs := paste0(`PubMed ID`,";",`Mapped PubMed ID`)]
uniprot2pubmed_human[`PubMed ID` == "" & `Mapped PubMed ID` == "", PMIDs := ""]
uniprot2pubmed_human[`PubMed ID` == "" & `Mapped PubMed ID` != "", PMIDs := `Mapped PubMed ID`]
uniprot2pubmed_human[`PubMed ID` != "" & `Mapped PubMed ID` == "", PMIDs := `PubMed ID`]
uniprot2pubmed_human$`PubMed ID` = NULL
uniprot2pubmed_human$`Mapped PubMed ID` = NULL
## 2. Split concatenated PubMed IDs and count their number per protein
uniprot2pubmed_human = uniprot2pubmed_human[, .(N_PMIDs = uniqueN(unlist(tstrsplit(PMIDs,";")))), by = Entry]

# remove objects necessary for the next part, save
rm(list = ls()[!ls() %in% c("human", "uniprot2pubmed_human", "Vidal_all", "Mann", "human_two_hybrid", "human_AP_MS")])
save(list = ls(), file = "./RData_steps/study_bias_doc_2.RData")

4. Evaluating bias of protein popularity in the literature

Why should you care about the study bias in your network and how to evaluate it

Back to the presentation, slide 61. Study bias is the phenomenon that causes certain properties of proteins to appear biologically meaningful but actually confounded with how well studies proteins are overall. Study bias may cause certain proteins to appear as hubs in the protein interaction network. This may lead to false conclusions regarding the importance of the protein in the network. Therefore, it is important to estimate and if possible correct the effect of study bias on the property of your interest.

First, we need to bin proteins by how often they are mentioned in the literature

Let’s use the number of UniProt reference as a proxy for how well studied the protein is. We will bin all proteins that have interactions in our network by the number of publications.

load("./RData_steps/study_bias_doc_2.RData")
uniprot2pubmed_human = uniprot2pubmed_human[order(N_PMIDs, decreasing = T)]
uniprot2pubmed_human_w = uniprot2pubmed_human[Entry %in% extractInteractors(human)]

# Most proteins have very low number of publications:
hist(log10(uniprot2pubmed_human_w$N_PMIDs), xlab = "log10(number of publications) per protein")

n_bins = 50
n_proteins = nrow(uniprot2pubmed_human_w)
n_proteins_per_bin = round(n_proteins/n_bins)
bins = rep(1:50, each = n_proteins_per_bin)
if(n_proteins > length(bins)){
  bins = c(bins, rep(50, n_proteins - length(bins)))
} else if(n_proteins < length(bins)){
  bins = bins[1:n_proteins]
}


uniprot2pubmed_human_w[, bins := bins]
bin_lists = split(uniprot2pubmed_human_w$Entry, uniprot2pubmed_human_w$bins)

Second, we need to count interactions within and between bins

This method shows the gist of the problem of study bias but needs further improvement.

unique_interactions = unique(human$data[, .(IDs_interactor_A, IDs_interactor_B)])
N_interaction_table = lapply(1:length(bin_lists), function(bin_list1_N, unique_interactions, bin_lists){
    N_interaction_table = lapply(1:length(bin_lists), function(bin_list2_N, bin_list1_N, unique_interactions, bin_lists) {
      bin_list1 = bin_lists[[bin_list1_N]]
      bin_list2 = bin_lists[[bin_list2_N]]
      N_proteins = uniqueN(unique_interactions[(IDs_interactor_A %in% bin_list1 &
                                     IDs_interactor_B %in% bin_list2) |
                                     (IDs_interactor_A %in% bin_list2 &
                                     IDs_interactor_B %in% bin_list1), c(IDs_interactor_A, IDs_interactor_B)])
      N_interactions = unique_interactions[,sum((IDs_interactor_A %in% bin_list1 &
                                     IDs_interactor_B %in% bin_list2) |
                                     (IDs_interactor_A %in% bin_list2 &
                                     IDs_interactor_B %in% bin_list1))]
      data.table(N_interactions_per_protein = N_interactions / N_proteins,
                 x = bin_list1_N, y = bin_list2_N,
                 N_interactions = N_interactions, N_proteins = N_proteins)
    }, bin_list1_N, unique_interactions, bin_lists)
    Reduce(rbind, N_interaction_table)
}, unique_interactions, bin_lists)
N_interaction_table = Reduce(rbind, N_interaction_table)

Third, we need to plot our results

cols = c(colorRampPalette(c("white", "#95cbee", "#0099dc", "#4ab04a", "#ffd73e"), bias = 2)(10), #"#c9e2f6"
            colorRampPalette(c("#eec73a", "#e29421", "#e29421", "#f05336","#ce472e"))(20))
ggplot(N_interaction_table, aes(x, y)) +
 geom_raster(aes(fill = N_interactions_per_protein)) +
    xlab("rank by number of studies") +
  ylab("rank by number of studies") +
  theme_bw() +
    scale_fill_gradientn(colours=cols,
                         na.value=rgb(246, 246, 246, max=255),
                         guide=guide_colourbar(ticks=T, nbin=50,
                                               barheight=15, label=T))

Finally, we can write a function to analyse other datasets

First, let’s put what we just did into a function.

interactionBias = function(ppi_network, uniprot2pubmed, n_bins = 50){
  uniprot2pubmed = copy(uniprot2pubmed)
  uniprot2pubmed = uniprot2pubmed[order(N_PMIDs, decreasing = T)]
  uniprot2pubmed = uniprot2pubmed[Entry %in% extractInteractors(human)]
  
  n_proteins = nrow(uniprot2pubmed)
  n_proteins_per_bin = round(n_proteins/n_bins)
  bins = rep(1:n_bins, each = n_proteins_per_bin)
  if(n_proteins > length(bins)){
    bins = c(bins, rep(n_bins, n_proteins - length(bins)))
  } else if(n_proteins < length(bins)){
    bins = bins[1:n_proteins]
  }
  
  uniprot2pubmed[, bins := bins]
  bin_lists = split(uniprot2pubmed$Entry, uniprot2pubmed$bins)
  unique_interactions = unique(ppi_network$data[, .(IDs_interactor_A, IDs_interactor_B)])
  N_interaction_table = lapply(1:length(bin_lists), function(bin_list1_N, unique_interactions, bin_lists){
    N_interaction_table = lapply(1:length(bin_lists), function(bin_list2_N, bin_list1_N, unique_interactions, bin_lists) {
      bin_list1 = bin_lists[[bin_list1_N]]
      bin_list2 = bin_lists[[bin_list2_N]]
      N_proteins = uniqueN(unique_interactions[(IDs_interactor_A %in% bin_list1 &
                                                  IDs_interactor_B %in% bin_list2) |
                                                 (IDs_interactor_A %in% bin_list2 &
                                                    IDs_interactor_B %in% bin_list1), c(IDs_interactor_A, IDs_interactor_B)])
      N_interactions = unique_interactions[,sum((IDs_interactor_A %in% bin_list1 &
                                                   IDs_interactor_B %in% bin_list2) |
                                                  (IDs_interactor_A %in% bin_list2 &
                                                     IDs_interactor_B %in% bin_list1))]
      data.table(N_interactions_per_protein = N_interactions / N_proteins,
                 x = bin_list1_N, y = bin_list2_N,
                 N_interactions = N_interactions, N_proteins = N_proteins)
    }, bin_list1_N, unique_interactions, bin_lists)
    Reduce(rbind, N_interaction_table)
  }, unique_interactions, bin_lists)
  N_interaction_table = Reduce(rbind, N_interaction_table)
  return(N_interaction_table)
}

4.2 You can see that Marc Vidal two-hybrid and Mattias Mann AP-MS data is substantially less biased by protein popularity among in the literature

N_interaction_table[, title := "All IntAct data"]
N_interaction_Mann = interactionBias(ppi_network = Mann, uniprot2pubmed = uniprot2pubmed_human, n_bins = 50)
N_interaction_Mann[, title := "Mattias Mann unbiased AP-MS dataset"]

N_interaction_Vidal = interactionBias(ppi_network = Vidal_all, uniprot2pubmed = uniprot2pubmed_human, n_bins = 50)
N_interaction_Vidal[, title := "Marc Vidal unbiased two-hybrid dataset"]

N_interaction_two_hybrid = interactionBias(ppi_network = human_two_hybrid, uniprot2pubmed = uniprot2pubmed_human, n_bins = 50)
N_interaction_two_hybrid[, title := "all two-hybrid data"]

N_interaction_AP_MS = interactionBias(ppi_network = human_AP_MS, uniprot2pubmed = uniprot2pubmed_human, n_bins = 50)
N_interaction_AP_MS[, title := "all AP-MS data"]

N_interaction = Reduce(rbind, list(N_interaction_Mann, N_interaction_Vidal,
                                   N_interaction_two_hybrid, N_interaction_AP_MS, N_interaction_table))

ggplot(N_interaction, aes(x, y)) +
  geom_raster(aes(fill = N_interactions_per_protein)) +
  xlab("rank by number of studies") + ylab("rank by number of studies") +
  facet_wrap( ~ title, ncol = 2) +
  theme_bw() +
    scale_fill_gradientn(colours=cols,
                         na.value=rgb(246, 246, 246, max=255),
                         guide=guide_colourbar(ticks=T, nbin=50,
                                               barheight=20, label=T))

R and system details

Sys.Date()

## [1] "2019-04-11"

devtools::session_info()

## ─ Session info ──────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 3.5.1 (2018-07-02)
##  os       macOS High Sierra 10.13.6   
##  system   x86_64, darwin15.6.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_GB.UTF-8                 
##  ctype    en_GB.UTF-8                 
##  tz       Europe/London               
##  date     2019-04-11                  
## 
## ─ Packages ──────────────────────────────────────────────────────────────
##  package       * version   date       lib source        
##  AnnotationDbi   1.44.0    2018-10-30 [1] Bioconductor  
##  assertthat      0.2.1     2019-03-21 [1] CRAN (R 3.5.1)
##  backports       1.1.3     2018-12-14 [1] CRAN (R 3.5.0)
##  Biobase         2.42.0    2018-10-30 [1] Bioconductor  
##  BiocGenerics  * 0.28.0    2018-10-30 [1] Bioconductor  
##  biomaRt       * 2.38.0    2018-10-30 [1] Bioconductor  
##  Biostrings      2.50.2    2019-01-03 [1] Bioconductor  
##  bit             1.1-14    2018-05-29 [1] CRAN (R 3.5.0)
##  bit64           0.9-7     2017-05-08 [1] CRAN (R 3.5.0)
##  bitops          1.0-6     2013-08-17 [1] CRAN (R 3.5.0)
##  blob            1.1.1     2018-03-25 [1] CRAN (R 3.5.0)
##  callr           3.2.0     2019-03-15 [1] CRAN (R 3.5.2)
##  cli             1.1.0     2019-03-19 [1] CRAN (R 3.5.2)
##  colorspace      1.4-1     2019-03-18 [1] CRAN (R 3.5.2)
##  commonmark      1.7       2018-12-01 [1] CRAN (R 3.5.0)
##  crayon          1.3.4     2017-09-16 [1] CRAN (R 3.5.0)
##  curl            3.3       2019-01-10 [1] CRAN (R 3.5.2)
##  data.table    * 1.12.0    2019-01-13 [1] CRAN (R 3.5.2)
##  DBI             1.0.0     2018-05-02 [1] CRAN (R 3.5.0)
##  desc            1.2.0     2018-05-01 [1] CRAN (R 3.5.0)
##  devtools        2.0.1     2018-10-26 [1] CRAN (R 3.5.1)
##  digest          0.6.18    2018-10-10 [1] CRAN (R 3.5.0)
##  downloader    * 0.4       2015-07-09 [1] CRAN (R 3.5.0)
##  dplyr           0.8.0.1   2019-02-15 [1] CRAN (R 3.5.2)
##  evaluate        0.13      2019-02-12 [1] CRAN (R 3.5.2)
##  fs              1.2.7     2019-03-19 [1] CRAN (R 3.5.2)
##  ggplot2       * 3.1.1     2019-04-07 [1] CRAN (R 3.5.1)
##  glue            1.3.1     2019-03-12 [1] CRAN (R 3.5.2)
##  gsubfn        * 0.7       2018-03-16 [1] CRAN (R 3.5.0)
##  gtable          0.3.0     2019-03-25 [1] CRAN (R 3.5.1)
##  hms             0.4.2     2018-03-10 [1] CRAN (R 3.5.0)
##  htmltools       0.3.6     2017-04-28 [1] CRAN (R 3.5.0)
##  httr          * 1.4.0     2018-12-11 [1] CRAN (R 3.5.0)
##  IRanges       * 2.16.0    2018-10-30 [1] Bioconductor  
##  jsonlite      * 1.6       2018-12-07 [1] CRAN (R 3.5.0)
##  knitr           1.22      2019-03-08 [1] CRAN (R 3.5.1)
##  labeling        0.3       2014-08-23 [1] CRAN (R 3.5.0)
##  lazyeval        0.2.2     2019-03-15 [1] CRAN (R 3.5.2)
##  magrittr        1.5       2014-11-22 [1] CRAN (R 3.5.0)
##  MASS            7.3-50    2018-04-30 [2] CRAN (R 3.5.1)
##  memoise         1.1.0     2017-04-21 [1] CRAN (R 3.5.0)
##  munsell         0.5.0     2018-06-12 [1] CRAN (R 3.5.0)
##  ontologyIndex * 2.5       2019-01-08 [1] CRAN (R 3.5.2)
##  pillar          1.3.1     2018-12-15 [1] CRAN (R 3.5.0)
##  PItools       * 0.1.41    2019-04-11 [1] local         
##  pkgbuild        1.0.3     2019-03-20 [1] CRAN (R 3.5.1)
##  pkgconfig       2.0.2     2018-08-16 [1] CRAN (R 3.5.0)
##  pkgdown         1.3.0     2018-12-07 [1] CRAN (R 3.5.0)
##  pkgload         1.0.2     2018-10-29 [1] CRAN (R 3.5.0)
##  plyr          * 1.8.4     2016-06-08 [1] CRAN (R 3.5.0)
##  prettyunits     1.0.2     2015-07-13 [1] CRAN (R 3.5.0)
##  processx        3.3.0     2019-03-10 [1] CRAN (R 3.5.2)
##  progress        1.2.0     2018-06-14 [1] CRAN (R 3.5.0)
##  proto         * 1.0.0     2016-10-29 [1] CRAN (R 3.5.0)
##  ps              1.3.0     2018-12-21 [1] CRAN (R 3.5.0)
##  PSICQUIC      * 1.20.0    2018-10-30 [1] Bioconductor  
##  purrr           0.3.2     2019-03-15 [1] CRAN (R 3.5.2)
##  R.methodsS3   * 1.7.1     2016-02-16 [1] CRAN (R 3.5.0)
##  R.oo          * 1.22.0    2018-04-22 [1] CRAN (R 3.5.0)
##  R.utils       * 2.8.0     2019-02-14 [1] CRAN (R 3.5.2)
##  R6              2.4.0     2019-02-14 [1] CRAN (R 3.5.2)
##  Rcpp            1.0.1     2019-03-17 [1] CRAN (R 3.5.2)
##  RCurl           1.95-4.12 2019-03-04 [1] CRAN (R 3.5.2)
##  remotes         2.0.2     2018-10-30 [1] CRAN (R 3.5.0)
##  rlang           0.3.1     2019-01-08 [1] CRAN (R 3.5.2)
##  rmarkdown       1.12      2019-03-14 [1] CRAN (R 3.5.2)
##  roxygen2        6.1.1     2018-11-07 [1] CRAN (R 3.5.0)
##  rprojroot       1.3-2     2018-01-03 [1] CRAN (R 3.5.0)
##  RSQLite         2.1.1     2018-05-06 [1] CRAN (R 3.5.0)
##  rstudioapi      0.10      2019-03-19 [1] CRAN (R 3.5.2)
##  S4Vectors     * 0.20.1    2018-11-09 [1] Bioconductor  
##  scales          1.0.0     2018-08-09 [1] CRAN (R 3.5.0)
##  sessioninfo     1.1.1     2018-11-05 [1] CRAN (R 3.5.0)
##  stringi         1.4.3     2019-03-12 [1] CRAN (R 3.5.2)
##  stringr         1.4.0     2019-02-10 [1] CRAN (R 3.5.2)
##  testthat        2.0.1     2018-10-13 [1] CRAN (R 3.5.1)
##  tibble          2.1.1     2019-03-16 [1] CRAN (R 3.5.2)
##  tidyselect      0.2.5     2018-10-11 [1] CRAN (R 3.5.0)
##  usethis         1.5.0     2019-04-07 [1] CRAN (R 3.5.1)
##  withr           2.1.2     2018-03-15 [1] CRAN (R 3.5.0)
##  xfun            0.6       2019-04-02 [1] CRAN (R 3.5.1)
##  XML             3.98-1.19 2019-03-06 [1] CRAN (R 3.5.2)
##  xml2            1.2.0     2018-01-24 [1] CRAN (R 3.5.0)
##  XVector         0.22.0    2018-10-30 [1] Bioconductor  
##  yaml            2.2.0     2018-07-25 [1] CRAN (R 3.5.0)
##  zlibbioc        1.28.0    2018-10-30 [1] Bioconductor  
## 
## [1] /Users/vk7/Library/R/3.5/library
## [2] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

Study bias in the protein network

Vitalii Kleshchevnikov

11/01/2018