Skip to content

Commit 336b02c

Browse files
committed
v0.1.3 - Add OpenTarget Gold Standards generator
- Add GoldStandards.generate_ot_goldstandards()
1 parent 35a03af commit 336b02c

File tree

4 files changed

+519
-10
lines changed

4 files changed

+519
-10
lines changed

README.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# pyCfS
2-
Version 0.1.2 <br>
2+
Version 0.1.3 <br>
33
The aggregation of Lichtarge Lab genotype-phenotype validation experiments<br>
44

55
## Installation
@@ -17,7 +17,7 @@ conda activate pyCfS<br>
1717
conda install -c conda-forge r-base r-ggplot2 r-deldir r-rcppeigen r-interp rpy2 rasterio r-tzdb r-vroom r-readr r-cowplot r-tidyverse <br>
1818

1919
### Install pyCfS (in anaconda environment)
20-
pip install git+https://github.com/kevwilhelm95/pyCfS.git <br>
20+
pip install git+https://github.com/LichtargeLab/pyCfS_Package.git <br>
2121
(Ensure pip is pointing to anaconda environment, if it is not, use anaconda environment pip: /path/to/env/../bin/pip install git+...) <br>
2222

2323
#### Examples
@@ -34,6 +34,7 @@ See "example.ipynb" for help
3434
- `functional_clustering`
3535
- `statistical_combination`
3636
- `pyCFS.GoldStandards`
37+
- `generate_ot_goldstandards`
3738
- `string_enrichment`
3839
- `goldstandard_overlap`
3940
- `ndiffusion`
@@ -129,6 +130,21 @@ Statistical p-value combination methods, including Cauchy, MCM, CMC, Minimum p,
129130

130131

131132
## pyCFS.GoldStandards
133+
### `generate_ot_goldstandards()`
134+
Identifies high-confidence genes associated with a given phenotype by pulling data from the OpenTargets platform.
135+
#### Parameters
136+
- `efo_id` (str): The EFO ID for a given phenotype. Must contain "EFO_####" or "MONDO_#####". Can be found at https://platform.opentargets.org/
137+
- **Optional**:
138+
- `min_goldstandards` (int) : Minimum goldstandards to pull. If filtering for coding variants, the genetic and overall bins (1.0-0.8, 0.6-0.8, 0.4-0.6, 0.2-0.4, 0.0-0.2) will be searched in full until we have a gold standard list greater than the minimum. If min_genetic or min_overall_association is set, this will be overridden. (Default = 10)
139+
- `min_genetic_association` (float) : The minimum level of evidence for genetic scores to be considered. 0.0 means no evidence, 1.0 means extremely strong evidence. If > 0.0, min_goldstandards will be overridden. (Default = 0.0)
140+
- `min_overall_association` (flaot) : The minimum level of evidence for overall scores to be considered. 0.0 means no evidence, 1.0 means extremely strong evidence. If > 0.0, min_goldstandards will be overridden. (Default = 0.0)
141+
- `filter_for_coding` (bool) : Toggle to check for coding variant evidence, determined by coding variant lead SNPs in GWAS (OT Genetics) or Gene Burden testing. If False, will top genes according to min_goldstandards or all genes determined by min_genetic_association/min_overall_association. (Default = True)
142+
- `savepath` (str): Path to save files.
143+
#### Returns
144+
- `list` : List of gold standard genes
145+
- `pd.DataFrame`: Dataframe containing supporting evidence and filtering criteria of gold standard set.
146+
147+
132148
### `string_enrichment()`
133149
Assess gene set network connectivity and functional enrichment using the STRING API. Returns the same results as if you were using the web-browser website (string-db.org).
134150
#### Parameters:

0 commit comments

Comments
 (0)