Skip to content

Commit 49aec1f

Browse files
authored
Merge pull request #272 from Roche/dev
Dev
2 parents edc1f64 + 6836a6e commit 49aec1f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+6951
-5140
lines changed

CITATION.cff

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ authors:
55
given-names: "Otto"
66
orcid: "https://orcid.org/0000-0002-3363-9287"
77
title: "Pyreadstat"
8-
version: 1.2.7
8+
version: 1.2.8
99
doi: 10.5281/zenodo.6612282
1010
date-released: 2018-09-24
1111
url: "https://github.com/Roche/pyreadstat"

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ If you would like to read R RData and Rds files into python in an easy way,
1616
take a look to [pyreadr](https://github.com/ofajardo/pyreadr), a wrapper
1717
around the C library [librdata](https://github.com/WizardMac/librdata)
1818

19+
If you would like to effortlessly produce beautiful summaries from pandas dataframes take
20+
a look to [pysummaries](https://github.com/Genentech/pysummaries)!
21+
1922

2023
**DISCLAIMER**
2124

change_log.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
# 1.2.8 (github, pypi and conda 2024.10.18)
2+
* Added Multiple Reponse Data Sets for SAV files #259
3+
* Fixed pyreadstat not raising error if folder does not exists when writing #269
4+
* Fixed tests for numpy 2 changes # 266
5+
* Readstat sources updated to commit ba4392e9d48c4d997d2737719f4cf6320fb66990 on dev branch
6+
17
# 1.2.7 (github, pypi and conda 2024.03.14)
28
* Fixing warnings in new pandas version 2.2.1 fixes #252
39
* Remove string encoding for read_por fixes #253
-347 KB
Binary file not shown.

docs/_build/doctrees/index.doctree

2.59 KB
Binary file not shown.

docs/_build/html/.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: f22f85c58ffb6d644f63f9173b9d1b95
3+
config: 3684e396bb8bf7df6bbcdabbae81c680
44
tags: 645f666f9bcd5a90fca523b33c5a78b7

docs/_build/html/_sources/index.rst.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,13 @@ object contains the following fields:
5252
* variable_storage_width: a dict with keys being variable names and values being the storage width
5353
* variable_display_width: a dict with keys being variable names and values being the display width
5454
* variable_measure: a dict with keys being variable names and values being the measure: nominal, ordinal, scale or unknown
55+
* mr_sets: a dictionary representing the definitions of multiple-response (MR)
56+
variables in the dataset (currently only supported for SAV format). MR variables
57+
are arrays composed of several other variables. This metadata entry, `mr_sets`,
58+
specifies which variables are combined to form each array. Each entry includes:
59+
`type`, `is_dichotomy`, `counted_value`, `label`, and `variable_list`. The meaning
60+
of these fields is based on the SPSS specification for multiple-response sets.
61+
5562

5663
There are two functions to deal with value labels: set_value_labels and set_catalog_to_sas. You can read about them
5764
in the next section.

docs/_build/html/_static/basic.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
*
55
* Sphinx stylesheet -- basic theme.
66
*
7-
* :copyright: Copyright 2007-2023 by the Sphinx team, see AUTHORS.
7+
* :copyright: Copyright 2007-2024 by the Sphinx team, see AUTHORS.
88
* :license: BSD, see LICENSE for details.
99
*
1010
*/

docs/_build/html/_static/doctools.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
*
55
* Base JavaScript utilities for all Sphinx HTML documentation.
66
*
7-
* :copyright: Copyright 2007-2023 by the Sphinx team, see AUTHORS.
7+
* :copyright: Copyright 2007-2024 by the Sphinx team, see AUTHORS.
88
* :license: BSD, see LICENSE for details.
99
*
1010
*/

docs/_build/html/_static/documentation_options.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
const DOCUMENTATION_OPTIONS = {
2-
VERSION: '1.2.7',
2+
VERSION: '1.2.8',
33
LANGUAGE: 'en',
44
COLLAPSE_INDEX: false,
55
BUILDER: 'html',

docs/_build/html/_static/language_data.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
* This script contains the language-specific data used by searchtools.js,
66
* namely the list of stopwords, stemmer, scorer and splitter.
77
*
8-
* :copyright: Copyright 2007-2023 by the Sphinx team, see AUTHORS.
8+
* :copyright: Copyright 2007-2024 by the Sphinx team, see AUTHORS.
99
* :license: BSD, see LICENSE for details.
1010
*
1111
*/
1212

1313
var stopwords = ["a", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "near", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with"];
1414

1515

16-
/* Non-minified version is copied as a separate JS file, is available */
16+
/* Non-minified version is copied as a separate JS file, if available */
1717

1818
/**
1919
* Porter Stemmer

docs/_build/html/_static/searchtools.js

Lines changed: 108 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
*
55
* Sphinx JavaScript utilities for the full-text search.
66
*
7-
* :copyright: Copyright 2007-2023 by the Sphinx team, see AUTHORS.
7+
* :copyright: Copyright 2007-2024 by the Sphinx team, see AUTHORS.
88
* :license: BSD, see LICENSE for details.
99
*
1010
*/
@@ -99,7 +99,7 @@ const _displayItem = (item, searchTerms, highlightTerms) => {
9999
.then((data) => {
100100
if (data)
101101
listItem.appendChild(
102-
Search.makeSearchSummary(data, searchTerms)
102+
Search.makeSearchSummary(data, searchTerms, anchor)
103103
);
104104
// highlight search terms in the summary
105105
if (SPHINX_HIGHLIGHT_ENABLED) // set in sphinx_highlight.js
@@ -116,8 +116,8 @@ const _finishSearch = (resultCount) => {
116116
);
117117
else
118118
Search.status.innerText = _(
119-
`Search finished, found ${resultCount} page(s) matching the search query.`
120-
);
119+
"Search finished, found ${resultCount} page(s) matching the search query."
120+
).replace('${resultCount}', resultCount);
121121
};
122122
const _displayNextItem = (
123123
results,
@@ -137,6 +137,22 @@ const _displayNextItem = (
137137
// search finished, update title and status message
138138
else _finishSearch(resultCount);
139139
};
140+
// Helper function used by query() to order search results.
141+
// Each input is an array of [docname, title, anchor, descr, score, filename].
142+
// Order the results by score (in opposite order of appearance, since the
143+
// `_displayNextItem` function uses pop() to retrieve items) and then alphabetically.
144+
const _orderResultsByScoreThenName = (a, b) => {
145+
const leftScore = a[4];
146+
const rightScore = b[4];
147+
if (leftScore === rightScore) {
148+
// same score: sort alphabetically
149+
const leftTitle = a[1].toLowerCase();
150+
const rightTitle = b[1].toLowerCase();
151+
if (leftTitle === rightTitle) return 0;
152+
return leftTitle > rightTitle ? -1 : 1; // inverted is intentional
153+
}
154+
return leftScore > rightScore ? 1 : -1;
155+
};
140156

141157
/**
142158
* Default splitQuery function. Can be overridden in ``sphinx.search`` with a
@@ -160,13 +176,26 @@ const Search = {
160176
_queued_query: null,
161177
_pulse_status: -1,
162178

163-
htmlToText: (htmlString) => {
179+
htmlToText: (htmlString, anchor) => {
164180
const htmlElement = new DOMParser().parseFromString(htmlString, 'text/html');
165-
htmlElement.querySelectorAll(".headerlink").forEach((el) => { el.remove() });
181+
for (const removalQuery of [".headerlink", "script", "style"]) {
182+
htmlElement.querySelectorAll(removalQuery).forEach((el) => { el.remove() });
183+
}
184+
if (anchor) {
185+
const anchorContent = htmlElement.querySelector(`[role="main"] ${anchor}`);
186+
if (anchorContent) return anchorContent.textContent;
187+
188+
console.warn(
189+
`Anchored content block not found. Sphinx search tries to obtain it via DOM query '[role=main] ${anchor}'. Check your theme or template.`
190+
);
191+
}
192+
193+
// if anchor not specified or not found, fall back to main content
166194
const docContent = htmlElement.querySelector('[role="main"]');
167-
if (docContent !== undefined) return docContent.textContent;
195+
if (docContent) return docContent.textContent;
196+
168197
console.warn(
169-
"Content block not found. Sphinx search tries to obtain it via '[role=main]'. Could you check your theme or template."
198+
"Content block not found. Sphinx search tries to obtain it via DOM query '[role=main]'. Check your theme or template."
170199
);
171200
return "";
172201
},
@@ -239,16 +268,7 @@ const Search = {
239268
else Search.deferQuery(query);
240269
},
241270

242-
/**
243-
* execute search (requires search index to be loaded)
244-
*/
245-
query: (query) => {
246-
const filenames = Search._index.filenames;
247-
const docNames = Search._index.docnames;
248-
const titles = Search._index.titles;
249-
const allTitles = Search._index.alltitles;
250-
const indexEntries = Search._index.indexentries;
251-
271+
_parseQuery: (query) => {
252272
// stem the search terms and add them to the correct list
253273
const stemmer = new Stemmer();
254274
const searchTerms = new Set();
@@ -284,21 +304,38 @@ const Search = {
284304
// console.info("required: ", [...searchTerms]);
285305
// console.info("excluded: ", [...excludedTerms]);
286306

287-
// array of [docname, title, anchor, descr, score, filename]
288-
let results = [];
307+
return [query, searchTerms, excludedTerms, highlightTerms, objectTerms];
308+
},
309+
310+
/**
311+
* execute search (requires search index to be loaded)
312+
*/
313+
_performSearch: (query, searchTerms, excludedTerms, highlightTerms, objectTerms) => {
314+
const filenames = Search._index.filenames;
315+
const docNames = Search._index.docnames;
316+
const titles = Search._index.titles;
317+
const allTitles = Search._index.alltitles;
318+
const indexEntries = Search._index.indexentries;
319+
320+
// Collect multiple result groups to be sorted separately and then ordered.
321+
// Each is an array of [docname, title, anchor, descr, score, filename].
322+
const normalResults = [];
323+
const nonMainIndexResults = [];
324+
289325
_removeChildren(document.getElementById("search-progress"));
290326

291-
const queryLower = query.toLowerCase();
327+
const queryLower = query.toLowerCase().trim();
292328
for (const [title, foundTitles] of Object.entries(allTitles)) {
293-
if (title.toLowerCase().includes(queryLower) && (queryLower.length >= title.length/2)) {
329+
if (title.toLowerCase().trim().includes(queryLower) && (queryLower.length >= title.length/2)) {
294330
for (const [file, id] of foundTitles) {
295-
let score = Math.round(100 * queryLower.length / title.length)
296-
results.push([
331+
const score = Math.round(Scorer.title * queryLower.length / title.length);
332+
const boost = titles[file] === title ? 1 : 0; // add a boost for document titles
333+
normalResults.push([
297334
docNames[file],
298335
titles[file] !== title ? `${titles[file]} > ${title}` : title,
299336
id !== null ? "#" + id : "",
300337
null,
301-
score,
338+
score + boost,
302339
filenames[file],
303340
]);
304341
}
@@ -308,46 +345,47 @@ const Search = {
308345
// search for explicit entries in index directives
309346
for (const [entry, foundEntries] of Object.entries(indexEntries)) {
310347
if (entry.includes(queryLower) && (queryLower.length >= entry.length/2)) {
311-
for (const [file, id] of foundEntries) {
312-
let score = Math.round(100 * queryLower.length / entry.length)
313-
results.push([
348+
for (const [file, id, isMain] of foundEntries) {
349+
const score = Math.round(100 * queryLower.length / entry.length);
350+
const result = [
314351
docNames[file],
315352
titles[file],
316353
id ? "#" + id : "",
317354
null,
318355
score,
319356
filenames[file],
320-
]);
357+
];
358+
if (isMain) {
359+
normalResults.push(result);
360+
} else {
361+
nonMainIndexResults.push(result);
362+
}
321363
}
322364
}
323365
}
324366

325367
// lookup as object
326368
objectTerms.forEach((term) =>
327-
results.push(...Search.performObjectSearch(term, objectTerms))
369+
normalResults.push(...Search.performObjectSearch(term, objectTerms))
328370
);
329371

330372
// lookup as search terms in fulltext
331-
results.push(...Search.performTermsSearch(searchTerms, excludedTerms));
373+
normalResults.push(...Search.performTermsSearch(searchTerms, excludedTerms));
332374

333375
// let the scorer override scores with a custom scoring function
334-
if (Scorer.score) results.forEach((item) => (item[4] = Scorer.score(item)));
335-
336-
// now sort the results by score (in opposite order of appearance, since the
337-
// display function below uses pop() to retrieve items) and then
338-
// alphabetically
339-
results.sort((a, b) => {
340-
const leftScore = a[4];
341-
const rightScore = b[4];
342-
if (leftScore === rightScore) {
343-
// same score: sort alphabetically
344-
const leftTitle = a[1].toLowerCase();
345-
const rightTitle = b[1].toLowerCase();
346-
if (leftTitle === rightTitle) return 0;
347-
return leftTitle > rightTitle ? -1 : 1; // inverted is intentional
348-
}
349-
return leftScore > rightScore ? 1 : -1;
350-
});
376+
if (Scorer.score) {
377+
normalResults.forEach((item) => (item[4] = Scorer.score(item)));
378+
nonMainIndexResults.forEach((item) => (item[4] = Scorer.score(item)));
379+
}
380+
381+
// Sort each group of results by score and then alphabetically by name.
382+
normalResults.sort(_orderResultsByScoreThenName);
383+
nonMainIndexResults.sort(_orderResultsByScoreThenName);
384+
385+
// Combine the result groups in (reverse) order.
386+
// Non-main index entries are typically arbitrary cross-references,
387+
// so display them after other results.
388+
let results = [...nonMainIndexResults, ...normalResults];
351389

352390
// remove duplicate search results
353391
// note the reversing of results, so that in the case of duplicates, the highest-scoring entry is kept
@@ -361,7 +399,12 @@ const Search = {
361399
return acc;
362400
}, []);
363401

364-
results = results.reverse();
402+
return results.reverse();
403+
},
404+
405+
query: (query) => {
406+
const [searchQuery, searchTerms, excludedTerms, highlightTerms, objectTerms] = Search._parseQuery(query);
407+
const results = Search._performSearch(searchQuery, searchTerms, excludedTerms, highlightTerms, objectTerms);
365408

366409
// for debugging
367410
//Search.lastresults = results.slice(); // a copy
@@ -466,14 +509,18 @@ const Search = {
466509
// add support for partial matches
467510
if (word.length > 2) {
468511
const escapedWord = _escapeRegExp(word);
469-
Object.keys(terms).forEach((term) => {
470-
if (term.match(escapedWord) && !terms[word])
471-
arr.push({ files: terms[term], score: Scorer.partialTerm });
472-
});
473-
Object.keys(titleTerms).forEach((term) => {
474-
if (term.match(escapedWord) && !titleTerms[word])
475-
arr.push({ files: titleTerms[word], score: Scorer.partialTitle });
476-
});
512+
if (!terms.hasOwnProperty(word)) {
513+
Object.keys(terms).forEach((term) => {
514+
if (term.match(escapedWord))
515+
arr.push({ files: terms[term], score: Scorer.partialTerm });
516+
});
517+
}
518+
if (!titleTerms.hasOwnProperty(word)) {
519+
Object.keys(titleTerms).forEach((term) => {
520+
if (term.match(escapedWord))
521+
arr.push({ files: titleTerms[term], score: Scorer.partialTitle });
522+
});
523+
}
477524
}
478525

479526
// no match but word was a required one
@@ -496,9 +543,8 @@ const Search = {
496543

497544
// create the mapping
498545
files.forEach((file) => {
499-
if (fileMap.has(file) && fileMap.get(file).indexOf(word) === -1)
500-
fileMap.get(file).push(word);
501-
else fileMap.set(file, [word]);
546+
if (!fileMap.has(file)) fileMap.set(file, [word]);
547+
else if (fileMap.get(file).indexOf(word) === -1) fileMap.get(file).push(word);
502548
});
503549
});
504550

@@ -549,8 +595,8 @@ const Search = {
549595
* search summary for a given text. keywords is a list
550596
* of stemmed words.
551597
*/
552-
makeSearchSummary: (htmlText, keywords) => {
553-
const text = Search.htmlToText(htmlText);
598+
makeSearchSummary: (htmlText, keywords, anchor) => {
599+
const text = Search.htmlToText(htmlText, anchor);
554600
if (text === "") return null;
555601

556602
const textLower = text.toLowerCase();

docs/_build/html/genindex.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
<head>
44
<meta charset="utf-8" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
6-
<title>Index &mdash; pyreadstat 1.2.7 documentation</title>
6+
<title>Index &mdash; pyreadstat 1.2.8 documentation</title>
77
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=fa44fd50" />
88
<link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=19f00094" />
99

@@ -12,8 +12,8 @@
1212
<script src="_static/js/html5shiv.min.js"></script>
1313
<![endif]-->
1414

15-
<script src="_static/documentation_options.js?v=a5753347"></script>
16-
<script src="_static/doctools.js?v=888ff710"></script>
15+
<script src="_static/documentation_options.js?v=4d6f9085"></script>
16+
<script src="_static/doctools.js?v=9a2dae69"></script>
1717
<script src="_static/sphinx_highlight.js?v=dc90522c"></script>
1818
<script src="_static/js/theme.js"></script>
1919
<link rel="index" title="Index" href="#" />

0 commit comments

Comments
 (0)