A parser for SEC EDGAR .nc files,
which represent an SEC filing.
These are SGML files, but the format seems to have drifted since the latest publicly-available DTD files I could find, so the parser implemented here is partly derived from the real-world data contained in the filings.
Attempts to provide a lossless Rust struct representation of each filing. Dates and datetimes
are represented as chrono objects.
This is currently a work-in-progress, and as such is not yet on crates.io, but it successfully
parses all non-corrupt .nc filings I have fed into it, which range from 1995 to 2021.
Decodes binary files when provided. Extracts included XBRL (enclosed in <XBRL></XBRL> tags)
as a String, but does not attempt to parse XBRL, which is an entirely separate format.