finagg package
Subpackages
- finagg.bea package
- finagg.fred package
- finagg.sec package
- Subpackages
- Submodules
- finagg.sec.api module
ConceptFrameSubmissionsResultAPICompanyConceptCompanyFactsExchangesFramesSubmissionsTickerscompany_conceptcompany_factsexchangesframessubmissionstickerspopular_framespopular_conceptscompute_financial_ratios()filter_original_filings()get_cik()get_ticker()get_ticker_set()group_and_pivot_filings()
- finagg.sec.sql module
- Module contents
Submodules
finagg.config module
finagg configuration and global SQLAlchemy setup. Backend file paths
and SQLAlchemy engine database URLs are configured in this module at runtime
according to environment variables.
Environment variables should ideally be configured using an .env file
in the desired working directory. Running finagg install will
automaticaly setup the .env file for you according to your input values.
Environment variables assigned in the .env file are loaded on the
finagg module’s first instantiation.
- finagg.config.root_path
Parent directory of the
findatadirectory where the backend database and API cache file will be stored (unless otherwise configured according to the relevant environment variables). This can be set with theFINAGG_ROOT_PATHenvironment variable. This defaults to and is typically set to the current working directory. It’s recommended you permanently set this value using thefinagg installCLI.
- finagg.config.disable_http_cache
Whether the disable the HTTP requests cache. Instead of a cachable session, a default, uncached user session will be used for all requests.
- finagg.config.http_cache_path
Path to the API cache file. This can be set with the
FINAGG_HTTP_CACHE_PATHenvironment variable and should NOT include a file extension. All API implementations share the same cache backend.
- finagg.config.database_path
Default path to the database file. The
FINAGG_DATABASE_URLenvironment variable will take precedence over this value.
- finagg.config.database_url
SQLAlchemy URL to the database. This can be set with the
FINAGG_DATABASE_URLenvironment variable and should include a file extension. This defaults tof"sqlite:///{finagg.config.database_path}".
- finagg.config.engine
The default SQLAlchemy engine for the backend database. All feature and SQL submodules use this engine and the database URL as configured by
database_urlfor reading and writing to and from the database by default.
finagg.ratelimit module
Customizable rate-limiting for requests-style getters.
The definitions within this submodule are used throughout finagg for
respecting 3rd party API rate limits to avoid server-side throttling.
- class finagg.ratelimit.RateLimit(limit: float, period: float | timedelta, /, *, buffer: float = 0.0)[source]
Bases:
ABCInterface for defining a rate limit for an external API getter.
You can create a custom rate-limiter by inheriting from this class and implementing a custom
eval()method.- Parameters:
limit – Max limit within
period(e.g., max number of requests, errors, size in memory, etc.).period – Time interval for evaluating
limit.buffer – Reduce
limitby this fraction. Adds a bit of leeway to ensurelimitis not reached. Useful for enforcing response size limits.
See also
guard(): For the intended usage of getting aRateLimitGuardinstance.RequestLimit: For an example of a requestrate limiter.
- limit: float
Max quantity allowed within
period. The quantity type being limited is dependent on what’s returned byeval().
- abstract eval(response: Response, /) float | dict[str, float][source]
Evaluate a response and determine how much it contributes to the max limit imposed by this instance.
This is the main method that should be overwritten by subclasses to create custom rate-limiters. This method is called with each requests’s response to determine how much that request/response contributes to the rate-limiting.
- Parameters:
response – Request response (possibly cached).
- Returns:
A number indicating the request/response’s contribution to the rate limit OR a dictionary containing:
”limit”: a number indicating the request/response’s contribution to the rate limit
”wait”: time to wait before a new request can be made
- class finagg.ratelimit.RequestLimit(limit: float, period: float | timedelta, /, *, buffer: float = 0.0)[source]
Bases:
RateLimitLimit the number of requests made by the underlying getter.
- eval(response: Response, /) float | dict[str, float][source]
Evaluate a response and determine how much it contributes to the max limit imposed by this instance.
This is the main method that should be overwritten by subclasses to create custom rate-limiters. This method is called with each requests’s response to determine how much that request/response contributes to the rate-limiting.
- Parameters:
response – Request response (possibly cached).
- Returns:
A number indicating the request/response’s contribution to the rate limit OR a dictionary containing:
”limit”: a number indicating the request/response’s contribution to the rate limit
”wait”: time to wait before a new request can be made
- class finagg.ratelimit.ErrorLimit(limit: float, period: float | timedelta, /, *, buffer: float = 0.0)[source]
Bases:
RateLimitLimit the number of errors occurred when using the underlying getter.
- eval(response: Response, /) float | dict[str, float][source]
Evaluate a response and determine how much it contributes to the max limit imposed by this instance.
This is the main method that should be overwritten by subclasses to create custom rate-limiters. This method is called with each requests’s response to determine how much that request/response contributes to the rate-limiting.
- Parameters:
response – Request response (possibly cached).
- Returns:
A number indicating the request/response’s contribution to the rate limit OR a dictionary containing:
”limit”: a number indicating the request/response’s contribution to the rate limit
”wait”: time to wait before a new request can be made
- class finagg.ratelimit.SizeLimit(limit: float, period: float | timedelta, /, *, buffer: float = 0.0)[source]
Bases:
RateLimitLimit the size of responses when using the underlying getter.
- eval(response: Response, /) float | dict[str, float][source]
Evaluate a response and determine how much it contributes to the max limit imposed by this instance.
This is the main method that should be overwritten by subclasses to create custom rate-limiters. This method is called with each requests’s response to determine how much that request/response contributes to the rate-limiting.
- Parameters:
response – Request response (possibly cached).
- Returns:
A number indicating the request/response’s contribution to the rate limit OR a dictionary containing:
”limit”: a number indicating the request/response’s contribution to the rate limit
”wait”: time to wait before a new request can be made
- class finagg.ratelimit.RateLimitGuard(f: Callable[[_P], Response], limits: tuple[finagg.ratelimit.RateLimit, ...], /, *, warn: bool = False)[source]
Bases:
Generic[_P]Wraps requests-like getters to introduce blocking functionality when requests are getting close to violating call limits.
- Parameters:
f – Requests-style getter that’s wrapped and rate-limited.
limits – Limits to apply to the requests-style getter.
warn – Whether to print a message to stdout whenever client-side throttling is occurring to respect
limits.
See also
guard(): For the intended usage of getting aRateLimitGuardinstance.RequestLimit: For an example of a requestrate limiter.
- limits: tuple[finagg.ratelimit.RateLimit, ...]
Limits to apply to requests/responses.
- finagg.ratelimit.guard(limits: Sequence[RateLimit], /, *, warn: bool = False) Callable[[Callable[[_P], Response]], RateLimitGuard[_P]][source]
Apply
limitsto a requests-style getter.- Parameters:
limits – Rate limits to apply to the requests-style getter.
warn – Whether to print a message when client-side throttling is occurring.
- Returns:
A decorator that wraps the original requests-style getter in a
RateLimitGuardto avoid exceedinglimits.
Examples
Limit 5 requests to Google per second.
>>> import requests >>> from datetime import timedelta >>> from finagg.ratelimit import RequestLimit, guard >>> @guard([RequestLimit(5, timedelta(seconds=1))]) ... def get() -> requests.Response: ... return requests.get("https://google.com")
finagg.testing module
Testing utils used for finagg’s own unit tests.
- finagg.testing.sqlite_engine(path: str, /, *, metadata: None | MetaData = None, table: None | Table = None) Generator[Engine, None, None][source]
Yield a test database engine that’s cleaned-up after usage.
- Parameters:
path – Path to SQLite database file.
metadata – Optional metadata for creating and dropping tables before and after yielding the engine, respectively.
table – Optional table for creating and dropping before and after yielding the engine, respectively.
- Returns:
A database engine that’s subsequently disposed of and whose respective database file is deleted after use.
- Raises:
ValueError – If both
metadataandtableare provided.
Examples
Using the testing util as a pytest fixture.
>>> import pytest >>> from sqlalchemy.engine import Engine >>> @pytest.fixture ... def engine() -> Engine: ... yield from finagg.testing.sqlite_engine("/path/to/db.sqlite")
finagg.utils module
Generic utils used by subpackages.
- finagg.utils.expand_csv(values: str | list[str], /) set[str][source]
Expand the given list of strings into a set of strings, where each value in the list of strings could be:
Comma-separated values
A path that points to a CSV file containing values
A regular ol’ string
- Parameters:
values – List of strings denoting comma-separated values, or CSV files containing comma-separated values.
- Returns:
A set of all strings found within the given list.
Examples
>>> ts = finagg.utils.expand_csv(["AAPL,MSFT"]) >>> "AAPL" in ts True
- finagg.utils.get_func_cols(table: Table | DataFrame, /) list[str][source]
Return the column names in
tablethat have the formatFUNC(arg0, arg1, ...).- Parameters:
table – SQLAlchemy table or dataframe.
- Returns:
List of functional-style column names in
table. Returns an empty list if none are found.- Raises:
TypeError – If the given object is not a SQLAlchemy table or dataframe.
- finagg.utils.parse_func_call(s: str, /) None | tuple[str, list[str]][source]
Parse a function’s name and its arguments’ names from a string of format
FUNC(arg0, arg1, ...).- Parameters:
s – Any string of format
FUNC(arg0, arg1, ...).- Returns:
A tuple containing the parsed function’s name and its arguments’ names. Returns
Noneif the string doesn’t match the expected format.
Examples
>>> finagg.utils.parse_func_call("LOG_CHANGE(high, open)") ('LOG_CHANGE', ['high', 'open'])
- finagg.utils.resolve_col_order(table: Table, df: DataFrame, /, *, extra_ignore: None | list[str] = None) DataFrame[source]
Reorder the columns in
dfto match the order of the columns intable.- Parameters:
table – SQLAlchemy table that defines the column order. Primary keys are ignored from the column order as they’re assumed to be used as part of the index in
df.df – Dataframe to reorder.
extra_ignore – Extra columns to ignore in the reordering. Sometimes columns aren’t used as primary keys but are used as part of the index in the dataframe. Those columns should be provided in this option.
- Returns:
Dataframe with columns ordered according to the column order in
table.
- finagg.utils.resolve_func_cols(table: Table, df: DataFrame, /, *, drop: bool = False, inplace: bool = False) DataFrame[source]
Inspect
tableand apply functions to columns that exist intableanddfaccording to columns named likeFUNC(col0, col1, ...)withintablesuch that new columns indfare the result of the applied functions and have names matching the function call signatures.- Parameters:
table – SQLAchemy table that defines a superset of columns that should exist in
df.df – Dataframe that contains a subset of columns within
tablethat will be updated with columns defined bytablethat have names likeFUNC(col0, col1, ...).drop – Whether to drop all other columns on the returned dataframe except for the columns in
table.inplace – Whether to perform operations in-place and use
dfas the output dataframe.
- Returns:
A new dataframe with columns from
dfand columns according to columns named withintablelikeFUNC(col0, col1, ...)where columnscol0andcol1exist indf.- Raises:
ValueError – If the function parsed from the column name has no supported and corresponding function.
- finagg.utils.safe_log_change(series: Series, other: None | Series = None) Series[source]
Safely compute log change between two columns.
Replaces
Infvalues withNaNand forward-fills. This function is meant to be used withpd.Series.apply.- Parameters:
series – Series of values.
other – Reference series to compute change against. Defaults to
seriesshifted forward one index.
- Returns:
A series representing percent changes of
col.
- finagg.utils.safe_pct_change(series: Series, other: None | Series = None) Series[source]
Safely compute percent change between two columns.
Replaces
Infvalues withNaNand forward-fills. This function is meant to be used withpd.Series.apply.- Parameters:
series – Series of values.
other – Reference series to compute change against. Defaults to
seriesshifted forward one index.
- Returns:
A series representing percent changes of
col.
- finagg.utils.setenv(name: str, value: str, /, *, exist_ok: bool = False) Path[source]
Set the value of the environment variable
nametovalue.The environment variable is permanently set in the environment and in the current process.
- Parameters:
name – Environment variable name.
value – Environment variable value.
exist_ok – Whether it’s okay if an environment variable of the same name already exists. If
True, it will be overwritten.
- Returns:
Path to the file the environment variable was written to.
- Raises:
RuntimeError – If
exist_okisFalseand an environment variable of the same name already exists.
- finagg.utils.today
Today’s date. Used by a number of submodules as the default end date when getting data from APIs or SQL tables.
Module contents
Main package interface.