finagg.sec.feat package
Module contents
Features from SEC sources.
- finagg.sec.feat.annual
The most popular way for accessing
finagg.sec.feat.Annual
.
- finagg.sec.feat.quarterly
The most popular way for accessing
finagg.sec.feat.Quarterly
.
- finagg.sec.feat.submissions
The most popular way for accessing
finagg.sec.feat.Submissions
.
- finagg.sec.feat.tags
The most popular way for accessing
finagg.sec.feat.Tags
.
- class finagg.sec.feat.Annual[source]
Bases:
object
Methods for gathering annual features from SEC EDGAR data.
The module variable
finagg.sec.feat.annual
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
It doesn’t matter which data source you use to gather features. They all return equivalent dataframes.
>>> df1 = finagg.sec.feat.annual.from_api("AAPL").head(5) >>> df2 = finagg.sec.feat.annual.from_raw("AAPL").head(5) >>> df3 = finagg.sec.feat.annual.from_refined("AAPL").head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4) >>> pd.testing.assert_frame_equal(df1, df3, rtol=1e-4)
- industry
Annual features aggregated for an entire industry. The most popular way for accessing the
finagg.sec.feat.IndustryAnnual
feature set.
- normalized
A company’s annual features normalized by its industry. The most popular way for accessing the
finagg.sec.feat.NormalizedAnnual
feature set.
- classmethod from_api(ticker: str, /, *, start: None | str = None, end: None | str = None) DataFrame [source]
Get annual features directly from the SEC API.
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
- Returns:
Annual data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.annual.from_api("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy filed ... 2010 2010-10-27 -0.089864 -0.023676 0.012840 ... 2011 2011-10-26 0.272493 0.278241 0.017805 ... 2012 2012-10-31 0.896033 0.076422 0.014423 ... 2013 2013-10-30 0.414064 0.248046 0.010630 ... 2014 2014-10-27 0.161871 0.239927 1.902394 ...
- classmethod from_raw(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get annual features from a local SEC SQL table.
Not all data series are published at the same rate or time. Missing rows for less-frequent annual publications are forward filled.
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Annual data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.annual.from_raw("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy filed ... 2010 2010-10-27 -0.089864 -0.023676 0.012840 ... 2011 2011-10-26 0.272493 0.278241 0.017805 ... 2012 2012-10-31 0.896033 0.076422 0.014423 ... 2013 2013-10-30 0.414064 0.248046 0.010630 ... 2014 2014-10-27 0.161871 0.239927 1.902394 ...
- classmethod from_refined(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from the refined SQL table.
This is the preferred method for accessing features for offline analysis (assuming data in the local SQL table is current).
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Annual data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
NoResultFound – If there are no rows for
ticker
in the refined SQL table.
Examples
>>> finagg.sec.feat.annual.from_refined("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy filed ... 2010 2010-10-27 -0.089864 -0.023676 0.012840 ... 2011 2011-10-26 0.272493 0.278241 0.017805 ... 2012 2012-10-31 0.896033 0.076422 0.014423 ... 2013 2013-10-30 0.414064 0.248046 0.010630 ... 2014 2014-10-27 0.161871 0.239927 1.902394 ...
- classmethod get_candidate_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the raw SQL table that MAY BE ELIGIBLE to be in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that may be valid for creating annual features that also have at least
lb
rows for each tag used for constructing the features.
Examples
>>> "AAPL" in finagg.sec.feat.annual.get_candidate_ticker_set() True
- classmethod get_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that contain all the columns for creating annual features that also have at least
lb
rows.
Examples
>>> "AAPL" in finagg.sec.feat.annual.get_ticker_set() True
- classmethod install(tickers: None | set[str] = None, *, processes: int = 3, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the raw SQL tables, transforming them into annual features, and then writing to the refined annual SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the candidate tickers from
Annual.get_candidate_ticker_set()
.processes – Number of background processes to run in parallel when processing
tickers
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_refined(ticker: str, df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the given dataframe to the refined feature table while using the ticker
ticker
.- Parameters:
ticker – Company ticker.
df – Dataframe to store as rows in a local SQL table
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.
- class finagg.sec.feat.IndustryAnnual[source]
Bases:
object
Methods for gathering industry-averaged annual data from SEC features.
The class variable
finagg.sec.feat.Annual.industry
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
You can aggregate this feature set using a ticker or an industry code directly.
>>> df1 = finagg.sec.feat.annual.industry.from_refined(ticker="MSFT").head(5) >>> df2 = finagg.sec.feat.annual.industry.from_refined(code=73).head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4)
- classmethod from_refined(*, ticker: None | str = None, code: None | str = None, level: Literal[2, 3, 4] = 2, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get annual features from the feature store, aggregated for an entire industry.
The industry can be chosen according to a company or by an industry code directly. If a company is provided, then the first
level
digits of the company’s SIC code is used for the industry code.- Parameters:
ticker – Company ticker. Lookup the industry associated with this company. Mutually exclusive with
code
.code – Industry SIC code to use for industry lookup. Mutually exclusive with
ticker
.level –
Industry level to aggregate features at. The industry used according to
ticker
orcode
is subsampled according to this value. Options include:2 = major group (e.g., furniture and fixtures)
3 = industry group (e.g., office furnitures)
4 = industry (e.g., wood office furniture)
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Annual data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
ValueError – If neither a
ticker
norcode
are provided.NoResultFound – If there are no rows for
ticker
orcode
in the refined SQL table.
Examples
>>> df = finagg.sec.feat.annual.industry.from_refined(ticker="AAPL").head(5) >>> df["mean"] name AssetCoverageRatio DebtEquityRatio EarningsPerShareBasic ... fy filed ... 2010 2011-02-24 1.577924 1.112079 3.630000 ... 2011 2012-02-24 1.998633 0.886148 1.851429 ... 2012 2013-02-26 1.997748 1.413677 3.750833 ... 2013 2014-03-03 2.232804 1.176639 4.475000 ... 2014 2015-03-17 2.019007 1.080055 2.742143 ... >>> df["std"] name AssetCoverageRatio DebtEquityRatio EarningsPerShareBasic ... fy filed ... 2010 2011-02-24 0.320778 0.149843 3.436292 ... 2011 2012-02-24 1.371792 0.331815 3.837792 ... 2012 2013-02-26 1.480323 1.085814 3.975254 ... 2013 2014-03-03 1.787131 0.575039 7.236695 ... 2014 2015-03-17 1.303284 0.458704 3.015323 ...
- class finagg.sec.feat.NormalizedAnnual[source]
Bases:
object
Annual features from SEC EDGAR data normalized according to industry averages and standard deviations.
The class variable
finagg.sec.feat.Annual.normalized
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
It doesn’t matter which data source you use to gather features. They both return equivalent dataframes.
>>> df1 = finagg.sec.feat.annual.normalized.from_other_refined("AAPL").head(5) >>> df2 = finagg.sec.feat.annual.normalized.from_refined("AAPL").head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4)
- classmethod from_other_refined(ticker: str, /, *, level: Literal[2, 3, 4] = 2, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from other feature SQL tables.
- Parameters:
ticker – Company ticker.
level –
Industry level to aggregate relative features at. The industry used according to
ticker
is subsampled according to this value. Options include:2 = major group (e.g., furniture and fixtures)
3 = industry group (e.g., office furnitures)
4 = industry (e.g., wood office furniture)
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Relative annual data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.annual.normalized.from_other_refined("AAPL").head(5) NORM(LOG_CHANGE(Assets)) NORM(LOG_CHANGE(AssetsCurrent)) NORM(LOG_CHANGE(CommonStockSharesOutstanding)) ... fy filed ... 2010 2010-10-27 -0.707107 0.707107 0.707107 ... 2011 2011-10-26 0.549435 0.485644 0.574215 ... 2012 2012-10-31 2.160612 -0.609935 0.802452 ... 2013 2013-10-30 0.756833 0.336215 -0.148804 ... 2014 2014-10-27 0.244212 1.384738 2.836717 ...
- classmethod from_refined(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from the features SQL table.
This is the preferred method for accessing features for offline analysis (assuming data in the local SQL table is current).
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Annual data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
NoResultFound – If there are no rows for
ticker
in the refined SQL table.
Examples
>>> finagg.sec.feat.annual.normalized.from_refined("AAPL").head(5) NORM(LOG_CHANGE(Assets)) NORM(LOG_CHANGE(AssetsCurrent)) NORM(LOG_CHANGE(CommonStockSharesOutstanding)) ... fy filed ... 2010 2010-10-27 -0.707107 0.707107 0.707107 ... 2011 2011-10-26 0.549435 0.485644 0.574215 ... 2012 2012-10-31 2.160612 -0.609935 0.802452 ... 2013 2013-10-30 0.756833 0.336215 -0.148804 ... 2014 2014-10-27 0.244212 1.384738 2.836717 ...
- classmethod get_candidate_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the annual SQL table that MAY BE ELIGIBLE to be in the feature’s SQL table.
This is just an alias for
finagg.sec.feat.Annual.get_ticker_set()
.- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that may be valid for creating industry-normalized annual features that also have at least
lb
rows for each tag used for constructing the features.
Examples
>>> "AAPL" in finagg.sec.feat.annual.normalized.get_candidate_ticker_set() True
- classmethod get_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that contain all the columns for creating industry-normalized annual features that also have at least
lb
rows.
Examples
>>> "AAPL" in finagg.sec.feat.annual.normalized.get_ticker_set() True
- classmethod get_tickers_sorted_by(column: str, /, *, ascending: bool = True, year: int = -1, engine: None | Engine = None) list[str] [source]
Get all tickers in the feature’s SQL table sorted by a particular column.
- Parameters:
column – Feature column to sort by.
ascending – Whether to return results in ascending order according to the values in
column
.year – Year to select from. Defaults to the most recent year that has data available.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Tickers sorted by a feature column for a particular year.
Examples
>>> ts = finagg.sec.feat.annual.normalized.get_tickers_sorted_by( ... "NORM(EarningsPerShareBasic)", ... year=2020, ... ) >>> "AMD" == ts[0] True
- classmethod install(tickers: None | set[str] = None, *, processes: int = 3, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the annual SQL tables, transforming them into normalized features, and then writing to the refined annual normalized SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the candidate tickers from
NormalizedAnnual.get_candidate_ticker_set()
.processes – Number of background processes to run in parallel when processing
tickers
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_refined(ticker: str, df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the dataframe to the feature store for
ticker
.- Parameters:
ticker – Company ticker.
df – Dataframe to store completely as rows in a local SQL table.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.
- class finagg.sec.feat.Quarterly[source]
Bases:
object
Methods for gathering quarterly features from SEC EDGAR data.
The module variable
finagg.sec.feat.quarterly
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
It doesn’t matter which data source you use to gather features. They all return equivalent dataframes.
>>> df1 = finagg.sec.feat.quarterly.from_api("AAPL").head(5) >>> df2 = finagg.sec.feat.quarterly.from_raw("AAPL").head(5) >>> df3 = finagg.sec.feat.quarterly.from_refined("AAPL").head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4) >>> pd.testing.assert_frame_equal(df1, df3, rtol=1e-4)
- industry
Quarterly features aggregated for an entire industry. The most popular way for accessing the
IndustryQuarterly
feature set.
- normalized
A company’s quarterly features normalized by its industry. The most popular way for accessing the
NormalizedQuarterly
feature set.
- classmethod from_api(ticker: str, /, *, start: None | str = None, end: None | str = None) DataFrame [source]
Get quarterly features directly from the SEC API.
Not all data series are published at the same rate or time. Missing rows for less-frequent quarterly publications are forward filled.
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
- Returns:
Quarterly data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.quarterly.from_api("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy fp filed ... 2010 Q1 2010-01-25 0.182629 -0.023676 0.012840 ... Q2 2010-04-21 0.000000 0.000000 0.000000 ... Q3 2010-07-21 0.000000 0.000000 0.000000 ... 2011 Q1 2011-01-19 0.459174 0.278241 0.017805 ... Q2 2011-04-21 0.000000 0.000000 0.000000 ...
- classmethod from_raw(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get quarterly features from a local SEC SQL table.
Not all data series are published at the same rate or time. Missing rows for less-frequent quarterly publications are forward filled.
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Quarterly data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.quarterly.from_raw("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy fp filed ... 2010 Q1 2010-01-25 0.182629 -0.023676 0.012840 ... Q2 2010-04-21 0.000000 0.000000 0.000000 ... Q3 2010-07-21 0.000000 0.000000 0.000000 ... 2011 Q1 2011-01-19 0.459174 0.278241 0.017805 ... Q2 2011-04-21 0.000000 0.000000 0.000000 ...
- classmethod from_refined(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from the refined SQL table.
This is the preferred method for accessing features for offline analysis (assuming data in the local SQL table is current).
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Quarterly data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
NoResultFound – If there are no rows for
ticker
in the refined SQL table.
Examples
>>> finagg.sec.feat.quarterly.from_refined("AAPL").head(5) LOG_CHANGE(Assets) LOG_CHANGE(AssetsCurrent) LOG_CHANGE(CommonStockSharesOutstanding) ... fy fp filed ... 2010 Q1 2010-01-25 0.182629 -0.023676 0.012840 ... Q2 2010-04-21 0.000000 0.000000 0.000000 ... Q3 2010-07-21 0.000000 0.000000 0.000000 ... 2011 Q1 2011-01-19 0.459174 0.278241 0.017805 ... Q2 2011-04-21 0.000000 0.000000 0.000000 ...
- classmethod get_candidate_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the raw SQL table that MAY BE ELIGIBLE to be in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that may be valid for creating quarterly features that also have at least
lb
rows for each tag used for constructing the features.
Examples
>>> "AAPL" in finagg.sec.feat.quarterly.get_candidate_ticker_set() True
- classmethod get_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that contain all the columns for creating quarterly features that also have at least
lb
rows.
Examples
>>> "AAPL" in finagg.sec.feat.quarterly.get_ticker_set() True
- classmethod install(tickers: None | set[str] = None, *, processes: int = 3, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the raw SQL tables, transforming them into quarterly features, and then writing to the refined quarterly SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the candidate tickers from
Quarterly.get_candidate_ticker_set()
.processes – Number of background processes to run in parallel when processing
tickers
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_refined(ticker: str, df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the given dataframe to the refined feature table while using the ticker
ticker
.- Parameters:
ticker – Company ticker.
df – Dataframe to store as rows in a local SQL table
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.
- class finagg.sec.feat.IndustryQuarterly[source]
Bases:
object
Methods for gathering industry-averaged quarterly data from SEC features.
The class variable
finagg.sec.feat.Quarterly.industry
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
You can aggregate this feature set using a ticker or an industry code directly.
>>> df1 = finagg.sec.feat.quarterly.industry.from_refined(ticker="MSFT").head(5) >>> df2 = finagg.sec.feat.quarterly.industry.from_refined(code=73).head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4)
- classmethod from_refined(*, ticker: None | str = None, code: None | str = None, level: Literal[2, 3, 4] = 2, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get quarterly features from the feature store, aggregated for an entire industry.
The industry can be chosen according to a company or by an industry code directly. If a company is provided, then the first
level
digits of the company’s SIC code is used for the industry code.- Parameters:
ticker – Company ticker. Lookup the industry associated with this company. Mutually exclusive with
code
.code – Industry SIC code to use for industry lookup. Mutually exclusive with
ticker
.level –
Industry level to aggregate features at. The industry used according to
ticker
orcode
is subsampled according to this value. Options include:2 = major group (e.g., furniture and fixtures)
3 = industry group (e.g., office furnitures)
4 = industry (e.g., wood office furniture)
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Quarterly data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
ValueError – If neither a
ticker
norcode
are provided.NoResultFound – If there are no rows for
ticker
orcode
in the refined SQL table.
Examples
>>> df = finagg.sec.feat.quarterly.industry.from_refined(ticker="AAPL").head(5) >>> df["mean"] name AssetCoverageRatio DebtEquityRatio EarningsPerShareBasic ... fy fp filed ... 2010 Q1 2010-04-29 1.718796 0.956111 0.865000 ... Q2 2010-07-30 1.753165 0.874013 0.380000 ... Q3 2010-11-04 1.830832 0.842663 1.062857 ... 2011 Q1 2011-05-05 2.164743 0.757230 1.022500 ... Q2 2011-08-08 2.222168 0.718292 2.016667 ... >>> df["std"] name AssetCoverageRatio DebtEquityRatio EarningsPerShareBasic ... fy fp filed ... 2010 Q1 2010-04-29 0.394667 0.327296 1.139079 ... Q2 2010-07-30 0.310804 0.285554 2.954901 ... Q3 2010-11-04 0.351052 0.273552 3.075227 ... 2011 Q1 2011-05-05 1.166501 0.311317 1.137287 ... Q2 2011-08-08 1.104677 0.313766 2.075428 ...
- class finagg.sec.feat.NormalizedQuarterly[source]
Bases:
object
Quarterly features from SEC EDGAR data normalized according to industry averages and standard deviations.
The class variable
finagg.sec.feat.Quarterly.normalized
is an instance of this feature set implementation and is the most popular interface for calling feature methods.Examples
It doesn’t matter which data source you use to gather features. They both return equivalent dataframes.
>>> df1 = finagg.sec.feat.quarterly.normalized.from_other_refined("AAPL").head(5) >>> df2 = finagg.sec.feat.quarterly.normalized.from_refined("AAPL").head(5) >>> pd.testing.assert_frame_equal(df1, df2, rtol=1e-4)
- classmethod from_other_refined(ticker: str, /, *, level: Literal[2, 3, 4] = 2, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from other feature SQL tables.
- Parameters:
ticker – Company ticker.
level –
Industry level to aggregate relative features at. The industry used according to
ticker
is subsampled according to this value. Options include:2 = major group (e.g., furniture and fixtures)
3 = industry group (e.g., office furnitures)
4 = industry (e.g., wood office furniture)
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Relative quarterly data dataframe with each tag as a separate column. Sorted by filing date.
Examples
>>> finagg.sec.feat.quarterly.normalized.from_other_refined("AAPL").head(5) NORM(LOG_CHANGE(Assets)) NORM(LOG_CHANGE(AssetsCurrent)) NORM(LOG_CHANGE(CommonStockSharesOutstanding)) ... fy fp filed ... 2010 Q2 2010-04-21 0.000000 0.000000 0.000000 ... Q3 2010-07-21 0.000000 0.000000 0.000000 ... 2011 Q1 2011-01-19 0.978816 0.074032 0.407282 ... Q2 2011-04-21 0.000000 0.000000 0.000000 ... Q3 2011-07-20 -0.353553 -0.353553 0.051602 ...
- classmethod from_refined(ticker: str, /, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get features from the features SQL table.
This is the preferred method for accessing features for offline analysis (assuming data in the local SQL table is current).
- Parameters:
ticker – Company ticker.
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Quarterly data dataframe with each tag as a separate column. Sorted by filing date.
- Raises:
NoResultFound – If there are no rows for
ticker
in the refined SQL table.
Examples
>>> finagg.sec.feat.quarterly.normalized.from_refined("AAPL").head(5) NORM(LOG_CHANGE(Assets)) NORM(LOG_CHANGE(AssetsCurrent)) NORM(LOG_CHANGE(CommonStockSharesOutstanding)) ... fy fp filed ... 2010 Q2 2010-04-21 0.000000 0.000000 0.000000 ... Q3 2010-07-21 0.000000 0.000000 0.000000 ... 2011 Q1 2011-01-19 0.978816 0.074032 0.407282 ... Q2 2011-04-21 0.000000 0.000000 0.000000 ... Q3 2011-07-20 -0.353553 -0.353553 0.051602 ...
- classmethod get_candidate_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the quarterly SQL table that MAY BE ELIGIBLE to be in the feature’s SQL table.
This is just an alias for
finagg.sec.feat.Quarterly.get_ticker_set()
.- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that may be valid for creating industry-normalized quarterly features that also have at least
lb
rows for each tag used for constructing the features.
Examples
>>> "AAPL" in finagg.sec.feat.quarterly.normalized.get_candidate_ticker_set() True
- classmethod get_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique tickers in the feature’s SQL table.
- Parameters:
lb – Minimum number of rows required to include a ticker in the returned set.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
All unique tickers that contain all the columns for creating industry-normalized quarterly features that also have at least
lb
rows.
Examples
>>> "AAPL" in finagg.sec.feat.quarterly.normalized.get_ticker_set() True
- classmethod get_tickers_sorted_by(column: str, /, *, ascending: bool = True, year: int = -1, quarter: int = -1, engine: None | Engine = None) list[str] [source]
Get all tickers in the feature’s SQL table sorted by a particular column.
- Parameters:
column – Feature column to sort by.
ascending – Whether to return results in ascending order according to the values in
column
.year – Year to select from. Defaults to the most recent year that has data available.
quarter – Quarter to select from. Defaults to the most recent quarter that has data available.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Tickers sorted by a feature column for a particular year and quarter.
Examples
>>> ts = finagg.sec.feat.quarterly.normalized.get_tickers_sorted_by( ... "NORM(EarningsPerShareBasic)", ... year=2020, ... quarter=3 ... ) >>> "AMD" == ts[0] True
- classmethod install(tickers: None | set[str] = None, *, processes: int = 3, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the quarterly SQL tables, transforming them into normalized features, and then writing to the refined quarterly normalized SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the candidate tickers from
NormalizedQuarterly.get_candidate_ticker_set()
.processes – Number of background processes to run in parallel when processing
tickers
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_refined(ticker: str, df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the dataframe to the feature store for
ticker
.- Parameters:
ticker – Company ticker.
df – Dataframe to store completely as rows in a local SQL table.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.
- class finagg.sec.feat.Submissions[source]
Bases:
object
Get a single company’s metadata as-is from raw SEC data.
The module variable
finagg.sec.feat.submissions
is an instance of this feature set implementation and is the most popular interface for calling feature methods.- classmethod from_raw(ticker: str, /, *, engine: None | Engine = None) DataFrame [source]
Get a single company’s metadata as-is from raw SEC data.
The metadata provided for each company varies and each company’s metadata may be incomplete. Only their SEC CIK and SIC industry code are guaranteed to be provided.
- Parameters:
ticker – Company ticker.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
A dataframe containing the company’s metadata.
- Raises:
NoResultFound – If there are no rows for
ticker
in the raw SQL table.
Examples
>>> finagg.sec.feat.submissions.from_raw("AAPL") cik ticker entity_type sic sic_description ... 0 0000320193 AAPL None 3571 None ...
- classmethod get_ticker_set(*, engine: None | Engine = None) set[str] [source]
Get all unique ticker symbols in the raw SQL submissions table.
This method is convenient for accessing the tickers that have raw SQL data associated with them so the data associated with those tickers can be further refined. A common pattern is to use this method and other
get_ticker_set
methods (such as those found infinagg.sec.feat
) to determine which tickers are missing data from other tables or features.- Parameters:
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
Examples
>>> "AAPL" in finagg.sec.feat.submissions.get_ticker_set() True
- classmethod install(tickers: None | set[str] = None, *, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the API, and then writing the data to the raw submissions SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the tickers from
finagg.indices.api.get_ticker_set()
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod install_from_zip(tickers: None | set[str] = None, *, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install all submissions data by downloading the bulk submissions zip file from the API, and then writing the data to the raw submissions SQL table.
Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the tickers from
finagg.indices.api.get_ticker_set()
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_raw(df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the given dataframe to the raw feature table.
- Parameters:
df – Dataframe to store as rows in a local SQL table
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.
- class finagg.sec.feat.Tags[source]
Bases:
object
Get a single company concept tag as-is from raw SEC data.
The module variable
finagg.sec.feat.tags
is an instance of this feature set implementation and is the most popular interface for calling feature methods.- classmethod from_raw(ticker: str, tag: str, /, *, form: str = '10-Q', start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get a single company concept tag as-is from raw SEC data.
This is the preferred method for accessing raw SEC data without using the SEC API.
- Parameters:
ticker – Company ticker.
tag – Company concept tag to retreive.
form –
SEC filing form to retrieve rows for. Options include:
”10-Q” = quarterly filings
”10-K” = annual filings
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
A dataframe containing the company concept tag values across the specified period.
- Raises:
NoResultFound – If there are no rows for
ticker
andtag
in the raw SQL table.
Examples
>>> finagg.sec.feat.tags.from_raw("AAPL", "EarningsPerShareBasic").head(5) units value fy fp filed 2009 Q3 2009-07-22 USD/shares 4.20 2010 Q1 2010-01-25 USD/shares 2.54 Q2 2010-04-21 USD/shares 4.35 Q3 2010-07-21 USD/shares 6.40 2011 Q1 2011-01-19 USD/shares 3.74
- classmethod get_ticker_set(lb: int = 1, *, start: None | str = None, end: None | str = None, engine: None | Engine = None) set[str] [source]
Get all unique ticker symbols in the raw SQL tables that have at least
lb
rows.This method is convenient for accessing the tickers that have raw SQL data associated with them so the data associated with those tickers can be further refined. A common pattern is to use this method and other
get_ticker_set
methods (such as those found infinagg.sec.feat
) to determine which tickers are missing data from other tables or features.- Parameters:
lb – Lower bound number of rows that a company must have for its ticker to be included in the set returned by this method.
start – The start date of the observation period to include when searching for tickers. Defaults to the first recorded date.
end – The end date of the observation period to include when searching for tickers. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
Examples
>>> "AAPL" in finagg.sec.feat.tags.get_ticker_set() True
- classmethod group_and_pivot_from_raw(ticker: str, tags: list[str], /, *, form: str = '10-Q', start: None | str = None, end: None | str = None, engine: None | Engine = None) DataFrame [source]
Get one or more company concept tags from raw SEC data.
Joins all the tags into one table, pivoting the columns such that each tag is in its own column. Tags are forward-filled to fill gaps.
- Parameters:
ticker – Company ticker.
tags – Company concept tags to retreive.
form –
SEC filing form to retrieve rows for. Options include:
”10-Q” = quarterly filings
”10-K” = annual filings
start – The start date of the observation period. Defaults to the first recorded date.
end – The end date of the observation period. Defaults to the last recorded date.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
A dataframe containing the company concept tag values across the specified period.
- Raises:
NoResultFound – If there are no rows for
ticker
or any of the tags intags
forticker
in the raw SQL table.
Examples
>>> finagg.sec.feat.tags.group_and_pivot_from_raw( ... "AAPL", ... ["Assets", "EarningsPerShareBasic"], ... form="10-Q" ... ).head(5) Assets EarningsPerShareBasic fy fp filed 2009 Q3 2009-07-22 3.957200e+10 4.20 2010 Q1 2010-01-25 4.750100e+10 2.54 Q2 2010-04-21 4.750100e+10 4.35 Q3 2010-07-21 4.750100e+10 6.40 2011 Q1 2011-01-19 7.518300e+10 3.74
- classmethod install(tickers: None | set[str] = None, *, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install data associated with
tickers
by pulling data from the API, and then writing the data to the raw tags SQL table.Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the tickers from
Submissions.get_ticker_set()
.engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod install_from_zip(tickers: None | set[str] = None, *, processes: int = 3, engine: None | Engine = None, recreate_tables: bool = False) int [source]
Install all popular tags data by downloading the bulk company facts zip file from the API, and then writing the data to the raw tags SQL table.
Tables associated with this method are created if they don’t already exist.
- Parameters:
tickers – Set of tickers to install features for. Defaults to all the tickers from
Submissions.get_ticker_set()
.processes – Number of background processes to use when installing data.
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.recreate_tables – Whether to drop and recreate tables, wiping all previously installed data.
- Returns:
Number of rows written to the feature’s SQL table.
- classmethod to_raw(df: DataFrame, /, *, engine: None | Engine = None) int [source]
Write the given dataframe to the raw feature table.
- Parameters:
df – Dataframe to store as rows in a local SQL table
engine – Feature store database engine. Defaults to the engine at
finagg.backend.engine
.
- Returns:
Number of rows written to the SQL table.