module finance.astock
¶
Short summary¶
module pyensae.finance.astock
Downloads stock prices (from Yahoo website) and other prices.
Classes¶
class |
truncated documentation |
---|---|
Defines a class containing stock prices, provides basic functions, the class uses :epkg:`pandas` to load the data. |
|
Raised by StockPrices classes. |
|
Raised by StockPrices classes. |
Properties¶
property |
truncated documentation |
---|---|
Returns the dataframe. |
|
Returns the tick name. |
Static Methods¶
staticmethod |
truncated documentation |
---|---|
Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates … |
|
Computes the covariances matrix (of returns). |
|
Draws a graph showing one or several time series. The example was taken date_demo.py. … |
Methods¶
method |
truncated documentation |
---|---|
Overloads the |
|
Returns the dataframe. |
|
Returns the first date. |
|
usual |
|
removes undesired dates |
|
Returns the first date. |
|
Returnq the list of missing dates from an overset of trading dates. |
|
See |
|
Builds the series of returns. |
|
usual |
|
Saves the file in text format, see to_csv … |
|
Saves the file in Excel format, see to_excel … |
Documentation¶
Downloads stock prices (from Yahoo website) and other prices.
- class pyensae.finance.astock.StockPrices(tick, url='google', folder='cache', begin=None, end=None, sep=',', intern=False, use_dtime=False)¶
Bases:
object
Defines a class containing stock prices, provides basic functions, the class uses :epkg:`pandas` to load the data.
Retrieve stock prices from the Yahoo source
from pyensae.finance import StockPrices prices = StockPrices(tick="NASDAQ:MSFT") print(prices.dataframe.head())
The class loads a stock price from either a url or a folder where the data was cached. If a filename
<folder>/<tick>.<day1>.<day2>.txt
already exists, it takes it from here. Otherwise, it downloads it.A couple of providers have been implemented but it is not easy to keep them up to date as policies from website change on a regular basis. If url is
'yahoo'
, the data will be download using CAC 40. The CAC40 composition is described by Wikipedia CAC 40. However Yahoo Finance introduced the use of cookies in May 2017 and it is not so easy to automate. The default provider could be Google Finance which has now been integrated into the search engine. Tick names depends on the data prodiver. More details: European Markets Information. You can also go to quandl and get the tick for the module quandl. As of May 14th, the following error appears when usingurl='yahoo'
which comes from an error in :epkg:`pandas_reader`:ImmediateDeprecationError(DEP_ERROR_MSG.format('Yahoo Daily')) pandas_datareader.exceptions.ImmediateDeprecationError: Yahoo Daily has been immediately deprecated due to large breaks in the API without the introduction of a stable replacement. Pull Requests to re-enable these data connectors are welcome. See https://github.com/pydata/pandas-datareader/issues
url='yahoo_new'
should solve the issue. It relies on :epkg:`yahoo_historial`. Data can be downloaded for a specific period of time. If not specified, it takes the largest available.Compute the average returns and correlation matrix
import pyensae, pandas from pyensae.finance import StockPrices from pyensae.datasource import download_data # download the CAC 40 composition from my website (for Yahoo) download_data('cac40_2013_11_11.txt', website='xd') # download all the prices (if not already done) and store them into files actions = pandas.read_csv("cac40_2013_11_11.txt", sep="\t") # we remove stocks with not enough historical data stocks = { k:StockPrices(tick = k) for k,v in actions.values } dates = StockPrices.available_dates(stocks.values()) stocks = {k:v for k,v in stocks.items() if len(v.missing(dates)) <= 10} print("nb left", len(stocks)) # we remove dates with missing prices dates = StockPrices.available_dates(stocks.values()) ok = dates[dates["missing"] == 0] print("all dates before", len(dates), " after:" , len(ok)) for k in stocks: stocks[k] = stocks[k].keep_dates(ok) # we compute correlation matrix and returns ret, cor = StockPrices.covariance(stocks.values(), cov = False, ret = True)
You should also look at pyensae et notebook. If you use Google Finance as a provider, the tick name is usually prefixed by the market places (NASDAQ for example). The export does not work for all markets places. Another provider was added,
yahoo_new
which delegates the task of getting data from Yahoo Finance to module yahoo-historical.- Parameters:
tick – tick name, ex
NASDAQ:MSFT
url – if yahoo, downloads the data from there if it was not done before url is possible,
'google'
,'yahoo_new'
,'quandl'
are predefined valuesfolder – cache folder (created if it does not exists
begin – first day (datetime), see below
end – last day (datetime), see below
sep – column separator
intern – do not use unless you know what to do (see
__getitem__
)use_dtime – if True, use DateTime instead of string
- FirstDate()¶
Returns the first date.
- LastDate()¶
Returns the first date.
- __getitem__(key)¶
Overloads the
getitem
operator to get aStockPrice
object.- Parameters:
key – key
- Returns:
StockPrice
- __init__(tick, url='google', folder='cache', begin=None, end=None, sep=',', intern=False, use_dtime=False)¶
- Parameters:
tick – tick name, ex
NASDAQ:MSFT
url – if yahoo, downloads the data from there if it was not done before url is possible,
'google'
,'yahoo_new'
,'quandl'
are predefined valuesfolder – cache folder (created if it does not exists
begin – first day (datetime), see below
end – last day (datetime), see below
sep – column separator
intern – do not use unless you know what to do (see
__getitem__
)use_dtime – if True, use DateTime instead of string
- __len__()¶
- Returns:
number of observations
- static available_dates(listStockPrices, missing=True, field='Close')¶
Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates for a list of stock prices.
A missing date is a date for which there is at least one stock price and one missing stock price.
if
missing
is true a column is added which gives the number of missing stock prices for this dates- Parameters:
listStockPrices – list of StockPrices
missing – True or False
field – which field to use to fill the matrix
- Returns:
matrix with the available dates for each stock
- static covariance(listStockPrices, missing=True, field='Close', cov=True, ret=False)¶
Computes the covariances matrix (of returns).
- Parameters:
listStockPrices – list of StockPrices
field – which field to use to fill the matrix
cov – if True, returns the covariance, otherwise, the correlations
ret – if True, also add the returns
- Returns:
square dataframe or 2 dataframe (returns, correlation)
- property dataframe¶
Returns the dataframe.
- df()¶
Returns the dataframe.
- static draw(listStockPrices, begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)¶
Draws a graph showing one or several time series. The example was taken date_demo.py.
- Parameters:
listStockPrices – list of
StockPrices
(or oneStockPrices
if it is the only one)begin – first date (datetime) or None to take the first one
end – last included date (datetime) or None to take the last one
field – Open, High, Low, Close, Adj Close, Volume
date_format –
%Y
or%Y-%m
or%Y-%m-%d
or None if you prefer the function to chooseargs – other arguments to send to
plt.subplots
axis – 1 or 2, it only works if existing is not None. If axis is 2, the function draws the curves on the second axis.
label_prefix – to prefix curve label
color – curve color
args – other parameters to give method
plt.subplots
ax – use existing axes
- Returns:
The parameter
figsize
of the method subplots can change the graph size (see the example below).graph of a financial series
from pyensae.finance import StockPrices stocks = [ StockPrices("NASDAQ:MSFT", folder = cache), StockPrices("NASDAQ:GOOGL", folder = cache), StockPrices("NASDAQ:AAPL", folder = cache)] fig, ax, plt = StockPrices.draw(stocks) fig.savefig("image.png") fig, ax, plt = StockPrices.draw(stocks, begin="2010-01-01", figsize=(16,8)) plt.show()
You can also chain the graphs and add a series on a second graph:
from pyensae.finance import StockPrices stock = StockPrices("NASDAQ:MSFT", folder = cache) stock2 = StockPrices "NASDAQ:GOOGL", folder = cache) fig, ax, plt = stock.plot(figsize=(16,8)) fig, ax, plt = stock2.plot(existing=(fig,ax), axis=2) plt.show()
Changed in version 1.1: Parameter existing was removed and parameter ax was added. If the date overlaps, the method autofmt_xdate should be called.
- head()¶
usual
- keep_dates(trading_dates)¶
removes undesired dates
- Parameters:
trading_dates – dates
- Returns:
new series
- missing(trading_dates)¶
Returnq the list of missing dates from an overset of trading dates.
- Parameters:
trading_dates – trading_dates (DataFrame having the column
Date
or in the index)- Returns:
missing dates (or None if issues)
- plot(begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)¶
See
draw
.
- returns()¶
Builds the series of returns.
- Parameters:
col – column to use to compute the returns
- Returns:
StockPrices
- property shape¶
- Returns:
number of observations
- tail()¶
usual
- property tick¶
Returns the tick name.
- to_csv(filename, sep='\t', index=False, **params)¶
Saves the file in text format, see to_csv
- Parameters:
filename – filename
sep – separator
index – to keep or drop the index
params – other parameters
- exception pyensae.finance.astock.StockPricesException¶
Bases:
Exception
Raised by StockPrices classes.
- exception pyensae.finance.astock.StockPricesHTTPException¶
Bases:
StockPricesException
Raised by StockPrices classes.