API Reference

Running a Backtest

zipline.run_algorithm(...)[source]

Run a trading algorithm.

Parameters:

start : datetime

The start date of the backtest.

end : datetime

The end date of the backtest..

initialize : callable[context -> None]

The initialize function to use for the algorithm. This is called once at the very begining of the backtest and should be used to set up any state needed by the algorithm.

capital_base : float

The starting capital for the backtest.

handle_data : callable[(context, BarData) -> None], optional

The handle_data function to use for the algorithm. This is called every minute when data_frequency == 'minute' or every day when data_frequency == 'daily'.

before_trading_start : callable[(context, BarData) -> None], optional

The before_trading_start function for the algorithm. This is called once before each trading day (after initialize on the first day).

analyze : callable[(context, pd.DataFrame) -> None], optional

The analyze function to use for the algorithm. This function is called once at the end of the backtest and is passed the context and the performance data.

data_frequency : {‘daily’, ‘minute’}, optional

The data frequency to run the algorithm at.

data : pd.DataFrame, pd.Panel, or DataPortal, optional

The ohlcv data to run the backtest with. This argument is mutually exclusive with: bundle bundle_timestamp

bundle : str, optional

The name of the data bundle to use to load the data to run the backtest with. This defaults to ‘quantopian-quandl’. This argument is mutually exclusive with data.

bundle_timestamp : datetime, optional

The datetime to lookup the bundle data for. This defaults to the current time. This argument is mutually exclusive with data.

default_extension : bool, optional

Should the default zipline extension be loaded. This is found at $ZIPLINE_ROOT/extension.py

extensions : iterable[str], optional

The names of any other extensions to load. Each element may either be a dotted module path like a.b.c or a path to a python file ending in .py like a/b/c.py.

strict_extensions : bool, optional

Should the run fail if any extensions fail to load. If this is false, a warning will be raised instead.

environ : mapping[str -> str], optional

The os environment to use. Many extensions use this to get parameters. This defaults to os.environ.

Returns:

perf : pd.DataFrame

The daily performance of the algorithm.

See also

zipline.data.bundles.bundles
The available data bundles.

Algorithm API

The following methods are available for use in the initialize, handle_data, and before_trading_start API functions.

In all listed functions, the self argument is implicitly the currently-executing TradingAlgorithm instance.

Data Object

class zipline.protocol.BarData

Provides methods to access spot value or history windows of price data. Also provides some utility methods to determine if an asset is alive, has recent trade data, etc.

This is what is passed as data to the handle_data function.

Parameters:

data_portal : DataPortal

Provider for bar pricing data.

simulation_dt_func : callable

Function which returns the current simulation time. This is usually bound to a method of TradingSimulation.

data_frequency : {‘minute’, ‘daily’}

The frequency of the bar data; i.e. whether the data is daily or minute bars

universe_func : callable, optional

Function which returns the current ‘universe’. This is for backwards compatibility with older API concepts.

can_trade()

For the given asset or iterable of assets, returns true if all of the following are true: 1) the asset is alive for the session of the current simulation time

(if current simulation time is not a market minute, we use the next session)
  1. (if we are in minute mode) the asset’s exchange is open at the
current simulation time or at the simulation calendar’s next market minute
  1. there is a known last price for the asset.
Parameters:assets: Asset or iterable of assets
Returns:can_trade : bool or pd.Series[bool] indexed by asset.

Notes

The second condition above warrants some further explanation. - If the asset’s exchange calendar is identical to the simulation calendar, then this condition always returns True. - If there are market minutes in the simulation calendar outside of this asset’s exchange’s trading hours (for example, if the simulation is running on the CME calendar but the asset is MSFT, which trades on the NYSE), during those minutes, this condition will return false (for example, 3:15 am Eastern on a weekday, during which the CME is open but the NYSE is closed).

current()

Returns the current value of the given assets for the given fields at the current simulation time. Current values are the as-traded price and are usually not adjusted for events like splits or dividends (see notes for more information).

Parameters:

assets : Asset or iterable of Assets

fields : str or iterable[str].

Valid values are: “price”, “last_traded”, “open”, “high”, “low”, “close”, “volume”, or column names in files read by fetch_csv.

Returns:

current_value : Scalar, pandas Series, or pandas DataFrame.

See notes below.

Notes

If a single asset and a single field are passed in, a scalar float value is returned.

If a single asset and a list of fields are passed in, a pandas Series is returned whose indices are the fields, and whose values are scalar values for this asset for each field.

If a list of assets and a single field are passed in, a pandas Series is returned whose indices are the assets, and whose values are scalar values for each asset for the given field.

If a list of assets and a list of fields are passed in, a pandas DataFrame is returned, indexed by asset. The columns are the requested fields, filled with the scalar values for each asset for each field.

If the current simulation time is not a valid market time, we use the last market close instead.

“price” returns the last known close price of the asset. If there is no last known value (either because the asset has never traded, or because it has delisted) NaN is returned. If a value is found, and we had to cross an adjustment boundary (split, dividend, etc) to get it, the value is adjusted before being returned.

“last_traded” returns the date of the last trade event of the asset, even if the asset has stopped trading. If there is no last known value, pd.NaT is returned.

“volume” returns the trade volume for the current simulation time. If there is no trade this minute, 0 is returned.

“open”, “high”, “low”, and “close” return the relevant information for the current trade bar. If there is no current trade bar, NaN is returned.

history()

Returns a window of data for the given assets and fields.

This data is adjusted for splits, dividends, and mergers as of the current algorithm time.

The semantics of missing data are identical to the ones described in the notes for get_spot_value.

Parameters:

assets: Asset or iterable of Asset

fields: string or iterable of string. Valid values are “open”, “high”,

“low”, “close”, “volume”, “price”, and “last_traded”.

bar_count: integer number of bars of trade data

frequency: string. “1m” for minutely data or “1d” for daily date

Returns:

history : Series or DataFrame or Panel

Return type depends on the dimensionality of the ‘assets’ and ‘fields’ parameters.

If single asset and field are passed in, the returned Series is indexed by dt.

If multiple assets and single field are passed in, the returned DataFrame is indexed by dt, and has assets as columns.

If a single asset and multiple fields are passed in, the returned DataFrame is indexed by dt, and has fields as columns.

If multiple assets and multiple fields are passed in, the returned Panel is indexed by field, has dt as the major axis, and assets as the minor axis.

Notes

If the current simulation time is not a valid market time, we use the last market close instead.

is_stale()

For the given asset or iterable of assets, returns true if the asset is alive and there is no trade data for the current simulation time.

If the asset has never traded, returns False.

If the current simulation time is not a valid market time, we use the current time to check if the asset is alive, but we use the last market minute/day for the trade data check.

Parameters:assets: Asset or iterable of assets
Returns:boolean or Series of booleans, indexed by asset.

Scheduling Functions

zipline.api.schedule_function(self, func, date_rule=None, time_rule=None, half_days=True, calendar=None)

Schedules a function to be called according to some timed rules.

Parameters:

func : callable[(context, data) -> None]

The function to execute when the rule is triggered.

date_rule : EventRule, optional

The rule for the dates to execute this function.

time_rule : EventRule, optional

The rule for the times to execute this function.

half_days : bool, optional

Should this rule fire on half days?

class zipline.api.date_rules[source]
every_day

alias of Always

static month_end(days_offset=0)[source]
static month_start(days_offset=0)[source]
static week_end(days_offset=0)[source]
static week_start(days_offset=0)[source]
class zipline.api.time_rules[source]
every_minute

alias of Always

market_close

alias of BeforeClose

market_open

alias of AfterOpen

Orders

zipline.api.order(self, asset, amount, limit_price=None, stop_price=None, style=None)

Place an order.

Parameters:

asset : Asset

The asset that this order is for.

amount : int

The amount of shares to order. If amount is positive, this is the number of shares to buy or cover. If amount is negative, this is the number of shares to sell or short.

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle, optional

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

The limit_price and stop_price arguments provide shorthands for passing common execution styles. Passing limit_price=N is equivalent to style=LimitOrder(N). Similarly, passing stop_price=M is equivalent to style=StopOrder(M), and passing limit_price=N and stop_price=M is equivalent to style=StopLimitOrder(N, M). It is an error to pass both a style and limit_price or stop_price.

zipline.api.order_value(self, asset, value, limit_price=None, stop_price=None, style=None)

Place an order by desired value rather than desired number of shares.

Parameters:

asset : Asset

The asset that this order is for.

value : float

If the requested asset exists, the requested value is divided by its price to imply the number of shares to transact. If the Asset being ordered is a Future, the ‘value’ calculated is actually the exposure, as Futures have no ‘value’.

value > 0 :: Buy/Cover value < 0 :: Sell/Short

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

See zipline.api.order() for more information about limit_price, stop_price, and style

zipline.api.order_percent(self, asset, percent, limit_price=None, stop_price=None, style=None)

Place an order in the specified asset corresponding to the given percent of the current portfolio value.

Parameters:

asset : Asset

The asset that this order is for.

percent : float

The percentage of the porfolio value to allocate to asset. This is specified as a decimal, for example: 0.50 means 50%.

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

See zipline.api.order() for more information about limit_price, stop_price, and style

zipline.api.order_target(self, asset, target, limit_price=None, stop_price=None, style=None)

Place an order to adjust a position to a target number of shares. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target number of shares and the current number of shares.

Parameters:

asset : Asset

The asset that this order is for.

target : int

The desired number of shares of asset.

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

order_target does not take into account any open orders. For example:

order_target(sid(0), 10)
order_target(sid(0), 10)

This code will result in 20 shares of sid(0) because the first call to order_target will not have been filled when the second order_target call is made.

See zipline.api.order() for more information about limit_price, stop_price, and style

zipline.api.order_target_value(self, asset, target, limit_price=None, stop_price=None, style=None)

Place an order to adjust a position to a target value. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target value and the current value. If the Asset being ordered is a Future, the ‘target value’ calculated is actually the target exposure, as Futures have no ‘value’.

Parameters:

asset : Asset

The asset that this order is for.

target : float

The desired total value of asset.

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

order_target_value does not take into account any open orders. For example:

order_target_value(sid(0), 10)
order_target_value(sid(0), 10)

This code will result in 20 dollars of sid(0) because the first call to order_target_value will not have been filled when the second order_target_value call is made.

See zipline.api.order() for more information about limit_price, stop_price, and style

zipline.api.order_target_percent(self, asset, target, limit_price=None, stop_price=None, style=None)

Place an order to adjust a position to a target percent of the current portfolio value. If the position doesn’t already exist, this is equivalent to placing a new order. If the position does exist, this is equivalent to placing an order for the difference between the target percent and the current percent.

Parameters:

asset : Asset

The asset that this order is for.

percent : float

The desired percentage of the porfolio value to allocate to asset. This is specified as a decimal, for example: 0.50 means 50%.

limit_price : float, optional

The limit price for the order.

stop_price : float, optional

The stop price for the order.

style : ExecutionStyle

The execution style for the order.

Returns:

order_id : str

The unique identifier for this order.

Notes

order_target_value does not take into account any open orders. For example:

order_target_percent(sid(0), 10)
order_target_percent(sid(0), 10)

This code will result in 20% of the portfolio being allocated to sid(0) because the first call to order_target_percent will not have been filled when the second order_target_percent call is made.

See zipline.api.order() for more information about limit_price, stop_price, and style

class zipline.finance.execution.ExecutionStyle[source]

Abstract base class representing a modification to a standard order.

exchange

The exchange to which this order should be routed.

get_limit_price(is_buy)[source]

Get the limit price for this order. Returns either None or a numerical value >= 0.

get_stop_price(is_buy)[source]

Get the stop price for this order. Returns either None or a numerical value >= 0.

class zipline.finance.execution.MarketOrder(exchange=None)[source]

Class encapsulating an order to be placed at the current market price.

class zipline.finance.execution.LimitOrder(limit_price, exchange=None)[source]

Execution style representing an order to be executed at a price equal to or better than a specified limit price.

class zipline.finance.execution.StopOrder(stop_price, exchange=None)[source]

Execution style representing an order to be placed once the market price reaches a specified stop price.

class zipline.finance.execution.StopLimitOrder(limit_price, stop_price, exchange=None)[source]

Execution style representing a limit order to be placed with a specified limit price once the market reaches a specified stop price.

zipline.api.get_order(self, order_id)

Lookup an order based on the order id returned from one of the order functions.

Parameters:

order_id : str

The unique identifier for the order.

Returns:

order : Order

The order object.

zipline.api.get_open_orders(self, asset=None)

Retrieve all of the current open orders.

Parameters:

asset : Asset

If passed and not None, return only the open orders for the given asset instead of all open orders.

Returns:

open_orders : dict[list[Order]] or list[Order]

If no asset is passed this will return a dict mapping Assets to a list containing all the open orders for the asset. If an asset is passed then this will return a list of the open orders for this asset.

zipline.api.cancel_order(self, order_param)

Cancel an open order.

Parameters:

order_param : str or Order

The order_id or order object to cancel.

Order Cancellation Policies

zipline.api.set_cancel_policy(self, cancel_policy)

Sets the order cancellation policy for the simulation.

Parameters:

cancel_policy : CancelPolicy

The cancellation policy to use.

class zipline.finance.cancel_policy.CancelPolicy[source]

Abstract cancellation policy interface.

should_cancel(event)[source]

Should all open orders be cancelled?

Parameters:

event : enum-value

An event type, one of:
  • zipline.gens.sim_engine.BAR
  • zipline.gens.sim_engine.DAY_START
  • zipline.gens.sim_engine.DAY_END
  • zipline.gens.sim_engine.MINUTE_END
Returns:

should_cancel : bool

Should all open orders be cancelled?

zipline.api.EODCancel(warn_on_cancel=True)[source]

This policy cancels open orders at the end of the day. For now, Zipline will only apply this policy to minutely simulations.

Parameters:

warn_on_cancel : bool, optional

Should a warning be raised if this causes an order to be cancelled?

zipline.api.NeverCancel()[source]

Orders are never automatically canceled.

Assets

zipline.api.symbol(self, symbol_str)

Lookup an Equity by its ticker symbol.

Parameters:

symbol_str : str

The ticker symbol for the equity to lookup.

Returns:

equity : Equity

The equity that held the ticker symbol on the current symbol lookup date.

Raises:

SymbolNotFound

Raised when the symbols was not held on the current lookup date.

zipline.api.symbols(self, *args)

Lookup multuple Equities as a list.

Parameters:

*args : iterable[str]

The ticker symbols to lookup.

Returns:

equities : list[Equity]

The equities that held the given ticker symbols on the current symbol lookup date.

Raises:

SymbolNotFound

Raised when one of the symbols was not held on the current lookup date.

zipline.api.future_symbol(self, symbol)

Lookup a futures contract with a given symbol.

Parameters:

symbol : str

The symbol of the desired contract.

Returns:

future : Future

The future that trades with the name symbol.

Raises:

SymbolNotFound

Raised when no contract named ‘symbol’ is found.

zipline.api.future_chain(self, root_symbol, as_of_date=None, offset=0)

Look up a future chain.

Parameters:

root_symbol : str

The root symbol of a future chain.

as_of_date : datetime.datetime or pandas.Timestamp or str, optional

Date at which the chain determination is rooted. If this date is not passed in, the current simulation session (not minute) is used.

offset: int

Number of sessions to shift as_of_date. Positive values shift

forward in time. Negative values shift backward in time.

Returns:

chain : FutureChain

The future chain matching the specified parameters.

Raises:

RootSymbolNotFound

If a future chain could not be found for the given root symbol.

zipline.api.set_symbol_lookup_date(self, dt)

Set the date for which symbols will be resolved to their assets (symbols may map to different firms or underlying assets at different times)

Parameters:

dt : datetime

The new symbol lookup date.

zipline.api.sid(self, sid)

Lookup an Asset by its unique asset identifier.

Parameters:

sid : int

The unique integer that identifies an asset.

Returns:

asset : Asset

The asset with the given sid.

Raises:

SidsNotFound

When a requested sid does not map to any asset.

Trading Controls

Zipline provides trading controls to help ensure that the algorithm is performing as expected. The functions help protect the algorithm from certian bugs that could cause undesirable behavior when trading with real money.

zipline.api.set_do_not_order_list(self, restricted_list)

Set a restriction on which assets can be ordered.

Parameters:

restricted_list : container[Asset]

The assets that cannot be ordered.

zipline.api.set_long_only(self)

Set a rule specifying that this algorithm cannot take short positions.

zipline.api.set_max_leverage(self, max_leverage)

Set a limit on the maximum leverage of the algorithm.

Parameters:

max_leverage : float

The maximum leverage for the algorithm. If not provided there will be no maximum.

zipline.api.set_max_order_count(self, max_count)

Set a limit on the number of orders that can be placed in a single day.

Parameters:

max_count : int

The maximum number of orders that can be placed on any single day.

zipline.api.set_max_order_size(self, asset=None, max_shares=None, max_notional=None)

Set a limit on the number of shares and/or dollar value of any single order placed for sid. Limits are treated as absolute values and are enforced at the time that the algo attempts to place an order for sid.

If an algorithm attempts to place an order that would result in exceeding one of these limits, raise a TradingControlException.

Parameters:

asset : Asset, optional

If provided, this sets the guard only on positions in the given asset.

max_shares : int, optional

The maximum number of shares that can be ordered at one time.

max_notional : float, optional

The maximum value that can be ordered at one time.

zipline.api.set_max_position_size(self, asset=None, max_shares=None, max_notional=None)

Set a limit on the number of shares and/or dollar value held for the given sid. Limits are treated as absolute values and are enforced at the time that the algo attempts to place an order for sid. This means that it’s possible to end up with more than the max number of shares due to splits/dividends, and more than the max notional due to price improvement.

If an algorithm attempts to place an order that would result in increasing the absolute value of shares/dollar value exceeding one of these limits, raise a TradingControlException.

Parameters:

asset : Asset, optional

If provided, this sets the guard only on positions in the given asset.

max_shares : int, optional

The maximum number of shares to hold for an asset.

max_notional : float, optional

The maximum value to hold for an asset.

Simulation Parameters

zipline.api.set_benchmark(self, benchmark)

Set the benchmark asset.

Parameters:

benchmark : Asset

The asset to set as the new benchmark.

Notes

Any dividends payed out for that new benchmark asset will be automatically reinvested.

Commission Models

zipline.api.set_commission(self, commission)

Sets the commission model for the simulation.

Parameters:

commission : CommissionModel

The commission model to use.

class zipline.finance.commission.CommissionModel[source]

Abstract commission model interface.

Commission models are responsible for accepting order/transaction pairs and calculating how much commission should be charged to an algorithm’s account on each transaction.

calculate(order, transaction)[source]

Calculate the amount of commission to charge on order as a result of transaction.

Parameters:

order : zipline.finance.order.Order

The order being processed.

The commission field of order is a float indicating the amount of commission already charged on this order.

transaction : zipline.finance.transaction.Transaction

The transaction being processed. A single order may generate multiple transactions if there isn’t enough volume in a given bar to fill the full amount requested in the order.

Returns:

amount_charged : float

The additional commission, in dollars, that we should attribute to this order.

class zipline.finance.commission.PerShare(cost=0.0075, min_trade_cost=1.0)[source]

Calculates a commission for a transaction based on a per share cost with an optional minimum cost per trade.

Parameters:

cost : float, optional

The amount of commissions paid per share traded.

min_trade_cost : optional

The minimum amount of commissions paid per trade.

class zipline.finance.commission.PerTrade(cost=1.0)[source]

Calculates a commission for a transaction based on a per trade cost.

Parameters:

cost : float, optional

The flat amount of commissions paid per trade.

class zipline.finance.commission.PerDollar(cost=0.0015)[source]

Calculates a commission for a transaction based on a per trade cost.

Parameters:

cost : float

The flat amount of commissions paid per trade.

Slippage Models

zipline.api.set_slippage(self, slippage)

Set the slippage model for the simulation.

Parameters:

slippage : SlippageModel

The slippage model to use.

class zipline.finance.slippage.SlippageModel[source]

Abstract interface for defining a slippage model.

process_order

Process how orders get filled.

Parameters:

data : BarData

The data for the given bar.

order : Order

The order to simulate.

Returns:

execution_price : float

The price to execute the trade at.

execution_volume : int

The number of shares that could be filled. This may not be all the shares ordered in which case the order will be filled over multiple bars.

class zipline.finance.slippage.FixedSlippage(spread=0.0)[source]

Model slippage as a fixed spread.

Parameters:

spread : float, optional

spread / 2 will be added to buys and subtracted from sells.

class zipline.finance.slippage.VolumeShareSlippage(volume_limit=0.025, price_impact=0.1)[source]

Model slippage as a function of the volume of shares traded.

Pipeline

For more information, see Pipeline API

zipline.api.attach_pipeline(self, pipeline, name, chunksize=None)

Register a pipeline to be computed at the start of each day.

Parameters:

pipeline : Pipeline

The pipeline to have computed.

name : str

The name of the pipeline.

chunksize : int, optional

The number of days to compute pipeline results for. Increasing this number will make it longer to get the first results but may improve the total runtime of the simulation.

Returns:

pipeline : Pipeline

Returns the pipeline that was attached unchanged.

zipline.api.pipeline_output(self, name)

Get the results of the pipeline that was attached with the name: name.

Parameters:

name : str

Name of the pipeline for which results are requested.

Returns:

results : pd.DataFrame

DataFrame containing the results of the requested pipeline for the current simulation date.

Raises:

NoSuchPipeline

Raised when no pipeline with the name name has been registered.

See also

zipline.api.attach_pipeline(), zipline.pipeline.engine.PipelineEngine.run_pipeline()

Miscellaneous

zipline.api.record(self, *args, **kwargs)

Track and record values each day.

Parameters:

**kwargs

The names and values to record.

Notes

These values will appear in the performance packets and the performance dataframe passed to analyze and returned from run_algorithm().

zipline.api.get_environment(self, field='platform')

Query the execution environment.

Parameters:

field : {‘platform’, ‘arena’, ‘data_frequency’,

‘start’, ‘end’, ‘capital_base’, ‘platform’, ‘*’}

The field to query. The options have the following meanings:
arena : str

The arena from the simulation parameters. This will normally be 'backtest' but some systems may use this distinguish live trading from backtesting.

data_frequency : {‘daily’, ‘minute’}

data_frequency tells the algorithm if it is running with daily data or minute data.

start : datetime

The start date for the simulation.

end : datetime

The end date for the simulation.

capital_base : float

The starting capital for the simulation.

platform : str

The platform that the code is running on. By default this will be the string ‘zipline’. This can allow algorithms to know if they are running on the Quantopian platform instead.

  • : dict[str -> any]

    Returns all of the fields in a dictionary.

Returns:

val : any

The value for the field queried. See above for more information.

Raises:

ValueError

Raised when field is not a valid option.

zipline.api.fetch_csv(self, url, pre_func=None, post_func=None, date_column='date', date_format=None, timezone='UTC', symbol=None, mask=True, symbol_column=None, special_params_checker=None, **kwargs)

Fetch a csv from a remote url and register the data so that it is queryable from the data object.

Parameters:

url : str

The url of the csv file to load.

pre_func : callable[pd.DataFrame -> pd.DataFrame], optional

A callback to allow preprocessing the raw data returned from fetch_csv before dates are paresed or symbols are mapped.

post_func : callable[pd.DataFrame -> pd.DataFrame], optional

A callback to allow postprocessing of the data after dates and symbols have been mapped.

date_column : str, optional

The name of the column in the preprocessed dataframe containing datetime information to map the data.

date_format : str, optional

The format of the dates in the date_column. If not provided fetch_csv will attempt to infer the format. For information about the format of this string, see pandas.read_csv().

timezone : tzinfo or str, optional

The timezone for the datetime in the date_column.

symbol : str, optional

If the data is about a new asset or index then this string will be the name used to identify the values in data. For example, one may use fetch_csv to load data for VIX, then this field could be the string 'VIX'.

mask : bool, optional

Drop any rows which cannot be symbol mapped.

symbol_column : str

If the data is attaching some new attribute to each asset then this argument is the name of the column in the preprocessed dataframe containing the symbols. This will be used along with the date information to map the sids in the asset finder.

**kwargs

Forwarded to pandas.read_csv().

Returns:

csv_data_source : zipline.sources.requests_csv.PandasRequestsCSV

A requests source that will pull data from the url specified.

Pipeline API

class zipline.pipeline.Pipeline(columns=None, screen=None)[source]

A Pipeline object represents a collection of named expressions to be compiled and executed by a PipelineEngine.

A Pipeline has two important attributes: ‘columns’, a dictionary of named Term instances, and ‘screen’, a Filter representing criteria for including an asset in the results of a Pipeline.

To compute a pipeline in the context of a TradingAlgorithm, users must call attach_pipeline in their initialize function to register that the pipeline should be computed each trading day. The outputs of a pipeline on a given day can be accessed by calling pipeline_output in handle_data or before_trading_start.

Parameters:

columns : dict, optional

Initial columns.

screen : zipline.pipeline.term.Filter, optional

Initial screen.

add(term, name, overwrite=False)[source]

Add a column.

The results of computing term will show up as a column in the DataFrame produced by running this pipeline.

Parameters:

column : zipline.pipeline.Term

A Filter, Factor, or Classifier to add to the pipeline.

name : str

Name of the column to add.

overwrite : bool

Whether to overwrite the existing entry if we already have a column named name.

remove(name)[source]

Remove a column.

Parameters:

name : str

The name of the column to remove.

Returns:

removed : zipline.pipeline.term.Term

The removed term.

Raises:

KeyError

If name is not in self.columns.

set_screen(screen, overwrite=False)[source]

Set a screen on this Pipeline.

Parameters:

filter : zipline.pipeline.Filter

The filter to apply as a screen.

overwrite : bool

Whether to overwrite any existing screen. If overwrite is False and self.screen is not None, we raise an error.

show_graph(format='svg')[source]

Render this Pipeline as a DAG.

Parameters:

format : {‘svg’, ‘png’, ‘jpeg’}

Image format to render with. Default is ‘svg’.

to_execution_plan(screen_name, default_screen, all_dates, start_date, end_date)[source]

Compile into an ExecutionPlan.

Parameters:

screen_name : str

Name to supply for self.screen.

default_screen : zipline.pipeline.term.Term

Term to use as a screen if self.screen is None.

all_dates : pd.DatetimeIndex

A calendar of dates to use to calculate starts and ends for each term.

start_date : pd.Timestamp

The first date of requested output.

end_date : pd.Timestamp

The last date of requested output.

to_simple_graph(screen_name, default_screen)[source]

Compile into a simple TermGraph with no extra row metadata.

Parameters:

screen_name : str

Name to supply for self.screen.

default_screen : zipline.pipeline.term.Term

Term to use as a screen if self.screen is None.

columns

The columns registered with this pipeline.

screen

The screen applied to the rows of this pipeline.

class zipline.pipeline.CustomFactor(*args, **kwargs)[source]

Base class for user-defined Factors.

Parameters:

inputs : iterable, optional

An iterable of BoundColumn instances (e.g. USEquityPricing.close), describing the data to load and pass to self.compute. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named inputs.

outputs : iterable[str], optional

An iterable of strings which represent the names of each output this factor should compute and return. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named outputs.

window_length : int, optional

Number of rows to pass for each input. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named window_length.

mask : zipline.pipeline.Filter, optional

A Filter describing the assets on which we should compute each day. Each call to CustomFactor.compute will only receive assets for which mask produced True on the day for which compute is being called.

Notes

Users implementing their own Factors should subclass CustomFactor and implement a method named compute with the following signature:

def compute(self, today, assets, out, *inputs):
   ...

On each simulation date, compute will be called with the current date, an array of sids, an output array, and an input array for each expression passed as inputs to the CustomFactor constructor.

The specific types of the values passed to compute are as follows:

today : np.datetime64[ns]
    Row label for the last row of all arrays passed as `inputs`.
assets : np.array[int64, ndim=1]
    Column labels for `out` and`inputs`.
out : np.array[self.dtype, ndim=1]
    Output array of the same shape as `assets`.  `compute` should write
    its desired return values into `out`. If multiple outputs are
    specified, `compute` should write its desired return values into
    `out.<output_name>` for each output name in `self.outputs`.
*inputs : tuple of np.array
    Raw data arrays corresponding to the values of `self.inputs`.

compute functions should expect to be passed NaN values for dates on which no data was available for an asset. This may include dates on which an asset did not yet exist.

For example, if a CustomFactor requires 10 rows of close price data, and asset A started trading on Monday June 2nd, 2014, then on Tuesday, June 3rd, 2014, the column of input data for asset A will have 9 leading NaNs for the preceding days on which data was not yet available.

Examples

A CustomFactor with pre-declared defaults:

class TenDayRange(CustomFactor):
    """
    Computes the difference between the highest high in the last 10
    days and the lowest low.

    Pre-declares high and low as default inputs and `window_length` as
    10.
    """

    inputs = [USEquityPricing.high, USEquityPricing.low]
    window_length = 10

    def compute(self, today, assets, out, highs, lows):
        from numpy import nanmin, nanmax

        highest_highs = nanmax(highs, axis=0)
        lowest_lows = nanmin(lows, axis=0)
        out[:] = highest_highs - lowest_lows

# Doesn't require passing inputs or window_length because they're
# pre-declared as defaults for the TenDayRange class.
ten_day_range = TenDayRange()

A CustomFactor without defaults:

class MedianValue(CustomFactor):
    """
    Computes the median value of an arbitrary single input over an
    arbitrary window..

    Does not declare any defaults, so values for `window_length` and
    `inputs` must be passed explicitly on every construction.
    """

    def compute(self, today, assets, out, data):
        from numpy import nanmedian
        out[:] = data.nanmedian(data, axis=0)

# Values for `inputs` and `window_length` must be passed explicitly to
# MedianValue.
median_close10 = MedianValue([USEquityPricing.close], window_length=10)
median_low15 = MedianValue([USEquityPricing.low], window_length=15)

A CustomFactor with multiple outputs:

class MultipleOutputs(CustomFactor):
    inputs = [USEquityPricing.close]
    outputs = ['alpha', 'beta']
    window_length = N

    def compute(self, today, assets, out, close):
        computed_alpha, computed_beta = some_function(close)
        out.alpha[:] = computed_alpha
        out.beta[:] = computed_beta

# Each output is returned as its own Factor upon instantiation.
alpha, beta = MultipleOutputs()

# Equivalently, we can create a single factor instance and access each
# output as an attribute of that instance.
multiple_outputs = MultipleOutputs()
alpha = multiple_outputs.alpha
beta = multiple_outputs.beta

Note: If a CustomFactor has multiple outputs, all outputs must have the same dtype. For instance, in the example above, if alpha is a float then beta must also be a float.

class zipline.pipeline.factors.Factor(*args, **kwargs)[source]

Pipeline API expression producing a numerical or date-valued output.

Factors are the most commonly-used Pipeline term, representing the result of any computation producing a numerical result.

Factors can be combined, both with other Factors and with scalar values, via any of the builtin mathematical operators (+, -, *, etc). This makes it easy to write complex expressions that combine multiple Factors. For example, constructing a Factor that computes the average of two other Factors is simply:

>>> f1 = SomeFactor(...)  
>>> f2 = SomeOtherFactor(...)  
>>> average = (f1 + f2) / 2.0  

Factors can also be converted into zipline.pipeline.Filter objects via comparison operators: (<, <=, !=, eq, >, >=).

There are many natural operators defined on Factors besides the basic numerical operators. These include methods identifying missing or extreme-valued outputs (isnull, notnull, isnan, notnan), methods for normalizing outputs (rank, demean, zscore), and methods for constructing Filters based on rank-order properties of results (top, bottom, percentile_between).

eq(other)

Binary Operator: ‘==’

rank(method='ordinal', ascending=True, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]

Construct a new Factor representing the sorted rank of each column within each row.

Parameters:

method : str, {‘ordinal’, ‘min’, ‘max’, ‘dense’, ‘average’}

The method used to assign ranks to tied elements. See scipy.stats.rankdata for a full description of the semantics for each ranking method. Default is ‘ordinal’.

ascending : bool, optional

Whether to return sorted rank in ascending or descending order. Default is True.

mask : zipline.pipeline.Filter, optional

A Filter representing assets to consider when computing ranks. If mask is supplied, ranks are computed ignoring any asset/date pairs for which mask produces a value of False.

groupby : zipline.pipeline.Classifier, optional

A classifier defining partitions over which to perform ranking.

Returns:

ranks : zipline.pipeline.factors.Rank

A new factor that will compute the ranking of the data produced by self.

See also

scipy.stats.rankdata(), zipline.pipeline.factors.factor.Rank

Notes

The default value for method is different from the default for scipy.stats.rankdata. See that function’s documentation for a full description of the valid inputs to method.

Missing or non-existent data on a given day will cause an asset to be given a rank of NaN for that day.

top(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]

Construct a Filter matching the top N asset values of self each day.

If groupby is supplied, returns a Filter matching the top N asset values for each group.

Parameters:

N : int

Number of assets passing the returned filter each day.

mask : zipline.pipeline.Filter, optional

A Filter representing assets to consider when computing ranks. If mask is supplied, top values are computed ignoring any asset/date pairs for which mask produces a value of False.

groupby : zipline.pipeline.Classifier, optional

A classifier defining partitions over which to perform ranking.

Returns:

filter : zipline.pipeline.filters.Filter

bottom(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))[source]

Construct a Filter matching the bottom N asset values of self each day.

If groupby is supplied, returns a Filter matching the bottom N asset values for each group.

Parameters:

N : int

Number of assets passing the returned filter each day.

mask : zipline.pipeline.Filter, optional

A Filter representing assets to consider when computing ranks. If mask is supplied, bottom values are computed ignoring any asset/date pairs for which mask produces a value of False.

groupby : zipline.pipeline.Classifier, optional

A classifier defining partitions over which to perform ranking.

Returns:

filter : zipline.pipeline.Filter

percentile_between(min_percentile, max_percentile, mask=sentinel('NotSpecified'))[source]

Construct a new Filter representing entries from the output of this Factor that fall within the percentile range defined by min_percentile and max_percentile.

Parameters:

min_percentile : float [0.0, 100.0]

Return True for assets falling above this percentile in the data.

max_percentile : float [0.0, 100.0]

Return True for assets falling below this percentile in the data.

mask : zipline.pipeline.Filter, optional

A Filter representing assets to consider when percentile calculating thresholds. If mask is supplied, percentile cutoffs are computed each day using only assets for which mask returns True. Assets for which mask produces False will produce False in the output of this Factor as well.

Returns:

out : zipline.pipeline.filters.PercentileFilter

A new filter that will compute the specified percentile-range mask.

See also

zipline.pipeline.filters.filter.PercentileFilter

isnan()[source]

A Filter producing True for all values where this Factor is NaN.

Returns:nanfilter : zipline.pipeline.filters.Filter
notnan()[source]

A Filter producing True for values where this Factor is not NaN.

Returns:nanfilter : zipline.pipeline.filters.Filter
isfinite()[source]

A Filter producing True for values where this Factor is anything but NaN, inf, or -inf.

__add__(other)

Binary Operator: ‘+’

__div__(other)

Binary Operator: ‘/’

__ge__(other)

Binary Operator: ‘>=’

__gt__(other)

Binary Operator: ‘>’

__le__(other)

Binary Operator: ‘<=’

__lt__(other)

Binary Operator: ‘<’

__mod__(other)

Binary Operator: ‘%’

__mul__(other)

Binary Operator: ‘*’

__ne__(other)

Binary Operator: ‘!=’

__pow__(other)

Binary Operator: ‘**’

__sub__(other)

Binary Operator: ‘-‘

class zipline.pipeline.factors.Latest(*args, **kwargs)[source]

Factor producing the most recently-known value of inputs[0] on each day.

The .latest attribute of DataSet columns returns an instance of this Factor.

class zipline.pipeline.factors.MaxDrawdown(*args, **kwargs)[source]

Max Drawdown

Default Inputs: None

Default Window Length: None

class zipline.pipeline.factors.Returns(*args, **kwargs)[source]

Calculates the percent change in close price over the given window_length.

Default Inputs: [USEquityPricing.close]

class zipline.pipeline.factors.RSI(*args, **kwargs)[source]

Relative Strength Index

Default Inputs: [USEquityPricing.close]

Default Window Length: 15

class zipline.pipeline.factors.SimpleMovingAverage(*args, **kwargs)[source]

Average Value of an arbitrary column

Default Inputs: None

Default Window Length: None

class zipline.pipeline.factors.VWAP(*args, **kwargs)[source]

Volume Weighted Average Price

Default Inputs: [USEquityPricing.close, USEquityPricing.volume]

Default Window Length: None

class zipline.pipeline.factors.WeightedAverageValue(*args, **kwargs)[source]

Helper for VWAP-like computations.

Default Inputs: None

Default Window Length: None

class zipline.pipeline.factors.ExponentialWeightedMovingAverage(*args, **kwargs)[source]

Exponentially Weighted Moving Average

Default Inputs: None

Default Window Length: None

Parameters:

inputs : length-1 list/tuple of BoundColumn

The expression over which to compute the average.

window_length : int > 0

Length of the lookback window over which to compute the average.

decay_rate : float, 0 < decay_rate <= 1

Weighting factor by which to discount past observations.

When calculating historical averages, rows are multiplied by the sequence:

decay_rate, decay_rate ** 2, decay_rate ** 3, ...

See also

pandas.ewma()

Notes

  • This class can also be imported under the name EWMA.
class zipline.pipeline.factors.ExponentialWeightedMovingStdDev(*args, **kwargs)[source]

Exponentially Weighted Moving Standard Deviation

Default Inputs: None

Default Window Length: None

Parameters:

inputs : length-1 list/tuple of BoundColumn

The expression over which to compute the average.

window_length : int > 0

Length of the lookback window over which to compute the average.

decay_rate : float, 0 < decay_rate <= 1

Weighting factor by which to discount past observations.

When calculating historical averages, rows are multiplied by the sequence:

decay_rate, decay_rate ** 2, decay_rate ** 3, ...

See also

pandas.ewmstd()

Notes

  • This class can also be imported under the name EWMSTD.
class zipline.pipeline.factors.AverageDollarVolume(*args, **kwargs)[source]

Average Daily Dollar Volume

Default Inputs: [USEquityPricing.close, USEquityPricing.volume]

Default Window Length: None

class zipline.pipeline.factors.BollingerBands(*args, **kwargs)[source]

Bollinger Bands technical indicator. https://en.wikipedia.org/wiki/Bollinger_Bands

Default Inputs: zipline.pipeline.data.USEquityPricing.close

Parameters:

inputs : length-1 iterable[BoundColumn]

The expression over which to compute bollinger bands.

window_length : int > 0

Length of the lookback window over which to compute the bollinger bands.

k : float

The number of standard deviations to add or subtract to create the upper and lower bands.

class zipline.pipeline.factors.RollingPearsonOfReturns(*args, **kwargs)[source]

Calculates the Pearson product-moment correlation coefficient of the returns of the given asset with the returns of all other assets.

Pearson correlation is what most people mean when they say “correlation coefficient” or “R-value”.

Parameters:

target : zipline.assets.Asset

The asset to correlate with all other assets.

returns_length : int >= 2

Length of the lookback window over which to compute returns. Daily returns require a window length of 2.

correlation_length : int >= 1

Length of the lookback window over which to compute each correlation coefficient.

mask : zipline.pipeline.Filter, optional

A Filter describing which assets should have their correlation with the target asset computed each day.

class zipline.pipeline.factors.RollingSpearmanOfReturns(*args, **kwargs)[source]

Calculates the Spearman rank correlation coefficient of the returns of the given asset with the returns of all other assets.

Parameters:

target : zipline.assets.Asset

The asset to correlate with all other assets.

returns_length : int >= 2

Length of the lookback window over which to compute returns. Daily returns require a window length of 2.

correlation_length : int >= 1

Length of the lookback window over which to compute each correlation coefficient.

mask : zipline.pipeline.Filter, optional

A Filter describing which assets should have their correlation with the target asset computed each day.

class zipline.pipeline.factors.RollingLinearRegressionOfReturns(*args, **kwargs)[source]

Perform an ordinary least-squares regression predicting the returns of all other assets on the given asset.

Parameters:

target : zipline.assets.Asset

The asset to regress against all other assets.

returns_length : int >= 2

Length of the lookback window over which to compute returns. Daily returns require a window length of 2.

regression_length : int >= 1

Length of the lookback window over which to compute each regression.

mask : zipline.pipeline.Filter, optional

A Filter describing which assets should be regressed against the target asset each day.

Notes

Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which regressions are computed.

This factor is designed to return five outputs:

  • alpha, a factor that computes the intercepts of each regression.
  • beta, a factor that computes the slopes of each regression.
  • r_value, a factor that computes the correlation coefficient of each regression.
  • p_value, a factor that computes, for each regression, the two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero.
  • stderr, a factor that computes the standard error of the estimate of each regression.

For more help on factors with multiple outputs, see zipline.pipeline.factors.CustomFactor.

class zipline.pipeline.filters.Filter(*args, **kwargs)[source]

Pipeline expression computing a boolean output.

Filters are most commonly useful for describing sets of assets to include or exclude for some particular purpose. Many Pipeline API functions accept a mask argument, which can be supplied a Filter indicating that only values passing the Filter should be considered when performing the requested computation. For example, zipline.pipeline.Factor.top() accepts a mask indicating that ranks should be computed only on assets that passed the specified Filter.

The most common way to construct a Filter is via one of the comparison operators (<, <=, !=, eq, >, >=) of Factor. For example, a natural way to construct a Filter for stocks with a 10-day VWAP less than $20.0 is to first construct a Factor computing 10-day VWAP and compare it to the scalar value 20.0:

>>> from zipline.pipeline.factors import VWAP
>>> vwap_10 = VWAP(window_length=10)
>>> vwaps_under_20 = (vwap_10 <= 20)

Filters can also be constructed via comparisons between two Factors. For example, to construct a Filter producing True for asset/date pairs where the asset’s 10-day VWAP was greater than it’s 30-day VWAP:

>>> short_vwap = VWAP(window_length=10)
>>> long_vwap = VWAP(window_length=30)
>>> higher_short_vwap = (short_vwap > long_vwap)

Filters can be combined via the & (and) and | (or) operators.

&-ing together two filters produces a new Filter that produces True if both of the inputs produced True.

|-ing together two filters produces a new Filter that produces True if either of its inputs produced True.

The ~ operator can be used to invert a Filter, swapping all True values with Falses and vice-versa.

Filters may be set as the screen attribute of a Pipeline, indicating asset/date pairs for which the filter produces False should be excluded from the Pipeline’s output. This is useful both for reducing noise in the output of a Pipeline and for reducing memory consumption of Pipeline results.

__and__(other)

Binary Operator: ‘&’

__or__(other)

Binary Operator: ‘|’

class zipline.pipeline.data.USEquityPricing[source]

Dataset representing daily trading prices and volumes.

close = USEquityPricing.close::float64
high = USEquityPricing.high::float64
low = USEquityPricing.low::float64
open = USEquityPricing.open::float64
volume = USEquityPricing.volume::float64

Asset Metadata

class zipline.assets.Asset(int sid, exchange, symbol='', asset_name='', start_date=None, end_date=None, first_traded=None, auto_close_date=None, exchange_full=None)
first_traded

first_traded: object

from_dict(type cls, dict_)

Build an Asset instance from a dict.

is_alive_for_session(self, session_label)

Returns whether the asset is alive at the given dt.

Parameters:

session_label: pd.Timestamp

The desired session label to check. (midnight UTC)

Returns:

boolean: whether the asset is alive at the given dt.

is_exchange_open(self, dt_minute)
Parameters:

dt_minute: pd.Timestamp (UTC, tz-aware)

The minute to check.

Returns:

boolean: whether the asset’s exchange is open at the given minute.

to_dict(self)

Convert to a python dict.

class zipline.assets.Equity
security_end_date

DEPRECATION: This property should be deprecated and is only present for backwards compatibility

security_name

DEPRECATION: This property should be deprecated and is only present for backwards compatibility

security_start_date

DEPRECATION: This property should be deprecated and is only present for backwards compatibility

class zipline.assets.Future(int sid, exchange, symbol='', root_symbol='', asset_name='', start_date=None, end_date=None, notice_date=None, expiration_date=None, auto_close_date=None, first_traded=None, tick_size='', float multiplier=1.0, exchange_full=None)
to_dict(self)

Convert to a python dict.

class zipline.assets.AssetConvertible[source]

ABC for types that are convertible to integer-representations of Assets.

Includes Asset, six.string_types, and Integral

Trading Calendar API

zipline.utils.calendars.get_calendar(self, name)

Retrieves an instance of an TradingCalendar whose name is given.

Parameters:

name : str

The name of the TradingCalendar to be retrieved.

Returns:

TradingCalendar

The desired calendar.

class zipline.utils.calendars.TradingCalendar(start=Timestamp('1990-01-01 00:00:00+0000', tz='UTC'), end=Timestamp('2017-09-08 17:47:02.261654+0000', tz='UTC'))[source]

An TradingCalendar represents the timing information of a single market exchange.

The timing information is made up of two parts: sessions, and opens/closes.

A session represents a contiguous set of minutes, and has a label that is midnight UTC. It is important to note that a session label should not be considered a specific point in time, and that midnight UTC is just being used for convenience.

For each session, we store the open and close time in UTC time.

is_open_on_minute(dt)[source]

Given a dt, return whether this exchange is open at the given dt.

Parameters:

dt: pd.Timestamp

The dt for which to check if this exchange is open.

Returns:

bool

Whether the exchange is open on this dt.

is_session(dt)[source]

Given a dt, returns whether it’s a valid session label.

Parameters:

dt: pd.Timestamp

The dt that is being tested.

Returns:

bool

Whether the given dt is a valid session label.

minute_index_to_session_labels(index)[source]

Given a sorted DatetimeIndex of market minutes, return a DatetimeIndex of the corresponding session labels.

Parameters:

index: pd.DatetimeIndex or pd.Series

The ordered list of market minutes we want session labels for.

Returns:

pd.DatetimeIndex (UTC)

The list of session labels corresponding to the given minutes.

minute_to_session_label(dt, direction='next')[source]

Given a minute, get the label of its containing session.

Parameters:

dt : pd.Timestamp or nanosecond offset

The dt for which to get the containing session.

direction: str

“next” (default) means that if the given dt is not part of a session, return the label of the next session.

“previous” means that if the given dt is not part of a session, return the label of the previous session.

“none” means that a KeyError will be raised if the given dt is not part of a session.

Returns:

pd.Timestamp (midnight UTC)

The label of the containing session.

minutes_for_session(session_label)[source]

Given a session label, return the minutes for that session.

Parameters:

session_label: pd.Timestamp (midnight UTC)

A session label whose session’s minutes are desired.

Returns:

pd.DateTimeIndex

All the minutes for the given session.

minutes_for_sessions_in_range(start_session_label, end_session_label)[source]

Returns all the minutes for all the sessions from the given start session label to the given end session label, inclusive.

Parameters:

start_session_label: pd.Timestamp

The label of the first session in the range.

end_session_label: pd.Timestamp

The label of the last session in the range.

Returns:

pd.DatetimeIndex

The minutes in the desired range.

minutes_in_range(start_minute, end_minute)[source]

Given start and end minutes, return all the calendar minutes in that range, inclusive.

Given minutes don’t need to be calendar minutes.

Parameters:

start_minute: pd.Timestamp

The minute representing the start of the desired range.

end_minute: pd.Timestamp

The minute representing the end of the desired range.

Returns:

pd.DatetimeIndex

The minutes in the desired range.

next_close(dt)[source]

Given a dt, returns the next close.

Parameters:

dt: pd.Timestamp

The dt for which to get the next close.

Returns:

pd.Timestamp

The UTC timestamp of the next close.

next_minute(dt)[source]

Given a dt, return the next exchange minute. If the given dt is not an exchange minute, returns the next exchange open.

Parameters:

dt: pd.Timestamp

The dt for which to get the next exchange minute.

Returns:

pd.Timestamp

The next exchange minute.

next_open(dt)[source]

Given a dt, returns the next open.

If the given dt happens to be a session open, the next session’s open will be returned.

Parameters:

dt: pd.Timestamp

The dt for which to get the next open.

Returns:

pd.Timestamp

The UTC timestamp of the next open.

next_session_label(session_label)[source]

Given a session label, returns the label of the next session.

Parameters:

session_label: pd.Timestamp

A session whose next session is desired.

Returns:

pd.Timestamp

The next session label (midnight UTC).

Notes

Raises ValueError if the given session is the last session in this calendar.

open_and_close_for_session(session_label)[source]

Returns a tuple of timestamps of the open and close of the session represented by the given label.

Parameters:

session_label: pd.Timestamp

The session whose open and close are desired.

Returns:

(Timestamp, Timestamp)

The open and close for the given session.

previous_close(dt)[source]

Given a dt, returns the previous close.

Parameters:

dt: pd.Timestamp

The dt for which to get the previous close.

Returns:

pd.Timestamp

The UTC timestamp of the previous close.

previous_minute(dt)[source]

Given a dt, return the previous exchange minute.

Raises KeyError if the given timestamp is not an exchange minute.

Parameters:

dt: pd.Timestamp

The dt for which to get the previous exchange minute.

Returns:

pd.Timestamp

The previous exchange minute.

previous_open(dt)[source]

Given a dt, returns the previous open.

Parameters:

dt: pd.Timestamp

The dt for which to get the previous open.

Returns:

pd.Timestamp

The UTC imestamp of the previous open.

previous_session_label(session_label)[source]

Given a session label, returns the label of the previous session.

Parameters:

session_label: pd.Timestamp

A session whose previous session is desired.

Returns:

pd.Timestamp

The previous session label (midnight UTC).

Notes

Raises ValueError if the given session is the first session in this calendar.

regular_holidays
Returns:

pd.AbstractHolidayCalendar: a calendar containing the regular holidays

for this calendar

session_distance(start_session_label, end_session_label)[source]

Given a start and end session label, returns the distance between them. For example, for three consecutive sessions Mon., Tues., and Wed, session_distance(Mon, Wed) would return 2.

Parameters:

start_session_label: pd.Timestamp

The label of the start session.

end_session_label: pd.Timestamp

The label of the ending session.

Returns:

int

The distance between the two sessions.

sessions_in_range(start_session_label, end_session_label)[source]

Given start and end session labels, return all the sessions in that range, inclusive.

Parameters:

start_session_label: pd.Timestamp (midnight UTC)

The label representing the first session of the desired range.

end_session_label: pd.Timestamp (midnight UTC)

The label representing the last session of the desired range.

Returns:

pd.DatetimeIndex

The desired sessions.

sessions_window(session_label, count)[source]

Given a session label and a window size, returns a list of sessions of size count + 1, that either starts with the given session (if count is positive) or ends with the given session (if count is negative).

Parameters:

session_label: pd.Timestamp

The label of the initial session.

count: int

Defines the length and the direction of the window.

Returns:

pd.DatetimeIndex

The desired sessions.

special_closes

A list of special close times and corresponding HolidayCalendars.

Returns:list: List of (time, AbstractHolidayCalendar) tuples
special_closes_adhoc
Returns:

list: List of (time, DatetimeIndex) tuples that represent special

closes that cannot be codified into rules.

special_opens

A list of special open times and corresponding HolidayCalendars.

Returns:list: List of (time, AbstractHolidayCalendar) tuples
special_opens_adhoc
Returns:

list: List of (time, DatetimeIndex) tuples that represent special

closes that cannot be codified into rules.

zipline.utils.calendars.register_calendar(self, name, calendar, force=False)

Registers a calendar for retrieval by the get_calendar method.

Parameters:

name: str

The key with which to register this calendar.

calendar: TradingCalendar

The calendar to be registered for retrieval.

force : bool, optional

If True, old calendars will be overwritten on a name collision. If False, name collisions will raise an exception. Default: False.

Raises:

CalendarNameCollision

If a calendar is already registered with the given calendar’s name.

zipline.utils.calendars.register_calendar_type(self, name, calendar_type, force=False)

Registers a calendar by type.

Parameters:

name: str

The key with which to register this calendar.

calendar_type: type

The type of the calendar to register.

force : bool, optional

If True, old calendars will be overwritten on a name collision. If False, name collisions will raise an exception. Default: False.

Raises:

CalendarNameCollision

If a calendar is already registered with the given calendar’s name.

zipline.utils.calendars.deregister_calendar(self, name)

If a calendar is registered with the given name, it is de-registered.

Parameters:

cal_name : str

The name of the calendar to be deregistered.

zipline.utils.calendars.clear_calendars(self)

Deregisters all current registered calendars

Data API

Writers

class zipline.data.minute_bars.BcolzMinuteBarWriter(rootdir, calendar, start_session, end_session, minutes_per_day, default_ohlc_ratio=1000, ohlc_ratios_per_sid=None, expectedlen=1474200)[source]

Class capable of writing minute OHLCV data to disk into bcolz format.

Parameters:

rootdir : string

Path to the root directory into which to write the metadata and bcolz subdirectories.

calendar : zipline.utils.calendars.trading_calendar.TradingCalendar

The trading calendar on which to base the minute bars. Used to get the market opens used as a starting point for each periodic span of minutes in the index, and the market closes that correspond with the market opens.

minutes_per_day : int

The number of minutes per each period. Defaults to 390, the mode of minutes in NYSE trading days.

start_session : datetime

The first trading session in the data set.

end_session : datetime

The last trading session in the data set.

default_ohlc_ratio : int, optional

The default ratio by which to multiply the pricing data to convert from floats to integers that fit within np.uint32. If ohlc_ratios_per_sid is None or does not contain a mapping for a given sid, this ratio is used. Default is OHLC_RATIO (1000).

ohlc_ratios_per_sid : dict, optional

A dict mapping each sid in the output to the ratio by which to multiply the pricing data to convert the floats from floats to an integer to fit within the np.uint32.

expectedlen : int, optional

The expected length of the dataset, used when creating the initial bcolz ctable.

If the expectedlen is not used, the chunksize and corresponding compression ratios are not ideal.

Defaults to supporting 15 years of NYSE equity market data. see: http://bcolz.blosc.org/opt-tips.html#informing-about-the-length-of-your-carrays # noqa

Notes

Writes a bcolz directory for each individual sid, all contained within a root directory which also contains metadata about the entire dataset.

Each individual asset’s data is stored as a bcolz table with a column for each pricing field: (open, high, low, close, volume)

The open, high, low, and close columns are integers which are 1000 times the quoted price, so that the data can represented and stored as an np.uint32, supporting market prices quoted up to the thousands place.

volume is a np.uint32 with no mutation of the tens place.

The ‘index’ for each individual asset are a repeating period of minutes of length minutes_per_day starting from each market open. The file format does not account for half-days. e.g.: 2016-01-19 14:31 2016-01-19 14:32 ... 2016-01-19 20:59 2016-01-19 21:00 2016-01-20 14:31 2016-01-20 14:32 ... 2016-01-20 20:59 2016-01-20 21:00

All assets are written with a common ‘index’, sharing a common first trading day. Assets that do not begin trading until after the first trading day will have zeros for all pricing data up and until data is traded.

‘index’ is in quotations, because bcolz does not provide an index. The format allows index-like behavior by writing each minute’s data into the corresponding position of the enumeration of the aforementioned datetime index.

The datetimes which correspond to each position are written in the metadata as integer nanoseconds since the epoch into the minute_index key.

last_date_in_output_for_sid(sid)[source]
pad(sid, date)[source]

Fill sid container with empty data through the specified date.

If the last recorded trade is not at the close, then that day will be padded with zeros until its close. Any day after that (up to and including the specified date) will be padded with minute_per_day worth of zeros

set_sid_attrs(sid, **kwargs)[source]

Write all the supplied kwargs as attributes of the sid’s file.

sidpath(sid)[source]
write(data, show_progress=False)[source]

Write a stream of minute data.

Parameters:

data : iterable[(int, pd.DataFrame)]

The data to write. Each element should be a tuple of sid, data where data has the following format:

columns : (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)

open : float64 high : float64 low : float64 close : float64 volume : float64|int64

index : DatetimeIndex of market minutes.

A given sid may appear more than once in data; however, the dates must be strictly increasing.

show_progress : bool, optional

Whether or not to show a progress bar while writing.

write_cols(sid, dts, cols)[source]

Write the OHLCV data for the given sid. If there is no bcolz ctable yet created for the sid, create it. If the length of the bcolz ctable is not exactly to the date before the first day provided, fill the ctable with 0s up to that date.

write_sid(sid, df)[source]

Write the OHLCV data for the given sid. If there is no bcolz ctable yet created for the sid, create it. If the length of the bcolz ctable is not exactly to the date before the first day provided, fill the ctable with 0s up to that date.

class zipline.data.us_equity_pricing.BcolzDailyBarWriter(filename, calendar, start_session, end_session)[source]

Class capable of writing daily OHLCV data to disk in a format that can be read efficiently by BcolzDailyOHLCVReader.

Parameters:

filename : str

The location at which we should write our output.

calendar : zipline.utils.calendar.trading_calendar

Calendar to use to compute asset calendar offsets.

start_session: pd.Timestamp

Midnight UTC session label.

end_session: pd.Timestamp

Midnight UTC session label.

write(data, assets=None, show_progress=False, invalid_data_behavior='warn')[source]
Parameters:

data : iterable[tuple[int, pandas.DataFrame or bcolz.ctable]]

The data chunks to write. Each chunk should be a tuple of sid and the data for that asset.

assets : set[int], optional

The assets that should be in data. If this is provided we will check data against the assets and provide better progress information.

show_progress : bool, optional

Whether or not to show a progress bar while writing.

invalid_data_behavior : {‘warn’, ‘raise’, ‘ignore’}, optional

What to do when data is encountered that is outside the range of a uint32.

Returns:

table : bcolz.ctable

The newly-written table.

write_csvs(asset_map, show_progress=False, invalid_data_behavior='warn')[source]

Read CSVs as DataFrames from our asset map.

Parameters:

asset_map : dict[int -> str]

A mapping from asset id to file path with the CSV data for that asset

show_progress : bool

Whether or not to show a progress bar while writing.

invalid_data_behavior : {‘warn’, ‘raise’, ‘ignore’}

What to do when data is encountered that is outside the range of a uint32.

class zipline.data.us_equity_pricing.SQLiteAdjustmentWriter(conn_or_path, equity_daily_bar_reader, calendar, overwrite=False)[source]

Writer for data to be read by SQLiteAdjustmentReader

Parameters:

conn_or_path : str or sqlite3.Connection

A handle to the target sqlite database.

equity_daily_bar_reader : BcolzDailyBarReader

Daily bar reader to use for dividend writes.

overwrite : bool, optional, default=False

If True and conn_or_path is a string, remove any existing files at the given path before connecting.

calc_dividend_ratios(dividends)[source]

Calculate the ratios to apply to equities when looking back at pricing history so that the price is smoothed over the ex_date, when the market adjusts to the change in equity value due to upcoming dividend.

Returns:

DataFrame

A frame in the same format as splits and mergers, with keys - sid, the id of the equity - effective_date, the date in seconds on which to apply the ratio. - ratio, the ratio to apply to backwards looking pricing data.

write(splits=None, mergers=None, dividends=None, stock_dividends=None)[source]

Writes data to a SQLite file to be read by SQLiteAdjustmentReader.

Parameters:

splits : pandas.DataFrame, optional

Dataframe containing split data. The format of this dataframe is:
effective_date : int

The date, represented as seconds since Unix epoch, on which the adjustment should be applied.

ratio : float

A value to apply to all data earlier than the effective date. For open, high, low, and close those values are multiplied by the ratio. Volume is divided by this value.

sid : int

The asset id associated with this adjustment.

mergers : pandas.DataFrame, optional

DataFrame containing merger data. The format of this dataframe is:
effective_date : int

The date, represented as seconds since Unix epoch, on which the adjustment should be applied.

ratio : float

A value to apply to all data earlier than the effective date. For open, high, low, and close those values are multiplied by the ratio. Volume is unaffected.

sid : int

The asset id associated with this adjustment.

dividends : pandas.DataFrame, optional

DataFrame containing dividend data. The format of the dataframe is:
sid : int

The asset id associated with this adjustment.

ex_date : datetime64

The date on which an equity must be held to be eligible to receive payment.

declared_date : datetime64

The date on which the dividend is announced to the public.

pay_date : datetime64

The date on which the dividend is distributed.

record_date : datetime64

The date on which the stock ownership is checked to determine distribution of dividends.

amount : float

The cash amount paid for each share.

Dividend ratios are calculated as: 1.0 - (dividend_value / "close on day prior to ex_date")

stock_dividends : pandas.DataFrame, optional

DataFrame containing stock dividend data. The format of the dataframe is:

sid : int

The asset id associated with this adjustment.

ex_date : datetime64

The date on which an equity must be held to be eligible to receive payment.

declared_date : datetime64

The date on which the dividend is announced to the public.

pay_date : datetime64

The date on which the dividend is distributed.

record_date : datetime64

The date on which the stock ownership is checked to determine distribution of dividends.

payment_sid : int

The asset id of the shares that should be paid instead of cash.

ratio : float

The ratio of currently held shares in the held sid that should be paid with new shares of the payment_sid.

write_dividend_data(dividends, stock_dividends=None)[source]

Write both dividend payouts and the derived price adjustment ratios.

write_dividend_payouts(frame)[source]

Write dividend payout data to SQLite table dividend_payouts.

class zipline.assets.AssetDBWriter(engine)[source]

Class used to write data to an assets db.

Parameters:

engine : Engine or str

An SQLAlchemy engine or path to a SQL database.

init_db(txn=None)[source]

Connect to database and create tables.

Parameters:

txn : sa.engine.Connection, optional

The transaction to execute in. If this is not provided, a new transaction will be started with the engine provided.

Returns:

metadata : sa.MetaData

The metadata that describes the new assets db.

write(equities=None, futures=None, exchanges=None, root_symbols=None, chunk_size=999)[source]

Write asset metadata to a sqlite database.

Parameters:

equities : pd.DataFrame, optional

The equity metadata. The columns for this dataframe are:

symbol : str

The ticker symbol for this equity.

asset_name : str

The full name for this asset.

start_date : datetime

The date when this asset was created.

end_date : datetime, optional

The last date we have trade data for this asset.

first_traded : datetime, optional

The first date we have trade data for this asset.

auto_close_date : datetime, optional

The date on which to close any positions in this asset.

exchange : str, optional

The exchange where this asset is traded.

The index of this dataframe should contain the sids.

futures : pd.Dataframe, optional

The future contract metadata. The columns for this dataframe are:

symbol : str

The ticker symbol for this futures contract.

root_symbol : str

The root symbol, or the symbol with the expiration stripped out.

asset_name : str

The full name for this asset.

start_date : datetime, optional

The date when this asset was created.

end_date : datetime, optional

The last date we have trade data for this asset.

first_traded : datetime, optional

The first date we have trade data for this asset.

exchange : str, optional

The exchange where this asset is traded.

notice_date : datetime

The date when the owner of the contract may be forced to take physical delivery of the contract’s asset.

expiration_date : datetime

The date when the contract expires.

auto_close_date : datetime

The date when the broker will automatically close any positions in this contract.

tick_size : float

The minimum price movement of the contract.

multiplier: float

The amount of the underlying asset represented by this contract.

exchanges : pd.Dataframe, optional

The exchanges where assets can be traded. The columns of this dataframe are:

exchange : str

The name of the exchange.

timezone : str

The timezone of the exchange.

root_symbols : pd.Dataframe, optional

The root symbols for the futures contracts. The columns for this dataframe are:

root_symbol : str

The root symbol name.

root_symbol_id : int

The unique id for this root symbol.

sector : string, optional

The sector of this root symbol.

description : string, optional

A short description of this root symbol.

exchange : str

The exchange where this root symbol is traded.

chunk_size : int, optional

The amount of rows to write to the SQLite table at once. This defaults to the default number of bind params in sqlite. If you have compiled sqlite3 with more bind or less params you may want to pass that value here.

See also

zipline.assets.asset_finder

Readers

class zipline.data.minute_bars.BcolzMinuteBarReader(rootdir, sid_cache_size=1000)[source]

Reader for data written by BcolzMinuteBarWriter

get_value(sid, dt, field)[source]

Retrieve the pricing info for the given sid, dt, and field.

load_raw_arrays(fields, start_dt, end_dt, sids)[source]
Parameters:

fields : list of str

‘open’, ‘high’, ‘low’, ‘close’, or ‘volume’

start_dt: Timestamp

Beginning of the window range.

end_dt: Timestamp

End of the window range.

sids : list of int

The asset identifiers in the window.

Returns:

list of np.ndarray

A list with an entry per field of ndarrays with shape (minutes in range, sids) with a dtype of float64, containing the values for the respective field over start and end dt range.

table_len(sid)[source]

Returns the length of the underlying table for this sid.

class zipline.data.us_equity_pricing.BcolzDailyBarReader(table, read_all_threshold=3000)[source]

Reader for raw pricing data written by BcolzDailyOHLCVWriter.

Parameters:

table : bcolz.ctable

The ctable contaning the pricing data, with attrs corresponding to the Attributes list below.

read_all_threshold : int

The number of equities at which; below, the data is read by reading a slice from the carray per asset. above, the data is read by pulling all of the data for all assets into memory and then indexing into that array for each day and asset pair. Used to tune performance of reads when using a small or large number of equities.

Notes

A Bcolz CTable is comprised of Columns and Attributes. The table with which this loader interacts contains the following columns:

[‘open’, ‘high’, ‘low’, ‘close’, ‘volume’, ‘day’, ‘id’].

The data in these columns is interpreted as follows:

  • Price columns (‘open’, ‘high’, ‘low’, ‘close’) are interpreted as 1000 * as-traded dollar value.
  • Volume is interpreted as as-traded volume.
  • Day is interpreted as seconds since midnight UTC, Jan 1, 1970.
  • Id is the asset id of the row.

The data in each column is grouped by asset and then sorted by day within each asset block.

The table is built to represent a long time range of data, e.g. ten years of equity data, so the lengths of each asset block is not equal to each other. The blocks are clipped to the known start and end date of each asset to cut down on the number of empty values that would need to be included to make a regular/cubic dataset.

When read across the open, high, low, close, and volume with the same index should represent the same asset and day.

Attributes

The table with which this loader interacts contains the following  
attributes:  
first_row (dict) Map from asset_id -> index of first row in the dataset with that id.
last_row (dict) Map from asset_id -> index of last row in the dataset with that id.
calendar_offset (dict) Map from asset_id -> calendar index of first row.
start_session_ns: int Epoch ns of the first session used in this dataset.
end_session_ns: int Epoch ns of the last session used in this dataset.
calendar_name: str String identifier of trading calendar used (ie, “NYSE”).
We use first_row and last_row together to quickly find ranges of rows to  
load when reading an asset’s data into memory.  
We use calendar_offset and calendar to orient loaded blocks within a  
range of queried dates.  
get_value(sid, day, colname)[source]
Parameters:

sid : int

The asset identifier.

day : datetime64-like

Midnight of the day for which data is requested.

colname : string

The price field. e.g. (‘open’, ‘high’, ‘low’, ‘close’, ‘volume’)

Returns:

float

The spot price for colname of the given sid on the given day. Raises a NoDataOnDate exception if the given day and sid is before or after the date range of the equity. Returns -1 if the day is within the date range, but the price is 0.

sid_day_index(sid, day)[source]
Parameters:

sid : int

The asset identifier.

day : datetime64-like

Midnight of the day for which data is requested.

Returns:

int

Index into the data tape for the given sid and day. Raises a NoDataOnDate exception if the given day and sid is before or after the date range of the equity.

class zipline.data.us_equity_pricing.SQLiteAdjustmentReader(conn)[source]

Loads adjustments based on corporate actions from a SQLite database.

Expects data written in the format output by SQLiteAdjustmentWriter.

Parameters:

conn : str or sqlite3.Connection

Connection from which to load data.

class zipline.assets.AssetFinder(engine)[source]

An AssetFinder is an interface to a database of Asset metadata written by an AssetDBWriter.

This class provides methods for looking up assets by unique integer id or by symbol. For historical reasons, we refer to these unique ids as ‘sids’.

Parameters:

engine : str or SQLAlchemy.engine

An engine with a connection to the asset database to use, or a string that can be parsed by SQLAlchemy as a URI.

equities_sids

All of the sids for equities in the asset finder.

futures_sids

All of the sids for futures consracts in the asset finder.

group_by_type(sids)[source]

Group a list of sids by asset type.

Parameters:

sids : list[int]

Returns:

types : dict[str or None -> list[int]]

A dict mapping unique asset types to lists of sids drawn from sids. If we fail to look up an asset, we assign it a key of None.

lifetimes(dates, include_start_date)[source]

Compute a DataFrame representing asset lifetimes for the specified date range.

Parameters:

dates : pd.DatetimeIndex

The dates for which to compute lifetimes.

include_start_date : bool

Whether or not to count the asset as alive on its start_date.

This is useful in a backtesting context where lifetimes is being used to signify “do I have data for this asset as of the morning of this date?” For many financial metrics, (e.g. daily close), data isn’t available for an asset until the end of the asset’s first day.

Returns:

lifetimes : pd.DataFrame

A frame of dtype bool with dates as index and an Int64Index of assets as columns. The value at lifetimes.loc[date, asset] will be True iff asset existed on date. If include_start_date is False, then lifetimes.loc[date, asset] will be false when date == asset.start_date.

See also

numpy.putmask, zipline.pipeline.engine.SimplePipelineEngine._compute_root_mask

lookup_asset_types(sids)[source]

Retrieve asset types for a list of sids.

Parameters:

sids : list[int]

Returns:

types : dict[sid -> str or None]

Asset types for the provided sids.

lookup_future_chain(root_symbol, as_of_date)[source]

Return the futures chain for a given root symbol.

Parameters:

root_symbol : str

Root symbol of the desired future.

as_of_date : pd.Timestamp or pd.NaT

Date at which the chain determination is rooted. I.e. the existing contract whose notice date/expiration date is first after this date is the primary contract, etc. If NaT is given, the chain is unbounded, and all contracts for this root symbol are returned.

Returns:

list

A list of Future objects, the chain for the given parameters.

Raises:

RootSymbolNotFound

Raised when a future chain could not be found for the given root symbol.

lookup_future_symbol(symbol)[source]

Lookup a future contract by symbol.

Parameters:

symbol : str

The symbol of the desired contract.

Returns:

future : Future

The future contract referenced by symbol.

Raises:

SymbolNotFound

Raised when no contract named ‘symbol’ is found.

lookup_generic(asset_convertible_or_iterable, as_of_date)[source]

Convert a AssetConvertible or iterable of AssetConvertibles into a list of Asset objects.

This method exists primarily as a convenience for implementing user-facing APIs that can handle multiple kinds of input. It should not be used for internal code where we already know the expected types of our inputs.

Returns a pair of objects, the first of which is the result of the conversion, and the second of which is a list containing any values that couldn’t be resolved.

lookup_symbol(symbol, as_of_date, fuzzy=False)[source]

Lookup an equity by symbol.

Parameters:

symbol : str

The ticker symbol to resolve.

as_of_date : datetime or None

Look up the last owner of this symbol as of this datetime. If as_of_date is None, then this can only resolve the equity if exactly one equity has ever owned the ticker.

fuzzy : bool, optional

Should fuzzy symbol matching be used? Fuzzy symbol matching attempts to resolve differences in representations for shareclasses. For example, some people may represent the A shareclass of BRK as BRK.A, where others could write BRK_A.

Returns:

equity : Equity

The equity that held symbol on the given as_of_date, or the only equity to hold symbol if as_of_date is None.

Raises:

SymbolNotFound

Raised when no equity has ever held the given symbol.

MultipleSymbolsFound

Raised when no as_of_date is given and more than one equity has held symbol. This is also raised when fuzzy=True and there are multiple candidates for the given symbol on the as_of_date.

map_identifier_index_to_sids(index, as_of_date)[source]

This method is for use in sanitizing a user’s DataFrame or Panel inputs.

Takes the given index of identifiers, checks their types, builds assets if necessary, and returns a list of the sids that correspond to the input index.

Parameters:

index : Iterable

An iterable containing ints, strings, or Assets

as_of_date : pandas.Timestamp

A date to be used to resolve any dual-mapped symbols

Returns:

List

A list of integer sids corresponding to the input index

reload_symbol_maps()[source]

Clear the in memory symbol lookup maps.

This will make any changes to the underlying db available to the symbol maps.

retrieve_all(sids, default_none=False)[source]

Retrieve all assets in sids.

Parameters:

sids : iterable of int

Assets to retrieve.

default_none : bool

If True, return None for failed lookups. If False, raise SidsNotFound.

Returns:

assets : list[Asset or None]

A list of the same length as sids containing Assets (or Nones) corresponding to the requested sids.

Raises:

SidsNotFound

When a requested sid is not found and default_none=False.

retrieve_asset(sid, default_none=False)[source]

Retrieve the Asset for a given sid.

retrieve_equities(sids)[source]

Retrieve Equity objects for a list of sids.

Users generally shouldn’t need to this method (instead, they should prefer the more general/friendly retrieve_assets), but it has a documented interface and tests because it’s used upstream.

Parameters:

sids : iterable[int]

Returns:

equities : dict[int -> Equity]

Raises:

EquitiesNotFound

When any requested asset isn’t found.

retrieve_futures_contracts(sids)[source]

Retrieve Future objects for an iterable of sids.

Users generally shouldn’t need to this method (instead, they should prefer the more general/friendly retrieve_assets), but it has a documented interface and tests because it’s used upstream.

Parameters:

sids : iterable[int]

Returns:

equities : dict[int -> Equity]

Raises:

EquitiesNotFound

When any requested asset isn’t found.

sids

All the sids in the asset finder.

class zipline.data.data_portal.DataPortal(asset_finder, trading_calendar, first_trading_day, equity_daily_reader=None, equity_minute_reader=None, future_daily_reader=None, future_minute_reader=None, adjustment_reader=None)[source]

Interface to all of the data that a zipline simulation needs.

This is used by the simulation runner to answer questions about the data, like getting the prices of assets on a given day or to service history calls.

Parameters:

asset_finder : zipline.assets.assets.AssetFinder

The AssetFinder instance used to resolve assets.

trading_calendar: zipline.utils.calendar.exchange_calendar.TradingCalendar

The calendar instance used to provide minute->session information.

first_trading_day : pd.Timestamp

The first trading day for the simulation.

equity_daily_reader : BcolzDailyBarReader, optional

The daily bar reader for equities. This will be used to service daily data backtests or daily history calls in a minute backetest. If a daily bar reader is not provided but a minute bar reader is, the minutes will be rolled up to serve the daily requests.

equity_minute_reader : BcolzMinuteBarReader, optional

The minute bar reader for equities. This will be used to service minute data backtests or minute history calls. This can be used to serve daily calls if no daily bar reader is provided.

future_daily_reader : BcolzDailyBarReader, optional

The daily bar ready for futures. This will be used to service daily data backtests or daily history calls in a minute backetest. If a daily bar reader is not provided but a minute bar reader is, the minutes will be rolled up to serve the daily requests.

future_minute_reader : BcolzFutureMinuteBarReader, optional

The minute bar reader for futures. This will be used to service minute data backtests or minute history calls. This can be used to serve daily calls if no daily bar reader is provided.

adjustment_reader : SQLiteAdjustmentWriter, optional

The adjustment reader. This is used to apply splits, dividends, and other adjustment data to the raw data from the readers.

get_adjusted_value(asset, field, dt, perspective_dt, data_frequency, spot_value=None)[source]

Returns a scalar value representing the value of the desired asset’s field at the given dt with adjustments applied.

Parameters:

asset : Asset

The asset whose data is desired.

field : {‘open’, ‘high’, ‘low’, ‘close’, ‘volume’, ‘price’, ‘last_traded’}

The desired field of the asset.

dt : pd.Timestamp

The timestamp for the desired value.

perspective_dt : pd.Timestamp

The timestamp from which the data is being viewed back from.

data_frequency : str

The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars

Returns:

value : float, int, or pd.Timestamp

The value of the given field for asset at dt with any adjustments known by perspective_dt applied. The return type is based on the field requested. If the field is one of ‘open’, ‘high’, ‘low’, ‘close’, or ‘price’, the value will be a float. If the field is ‘volume’ the value will be a int. If the field is ‘last_traded’ the value will be a Timestamp.

get_adjustments(assets, field, dt, perspective_dt)[source]

Returns a list of adjustments between the dt and perspective_dt for the given field and list of assets

Parameters:

assets : list of type Asset, or Asset

The asset, or assets whose adjustments are desired.

field : {‘open’, ‘high’, ‘low’, ‘close’, ‘volume’, ‘price’, ‘last_traded’}

The desired field of the asset.

dt : pd.Timestamp

The timestamp for the desired value.

perspective_dt : pd.Timestamp

The timestamp from which the data is being viewed back from.

data_frequency : str

The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars

Returns:

adjustments : list[Adjustment]

The adjustments to that field.

get_fetcher_assets(dt)[source]

Returns a list of assets for the current date, as defined by the fetcher data.

Returns:list: a list of Asset objects.
get_history_window(assets, end_dt, bar_count, frequency, field, ffill=True)[source]

Public API method that returns a dataframe containing the requested history window. Data is fully adjusted.

Parameters:

assets : list of zipline.data.Asset objects

The assets whose data is desired.

bar_count: int

The number of bars desired.

frequency: string

“1d” or “1m”

field: string

The desired field of the asset.

ffill: boolean

Forward-fill missing values. Only has effect if field is ‘price’.

Returns:

A dataframe containing the requested data.

get_last_traded_dt(asset, dt, data_frequency)[source]

Given an asset and dt, returns the last traded dt from the viewpoint of the given dt.

If there is a trade on the dt, the answer is dt provided.

get_splits(sids, dt)[source]

Returns any splits for the given sids and the given dt.

Parameters:

sids : container

Sids for which we want splits.

dt : pd.Timestamp

The date for which we are checking for splits. Note: this is expected to be midnight UTC.

Returns:

splits : list[(int, float)]

List of splits, where each split is a (sid, ratio) tuple.

get_spot_value(asset, field, dt, data_frequency)[source]

Public API method that returns a scalar value representing the value of the desired asset’s field at either the given dt.

Parameters:

asset : Asset

The asset whose data is desired.

field : {‘open’, ‘high’, ‘low’, ‘close’, ‘volume’,

‘price’, ‘last_traded’}

The desired field of the asset.

dt : pd.Timestamp

The timestamp for the desired value.

data_frequency : str

The frequency of the data to query; i.e. whether the data is ‘daily’ or ‘minute’ bars

Returns:

value : float, int, or pd.Timestamp

The spot value of field for asset The return type is based on the field requested. If the field is one of ‘open’, ‘high’, ‘low’, ‘close’, or ‘price’, the value will be a float. If the field is ‘volume’ the value will be a int. If the field is ‘last_traded’ the value will be a Timestamp.

get_stock_dividends(sid, trading_days)[source]

Returns all the stock dividends for a specific sid that occur in the given trading range.

Parameters:

sid: int

The asset whose stock dividends should be returned.

trading_days: pd.DatetimeIndex

The trading range.

Returns:

list: A list of objects with all relevant attributes populated.

All timestamp fields are converted to pd.Timestamps.

handle_extra_source(source_df, sim_params)[source]

Extra sources always have a sid column.

We expand the given data (by forward filling) to the full range of the simulation dates, so that lookup is fast during simulation.

Bundles

zipline.data.bundles.register()

Register a data bundle ingest function.

Parameters:

name : str

The name of the bundle.

f : callable

The ingest function. This function will be passed:

environ : mapping

The environment this is being run with.

asset_db_writer : AssetDBWriter

The asset db writer to write into.

minute_bar_writer : BcolzMinuteBarWriter

The minute bar writer to write into.

daily_bar_writer : BcolzDailyBarWriter

The daily bar writer to write into.

adjustment_writer : SQLiteAdjustmentWriter

The adjustment db writer to write into.

calendar : zipline.utils.calendars.TradingCalendar

The trading calendar to ingest for.

start_session : pd.Timestamp

The first session of data to ingest.

end_session : pd.Timestamp

The last session of data to ingest.

cache : DataFrameCache

A mapping object to temporarily store dataframes. This should be used to cache intermediates in case the load fails. This will be automatically cleaned up after a successful load.

show_progress : bool

Show the progress for the current load where possible.

calendar : zipline.utils.calendars.TradingCalendar or str, optional

The trading calendar to align the data to, or the name of a trading calendar. This defaults to ‘NYSE’, in which case we use the NYSE calendar.

start_session : pd.Timestamp, optional

The first session for which we want data. If not provided, or if the date lies outside the range supported by the calendar, the first_session of the calendar is used.

end_session : pd.Timestamp, optional

The last session for which we want data. If not provided, or if the date lies outside the range supported by the calendar, the last_session of the calendar is used.

minutes_per_day : int, optional

The number of minutes in each normal trading day.

create_writers : bool, optional

Should the ingest machinery create the writers for the ingest function. This can be disabled as an optimization for cases where they are not needed, like the quantopian-quandl bundle.

Notes

This function my be used as a decorator, for example:

@register('quandl')
def quandl_ingest_function(...):
    ...
zipline.data.bundles.ingest(name, environ=os.environ, date=None, show_progress=True)

Ingest data for a given bundle.

Parameters:

name : str

The name of the bundle.

environ : mapping, optional

The environment variables. By default this is os.environ.

timestamp : datetime, optional

The timestamp to use for the load. By default this is the current time.

assets_versions : Iterable[int], optional

Versions of the assets db to which to downgrade.

show_progress : bool, optional

Tell the ingest function to display the progress where possible.

zipline.data.bundles.load(name, environ=os.environ, date=None)

Loads a previously ingested bundle.

Parameters:

name : str

The name of the bundle.

environ : mapping, optional

The environment variables. Defaults of os.environ.

timestamp : datetime, optional

The timestamp of the data to lookup. Defaults to the current time.

Returns:

bundle_data : BundleData

The raw data readers for this bundle.

zipline.data.bundles.unregister(name)

Unregister a bundle.

Parameters:

name : str

The name of the bundle to unregister.

Raises:

UnknownBundle

Raised when no bundle has been registered with the given name.

zipline.data.bundles.bundles

The bundles that have been registered as a mapping from bundle name to bundle data. This mapping is immutable and should only be updated through register() or unregister().

zipline.data.bundles.yahoo_equities(symbols, start=None, end=None)[source]

Create a data bundle ingest function from a set of symbols loaded from yahoo.

Parameters:

symbols : iterable[str]

The ticker symbols to load data for.

start : datetime, optional

The start date to query for. By default this pulls the full history for the calendar.

end : datetime, optional

The end date to query for. By default this pulls the full history for the calendar.

Returns:

ingest : callable

The bundle ingest function for the given set of symbols.

Notes

The sids for each symbol will be the index into the symbols sequence.

Examples

This code should be added to ~/.zipline/extension.py

from zipline.data.bundles import yahoo_equities, register

symbols = (
    'AAPL',
    'IBM',
    'MSFT',
)
register('my_bundle', yahoo_equities(symbols))

Utilities

Caching

class zipline.utils.cache.CachedObject[source]

A simple struct for maintaining a cached object with an expiration date.

Parameters:

value : object

The object to cache.

expires : datetime-like

Expiration date of value. The cache is considered invalid for dates strictly greater than expires.

class zipline.utils.cache.ExpiringCache(cache=None)[source]

A cache of multiple CachedObjects, which returns the wrapped the value or raises and deletes the CachedObject if the value has expired.

Parameters:

cache : dict-like, optional

An instance of a dict-like object which needs to support at least: __del__, __getitem__, __setitem__ If None, than a dict is used as a default.

class zipline.utils.cache.dataframe_cache(path=None, lock=None, clean_on_failure=True, serialization='msgpack')[source]

A disk-backed cache for dataframes.

dataframe_cache is a mutable mapping from string names to pandas DataFrame objects. This object may be used as a context manager to delete the cache directory on exit.

Parameters:

path : str, optional

The directory path to the cache. Files will be written as path/<keyname>.

lock : Lock, optional

Thread lock for multithreaded/multiprocessed access to the cache. If not provided no locking will be used.

clean_on_failure : bool, optional

Should the directory be cleaned up if an exception is raised in the context manager.

serialize : {‘msgpack’, ‘pickle:<n>’}, optional

How should the data be serialized. If 'pickle' is passed, an optional pickle protocol can be passed like: 'pickle:3' which says to use pickle protocol 3.

Notes

The syntax cache[:] will load all key:value pairs into memory as a dictionary. The cache uses a temporary file format that is subject to change between versions of zipline.

class zipline.utils.cache.working_file(final_path, *args, **kwargs)[source]

A context manager for managing a temporary file that will be moved to a non-temporary location if no exceptions are raised in the context.

Parameters:

final_path : str

The location to move the file when committing.

*args, **kwargs

Forwarded to NamedTemporaryFile.

Notes

The file is moved on __exit__ if there are no exceptions. working_file uses shutil.move() to move the actual files, meaning it has as strong of guarantees as shutil.move().

class zipline.utils.cache.working_dir(final_path, *args, **kwargs)[source]

A context manager for managing a temporary directory that will be moved to a non-temporary location if no exceptions are raised in the context.

Parameters:

final_path : str

The location to move the file when committing.

*args, **kwargs

Forwarded to tmp_dir.

Notes

The file is moved on __exit__ if there are no exceptions. working_dir uses dir_util.copy_tree() to move the actual files, meaning it has as strong of guarantees as dir_util.copy_tree().

Command Line

zipline.utils.cli.maybe_show_progress(it, show_progress, **kwargs)[source]

Optionally show a progress bar for the given iterator.

Parameters:

it : iterable

The underlying iterator.

show_progress : bool

Should progress be shown.

**kwargs

Forwarded to the click progress bar.

Returns:

itercontext : context manager

A context manager whose enter is the actual iterator to use.

Examples

with maybe_show_progress([1, 2, 3], True) as ns:
     for n in ns:
         ...