### abstract ###
Symbolic dynamics has proven to be an invaluable tool in analyzing the mechanisms that lead to unpredictability and random behavior in nonlinear dynamical systems
Surprisingly, a discrete partition of continuous state space can produce a coarse-grained description of the behavior that accurately describes the invariant properties of an underlying chaotic attractor
In particular, measures of the rate of information production---the topological and metric entropy rates---can be estimated from the outputs of Markov or generating partitions
Here we develop Bayesian inference for  SYMBOL -th order Markov chains as a method to finding generating partitions and estimating entropy rates from finite samples of discretized data produced by coarse-grained dynamical systems
### introduction ###
Research on chaotic dynamical systems during the last forty years produced a new vision of the origins of randomness
It is now widely understood that observed randomness can be generated by low-dimensional deterministic systems that exhibit a chaotic attractor
Today, when confronted with what appears to be a high-dimensional stochastic process, one now asks whether or not the process is instead a hidden low-dimensional, but nonlinear dynamical system
This awareness, though, requires a new way of looking at apparently random data since chaotic dynamics are very sensitive to the measurement process~ CITATION , which is both a blessing and a curse, as it turns out
Symbolic dynamics, as one of a suite of tools in dynamical systems theory, in its most basic form addresses this issue by considering a coarse-grained view of a continuous dynamics ~  In this sense, any finite-precision instrument that measures a chaotic system induces a symbolic representation of the underlying continuous-valued behavior
To effectively model time series of discrete data from a continuous-state system two concerns must be addressed
First, we must consider the measurement instrument and the representation of the true dynamics which it provides
Second, we must consider the inference of models based on this data
The relation between these steps is more subtle than one might expect
As we will demonstrate, on the one hand, in the measurement of chaotic data, the instrument should be designed to maximize the entropy rate of the resulting data stream
This allows one to extract as much information from each measurement as possible
On the other hand, model inference strives to minimize the apparent randomness (entropy rate) over a class of alternative models
This reflects a search for determinism and structure in the data
Here we address the interplay between optimal instruments and optimal models by analyzing a relatively simple nonlinear system
We consider the design of binary-output instruments for chaotic maps with additive noise
We then use Bayesian inference of a  SYMBOL -th order Markov chain to model the resulting data stream
Our model system is a one-dimensional chaotic map with additive noise~ CITATION   SYMBOL } where  SYMBOL ,  SYMBOL , and  SYMBOL  is Gaussian random variable with mean zero and variance  SYMBOL
To start we consider the design of instruments in the zero-noise limit
This is the regime of most previous work in symbolic dynamics and provides a convenient frame of reference
The construction of a symbolic dynamics representation of a continuous-state system goes as follows  CITATION
We assume time is discrete and consider a map  SYMBOL  from the  state space   SYMBOL  to itself  SYMBOL
This space can partitioned into a finite set  SYMBOL  of nonoverlapping regions in many ways
The most powerful is called a  Markov partition  and must satisfy two conditions
First, the image of each region  SYMBOL  must be a union of intervals:  SYMBOL
Second, the map  SYMBOL , restricted to an interval, must be one-to-one and onto
If a Markov partition cannot be found for the system under consideration, the next best coarse-graining is called a  generating partition
For one dimensional maps, these are often easily found using the extrema of  SYMBOL ---its  critical points
The critical points in the map are used to divide the state space into intervals  SYMBOL  over which  SYMBOL  is monotone
Note that Markov partitions are generating, but the converse is not generally true
Given any partition  SYMBOL , then, a series of continuous-valued states  SYMBOL  can be projected onto its symbolic representation  SYMBOL
The latter is simply the associated sequence of partition-element indices
This is done by defining an operator  SYMBOL  that returns a unique symbol  SYMBOL  for each  SYMBOL  from an alphabet  SYMBOL  when  SYMBOL
The central result in symbolic dynamics establishes that, using a generating partition, increasingly long sequences of observed symbols identify smaller and smaller regions of the state space
Starting the system in such a region produces the associated measurement symbol sequence
In the limit of infinite symbol sequences, the result is a discrete-symbol representation of a continuous-state system---a representation that, as we will show, is often much easier to analyze
In this way a chosen partition creates a symbol sequence  SYMBOL  which describes the continuous dynamics as a sequence of symbols
The choice of partition then is equivalent to our instrument-design problem
The effectiveness of a partition (in the zero noise limit) can be quantified by estimating the entropy rate of the resulting symbolic sequence
To do this we consider length- SYMBOL   words   SYMBOL
The  block entropy  of length- SYMBOL  sequences obtained from partition  SYMBOL  is then  SYMBOL } where  SYMBOL  is the probability of observing the word  SYMBOL
From the block entropy the  entropy rate  can be estimated as the following limit  SYMBOL } In practice it is often more accurate to calculate the length- SYMBOL  estimate of the entropy rate using  SYMBOL }  Another key result in symbolic dynamics says that the entropy of the original continuous system is found using generating partitions  CITATION
In particular, the true entropy rate maximizes the estimated entropy rates:  SYMBOL } Thus, translated into a statement about experiment design, the results tell us to design an instrument so that it maximizes the observed entropy rate
This reflects the fact that we want each measurement to produce the most information possible
As a useful benchmark on this, useful only in the case when we know  SYMBOL ,  Piesin's Identity ~ CITATION  tells us that the value of  SYMBOL  is equal the sum of the positive Lyapunov characteristic exponents:  SYMBOL
For one-dimensional maps there is a single Lyapunov exponent which is numerically estimated from the map  SYMBOL  and observed trajectory  SYMBOL  using  SYMBOL }  Taken altogether, these results tell us how to design our instrument for effective observation of deterministic chaos
Notably, in the presence of noise no such theorems exist
However,  CITATION  demonstrated the methods developed above are robust in the presence of noise
In any case, we view the output of the instrument as a stochastic process
A sample realization  SYMBOL  of length  SYMBOL  with measurements taken from a finite alphabet is the basis for our inference problem:  SYMBOL
For our purposes here, the sample is generated by a partition of continuous-state sequences from iterations of a one-dimensional map and that states are on a chaotic attractor
This means, in particular, that the stochastic process is stationary
We assume, in addition, that the alphabet is binary  SYMBOL
