Structure sensitive complexity for symbol-free sequences
- Cheng-Yuan Liou^{1}Email author,
- Aleksandr A Simak^{1} and
- Jiun-Wei Liou^{2}
DOI: 10.1186/s40535-015-0011-9
© Liou et al. 2015
Received: 15 April 2015
Accepted: 7 June 2015
Published: 3 July 2015
Abstract
The study proposes our extended method to assess structure complexity for symbol-free sequences, such as literal texts, DNA sequences, rhythm, and musical input. This method is based on L-system and topological entropy for context-free grammar. Inputs are represented as binary trees. Different input features are represented separately within tree structure and actual node contents. Our method infers tree generating grammar and estimates its complexity. This study reviews our previous results on texts and DNA sequences and provides new information regarding them. Also, we show new results measuring complexity of Chinese classical texts and music samples with rhythm and melody components. Our method demonstrates enough sensitivity to extract quasi-regular structured fragments of Chinese texts and to detect irregular styled samples of music inputs. To our knowledge, there is no other method that can detect such quasi-regular patterns.
Background
This work introduces general complexity assessment on structure properties for different types of inputs. Input sequences are represented as binary trees, the concept of L-system (Wikipedia 2005) is borrowed to infer rewriting rules and build corresponding context-free grammars, which are used later to assess the complexity score (Kuich 1970). This complexity score is closely related to the notion of entropy (Shannon 1948). Current work is intended to establish a general vision on such kinds of structural complexity assessment.
One initial work in this field focused on the complexity of musical rhythm (Liou et al. 2010), where binary tree representation almost perfectly fits. Later, our proposed method was applied to the complexity of DNA sequences (Liou et al. 2013a, b). From this arose the question of representation: how can other input types be transformed into a binary tree, while keeping the complexity assessment the same? The third study adapted complexity assessment to general texts encoded as symbol-free sequences (Liou et al. 2013a, b). Symbol-free representation was an important milestone—it allowed to extend method for more generic input data, such as Chinese paragraphs. Finally, the study turns back to music with an attempt to reconsider the initial assessment, redefine it, and make method capable of naturally incorporating both musical melody and rhythm.
Complexity assessment
This section provides a generic version of the earlier proposed method for structural complexity assessment (Liou et al. 2010). Our method in the essence remains the same; however, basic data structure and definitions were modified to equip our approach with new capabilities. Also, previous studies paid attention on the equivalence of bracketed strings and binary tree rewriting systems. This study considered it as being already justified, and bracketed strings do not appear in generalized version of the method any more. Instead, we focused on other issues, updating the notation of our formal grammars and proposing a better view on the classification step. It is worth mentioning that all adjustments follow previous conclusions and important statements, as well.
Binary tree
The procedure of transformation from arbitrary input encoding to the binary tree depends on the nature of the input. Despite this, following remains the same: the resulting binary tree reflects and corresponds to the structure of the input. We will not provide exact specifications here on how to transform different kinds of input into corresponding binary trees, but each following section dedicated to one particular kind does provide such necessary explanation in detail.
- 1.
Every sub-tree of a binary tree is a binary tree itself;
- 2.
Every node except the root has a parent node;
- 3.
Every node can have exactly two or none child nodes;
- 4.
Every child node is labeled as left or right;
- 5.
Every node can store some content inside.
Using a branching factor of two gives the tree two useful properties—it is relatively simple to maintain and general enough to get in account inputs properties, which are known to be local in linguistics (Gibson 1998) and music (Simonton 1984).
L-system
Every one of these trees can be considered as the result of consecutive development starting from the root. Each development step corresponds to the next tree level, and nodes at any current level are actually the result of development at a previous level. The process gradually continues until the original tree is replicated identically. Such development mechanism can be formalized with biology-inspired parallel string rewriting systems, or L-systems (Prusinkiewicz and Lindenmayer 1996). The L-system is a special case of formal grammar (Chomsky 1956). The core of its capabilities is a set of rewriting rules (it explains how every element shall be certainly rewritten), which are applied in parallel, naturalistically reflecting the processes of cell division and plant growth (Lindenmayer 1968). To replicate the tree, it is necessary to construct a complete set of rewriting rules based on labels of the nodes and start the rewriting procedure with the root node as the initial.
Methods
Rewriting rules
- 1.
Left-hand side of rewriting rule contains node non-terminal symbol with the context on a left defined by traversing parent nodes up to the root inclusively and concatenating their labels.
- 2.
Right-hand side of the rule contains node label itself, which is actually a terminal symbol, followed by non-terminals in case the node has a child.
- 3.
An additional operation of node content setting denoted by brackets at right-hand side of the rule immediately after the terminal symbol with the content supposed to be placed inside the node at rewriting moment.
List 1 demonstrates the rewriting rules set for this particular binary tree (Fig. 2) after the procedure above is completed.
Thus, such parallel rewriting system is a non-ambiguous context-sensitive formal grammar, which is capable of replicating the original tree identically (Chomsky 1959).
Homomorphism and isomorphism
Curious reader may note two things. Firstly, for every node in a binary tree, there is exactly one corresponding rewriting rule. Secondly, some rewriting rules are quite similar and may appear redundant. The last claim is also correct relative to the tree nodes and even sub-trees. Indeed, some sections of a binary tree may share exactly the same structure and even the same placement of node content. To extract such repeated structures based on their similarity and bound the redundancy of rewriting set, two auxiliary definitions are provided:
Homomorphism in rewriting rules
Two rewriting rules are homomorphic if and only if they assign equal contents to their terminals.
In terms of a binary tree, it means that after the rewriting procedure has been completed, homomorphic nodes share the same content.
Isomorphism on level X in rewriting rules
Two rewriting rules are isomorphic on depth X if and only if they are homomorphic and rules corresponding to their non-terminals are relatively isomorphic on depth X-1. Isomorphism on level 0 indicates homomorphism.
After the rewriting has been completed, two sub-trees of a binary tree are considered isomorphic (on depth X) if their root nodes share the same content and their descendants form an equal structure and relatively share the same content (up to depth X-1).
Classified rewriting rules with respect to isomorphism levels
Class | Homomorphism | Isomorphism-1 | Isomorphism-2 |
---|---|---|---|
1 | p ↦ Plr | p ↦ Plr | p ↦ Plr |
Pl ↦ Llr | |||
Pr ↦ Rlr | |||
2 | PLl ↦ L(1)lr | Pl ↦ Llr | Pl ↦ Llr |
PRl ↦ L(1) | Pr ↦ Rlr | ||
3 | PLr ↦ R(2) | PLl ↦ L(1)lr | Pr ↦ Rlr |
PRr ↦ R(2)lr | |||
4 | PLLl ↦ L(3) PRRl ↦ L(3) | PLr ↦ R(2) | PLl ↦ L(1)lr |
5 | PLLr ↦ R(4) PRRr ↦ R(4) | PRl ↦ L(1) | PLr ↦ R(2) |
6 | PRr ↦ R(2)lr | PRl ↦ L(1) | |
7 | PLLl ↦ L(3) | PRr ↦ R(2)lr | |
PRRl ↦ L(3) | |||
8 | PLLr ↦ R(4) | PLLl ↦ L(3) | |
PRRr ↦ R(4) | PRRl ↦ L(3) | ||
9 | PLLr ↦ R(4) | ||
PRRr ↦ R(4) |
Final rewriting set after the classification is finished, rules positions are corresponding to Table 1
Class | Homomorphism | Isomorphism-1 | Isomorphism-2 |
---|---|---|---|
1 | C _{1} ↦ C _{1} C _{1} | C _{1} ↦ C _{2} C _{2} | C _{1} ↦ C _{2} C _{3} |
C _{1} ↦ C _{2} C _{3} | |||
C _{1} ↦ C _{2} C _{3} | |||
2 | C _{2} ↦ C _{4} C _{5} | C _{2} ↦ C _{3} C _{4} | C _{2} ↦ C _{4} C _{5} |
C _{2} ↦ null | C _{2} ↦ C _{5} C _{6} | ||
3 | C _{3} ↦ null | C _{3} ↦ C _{7} C _{8} | C _{3} ↦ C _{6} C _{7} |
C _{3} ↦ C _{4} C _{5} | |||
4 | C _{4} ↦ null | C _{4} ↦ null | C _{4} ↦ C _{8} C _{9} |
C _{4} ↦ null | |||
5 | C _{5} ↦ null | C _{5} ↦ null | C _{5} ↦ null |
C _{5} ↦ null | |||
6 | C _{6} ↦ C _{7} C _{8} | C _{6} ↦ null | |
7 | C _{7} ↦ null | C _{7} ↦ C _{8} C _{9} | |
C _{7} ↦ null | |||
8 | C _{8} ↦ null | C _{8} ↦ null | |
C _{8} ↦ null | C _{8} ↦ null | ||
9 | C _{9} ↦ null | ||
C _{9} ↦ null |
It is good to place boundaries on isomorphism depth. Obviously, the lower bound of isomorphism domain is 0 while the upper bound is the number of levels of the original binary tree. However, such isomorphism depth bounds are quite meaningless. The lower bound does not involve any structural information, while the upper bound does not leave anything to compare with the whole tree. Thus, the meaningful lower and upper for the rewriting rules of isomorphism depth are 1 and depth of the original tree minus 1.
Classification
The classification of rewriting rules is one of the most important steps for structural complexity assessment. It reveals the hidden redundancy of a binary tree to the explicit form, exploiting the redundancy of the corresponding rewriting set.
All isomorphic rewriting rules are labeled with one denoting class label (Table 1). However, such a simple procedure is quite computationally expensive, despite the chosen domain of rewriting rules or tree nodes. The isomorphism check will be repeatedly performed dozens of times on the same inputs, expanding with factor of two for every level of required isomorphism depth. A good illustration is a straightforward implementation of Fibonacci numbers computation.
New class labels (final nodes values) shall be propagated to the corresponding rewriting rules to compose a new rewriting set, for each rule replacing the left-hand side with its class label and the right-hand side with class labels of its children (Table 2). Some rules in the set will have duplicates. Or, alternatively, every rule occurs exactly once but has an associated counter for how many times it actually appears. This information is required for the following complexity assessment. All labels are considered as non-terminal symbols, additional productions to the dedicated terminal symbol shall be added to the set to conform the formality. The initial symbol is obviously a root node class label.
This new parallel rewriting system is a stochastic context-free formal grammar capable of reproducing the original binary tree as well as many other similar trees.
Complexity formula
As mentioned above, a set of classified rewriting rules is a context-free grammar. Thus, the redundancy in the tree (its hidden structure) can be explored by assessing the complexity of tree generating grammar (Liou et al. 2010), which is closely related to the entropy notion for context-free grammars (Kuich, 1970).
- 1.Assume that there are n classes of rules and that each class C _{ i } contains n _{ i } rules. Let V _{ i } ∈ {C _{1}, C _{2}, …, C _{ n }}, U _{ ij } ∈ {R _{ ij }, i = 1, 2, …, n, j = 1, 2, …, n _{ i }}, and a _{ ijk } ∈ {x, x = 1, 2, …, n}, where each U _{ ik } has the following form:$$ {U}_{i1}\to {V}_{a_{i11}}{V}_{a_{i12}},{U}_{i2}\to {V}_{a_{i21}}{V}_{a_{i22}},\dots \to \dots {U}_{i{n}_i}\to {V}_{a_{i{n}_i1}}{V}_{a_{i{n}_i2}}. $$
- 2.The generating function of V _{ i }, V _{ i }(z) defined as:$$ {V}_i(z)=\frac{{\displaystyle {\sum}_{p=1}^{n_i}}{n}_{ip}z{V}_{a_{ip1}}(z){V}_{a_{ip2}}(z)}{{\displaystyle {\sum}_{q=1}^{n_i}}{n}_{iq}}, $$
If V _{ i } does not have non-terminals, set V _{ i }(z) = 1.
- 3.After formulating the generating function V _{ i }(z), we intend to find the largest value of z, z _{max}, at which V _{1}(z _{max}) still converges (V _{1} here denoted the root node rule of a binary tree). After obtaining z _{max} of V _{1}(z), we set R = z _{max} (the radius of convergence). We define the complexity of a binary tree as:$$ {K}_0=- \ln\ R. $$
Numerical estimation
- 1.Rewrite generating function:$$ \left\{\begin{array}{c}\hfill {V}_i^m\left({z}^{\hbox{'}}\right)=\frac{{\displaystyle {\sum}_{p=1}^{n_i}}{n}_{ip}{z}^{\hbox{'}}{V}_{a_{ip1}}^{m-1}\left({z}^{\hbox{'}}\right){V}_{a_{ip2}}^{m-1}\left({z}^{\hbox{'}}\right)}{{\displaystyle {\sum}_{q=1}^{n_i}}{n}_{iq}}\hfill \\ {}\hfill {V}_i^0\left({z}^{\hbox{'}}\right)=1\hfill \end{array}\right. $$and$$ {V}_i^0\left({z}^{\hbox{'}}\right)=1. $$
- 2.
Each iteration, calculate values from \( {V}_i^0\left({z}^{\hbox{'}}\right) \) to \( {V}_i^m\left({z}^{\hbox{'}}\right) \) . When \( {V}_i^{m-1}\left({z}^{\hbox{'}}\right)={V}_i^m\left({z}^{\hbox{'}}\right) \) for all i, we say \( {V}_i^m \) reaches the convergence for z '. We set m = 200.
- 3.
We look up for \( {\mathrm{z}}_{max}^{\hbox{'}} \) using dichotomy search to check z ^{'} between 0 and 1 for \( {V}_i^m \) convergence.
DNA sequences
In modern bioinformatics, finding an efficient way to locate sequence fragments with biological meaning is an important issue. There are two broadly used categories of methods—sequence complexity (Koslicki 2011) and structure patterns analysis (Manna and Liou 2006; Tino 1998; Peng et al. 1992). Koslicki (2011) presented a method for computing the complexity of a sequence using redefined topological entropy, so the complexity score will not converge to zero for longer sequences. According to Hao et al. (Hao et al. 2000), we can find some rare subsequences by proposed graphical representation for DNA sequences. Zhang and Zhang (1994) analyzed nucleotides occurrence probabilities using four-nucleotide-related functions to draw 3D curves plots.
All of the following steps, such as rewriting rules extraction, classification, and numerical estimation of complexity scores remain the same as stated in the section above.
The study also paid attention to comparing topological entropy (Koslicki 2011) and presented a method of structural complexity, revealing the advanced nature of the latter one. Both methods showed the ability to detect statistical properties of test sequences, but only structural complexity assessment was sensitive to the changes of the sequence sub-words order. In addition, for some input, Koslicki’s method cannot compute amino-acid sequences efficiently (required fragment size growths exponentially with sub-word length on alphabet size), but structural complexity does not pose such limitations and can be applied to any amino-acids directly.
The study was successful in attempting to represent symbol sequences as binary trees and encoding sequence symbols with fixed tree structures for the next structural complexity assessment. However, a possible dependency of final complexity scores on chosen fixed representations still was a matter of future study at that moment.
Text sequences
- 1.
How can we efficiently encode a sequence for alphabet cardinalities higher than the number of nucleotide bases? Encoding every alphabet symbol as fixed tree structure requires deeper trees for larger alphabet symbol sets, and the complexity assessment obviously tends to measure the dependencies between those fixed structures;
- 2.
How do different encodings affect the complexity scores?
This study compared sequence complexity for both of the intermediate encodings. Interestingly, the complexity for BIN remains quite uniform over the encoded sequence, while LZW tends to have lower complexity scores in the front and higher scores in the rear of the sequence. Since LZW saves regular patterns in the front part to absorb them later in the rear end, there are not so many regular patterns in the end of the sequence. Also, structural complexity was compared with linguistic complexity (LC) and topological entropy (TE). They also showed similar behavior on intermediate encodings.
The study analyzed intermediate encodings, but some parts of question 2 still remain. Theoretically, there should be no difference in complexity score if all fixed tree replacements are unique, and the replacement procedure is one-to-one function.
Later investigation showed that the intermediate encoding BIN encodes 27 symbols as binary strings with length of 5 and fixed tree replacements are aligned to length 2. Original symbols of input sequence became shredded because of this misalignment. Thus, some fixed representation substitutions were formed by ending bit of one symbol and starting bit of the next one. It is not important when one just measures the relative complexity of incoming transmission stream. But when one has to reveal structure complexity of input sequence—such alignment does matter. Since fixed tree representation replaces 2-bit fragments of encoded string—intermediate encoding should be aligned to a multiple of 2.
Chinese texts
In this section, Chinese texts are considered as an extreme case of possible application for structural complexity. Alphabet size or symbol size of such input sequences is of the order of thousand and can easily exceed the input sequence length. Such alphabet cardinality may also create some restrictions on encoding due to the limitation of memory capacity of modern computers.
Dataset
There are four great classical novels of the Chinese literature (Shep 2011), which are commonly regarded as the greatest and most influential of premodern Chinese fiction. Two of those classical Chinese novels—“Dream of the Red Chamber” (Trad. Chinese “紅樓夢”) by Cao Xueqin (18th century) and “Romance of the Three Kingdoms” (Trad. Chinese “三國演義”) by Luo Guanzhong (14th century)—were decided for analysis with developed structural complexity method.
Processing
Intermediate encoding of input Unicode symbols (e.g., u4e00, u4e8c) removes the “u” character and considers every 4 hex numbers of two bytes as ASCII symbols, 8 bits each. Thus, all initial input symbols were encoded as 32-bit binary string and concatenated together later. Next, four fixed tree representations were applied to compose binary trees for every of 1024-bit segments of binary input string. Those trees were used as input to perform structural complexity assessments with isomorphism level 8.
Results and discussion
The most fascinating result we have discovered so far is a significantly lower complexity scores for sentences containing regular structures inside. When sentences display a more regular structure than a regular narrative plot (for instance, some poetic inserts), the structural complexity score tends to be lower. Below we provide a few instances of this effect for both novels in descendent order from the highest (less regularity) to the lowest (more regularity) complexity scores. For those who do not feel confident in Chinese, we would recommend paying attention to some regularities in the sequences of the symbols. Some of those regularities are typical for classical Chinese, and some of them are something more.
- 1.
Chapter 91:
和他好,他偏不和你好,你怎麼樣?你不和他好,他偏要和你好,你怎麼
- 2.
Chapter 5:
「癡情司」,「結怨司」,「朝啼司」,「暮哭司」,「春感司」,「
- 3.
Chapter 1:
便是『了』,『了』便是『好』;若不『了』便不『好』;若要『好』,
- 4.
Chapter 13:
、賈敕、賈效、賈敦、賈赦、賈政、賈琮、賈 、賈珩、賈珖、賈琛、賈
- 5.
Chapter 54:
、太婆婆、媳婦、孫子媳婦、重孫子媳婦、親孫子媳婦、姪孫子、重孫子
- 1.
Chapter 20:
建。建生廣陵侯劉哀。哀生膠水侯劉憲。憲生祖邑侯劉舒。舒生祁陽侯劉
- 2.
Chapter 23:
也;不讀詩書,是口濁也;不納忠言,是耳濁也;不通古今,是身濁也;
- 3.
Chapter 22:
之人,然後有非常之事;有非常之事,然後立非常之功。夫非常者,固
- 4.
Chapter 102:
,方者為牛腹。垂者為牛舌,曲者為牛肋。刻者為牛齒,立者為牛角。細
- 1.
5. Chapter 20:
劉昂。昂生漳侯劉祿。祿生沂水侯劉戀。戀生欽陽侯劉英。英生安國侯劉
To our knowledge, there is no other method which can detect such quasi-regular sections.
Music samples
An earlier study (Liou et al. 2010) proposed the complexity measure for musical rhythm, representing it as a binary tree. Such representation seems very natural for rhythm, because notes durations are generally square. The study focused only on the rhythm ignoring another important music component—the melody. Melody gives information on tones transitions through time, specified by rhythm.
Encoding
Dataset
We decide to approbate the structural complexity method on a test dataset, the collection of drum lessons for three styles: Rock, Blues, and Jazz. The collection was created and published online by drummer of over 25 years, Rudy Lievens at his personal website (Lievens, 2013) devoted to drums. Exercising materials are provided as note sheets and MP3 or MIDI files for listening and downloading. Exercises download had some issues for few particular files, they were later eliminated from the assessments. In total, after download, Rock had 7533 exercises, and Blues and Jazz had 8594 and 12609 exercises, respectively. Typical lesson note sheets are provided below (Fig. 10).
Processing
We conducted preprocessing for all data. The procedure works as uniformly as possible; a single implementation version was used to preprocess all dataset samples. The procedure recognized and properly fixed the following cases: uncertain note onsets and time lags, upbeats and syncopations, and triplets and grace notes. All notes were adjusted to the most suitable positions. Samples with triplets had an additional transformation with multiplication to 3/2 of their durations. Detected grace notes served as indicators to extend their joined notes up to the proper length.
All samples are rather short and structurally similar to each other within one style. Thus, straightforward structural complexity assessment on each sample with isomorphism level 1 does not reveal fascinating results. We decided to assess complexity of each style first and later try to distinguish the most atypical samples within each style. To do so, an additional structure called the universal rewriting rules set is required. This universal rules set contains all rewriting rules from all the samples within one style corpus.
The complexity assessment procedure has been adapted for the current task and was performed in three steps. Step 1 converted preprocessed MIDI files into its binary tree, extracted rewriting rules, and classified them with isomorphism level 1. Step 2 placed classified rewriting rules into universal rules set and accurately maintain their relative probabilities (occurrence scores). The final step assessed the complexity for each sample and each universal set. Numerical estimation of structural complexity for individual samples remains the same, with just one difference—instead of individual rules scores, corresponding scores from universal rules sets were substituted. And to assess the complexity of each style, numerical estimation was applied for each universal rewriting set directly.
Conclusion
Higher complexity score as well as larger size of universal set for particular corpus might be the direct evidence on a more comprehensive music style. A larger universal set with no dependency on the corresponding complexity score might recall to the richness of music and overall musical expression.
Also, we identified some samples with extremely high complexity. Later examination revealed that they are different from all the other samples of the style. The most evident and easy to understand are several Rock exercises with detected absence of standard for the style hi-hat beats rhythmic line. Figure 12 shows two samples with and without such hi-hat pattern.
Declarations
Acknowledgements
This work was supported by the Ministry of Science and Technology MOST 103-2221-E-002-180 and MOST 104-2811-H-001-004. We also greatly appreciate Rudy Lievens permission to use his data in our research.
Authors’ Affiliations
References
- Chomsky N (1956) Three models for the description of language. IRE Trans Inform Theory 2(3):113–124MATHView ArticleGoogle Scholar
- Chomsky N (1959) On certain formal properties of grammars. Inf Control 2(2):137–167MATHMathSciNetView ArticleGoogle Scholar
- Gibson E (1998) Linguistic complexity: locality of syntactic dependencies. Cognition 68(1):1–76View ArticleGoogle Scholar
- Hao B-L, Lee HC, Zhang S-Y (2000) Fractals related to long DNA sequences and complete genomes. Chaos, Solitons Fractals 11(6):825–836MATHView ArticleGoogle Scholar
- Koslicki D (2011) Topological entropy of DNA sequences. Bioinformatics 27:1061–1067View ArticleGoogle Scholar
- Kuich W (1970) On the entropy of context-free languages. Inf Control 16(2):173–200MATHMathSciNetView ArticleGoogle Scholar
- Lievens, R. (2013). Retrieved 2013, from drum beats, drum lessons and Midi loops: http://www.edrumbeats.com/
- Lindenmayer A (1968) Mathematical models for cellular interactions in development I. Filaments with one-sided inputs. J Theor Biol 18(3):280–299View ArticleGoogle Scholar
- Liou C-Y, Wu T-H, Chia-Ying L (2010) Modelling complexity in musical rhythm. Complexity 15:19–30MathSciNetGoogle Scholar
- Liou C-Y, Liou D-R, Simak AA, Huang B-S (2013a) Syntactic sensitive complexity for symbol-free sequence. In: LNCS, 4th International Conference, IScIDE 2013, Beijing, China, July 31 – August 2, 8261. pp 14–21Google Scholar
- Liou C-Y, Tseng S-H, Cheng W-C, Tsai H-Y (2013b) Structural complexity of DNA sequence. Comput Math Methods Med 2013:11MathSciNetView ArticleGoogle Scholar
- Manna S, Liou C-Y (2006) Reverse engineering approach in molecular evolution: simulation and case study with enzyme proteins, Proceedings of the 2006 International Conference on Bioinformatics & Computational Biology, BIOCOMP'06. Las Vegas, Nevada, pp 529–533Google Scholar
- Peng C-K, Buldyrev SV, Goldberger A, Havlin S, Sciortino F, Simons M et al (1992) Long-range correlations in nucleotide sequences. Nature 356(6365):168–170View ArticleGoogle Scholar
- Prusinkiewicz P, Lindenmayer A (1996) The algorithmic beauty of plants. Springer-Verlag, New YorkMATHGoogle Scholar
- Shannon, C. (1948). The mathematical theory of communication. The Bell System Technical Journal, Vol. 27, pp. 379–423, 623–656, July, October
- Shep, S. J. (2011). Paper and print technology. In The Encyclopedia of the Novel, Volume 2 of Wiley-Blackwell Encyclopedia of Literature (p. 596). John Wiley & Sons New Jersey, USA.
- Simonton DK (1984) Melodic structure and note transition probabilities: a content analysis of 15,618 classical themes. Psychol Music 12:3–16View ArticleGoogle Scholar
- Tino P (1998) Spatial representation of symbolic sequences through iterative function systems. Systems Man Cybernetics A 29(4):386–393View ArticleGoogle Scholar
- U.S. National Library of Medicine. (n.d.). Retrieved from National Center for Biotechnology Information: http://www.ncbi.nlm.nih.gov/
- Welch TA (1984) A technique for high-performance data compression. Computer 17(6):8–19View ArticleGoogle Scholar
- Wikipedia. (2005). L-system. Retrieved October 1, 2013, from Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/L-system
- Zhang R, Zhang C (1994) Z curves, an intuitive tool for visualizing and analyzing the DNA sequences. J Biomol Struct Dyn 11(4):767–782View ArticleGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.