Data Compression and the 6-7-8-9 Coding in the Q-Ching

by John McFnordland, Exerr University

(Originally published in Spring 1995 issue of JOAC.)

Abstract: The famous 6-7-8-9 coding and associated data compression techniques of the Q-Ching are described. The evolution of this coding from the ill-fated Trigram Project is explored in depth.

One of the most pivotal design decisions for the divinatory computing devices at the Golden Emperor's court, which in turn had a marked influence on the development of the Q-Ching itself, was the choice of a numerical encoding known today as "the 6-7-8-9 coding". This code related the various elements of the Q-Ching symbol system (emblematic lines, digrams, trigrams) to each other through their numerical equivalents. It also provided a system of numerical/geometric operations on the hexagrams that proved absolutely crucial to correctly interpreting the oracle. In fact, a programmer's reputation in the divinatory arts, and often his longevity in the court, depended heavily on a full mastery of the 6-7-8-9 coding.

The origin of this important cornerstone of the Q-Ching has long been a forgotten mystery. However, with the recent excavations at the Golden Emperor site, new data relative to this mystery has surfaced, revealing a story more fascinating than most scholars expected. It appears this famous coding was a compromise outcome of a failed project that was incredibly advanced conceptually. However, it pushed the envelope of Chinese oracular thought too far for the programmers to actually pull off a working system. In an attempt to salvage something of the project before an important ritual deadline, they used the simpler (though still quite sophisticated) 6-7-8-9 as a fall-back design.

The Basics of the 6-7-8-9 Coding

The 6-7-8-9 coding was a system of correspondences between the basic elements of the Q-Ching hexagrams. This hierarchy of symbols included basic lines (yin or yang), emblematic lines (moving and static versions of the basic lines), digrams (2 basic lines, one on top of another), trigrams (3 basic lines in a stack), and finally full hexagrams (6 lines). Since a hexagram can be viewed as 2 trigrams or 3 digrams stacked up, a way of looking at the internal structure of the hexagram became available through this code. Further, by substituting lines, digrams and trigrams for each other through the 6-7-8-9, a complicated divinatory calculus shows through that strongly shaped the interpretations of the Q-Ching hexagrams.

The basis of the 6-7-8-9 coding, like much in the Q-Ching, goes back to the basic correspondence of:

Yin = = 2, Yang = = 3,
which is then elaborated upon as we move to more complex symbol elements. The types of operations used at this stage are fairly unsophisticated, usually nothing more complicated than counting and adding. While catering to the abilities of the average soothsayer, this small group of operations had a marked limiting effect on the Q-Ching itself.

In a nutshell, the coding can be summarized by the following table.

Table 1. The 6-7-8-9 Coding
ValueEmblematic Line Digram Trigrams

One important fact to notice is that 6 and 9 consist of a single trigram each, while 7 and 8 consist of 3 trigrams apiece (known respectively as "the sons" and "the daughters"). This reduction (or loss of information) of 3 trigrams mapping onto the same value has marked effects throughout the oracle.

Data Compression Techniques during Consultation

Conceptually, when the oracle is consulted and a hexagram is being drawn, each line can be considered the result of 3 random decisions that create a single emblematic line (2 bits of information). In the yarrow stick method, the random choices are made by dividing the pile of sticks into 2 piles and using a certain counting algorithm to produce either a yin=2 or yang=3 decision. After repeating this process two more times, creating 3 choices, a line is drawn. In the 3 coin method, 3 Chinese coins are simultaneously tossed, creating 3 random "head/tail" choices. Traditionally, the engraved side of the coin was yin=2, the blank side yang=3. In either case, the three choices could have been imagined as a single trigram, and the emblematic line drawn would be calculated by:

Algorithm 1:
line = choice1 + choice2 + choice3;
and the value of "line" was converted to an emblematic line via the 6-7-8-9 table (see above). Alternatively, the table directly converts the conceptual trigram into a line.

The choice of addition, while making it easy on the oracle's operator, leads to a loss of information. Technically, this is known as "lossy data compression." Even though a trigram has 3 bits of information, only 2 bits make it to the hexagram in the form of an emblematic line. When dealing with a son or daughter trigram, the position of the "odd line" determines whether it is 1st, 2nd or 3rd in the birth order. But addition destroys this order information, collapsing all the sons into one value (3+2+2 = 2+3+2 = 2+2+3 =7), and similarly for the daughters. Notice that this order information is lost anyway in the coin method, since the 3 coins are "anonymous" (identical), but it could have been retained in the yarrow stick method, the older and more prestigious technique. There must have been some powerful pressure to ignore the identities of the individual sons and daughters.

(Technically speaking, even more information is lost due to the moving lines occuring less frequently than the static lines. All told, a given line conveys only about 1.81 bits of information, not 2 or 3 bits.)

Data Compression in Interpretation

On the interpretation side, the usage of the 6-7-8-9 coding led to a sophisticated technique for looking at the "inner nature" of the hexagram. A hexagram of six lines can be viewed as either 3 digrams (lines 1-2, 3-4, 5-6) or 2 trigrams (lines 1-2-3, 4-5-6, ignoring the inner trigrams for the moment). Since each digram is equivalent to an emblematic line, a hexagram can be converted into a trigram, revealing a deeper level of symbolism to the original figure. Similarly, 2 trigrams can be converted to 2 emblematic lines via the coding, reducing a hexagram to a digram (and often a single line). Such symbolic gymnastics were commonplace for the advanced operator.

A Revised History of the 6-7-8-9 Coding

The origins of the 6-7-8-9 coding have long been considered to be lost in the mists of time. However, recent excavations at the Golden Emperor's palace complex, particularly at the Oracular Temple itself (also known as "the Programmer's Warehouse", due to it's crude accomodations), have cast a whole new light on these origins. Design documents found in the cubicles of the programmers throughout the temple clearly point to an attempt to create a much more complicated design than the 6-7-8-9. Only after repeated setbacks did they finally settle on the simpler paradigm that has come down to us today.

This advanced project is known today as "The Trigram Project." The sages in the temple were attempting to further the state of oracular archeocomputing by centuries in one fell swoop. However, powerful conceptual and practical limitations intervened to make this impossible in the end, much to the loss of generations since their time. It is my contention that they very nearly created the binary number system over two millennia before Liebniz officially "discovered" them in the 17th century.

Hints within the Digrams and Trigrams

Common wisdom has it that the numerical equivalents of 6 through 9 were obtained by adding the 3 lines of the trigrams, as in algorithm 1 above. However, this explains nothing as to why the digrams are also mapped onto 6 to 9. If you add the values of the lines in a digram, you get a number from 4 to 6. Further, both static yin and static yang have a value of 5. It would be very inauspicious for your oracular operations to destroy the distinction of yin and yang. The digrams use some other method besides simple addition to come up with the equivalents of 6 to 9. As for the emblematic lines, any 4 consecutive numbers would serve as a suitable coding; why start at 6?

Judging from the archeological record recently unearthed, some unknown up and coming programmer was responsible for developing a new algorithm for evaluating the value of a digram. This new algorithm is as follows:

Algorithm 2:
value = 2*lowerline + 1*upperline;
As the reader can verify, this algorithm correctly calculates the values of the digrams (as given by the 6-7-8-9) without loss of positional information! Those coefficients of 1 and 2 in the formula preserve the order information, the first step towards a binary notation. This raises the important question: which came first in the 6-7-8-9, the digrams or the trigrams?

Needless to say, a discovery of this magnitude caused quite a stir in the temple. Before long, the lead programmers caught wind of this insight and started wondering if the same approach could be used with the trigrams and perhaps even the hexagrams as well. Thus was born the Trigram Project.

A very tattered parchment of the 1.0 version of the design document for the project has been unearthed and mostly translated. Needless to say, it was a very thin document, full of mostly "blue sky" speculations, more a germ of an idea than a working system. However, it does contain the seminal algorithm that determined the course of the project, a new formula for calculating the value of a trigram using this positional notation. After much work, it was found that

Algorithm 3:
value = 4*lowerline + 2*middleline + 1*upperline;
mapped the trigrams onto the numbers 14 to 21 without loss of information (8 trigrams, 8 values). This coding is summarized in Table 2. The flurry of revisions of this document, each fatter than the last, that followed in quick succession, is testimony to the unsolved problems that were encountered.

"My Kingdom for a Nothing"

For modern computer scientists, it seems extremely odd that yin and yang are represented by 2 and 3 instead of 0 and 1 as we do today. It is hard for us to remember that numbers are synonymous with philosophical concepts in old cultures, and that the level of development of ancient mathematics was limited by how enlightened the philosophy was. The highest concept (which is barely a concept at all) for the early Chinese was the Tao or undifferentiated unity, which they represented by the number one. This left the next few numbers, 2 and 3, for the ideas of yin and yang. What they didn't have was a notion of nothing, emptiness, the void or sunya. The idea of nothing didn't occur for many centuries until the Hindus and Buddhists conceived of it around the 6th century B.C. With it came an equally heretical idea, mathematical zero, which totally revolutionized mathematics from that moment to the present. Since the Chinese programmers had no way of imagining such "nothings", they stuck with their more concrete 2's and 3's. Needless to say, this conceptual blindspot made life pretty difficult for the Trigram Project. Notice that if you used yin=0 and yang=1 in algorithm 3, the binary numbers 0 to 7 pop right out. (In fact, the trigram value of algorithm 3 equals the binary value + 14, where 14 = 2*(4+2+1).) The entire history of computing on the face of the planet could have been radically different if only someone on the temple staff had been able to make a clearcut case for zero. It simply eluded them philosophically.

Perhaps this explains the poignant and possibly facetious lament from one of the old sages, found on a scrap of parchment: "My Kingdom for a Nothing."

The Scandal of the Transvestite Trigrams

Work proceeded quickly on the design, until the next snag was encountered (around version 1.7). It was noticed that the sons were numbers 15, 16 and 18, skipping over 17. Similarly, the daughters were 17, 19 and 20, omitting 18. Symbolically, it seemed rather scandalous that the first son was hanging out with two of the daughters; it was even more unseemly to have the first daughter in the company of the younger sons. Numerous "solutions" for this problem were proposed, none very convincing. This may explain the cryptic comment written on the cover sheet of one copy of the design document: "Something wrong with number one son..." Some historians have mused this was a reference to a family problem at home, but it now seems clear it refers to the "effeminate" nature of the first son in the new coding.

A further problem with these misplaced trigrams became apparent during the attempt to cleanly map the trigrams (14 to 21) onto the digrams (6 to 9). The designers were hampered by the primitive nature of computing used by the average soothsayer. Rather than retrain every village idiot with a handful of yarrow sticks into a new, advanced paradigm, they tried to build on the simple operations (like counting and adding) that the average operator would be aware of already. Without some higher math, those gaps in the numbers for the sons and daughters were impossible to work with.

The solution that was chosen was in equal parts clever and a cop-out. The sages decided to abandon the strict binary pattern of the trigrams in order to group the sons and daughters together. This involved switching the first son and daughter in the line-up, switching the values of 17 and 18 around from their natural placement. This complicated the trigram algorithm somewhat:

Algorithm 4:
value = 4*lowerline + 2*middleline + 1*upperline;
if (value == 17) value = 18;
else if (value == 18) value = 17;
It's a small bit of coding and it makes subsequent steps easier to program, but this compromise was the first intellectual error that led them away from the binary vision. This fudging around to clarify the genders of the first son and daughter has led some modern wags to call them "transvestite trigrams."

Mapping Trigrams to Digrams

The next hurdle was to settle on a clearcut method for associating the trigrams with the digrams. The big problem is how to shoehorn 8 trigrams into 4 digrams without violating any social customs or philosophical dictates, and without requiring any difficult math. A modern student, when faced with putting 8 widgits into 4 holes, would simply slip 2 widgits into each hole. This could be accomplished by:

Algorithm 5:
digram = trigram/2 -1;
where integer division (no remainders or fractions) is used. Socially, this simply would not work, since it pairs the third son with the father and the third daughter with the mother. This was such a violation of social custom, where birth order conveys privilege, that no programmer, sage or entry-level, could see any virtue in this algorithm. It was better to leave the parents in a class by themselves, and treat the sons or daughters as a group, than to undermine the family order like this. A new algorithm, which curiously never seemed to occur to the designers, is as follows:
Algorithm 6:
digram = (trigram/3) +2;
where integer division is used, again. Note that this method requires the transvestite trigrams to be exchanged (grouping the genders together) for the formula to work. Perhaps the act of dividing by 3 was too imposing for the old sages or else they feared the creation of funny fractions (not common knowledge in those days among the often unskilled users of the Q-Ching), so unconsciously they never considered this solution.

In an attempt to avoid any fractions, difficult divisions or other awkward mathematics, version 1.12 of the design settled on an approximate integer solution that only required adding and a bit of memorization. The digrams span a length of 3 (= 9-6), while the trigrams span a length of 7 (= 21-14). The fraction 7/3 and its multiples didn't look very friendly, so the sages looked for the nearest integers to these values. Instead of using 0, 2.33, 4.67 and 7, they rounded off to 0, 2, 5 and 7. When you add these lengths to the lowest trigram value of 14, you get the series 14, 16, 19 and 21. The mother is, of course, equal to 14, while the father is 21. Further, 16 is right in the middle of the sons and 19 is right in the middle of the daughters. This was sometimes called "the 2-3-2 solution": start with 14, add 2, add 3, add 2 again. Needless to say, there is no simple integer formula to express this method, so it really boiled down in practice to just memorizing the result. Also, "2-3-2" resembles the trigram of the second son, known as the Abyss. This didn't bode well for eventual acceptance of this proposal.

In theory, this problem is nothing more than finding the equation of a line in the plane; algorithm 6 is a straightforward solution. In practice, due to the conceptual limitations they labored under, no simple solution could be found. The 2-3-2 solution is pretty ad hoc. Growing opposition to this heretical positional notation was becoming evident. Eventually, the entire issue was avoided by adopting the order destroying algorithm:

Algorithm 7:
value = lowerline + middleline + upperline;
for the trigrams. This maps both trigrams and digrams (using algorithm 2) onto the range from 6 to 9, making it "obvious" how to pair off these symbols. Simple addition sent the binary notation to the backburner. Rather than building on a solid mathematical foundation, the Trigram Project was settling on an arbitrary compromise solution. The 6-7-8-9 was emerging.

Bringing Order to the Coins

One small working group of programmers was intrigued by the possibilities of utilizing the order of the 3 changes within each line of the hexagram. Since all 8 orders of 2's and 3's are possible when throwing the yarrow sticks (though addition destroys that order), they wondered how to take advantage of this fact. Under the mantra "All lines are trigrams", they tried to preserve all the information generated by manipulating the yarrow sticks. It made no sense to this group that one would go to so much trouble to glean information from heaven and then blindly throw away almost half of it.

The main sticking point was the alternative method of consulting the oracle, using 3 coins. All the Chinese coins being used were identical or "anonymous": it's impossible to tell whether it's coin number 1, 2 or 3 that is heads when a single head and two tails shows up. All sons look the same, as do the daughters. It was suggested that the numbers 1, 2 and 4 be etched on the coins to distinguish them (and act as a mnemonic to help the operator remember algorithm 3). This entire movement was squashed when one of the emperor's ministers issued a decree that new coinage would not be issued for oracular use and that anyone caught defacing the emperor's coinage would be severely punished. Without any way to reconcile the extended yarrow stick method with this restricted 3 coin method, the working group quietly disbanded.

The Incident of the Hybrid Digrams

Even as the ambitious Trigram Project was settling into the more modest 6-7-8-9 paradigm, the spirit of intellectual adventure it was generating was spilling over into the interpretive realm as well. As the programmers became enthralled with playing with hexagrams as pure mental symbols, divorced from generations of pious reverence, new avenues of interpretting the oracle started showing up on scraps of parchment and blackboards throughout the temple. In particular, the techniques of data compression mentioned earlier were rapidly gaining popularity.

Inspired by this climate, one precocious initiate started converting hexagrams into digrams by changing the upper and lower trigrams into upper and lower emblematic lines (forming a digram). At times, he even went a step further, transforming the digram into a single emblematic line. He ran into one of those "Zen two-by-four on the head" moments when he started reducing Hx. 3, Golden Code in this way. At step one, he produced the bizarre looking digram at the left, which left him stunned for quite some time. At step two, his mind hit a brickwall. Nothing in the common paradigm of equating emblematic lines with digrams gave him the slightest clue how to interpret this paradox of a symbol.

News of this discovery spread quickly, jumping many cube walls before the afternoon tea break: a new kind of line had been invented! Over the course of the next few days (judging from a huge stack of memos with hasty calligraphy from this period), much of the temple staff was caught up in the excitement, trying to produce a theory to make sense of these "hybrid digrams" (that's the most polite translation we can think of). The first problem was one of sheer magnitude. One of the sages summarized the situation in his report:

Beginning with the basic lines, yin and yang, two and three, one line is placed upon another. This forms the four digrams, which are the same as the emblematic lines. The line nearest heaven describes the state of the new line at the commencement of the situation, whether it is yin or yang. The line nearest earth determines the state of the new line at the conclusion, whether it moves or stands still. Because they are moving and standing still, yin and yang, the number of the emblematic lines is four. Similarly, if one emblematic line is placed upon another, forming a "hybid digram", there are many more of them. With careful study, we calculated there to be 16 of them.

It occurred to one of the more daring priests that this process could be repeated another time. If one of the 16 new lines is placed upon another, even stranger digrams are created, leading to a vast multitude of strange kinds of lines. After heated discussion, we agreed their number to be 256.

None among us dared to repeat this process yet again. In only a few days, the number of lines in the oracle had grown greater than the number of the 10,000 things under heaven! Such abundance, we are not prepared to contemplate. Yet who can say that these bizarre creations of our minds, these many lines, are any less real than yin and yang?

In fact, they had an infinite loop that was growing exponentially on their hands. They simply lacked the mathematical tools to grasp the enormity of the situation past a few repetitions of the loop.

The biggest problem with these hybrid digrams and "hyperemblematic lines", however, was one of interpretation. Nobody could agree on their oracular meaning. Such a line is extremely slippery, no matter how much you've meditated. Take a typical hybrid digram, with moving lines in both positions. The initial state, the upper line, is indeterminent, since a moving line can be either yin or yang, depending on when you look at it. (This resembles the modern theory of quantum mechanics in this regard.) But the lower line, which specifies how the digram moves, is also indeterminent. Or maybe it describes 4 separate kinds of "motion" that are possible. All this confusion in the digram makes for a single line with extremely unruly characteristics. Needless to say, the records indicate that programmers would get together in great numbers to discuss the merits of various interpretation schemes. Often the discussions would go on for days, ending in either violent arguments or mass enlightenment experiences. Either way, vast amounts of productivity were lost, which eventually came to the attention of the ministers overseeing the temple. An edict was drafted and distributed ordering the programmers to "quit this useless and unproductive philosophizing" and get back to work. The net effect was simply to drive the discussions underground, away from the temple and into the beer halls.

This incident served to point out that even the 6-7-8-9 paradigm had its unresolved problems and that a certain amount of discretion and "good taste" was required when consulting the oracle. Such is the mark of a Wise One. In particular, reducing the trigrams for heaven or earth to a single moving line, not to mention the whole hybrid hierarchy, was branded an "illegal operation" (similar to our ban on dividing by zero). As the memo from Lao Tse Kaud to all the staff emphasized:

As for these many diversions, do not even go there! There are too many dragons in those caves. Even a Wise One may lose his footing in the dark."
With that advice, many pathological data compression paradoxes were simply swept under the rug.

(The modern theory of hyperemblematics is still an advanced topic within sacred mathematics. The details of such a theory are simply beyond the scope of this article.)

The Deadline

Needless to say, the emperor's ministers were not impressed by the progress of the Trigram Project. A ministerial edict, posted throughout the temple, appeared to remind the staff that the rituals for the emperor's coronation anniversary were only two months away. Referring to Lao Tse Kaud by name, he warned, "And you'd better be ready to throw those damn yarrow sticks for the emperor, or whatever it is you people do." Sensing the inauspiciousness of the situation, the programmers decided to quickly wrap up the project and glean what gems they could from their creative work before having to return to more mundane oracular duties. Thus was the 6-7-8-9 coding adopted as a standard in the final draft (version 1.22) of the design document. This standard also formalized the data compression techniques, supplemented by a long list of "illegal operations" that became known as the "Food for the Dragon List". This standard, which became the controlling document for the Q-Ching for centuries, was then promptly filed in a dark corner of the archival library and has been mostly forgotten until modern times.


Despite the usefulness and even brilliance of the 6-7-8-9 coding used in the Q-Ching, it is only a pale reflection of the possibilities that could have emerged from the Oracle Temple. They nearly invented binary numbers, millennia before they appeared in our times, a development that could have revolutionized civilization for generations to come. They produced new and daring speculations on the oracle itself, which we are only now beginning to uncover. If they had been working with a more robust mathematical and philosophical foundation, the Trigram Project may have succeeded. For lack of the proper tools, an amazing opportunity was lost.

Table 2. Evolution of the Trigram Project paradigm.
Trigram Binary Trigram Value Modified Value 6-7-8-9 Value
Algorithm:Alg. 3 using 0, 1 Alg. 3 using 2, 3 Alg. 4 Alg. 7
31718 (*)8
41817 (*)7

* The two "transvestite trigrams" that are switched around.