The Famous Cantor Ternary Set

infinity3headerfigMany mathematicians consider the Mandelbrot set to be the most complicated and interesting set in all of mathematics. In contrast, the Cantor set is deceptively simple, but it has properties that are just as counter-intuitive and astonishing as some of the properties of the Mandelbrot set. Henry John Stephen Smith discovered the Cantor set in 1874, and Cantor introduced it to the world in 1883. As mentioned in a previous post, Cantor’s work helped lay the foundation upon which much of modern mathematical analysis rests. After studying some of Cantor’s mathematics in grad school, I gained a deeper understanding of fundamental concepts in calculus. When you think that you understand a concept fairly well and later learn that your understanding is a bit lacking, it can be humbling. If you have not read my first two Cantor posts, Infinity Does Not Necessarily Equal Infinity, and Why There Are Infinity Many Different Levels of Infinity, I strongly suggest that you read them before continuing. The concepts presented in this post heavily depend on an understanding of the concepts presented in those posts. As you continue to read this post, please slow down and read carefully. My intention is to be as clear and precise as possible with very limited use of the symbols of abstract mathematics. Cantor is very deep, and therefore this post will not be easy reading for many. Don’t get frustrated and give up if you don’t get it on the first or second read. If you are persistent, you will get it. You will find that the struggle was well worth your time and effort.

As I see it, part of the difficulty in understanding Cantor is that there is a vast difference between mathematical objects and physical objects. There is a theorem in mathematics that tells us that no matter how close one real number is to another real number, say 10-100 apart, there are infinitely many rational numbers between them.  Mathematicians elegantly describe this theorem as follows: The rational numbers are dense in the real numbers. This theorem is certainly not obvious to a normal person, and it seems to be counter-intuitive nonsense. We forget that mathematical points/numbers and curves have no thickness. The set of real numbers is far more complicated and mysterious than we realize. A pure mathematician can cut any interval on a real number line into infinitely many pieces without blinking an eye, but he or she can only cut a piece of lumber into a finite number of pieces. If two atoms are sufficiently close to each other, there is no room to fit another atom between them because atoms have a thickness property. As far as I know, no physical object contains infinitely many atoms. The speed of everything is finite. Pure mathematical knowledge is obtained through a mental activity of logical reasoning from a set of postulates or axioms. Knowledge in physics is obtained through a process of observation, inductive logic, deductive logic and experimentation with physical objects. To say that two real numbers are “just a little bit apart” is imprecise nonsense to a pure mathematician. For a pure mathematician, two reals are equal or they are not equal, but never just a little bit apart. A math professor once told me: “A woman is pregnant or she is not pregnant, but she is never just a little bit pregnant.” Cantor’s work reflects the mind of a pure mathematician who deals with mathematical objects that exist only in the mind of man and God. If God directly communicates math concepts to humans, as Cantor believed, God must be a jokester who is roaring with laughter as He watches us struggle to understand Cantor. If you understand and accept Cantor’s definition of equal cardinality of two sets, his counter-intuitive and absurd theorems are not so counter-intuitive and absurd after all. My three Cantor posts reflect my struggle to understand Cantor.

The purpose of this post is to give the reader a description of the Cantor set and some of its basic properties which seem counter-intuitive, preposterous, absurd, and astonishing. I will avoid the language of formal abstract mathematics as much as possible, and provide numerous explicit examples that illustrate the concepts presented.  Keep in mind the following concepts, definitions, and facts as the construction of the Cantor set is explained.

  • Open interval (p, q) equals all real numbers between p and q, but not including p or q.
  • Closed interval [p, q] equals all real numbers between p and q, and including both p and q.
  • Two sets are disjoint if and only if the intersection of the sets equals the empty set.
  • Numbers expressed in binary or base 2 format have only the digits 0 or 1.
  • Numbers expressed in ternary or base 3 format have only the digits 0, 1, or 2.
  • Numbers expressed in decimal or base 10 format have only digits 0 through 9.
  • The terms “all”, “each”, and “every” mean without exception.
  • “If and only if ” means whenever the first statement is true, the second statement is true and vice versa.
  • A real number is a Cantor number if and only if it’s in the Cantor set.

To get started, let’s see how the Cantor ternary set is constructed. Like the Mandelbrot set, the Cantor ternary set is a fractal because it’s created by an infinite iterative procedure that determines what numbers are in the set. Numbers in the Mandelbrot set are complex numbers (a + bi) in a region of the complex coordinate plane. All Cantor numbers are real numbers contained in [0, 1]. Like the Mandelbrot set, construction of the Cantor set can only take place in the human intellect.

Construction of the Cantor set starts with the closed interval [0, 1] on the real number line with nothing removed. This interval is represented by the top solid bar in the graph below. At each step in the construction, the open middle 1/3 of each closed interval is removed to produce a new set of closed intervals. The Cantor set equals the set of points/numbers remaining in the closed interval [0, 1] after infinitely many iterations. The graph below depicts the closed intervals remaining after each of the first six iterations in the construction. Every horizontal line of the graph depicts the union of a set of closed intervals, and this union of closed intervals contains all of the Cantor numbers. It might appear that the Cantor set is empty, but you will later see that there are as many numbers in the Cantor set as there are real numbers in [0,1]. We end up with the same number of points we started with! In other words, the cardinal number of the Cantor set equals the cardinal number of the set of the real numbers in [0,1]. What remains in [0, 1] is fractal dust. So the Cantor set is nothing more than fractal dust that has the same cardinality as the set of reals in [0, 1].  (That is absurd! How can that be possible?)

infinity3fig1a

The text box below lists the closed intervals remaining after each of the first 5 iterations in the construction of the Cantor set. Note that the endpoints of the closed intervals are Cantor numbers, and the number of endpoints doubles on each iteration. After the 5th iteration, we have found at least 64 Cantor numbers. Later you will see that there are numbers between the endpoints of closed intervals that are also in the Cantor set. Furthermore, all Cantor numbers are contained somewhere inside the union of the disjoint closed intervals at every step in the construction. Example: 12/13 is contained in [8/9, 9/9], 12/13 is in the Cantor set, and 12/13 is not an endpoint of any of the closed intervals. Later you will see why 12/13 is a Cantor number. As an exercise, you could explicitly list all 64 closed intervals remaining after the 6th iteration. This may help you better understand how the Cantor set is constructed.

infinity3txtbx1

Before I go any further, I need to discuss a theorem that tells us how to calculate the value of an infinite geometric sum. In this post, the infinite geometric sum formula is used to calculate the value of a binary or ternary expansion of a number with infinitely many digits. Infinite geometric sums are calculated follows:

  • Let a equal the value of first term of the sum.
  • Let r equal the constant multiplier of the terms where |r| < 1.
  • Sum = a + ar + ar2 + ar3 + . . . = a(1/(1 – r))

The text box below shows how to apply the infinite geometric sum formula to calculate the binary or ternary expansion of a real number that has infinitely many digits.

infinity3txtbx2

We can now see why the sum of the lengths of all open intervals removed from the interval [0, 1] equals 1. This is astonishing and leads to counter-intuitive conclusions. The text box below lists all of the disjoint open intervals removed from [0, 1] in the first 5 iterations. None of these open intervals contains a Cantor number. The sum of all removed open intervals = 1/3 + 2/9 + 4/27 + 8/81 + . . . = 1/3(1 / (1 – 2/3)) = 1. The length of the interval [0, 1] = 1, and the total of the lengths of all of the infinitely many open disjoint intervals removed equals 1. Therefore after infinity many iterations, the sum of the lengths of the closed disjoint intervals that contains the Cantor set must equal 0. All that remains is fractal dust. Mathematicians say that the Cantor set has Lebesgue measure zero. Later you will see that the cardinal number of [0, 1] equals the cardinal number of the Cantor set. (How is it possible that the cardinal number of fractal dust equals the cardinal number of all reals in [0, 1]? That is absurd!)

infinity3txtbx3

It turns out that we can easily determine whether or not a real number x is a Cantor number if we know the ternary expansion of x. An important theorem about Cantor numbers states that every real number x in [0, 1] is a Cantor number if and only if there exists a ternary expansion of x that uses only digits 0 and 2. The proof of Cantor’s theorem hinges on this theorem. We will accept this theorem without a proof. The text box below shows the ternary expansion of various rational numbers in the Cantor set. Notice that some Cantor numbers like 1/27 and 1/3 have two equivalent ternary expansions. What’s important to understand is that the ternary expansion of all Cantor numbers, rational or irrational, can be uniquely expressed using only ternary digits 0 and 2. The ternary expansion of 1/2 = (0.111 . . .)3, and 1/2 is not in the Cantor set. It’s not important to know how to convert an arbitrary number to ternary format. Note that the ternary expansion of 12/13 = (0.220220220 . . .)3. If you are bored and want to add a little spice to your life, find the first 100 digits of the ternary expansion of 1/π or 1/e.

infinity3txtbx4

Before we can get to the proof of Cantor’s theorem, we need to understand one more important idea. Every real number, rational or irrational, in the closed interval [0, 1] can be expressed as a unique binary coded number of the form (0.b1b2b3 . . .)2 where each binary digit bi equals 0 or 1. Some examples: 0 = (0.0)2, 1 = 1/2 + 1/4 + 1/8 + 1/16 + . . . = (0.1111 . . . )2, 2/3 = (0.10101010 . . . )2 and 0.875 = (0.111)2. What’s important to understand is that there is a unique binary expansion of the form just described for every real number in [0, 1]. How to find the binary expansion of an arbitrary number is not important; we just need to know that it can be done.

Cantor’s theorem states that the cardinal number of the set Cantor numbers equals the cardinal number of the set  reals in  [0, 1]. In other words, the number of Cantor numbers equals the number of reals in [0, 1]. A proof of Cantor’s remarkable theorem can now be given and it goes something like this:

  • Let C equal the set of ternary expansions, using only the digits 0 and 2, of all reals in [0, 1]. Therefore C equals the set of Cantor numbers and C is a proper subset of the reals in [0, 1]. C is the fractal dust that is contained in the closed interval [0, 1].
  • Let R equal the set of all binary expansions of the reals in [0, 1]. Therefore R equals the set of all reals in [0,1].
  • Construct a one-to-one function f(x) with domain C and range R that matches all elements of C with all elements of R as follows: (This construction is so simple.) Let x equal any element of C. If the nth ternary digit of x = 0, then set the nth binary digit of f(x) = 0. If the nth ternary digit of x = 2, then set the nth binary digit of f(x) = 1.

Examples: f((0.20022202)3) = (0.10011101)2  and f((0.020220222)3) = (0.010110111)2

  • For every element y in R, there is an element x in C such that f(x) = y.

Example: If y = (0.11000101)2, then x = (0.22000202)3.

  • From the results discussed above and the definition function f, Cantor’s theorem easily follows. Since the cardinal number of the reals in [0, 1] equals the cardinal number of the set of all real numbers, it follows that the cardinal number of the Cantor set equals the cardinal number of the set of all real numbers.

A Couple of Comments:

How to construct the inverse function of f(x) is obvious. I don’t know what the graph of f(x) looks like, and I really don’t care. It’s only important to know that f(x) is a one-to-one function that pair wise maps set C to set R. Perhaps it’s a bit too dramatic and somewhat misleading to say “The number of Cantor numbers equals the number of reals in [0, 1].” It’s probably better to just say “There is a one-to-one function that pair wise matches the set of Cantor numbers with the real numbers in [0, 1].” On the other hand, how else can we compare the number of elements in two sets? If you understand and accept Cantor’s definition of equal cardinality, Cantor’s work makes more sense. Note that the above proof did not use the technique of proof by contradiction.

There is also a diagonal proof of Cantor’s theorem which uses the technique of proof by contradiction. My post Infinity Does Not Necessarily Equal Infinity gives Cantor’s famous diagonal proof which states that the cardinality of the set of real numbers is strictly greater than the cardinality of the set of counting numbers. The diagonal proof can be easily modified to show that the cardinality of the Cantor set is strictly greater than the cardinality of the set of counting numbers. This is easily accomplished by just replacing the strings of binary digits with strings of ternary digits consisting of 0 or 2 only. Therefore the cardinality of the Cantor set and the cardinality of the set of real numbers is strictly greater than the cardinality of the set of counting numbers. The continuum hypothesis states that there is no cardinal number between the cardinal of the counting numbers and cardinal number of all real numbers. If we accept the continuum hypothesis, it follows that the cardinality of the Cantor set equals the cardinality of the set of all real numbers; not just the reals in [0, 1].

I will close this post with a short discussion of the Cantor ternary function. Warning! This is not the function that was defined in the proof of Cantor’s theorem above. Basic properties of the ternary function and its graph are shown below. After a student studies the Cantor ternary function in a graduate level math course, he or she gains a deeper understanding of concepts learned in undergraduate level math courses. Functions are no longer just some formula like f(x) = 3x2 – 2x + 1 or g(x) = 3Cos(x) – 5. The first derivative of the ternary function can’t be found by applying the standard differentiation rules because there is no explicit formula for it. Wikipedia has an excellent article on the Cantor function.

infinity3txtbx5

infinity3fig2

Why There Are Infinitely Many Different Levels of Infinity

Georg Cantor circa 1870
Georg Cantor circa 1870

Because a recent post about the German mathematician Georg Cantor (1845-1918) generated a great deal of interest, I decided to do two more posts about Cantor’s contributions to mathematics. Of course, I find Cantor’s mathematics fascinating, and apparently many readers found the content of that post intellectually stimulating as well. You might think that Cantor’s work amounts to a bunch of clever and interesting mathematical mind games, but this is not the case. His work helped lay the foundation upon which much of modern mathematical analysis rests. If you have not read my first Cantor post, Infinity Does Not Necessarily Equal Infinity, I strongly suggest that you read it before continuing. The concepts presented in this post heavily depend on an understanding of the concepts presented in that post.

The purpose of this post is to show how Cantor proved that there are infinitely many levels of infinity or there are infinitely many different infinite cardinal numbers! The purpose of the next Cantor post will be to give readers a general description of the Cantor ternary set which is counter-intuitive, preposterous, absurd, and astonishing. I will not delve deeply into formal abstract mathematics, because my understanding of Cantor’s work only scratches the surface of his deep mathematics.

To get started, I will do a quick review of some basic definitions and concepts in set theory.

  • A set can be any collection of objects such as numbers, character symbols, cars, people, cats, etc.
  • Set A is a subset of set B if and only if every element of set A is an element of set B.
  • Let A equal any subset of B. A is a proper subset of B if and only if there is an element in B that is not in A.
  • The null or empty set is a set that contains no elements. The symbols { } or Ø denote the empty set.
  • Sets {0}, {Ø}, and {{ }} are not the empty set because each of the three sets contains an element.
  • The null set is both a subset and proper subset of every set.
  • Set A equals set B if and only if the sets are subsets of each other.
  • In set theory and Boolean algebra, the word “or” means “one or the other and possibly both.” In contrast, when a parent uses the word “or” with a child, the parent means “one or the other, but not both.”
  • In set theory and Boolean algebra, the word “and” means “both are in the set” or “both are true.”
  • The union of sets A and B, denoted by AB, is the set of elements that are in A or B.
  • The intersection of sets A and B, denoted by A ∩ B, is the set of elements that are in A and B.
The text box below uses sets of numbers to illustrate the set definitions above.
infinity2txtbx1

The power set of set A, denoted by P(A), equals the set of all possible distinct subsets of A. In other words, P(A) is just another set that contains all of the distinct subsets of set A. To get a better idea of what P(A) means, the text box below gives P(A) for different finite sets of counting numbers. Note that if set A has n elements, then P(A) has 2n elements.

To see why increasing the number of elements in a set by one causes the number of elements in the power set to double, consider how you could go about creating a list of all 32 subsets of the set {1, 2, 3, 4, 5}. The first 16 subsets of {1, 2, 3, 4, 5} are given by the power set of {1, 2, 3, 4}. The other 16 subsets of {1, 2, 3, 4, 5} can be obtained by forming the union of {5} with each of the 16 subsets of {1, 2, 3, 4}. You should later go ahead and list all 32 subsets of {1, 2, 3, 4, 5} and then all 64 subsets of {1, 2, 3, 4, 5, 6}. I’m very serious about this suggestion because it will help you learn to think in a different way and help you better understand the fundamental counting principle, permutations, and combinations. You may find this task tedious and boring, but you will be rewarded with a better understanding of fundamental counting concepts.

infinity2txtbx2

I can now explain the proof Cantor’s theorem which states that the cardinal number of P(A) is strictly greater than the cardinal number of A where A is any finite or infinite set. Cantor’s theorem can be used to show that there are infinitely many different infinite cardinal numbers. Recall from my first Cantor post that the cardinal number of a set equals the number of elements the set contains. Therefore the cardinal number of a google of water molecules equals 10100, and the cardinal number of a MLB active roster equals 25.  Also recall from my first Cantor post that the symbol for the cardinal number of the counting numbers is ℵ0, the symbol for the cardinal number of the real numbers is ℵ1, and ℵ0 < ℵ1.

For finite sets, Cantor’s theorem is obvious. If the cardinal number of finite set A equals n, then the cardinal number of P(A) equals 2n. For infinite sets Cantor’s theorem might seem obvious, but it’s much more difficult to prove. To make Cantor’s proof more comprehensible for infinite sets, I will first give a proof that shows that the cardinal number of P(C) is strictly greater than the cardinal number of C where C equals the set of counting numbers. Like many deep abstract mathematical proofs, Cantor’s proof uses the sophisticated technique of proof by contradiction. For set C, his proof goes something like this:

  • Assume that there is a one-to-one function f with domain C and range P(C) that matches the counting numbers in C with all of the elements of P(C). The text box below shows one of the infinitely many possible ways that we could create a matching rule for f. The order in which domain and range elements are listed makes no difference. The important point is that there exists a one-to-one function f such that the domain of f equals set C and the range of the function, all function values, equals the set P(C).
infinity2txtbx3
  • Construct a special subset M of the counting numbers as follows: (See the text box above.) Let set M equal all counting numbers n such that n is not contained in f(n). M is not the empty set because counting number j, such that f(j) = {   }, must be an element of M by definition.
  • Set M raises a contradiction as follows: There must be a unique counting number k such that f(k) = M. Either k is contained in M or k is not contained in M. If k is contained in M, then by the definition of set M, k is not an element of M. If k is not contained in M, they by the definition of set M, k is an element of M. Therefore the function f can’t exist. Hence there is no counting number k that matches with M, and the cardinal number of set P(C) is greater than the cardinal number of set C. (The contradiction is somewhat like damned if you do and damned if you don’t.)

Now let’s see how we can prove that the cardinal number of the set of real numbers is strictly less than the cardinal number of the power set of real numbers. We need only to modify the last proof a little bit to give us our proof. The following algorithm describes how to create the modified proof:

  • Let R equal the set of real numbers and the variable x equal any real number or element of R.
  • Assume that there is a one-to-one function f with domain R and range P(R) that matches the real numbers in R with all of the elements of P(R).
  • Construct a special subset M of the real numbers as follows:
    1. Let set M equal all real numbers x such that x is not contained in f(x).
    2. M is not the empty set because real number y, such that f(y) = {   }, must be an element of M by definition.
  • Set M raises a contradiction as follows:
    1. There must be a unique real number k such that f(k) = M. Either k is an element of M or k is not an element of M.
    2. If k is an element of M, then by definition of set M, k is not an element of M.
    3. If k is not an element of M, they by definition of set M, k is an element of M.
    4. Therefore function f can’t exist. Hence there is no real number k that matches with M, and the cardinal number of set P(R) is greater than the cardinal number of set R.
Using the proofs described above as a model, it’s relatively easy to prove that the cardinal number of P(A) is strictly greater than the cardinal number of set A where A is any infinite set. By letting our imaginations run wild and considering set expressions such as P(P(A)) and P(P(P(P(A)))), we can create as many different infinite cardinal numbers we wish.

 

If you were persistent enough to follow the logic of Cantor’s proofs, your head is probably about ready to explode from all of the mental exercise. You might be asking these questions:
  1. How will I ever apply set theory and cardinal numbers in my daily life? A: Probably never.
  2. Do many engineers and scientists use set theory and cardinal numbers? A: Very few.
  3. Name a math class that uses set theory? A: Venn diagrams to model probabilities of events in statistics.
  4. Who uses set theory on a regular basis? A: People who design computer logic circuits use Boolean algebra.
  5.  Is the study of set theory and cardinal numbers really just a mind game played by crackpot mathematicians? A: No – Cantor’s work helped create the foundation upon which much of modern mathematics rests.
  6. Should very bright and curious high school math students be exposed to some of Cantor’s ideas? A: Yes.
I will close this post with a bit of personal information about Cantor. Cantor was a devout Lutheran who acknowledged the Absolute infinity of God. He believed that his theories about different levels of infinity were communicated to him by God. Some contemporary Christian theologians viewed Cantor’s work as a direct challenge to the idea that there is a unique infinity that only resides in God. For about the last 35 years of his life, Cantor suffered recurring bouts of depression. Most likely, the numerous vicious attacks on his work by many of his contemporaries contributed to his bouts of depression. Eventually Cantor’s work received praise and accolades from prominent contemporaries. In 1904, the Royal Academy awarded Cantor the Sylvester Medal which was the highest honor in mathematics. The brilliant mathematician David Hilbert said “No one can expel us from the Paradise that Cantor has created.” He spent the last year of his life in a sanatorium where he died on January 16, 1918.

 

Infinity Does Not Necessarily Equal Infinity

Georg Cantor
Georg Cantor (1845-1918)

A light year is about 6 trillion miles and the U.S. national debt reached 18 trillion dollars in 2015. Numbers of this magnitude are almost impossible to comprehend, but compared to infinity they are rather small. The German mathematician Georg Cantor (1845-1918) invented set theory and the mathematics of infinite numbers which in Cantor’s time was considered counter-intuitive, utter nonsense, and simply wrong. Many of Cantor’s contemporaries considered him to be nothing more than a charlatan. Set theory and the mathematics of infinite numbers are now part of mainstream mathematics.

To understand why Cantor upset so many mathematicians, I need to explain a basic concept of Cantor’s set theory. The cardinal number of a set or collection of objects equals the number of objects the set contains. Therefore the cardinal number of a gross of pencils equals 144 and the cardinal number of a mole of atoms is about 6.023 x 1023. For finite sets, the concept of cardinality is simple and straight forward, but for infinite sets the concept of cardinality can be counter-intuitive and utter nonsense. Cantor proved that the cardinal number of one infinite set can be greater than the cardinal number of another infinite set; infinity no longer necessarily equals infinity. Cantor also proved that there are infinitely many levels of infinity. In other words, there are infinitely many different infinite cardinal numbers!

What does Cantor mean when he says that two so seemly different infinite sets can have the same cardinality? Why are there as many real numbers between 0 and 1 as there are from –∞ to +∞? Why are there as many counting numbers as there are rational numbers? Why is the cardinality of the set of rational numbers less than the cardinality of the set of real numbers?  The purpose of this post is to provide answers to these questions without delving into formal abstract mathematics. If you want a mathematically rigorous discussion of cardinal numbers and set theory, take a graduate level course in set theory or point-set topology. My previous post Relations, Functions, and One-to-One Functions discussed concepts that will be used in this post and therefore readers may find it helpful.

To get started, I will do a quick review of the different types of real numbers. All real numbers are either rational or irrational. The set of rational numbers is composed of counting numbers, whole numbers, integers, and numbers that can be expressed as the ratio of two integers. The decimal expansion of all rational numbers starts to repeat in a pattern of fixed finite length at some point. The decimal expansion of an irrational number never starts to repeat in a pattern of fixed finite length. The text boxes below give examples of the different types of real numbers. Note that there is a pattern in the decimal expansion of n, but the length of the pattern increases. The term “real number” is unfortunate because it suggests that some numbers are valid and other numbers like the imaginary numbers are fake numbers.

infinitytxbx1

infinitytxbx2

Cantor uses the concept of cardinality to define when two sets have the same cardinality. Set A has the same cardinality as set B if and only if there is a one-to-one function that matches elements of A with elements of B such that the domain of the function is set A and the range of the function is set B. When both sets have a finite number of elements, this definition makes perfect intuitive sense. Example: When two bags of golf balls contain an equal number of golf balls, it’s easy to see how we can match the golf balls one-to-one, to show that the two bags of golf balls have the same cardinality. When both sets A and B have infinitely many elements, Cantor’s definition leads to a new and profound understanding of the nature of infinity. The matching rule for the one-to-one function in Cantor’s definition may be described by an equation or general algorithm that tells us how to match domain elements with range elements.

Using Cantor’s definition, let’s see why it makes sense to say that the set of real numbers between 0 and 1 has the same cardinality as the set of real numbers greater than 1. Initially, this seems preposterous. Two numbers are reciprocals if and only if the product of the two numbers equals 1. It’s a mathematical fact that every nonzero real number has a unique reciprocal and the reciprocals of two numbers are different if the numbers are different. If 0 < n < 1, then 1/n > 1. If n > 1, then 0 < 1/n < 1. Therefore the one-to-one function y = 1/x with domain equal to the open interval (0, 1) and range equal to the open interval (1, ∞) leads us to the conclusion that the open intervals (0, 1) and (1, ∞) have the same cardinality. Graph A below shows the graph of this function. If you think about it, the only practical way to show that two sets have the same cardinality is to show that there exists a one-to-one function that pair wise matches the elements of the two sets.

infinityfig1a

I will now demonstrate that the open interval (p, q) has the same cardinally as the open interval (-∞, +∞) for any pair of real numbers p and q such that q > p. Let the width of the interval w = q – p and midpoint of the interval m = (p + q)/2. The one-to-one function y = Tan(π/w(x – m) with domain (p, q) has a range equal to (-∞, +∞). From Cantor’s definition, it follows that the cardinality of the open interval (p, q) equals the cardinality of the open interval to (-∞, +∞). Graph B below illustrates that the open interval (0.75, 1.25) has the same cardinality as the open interval (-∞, +∞).

infinityfig2a

The next part of this post will demonstrate that the cardinality of the counting numbers equals the cardinality of the positive rational numbers. This will be accomplished by showing that there is a one-to-one function, f, that matches the counting numbers with the positive rational numbers. The matching rule of the one-to-one function is an algorithm that describes how we can systematically go about matching every counting number with a unique positive rational number in such a manner that every positive rational number gets matched with a counting number. Mathematicians say that the rational numbers are countable.

The algorithm for the matching rule of the one-to-one function is as follows:

1) Organize the positive rational numbers in a rectangular grid as shown below.

2) Start in the upper left corner of the grid. Set the counting number n = 1 and let f(n) = 1/1.

3) Continue moving from grid element to grid element forever as indicated in the diagram. If grid element p/q is not equivalent to a previous function value, then increase n by 1 and let f(n) = p/q. If grid element p/q is equivalent to a previous function value, then skip the grid element and go to the next grid element. (Note that skipped grid elements in the diagram are crossed out.)

infinitytxbx3

Now for Cantor’s famous diagonal proof that the real numbers are not countable. His proof used the sophisticated technique of proof by contradiction which is commonly used by mathematicians to prove a theorem. The diagonal proof goes something like this.

  • Assume that there is a one-to-one function f(n) that matches the counting numbers with all of the real numbers. The box below shows the start of one of the infinitely many possible matching rules for f(n) that matches the counting numbers with all of the real numbers. The real numbers in the range of the function are represented as strings of base 2 real number digits or binary digits (i.e. consisting only of zeros and ones).
  • Now construct a real number p as follows: Let n equal any counting number and f(n) equal the corresponding function value.
  • If the nth binary digit of f(n) = 0, then set the nth binary digit of p = 1.
  • If the nth binary digit of f(n) = 1, then set the nth binary digit of p = 0. (See the text box below.) It’s clear that the real number p is not in the range of f(n) which in turn contradicts the original assumption about f(n). Therefore the cardinality of the real numbers is greater than and the cardinality of the counting numbers and the real numbers are not countable.
infinitytxbx4

 

Some Comments Regarding Cardinal Numbers and Real Numbers:

  • The symbol for the cardinal number of the counting numbers is ℵ0. (aleph naught)
  • The symbol for the cardinal number of the real numbers is ℵ1.
  • The continuum hypothesis states that there is no cardinal number between ℵ0 and ℵ1.
  • If you can prove the continuum hypothesis, you will become world famous overnight.
  • There are infinitely many rational numbers between any two real numbers.
  • The rational numbers are an infinitely small fraction of the real numbers.
  • To work with irrational numbers in practical applications, we use rational numbers to approximate irrational numbers. (3.1416 ≠ π)
  • The points, lines and curves that we draw on a chalkboard or computer screen are just crude approximations of true mathematical points, lines, and curves.
  • True mathematical points and curves are infinitely thin, and therefore they can’t reflect light which in turn tells us that we really can’t see true mathematical points and curves in the physical sense.
  • Mathematical objects only exist in the mind of man and God.

Applying the Order of Operation Rules to Solve an Equation

chalkboard1Experienced teachers know that some students seem to have a natural feel of how to solve an equation. They just know how and when an operation should be applied to both sides of the equation. Capable math students may not be fully aware of why they are using a particular strategy to solve an equation, but they know how to apply the strategy. Students who struggle with solving basic equations ask questions like the following: How do you know whether to add or subtract the same number to both sides of the equation?  How do you know whether to multiply or divide both sides of the equation by the same number? How do you know in what order various operations need to be applied to both sides of an equation? What does “find x” or “what is x” really mean? The purpose of this post is to provide answers to these types of questions. Readers who have read the posts Inverses of Relations and Functions and Inverse of a Matrix will immediately see the connection between those posts and this post.

To get started, you can download the free handouts Basic Equation Solving Strategies, Strategies for Solving Exponential and Logarithmic Equations, and Basics of Solving Inequality Relations. These free handouts (and many more) can also be accessed by visiting mathteachersresource.com/instruction-content. Depending on the course I’m teaching, I give one or more of these handouts to my students. Basic Equation Solving Strategies breaks down the basic algebraic equations that students will encounter in a lower level math course into six equation types which I call “case 1” through “case 6”. When solving different types of equations, I routinely ask my students to identify the type of equation being solved. When students can identify the type of equation being solved and know the basic algorithm to solve that type of equation, they can quickly and efficiently solve the equation; no wasted time. In application problems, solving an equation is usually just one small step in finding a solution of a problem.

This post will focus on how to solve a case 1 type of equation by the “work backwards” method. Case 1 equations are equations in which the variable appears only once in the equation. The work backwards technique of solving an equation is well known, but the manner in which it’s presented in this post is different. I use the following analogy to explain the basic reasoning behind the work backwards method: Suppose that you have established a base camp on a camping trip and take a day hike in the woods. You can always get back to your base camp by simplify reversing your steps.

Sample case 1 equations are shown in the text box below. I certainly would not have a beginning algebra student initially solve equations of this complexity; however, it’s my hope that they will eventually learn how to solve equations of this complexity. It takes a while for students to remember that equations involving the squaring or absolute value operations can have two solutions or no solutions. From personal experience, the work backwards method as presented in this post is effective with both low and high ability students. I remember one of my students saying to a classmate, “The work backwards method really works, but you need to know the order of operation rules.” Is the work backwards method effective with all students? Of course not!

The text box below gives five examples of case 1 equation types. When solving for a variable in terms of other variables, think of the other variables as constants.

equationsolvetxtbx1

To use the work backwards method, it’s necessary to understand which operation reverses a given operation. After looking at specific examples of each type of operation, most students quickly develop an understanding of what reversing an operation means. The text box below gives a summary of common operations and the corresponding inverse operation. Note that the reciprocal operation is its own reverse or inverse operation.

equationsolvetxtbx2a

There are three major steps in solving a case 1 type of equation.

Step1) Recognize that an equation describes a process that starts with an unknown value of a variable, then performs a series of back-to-back mathematical operations, according to the order of operation rules. This process is described in what I call “the equation solve plan” which may or may not be explicitly stated by the student. Equation solve plans are road maps that lead to the solution of an equation. The student now knows what operation needs to be applied to both sides of the equation and when the operation needs to be applied. Initially, I require beginning students to explicitly state the equation solve plan as shown in the examples below where the solve plan is described in a box to the left of the list of equations. The equation solve plan is the series of back-to-back operations in the equation that are performed, according to the order of operation rules, on the variable being solved for.

Step 2) Follow the equation solve plan in reverse order from the last step to the first step. At each step, apply the reverse or inverse operation to both sides of the equation. Continue working backwards until the solution appears. This process is like peeling back an onion layer by layer.

Step 3) Check numerical solutions by plugging the solutions into the original equation. This is very important because it helps students better understand the equation and catch errors. I have no sympathy for a student who gives an incorrect solution and has not bothered to check the solution. With modern calculators, there is no reason that solutions can’t be checked.

The next six text boxes illustrate how the work backwards method works. The text in the box to the left of the list of equations is the equation solve plan. The operation symbol to the right of a step indicates what operation was applied to both sides of the equation. The check mark certifies that the solution was checked. Notice that the solution of the sixth equation is a somewhat different approach of solving the equation.

equationsolvetxtbx3

equationsolvetxtbx4

equationsolvetxtbx5

equationsolvetxtbx6

equationsolvetxtbx7

equationsolvetxtbx8

The handouts previously mentioned in this post are intended to provide a set of efficient algorithms that students can use to solve the types of equations and inequalities found in lower level math courses. Of course, some students will find a faster way to solve an equation or inequality in a special situation. However, I have observed students who seem to have a knack for making easy problems difficult. One time I saw a student rewrite the equation and then use the quadratic formula to solve the equation. The solution could have been obtained in 3 or 4 steps by using the work backwards method!

Some Personal Observations:

  • Equation solving is not an end in itself, but a small step in a larger application problem.
  • Practice solving equations is really nothing more than good mental gymnastics.
  • Equation solving is an essential skill, but not creative mathematics.
  • Discovering an equation that models a law of nature is creative mathematics.
  • Discovering a new algorithm to solve a math problem is creative mathematics.
  • Students primarily take algebra to start learning to how to reason abstractly with symbols; not how to learn to manipulate polynomial expressions, graph equations and solve equations. Of course, they don’t realize this.
  • Professionals such as engineers, scientists, writers, artists, musicians, educators, company managers, business executives, military planners, etc. routinely think and reason abstractly with symbols, but they never or seldom factor a polynomial or use the quadratic formula.
  • Beginning students should be required to express solutions in decimal format because an expression like 3 + √(29) has no real meaning for them.
  • It’s necessary to remind beginning students that dividing by a number is the same as multiplying by the reciprocal of the number and vice versa.
  • When the solution of an equation is an algebraic expression, imagine replacing the single variable that was solved for with the expression. If you carefully study the resulting equation, you will see the equation magically transformed itself into an identity. It’s amazing.

Inverse of a Matrix

inversematrixleadfigMatrices are rectangular grids of numbers arranged in rows and columns. Applications of matrices are found in many areas of mathematics and science. Areas of application include: 2D and 3D computer graphics, linear programming, Markov chains in probability theory, finite state automata theory, graph theory, network theory, multiple dimension vector spaces, economic modeling, solving systems of linear equations, calculus, and statistics.

Objects or shapes are drawn and moved in 2D or 3D space on a computer screen by multiplying matrices. Numerically intensive matrix computations are offloaded to video cards that are optimized to carry out millions of matrix multiplications per second. Generally speaking, computer animation is achieved as follows by repeating steps 1, 2, and 3 below.

Step 1) Use the coordinates of the current geometric object or shape to calculate the coordinates of a new geometric object. The coordinates of a new object are calculated by multiplying the coordinates of the current object by a matrix or series of matrices so that the resulting product gives us the coordinates of an object that is to be translated, rotated about a point, reflected over a line or plane, horizontally stretched/shrunk, or vertically stretched/shrunk.

Step 2) Erase the object from the screen.

Step 3) Use the coordinates of the new object to draw the object on the screen and then make the coordinates of the current object equal to the coordinates of the new object.

A previous post discussed the mathematical concepts of inverse of a relation and inverse of a function where the matching rule of the relation was an x-y variable equation. Like many concepts in mathematics, the concept of the inverse of a relation and the inverse of a matrix are connected. The purpose of this post is to show how the concept of the inverse of a geometric transformation matrix dovetails with the general concept of the inverse of a relation. Because of time and space considerations, it’s not possible to provide a review of matrix algebra and 2D geometric transformation matrices in this post, but you can find a concise summary of these topics by downloading the free handouts Introduction to Matrices and Geometric Matrix Transformations (student version) or by visiting mathteachersresource.com/instruction-content. The Matrix Geometric Transformations (student version) handout contains a list of student exercises which are designed to drive home key concepts. The teacher version of this handout provides detailed solutions of the exercises.

I will start the discussion by describing three basic 2D geometric transformation matrices and the structure of a preimage matrix [P] which contains the vertices of the object or shape to be transformed. Notice that all 2D transformation matrices have three horizontal rows and three vertical columns. Preimage matrices have three horizontal rows and n vertical columns; one column for each of the n vertices of the geometric object. In passing, 3D geometric transformation matrices have four rows and four columns.

inversematrixtxtbx1

inversematrixtxtbx2

I will now show you how to use transformation matrices to rotate an object 1200 counterclockwise and 1200 clockwise about (0, 0). Notice that the transformation matrices [T1] and [T2] must be inverses of each other because a clockwise rotation of 1200 reverses a counterclockwise 1200 rotation. The vertices of the green colored preimage polygon are contained in matrix [P]. For the red polygon, the matrix of vertices [P’] = [T1][P]. For the blue polygon, the matrix of vertices [P’] = [T2][P]. Matrix elements are rounded to three decimal places.

inversematrixtxtbx3

inversematrixfig1

Now let’s take a look to see how a translation matrix can be used to slide a figure 7 units to the left and 5 units up. Notice that the transformation matrices [T1] and [T2] must be inverses of each other because sliding a figure in the opposite direction reverses a slide operation. The green colored polygon is the preimage object. For the red polygon, the matrix of vertices [P’] = [T1][P].

inversematrixtxtbx4

inversematrixfig2

I will conclude this post my showing you how basic geometric transformation matrices can be chained together to get a desired geometric transformation of an object. In this example, the green colored polygon is rotated 1800 about the point (1, 3). Therefore the polygon is first slid 1 unit to the left and 3 units down. Next, the translation image is rotated 1800 about the origin. The final step in the transformation is to reverse the effects of the initial translation by multiplying by [T2]. Because of the nature of matrix multiplication, matrix multiplication in an expression must be performed in a specific order; usually right to left. When doing algebraic manipulations of matrix expressions, it’s easy to forget that matrix multiplication is NOT communitive. In fact, the matrix expression [P][T1] is undefined, but [T1][P] is perfectly legal.

inversematrixtxtbx5

inversematrixfig3

Comments Regarding the Inverse of a Matrix:

  • Only square matrices (equal number of rows and columns) can have an inverse.
  • Not all square matrices have an inverse.
  • Geometric transformation matrices have an inverse because all affine transformations can be reversed.
  • If the functions f(x) and g(x) are inverses of each other, then f(g(x)) = g(f(x)) = x, the identity function.
  • If the matrices [A] and [B] are inverses of each other, then [A][B] = [B][A] = [I], the identity matrix.
  • Before digital computers, finding the inverse of a large matrix was a daunting task.
  • Students can use a graphing calculator to find the inverse of a matrix if the matrix is not too large.
  • The inverse of a basic affine geometric transformation matrix can found by just inspecting the matrix.

Inverses of Functions and Relations

InverseFuncLeader1My previous post discussed the mathematical concepts of function and relation. Because the content of this post heavily depends on an understanding of the ideas presented in that post, you may find it helpful to read it before continuing.

The concept of the inverse of a relation is a natural extension of the important concept of a relation. The central idea is that an inverse relation is about reversing a relationship by exchanging variables, reversing/undoing an operation, or reversing/undoing a series of operations in a specific order. The following five questions and situations illustrate how a person uses the concept of the inverse of a relation to solve a problem.

(1) If we know a formula to convert Fahrenheit temperatures to Celsius, what formula converts Celsius temperatures to Fahrenheit?

(2) If we know a formula that tells us how to calculate the area of a circle from its radius, what formula will tell us how to calculate the radius of the circle from its area?

(3) A diner in a restaurant uses the restaurant’s menu function in inverse mode to determine what food items on the menu he/she can afford.

(4) A criminal investigator uses the one-to-one function that matches people with DNA molecules in inverse mode to match a sample of DNA molecules with a criminal.

(5) When solving for the sides and angles of a triangle, a trig student uses the inverse trig functions on his/her calculator to find the measure of an angle that has a specific trig function value.

The purpose of this post is to discuss inverse functions and relations when the matching rule is given by an x-y variable equation where both the domain and range is a subset of real numbers. These concepts will be discussed from algebraic and geometric points of view.

I will begin by looking at inverses of functions and relations from a geometric point of view. The two text boxes below summarize the geometric relationships between a relation and the inverse of a relation. The companion graphs illustrate the geometric relationships described in the text boxes. Notice that exchanging the variables in an equation gives us the equation of the inverse relation. These observations, of course, follow from the definition of the inverse of a relation, midpoint formula, definition of slope, and the fact that the product of the slopes of two perpendicular lines equals -1.

InverseFunctxtbx1

InverseFuncFig1

InverseFuncFig2

The text box below shows examples of elementary functions and the corresponding inverse relation which may or may not be a function. Notice that the inverse of the functions y = x2 and y = |x| are relations, but not functions since y = x2 and y = |x| are not one-to-one functions. As a reminder, the symbol √(x) means take the positive square root of x, and positive real numbers have a positive square root and a negative square root. Also note that the function y = Sin(x) is not one-to-one, and therefore the inverse relation is not a function. Calculators get around this problem by restricting the range of the function Sin-1(x) to values that range from –π/2 to π/2.

The next part explains how I teach the inverse of trig functions y = Sin(x) and y = Cos(x). Initially, students struggle with the definitions of the inverse trig functions. Consider the equations listed in the edit box and graphs below. Because the trig functions are periodic, there are infinitely many solutions for each equation. Because the calculator keys Cos-1(x) and Sin-1(x) are function keys, the calculator should display only one of the infinitely possible output values. When x ranges from 0 to π, Cos(x) is one-to-one in adjacent quadrants I and II, and all possible output values of Cos(x) from -1 to 1 can be generated in quadrants I and II. Therefore Cos-1(x) is a function if the output is restricted to range values from 0 to π radians. When x ranges from – π/2 to π/2, Sin(x) is one-to-one in adjacent quadrants I and IV, and all possible output values of Sin(x) from -1 to 1 can be generated in quadrants I and IV. Therefore Sin-1(x) is a function if the output is restricted to range values from – π/2 to π/2 radians.  I have my trig students find six solutions of simple trig equations.  Example: Find six angles β in degrees in quadrant III, 3 positive and 3 negative, such that Cos(β) = -0.951056516. Round solutions to the nearest tenth of a degree.

InverseFunctxtbx4

InverseFuncFig3

I will conclude this post my showing you how I teach my students to find the inverse of a function when the function is composed of basic functions. The steps in the algorithm involve applying inverse operations in the reverse order of the order of operation rules. Exercises of this type reinforce concepts and are a good way to practice algebra skills. If you want to add some rigor to your course, have students check their solution by showing f(f-1(x)) = f -1(f(x)) = x. I remind students that an initial equation like x = y/(3y – 4) is an equation of the inverse relation, but it’s not expressed as a function of x. When a relationship is expressed as a function of x, we can graph the relation with a graphing utility. This is one of the reasons that we teach kids to solve an equation for a given variable. Sometimes I tell students to rearrange the equation for some variable because it makes more sense to them.

Useful tools from Math Teacher’s Resource:

•   The graphs in my posts are created with my software, Basic Trig Functions. I think that you will find it very useful for teaching mathematical concepts in your classroom and developing custom instructional content. Relations can be entered as an explicitly defined function of x, an explicitly defined function of y, or as an implicitly defined x-y variable relation. Check it out at mathteachersresource.com/trigonometry.

•   There are a wide variety of free handouts that teachers can use to create lessons or give to students as a handy reference handout. Among these handouts are Inverse Relations and Functions, Even and Odd Functions, and Relations and Functions Introduction handouts. Go to mathteachersresource.com/instructional-content to download MTR handouts. All content is available for immediate download. No sign-up required; no strings attached!

Comments Regarding My Previous Post:

•   Some readers wanted to know the equation of the lead graph in my previous post. The equation of the graph is Cos(x) + Cos(y) >= 0.4 where both x and y range from -15 to 15. In view of the fact that Cos(x) is an even function, it should be no surprise that the graph has symmetry with respect to the x-axis, the y-axis, and the origin.

•   The equation of the strange graph at the end of my previous post is 2xSin(3x) + 2y <= 3yCos(x + 2y) + 1. If you are skeptical, here are six solutions that you can plug into the equation to verify that the equation really does have solutions that satisfy the equality relationship. Just make sure that your calculator is in radian angle mode.

(-5.4, 5.195 577 636)

(5.5, 5.976 946 313)

(8.680 865 276, -5.2)

(-6.8, -6.786 215 284)

(0.578 827 17, -3)

(0.051 781 64, 5.8)

 

Modeling Limited Population Growth with the Logistic Function

250px-Pierre_Francois_Verhulst[1]Because of limits on food, living space, disease, current technology, war, and other factors, most populations have limited growth as opposed to unlimited exponential growth which is modeled by the classic exponential growth equation P = P0bt/k. A limited growth population starts growing almost exponentially, but reaches a critical point in time where its growth rate slows, and the population starts to asymptotically approach an upper limit as time increases. There are several models that are used to describe limited growth of a population.

In this post, I will discuss the logistic function which was used by the Belgian mathematician Pierre Francois Verhulst (1804-1849) to study limited population growth. The logistic function also has applications in artificial neural networks, biology, chemistry, demography, ecology, economics, biomathematics, geoscience, mathematical psychology, sociology, political science, probability, and statistics.

The two text boxes below describes the key parameters and relationships between the parameters of a logistic function. Graph (A) shows a typical logistic function curve and how equation parameters can be calculated from known characteristics of the population. If pmax, p0, and tc are known, then a, b, and k can be calculated. Likewise, if a, b, and k are known, then pmax, p0, and tc can be calculated. Keep in mind that all limited growth models can only give us a good approximation of a population value at some point in time.

Graph A - JPEG

Text Box 1 - JPEG

Text Box 2 - JPEG

In most cases, the key parameters of a logistic equation are unknown, but an observed set of data-pairs is known. The least-squares logistic equation of a data set is the best of all possible logistic equations that describes the relationship between the data-pair variables. Best possible equation means that the sum of the squared errors (difference between observed value and predicted value) is minimized. Modern graphing calculators have the capability of findings a least-squares equation for a variety of models such as linear, quadratic, cubic, quartic, sinusoidal, log, exponential, and logistic. When given a logistic type data set, I will use a graphing calculator to find the least-squares logistic equation of the data set, and then calculate various characteristics and properties of the resulting logistic model. I will now take a look at two problems that illustrate how the logistic function can be used to describe limited population growth.

Text Box 3 - JPEG

Problem 1 solution: Use math software to do a scatter plot of the data, find the least-squares logistic equation p = 12.0121 / (1 + 10.6694e-0.023856x) of the data set, and then do the appropriate calculations. Refer to graph (B) below. From graph (B) we see that the world population growth rate started to slow in 1999, and the upper limit of the world population is about 12 billion. Keep in mind that this least-squares equation is our current best description of world population growth. Future unknowable events will alter this model.

Graph B - JPEG

Problem 2: The logistic function N(t) = 3,600 / (1 + 29.4e-0.2t ) models the spread of a disease in a town. N(t) = the total number of people infected at time t, and t = the number of days after the first reported infections.

(a) How many people were initially infected?

(b) How many people were infected after 10 days and after 30 days?

(c) When did the rate of infection start to slow?

(d) What is the upper limit of the number of infected people?

 

Problem 2 solution: Use math software to graph the equation, and then do the appropriate calculations. Refer to graph (C) below.

(a) About 118 people were initially infected.

(b) After 10 and 30 days, 723 and 3,355 were infected.

(c) About day 17, the infection rate started to slow.

(d) The upper limit of the number of people infected = 3,600.

Graph C - JPEG

Comments:

  • It’s fun and interesting to experiment with different logistic function parameters. Experimentation always gives a better learning experience.
  • With my graphing calculator, it took about 8 seconds to compute the parameters of a logistic equation. This is an indication of the complexity of the algorithms for computing the parameters of a least-squares equation. I tell my students that they should be ever thankful that they have access to such wonderful computation tools.
  • Computer math software allows students to focus on math concepts, and not get lost in gory computational details. This is why graphing calculators have revolutionized the way we teach statistics. Just getting the ‘answer’ is no longer sufficient. Students must be able to interpret and explain the meaning of the answer in the context of the problem.

Derivation of Continuous Compound Interest Formula without Calculus

Jacob Bernoulli, 1654-1705
Jacob Bernoulli, 1654-1705

My students, like most people, like money and find the topic of compound interest interesting. After completing a unit on simple, compound and continuous compound interest, one of my students told me that math is useful and interesting after all.

This post will discuss the derivation of the formula for the future value of an investment when interest is compounded continuously, FV = Pert. No prior understanding of the limit concept in calculus is required. I will be using the limit concept, but I will give an informal intuitive explanation of the limit concept as it comes up in the discussion. A recent post discussed an approach for deriving an equation that models exponential growth/decay. Problem (2) in that post showed the derivation of the compound interest formula FV = P(1 + r/k)kt where FV = the future value of the investment account, P = principle or one time lump-sum investment, r = annual percent rate of return expressed as a decimal, k = the number of times per year interest is compounded, and time t = the number of years the principal is invested.

Before I can get to the derivation of the equation FV = Pert, I need to explain what continuous compound interest means. Let’s consider an investment where P = $10,000, average annual rate of return = 7% = 0.07, and the investment collects interest over a period of 20 years. I adopted the standard banking convention rule that 1 year = 360 days. (Whether we use 365 or 360 days in a year makes no significant difference. Apparently banks like 30-day months.) The text box below shows how increasing the number of times per year interest is compounded affects the future value of an investment.

compound_interest_txtbx1a

Students immediately notice that there is a point where it makes no difference how often interest is compounded, and they completely understand the difference between simple interest and compound interest. I tell them that the future value of the $10,000 investment, $40,552.00 in this example, represents the upper limit of one’s greed. When interest is compounded more times per year (k approaches infinity), and interest is compounded over smaller and smaller time intervals; say every second, every microsecond, or continuously. No matter what the principal is or the annual interest rate, there is always an upper limit of the future value of an investment, and the upper limit is reached when interest is compounded continuously.

In 1683 in the course of his study of continuous compound interest, Jacob Bernoulli (1654-1705) wanted to find the number that was the limiting value of the expression (1+1/n)^n as n approaches infinity. This is the first time that a number is defined as the limiting value of an expression. Bernoulli determined that this special number is bounded and lies between 2 and 3. In 1748 Leonard Euler (pronounced Oil-er) (1707-1783) published a document in which he named this special number e. He showed that e is the limiting value of the expression (1 + 1/n)n as n approaches infinity, and is approximately equal to 2.718281828459045235. He also gave another definition of e as the limiting value of the infinite sum 1 + 1/1! + 1/2! + 1/3! + . . . . Euler is generally given credit as the first to prove e is an irrational number.

To help you better understand the definition of the irrational number e, I will start by comparing the graphs of functions of the form y = (1 + 1/k)x where k is a fixed constant and the graph of the function y = (1 + 1/x)x. Refer to graphs (A) and (B) and the companion text box below. A quantity that approaches infinity means the quantity gets bigger and bigger without any upper boundary. A quantity that approaches a fixed constant means the quantity gets infinitely close to the fixed constant.

compound_interest_fig1

 

compound_interest_fig2

compound_interest_txtbx2a

The purpose of the above graphs and the comments in the text box is to demonstrate that a subtle difference in the expressions (1 + 1/k)x and (1 + 1/x)x results in far different limiting values as x approaches ∞. The key result needed in the derivation of the continuous compound interest formula is the fact that e = limiting value of (1 + 1/x)x as x approaches ∞ when x is any positive real number. Considering that the expression (1 + 1/n)n is a rational number for every positive integer n, it is astonishing that the expression (1 + 1/n)n approaches an irrational number as n approaches ∞. I can now show you the derivation of the continuous compound interest formula FV = Pert.

compound_interest_txtbx3

Comments:

• When I did the calculations for compounding every minute and compounding every second with my graphing calculator, I got results that were slightly different than the expected results. When I used double floating point precision real numbers in a computer program, program output agreed with the expected results. We need to constantly remind ourselves that calculator or computer calculations of expressions that involve very large numbers, or require a large number of iterations to arrive at a solution, results may be slightly different than the expected or theoretical value.

• Using problems similar to the examples in this post, I show my students how compound interest works and what continuous compounding of interest means. I have them enter the expressions into their graphing calculator as the lesson progresses. This gives them practice using their calculator and they gain a better understanding and appreciation of what compound interest is all about. They are astonished when I show them $10,000*e.07*20 = $40,552.00.

• For a class of curious or advanced students, it’s not wasted class time to show the derivation of the continuous compound interest formula. Less advanced students are usually content with learning how to use the formula. My handout, Basic Financial Formulas, provides an overview of useful financial formulas that you can use in your classroom.

• The derivation of the continuous compound interest formula is a great opportunity to expose advanced high school algebra, college algebra and pre-calculus students to the limit concept in calculus.

• As mentioned earlier, very term of the sequence an = (1 + 1/n)n is a rational number, but the sequence itself converges to the irrational number e. Most calculus students find this very counterintuitive. What a great opportunity to launch a discussion of any number of related math concepts!

• The constants 0, 1, π, e, and i where i2 = -1 are the five most important constants in mathematics because they are widely used in equations that describe relationships in all branches of mathematics and science. The equation eπi + 1 = 0, which is due to Leonhard Euler, is one of the most interesting and intriguing equations in mathematics. Euler used the symbol e for the irrational constant, and in his honor, e is named Euler’s number.

• Both Bernoulli and Euler were prolific mathematical giants. Much of what is routinely used in mathematics and science can be traced back to the work of these two great men. L’Hospital’s Rule in calculus is due to Bernoulli, not L’Hospital. L’Hospital published the rule, but Bernoulli discovered the rule and gave it, for a fee, to L’Hospital.

Because of limits on food, living space, disease, existing technology, war, and other factors, most populations have limited growth as opposed to unlimited exponential growth which is modeled by the classic exponential growth equation P = P0bt/k. A limited growth population starts growing almost exponentially, but it reaches a critical point in time where its growth rate slows, and the population starts to exponentially and asymptotically approach an upper limit. There are several models that are used to describe limited growth of a population. In my next post, I will discuss the logistic function which was used by the Belgium mathematician Pierre Francois Verhulst (1804-1849) to study limited population growth. The logistic function also has applications in artificial neural networks, biology, chemistry, demography, ecology, economics, biomathematics, geoscience, mathematical psychology, sociology, political science, probability, and statistics.

Solving Newton’s Law of Cooling/Heating Problems without Differential Calculus

GodfreyKneller-IsaacNewton-sm
Sir Isaac Newton (portrait by Godfrey Kneller, 1689)

My last post discussed how to find an exponential growth/decay equation that expresses a relationship between two variables by first constructing a table of data-pairs to better understand and derive the fundamental grow/decay equation A = A0*bt/k. Because the content of this post depends on the concepts developed in my last post, I strongly suggest that you read that post before continuing.

This post shows how to solve Newton’s law of cooling and heating problems without any understanding of differential calculus, which makes this post different from descriptions found in differential calculus text books. Newton’s Law of Cooling describes the relationship between the temperature of an object and time t when the object is placed in an environment where the ambient (or surrounding) temperature is maintained at a constant temperature. Newton’s law of cooling and heating is described as follows:

(a) If the initial temperature of the object equals the ambient temperature, the temperature of the object remains constant as time t increases.

(b) If the initial temperature of the object is greater than the ambient temperature, the object cools and its temperature exponentially and asymptotically approaches the ambient temperature as time t increases.

(c) If the initial temperature of the object is less than the ambient temperature, the object heats up and its temperature exponentially and asymptotically approaches the ambient temperature as time t increases.

I will use two familiar cooling/heating problems to illustrate how the table data-pair approach can be applied to solve a Newton’s law of cooling or heating problem. The key step in solving a cooling/heating problem is to carefully read the problem and then apply what Newton tells us about cooling and heating to create a rough sketch of the growth/decay graph of the model with key points labeled. Even if you don’t know the equation of the graph, the rough sketch will enable you to determine the parameters of the growth/decay equation. From this rough sketch, recognize that the graph is just the result of a vertical translation of an exponential decay graph in the form A = A0*bt/k. (In view of what Newton tells us about cooling and heating, the rough graph makes perfect sense to students.)

Problem 1: A pot of boiling soup is put into a sink filled with cold water. The temperature of the soup was 1000 C when it was first put into the sink. By adding ice and stirring the water, the temperature of the water was maintained at a constant temperature of 50 C. If the temperature of the soup was 600 C after 10 minutes, how many minutes will it take for the temperature of the soup to reach a room temperature of 200 C?

Solution: Refer to graphs A and B below where x = time t in minutes and y = the temperature of the soup in degrees Celsius. Similar to graph A, first draw a rough sketch of the model with key points labeled. Recognizing that graph A is just the result of a 5 unit vertical translation of an exponential decay graph, use the information from the first rough sketch to draw a rough sketch of the exponential decay graph with key points labeled, similar to graph B. Now use the key points on the sketch of graph B to find the equation of graph B, and then apply the equation transformation rules to find the equation of graph A. To find out how many minutes it will take for the temperature of the soup to reach 200 C, use a computer graphing program to find the intersection point of the graphs y = 20 and y = 95(55/95)x/10 + 5. Graph A tells us the temperature of the soup equals 200 C, when time t = 33.77 minutes or about 34 minutes.

newton_fig1

newton_fig2

Problem 2: A 400 F roast is put into an oven that is set to bake at 3500 F. After 2 hours, the temperature of the roast is 1250 F. The roast is considered done when its internal temperature reaches 1650 F. How many hours well it take to cook the roast?

Solution: Refer to graphs C and D below where x = time t in minutes and y = the temperature of the roast in degrees Fahrenheit. The strategy is to first draw a rough sketch of the model with key points labeled; similar to graph C below. Recognizing that graph C is the result of a 350 unit vertical translation of an exponential decay graph that was reflected over the x-axis, use the information from the sketch of graph C to draw a rough sketch of the flipped exponential decay graph with key points labeled, similar to graph D. Now use the key points on the sketch of graph D to find the equation of Graph D, and then apply the equation transformation rules to find the equation of graph C. To find out how many hours it will take to cook the roast, use a computer graphing program to find the intersection point of the graphs y = 165 and y = -310(225/310)x/2 + 350. Graph C tells us that it will take 3.22 hours or about 3 hours and 13 minutes to cook the roast.

newton_fig3

newton_fig4

Here are four exercises that you can give to your students. The solutions are provided. (See my third comment below.) You or your students shouldn’t be too disappointed if you fail to correctly solve all four exercises on your first attempt.

Exercise 1: When first removed from an oven and placed in a 700 F room to cool, the temperature of a cake was 1800 F. Three minutes later the temperature of the cake dropped to 1600 F.

(a) What is the temperature of the cake after 20 minutes? (A: 98.870 F or about 990 F)
(b) How many minutes will take for the cake to cool to 900 F? (A: 25.49 minutes or about 26 minutes)

Exercise 2: The temperature of a very small metal bar was 300 C when it was dropped into a large barrel of hot water having a 750 C temperature. After 1 second, the temperature of the bar was 310 C.

(a) How long will it take for the temperature of the bar to reach 700 C? (A: 97.77 seconds or about 98 seconds)
(b) How long will it take for the temperature of the bar to reach 740 C? (A: 169.39 seconds or about 170 seconds)

Exercise 3: Find the equation of graph A below.  (A: y = 40(1/5)x/10 + 30)

Exercise 4: Find the equation of graph B below. (A: y = -90(2/3)x/5 + 160)

newton_fig5a

Comments:

• In the two sample problems above, the final step in the solution involved finding the intersection point of two graphs. This gives us the solution from a geometric point of view. The solution from an algebraic point of view involves log functions which would enable you to find the solution faster. As I mentioned in previous posts, whenever possible, solutions to problems should be understood from both an algebraic and geometric point of view.

• Solving exponential growth and decay problems naturally leads to a need to understand logarithms and log functions.

• All modern physicists know that the equations they discovered can only give us an approximation of how nature’s laws work. In reference to problem (2) above, if we conducted an experiment with a roast by measuring its internal temperature at various points in time, we would find a discrepancy between the experimental results and the predicted results. No matter how accurately we measure the internal temperature of the roast and time, the errors can’t be taken out of the experimental observations. We can only say that the interval temperate of the roast at some specific point in time lies in an area of uncertainly which is the area under a probability distribution curve. This is why least-squares regression equations are used to describe the relationship between two variables.

• I have used the handout Newton’s Law of Cooling with college algebra and pre-calculus students, and with more advanced students that I tutor. To download the free student and teacher versions of the handout, go to mathteachersresource.com/instructional-content. There are other free handouts on properties of exponents, properties of logarithms, solving exponential/logarithmic equations, and logarithmic base conversion.

• Using the approach presented in my last post and this post, I believe it’s possible to teach how to solve exponential growth/decay problems to younger mathematically capable students. From my own experience, students find these types of problems interesting and practical.

• All graphs in this post were created with my program Basic Trig Functions. I designed the program to make it easy for teachers to create content for their own courses.

My next post will discuss the derivation of the formula for the future value of an investment when interest is compound continuously, FV = Pert. The post will assume that the reader has no understanding of the limit concept in calculus.

Exponential Growth and Decay from a Data-Pairs Approach

header_expgrwthdcyMy last two posts discussed the mathematics of linear growth and decay. If you have not read those posts, you might find it helpful to read them before continuing. This post focuses on finding an exponential equation that expresses a relationship between two variables by first constructing a table of data-pairs to better understand the relationship and see the pattern in the relationship.

Most exponential growth/decay relationships involve a time variable t and the amount A of some quantity at time t. Amount could be the current value of an investment account, population of a city, remaining kilograms of radioactive material, assessed value of a truck, etc. The text box and observations below explain how and why the basic fundamental exponential growth/decay formula A = A0*bt/k works, and the role that the parameters A0, b, and k play in the equation. Periodic growth factor is another way to think of the base multiplier b.

txtbx1_expgrwthdcy

Some observations about A = A0*bt/k where b > 0:
• The point (0, A0) is the intercept on the vertical axis of the graph.
• Base multiplier b is a periodic growth or decay factor.
• If 0 < b < 1, the equation models exponential decay.
• If b > 1, the equation models exponential growth.
• Exponential growth/decay is about repeated multiplication by growth/decay factor b.
• A0 and any other point on the graph determines a unique exponential equation.
• If A0 is positive, the graph is above and asymptotic to the horizontal axis.
• If A0 is negative, the graph is below and asymptotic to the horizontal axis.

Most discussions about finding the equation of an exponential relationship don’t start by looking at data-pairs in a table. After only a couple of demonstrations of how to apply the data-pairs approach, students quickly develop the ability to find the three key parameters of an exponential growth/decay relationship. Exponential equations of the form A = A0*bt/k where base b is a rational number are much easier to comprehend than equations of the form A = A0*ekt where e is the irrational math constant = 2.718281828459045 . . .  . I will use four familiar math problems that involve an exponential relationship to illustrate the table data-pairs approach. In the comments section of this post, you will find an example that further clarifies my reason for expressing most exponential growth/decay equations as A = A0*bt/k where b is a rational number, and the reason that the solutions of population and radioactive growth/decay problems tend to be expressed in terms of base e only.

Problem 1: Consider a population of bacteria that is growing exponentially 50% every 4 hours and the current population is 60 bacteria. Let t = number of hours in the future and N = the number of bacteria after t hours.
(a) Find an equation that expresses N as a function of t.
(b) Find the population after 10 hours and 45 minutes ago.
(c) Express N as a function of t if the population is increasing 5% every 15 minutes.

The solution is given in the text box below. Problem solvers should carefully read the problem, create a table of data-pairs, determine the equation parameters, and then write the equation that models the problem situation. A companion exponential growth graph with a series of slope/rate triangles is provided to show the role that the equation parameters play in the relationship. Of course, the problem solver should always check the solution by using a computer graphing program to graph the equation.

txtbx2_expgrwthdcy

fig1_expgrwthdcy

Problem 2: Suppose a person invests $10,000 in a CD that will earn interest at 6%/year and interest is compounded monthly. Let t = the number of years in the future and V = the value of the investment after t years.
(a) Express V as a function of t.
(b) Find the value of the investment after 10 years and 20 years.
(c) Express V as a function of t if interest is compounded 360 times per year.

txtbx3_expgrwthdcy

Problem 3: The half-life of a radioactive substance equals the time it takes (20 days, 149 years, 5,700 years, etc.) for the substance to lose half its mass. Consider a radioactive substance with a half-life of 60 days that currently has a 100 kg mass. Let time t = the number of days in the future and A = the mass of the remaining substance in kg at time t. Refer to table and companion graph below.


(a) Find a formula that expresses A as a function of time t in days.
(b) Find the mass of the substance after 135 days.
(c) Find a formula for A(t) if the half-life = 6 hours instead of 60 days.

txtbx5_expgrwthdcyfig2_expgrwthdcy

Problem 4: The two exponential growth/decay graphs along with key points on the graphs are shown below.
(a) For graph A: Write an equation that expresses y as a function of x.
(b) For graph B: Write an equation that expresses y as a function of x.

fig3_expgrwthdcy

txtbx6_expgrwthdcy

Here are four exercises that you can give to your students. The graphs are a mixture of linear and exponential growth/decay graphs. Using the points on the graph, find the equation of the graph. If you wish, remind them that they should first create a table of data-pairs. Let them do the exercises with a partner and then check their answers by using a computer to graph the equations. We want to create a save environment in which kids feel free to experiment and check their answers for understanding. It’s OK to make a mistake, just fix it. If the first attempt to fix a mistake fails, so what? Try again. This is how real people learn to do anything that is worthwhile. The solutions are given at the end of this post.

fig4_expgrwthdcy

Comments:
• Consider the two mathematically equivalent equations below that model the population growth of a small town where t equals the number of years after 2010.

P = 5,200(1.08)t/4  and  P = 5,200e0.019240260t

The first equation immediately tells us the population of the town was 5,200 in 2010, and the population is increasing 8% every 4 years. The second equation tells us the population of the town was 5,200 in 2010, but by just inspecting the second equation, only God can figure out that the population is increasing 8% every four years. (Increasing 8% every 4 years is slightly less than increasing 2% every year.)

• It’s a snap to find the derivative of functions of the form y = Aekt. To find the derivative of functions of the form y = Abx/k where base b is a rational number requires a little more work. I suspect this is the reason that the solutions of population and radioactive grow/decay problems tend to be expressed in terms of base e only. From my point of view, this is not a sufficient reason to do so because converting an exponential function from one base to another base is a simple procedure. My free handout Logarithmic Base Conversion shows how to do this.

• All modern physicists know that the equations they discovered can only give us an approximation of how nature’s laws work. The brilliant physicist Richard Feynman, over and over again, stated this fundamental fact in his lectures and talks. In reference to problem (3) above, if we conducted an experiment with a radioactive material by measuring the remaining mass of the material at various points in time, we would find a discrepancy between the experimental results and the predicted results. No matter how accurately we measure mass and time, the errors can’t be taken out of the experimental observations. We can only say that the remaining mass of radioactive material at time t lies in an area of uncertainly which is the area under a probability distribution curve. This is why least-squares regression equations are used to describe the relationship between two variables.

• The formula for calculating the future value of an account after t years when interest is compounded continuously is FV = Pert where P = the principal and r = the annual interest rate expressed as a decimal. It’s impossible to express this relationship with a base that is a rational number. In a future post, I will give a derivation of this formula in a manner that does not require an understanding of concepts in calculus.

• I have used the handout, Introduction to Exponential Growth and Decay, with college algebra students, pre-calculus students, and as a review for more advanced students. To download the free student and teacher versions of the handout, go to mathteachersresource.com/instructional-content.html. There are other free handouts on properties of exponents, properties of logarithms, solving exponential/logarithmic equations, and logarithmic base conversion.

• All graphs in this post were created with my software, Basic Trig Functions. I designed the software to help teachers quickly make custom content for their classrooms. This software allows you to easily copy any graphic and then import it directly into a document (e.g. lesson plan, class handout, test) or further manipulate it in various graphic processing programs.

My next post will show how to solve Newton’s Law of Cooling problems without understanding differential calculus.

Solutions to exercises:
Graph A: y = -2x + 40
Graph B: y = 40*0.5x/2.5
Graph C: y = 20*1.5x/5
Graph D: y = x + 10